Meltdown and Spectre: Systems Management Challenges
The impact of the Meltdown and Spectre vulnerabilities is having a reach that exceeds any of recent memory. From servers to desktops, tablets to phones, and Intel to AMD & ARM, these security vulnerabilities impact systems and devices across the entire spectrum of computing.
With the public disclosure of these vulnerabilities comes the patches, fixes, and mitigation techniques to try to close up the attack surface . When the security fixes are talked about in a singular sense, the solution is straightforward. But for any business, this wave of security fixes will shine a light on how important systems management is to a secure business.
Systems management can be broken up into two core areas: state management and operations management. For state management, this is understanding the current state or configuration of the target system and how to manage or change that state. This state information can include the hardware, operating system (OS), applications, and data. State management tools must understand what is the current status (inventory and control), how to maintain that status (backup and recovery), and how to change that status (hardware configurations, patches, and app deployment). The potential fixes needed for Meltdown and Spectre, from CPU microcode updates through OS patches to application updates and changes, all require the state management part of systems management.
Further complicating this is the wide variety of systems that need to be updated and the associated tools to be used. Desktop, mobile devices, servers, and cloud resources often have different tool sets for systems management. For a business to know that its systems are secure, all these tools will have to coordinate and validate the deployment of security fixes. This complication may be why we are seeing more and more demand for cross-function, integrated systems management.
While the initial focus of the Meltdown and Spectre news has been on the patches and fixes, those security fixes come with potential performance penalties. Well after the state management tasks are done, the operations management side of systems management will continue to be used as a result of this situation. Operations management enables the monitoring and management of the running applications and systems, from the monitoring of the back-end server performance to the impact on the end-user experience. Gone are the the days where IT can simply watch the performance metrics captured from back-end servers. Today’s systems, including cloud-native and serverless workloads, also need end-user experience monitoring.
The issue is the impact of the security fixes. The initial estimates are of single digit to low teens in percentage performance impact, with the possibility of more impact on workloads that are CPU or network intensive. Those workloads are exactly what businesses run, especially in public cloud services. In today’s leaner running IT, most workloads are sized exactly for the required resources, especially in the public cloud, where you pay by the size of the VM. If the workload is negatively impacted by 5-15% in performance, a resizing exercise is required, potentially across all impacted workloads. That is why operations management is key. The performance impact of these security fixes must be measured across all areas, from back-end to end-users, and must be continuously monitored as more fixes are rolled out.
The impact of Meltdown and Spectre will likely be felt by businesses for a long time to come. The many facets of systems management will play a critical role in how these vulnerabilities are mitigated. IT needs to look beyond just the top end headlines of the news and start looking at how they will leverage their systems management tools to deploy these fixes quickly, efficiently, and with the least impact to operations as possible.