Patching the (Hypervisor) Platform: How Do You Manage Risk?
Hi. Me again.
These blogs focused mainly on virtualization-powered IaaS/PaaS offerings and whilst they targeted “Cloud Computing,” they applied equally to the heavily virtualized enterprise. To this point I wrote another in 2008 titled “On Patch Tuesdays For Virtualization Platforms.”
The operational impacts of managing change control, vulnerability management and threat mitigation have always intrigued me, especially at scale.
I was reminded this morning of the importance of the question posed above as VMware released a series of security advisories detailing ten vulnerabilities across many products, some of which are remotely exploitable. While security vulnerabilities in hypervisors are not new, it’s unclear to me how many heavily-virtualized enterprises or Cloud providers actually deal with what it means to patch this critical layer of infrastructure.
Once virtualized, we expect/assume that VM’s and the guest OS’s within them should operate with functional equivalence when compared to non-virtualized instances. We have, however, seen that this is not the case. It’s rare, but it happens that OS’s and applications, once virtualized, suffer from issues that cause faults to the underlying virtualization platform itself.
So here’s the $64,000 question – feel free to answer anonymously:
While virtualization is meant to effectively isolate the hardware from the resources atop it, the VMM/Hypervisor itself maintains a delicate position arbitrating this abstraction. When the VMM/Hypervisor needs patching, how do you regression test the impact across all your VM images (across test/dev, production, etc.)? More importantly, how are you assessing/measuring compound risk across shared/multi-tenant environments with respect to patching and its impact?
P.S. It occurs to me that after I wrote the blog last night on ‘high assurance (read: TPM-enabled)’ virtualization/cloud environments with respect to change control, the reference images for trust launch environments would be impacted by patches like this. How are we going to scale this from a management perspective?
Related articles by Zemanta
- Incomplete Thought: Virtual Machines Are the Problem, Not the Solution… (rationalsurvivability.com)
- More On High Assurance (via TPM) Cloud Environments (rationalsurvivability.com)
- Redux: Patching the Cloud (rationalsurvivability.com)
- On Patch Tuesdays For Virtualization Platforms (rationalsurvivability.com)