Archive for March, 2009

Introducing the Cloud Security Alliance

March 31st, 2009 5 comments

I’m a founding member and serve as the technical advisor for the Cloud Security Alliance (CSA.)  This is an organization you may not have heard of yet, so I wanted to introduce you.

The more formal definition of the role and goals of the CSA appears below, but it’s most easily described as a member-driven forum for both industry, providers and “consumers” of Cloud Computing services to discuss issues and opportunities for security in this emerging space and help craft awareness, guidance and best practices for secure Cloud adoption.  It’s not a standards body. It’s not a secret cabal of industry-only players shuffling for position.  

It’s a good mix of vendors, practitioners and interested parties who are concerned with framing the most pressing concerns related to Cloud security and working together to bring ideas to life on how we can address them. 

From the website, here’s the more formal definition:

The CSA is a non-profit organization formed to promote the use of best practices for providing security assurance within Cloud Computing, and provide education on the uses of Cloud Computing to help secure all other forms of computing.

The Cloud Security Alliance is comprised of many subject matter experts from a wide variety disciplines, united in our objectives:

  • Promote a common level of understanding between the consumers and providers of cloud computing regarding the necessary security requirements and attestation of assurance.
  • Promote independent research into best practices for cloud computing security.
  • Launch awareness campaigns and educational programs on the appropriate uses of cloud computing and cloud security solutions.
  • Create consensus lists of issues and guidance for cloud security assurance.

The Cloud Security Alliance will be launched at the RSA Conference 2009 in San Francisco, April 20-24, 2009.

It’s clear that people will likely draw parallels between the CSA and the Open Cloud Manifesto given the recent announcement of the latter.  

The key difference between the two efforts relates to the CSA’s engagement and membership by both providers and consumers of Cloud Services and the organized non-profit structure of the CSA.  The groups are complimentary in nature and goals.

You can see who is participating in the CSA now based upon the pre-release of the working draft of our initial whitepaper.  Full attribution of company affiliation will be posted as the website is updated:



Nils Puhlmann
Jim Reavis

Founding Members and Contributors

Todd Barbee
Alan Boehme
Jon Callas
Sean Catlett
Shawn Chaput
Dave Cullinane
Ken Fauth
Pam Fusco
Francoise Gilbert
Christofer Hoff
Dennis Hurst
Michael Johnson
Shail Khiyara
Subra Kumaraswamy
Paul Kurtz
Mark Leary
Liam Lynch
Tim Mather
Scott Matsumoto
Luis Morales
Dave Morrow
Izak Mutlu
Jean Pawluk
George Reese
Jeff Reich
Jeffrey Ritter
Ward Spangenberg
Jeff Spivey
Michael Sutton
Lynn Terwoerds
Dave Tyson
John Viega
Dov Yoran
Josh Zachry

Founding Charter Companies


If you’d like to get involved, here’s how:


Individuals with an interest in cloud computing and expertise to help make it more secure receive a complimentary individual membership based on a minimum level of participation. If you are interested in becoming a member, apply to join our LinkedIn Group


Not-for-profit associations and industry groups may form an affiliate partnership with the Cloud Security Alliance to collaborate on initiatives of mutual concern. Contact us at for more information.


Information on corporate memberships and sponsorship programs will be available soon. Contact for more information.


Meditating On the Manifesto: It’s Good To Be King…

March 29th, 2009 6 comments

By now you’ve heard of ManifestoGate, no?  If not, click on that link and read all about it as James Urquhart does a nice job summarizing it all.

In the face of all of this controversy, tonight Reuven Cohen twittered that the website was live.

So I mosied over to take a look at the promised list of supporters of said manifesto since I’ve been waiting for a definition of the “we” who developed/support it.  

It’s a very interesting list.

There are lots of players. Some of them are just starting to bring their Cloud visions forward.

But clearly there are some noticeable absences, namely Google, Microsoft Salesforce and Amazon — the three four largest established Cloud players in the Cloudusphere.

I think it’s been said in so many words before, but let me make it perfectly clear why, despite the rhetoric both acute and fluffy from both sides, that these three Cloud giants aren’t listed as supporters.

Here are the the listed principles of the Open Cloud from the manifesto itself:

Of course, many clouds will continue to be different in a number of important ways, 
providing unique value for organizations. It is not our intention to define standards for 
every capability in the cloud and create a single homogeneous cloud environment. 

Rather, as cloud computing matures, there are several key principles that must be 
followed to ensure the cloud is open and delivers the choice, flexibility and agility 
organizations demand:

1. Cloud providers must work together to ensure that the challenges to 
cloud adoption (security, integration, portability, interoperability, 
governance/management, metering/monitoring) are addressed through 
open collaboration and the appropriate use of standards.

2. Cloud providers must not use their market position to lock customers 
into their particular platforms and limit their choice of providers.

3. Cloud providers must use and adopt existing standards wherever 
appropriate. The IT industry has invested heavily in existing standards 
and standards organizations; there is no need to duplicate or reinvent 

4. When new standards (or adjustments to existing standards) are needed, 
we must be judicious and pragmatic to avoid creating too many 
standards. We must ensure that standards promote innovation and do 
not inhibit it.

5. Any community effort around the open cloud should be driven by 
customer needs, not merely the technical needs of cloud providers, and 
should be tested or verified against real customer requirements.

6. Cloud computing standards organizations, advocacy groups, and 
communities should work together and stay coordinated, making sure 
that efforts do not conflict or overlap.

Fact is, from a customer’s point of view, I find all of these principles agreeable and despite calling it a manifesto, I could see using it as a nice set of discussion points with which I can chat about my needs from the Cloud.   It’s intersting to note that given the audience as stated in the manifesto, that the only list of supporters are vendors and not “customers.”

I think the more discussion we have on the matter, the better.  Personally, I grok and support the principles herein.  I’m sure this point will be missed as I play devil’s advocate, but so be it.  

However, from the “nice theory, wrong universe” vendor’s point-of-view, why/how could I sign it?

See #2 above?  It relates to exactly the point made by James when he said “Those who have publicly stated that they won’t sign have the most to lose.”

Yes they do.  And the last time I looked, all three of them have notions of what the Cloud ought to be, and how and to what degree  it ought to interoperate and with whom.  

I certainly expect they will leverage every ounce of “lock-in” enhanced customer experience through a tightly-coupled relationship they can muster and capitalize on the de facto versus de jure “standardization” that naturally occurs in a free market when you’re in the top 4.  Someone telling me I ought to sign a document to the contrary would likely not get offered a free coffee at the company cafe.

Trying to socialize (in every meaning of the word) goodness works wonders if you’re a kibbutz.  With billions up for grabs in a technology land-grab, not so much.

This is where the ever-hopeful consumer, the idealist integrator, and the vendor-realist personalities in me begin to battle.

Oh, you should hear the voices in my head…


Categories: Cloud Computing Tags:

Incomplete Thought: Looking At An “Open & Interoperable Cloud” Through Azure-Colored Glasses

March 29th, 2009 4 comments

As with the others in my series of “incomplete thoughts,” this one is focused on an issue that has been banging around in my skull for a few days.  I’m not sure how to properly articulate my thought completely, so I throw this up for consideration, looking for your discussion to clarify my thinking.

You may have heard of the little dust-up involving Microsoft and the folk(s) behind the Open Cloud Manifesto. The drama here reminds me of the Dallas episode where everyone tried to guess who shot J.R., and it’s really not the focus of this post.  I use it here for color.

What is the focus of this post is the notion of “open(ness),” portability and interoperability as it relates to Cloud Computing — or more specifically how these terms relate to the infrastructure and enabling platforms of Cloud Computing solution providers.

I put “openness” in quotes because definitionally, there are as many representations of this term as there are for “Cloud,” which is a big part of the problem.  Just to be fair, before you start thinking I’m unduly picking on Microsoft, I’m not. I challenged VMware on the same issues.

So here’s my question as it relates to Microsoft’s strategy regarding Azure given an excerpt from Microsoft’s Steven Martin as he described his employer’s stance on Cloud in regard to the Cloudifesto debacle above in his blog post titled “Moving Toward an Open Process On Cloud Computing Interoperability“:

From the moment we kicked off our cloud computing effort, openness and interop stood at the forefront. As those who are using it will tell you, the  Azure Services Platform is an open and flexible platform that is defined by web addressability, SOAP, XML, and REST.  Our vision in taking this approach was to ensure that the programming model was extensible and that the individual services could be used in conjunction with applications and infrastructure that ran on both Microsoft and non-Microsoft stacks. 

What got me going was this ZDNet interview by Mary Jo Foley wherein she interviewed Julius Sinkevicius, Microsoft’s Director of Product Management for Windows Server, in which she loosely references/compares Cisco’s Cloud strategy is to Microsoft’s and apparently a lack of interoperability between Microsoft’s own virtualization and Cloud Computing platforms:

MJF: Did Cisco ask Microsoft about licensing Azure? Will Microsoft license all of the components of Azure to any other company?

Sinkevicius: No, Microsoft is not offering Windows Azure for on premise deployment. Windows Azure runs only in Microsoft datacenters. Enterprise customers who wish to deploy a highly scalable and flexible OS in their datacenter should leverage Hyper-V and license Windows Server Datacenter Edition, which has unlimited virtualization rights, and System Center for management.

MJF: What does Microsoft see as the difference between Red Dog (Windows Azure) and the OS stack that Cisco announced?

Sinkevicius: Windows Azure is Microsoft’s runtime designed specifically for the Microsoft datacenter. Windows Azure is designed for new applications and allows ISVs and Enterprises to get geo-scale without geo-cost.  The OS stack that Cisco announced is for customers who wish to deploy on-premise servers, and thus leverages Windows Server Datacenter and System Center.

The source of the on-premise Azure hosting confusion appears to be this: All apps developed for Azure will be able to run on Windows Server, according to the Softies. However — at present — the inverse is not true: Existing Windows Server apps ultimately may be able to run on Azure. For now only some can do so, and only with a fairly substantial amount of tweaking.

Microsoft’s cloud pitch to enterprises who are skittish about putting their data in the Microsoft basket isn’t “We’ll let you host your own data using our cloud platform.” Instead, it’s more like: “You can take some/all of your data out of our datacenters and run it on-premise if/when you want — and you can do the reverse and put some/all of your data in our cloud if you so desire.”

What confuses me is how Azure, as a platform, will be limited to deployment only in Microsoft’s operating environment (i.e. their datacenters) and not for use outside of that environment and how that compares to the statements above regarding the interoperability described by Martin.

Doesn’t the proprietary nature of the Azure runtime platform, “open” or not via API, by definition limit its openness and interoperability? If I can’t take my applications and information and operate it anywhere without major retooling, how does that imply openness, portability and interoperability?  

If one cannot do that fully between Windows Server and Azure — both from the same company —  what chance do we have between instances running across different platforms not from Microsoft?

The issue at hand to consider is this:

If you do not have one-to-one parity between the infrastructure that provides your cloud “externally” versus “internally,” (and similarly public versus private clouds) can you truly claim openness, portability and interoperability?

What do you think?

Pimping My Friends: Joshua Corman on Virtualization Security

March 29th, 2009 1 comment
Josh Corman - Virtualization Security Tutorial

Josh Corman - Virtualization Security Tutorial

Joshua Corman is IBM/ISS’ Principal Security Strategist and a longtime friend.

Josh has a great virtualization security tutorial up at the Internet Evolution “macro-site.”

I like the layout and functionality as well as the content; there is a ton of great information here.

Check it out.


Update on the Cloud (Ontology/Taxonomy) Model…

March 28th, 2009 3 comments

A couple of months ago I kicked off a community-enabled project to build an infrastructure-centric ontology/taxonomy model of Cloud Computing.

You can see the original work with all the comments here.  Despite the distracting haggling over the use of the words “ontology and taxonomy,”  the model (as I now call it) has been well received by those for whom it was created.

Specifically, my goal was to be able to help a network or security professional do these things:

  1. Settle on a relevant and contextual technology-focused definition of Cloud Computing and its various foundational elements beyond the existing academic & 30,000 foot-view models
  2. Understand how this definition maps to the classical SPI (SaaS, PaaS, IaaS) models with which many people are aware
  3. Deconstruct the SPI model and present it in a layered format similar to the OSI model showing how each of the SPI components interact with and build atop one another
  4. Provide a further relevant mapping of infrastructure, applications, etc. at each model so as to relate well-understood solutions experiences to each
  5. Map a set of generally-available solutions from a catalog of compensating controls (from the security perspective) to each layer
  6. Ultimately map the SPI layers to the compensating controls and in turn to a set of governanance and regulatory requirements (SoX, PCI, HIPAA, etc.)

This is very much, and unapologetically so, a controls-based model.  I assume that there exists no utopic state of proper architectural design, secure development lifecycle, etc. Just like the real world.  So rather than assume that we’re going to have universal goodness thanks to proper architecture, design and execution, I thought it more reasonable to think about plugging the holes (sadly) and going from there.

At the end of the day, I wanted an IT/Security professional to use the model like an “Annie Oakley Secret Decoder Ring” in order to help rapidly assess offerings, map them to the specific model layers, understand what controls they or a vendor needs to have in place by mapping that, in turn, to compliance requirements.  This would allow for a quick and accurate manner by which to perform a gap analysis which in turn can be used to feed into a risk assessment/analysis.

We went through 5 versions in a relatively short period of time and arrived at a solid fundamental model based upon feedback from the target audience:


The model is CLEARLY not complete.  The next three steps for improving it are:

  1. Reference other solution taxonomies to complete the rest of the examples and expand upon the various layers with key elements and offerings from vendors/solutions providers.  See here.
  2. Polish up the catalog of compensating controls
  3. Start mapping to various regulatory/compliance requirements
  4. Find a better way of interactively presenting this whole mess.

For my Frogs presentation, I presented the first stab at the example controls mapping and it seemed to make sense given the uptake/interest in the model. Here’s an example:

Frogs: Cloud Model Aligned to Security Controls Model

This still has a ways to go, but I’ve been able to present this to C-levels, bankers, technologists and lay people (with explanation) and it’s gone over well.

I look forward to making more progress on this shortly and would welcome the help, commentary, critique and collaboration.

I’ll start adding more definition to each of the layers so people can feedback appropriately.



P.S. A couple of days ago I discovered that Kevin Jackson had published an extrapolation of the UCSB/IBM version titled “A Tactical Cloud Computing Ontology.

Kevin’s “ontology” is at the 20,000 foot view compared to the original 30,000 foot UCSB/IBM model but is worth looking at.

Categories: Cloud Computing, Cloud Security Tags:

The Most Comprehensive Review Of the Open Cloud Computing Manifesto Debacle, Ever…

March 28th, 2009 3 comments

[This Page Intentionally Left Blank]

That is all.


Categories: Cloud Computing, Cloud Security Tags:

Joanna Rutkowska: Making Invisible Things Visible…

March 25th, 2009 No comments

I’ve had my issues in the past with Joanna Rutkowska; the majority of which have had nothing to do with technical content of her work, but more along the lines of how it was marketed.  That was then, this is now.

Recently, the Invisible Things Lab team have released some really interesting work regarding attacking the SMM.  What I’m really happy about is that Joanna and her team are really making an effort to communicate the relevance and impact  the team’s research and exploits really have in ways they weren’t doing before.  As much as I was critical previously, I must acknowledge and thank her for that, too.

I’m reasonably sure that Joanna could care less what I think, but I think the latest work is great and really does indicate the profoundly shaky foundation upon which we’ve built our infrastructure and I am thankful for what this body of work points out. 

Here’s a copy of Joanna’s latest blog titled “The Sky is Falling?” explaining such:

A few reporters asked me if our recent paper on SMM attacking via CPU cache poisoning means the sky is really falling now?

Interestingly, not many people seem to have noticed that this is the 3rd attack against SMM our team has found in the last 10 months. OMG 😮

But anyway, does the fact we can easily compromise the SMM today, and write SMM-based malware, does that mean the sky is falling for the average computer user?

No! The sky has actually fallen many years ago… Default users with admin privileges, monolithic kernels everywhere, most software unsigned and downloadable over plaintext HTTP — these are the main reasons we cannot trust our systems today. And those pathetic attempts to fix it, e.g. via restricting admin users on Vista, but still requiring full admin rights to install any piece of stupid software. Or selling people illusion of security via A/V programs, that cannot even protect themselves properly…

It’s also funny how so many people focus on solving the security problems by “Security by Correctness” or “Security by Obscurity” approaches — patches, patches, NX and ASLR — all good, but it is not gonna work as an ultimate protection (if it could, it would worked out already).

On the other hand, there are some emerging technologies out there that could allow us to implement effective“Security by Isolation” approach. Such technologies as VT-x/AMD-V, VT-d/IOMMU or Intel TXT and TPM.

So we, at ITL, focus on analyzing those new technologies, even though almost nobody uses them today. Because those technologies could actually make the difference. Unlike A/V programs or Patch Tuesdays, those technologies can change the level of sophistication required for the attacker dramatically.

The attacks we focus on are important for those new technologies — e.g. today Intel TXT is pretty much useless without protection from SMM attacks. And currently there is no such protection, which sucks. SMM rootkits sound sexy, but, frankly, the bad guys are doing just fine using traditional kernel mode malware (due to the fact that A/V is not effective). Of course, SMM rootkits are just yet another annoyance for the traditional A/V programs, which is good, but they might not be the most important consequence of SMM attacks.

So, should the average Joe Dow care about our SMM attacks? Absolutely not!

I really appreciate the way this is being discussed; I think the ITL work is (now) moving the discussion forward by framing the issues instead of merely focusing on sensationalist exploits that whip people into a frenzy and cause them to worry about things they can’t control instead of the things they unfortunately choose not to 😉 

I very much believe that we can and will see advancements with the “security by isolation” approach; a lot of other bad habits and classes of problems can be eliminated (or at least significantly reduced) with the benefit of virtualization technology. 


Cloud Catastrophes (Cloudtastrophes?) Caused by Clueless Caretakers?

March 22nd, 2009 4 comments
You'll ask "How?" Everytime... 




You'll ask "How?" Everytime...

Enter the dawn of the Cloudtastrophe…

I read a story today penned by Maureen O’Gara titled “Carbonite Loses Cloud-Based Data, Sues Storage Vendor.”

I thought this was going to be another story regarding a data breach (loss) of customer data by a Cloud Computing service vendor.

What I found, however, was another hyperbolic illustration of how the messaging of the Cloud by vendors has set expectations for service and reliability that are out of alignment with reality when you take a lack of: sound enterprise architecture, proper contingency planning, solid engineering and common sense and add the economic lubricant of the Cloud.

Stir in a little journalistic sensationalism, and you’ve got CloudWow!

Carbonite, the online backup vendor, says it lost data belonging to over 7,500 customers in a number of separate incidents in a suit filed in Massachusetts charging Promise Technology Inc with supplying it with $3 million worth of defective storage, according to a story in Saturday’s Boston Globe.

The catastrophe is the latest in a series of cloud failures.

The widgetry was supposed to detect disk failures and transfer the data to a working drive. It allegedly didn’t.

The story says Promise couldn’t fix the errors and “Carbonite’s senior engineers, senior management and senior operations personnel…spent enormous amounts of time dealing with the problems.”

Carbonite claims the data losses caused “serious damage” to its business and reputation for reliability. It’s demanding unspecified damages. Promise told the Globe there was “no merit to the allegations.”

Carbonite, which sells largely to consumers and small businesses and competes with EMC’s Mozy, tells its customers: “never worry about losing your files again.”

The abstraction of infrastructure and democratization of applications and data that Cloud Computing services can bring does not mean that all services are created equal.  It does not make our services or information more secure (or less for that matter.)  Just because a vendor brands themselves as a “Cloud” provider does not mean that “their” infrastructure is any more implicitly reliable, stable or resilient than traditional infrastructure or that proper enterprise architecture as it relates to people, process and technology is in place.  How the infrastructure is built and maintained is just as important as ever.

If you take away the notion of Carbonite being a “Cloud” vendor, would this story read any differently?

We’ve seen a few failures recently of Cloud-based services, most of them sensationally lumped into the Cloud pile: Google, Microsoft, and even Amazon; most of the stories about them relate the impending doom of the Cloud…

Want another example of how Cloud branding, the Web2.0 experience and blind faith makes for another FUDtastic “catastrophe in the cloud?”  How about the well-known service Ma.gnolia?

There was a meltdown at bookmark sharing website Ma.gnolia Friday morning. The service lost both its primary store of user data, as well as its backup. The site has been taken offline while the team tries to reconstruct its databases, though some users may never see their stored bookmarks again.

The failure appears to be catastrophic. The company can’t say to what extent it will be able to restore any of its users’ data. It also says the data failure was so extensive, repairing the loss will take “days, not hours.”

So we find that a one man shop was offering a service that people liked and it died a horrible death.  Because it was branded as a Cloud offering, it “seemed” bigger than it was.  This is where perception definitely was not reality and now we’re left with a fluffy bad taste in our mouths.

Again, what this illustrates is that just because a service is “Cloud-based” does not imply it’s any more reliable or resilient as one that is not. It’s just as important that as enterprises look to move to the Cloud that they perform as much due diligence on their providers as makes sense. We’ll see a weeding out of the ankle-biters in Cloud Computing.

Nobody ever gets fired for buying IBM…

What we’ll also see is that even though we’re not supposed to care what our Cloud providers’ infrastructure is powered by and how, we absolutely will in the long term and the vendors know it.

This is where people start to freak about how standards and consolidation will kill innovation in the space but it’s also where the realities of running a business come crashing down on early adopters.

Large enterprises will move to providers who can demonstrate that their services are solid by way of co-branding with the reputation of the providers of infrastructure coupled with the compliance to “standards.”

The big players like IBM see this as an opportunity and as early as last year introduced a Cloud certification program:

IBM to Validate Resiliency of Cloud Computing Infrastructures

Will Consult With Businesses of All Sizes to Ensure Resiliency, Availability, Security; Drive Adoption of New Technology

ARMONK, NY – 24 Nov 2008: In a move that could spur the rise of the nascent computing model known as “cloud,” IBM (NYSE: IBM) today said it would introduce a program to validate the resiliency of any company delivering applications or services to clients in the cloud environment. As a result, customers can quickly and easily identify trustworthy providers that have passed a rigorous evaluation, enabling them to more quickly and confidently reap the business benefits of cloud services.

Cloud computing is a model for network-delivered services, in which the user sees only the service and does not view the implementation or infrastructure required for its delivery. The success to date of cloud services like storage, data protection and enterprise applications, has created a large influx of new providers. However, unpredictable performance and some high-profile downtime and recovery events with newer cloud services have created a challenge for customers evaluating the move to cloud.

IBM’s new “Resilient Cloud Validation” program will allow businesses who collaborate with IBM on a rigorous, consistent and proven program of benchmarking and design validation to use the IBM logo: “Resilient Cloud” when marketing their services.

Remember the “Cisco Powered Network” program?  How about a “Cisco Powered Cloud?”  See how GoGrid advertises their load balancers are f5?

In the long term, like the CapitalOne credit card commercials challenging the company providing your credit card services by asking “What’s in your wallet?” you can expect to start asking the same thing about your Cloud providers’ offerings, also.



Azure Users Seeing Red: When Patching the Cloud Causes Cracks

March 19th, 2009 4 comments

No, this isn’t one of those posts that suggests we can’t depend on the Cloud just because of one (ok, many) outages of note lately.  That’s so dystopic.  Besides, everyone else is already doing that.

I mean just because Azure was offline for 22 hours isn’t cause for that much concern, right?  It’s a beta community technology preview, anyway… 😉  Just like Google’s a beta.

azureWhat I found interesting was what Microsoft reported as the root cause for the outage, however:


The Windows Azure Malfunction This Weekend

First things first: we’re sorry.  As a result of a malfunction in Windows Azure, many participants in our Community Technology Preview (CTP) experienced degraded service or downtime.  Windows Azure storage was unaffected.

In the rest of this post, I’d like to explain what went wrong, who was affected, and what corrections we’re making.

What Happened?

During a routine operating system upgrade on Friday (March 13th), the deployment service within Windows Azure began to slow down due to networking issues.  This caused a large number of servers to time out and fail.

You catch that bit about “…a routine operating system upgrade?”  Sometimes we call those things “patches.”  Even if this wasn’t a patch, let’s call it one for argument’s sake, okay?

As such, I was reminded of a blog post that I wrote last year titled: “Patching the Cloud” in which I squawked about my concerns regarding patching and change management/roll-back in Cloud services.  It seems apropos:


Your application is sitting atop an operating system and underlying infrastructure that is managed by the cloud operator.  This “datacenter OS” may not be virtualized or could actually be sitting atop a hypervisor which is integrated into the operating system (Xen, Hyper-V, KVM) or perhaps reliant upon a third party solution such as VMware.  The notion of cloud implies shared infrastructure and hosting platforms, although it does not imply virtualization.

A patch affecting any one of the infrastructure elements could cause a ripple effect on your hosted applications.  Without understanding the underlying infrastructure dependencies in this model, how does one assess risk and determine what any patch might do up or down the stack?  …

Huh.  Go figure.  



Bypassing the Hypervisor For Performance & Network “Simplicity” = Bypassing Security?

March 18th, 2009 4 comments

As part of his coverage of Cisco’s UCS, Alessandro Perilli from highlighted this morning something I’ve spoken about many times since it was a one-slider at VMworld (latest, here) but that we’ve not had a lot of details about: the technology evolution of Cisco’s Nexus 1000v & VN-Link to the “Initiator:”

Chad Sakac, Vice President of VMware Technology Alliance at EMC, adds more details on his personal blog:

…[The Cisco] VN-Link can apply tags to ethernet frames –  and is something Cisco and VMware submitted together to the IEEE to be added to the ethernet standards.

It allows ethernet frames to be tagged with additional information (VN tags) that mean that the need for a vSwitch is eliminated.   the vSwitch is required by definition as you have all these virtual adapters with virtual MAC addresses, and they have to leave the vSphere host on one (or at most a much smaller number) of ports/MACs.   But, if you could somehow stretch that out to a physical switch, that would mean that the switch now has “awareness” of the VM’s attributes in network land – virtual adapters, ports and MAC addresses.   The physical world is adapting to andgaining awareness of the virtual world…


Bundle that with Scott Lowe’s interesting technical exploration of some additional elements of UCS as it relates to abstracting — or more specifically completely removing virtual networking from the hypervisor — and things start to get heated.  I’ve spoken about this in my Four Horsemen presentation:

Today, in the VMware space, virtual machines are connected to a vSwitch because connecting them directly to a physical adapter just isn’t practical. Yes, there is VMDirectPath, but for VMDirectPath to really work it needs more robust hardware support. Otherwise, you lose useful features like VMotion. (Refer back to my VMworld 2008 session notes from TA2644.) So, we have to manage physical switches and virtual switches—that’s two layers of management and two layers of switching. Along comes the Cisco Nexus 1000V. The 1000V helps to centralize management but we still have two layers of switching.

That’s where the “Palo” adapter comes in. Using VMDirectPath “Gen 2″ (again, refer to my TA2644 notes) and the various hardware technologies I listed and described above, we now gain the ability to attach VMs directly to the network adapter and eliminate the virtual switching layer entirely. Now we’ve both centralized the management and eliminated an entire layer of switching. And no matter how optimized the code may be, the fact that the hypervisor doesn’t have to handle packets means it has more cycles to do other things. In other words, there’s less hypervisor overhead. I think we can all agree that’s a good thing


So here’s what I am curious about. If we’re clawing back networking form the hosts and putting it back into the network, regardless of flow/VM affinity AND we’re bypassing the VMM (where the dvfilters/fastpath drivers live for VMsafe,) do we just lose all the introspection capabilities and the benefits of VMsafe that we’ve been waiting for?  Does this basically leave us with having to shunt all traffic back out to the physical switches (and thus physical appliances) in order to secure traffic?  Note, this doesn’t necessarily impact the other components of VMsafe (memory, CPU, disk, etc.) but the network portion it would seem, is obviated.

Are we trading off security once again for performance and “efficiency?”  How much hypervisor overhead (as Scott alluded to) are we really talking about here for network I/O?

Anyone got any answers?  Is there a simple  answer to this or if I use this option, do I just give up what I’ve been waiting 2 years for in VMsafe/vNetworking?


Categories: Cisco, Virtualization, VMware Tags: