Rational Survivability

Incomplete Thought: On Horseshoes & Hand Grenades – Security In Enterprise Virt/Cloud Stacks

May 22nd, 2012 beaker 7 comments

It’s not really *that* incomplete of a thought, but I figure I’d get it down on vPaper anyway…be forewarned, it’s massively over-simplified.

Over the last five years or so, I’ve spent my time working with enterprises who are building and deploying large scale (relative to an Enterprise’s requirements, that is) virtualized data centers and private cloud environments.

For the purpose of this discussion, I am referring to VMware-based deployments given the audience and solutions I will reference.

To this day, I’m often shocked with regard to how many of these organizations that seek to provide contextualized security for intra- and inter-VM traffic seem to position an either-or decision with respect to the use of physical or virtual security solutions.

For the sake of example, I’ll reference the architectural designs which were taken verbatim from my 2008 presentation “The Four Horsemen of the Virtualization Security Apocalypse.“

If you’ve seen/read the FHOTVA, you will recollect that there are many tradeoffs involved when considering the use of virtual security appliances and their integration with physical solutions. Notably, an all-virtual or all-physical approach will constrain you in one form or another from the perspective of efficacy, agility, and the impact architecturally, operationally, or economically.

The topic that has a bunch of hair on it is where I see many enterprises trending: obviating virtual solutions and using physical appliances only:

…the bit that’s missing in the picture is the external physical firewall connected to that physical switch. People are still, in this day and age, ONLY relying on horseshoeing all traffic between VMs (in the same or different VLANs) out of the physical cluster machine and to an external firewall.

Now, there are many physical firewalls that allow for virtualized contexts, zoning, etc., but that’s really dependent upon dumping trunked VLAN ports from the firewall/switches into the server and then “extending” virtual network contexts, policies, etc. upstream in an attempt to flatten the physical/virtual networks in order to force traffic through a physical firewall hop — sometimes at layer 2, sometimes at layer 3.

It’s important to realize that physical firewalls DO offer benefits over the virtual appliances in terms of functionality, performance, and some capabilities that depend on hardware acceleration, etc. but from an overall architectural positioning, they’re not sufficient, especially given the visibility and access to virtual networks that the physical firewalls often do not have if segregated.

Here’s a hint, physical-only firewall solutions alone will never scale with the agility required to service the virtualized workloads that they are designed to protect. Further, a physical-only solution won’t satisfy the needs to dynamically provision and orchestrate security as close to the workload as possible, when the workloads move the policies will generally break, and it will most certainly add latency and ultimately hamper network designs (both physical and virtual.)

Virtual security solutions — especially those which integrate with the virtualization/cloud stack (in VMware’s case, vCenter & vCloud Director) — offer the ability to do the following:

…which is to say that there exists the capability to utilize virtual solutions for “east-west” traffic and physical solutions for “north-south” traffic, regardless of whether these VMs are in the same or different VLAN boundaries or even across distributed virtual switches which exist across hypervisors on different physical cluster members.

For east-west traffic (and even north-south models depending upon network architecture) there’s no requirement to horseshoe traffic physically.

It’s probably important to mention that while the next slide is out-of-date from the perspective of the advancement of VMsafe APIs, there’s not only the ability to inject a slow-path (user mode) virtual appliance between vSwitches, but also utilize a set of APIs to instantiate security policies at the hypervisor layer via a fast path kernel module/filter set…this means greater performance and the ability to scale better across physical clusters and distributed virtual switching:

Interestingly, there also exists the capability to actually integrate policies and zoning from physical firewalls and have them “flow through” to the virtual appliances to provide “micro-perimeterization” within the virtual environment, preserving policy and topology.

There are at least three choices for hypervisor management-integrated solutions on the market for these solutions today:

VMware vShield App,
Cisco VSG+Nexus 1000v and
Juniper vGW

Note that the solutions above can be thought of as “layer 2” solutions — it’s a poor way of describing them, but think “inter-VM” introspection for workloads in VLAN buckets. All three vendors above also have, or are bringing to market, complementary “layer 3” solutions that function as virtual “edge” devices and act as a multi-function “next-hop” gateway between groups of VMs/applications (nee vDC.) For the sake of brevity, I’m omitting those here (they are incredibly important, however.)

They (layer 2 solutions) are all reasonably mature and offer various performance, efficacy and feature set capabilities. There are also different methods for plumbing the solutions and steering traffic to them…and these have huge performance and scale implications.

It’s important to recognize that the lack of thinking about virtual solutions often seem to be based largely on ignorance of need and availability of solutions.

However, other reasons surface such as cost, operational concerns and compliance issues with security teams or assessors/auditors who don’t understand virtualized environments well enough.

From an engineering and architectural perspective, however, obviating them from design consideration is a disappointing concern.

Enterprises should consider a hybrid of the two models; virtual where you can, physical where you must.

If you’ve considered virtual solutions but chose not to deploy them, can you comment on why and share your thinking with us (even if it’s for the reasons above?)

/Hoff

Incomplete Thought: Will the Public Cloud Create a Generation Of Network Stupid?(rationalsurvivability.com)
Virtual Networking is more than VMs and VLAN duct tape(ioshints.info)

Overlays: Wasting Away Again In Abstractionville… (rationalsurvivability.com)
Virtualized security routers for cloud security (net-security.org)

Categories: Cloud Security, Virtualization, Virtualization Security Tags: Cloud Computing, Data center, Nicira, OpenStack, Security, Star Trek: Enterprise, Virtual machine, VMware

QuickQuip: Don’t run your own data center if you’re a public IaaS < Sorta...

January 10th, 2012 beaker 7 comments

Patrick Baillie, the CEO of Swiss IaaS provider, CloudSigma, wrote a very interesting blog published on GigaOm titled “Don’t run your own data center if you’re a public IaaS.”

Baillie leads off by describing how AWS’ recent outage is evidence as to why the complexity of running facilities (data centers) versus the services running atop them is best segregated and outsourced to third parties in the business of such things:

Why public IaaS cloud providers should outsource their data centers

While there are some advantages for cloud providers operating data centers in-house, including greater control, capacity, power and security, the challenges, such as geographic expansion, connectivity, location, cost and lower-tier facilities can often outweigh the benefits. In response to many of these challenges, an increasing number of cloud providers are realizing the benefits of working with a third-party data center provider.

It’s a very interesting blog, sprinkled throughout with pros and cons of rolling your own versus outsourcing but it falls down in being able to carry the burden in logic of some the assertions.

Perhaps I misunderstood, but the article seemed to focus on single DC availability as though (per my friend @CSOAndy’s excellent summarization) “…he missed the obvious reason: you can arbitrage across data centers” and “…was focused on single DC availability. Arbitrage means you just move your workloads automagically.”

I’ll let you read the setup in its entirety, but check out the conclusion:

In reality, taking a look at public cloud providers, those with legacy businesses in hosting, including Rackspace and GoGrid, tend to run their own facilities, whereas pure-play cloud providers, like my company CloudSigma, tend to let others run the data centers and host the infrastructure.

The business of operating a data center versus operating a cloud is very different, and it’s crucial for such providers to focus on their core competency. If a provider attempts to do both, there will be sacrifices and financial choices with regards to connectivity, capacity, supply, etc. By focusing on the cloud and not the data center, public cloud IaaS providers don’t need to make tradeoffs between investing in the data center over the cloud, thereby ensuring the cloud is continually operating at peak performance with the best resources available.

The points above were punctuated as part of a discussion on Twitter where @georgereese commented “IaaS is all about economies of scale. I don’t think you win at #cloud by borrowing someone else’s”

Fascinating. It’s times like these that I invoke the widsom of the Intertubes and ask “WWWD” (or What Would Werner Do?)

If we weren’t artificially limited in this discussion to IaaS only, it would have been interesting to compare this to SaaS providers like Google or Salesforce or better yet folks like Zynga…or even add supporting examples like Heroku (who run atop AWS but are now a part of SalesForce o_O)

I found many of the points raised in the discussion intriguing and good food for thought but I think that if we’re talking about IaaS — and we leave out AWS which directly contradicts the model proposed — the logic breaks almost instantly…unless we change the title to “Don’t run your own data center if you’re a [small] public IaaS and need to compete with AWS.”

Interested in your views…

/Hoff

Categories: Cloud Computing Tags: AWS, Cloud Computing, CloudSigma, Data center, Patrick Baillie

Incomplete Thought: Cloudbursting Your Bubble – I call Bullshit…

April 5th, 2011 beaker 6 comments

My wife is in the midst of an extended multi-phasic, multi-day delivery process of our fourth child. In between bouts of her moaning, breathing and ultimately sleeping, I’m left to taunt people on Twitter and think about Cloud.

Reviewing my hot-button list of terms that are annoying me presently, I hit upon a favorite: Cloudbursting.

It occurred to me that this term brings up a visceral experience that makes me want to punch kittens. It’s used by people to describe a use case in which workloads that run first and foremost within the walled gardens of an enterprise, magically burst forth into public cloud based upon a lack of capacity internally and a plethora of available capacity externally.

I call bullshit.

Now, allow me to qualify that statement.

Ben Kepes suggests that cloud bursting makes sense to an enterprise “Because you’ve spent a gazillion dollars on on-prem h/w that you want to continue using. BUT your workloads are spiky…” such that an enterprise would be focused on “…maximizing returns from on-prem. But sending excess capacity to the clouds.” This implies the problem you’re trying to solve is one of scale.

I just don’t buy this.

Either you build a private cloud that gives you the scale you need in the first place in which you pattern your operational models after public cloud and/or design a solid plan to migrate, interconnect or extend platforms to the public [commodity] cloud using this model, therefore not bursting but completely migrating capacity, but you don’t stop somewhere in the middle with the same old crap internally and a bright, shiny public cloud you “burst things to” when you get your capacity knickers in a twist:

The investment and skillsets needed to rectify two often diametrically-opposed operational models doesn’t maximize returns, it bifurcates and diminishes efficiencies and blurs cost allocation models making both internal IT and public cloud look grotesquely inaccurate.

Christian Reilly suggested I had no legs to stand on making these arguments:

Fair enough, but…

Short of workloads such as HPC in which scale really is a major concern, if a large enterprise has gone through all of the issues relevant to running tier-1 applications in a public cloud, why on earth would you “burst” to the public cloud versus execute on a strategy that has those workloads run there in the first place.

Christian came up with another ringer during this exchange, one that I wholeheartedly agree with:

Ultimately, the reason I agree so strongly with this is because of the architectural, operational and compliance complexity associated with all the mechanics one needs to allow for interoperable, scaleable, secure and manageable workloads between an internal enterprise’s operational domain (cloud or otherwise) and the public cloud.

The (in)ability to replicate capabilities exactly across these two models means that gaps arise — gaps that unfairly amplify the immaturity of cloud for certain things and it’s stellar capabilities in others. It’s no wonder people get confused. Things like security, networking, application intelligence…

NOTE: I make a wholesale differentiaton between a strategy that includes a structured hybrid cloud approach of controlled workload placement/execution versus a purely overflow/capacity movement of workloads.*

There are many workloads that simply won’t or can’t *natively* “cloudburst” to public cloud due to a lack of supporting packaging and infrastructure.** Some of them are pretty important. Some of them are operationally mission critical. What then? Without an appropriate way of understanding the implications and complexity associated with this issue and getting there from here, we’re left with a strategy of “…leave those tier-1 apps to die on the vine while we greenfield migrate new apps to public cloud.” That doesn’t sound particularly sexy, useful, efficient or cost-effective.

There are overlay solutions that can allow an enterprise to leverage utility computing services as an abstracted delivery platform and fluidly interconnect an enterprise with a public cloud, but one must understand what’s involved architecturally as part of that hybrid model, what the benefits are and where the penalties lay. Public cloud needs the same rigor in its due diligence.

[update] My colleague James Urquhart summarized well what I meant by describing the difference in DC-DC (cloud or otherwise) workload execution as what I see as either end of a spectrum: VM-centric package mobility or adopting a truly distributed application architecture. If you’re somewhere in the middle, things like cloudbursting get really hairy. As we move from IaaS -> PaaS, some of these issues may evaporate as the former (VM’s) becomes less relevant and the latter (Applications deployed directly to platforms) more prevalent.

Check out this zinger from JP Morgenthal which much better conveys what I meant:

If your Tier-1 workloads can run in a public cloud and satisfy all your requirements, THAT’S where they should run in the first place! You maximize your investment internally by scaling down and ruthlessly squeezing efficiency out of what you have as quickly as possible — writing those investments off the books.

That’s the point, innit?

Cloud bursting — today — is simply a marketing term.

Thoughts?

/Hoff

* This may be the point that requires more clarity, especially in the wake of examples that were raised on Twitter after I posted this such as using eBay and Netflix as examples of successful “cloudbursting” applications. My response is that these fine companies hardly resemble a typical enterprise but that they’re also investing in a model that fundamentally changes they way they operate.

** I should point out that I am referring to the use case of heterogeneous cloud platforms such as VMware to AWS (either using an import/conversion function and/or via VPC) versus a more homogeneous platform interlock such as when the enterprise runs vSphere internally and looks to migrate VMs over to a VMware vCloud-powered cloud provider using something like vCloud Director Connector, for example. Either way, the point still stands, if you can run a workload and satisfy your requirements outright on someone else’s stack, why do it on yours?

Incomplete Thought: Cloud Capacity Clearinghouses & Marketplaces – A Security/Compliance/Privacy Minefield? (rationalsurvivability.com)
The Year of the Hybrid Cloud (blogs.cisco.com)
Take the Fast Track to Private Clouds (pcworld.com)
New Innovations in Data Center and Virtualization Promise More Agility (blogs.cisco.com)
Dedicated AWS VPC Compute Instances – Strategically Defensive or Offensive? (rationalsurvivability.com)
Incomplete Thought: Why Security Doesn’t Scale…Yet. (rationalsurvivability.com)
AWS’ New Networking Capabilities – Sucking Less 😉 (rationalsurvivability.com)

Categories: Cloud Computing Tags: Application Service Providers, Ben Kepes, Cloud Computing, Data center, EBay, High-performance computing, James Urquhart, Twitter, VMware

Archive

Incomplete Thought: On Horseshoes & Hand Grenades – Security In Enterprise Virt/Cloud Stacks

QuickQuip: Don’t run your own data center if you’re a public IaaS < Sorta...

Incomplete Thought: Cloudbursting Your Bubble – I call Bullshit…

Recent Posts

Recent Comments

Categories

Archives

Rational Survivability

Archive

Incomplete Thought: On Horseshoes & Hand Grenades – Security In Enterprise Virt/Cloud Stacks

Related articles

QuickQuip: Don’t run your own data center if you’re a public IaaS < Sorta...

Incomplete Thought: Cloudbursting Your Bubble – I call Bullshit…

Related articles

Recent Posts

Recent Comments

Tag Cloud

Categories

Archives