Search Results

Keyword: ‘patching the cloud’

All For One, One For All? On Standardizing Virtual Appliance Operating Systems

June 11th, 2010 beaker 6 comments
SuSE logo
Image via Wikipedia

Hot on the tail of the announcement that VMware and Novell are entering into a deeper “strategic partnership” in order to deliver and support SUSE Linux Enterprise Server (SLES) for VMware vSphere environments, was an interesting blog post from Stu (@vinternals) titled “Enter the Appliance.

Now, before we get to Stu’s post, let’s look at the language from the press release (the emphasis is mine):

VMware and Novell today announced an expansion to their strategic partnership with an original equipment manufacturer (OEM) agreement through which VMware will distribute and support the SUSE® Linux Enterprise Server operating system. Under the agreement, VMware also intends to standardize its virtual appliance-based product offerings on SUSE Linux Enterprise Server.

Customers who want to deploy SUSE Linux Enterprise Server for VMware® in VMware vSphere™ virtual machines will be entitled to receive a subscription to SUSE Linux Enterprise Server that includes patches and updates as part of their newly purchased qualifying VMware vSphere license and Support and Subscription. Under this agreement, VMware and its extensive network of solution provider partners will also be able to offer customers the option to purchase technical support for SUSE Linux Enterprise Server delivered directly by VMware for a seamless support experience. This expanded relationship between VMware and Novell benefits customers by reducing the cost and complexity of deploying and maintaining an enterprise operating system with VMware solutions.

As a result of this expanded collaboration, both companies intend to provide customers the ability to port their SUSE Linux-based workloads across clouds.  Such portability will deliver choice and flexibility for VMware vSphere customers and is a significant step forward in delivering the benefits of seamless cloud computing.

Several VMware products are already distributed and deployed as virtual appliances. A virtual appliance is a pre-configured virtual machine that packages an operating system and application into a self-contained unit that is easy to deploy, manage and maintain. Standardizing virtual appliance-based VMware products on SUSE Linux Enterprise Server for VMware® will further simplify the deployment and ongoing management of these solutions, shortening the path to ROI.

What I read here is that VMware virtual appliances — those VMware products packaged as virtual appliances distributed by VMware — will utilized SLES as the underlying operating system of choice. I don’t see language or the inference that other virtual appliance ISVs will be required to do so

To that point, Stu’s blog post said:

VMware will be adopting SUSE Linux Enterprise Server, SLES, as the single platform for their virtual appliances.

I’ve ranted in the past about the problem with virtual appliances. Everything from the lack of a standard Linux platform even within a single vendor (let alone amongst multiple vendors), to the additional overhead such a model of software distribution would place upon software vendors, to the security needs of the Enterprise around patch response times etc. And today, every single one of those arguments has been nullified in one fell swoop. Hallelujah, someone was listening after all!

So far, so good. Seems pretty much in-line with what VMware said.

Here’s the interesting assertion Stu makes that inspired my commentary:

If you’re a software vendor looking to adopt the virtual appliance model to distribute your wares then I have some advice for you – if you’re not using SLES for the base of your appliance, start doing so. Now. This partnership will mean doors that were previously closed to virtual appliances will now be opened, but not to any old virtual appliance – it will need to be built on an Enterprise grade distro. And SLES is most certainly that.

Chris Wolf, Stu and I had a bit of banter on Twitter regarding this announcement wherein I suggested there’s a blurring of the lines and a conflation of messaging as well as a very unique perspective that’s not being discussed.

Specifically, I don’t see where it was implied that ISV’s would be forced to adopt SLES as their OS of choice for virtual appliances.  I’m not suggesting it’s not compelling to do so for the support and distribution reasons stated above, but I suggest that the notion that “…doors that were previously closed to virtual appliances” from the perspective of support and uniformity of disto will also have and equal and opposite effect caused by a longer development lifecycle for many vendors.

Especially networking and security ISVs looking to move their products into a virtual appliance offering.

I summed up many of the issues associated with virtual security and networking appliances in my Four Horsemen of the Virtualization Security Apocalypse presentation, and given how the definition and capabilities of “the network” are (d)evolving (depending upon how you view abstraction) you might also find Where Are the Network Virtual Appliances? Hobbled By the Virtual Network, That’s Where… an interesting read:

What does this mean?  It means that ultimately to ensure their own survival, virtualization and cloud providers will depend less upon virtual appliances and add more of the basic connectivity AND security capabilities into the VMMs themselves as its the only way to guarantee performance, scalability, resilience and satisfy the security requirements of customers. There will be new generations of protocols, APIs and control planes that will emerge to provide for this capability, but this will drive the same old integration battles we’re supposed to be absolved from with virtualization and Cloud.

Tell me that’s not what’s happening *right* now.

Unlike most user-facing or service-delivery applications that are not topology sensitive (that is, they simply expect to be able to speak to “the network” without knowing anything about it,) network and security ISVs do very interesting things with drivers and kernel-space code in order to deal with topology, where they sit in the stack, and how they improve performance and stability that are extremely dependent upon direct access to hardware or at the very least, customer drivers or extended/hacked kernels.

One of the reasons you see a slow trickle of network and security virtual appliances is because of these bespoke OS builds and what virtualization has done to how these services are delivered, scaled and deal with resilience.  We’ve already seen the challenge of ISVs having to re-write code to fit the VMsafe fast/slow-path driver model.

You can imagine the consternation involved if what Stu alluded to is actually required — that you must build your virtual appliances on a specific OS.  It’s going to slow down innovation and delivery of solutions if the ISV does not (for any number of valid reasons) use SLES.  This is also one of the downsides of a JEOS approach.

Stu’s warnings about compliance to SLES development notwithstanding, this puts ISVs in a delicate position — one they’ve faced before but is now exacerbated by virtualization and Cloud.  Security vendors generally minimize and harden OS stacks to fit their “application” and then tune the environment accordingly.  We’re already introducing new monocultures and uniformity in attack surfaces with hypervisors.  Are we going to do the same with the operating systems that power the virtual appliances/virtual machines that run atop them — especially those designed to protect these very systems?

Diversity is a good thing — at least when it comes to your networking and security infrastructure.  While I happen to work for a networking vendor, we all recognize that uniformity brings huge benefits as well as the potential for nasty concerns.  If you want an example, check out how a simple software error affected tens of millions of users of WordPress (WordPress and the dark side of multitenancy.) While we’re talking about a different layer in the stack, the issue is the same.

I totally grok the standardization argument for the cost control, support and manageability reasons Stu stated but I am also fearful of the extreme levels of lock-in and monoculture this approach can take.

/Hoff

Enhanced by Zemanta
  • Share/Bookmark

Incomplete Thought: The DevOps Disconnect

May 31st, 2010 beaker 16 comments

DevOps — what it means and how it applies — is a fascinating topic that inspires all sorts of interesting reactions from people, polarized by their interpretation of what this term really means.

At CloudCamp Denver, adjacent to Gluecon, Aaron Pederson of OpsCode gave a lightning talk titled: ”Operations as Code.”  I’ve seen this presentation on-line before, but listened intently as Aaron presented.  You can see John Willis’ version on Slideshare here.  Adrian Cole (@adrianfcole) of jClouds fame (and now Opscode) and I had an awesome hour-long discussion afterwards that was the genesis for this post.

“Operations as Code” (I’ve seen it described also as “Infrastructure as Code”) is really a fantastically sexy and intriguing phrase.  When boiled down, what I extract is that the DevOps “movement” is less about developers becoming operators, but rather the notion that developers can be part of the process whereby they help enable operations/operators to repeatably and with discipline, automate processes that are otherwise manual and prone to error.

[Ed: great feedback from Andrew Shafer: "DevOps isn't so much about developers helping operations, it's about operational concerns becoming more and more programmable, and operators becoming more and more comfortable and capable with that.  Further, John Allspaw (@allspaw) added some great commentary below - talking about DevOps really being about tools + culture + communication. Adam Jacobs from Opscode *really* banged out a great set of observations in the comments also. All good perspective.]

Automate, automate, automate.

While I find the message of DevOps totally agreeable, it’s the messaging that causes me concern, not because of the groups it includes, but those that it leaves out.  I find that the most provocative elements of the DevOps “manifesto” (sorry) are almost religious in nature.  That’s to be expected as most great ideas are.

In many presentations promoting DevOps, developers are shown to have evolved in practice and methodology, but operators (of all kinds) are described as being stuck in the dark ages. DevOps evangelists paint a picture that compares and contrasts the Agile-based, reusable componentized, source-controlled, team-based and integrated approach of “evolved” development environments with that of loosely-scripted, poorly-automated, inefficient, individually-contributed, undisciplined, non-source-controlled operations.

You can see how this might be just a tad off-putting to some people.

In Aaron’s presentation, the most interesting concept to me is the definition of “infrastructure.” Take the example to the right, wherein various “infrastructure” roles are described.  What should be evident is that to many — especially those in enterprise (virtualized or otherwise) or non-Cloud environments — is that these software-only components represent only a fraction of what makes up “infrastructure.”

The loadbalancer role, as an example makes total sense if you’re using HAproxy or Zeus ZXTM. What happens if it’s an F5 or Cisco appliance?

What about the routers, switches, firewalls, IDS/IPS, WAFs, SSL engines, storage, XML parsers, etc. that make up the underpinning of the typical datacenter?  The majority of these elements — as most of them exist today — do not present consistent interfaces for automation/integration. Most of them utilize proprietary/closed API’s for management that makes automation cumbersome if not impossible across a large environment.

Many will react to that statement by suggesting that this is why Cloud Computing is the great equalizer — that by abstracting the “complexity” of these components into a more “simplified” set of software resources versus hardware, it solves this problem and without the hardware-centric focus of infrastructure and the operations mess that revolves around it today, we’re free to focus on “building the business versus running the business.”

I’d agree.  The problem is that these are two different worlds.  The mass-market IaaS/PaaS providers who provide abstracted representations of infrastructure are still the corner-cases when compared to the majority of service providers who are entering the Cloud space specifically focused on serving the enterprise, and the enterprise — even those that are heavily virtualized — still very dependent upon hardware.

This is where the DevOps messaging miss comes — at least as it’s described today. DevOps is really targeted (today) toward the software-homogeneity of public, mass-market Cloud environments (such as Amazon Web Services) where infrastructure can be defined as abstract component, software-only roles, not the complex mish-mash of hardware-focused IT of the enterprise as it currently stands. This may be plainly obvious to some, but the messaging of DevOps is obscuring the message which is unfortunate.

DevOps is promoted today as a target operational end-state without explicitly defining that the requirements for success really do depend upon the level of abstraction in the environment; it’s really focused on public Cloud Computing.  In and of itself, that’s not a bad thing at all, but it’s a “marketing” miss when it comes to engaging with a huge audience who wants and needs to get the DevOps religion.

You can preach to the choir all day long, but that’s not going to move the needle.

My biggest gripe with the DevOps messaging is with the name itself. If you expect to truly automate “infrastructure as code,” we really should call it NetSecDevOps. Leaving the network and security teams — and the infrastructure they represent — out of the loop until they are either subsumed by software (won’t happen) or get religion (probable but a long-haul exercise) is counter-productive.

Take security, for example. By design, 95% of security technology/solutions are — by design — not easily automatable or are built to require human interaction given their mission and lack of intelligence/correlation with other tools.  How do you automate around that?  It’s really here that the statement I’ve made that “security doesn’t scale” is apropos. Believe me, I’m not making excuses for the security industry, nor am I suggesting this is how it ought to be, but it is how it currently exists.

Of course we’re seeing the next generation of datacenters in the enterprise become more abstract. With virtualization and cloud-like capabilities being delivered with automated provisioning, orchestration and governance by design for BOTH hardware and software and the vision of private/public cloud integration baked into enterprise architecture, we’re actually on a path where DevOps — at its core — makes total sense.

I only wish that (NetSec)DevOps evangelists — and companies such as Opscode — would  address this divide up-front and start to reach out to the enterprise world to help make DevOps a goal that these teams aspire to rather than something to rub their noses in.  Further, we need a way for the community to contribute things like Chef recipes that allow for flow-through role definition support for hardware-based solutions that do have exposed management interfaces (Ed: Adrian referred to these in a tweet as ‘device’ recipes)

/Hoff

Related articles by Zemanta

Reblog this post [with Zemanta]
  • Share/Bookmark

Patching the (Hypervisor) Platform: How Do You Manage Risk?

April 12th, 2010 beaker 7 comments

Hi. Me again.

In 2008 I wrote a blog titled “Patching the Cloud” which I followed up with material examples in 2009 in another titled “Redux: Patching the Cloud.

These blogs focused mainly on virtualization-powered IaaS/PaaS offerings and whilst they targeted “Cloud Computing,” they applied equally to the heavily virtualized enterprise.  To this point I wrote another in 2008 titled “On Patch Tuesdays For Virtualization Platforms.

The operational impacts of managing change control, vulnerability management and threat mitigation have always intrigued me, especially at scale.

I was reminded this morning of the importance of the question posed above as VMware released a series of security advisories detailing ten vulnerabilities across many products, some of which are remotely exploitable. While security vulnerabilities in hypervisors are not new, it’s unclear to me how many heavily-virtualized enterprises or Cloud providers actually deal with what it means to patch this critical layer of infrastructure.

Once virtualized, we expect/assume that VM’s and the guest OS’s within them should operate with functional equivalence when compared to non-virtualized instances. We have, however, seen that this is not the case. It’s rare, but it happens that OS’s and applications, once virtualized, suffer from issues that cause faults to the underlying virtualization platform itself.

So here’s the $64,000 question – feel free to answer anonymously:

While virtualization is meant to effectively isolate the hardware from the resources atop it, the VMM/Hypervisor itself maintains a delicate position arbitrating this abstraction.  When the VMM/Hypervisor needs patching, how do you regression test the impact across all your VM images (across test/dev, production, etc.)?  More importantly, how are you assessing/measuring compound risk across shared/multi-tenant environments with respect to patching and its impact?

/Hoff

P.S. It occurs to me that after I wrote the blog last night on ‘high assurance (read: TPM-enabled)’ virtualization/cloud environments with respect to change control, the reference images for trust launch environments would be impacted by patches like this. How are we going to scale this from a management perspective?

Reblog this post [with Zemanta]
  • Share/Bookmark

To Achieve True Cloud (X/Z)en, One Must Leverage Introspection

January 6th, 2010 beaker No comments

Back in October 2008, I wrote a post detailing efforts around the Xen community to create a standard security introspection API (Xen.Org Launches Community Project To Bring VM Introspection to Xen :)

The Xen Introspection Project is a community effort within Xen.org to leverage the existing research presented above with other work not yet public to create a standard API specification and methodology for virtual machine introspection.

That blog was focused on introspection for virtualization proper but since many of the larger cloud providers utilize Xen virtualization as an underpinning of their service architecture and as an industry we’re suffering from a lack of visibility and deployable security capabilities, the relevance of VM and VMM introspection to cloud computing is quite relevant.

I thought I’d double around and see where we are.

It looks as though there’s been quite a bit of recent activity from the folks at Georgia Tech (XenAccess Project) and the University of Alaska at Fairbanks (Virtual Introspection for Xen) referenced in my previous blog.  The vCloud API proffered via the DMTF seems to also leverage (at least some of) the VMsafe API capabilities present in VMware‘s vSphere virtualization platform.

While details are, for obvious reasons sketchy, I am encouraged in speaking to representatives from a few cloud providers who are keenly interested in including these capabilities in their offerings.  Wouldn’t that be cool?

Adoption and inclusion of introspection capabilities will overcome some of the inherent security and visibility limitations we face in highly-virtualized multi-tenant environments due to networking constraints for integrating security functionality that I wrote about here.

I plan a follow-on blog in more detail once I finish some interviews.

/Hoff

Reblog this post [with Zemanta]
  • Share/Bookmark

Redux: Patching the Cloud

September 23rd, 2009 beaker 3 comments

Back in 2008 I wrote a piece titled “Patching the Cloud” in which I highlighted the issues associated with the black box ubiquity of Cloud and what that means to patching/upgrading processes:

Your application is sitting atop an operating system and underlying infrastructure that is managed by the cloud operator.  This “datacenter OS” may not be virtualized or could actually be sitting atop a hypervisor which is integrated into the operating system (Xen, Hyper-V, KVM) or perhaps reliant upon a third party solution such as VMware.  The notion of cloud implies shared infrastructure and hosting platforms, although it does not imply virtualization.

A patch affecting any one of the infrastructure elements could cause a ripple effect on your hosted applications.  Without understanding the underlying infrastructure dependencies in this model, how does one assess risk and determine what any patch might do up or down the stack?  How does an enterprise that has no insight into the “black box” model of the cloud operator, setup a dev/test/staging environment that acceptably mimics the operating environment?

What happens when the underlying CloudOS gets patched (or needs to be) and blows your applications/VMs sky-high (in the PaaS/IaaS models?)

How does one negotiate the process for determining when and how a patch is deployed?  Where does the cloud operator draw the line?   If the cloud fabric is democratized across constituent enterprise customers, however isolated, how does a cloud provider ensure consistent distributed service?  If an application can be dynamically provisioned anywhere in the fabric, consistency of the platform is critical.

I followed this up with a practical example when Microsoft’s Azure services experienced a hiccup due to this very thing.  We see wholesale changes that can be instantiated on a whim by Cloud providers that could alter service functionality and service availability such as this one from Google (Published Google Documents to appear in Google search) — have you thought this through?

So now as we witness ISP’s starting to build Cloud service offerings from common Cloud OS platforms and espouse the portability of workloads (*ahem* VM’s) from “internal” Clouds to Cloud Providers — and potentially multiple Cloud providers — what happens when the enterprise is at v3.1 of Cloud OS, ISP A is at version 2.1a and ISP B is at v2.9? Portability is a cruel mistress.

Pair that little nugget with the fact that even “global” Cloud providers such as Amazon Web Services have not maintained parity in terms of functionality/services across their regions*. The US has long had features/functions that the european region has not.  Today, in fact, AWS announced bringing infrastructure capabilities to parity for things like elastic load balancing and auto-scale…

It’s important to understand what happens when we squeeze the balloon.

/Hoff

*corrected – I originally said “availability zones” which was in error as pointed out by Shlomo in the comments. Thanks!

  • Share/Bookmark

The Six Worst Cloud Security Mistakes? I Can Do You One Better…

June 6th, 2009 beaker 2 comments

I recently read a story from Kelly Jackson Higgins of Dark Reading outlining what are described as the “Six Worst Cloud Security Mistakes:

  1. Assuming the cloud is less secure than your data
  2. Not verifying, testing, or auditing the security of your cloud-based service provider.
  3. Failing to vet your cloud provider’s viability as a business.
  4. Assuming you’re no longer responsible for securing data once it’s in the cloud.
  5. Putting insecure apps in the cloud and expecting that to make them more secure.
  6. Having no clue that your business units are already using some cloud-based services.

A very interesting list, for sure, and a reasonable set of potential “mistakes” to ponder, but I’m really having trouble with one in particular.

The one that’s getting my goose honking is #1: Assuming the cloud is less secure than your data.

Really? I maintain that this generalization about Cloud being more or less secure (in regards to one’s own capabilities) is a silly thing to argue; let’s see why.

We start off with what I think is a strange bit of contradiction:

It’s only natural for security pros to be control freaks. Being charged with securing a company’s data and intellectual property requires a healthy dose of paranoia and protectionism. But sometimes that leads to false impressions about cloud security. “One common mistake is that as soon as you talk about the cloud, [organizations] assume it’s less secure than their own IT security operation,” says Chenxi Wang, principal analyst at Forrester Research. “More control does not necessarily lead to more security.”

Assuming that one of the reasons a company might consider outsourcing their IT security operations to a third party [Cloud] provider IS the fact that they have more control or at least equal to what a company can provide themselves, it occurs to me this sort of statement can be interpreted many ways.  Here’s one, for example.

I find myself confused by the highlighted sentence regarding control and security within the context of what is written.  In fact, if you read the next paragraph, it seems to imply that the because a Cloud provider has more control they can offer better security:

In fact, with services such as Google’s SaaS, data loss is less likely because the information is accessible from anywhere and anytime without saving it to an easily lost or stolen USB stick or CD, according to Eran Feigenbaum, director of security for Google Apps. And Google’s security-patching process is more streamlined than a typical enterprise because its server architecture is homogeneous, he says. “Many attacks [come from a] lack of patch management and server misconfiguration…For Google, when the time comes to patch, we can do so across the entire platform in a uniform fashion,” he said.

I’ll say it again: SaaS is a convenient way of dumbing down “Cloud Computing” to a singular instance/application/service but it completely obviates Platform and Infrastructure as a Service offerings, which are wildly different animals, especially from a security perspective.  Please see my latest commentary about this in my response to Bruce Schneier’s equation of SaaS with Cloud Computing to the exclusion of PaaS/IaaS.

I’ve made the point before that comparing managing/patching a single application and its supporting infrastructure in a SaaS offering to an enterprise that would otherwise have to support not only that service but potentially hundreds more is a completely unfair comparison.  If you want to compare apples to apples, I’d maintain that any organization with a mature security program whose only charter was to support (securely) a single application could do it just as well as a SaaS provider, all other things being equal.

The differences here become scale and multi-tenancy in the case of the Cloud provider, I think these issues actually make a Cloud environment more difficult to secure.

Also, suggesting with the Google example that “data loss is less likely” because it’s “accessible from anywhere” and doesn’t involve “…lost or stolen USB stick(s) or CD(s)” seems an awfully arbitrary one given the fact that one of the most interesting data loss/leakage incidents in recent Cloud history came from Google’s Docs offering due to an operator (Google) system misconfiguration.  USB sticks and CDs are also a very narrow definition of data loss/leakage.

Then there’s the more global view SaaS and other cloud providers have, Feigenbaum says. “As an enterprise, you only see a small slice of what’s affecting you [threat-wise],” Feigenbaum said during a panel on cloud security at the RSA Conference in April. “A cloud provider can have the economy of scale for a holistic vision…the cloud shifts security and also makes it better,” he said.

I don’t have anything to argue about here; a wider perspective and better visibility is a good thing.  Again, however, this depends upon the type of service, what is being monitored and protected, on behalf of whom and from whom.

But that doesn’t mean you should blindly trust your cloud provider, though the larger ones do tend to have a better handle on threats due to their size, Forrester’s Wang says. “These people deal with security issues at more complex levels than your own IT team sees on a daily basis,” Wang says. “It’s a misconception to say cloud security is definitely less capable or more problematic.”

No, you shouldn’t blindly trust your providers but that last statement suggests we should similary trust that providers do a better job and deal with security issues at more complex levels?  What does that even mean? Please do NOT tell me that a SAS70 Type II is your answer.  Just as “It’s a misconception to say cloud security is definitely less capable or more problematic,” I can just as easily suggest the converse is true without evidence.

I would like to see the empirical data that backs that set of statements up and the common metrics I can use to measure across providers and enterprises alike.  Thought so.

Thus far, security has been one of the main hurdles to adoption of cloud-based services, says Michelle Dennedy, chief governance officer for cloud computing at Sun Microsystems. “Trust in the cloud, more than technical abilities, has been hindering adoption,” Dennedy says. “But the cloud can be more secure than a private environment in many cases.”

Michelle is definitely correct; trust represents a fundamental issue with Cloud adoption, and it rolls both ways.  Asking us to “trust but verify” when what we’re being asked to verify can’t easily be trusted poses a very difficult scenario indeed.

By the way, I think the worst Cloud Security mistake is not knowing what Cloud Security even means.

/Hoff

  • Share/Bookmark

Cloud Security Will NOT Supplant Patching…Qualys Has Its Head Up Its SaaS

May 4th, 2009 beaker 4 comments

“Cloud Security Will  Supplant Patching…”

What a sexy-sounding claim in this Network World piece which is titled with the opposite suggestion from the title of my blog post.  We will still need patching.  I agree, however, that how it’s delivered needs to change.

Before we get to the issues I have, I do want to point out that the article — despite it’s title –  is focused on the newest release of Qualys’ Laws of Vulnerability 2.0 report (pdf,) which is the latest version of the Half Lives of Vulnerability study that my friend Gerhardt Eschelbeck started some years ago.

In the report, the new author, Qualys’ current CTO Wolfgang Kandek, delivers a really disappointing statistic:

In five years, the average time taken by companies to patch vulnerabilities had decreased by only one day, from 60 days to 59 days, at a time when the number of flaws and the speed at which they are being exploited has accelerated from weeks to, in some cases, days. During the same period, the number of IP scanned on an anonymous basis by the company from its customer base had increased from 3 million to a statistically significant 80 million, with the number of vulnerabilities uncovered rocketing from 3 million to 680 million. Of the latter, 72 million were rated by Qualys as being of ‘critical’ severity.

That lack of progress is sobering, right? So far I’m intrigued, but then that article goes off the reservation by quoting Wolfgang as saying:

Taken together, the statistics suggested that a new solution would be needed in order to make further improvement with the only likely candidate on the horizon being cloud computing. “We believe that cloud security providers can be held to a higher standard in terms of security,” said Kandek. “Cloud vendors can come in and do a much better job.”  Unlike corporate admins for whom patching was a sometimes complex burden, in a cloud environment, patching applications would be more technically predictable – the small risk of ‘breaking’ an application after patching it would be nearly removed, he said.

Qualys has its head up its SaaS.  I mean that in the most polite of ways… ;)

Let me make a couple of important observations on the heels of those I’ve already made and an excellent one Lori MacVittie made today in here post titled “The Real Meaning Of Cloud Security Revealed:

  1. I’d like a better definition of the context of “patching applications.”  I don’t know whether Kandek mean applications in an enterprise or those hosted by a Cloud Provider or both?
  2. There’s a difference between providing security services via the Cloud versus securing Cloud and its application/data.  The quotes above mix the issues.  A “Cloud Security” provider like Qualys can absolutely provide excellent solutions to many of the problems we have today associated with point product deployments of security functions across the enterprise. Anti-spam and vulnerability management are excellent examples.  What that does not mean is that the applications that run in an enterprise can be delivered and deployed more “securely” thanks to the efforts of the same providers.
  3. To that point, the Cloud is not all SaaS-based.  Not every application is going to be or can be moved to a SaaS.  Patching legacy applications (or hosting them for that matter) can be extremely difficult.  Virtualization certainly comes into play here, but by definition, that’s an IaaS/PaaS opportunity, not a SaaS one.
  4. While SaaS providers who do “own the entire stack” are in a better position through consolidated multi-tenancy to transfer the responsibility of patching “their” infrastructure and application(s) on your behalf, it doesn’t really mean they do it any better on an application-by-application basis.  If a SaaS provider only has 1-2 apps to manage (with lots of customers) versus an enterprise with hundreds (and lost of customers,) the “quality” measurements as it relates to management of defect (from any perspective) would likely look better were you the competent SaaS vendor mentioned in this article.  You can see my point here.
  5. If you add in PaaS and IaaS as opposed to simply SaaS (as managed by a third party.) then the statement that “…patching applications would be more technically predictable – the small risk of ‘breaking’ an application after patching it would be nearly removed” is false.

It’s really, really important to compare apples to apples here. Qualys is a fantastic company with a visionary leader in Phillipe Courtot.  I was an early adopter of his SaaS service.  I was on his Customer Advisory Board.  However, as I pointed out to him at the Jericho event where I was a panelist, delivering a security function via the Cloud is not the same thing as securing it and SaaS is merely one piece of the puzzle.

I wrote a couple of other blogs about this topic:

/Hoff

  • Share/Bookmark

The Cloud Is a Fickle Mistress: DDoS&M…

April 2nd, 2009 beaker 6 comments

It’s interesting to see how people react when they are reminded that the “Cloud” still depends upon much of the same infrastructure and underlying protocols that we have been using for years.

BGP, DNS, VPNs, routers, swtiches, firewalls…

While it’s fun to talk about new attack vectors and sexy exploits, it’s the oldies and goodies that will come back to haunt us:

Simplexity

Building more and more of our business’ ability to remain an on-going concern on infrastructure that was never designed to support it is a scary proposition.  We’re certainly being afforded more opportunity to fix some of these problems as the technology improves, but it’s a patching solution to an endemic problem, I’m afraid.  We’ve got two ways to look at Cloud:

  • Skipping over the problems we have and “fixing” crappy infrastructure and applications by simply adding mobility and orchestration to move around an issue, or
  • Actually starting to use Cloud as a forcing function to fundamentally change the way we think about, architect, deploy and manage our computing capabilities in a more resilient, reliable and secure fashion

If I were a betting man…

Remember that just because it’s in the “Cloud” doesn’t mean someone’s sprinkled magic invincibility dust on your AppStack…

That web service still has IP addresses, open sockets. It still gets transported over MANY levels of shared infrastructure, from the telcos to the DNS infrastructure…you’re always at someone elses’ mercy.

Dan Kaminsky has done a fabulous job reminding us of that.

A more poignant reminder of our dependency on the Same Old Stuff™ is the recent DDoS attacks against Cloud provider Go-Grid:

ONGOING DDoS ATTACK

Our network is currently the target of a large, distributed DDoS attack that began on Monday afternoon.   We took action all day yesterday to mitigate the impact of the attack, and its targets, so that we could restore service to GoGrid customers.  Things were stabilized by 4 PM PDT and most customer servers were back online, although some of you continued to experience intermittent loss in network connectivity.

This is an unfortunate thing.  It’s also a good illustration of the sorts of things you ought to ask your Cloud service providers about.  With whom do they peer? What is their bandwidth? How many datacenters do they have and where? What DoS/DDoS countermeasures do you have in place? Have they actually dealt with this before?  Do they drill disaster scenarios like this?

We’re told we shouldn’t have to worry about the underlying infrastructure with Cloud, that it’s abstracted and someone else’s problem to manage…until it’s not.

This is where engineering, architecture and security meet the road.  Your provider’s ability to sustain an attack like this is critical.  Further, how you’ve designed your BCP/DR contingency plans is pretty important, too.  Until we get true portability/interoperability between Cloud providers, it’s still up to you to figure out how to make this all work.  Remember that when you’re assuming those TCO calculations accurately reflect reality.

Big providers like eBay, Amazon, and Microsoft invest huge sums of money and manpower to ensure they are as survivable as they can be during attacks like this.  Do you?  Does your Cloud Provider? How many do you have.

Again, even Amazon goes down.  At this point, it’s largely been operational issues on their end and not the result of a massive attack. Imagine, however, if someday it is.  What would that mean to you?

As more and more of our applications and information are moved from inside our networks to beyond the firewalls and exposed to a larger audience (or even co-mingled with others’ data) the need for innovation and advancement in security is only going to skyrocket to start to deal with many of these problems.

/Hoff

  • Share/Bookmark
Categories: Cloud Computing, Cloud Security Tags:

Azure Users Seeing Red: When Patching the Cloud Causes Cracks

March 19th, 2009 beaker 4 comments

No, this isn’t one of those posts that suggests we can’t depend on the Cloud just because of one (ok, many) outages of note lately.  That’s so dystopic.  Besides, everyone else is already doing that.

I mean just because Azure was offline for 22 hours isn’t cause for that much concern, right?  It’s a beta community technology preview, anyway… ;)  Just like Google’s a beta.

azureWhat I found interesting was what Microsoft reported as the root cause for the outage, however:

 

The Windows Azure Malfunction This Weekend

First things first: we’re sorry.  As a result of a malfunction in Windows Azure, many participants in our Community Technology Preview (CTP) experienced degraded service or downtime.  Windows Azure storage was unaffected.

In the rest of this post, I’d like to explain what went wrong, who was affected, and what corrections we’re making.

What Happened?

During a routine operating system upgrade on Friday (March 13th), the deployment service within Windows Azure began to slow down due to networking issues.  This caused a large number of servers to time out and fail.

You catch that bit about “…a routine operating system upgrade?”  Sometimes we call those things “patches.”  Even if this wasn’t a patch, let’s call it one for argument’s sake, okay?

As such, I was reminded of a blog post that I wrote last year titled: “Patching the Cloud” in which I squawked about my concerns regarding patching and change management/roll-back in Cloud services.  It seems apropos:

 

Your application is sitting atop an operating system and underlying infrastructure that is managed by the cloud operator.  This “datacenter OS” may not be virtualized or could actually be sitting atop a hypervisor which is integrated into the operating system (Xen, Hyper-V, KVM) or perhaps reliant upon a third party solution such as VMware.  The notion of cloud implies shared infrastructure and hosting platforms, although it does not imply virtualization.

A patch affecting any one of the infrastructure elements could cause a ripple effect on your hosted applications.  Without understanding the underlying infrastructure dependencies in this model, how does one assess risk and determine what any patch might do up or down the stack?  …

Huh.  Go figure.  

/Hoff

 

  • Share/Bookmark

Patching The Cloud?

October 25th, 2008 beaker 2 comments

PatchingJust to confuse you, as a lead-in on this topic, please first read my recent rant titled “Will You All Please Shut-Up About Securing THE Cloud…NO SUCH THING…”

Let’s say the grand vision comes to fruition where enterprises begin to outsource to a cloud operator the hosting of previously internal complex mission-critical enterprise applications.

After all, that’s what we’re being told is the Next Big Thing™

In this version of the universe, the enterprise no longer owns the operational elements involved in making the infrastructure tick — the lights blink, packets get delivered, data is served up and it costs less for what is advertised is the same if not better reliability, performance and resilience.

Oh yes, “Security” is magically provided as an integrated functional delivery of service.

Tastes great, less datacenter filling.

So, in a corner case example, what does a boundary condition like the out-of-cycle patch release of MS08-067 mean when your infrastructure and applications are no longer yours to manage and the ownership of the “stack” disintermediates you from being able to control how, when or even if vulnerability remediation anywhere in the stack (from the network on up to the app) is assessed, tested or deployed.

Your application is sitting atop an operating system and underlying infrastructure that is managed by the cloud operator.  This “datacenter OS” may not be virtualized or could actually be sitting atop a hypervisor which is integrated into the operating system (Xen, Hyper-V, KVM) or perhaps reliant upon a third party solution such as VMware.  The notion of cloud implies shared infrastructure and hosting platforms, although it does not imply virtualization.

A patch affecting any one of the infrastructure elements could cause a ripple effect on your hosted applications.  Without understanding the underlying infrastructure dependencies in this model, how does one assess risk and determine what any patch might do up or down the stack?  How does an enterprise that has no insight into the “black box” model of the cloud operator, setup a dev/test/staging environment that acceptably mimics the operating environment?

What happens when the underlying CloudOS gets patched (or needs to be) and blows your applications/VMs sky-high (in the PaaS/IaaS models?)

How does one negotiate the process for determining when and how a patch is deployed?  Where does the cloud operator draw the line?   If the cloud fabric is democratized across constituent enterprise customers, however isolated, how does a cloud provider ensure consistent distributed service?  If an application can be dynamically provisioned anywhere in the fabric, consistency of the platform is critical.

I hate to get all “Star Trek II: The Wrath of Khan” on you, but as Spock said, “The needs of the many outweigh the needs of the few.”  How, when and if a provider might roll a patch has a broad impact across the entire customer base — as it has had in the hosting markets for years — but again the types of applications we are talking about here are far different than what we we’re used to today where the applications and the infrastructure are inextricably joined at the hip.

Hosting/SaaS providers today can scale because of one thing: standardization.  Certainly COTS applications can be easily built on standardized tiered models for compute, storage and networking, but again, we’re being told that enterprises will move all their applications to the cloud, and that includes bespoke creations.

If that’s not the case, and we end up with still having to host some apps internally and some apps in the cloud, we’ve gained nothing (from a cost reduction perspective) because we won’t be able to eliminate the infrastructure needed to support either.

Taking it one step further, what happens if there is standardization on the underlying Cloud platform (CloudOS?) and one provider “patches” or updates their Cloud offering but another does or cannot? If we ultimately talk about VM portability between providers running the “same” platform, what will this mean?  Will things break horribly or be instantiated in an insecure manner?

What about it?  Do you see cloud computing as just an extension of SaaS and hosting of today?  Do you see dramatically different issues arise based upon the types of information and applications that are being described in this model?  We’ve seen issues such as data ownership, privacy and portability bubble up, but these are much more basic operational questions.

This is obviously a loaded set of questions for which I have much to say — some of which is obvious — but I’d like to start a discussion, not a rant.

/Hoff

*This little ditty was inspired by a Twitter exchange with Bob Rudis who was complaining that Amazon’s EC2 service did not have the MS08-067 patch built into the AMI…Check out this forum entry from Amazon, however, as it’s rather apropos regarding the very subject of this blog…

  • Share/Bookmark