Dear Public Cloud Providers: Please Make Your Networking Capabilities Suck Less. Kthxbye
There are lots of great discussions these days about how infrastructure and networking need to become more dynamic and intelligent in order to more fully enable the mobility and automation promised by both virtualization and cloud computing. There are many examples of how that’s taking place in the enterprise.
Incumbent networking vendors and emerging cloud/network startups are coming to terms with the impact of virtualization and cloud as juxtaposed with that of (and you’ll excuse the term) “pure” cloud vendors and those more traditional (Inter)networking service providers who have begun to roll out Cloud services atop or alongside their existing portfolio of offerings.
- On the one hand we see hardware-based networking vendors adding software-based virtual switching and virtual appliance extensions in order to claw back the networking and security functions which have been abstracted into the virtualization and cloud stacks. This is a big deal in the enterprise and especially with vendors looking to stake a claim in the private cloud space which is the evolution of traditional datacenter capabilities extended with virtualization and leverages the attributes of Cloud to provide for a more frictionless computing experience. Here is where we see innovation and evolution with the likes of converged data and storage networking and unified fabric solutions.
- On the other hand we see massively-scaled public cloud providers and evolving (Inter)networking service providers who have essentially absorbed the networking layers into their cloud operating platforms and rely on the software functionality embedded within to manifest the connectivity required to enable service. There is certainly networking hardware sitting beneath these offerings, but depending upon their provenance, there are remarkable differences in the capabilities and requirements between them and those mentioned above. Mostly, these providers are really shouting for multi-terabit layer two switching fabric interconnects to which they interface their software-enabled compute platforms. The secret sauce is primarily in software.
For the purpose of this post, I’m not going to focus on the private Cloud camp and enterprise cloud plays, or those “Cloud” providers who replicate the same architectures to serve these customers, rather, I want to focus on those service providers/Cloud providers who offer massively scalable Infrastructure and Platform-as-a-Service offerings as in the second example above and highlight two really important points:
- From a physical networking perspective, most of these providers rely, in some large part, on giant, flat, layer two physical networks with the actual “intelligence,” segmentation, isolation and logical connectivity provided by the hypervisor and their orchestration/provisioning/automation layers.
- Most of the networking implementations in these environments are seriously retarded as it relates to providing flexible and extensible networking topologies which make for n-Tier application mapping nightmares for an enterprise looking to move a reasonable application stack to their service.
I’ve been experimenting with taking several reasonably basic n-Tier app stacks which require mutiple levels of security, load balancing and message bus capabilities and design them using several cloud platform providers offerings today.
The dirty little secret is that there are massive trade-offs with each of them, mostly due to constraints related to the very basic networking and security functionality offered by the hypervisors that power their services today. The networking is basic. Just the way they like it. It sucks for me.
This is a problem I demonstrated in enterprise virtualization in my Four Horsemen of the Virtualization Apocalypse presentation two years ago. It’s much, much worse in Cloud.
Not supporting multiple virtual interfaces, not supporting multiple IP addresses per instance/VM, not supporting multicast or broadcast capabilities for software-based load balancing (and resiliency of the LB engines themselves)…these are nasty issues that in many cases require wholesale re-engineering of app stacks and push things like resiliency and high availability into uncertain waters.
It’s also going to cost me more.
Sure, there are ways of engineering around these inadequacies, but they require additional levels of complexity, more cost, additional providers or instances and still leave me without many introspection options and detective and preventative security controls that I’m used to being able to rely on in traditional networking environments using colocation services or natively within the enterprise.
I’m sure I’ll see comments (public and private) suggesting all sorts of reasons why these are non-issues and how it’s silly to try and replicate the enterprise approach in the cloud. I have 500 reasons why they’re wrong…the Fortune 500, that is. You should also know I’m not apologizing for the sorry state of non-dynamic infrastructure, but I am suggesting that forcing me to re-tool app stacks to fit your flat network topologies without giving me better security and flexible connectivity options simply sucks.
In may cases, people just can’t get there from here.
I don’t want to have to re-architect my app stacks to work in the cloud simply because of a lack of maturity from a networking perspective. I shouldn’t have to. That’s simply backward. If the power of Cloud is its ability to quickly, flexibly, and easily allow me to provision, orchestrate and deploy services, that must include the network, also!
The networking and security capabilities of public Cloud providers needs to improve — and quickly. Applications that are not network topology-dependent and only require a single interface (or more specifically an IP address/socket) to communicate aren’t the problem. It’s when you need to integrate applications and/or infrastructure solutions that require multiple interfaces, that *are* topology dependent and require insertion between these monolithic applications that things break down. Badly.
The “app on a stick” model doesn’t work when enterprises struggle with taking isolated clusters of applications (tiers) and isolate/protect them with physical or virtual appliances that require multiple interfaces to do so. ACL’s don’t cut it, not when I need FW, IPS, DLP, WAF, etc. functionality. Let’s not forget dedicated management, storage or backup interfaces. These are many of the differences between public and private cloud offerings.
I can’t do many of the things I need to do easily in the Cloud today, not without serious trade-offs that incur substantial cost and given the immaturity of the market as a whole put me at risk.
For the large enterprise, if the fundamental networking and security architectures don’t allow for easy portability that does not require massive re-engineering of app stacks, these enterprises are going to turn to niche or evolving (Inter)networking providers who offer them the capability to do so, even if they’re not as massively scaleable, or they’ll simply build private clouds instead.