Incomplete Thought: Storage In the Cloud: Winds From the ATMOS(fear)
I never metadata I didn’t like…
I first heard about EMC’s ATMOS Cloud-optimized storage “product” months ago:
EMC Atmos is a multi-petabyte offering for information storage and distribution. If you are looking to build cloud storage, Atmos is the ideal offering, combining massive scalability with automated data placement to help you efficiently deliver content and information services anywhere in the world.
I had lunch with Dave Graham (@davegraham) from EMC a ways back and while he was tight-lipped, we discussed ATMOS in lofty, architectural terms. I came away from our discussion with the notion that ATMOS was more of a platform and less of a product with a focus on managing not only stores of data, but also the context, metadata and policies surrounding it. ATMOS tasted like a service provider play with a nod to very large enterprises who were looking to seriously trod down the path of consolidated and intelligent storage services.
I was really intrigued with the concept of ATMOS, especially when I learned that at least one of the people who works on the team developing it also contributed to the UC Berkeley project called OceanStore from 2005:
OceanStore is a global persistent data store designed to scale to billions of users. It provides a consistent, highly-available, and durable storage utility atop an infrastructure comprised of untrusted servers.
Any computer can join the infrastructure, contributing storage or providing local user access in exchange for economic compensation. Users need only subscribe to a single OceanStore service provider, although they may consume storage and bandwidth from many different providers. The providers automatically buy and sell capacity and coverage among themselves, transparently to the users. The utility model thus combines the resources from federated systems to provide a quality of service higher than that achievable by any single company.
OceanStore caches data promiscuously; any server may create a local replica of any data object. These local replicas provide faster access and robustness to network partitions. They also reduce network congestion by localizing access traffic.
Pretty cool stuff, right? This just goes to show that plenty of smart people have been working on “Cloud Computing” for quite some time.
Ah, the ‘Storage Cloud.’
Now, while we’ve heard of and seen storage-as-a-service in many forms, including the Cloud, today I saw a really interesting article titled “EMC, AT&T open up Atmos-based cloud storage service:”
EMC Corp.’s Atmos object-based storage system is the basis for two cloud computing services launched today at EMC World 2009 — EMC Atmos onLine and AT&T’s Synaptic Storage as a Service.
EMC’s service coincides with a new feature within the Atmos Web services API that lets organizations with Atmos systems already on-premise “federate” data – move it across data storage clouds. In this case, they’ll be able to move data from their on-premise Atmos to an external Atmos computing cloud.
Boston’s Beth Israel Deaconess Medical Center is evaluating Atmos for its next-generation storage infrastructure, and storage architect Michael Passe said he plans to test the new federation capability.
Organizations without an internal Atmos system can also send data to Atmos onLine by writing applications to its APIs. This is different than commercial graphical user interface services such as EMC’s Mozy cloud computing backup service. “There is an API requirement, but we’re already seeing people doing integration” of new Web offerings for end users such as cloud computing backup and iSCSI connectivity, according to Mike Feinberg, senior vice president of the EMC Cloud Infrastructure Group. Data-loss prevention products from RSA, the security division of EMC, can also be used with Atmos to proactively identify confidential data such as social security numbers and keep them from being sent outside the user’s firewall.
AT&T is adding Synaptic Storage as a Service to its hosted networking and security offerings, claiming to overcome the data security worries many conservative storage customers have about storing data at a third-party data center.
The federation of data across storage clouds using API’s? Information cross-pollenization and collaboration? Heavy, man.
Take plays like Cisco’s UCS with VMware’s virtualization and stir in VN-Tag with DLP/ERM solutions and sit it on top of ATMOS…from an architecture perspective, you’ve got an amazing platform for service delivery that allows for some slick application of policy that is information centric. Sure, getting this all to stick will take time, but these are issues we’re grappling with in our discussions related to portability of applications and information.
Settling Back Down to Earth
This brings up a really important set of discussions that I keep harping on as the cold winds of reality start to blow.
From a security perspective, storage is the moose on the table that nobody talks about. In virtualized environments we’re interconnecting all our hosts to islands of centralized SANs and NAS. We’re converging our data and storage networks via CNAs and unified fabrics.
In multi-tenant Cloud environments all our data ends up being stored similarly with the trust that segregation and security are appropriately applied. Ever wonder how storage architectures never designed to do these sorts of things at scale can actually do so securely? Whose responsibility is it to manage the security of these critical centerpieces of our evolving “centers of data.”
So besides my advice that security folks need to run out and get their CCIE certs, perhaps you ought to sign up for a storage security class, too. You can also start by reading this excellent book by Himanshu Dwivedi titled “Securing Storage.”
What are YOU doing about securing storage in your enterprise our Cloud engagements? If your answer is LUN masking, here’s four Excedrin, call me after the breach.