Home > Data-Centric Security, IP/Data Leakage, Security Breaches > Does Centralized Data Governance Equal Centralized Data?

Does Centralized Data Governance Equal Centralized Data?

Cube
I’ve been trying to construct a palette of blog entries over the last few months which communicates the need for a holistic network, host and data-centric approach to information security and information survivability architectures. 

I’ve been paying close attention to the dynamics of the DLP/CMF market/feature positioning as well as what’s going on in enterprise information architecture with the continued emergence of WebX.0 and SOA.

That’s why I found this Computerworld article written by Jay Cline very interesting as it focused on the need for a centralized data governance function within an organization in order to manage risk associated with coping with the information management lifecycle (which includes security and survivability.)  The article went on to also discuss how the roles within the organization, namely the CIO/CTO, will also evolve in parallel.

The three primary indicators for this evolution were summarized as:

1. Convergence of information risk functions
2. Escalating risk of information compliance
3. Fundamental role of information.

Nothing terribly earth-shattering here, but the exclamation point of this article to enable a
centralized data governance  organization is a (gasp!) tricky combination of people, process
and technology:

"How does this all add up? Let me connect the dots: Data must soon become centralized,
its use must be strictly controlled within legal parameters, and information must drive the
business model. Companies that don’t put a single, C-level person in charge of making
this happen will face two brutal realities: lawsuits driving up costs and eroding trust in the
company, and competitive upstarts stealing revenues through more nimble use of centralized
information."

Let’s deconstruct this a little because I totally get the essence of what is proposed, but
there’s the insertion of some realities that must be discussed.  Working backwards:

  • I agree that data and it’s use must be strictly controlled within legal parameters.
  • I agree that a single, C-level person needs to be accountable for the data lifecycle
  • However, I think that whilst I don’t disagree that it would be fantastic to centralize data,
    I think it’s a nice theory but the wrong universe. 

Interesting, Richard Bejtlich focused his response to the article on this very notion, but I can’t get past a couple of issues, some of them technical and some of them business-related.

There’s a confusing mish-mash alluded to in Richard’s blog of "second home" data repositories that maintain copies of data that somehow also magically enforce data control and protection schemes outside of this repository while simultaneously allowing the flexibility of data creation "locally."  The competing themes for me is that centralization of data is really irrelevant — it’s convenient — but what you really need is the (and you’ll excuse the lazy use of a politically-charged term) "DRM" functionality to work irrespective of where it’s created, stored, or used.

Centralized storage is good (and selfishly so for someone like Richard) for performing forensics and auditing, but it’s not necessarily technically or fiscally efficient and doesn’t necessarily align to an agile business model.

The timeframe for the evolution of this data centralization was not really established,
but we don’t have the most difficult part licked yet — the application of either the accompanying
metadata describing the information assets we wish to protect OR the ability to uniformly classify and
enforce it’s creation, distribution, utilization and destruction.

Now we’re supposed to also be able to magically centralize all our data, too?  I know that large organizations have embraced the notion of data warehousing, but it’s not the underlying data stores I’m truly worried about, it’s the combination of data from multiple silos within the data warehouses that concerns me and its distribution to multi-dimensional analytic consumers. 

You may be able to protect a DB’s table, row, column or a file, but how do you apply a policy to a distributed ETL function across multiple datasets and paths?

ATAMO?  (And Then A Miracle Occurs) 

What I find intriguing about this article is that this so-described pendulum effect of data centralization (data warehousing, BI/DI) and resource centralization (data center virtualization, WAN optimization/caching, thin client computing) seem to be on a direct  collision course with the way in which applications and data are being distributed with  Web2.0/Service Oriented architectures and delivery underpinnings such as rich(er) client side technologies such as mash-ups and AJAX…

So what I don’t get is how one balances centralizing data when today’s emerging infrastructure
and information architectures are constructed to do just the opposite; distribute data, processing
and data re-use/transformation across the Enterprise?  We’ve already let the data genie out of the bottle and now we’re trying to cram it back in? 
(*please see below for a perfect illustration)

I ask this again within the scope of deploying a centralized data governance organization and its associated technology and processes within an agile business environment. 

/Hoff

P.S. I expect that a certain analyst friend of mine will be emailing me in T-Minus 10, 9…

*Here’s a perfect illustration of the futility of centrally storing "data."  Click on the image and notice the second bullet item…:

Googlegears

  1. June 17th, 2007 at 04:36 | #1

    I've often thought that security architects, when they consider "data", should take a page from urban planners…
    As highways grew in popularity, traffic was thought to behave like a liquid. It would flow at a certain volume and rate. If that rate dropped – then the government could simply add more lanes or other highways to increase the capacity to handle the new traffic.
    Recently, urban planning has realized that traffic does not behave like a liquid at all. It behaves more like a gas – expanding to meet whatever capacity you give it.
    My current theory is that data operates the same way. The more capacity we have, the more human use will cause the data will expand to meet that capacity. From core to app servers. From app servers to clients – clients now being everything from PCs and laptops to smartphones and thumb drives. The more capacity the individual is given, the more data will move towards filling that capacity.
    Unfotunately for security, we've spent the past 12-15 years building IT architectures that operate under the premise that giving users the capacity to store data "empowers" them to be more productive. If we put PCs on their desks, they'll be productive (and they were). If we network them, they'll be more productive (and they were). If we give them laptops, they'll be more productive (and they were). If we give them smartphones, they'll be more productive (and they can't stop typing on their dadburn crackberries).
    So while from a security architecture standpoint, I'm in love with data centralization and thin clients – I'm afraid Pandora's box is already open. We've given the users power, and now who wants to take it back from them in the name of "compliance"?

  2. June 17th, 2007 at 06:12 | #2

    Wow! Alex, that's a fantastic post and right on. Great analogy.
    That last sentence:
    "So while from a security architecture standpoint, I'm in love with data centralization and thin clients – I'm afraid Pandora's box is already open. We've given the users power, and now who wants to take it back from them in the name of "compliance"?"
    …is really powerful.
    /Hoff

  3. June 17th, 2007 at 08:35 | #3

    C-
    Nah, it was just more "me too-ism" for the points you make in your bold paragraphs 🙂
    The most interesting thing to me (from a security architecture standpoint) is the "DRM" angle and considerations we make for human behavior. The fact of the matter is that we're in the "crtl – c, crtl – v" society – which means that users (esp. younger ones) will inevitably break whatever barriers they can in order to use the data in the way they want. Now if you think about it – people who know better "pirate" entertainment data all the time without even attempting to rationalize their actions. What happens when the user can actually come up with a "business case" for breaking DRM?
    Now don't get me wrong, DRM does have risk reducing effects on business processes. The question in my mind is whether or not extreme centralization efforts aren't at least equally effective at reducing risk – and might have a drastically lower TCO in terms of IT infrastructre (acquisition and support).
    The inherit problems with Web 2.0 and SOA are even worse. We're essentially using a remarkably flexible and revolutionary platform that was built to present physics papers for all sorts of uses that it was never designed to perform. This foundation, from both a support and security standpoint, is fundamentally poor. But we can't stop using html and Javascript, can we? From a network persepective, IP is the same way. IP wasn't built to provide secure communications from a C or I standpoint (or even built to help prevent, detect, or respond) – just built to solve a rudementary A problem.
    This is why IMHO people who claim that security is broken are wrong. Our foundational premises are absolutely correct – it's just that the problems we're applying them to are not only remarkably complex – but were never built to do what our IT departments and vendors have adapted them to do.

  4. Eric Hacker
    June 18th, 2007 at 11:23 | #4

    You are all correct.
    I really don't have the time to explain all the details now. Perhaps at the BeanSec meeting.
    Some data must always be centralized, based on its classification. Using hash techniques as suggested by Jeff Jonas some data can be given to users while the sensitive parts can be retained. Later the data can be matched and merged.

  1. No trackbacks yet.