Archive for the ‘IP/Data Leakage’ Category

Profiling Data At the Network-Layer and Controlling It’s Movement Is a Bad Thing?

June 3rd, 2007 2 comments

I’ve been watching what appears like a car crash in slow-motion and for some strange reason I share some affinity and/or responsibility for what is unfolding in the debate between Rory and Rob.

What motivated me to comment on this on-going exploration of data-centric security was Rory’s last post in which he appears to refer to some points I raised in my original post but still bent on the idea that the crux of my concept was tied to DRM:

So .. am I anti-security? Nope I’m extremely pro-security. My feeling
is however that the best way to implement security is in ways which
it’s invisable to users. Every time you make ordinary business people
think about security (eg, usernames/passwords) they try their darndest
to bypass those requirements.

That’s fine and I agree.  The concept of ADAPT is completely transparent to "users."  This doesn’t obviate the fact that someone will have to be responsible for characterizing what is important and relevant to the business in terms of "assets/data," attaching weight/value to them, and setting some policies regarding how to mitigate impact and ultimately risk.

Personally I’m a great fan of network segregation and defence in
depth at the network layer. I think that devices like the ones
crossbeam produce are very useful in coming up with risk profiles, on a
network by network basis rather than a data basis and
managing traffic in that way. The reason for this is that then the
segregation and protections can be applied without the intervention of
end-users and without them (hopefully) having to know about what
security is in place.

So I think you’re still missing my point.  The example I gave of the X-Series using ADAPT takes a combination of best-of-breed security software components such as FW, IDP, WAF, XML, AV, etc. and provides you with segregation as you describe.  HOWEVER, the (r)evolutionary delta here is that the ADAPT profiling of content set by policies which are invisible to the user at the network layer allows one to make security decisions on content in context and control how data moves.

So to use the phrase that I’ve seen in other blogs on this subject,
I think that the "zones of trust" are a great idea, but the zone’s
shouldn’t be based on the data that flows over them, but the
user/machine that are used. It’s the idea of tagging all that data with
the right tags and controlling it’s flow that bugs me.

…and thus it’s obvious that I completely and utterly disagree with this statement.  Without tying some sort of identity (pseudonymity) to the user/machine AND combining it with identifying the channels (applications) and the content (payload) you simply cannot make an informed decision as to the legitimacy of the movement/delivery of this data.

I used the case of being able to utilize client-side tagging as an extension to ADAPT, NOT as a dependency.  Go back and re-read the post; it’s a network-based transparent tagging process that attaches the tag to the traffic as it moves around the network.  I don’t understand why that would bug you?

So that’s where my points in the previous post came from, and I
still reckon their correct. Data tagging and parsing relies on the
existance of standards and their uptake in the first instance and then
users *actually using them* and personally I think that’s not going to
happen in general companies and therefore is not the best place to be
focusing security effort…

Please explain this to me?  What standards need to exist in order to tag data — unless of course you’re talking about the heterogeneous exchange and integration of tagging data at the client side across platforms?  Not so if you do it at the network layer WITHIN the context of the architecture I outlined; the clients, switches, routers, etc. don’t need to know a thing about the process as it’s done transparently.

I wasn’t arguing that this is the end-all-be-all of data-centric security, but it’s worth exploring without deadweighting it to the negative baggage of DRM and the existing DLP/Extrusion Prevention technologies and methodologies that currently exist.

ADAPT is doable and real; stay tuned.


For Data to Survive, It Must ADAPT…

June 1st, 2007 2 comments


Now that I’ve annoyed you by suggesting that network security will over time become irrelevant given lost visibility due to advances in OS protocol transport and operation, allow me to give you another nudge towards the edge and further reinforce my theories with some additionally practical data-centric security perspectives.

If any form of network-centric security solution is to succeed in adding value over time, the mechanics of applying policy and effecting disposition on flows as they traverse the network must be made on content in context.  That means we must get to a point where we can make “security” decisions based upon information and its “value” and classification as it moves about.

It’s not good enough to only make decisions on how flows/data should be characterized and acted on with the criteria being focused on the 5-tupule (header,) signature-driven profiling or even behavioral analysis that doesn’t characterize the content in context of where it’s coming from, where it’s going and who (machine, virtual machine and “user”) or what (application, service) intends to access and consume it.

In the best of worlds, we like to be able to classify data before it makes its way though the IP stack and enters the network and use this metadata as an attached descriptor of the ‘type’ of content that this data represents.  We could do this as the data is created by applications (thick or thin, rich or basic) either using the application itself or by using an agent (client-side) that profiles the data prior to storage or transmission.

Since I’m on my Jericho Forum kick lately, here’s how they describe how data ought to be controlled:

Access to data should be controlled by security attributes of the data itself.

  • Attributes can be held within the data (DRM/Metadata) or could be a separate system.
  • Access / security could be implemented by encryption.
  • Some data may have “public, non-confidential” attributes.
  • Access and access rights have a temporal component.

You would probably need client-side software to provide this
functionality.  As an example, we do this today with email compliance solutions that have primitive versions of
this sort of capability that force users to declare the classification
of an email before you can hit the send button or even the document info that can be created when one authors a Word document.

There are a bunch of ERM/DRM solutions in play today that are bandied about being sold as “compliance” solutions, but there value goes much deeper than that.  IP Leakage/Extrusion prevention systems (with or without client-side tie-ins) try to do similar things also.

Ideally, this metadata would be used as a fixed descriptor of the content that permanently attaches itself and follows that content around so it can be used to decide what content should be “routed” based upon policy.

If we’re not able to use this file-oriented static metadata, we’d like then for the “network” (or something in/on it) to be able to dynamically profile content at wirespeed and characterize the data as it moves around the network from origin to destination in the same way.

So, this is where Applied Data & Application Policy Tagging (ADAPT) comes in.  ADAPT is an approach that can make use of existing and new technology to profile and characterize content (by using content matching, signatures, regular expressions and behavioral analysis in hardware or software) to then apply policy-driven information “routing” functionality as flows traverse the network by using an 802.1 q-in-q VLAN tags (open approach) or applying a proprietary ADAPT tag-header as a descriptor to each flow as it moves around the network.

Think of it like a VLAN tag the describes the data within the packet/flow which is defined as seen fit;

The ADAPT tag/VLAN is user defined and can use any taxonomy that best suits the types of content that is interesting; one might use asset classification such as “confidential” or uses taxonomies such as “HIPAA” or “PCI” to describe what is contained in the flows.  One could combine and/or stack the tags, too.  The tag maps to one of these arbitrary categories which could be fed by interpreting metadata attached to the data itself (if in file form) or dynamically by on-the-fly profiling at the network level.

As data moves across the network and across what we call boundaries (zones) of trust, the policy tags are parsed and disposition effected based upon the language governing the rules.  If you use the “open” version using the q-in-q VLAN’s, you have something on the order of 4096 VLAN IDs to choose from…more than enough to accomodate most asset classification and still leave room for VLAN usage.  Enforcing the ACL’s can be done by pretty much ANY modern switch that supports q-in-q, very quickly.

Just like an ACL for IP addresses or VLAN policies, ADAPT does the same thing for content routing, but using VLAN ID’s (or the proprietary ADAPT header) to enforce it.

To enable this sort of functionality, either every switch/router in the network would need to either be q-in-q capable (which is most switches these days) or ADAPT enabled (which would be difficult since you’d need every network vendor to support the protocols.)  You could use an overlay UTM security services switch sitting on top of the network plumbing through which all traffic moving from one zone to another would be subject to the ADAPT policy since each flow has to go through said device.

Since the only device that needs to be ADAPT aware is this UTM security service switch (see the example below,) you can let the network do what it does best and utilize this solution to enforce the policy for you across these boundary transitions.  Said UTM security service switch needs to have an extremely high-speed content security engine that is able to characterize the data at wirespeed and add a tag to the frame as it moves through the switching fabric and processed prior to popping out onto the network.

Clearly this switch would have to have coverage across every network segment.  It wouldn’t work well in virtualized server environments or any topology where zoned traffic is not subject to transit through the UTM switch.

I’m going to be self-serving here and demonstrate this “theoretical” solution using a Crossbeam X80 UTM security services switch plumbed into a very fast, reliable, and resilient L2/L3 Cisco infrastructure.  It just so happens to have a wire-speed content security engine installed in it.  The reason the X-Series can do this is because once the flow enters its switching fabric, I own the ultimate packet/frame/cell format and can prepend any header functionality I like onto the structure to determine how it gets “routed.”

Take the example below where the X80 is connected to the layer-3 switches using 802.1q VLAN trunked interfaces.  I’ve made this an intentionally simple network using VLANs and L3 routing; you could envision a much more complex segmentation and routing environment, obviously.

AdaptjpgThis network is chopped up into 4 VLAN segments:

  1. General Clients (VLAN A)
  2. Finance & Accounting Clients (VLAN B)
  3. Financial Servers (VLAN C)
  4. HR Servers (VLAN D)

Each of the clients/servers in the respective VLANs default routes out to an IP address which belongs to the firewall cluster IP addresses which is proffered by the firewall application modules providing service in the X80.

Thus, to get from one VLAN to another VLAN, one must pass through the X80 and profiled by this content security engine and whatever additional UTM services are installed in the chassis (such as firewall, IDP, AV, etc.)

Let’s say then that a user in VLAN A (General Clients) attempts to access one or more resources in the VLAN D (HR Servers.)

Using solely IP addresses and/or L2 VLANs, let’s say the firewall and IPS policies allow this behavior as the clients in that VLAN have a legitimate need to access the HR Intranet server.  However, let’s say that this user tries to access data that exists on the HR Intranet server but contains personally identifiable information that falls under the governance/compliance mandates of HIPAA.

Let us further suggest that the ADAPT policy states the following:

Rule  Source                Destination            ADAPT Descriptor           Action

1        VLAN A             VLAN D                    HIPAA, Confidential        Deny
IP.1.1               IP.3.1

2        VLAN B             VLAN C                    PCI                                 Allow
IP.2.1             IP.4.1

Using rule 1 above, as the client makes the request, he transits from VLAN A to VLAN D.  The reply containing the requested information is profiled by the content security engine which is able to  characterize the data as containing information that matches our definition of either “HIPAA or Confidential” (purely arbitrary for the sake of this example.)

This could be done by reading the metadata if it exists as an attachment to the content’s file structure, in cooperation with an extrusion prevention application running in the chassis, or in the case of ad-hoc web-based applications/services, done dynamically.

According to the ADAPT policy above, this data would then be either silently dropped, depending upon what “deny” means, or perhaps the user would be redirected to a webpage that informs them of a policy violation.

Rule 2 above would allow authorized IP’s in VLANs to access PCI-classified data.

You can imagine how one could integrate IAM and extend the policies to include pseudonymity/identity as a function of access, also.  Or, one could profile the requesting application (browser, for example) to define whether or not this is an authorized application.  You could extend the actions to lots of stuff, too.

In fact, I alluded to it in the first paragraph, but if we back up a step and look at where consolidation of functions/services are being driven with virtualization, one could also use the principles of ADAPT to extend the ACL functionality that exists in switching environments to control/segment/zone access to/from virtual machines (VMs) of different asset/data/classification/security zones.

What this translates to is a workflow/policy instantiation that would use the same logic to prevent VM1 from communicating with VM2 if there was a “zone” mis-match; as we add data classification in context, you could have various levels of granularity that defines access based not only on VM but VM and data trafficked by them.

Furthermore, assuming this service was deployed internally and you could establish a trusted CA with certs that would support transparent MITM SSL decrypts, you could do this (with appropriate scale) with encrypted traffic also.

This is data-centric security that uses the network when needed, the host when it can and the notion of both static and dynamic network-borne data classification to enforce policy in real-time.


[Comments/Blogs on this entry you might be interested in but have no trackbacks set:

MCWResearch Blog

Rob Newby’s Blog

Alex Hutton’s Blog

Security Retentive Blog

Security: “Built-in, Overlay or Something More Radical?”

May 10th, 2007 No comments

I was reading Joseph Tardo’s (Nevis Networks) new Illuminations blog and found the topic of his latest post ""Built-in, Overlay or Something More Radical?" regarding the possible future of network security quite interesting.

Joseph (may I call you Joseph?) recaps the topic of a research draft from Stanford funded by the "Stanford Clean Slate Design for the Internet" project that discusses an approach to network security called SANE.   The notion of SANE (AKA Ethane) is a policy-driven security services layer that utilizes intelligent centrally-located services to replace many of the underlying functions provided by routers, switches and security products today:

Ethane is a new architecture for enterprise networks which provides a powerful yet simple management model and strong security guarantees.  Ethane allows network managers to define a single, network-wide, fine-grain policy, and then enforces it at every switch.  Ethane policy is defined over human-friendly names (such as "bob, "payroll-server", or "http-proxy) and  dictates who can talk to who and in which manner.  For example, a policy rule may specify that all guest users who have not authenticated can only use HTTP and that all of their traffic must traverse a local web proxy.

Ethane has a number of salient properties difficult to achieve
with network technologies today.  First, the global security policy is enforced at each switch in a manner that is resistant to poofing.  Second, all packets on an Ethane network can be
attributed back to the sending host and the physical location in
which the packet entered the network.  In fact, packets collected
in the past can also be attributed to the sending host at the time the packets were sent — a feature that can be used to aid in
auditing and forensics.  Finally, all the functionality within
Ethane is provided by very simple hardware switches.

The trick behind the Ethane design is that all complex
functionality, including routing, naming, policy declaration and
security checks are performed by a central
controller (rather than
in the switches as is done today).  Each flow on the network must
first get permission from the controller which verifies that the
communicate is permissible by the network policy.  If the controller allows a flow, it computes a route for the flow to
take, and adds an entry for that flow in each of the switches
along the path.

With all complex function subsumed by the controller, switches in
Ethane are reduced to managed flow tables whose entries can only be populated by the controller (which it does after each succesful permission check).  This allows a very simple design for Ethane
      switches using only SRAM (no power-hungry TCAMS) and a little bit
of logic.


I like many of the concepts here, but I’m really wrestling with the scaling concerns that arise when I forecast the literal bottlenecking of admission/access control proposed therein.

Furthermore, and more importantly, while SANE speaks to being able to define who "Bob"  is and what infrastructure makes up the "payroll server,"  this solution seems to provide no way of enforcing policy based on content in context of the data flowing across it.  Integrating access control with the pseudonymity offered by integrating identity management into policy enforcement is only half the battle.

The security solutions of the future must evolve to divine and control not only vectors of transport but also the content and relative access that the content itself defines dynamically.

I’m going to suggest that by bastardizing one of the Jericho Forum’s commandments for my own selfish use, the network/security layer of the future must ultimately respect and effect disposition of content based upon the following rule (independent of the network/host):

Access to data should be controlled by security attributes of the data itself.

  • Attributes can be held within the data (DRM/Metadata) or could be a separate system.
  • Access / security could be implemented by encryption.
  • Some data may have “public, non-confidential” attributes.
  • Access and access rights have a temporal component. 


Deviating somewhat from Jericho’s actual meaning, I am intimating that somehow, somewhere, data must be classified and self-describe the policies that govern how it is published and consumed and ultimately this security metadata can then be used by the central policy enforcement mechanisms to describe who is allowed to access the data, from where, and where it is allowed to go.

…Back to he topic at hand, SANE:

As Joseph alluded, SANE would require replacing (or not using much of the functionality of) currently-deployed routers, switches and security kit.  I’ll let your imagination address the obvious challenges with this design.

Without delving deeply, I’ll use Joseph’s categorization of “interesting-but-impractical”


Intellectual Property/Data Leakage/Content Monitoring & Protection – Another Feature, NOT a Market.

April 8th, 2007 8 comments

Besides having the single largest collection of vendors that begin with the letter ‘V" in one segment of the security space (Vontu, Vericept, Verdasys, Vormetric…what the hell!?) it’s interesting to see how quickly content monitoring and protection functionality is approaching the inflection point of market versus feature definition.

The "evolution" of the security market marches on.

Known by many names, what I describe as content monitoring and protection (CMP) is also known as extrusion prevention, data leakage or intellectual property management toolsets.  I think for most, the anchor concept of digital rights management (DRM) within the Enterprise becomes glue that makes CMP attractive and compelling; knowing what and where your data is and how its distribution needs to be controlled is critical.

The difficulty with this technology is the just like any other feature, it needs a delivery mechanism.  Usually this means yet another appliance; one that’s positioned either as close to the data as possible or right back at the perimeter in order to profile and control data based upon policy before it leaves the "inside" and goes "outside."

I made the point previously that I see this capability becoming a feature in a greater amalgam of functionality;  I see it becoming table stakes included in application delivery controllers, FW/IDP systems and the inevitable smoosh of WAF/XML/Database security gateways (which I think will also further combine with ADC’s.)

I see CMP becoming part of UTM suites.  Soon.

That being said, the deeper we go to inspect content in order to make decisions in context, the more demanding the requirements for the applications and "appliances" that perform this functionality become.  Making line speed decisions on content, in context, is going to be difficult to solve. 

CMP vendors are making a push seeing this writing on the wall, but it’s sort of like IPS or FW or URL Filtering…it’s going to smoosh.

Websense acquired PortAuthority.  McAfee acquired Onigma.  Cisco will buy…?


Categories: DLP, IP/Data Leakage Tags: