Thinning the Herd & Chlorinating the Malware Gene Pool…

December 28th, 2007 Leave a comment Go to comments

Alan Shimel pointed us to an interesting article written by Matt Hines in his post here regarding the "herd intelligence" approach toward security.  He followed it up here. 

All in all, I think both the original article that Andy Jaquith was quoted in as well as Alan’s interpretations shed an interesting light on a problem solving perspective.

I’ve got a couple of comments on Matt and Alan’s scribbles.

I like the notion of swarms/herds.  The picture to the right from Science News describes the
notion of "rapid response," wherein "mathematical modeling is
explaining how a school of fish can quickly change shape in reaction to
a predator."  If you’ve ever seen this in the wild or even in film,
it’s an incredible thing to see in action.

It should then come as no surprise that I think that trying to solve the "security problem" is more efficiently performed (assuming one preserves the current construct of detection and prevention mechanisms) by distributing both functions and coordinating activity as part of an intelligent "groupthink" even when executed locally.  This is exactly what I was getting at in my "useful predictions" post for 2008:

Grid and distributed utility computing models will start to creep into security
really interesting by-product of the "cloud compute" model is that as
data, storage, networking, processing, etc. get distributed, so shall
security.  In the grid model, one doesn’t care where the actions take
place so long as service levels are met and the experiential and
business requirements are delivered.  Security should be thought of in
exactly the same way. 

The notion that you can point to a
physical box and say it performs function ‘X’ is so last Tuesday.
Virtualization already tells us this.  So, imagine if your security
processing isn’t performed by a monolithic appliance but instead is
contributed to in a self-organizing fashion wherein the entire
ecosystem (network, hosts, platforms, etc.) all contribute in the
identification of threats and vulnerabilities as well as function to
contain, quarantine and remediate policy exceptions.

of sounds like that "self-defending network" schpiel, but not focused
on the network and with common telemetry and distributed processing of
the problem.
Check out Red Lambda’s cGrid technology for an interesting view of this model.

This basically means that we should distribute the sampling, detection and prevention functions across the entire networked ecosystem, not just to dedicated security appliances; each of the end nodes should communicate using a standard signaling and telemetry protocol so that common threat, vulnerability and effective disposition can be communicated up and downstream to one another and one or more management facilities.

This is what Andy was referring to when he said:

As part of the effort, security vendors may also need to begin sharing more of that information with their rivals to create a larger network effect for thwarting malware on a global basis, according to the expert.

may be hard to convince rival vendors to work together because of the
perception that it could lessen differentiation between their
respective products and services, but if the process clearly aids on
the process of quelling the rising tide of new malware strains, the
software makers may have little choice other than to partner, he said.

Secondly, Andy suggested that basically every end-node would effectively become its own honeypot:

turning every endpoint into a malware collector, the herd network
effectively turns into a giant honeypot that can see more than existing
monitoring networks," said Jaquith. "Scale enables the herd to counter
malware authors’ strategy of spraying huge volumes of unique malware
samples with, in essence, an Internet-sized sensor network."

I couldn’t agree more!  This is the sort of thing that I was getting at back in August when I was chatting with Lance Spitzner regarding using VM’s for honeypots on distributed end nodes:

I clarified that what I meant was actually integrating a
HoneyPot running in a VM on a production host as part of a standardized
deployment model for virtualized environments.  I suggested that this
would integrate into the data collection and analysis models the same
was as a "regular" physical HoneyPot machine, but could utilize some of
the capabilities built into the VMM/HV’s vSwitch to actually make the
virtualization of a single HoneyPot across an entire collection of VM’s
on a single physical host.

Thirdly, the notion of information sharing across customers has been implemented cross-sectionally in industry verticals with the advent of the ISAC’s such as the Financial Services Information Sharing and Analysis Center which seeks to inform and ultimately leverage distributed information gathering and sharing to protect it’s subscribing members.  Generally-available services like Symantec’s DeepSight have also tried to accomplish similar goals.

Unfortunately, these offerings generally lack the capacity to garner ubiquitous data gathering and real-time enforcement capabilities.

As Matt pointed out in his article, gaining actionable intelligence on the monstrous amount of telemetric data from participating end nodes means that there is a need to really prune for false positives.  This is the trade-off between simply collecting data and actually applying intelligence at the end-node and effecting disposition. 

This requires technology that we’re starting to see emerge with a small enough footprint when paired with the compute power we have in endpoints today. 

Finally, as the "network" (which means the infrastructure as well as the "extrastructure" delivered by services in the cloud) gains more intelligence and information-centric granularity, it will pick up some of the slack — at least from the perspective of sloughing off the low-hanging fruit by using similar concepts.

I am hopeful that as we gain more information-centric footholds, we shouldn’t actually be worried about responding to every threat but rather only those that might impact the most important assets we seek to protect. 

Ultimately the end-node is really irrelevant from a protection perspective as it should really be little more than a presentation facility; the information is what matters.  As we continue to make progress toward more resilient operating systems leveraging encryption and mutual authentication within communities of interest/trust, we’ll start to become more resilient and information assured.

The sharing of telemetry to allow these detective and preventative/protective capabilities to self-organize and perform intelligent offensive/evasive actions will evolve naturally as part of this process.



  1. December 30th, 2007 at 19:08 | #1

    Thinning the Herd & Chlorinating the Malware Gene …

    Bookmarked your post over at Blog!

  2. Zach
    January 2nd, 2008 at 19:15 | #2

    Awesome post, Chris! TBH, I'm a little perplexed as to why this is suddenly garnering so much attention as it seems like something that was talked about ages ago. I remember discussing topics like distributed intrusion analysis, anomaly detection, (semi-) intelligent agents (for security) and the like back in my early college years…granted I'm not *that* old.
    Is it that it's all *just now* coming to fruition or have I missed something?

  3. January 3rd, 2008 at 17:14 | #3

    What's old is new again. That's the cyclic innovation curves in effect.
    Quite frankly, the technology was not quite baked enough and combined with being ahead of the adoption curve, the productized versions of these products never made it across the chasm.
    NBAD is another good example of how technology has come to roost as a solution to a specific problem (such as DDoS) but hasn't quite escaped low-earth orbit in the product demand curve and "usefulness" categories to become pervasive as a named core technology across many product lines.
    This is oversimplifying the case, I know, but there are many applications for this technology and you'll see them become more important here over time (see my 2008 predictions re: interconnected sensornets)

  1. April 29th, 2009 at 07:50 | #1