Incomplete Thought: In-Line Security Devices & the Fallacies Of Block Mode

Home > Active Defense, Automation, De-Perimeterization, General Rants & Raves, Information Centricity, Information Security, Information Survivability, Intrusion Detection, Intrusion Prevention > Incomplete Thought: In-Line Security Devices & the Fallacies Of Block Mode

Incomplete Thought: In-Line Security Devices & the Fallacies Of Block Mode

June 28th, 2013 beaker Leave a comment Go to comments

The results of a long-running series of extremely scientific studies has produced a Metric Crapload™ of anecdata.

Namely, hundreds of detailed discussions (read: lots of booze and whining) over the last 5 years has resulted in the following:

Most in-line security appliances (excluding firewalls) with the ability to actively dispose of traffic — services such as IPS, WAF, Anti-malware — are deployed in “monitor” or “learning” mode are rarely, if ever, enabled with automated blocking. In essence, they are deployed as detective versus preventative security services.

I have many reasons compiled for this.

I am interested in hearing whether you agree/disagree and your reasons for such.

/Hoff

Categories: Active Defense, Automation, De-Perimeterization, General Rants & Raves, Information Centricity, Information Security, Information Survivability, Intrusion Detection, Intrusion Prevention Tags: Firewall, Intrusion prevention system, Malware, Security, Security appliance

Comments (16) Trackbacks (1) Leave a comment Trackback

Jack

June 28th, 2013 at 13:11 | #1

Reply | Quote

Yep. Add my pile of anecdata to yours. I have seen many devices in many environments, including IPS, WAFs, and “next-gen” or application firewalls in monitor-only mode. Often they were installed in blocking mode and someone got tired of the effort of maintaining and updating exceptions and tuning- other times they were put in monitor mode “temporarily” many years ago.
Dave Walker

June 28th, 2013 at 13:26 | #2

Reply | Quote

I have similar anecdata (and I like that term 🙂 ).

I have 2 likely explanations, heard one way or another from various folk involved:

1. Define “normal”. A network needs to operate at a sufficiently high and sufficiently mixed traffic rate, for sufficient time, for a heuristics-based system to baseline itself. During this time, “Bad Stuff Can’t Happen”, lest the system misinterpret it as normal. Extra point: depending on the learning system used, you can’t tell whether what’s being learned, is the right kind of thing – recall the famous case from the military back in the ’80s, of the neural network-based image recognition and target identification system which became expert at distinguishing different kinds of tree from eachother, while completely ignoring the tanks hidden among them.

2. Lack of contextual understanding, by such a box. To a WAF or IPS installed on news.bbc.co.uk, for instance, hugely popular stories (and groups of stories) such as the Royal Wedding, the Diamond Jubilee and the Olympics would be seen as sustained DDoS attacks.

Ultimately, a combination of low risk appetite for an effective self-DoS, combined with lack of transparency of what the box is learning and defining as normal, stops people turning preventative mode on.
Robert David Graham

June 28th, 2013 at 13:38 | #3

Reply | Quote

Of course.

BlackICE Guard was the first inline appliance (outside firewalls). We advertised it this way. I think we coined the term “learning mode”. It was designed around this concept. It was easy to configure with a range of policies, from blocking “nothing”, “nearly nothing”, or “a lot”. Our first big sale was a major financial network that put it in place in monitor mode, with the intent of flipping a switch when worms hit to block the worms. It actually worked this way, blocking CodeRed. It was kinda cool.

I don’t know about other products, but I think most Proventia (the IBM product that BlackICE became) is configured with “nearly nothing” mode, which is too say, it errrs on the side of network performance instead of on the side of security, thus only blocking those things that it absolutely positive is certain of, but not things is it merely certain of.
Andre Gironda

June 28th, 2013 at 13:41 | #4

Reply | Quote

Additionally you hear that companies who pull network monitoring and detective capabilities into their SIEM or log management systems don’t have the expert humans necessary to read and understand the logs, let alone act on them as if they are or are not a security incident. Maybe it’s because their IPS/WAF platforms have not turned on blocking mode…

Really the answer is simple: design and implement detection and protection in an Enterprise product or ISV product way before it ships. First problem: we have a bunch of legacy code. First solution: Enterprise app developers need to learn how to refactor older languages/frameworks as well — or better — than they can with new products. First solution point person: the app owner.

The second problem is that we don’t have enough standardized, Enterprise-ready secure coding frameworks that provide everything a framework starter would want. Second solution: implement it by first using an evolved STRIDE or Trike model iteratively throughout your app lifecycle management in order to deliver results in appsec risk management. Second solution point person: the appsec lead.

Security appliances have never been the one-day answer to security monitoring or information security management. You are allowed to cite the haydays of firewalls, or even IPS (but not WAF). However, my instant retort is that no firewall was ever anything except a simple boundary condition placed in between 2 or more networks. Firewall rules are imperfect. Firewall code is imperfect. Finding conditions around the boundar(ies) were easy from the start… there have always been fragrouter and derivatives.

The primary issue throughout is data validation. Only 3 percent of the world’s data can even be accurately described by informal/ad-hoc specifications. How are we supposed to validate, encode, or re-encode this data when most of it can’t even be specified? Perhaps a solution here (some claim that 50 percent or more of Enterprise network traffic is XML these days) is structured data, but then we go back to our 1990s issue: vulnerable, critical-path parsers. In 2013 and going-on, we really have both I/O and parser issues to deal with.

What a better time to best-practicize all appdev and IT/Ops (some say DevOps or DevOpsSec) towards the goal of fixing our I/O data flows and our structured-data-parser technologies? Seems legit.
Mike

June 28th, 2013 at 17:59 | #5

Reply | Quote

Can’t speak for others but in our environment, ours are in block mode. We set them to learn for a few months to profile/tune and then set to block with the various sigs we think are important.
Allen Baranov

June 28th, 2013 at 21:14 | #6

Reply | Quote

Absolutely…

The Security guy who is four layers below the CIO in the IT Department finally gets a device that will clear up an audit finding. He gets permission to implement it and puts it in, in learning mode.

He wants to switch it to block mode but all the layers above him to CIO level have annual bonuses based on uptime delivery so the question is asked.. “can you promise that this won’t block genuine traffic, ever?” and the answer is “no.” The picture in the CIO’s head is his annual bonus floating away and the project is delayed.

If the security guy manages to get permission to deploy the device in block mode then the first time that there is a false positive and real traffic is blocked – the device is turned back into learning mode until the “investigation” is completed and the cause is established and the device is guaranteed never to block legitimate traffic again.

In devices like Firewalls where it is impossible to run them as non-blocking devices, rules are created to make them almost useless. A project is running late, the weird port configuration is not working so a broader rule is created than what is needed.. it works and is never closed down again.

To be fair, what works quite well is an idea of kaizen where certain traffic is blocked and the amount blocked is slowly increased over time. I have done this with IPS traffic – block all “critical” alerts, turn on “high” level alert blocking one at a time. Etc.

In some cases monitoring is elected even when blocking is available. For example – some companies I visited when I was putting a DLP solution together choose not to block information leakage so that malicious users don’t find more cunning ways to leak information. They would rather see and track who was releasing information and use HR ways to stop the offenders (“please clear your desk and hand in your access card”) than technical ways.
Chris Swan

June 29th, 2013 at 02:08 | #7

Reply | Quote

Rather than another ‘me too’ reply I’ll try to add a little colour to what I’ve seen going on here…

These devices are normally deployed at network choke points where there are many things behind them. Whilst it should be fairly easy to figure out an appropriate rule set for one thing or a handful of things it gets ‘too hard’ to figure out for an entire branch/country/enterprise or whatever. Once determinism is abandoned in favour of heuristics trust starts to dissolve.

I think this is fixable with a more application centric approach to containment and rule management, whereby our put upon network/security admin guy can stand fast and say something like ‘the WAF rules for this recruiting site *will not* accidentally take out our client trading portal’.

PS on the subject of IDS/IPS talk to CC about how the entire financial services industry deployed in anticipation of regulation that never actually came.
Andy

June 29th, 2013 at 18:02 | #8

Reply | Quote

@Robert David Graham
I actually put BlackICE into the .com environment I was working in. We did block stuff but one of our critical apps had issues, and our CTO decided the project had impacted customer user-experience too much. It was subsequently relegated to monitor only mode. I was pretty impressed at the time, but of course, anything that upset our customers even slightly got a rough ride…..

I have gone on to encourage and help customers to design, integrate and deploy both IPS and SIEMs over the years. Sadly, very few enterprise customers ever realise the value of either class of security control, typically due to lack of skills and or discipline over continuing a regime of incident response/analysis and tuning. “Forth bridge” Syndrome typically kicks in about 18 months in when either staff churn or budget cuts inhibit the resources required to continue on………….

Ultimately, we need to increase ease of use. improve reliability and limit the opportunity for false-positives/misses at the detection/enforcement end, whilst simplifying the information/response options at the control plane (SIEMs) end. Then the limited security resources employed in many organisations can handle the workload as upper-echelon interest wanes. Alternatively, smart organisations with the ability to retain and progress good staff, can run DevOps(Sec) teams and get to the core of the problem – insecure application coding etc.
Donny

July 1st, 2013 at 06:47 | #9

Reply | Quote

I have always viewed IPS/IDS/etc. as having a core architectural problem. Today, most of these solutions are implemented in a choke point (as Chris identified) and attempt to inspect multiple unique flows with a plethora of requirements.

For IDS/IPS to work, they need to become distributed. I would argue, done to the host. At a host level, the flow makes sense as each has a unique traffic pattern. Tuning can be done in a finite manner as the workload and expected interactions can be easily mapped.

The key would be to centralize monitoring and reporting while distributing enforcement. This model would allow for correlation when multiple entities report traffic deltas. Selective enforcement would be possible as deployment would be per node/application/cluster.

I believe centralized traffic managed is a farce and always has been. The granular control required to maintain service availability and mitigate threats requires finite implementation.

On the original question, most (~90) of the implementations I have worked on are monitoring with manual intervention.
Ted Doty

July 1st, 2013 at 15:12 | #10

Reply | Quote

The only blocking IPS deployments I remember in widespread use used the “Nifty Fifty” signatures – the ones that essentially never False Positived. 3 or 4 years ago it was the Nifty Two Hundred, but all of these were for more or less trivial attacks. All the other signatures were in alert mode, for the reasons that Allen Baranov describes very well indeed.

This is great if you spend sleepless hours worrying about smurf attacks, but for everyone else it really reduced the perceived value of IPS.
Erik Freeland

July 1st, 2013 at 17:14 | #11

Reply | Quote

100% correct in my viewpoint. The ONLY time that I see value in blocking would be an IPS in network garbage man mode.
Kevin

July 1st, 2013 at 21:26 | #12

Reply | Quote

We initially deployed our firewall in permissive mode, then tightened it step by step. A recent external security assessment got the last groups pulled under it. It blocks an absurd percentage of our total incoming traffic, all bad.

Our TippingPoint is an IPS. Hence we have our TippingPoint behind the FW in block mode, with auto quarantine of anything that shows it is connecting to a C&C network, as well as using it to block peer-to-peer and other undesirable traffic. We have had about 3-5 false positives in 8 years, all fairly minor. It requires tuning, but mostly to avoid CPU/RAM saturation. TippingPoint will tell you how to find and disable filters that are using resources but not actually finding bad traffic.

Our Ironport WSA is in block mode, fed by a WCCP connection to the core. It is incredibly effective at stopping people from getting to places that will require IT re-imaging their computer. We found we lost about 1 machine per 6 minutes when we had to disable it. We find about a false positive about once per week and we promptly get it accessible.

Our snort based IDS is an IDS so it gets fed by a tap. It’s also how we find most of the compromised systems, about 1 per day. It requires that someone with some skill looks at it every day and that there is someone with more skill is available to look at the more interesting cases. Then we use TP block it from internet access until it is re-imaged

What is amazing to me is how many large organizations claim they don’t have a security problem and hence don’t have to put any effort into IDS because “The FBI only contacts use once or twice a year about a compromised system.” If the FBI is regularly contacting you on a yearly basis because of stuff your hosts are doing you probably have many hundreds or thousands of compromised systems on your network, they just haven’t yet done anything that gets the Feds excited. “None so blind as those that will not see.”
curmudgeon

July 3rd, 2013 at 16:37 | #13

Reply | Quote

Thank the gods for the drive to wrap everything in ssl and use cert pining. It quickly removes 90% of false positives. *grin* Well at least for IPS deployed in front of a user LAN.
LonerVamp

July 5th, 2013 at 07:30 | #14

Reply | Quote

Oh man, I can’t wait for your post on this, since it surely will be huge. I like the answers up above, and I’ll try to be succinct and not echo too many of them. Definitely like Allen’s response.

– Network changes will trump security device changes. There are times when changes (for better or worse) to a network render

– Auditors (PCI compliance, etc) don’t have the manpower or expertise to also properly audit any of these devices that have been changed from their (always lax) defaults. Most of the time even simple configurations are glossed over. “You have firewall rules? Default deny? Pass.” The check-and-balance system for security is weak, if existent at all.

– Anything that interrupts communication, systems, or people doing their jobs is considered less than ideal. Most admins running security tools will get in immediate and deep trouble if their devices interrupt the business; and they will get in less trouble if their security devices are just not as tight as they could be.

– Staff. I would say “most” organizations do not have discrete staff dedicated to tuning security devices. Instead, they likely share duties with other things, including adminning things. Doing pretty much any other task will trump security tuning just because of what they will get evaluated for, what will get them in the most trouble, etc. Pretty much exposing that measuring security efforts is also weak.

– Statistics. Like above, I think “most” organizations are not like the biggest ones that have security teams. I think most are SMBs, which drives statistics down. Same thing with packaged security tools in other devices, like an IPS in a firewall. I imagine “most” don’t use these features heavily, so they get lumped into the group that don’t have them utilized.

– With something like a WAF, multiple teams need deep understanding of applications and how to combine them with the WAF to provide decent security. Getting this cross-team work done pretty much never happens. Let alone having the knowledge in other teams to even begin it. I’ve too often seen blank stares when developers are asked to explain their own apps, let alone how to properly intergrate with security or secure them.

– Many security devices are treated as stand-alone projects with a discrete start and finish. You tune it to where it should be, then it runs on its own! This is the secret belief of pretty much all management, otherwise they eat the cost of on-going security. Even if this works out, it only takes a year of changes in the network or updates to the product to weaken the posture. Even something as benign as upgrading from Windows 2003 to Windows 2008 can render log mgmt rules completely obsolete.

– Desktop staff (much like anyone else) are evaluated on their customer service and getting things done. Turning off a host firewall or making a huge (bad) rule allowance will be a job well done!

– In similar fashion, everyone just wants to get their jobs done. Even if security is proactive and blocks some things, users will find ways around what they want to get around (just like water flowing downhill). This means they may expose issues, which then people harp on and you wind up with the *perception* that tuning is lax. Really, that’s just the whole point of ongoing security: to keep tuning it tighter as things change. This perception sometimes is leveraged for whomever’s agenda…
Clerkendweller

July 7th, 2013 at 13:30 | #15

Reply | Quote

Lack of user context, which is available within applications themselves. Build it to the application layer like OWASP AppSensor http://www.crosstalkonline.org/storage/issue-archives/2011/201109/201109-watson.pdf
MarketingGuy

July 29th, 2013 at 23:59 | #16

Reply | Quote

Because marketing guy sold the WAF to his chum who was just promoted? Implementation by sales douche and security guy sent home.