Home > Software as a Service (SaaS) > Reflections on Recent Failures in the Fragile Internet Ecosystem Due to Service Monoculture…

Reflections on Recent Failures in the Fragile Internet Ecosystem Due to Service Monoculture…

September 2nd, 2007 Leave a comment Go to comments

Our digital lives and the transactions that enable them are based upon crumbling service delivery foundations and we’re being left without a leg to stand on…

I’ve blogged about this subject before, and it’s all a matter of perspective, but the latest high-profile Internet-based service failure which has had a crippling effect on users dependent upon its offerings is PayPal

Due to what looks to be a recent roll-out of code gone bad, subscription payment processing went belly-up. 

On September 1st, PayPal advised those affected that the issue should be fixed "…by September 5 or 6, and that all outstanding subscription payments would be collected."  That’s 4 days on top of the downtime sustained already.

This has been a tough last few weeks for parent company eBay as one of its other famous children, Skype, suffered its own highly-visible flame-outs due to an issue they company blamed upon overwhelmed infrastructure due to a Microsoft Patch Tuesday download.  This outage left several million users who were "dependent" upon Skype for communicating with others without a service to do so.

This is getting to the point that the services we take for granted will always be up are showing their vulnerable side, for lots of different reasons.  Some of these services are free, so that introduces a confusing debate relating to service levels and availability when one doesn’t pay for said service.

The failures are increasing in frequency and downtime.  Scary still is that I now count five recent service failures in the last four months that have affected me directly.  Not all of them are Internet-based, but they indicate a reliance on networked infrastructure that is obviously fragile:

1) United Airlines  Flight Operations Computer System Failure
2) San Francisco Power Grid Failure
3) LAX Passenger Screening System Computer System Failure
4) Skype Down for Days, and finally…
5) PayPal Subscription Processing Down

That’s quite a few, isn’t it?  Did you realize these were all during the last few months?

Most of these failures caused me inconvenience at best; some missed flights, inability to blog, failed subscription processing for web services, inability to communicate with folks…none of them life-threatening, and none of them dramatically impacting my ability to earn a wage.  But that’s me and my "luck."  Other people have not been so lucky.

Some have reasonably argued that these services do not represent "critical" infrastructure and at the level of things such as national defense, health and safety, etc. I’d have to agree.  But they could, and if our dependence on these services increases, they will.

As these services evolve and enable the economic plumbing of an entire generation of folks who expect ever-presence and conduct the bulk of their lives online, this sort of thing will turn from an inconvenience to a disaster. 

Even more interesting is a number of these services are now owned and delivered by what I call service monocultures; eBay provides not only the auction services, but PayPal and Skype, too.  Google gives you mail, apps, search, video, ads and soon wireless and payment.

While the investment these M&A/consolidation activities generates means bigger and better services, it also increases the likelihood of cascading failure domains in an ever-expanding connectedness, especially when they are operated by a single entity.

There’s a lot of run-and-gun architecture servicing these utilities in the software driven world that isn’t as resilient as it ought to be up and down the stack.  We haven’t even scratched the tip of the iceberg on this one folks…it’s going to get nasty.  Web2.0 is just the beginning.

I think we’d have a civil war if YouTube, FaceBook, Orkut or MySpace went down.

What would people do without Google if it were to disappear for 2-3 days.


Knock on (virtual) wood.


Categories: Software as a Service (SaaS) Tags:
  1. hapkido
    September 3rd, 2007 at 05:59 | #1

    as much as we complain or get inconvenienced, there is always an alternative or no alternative. You get what you pay for or dont pay for.

  2. colin
    September 3rd, 2007 at 18:01 | #2

    I recall several years ago when I actually read NANOG about a Level3 outage that resulted in such a wailing and gnashing of teeth that I had to comment that we are relying on an infrastructure that has no so much evolved but has been the result of accretion. The Internet has been promoted, and thus has become, a business tool and that was never its charter – these infrastructure failures will continue (albeit to a lesser degree with redundancy of redundant redundant systems), but to *rely* on the Internet for even a percentage of your business is, to my mind, an unacceptable risk.
    It wasn't a troll, but I was thankful that I had my asbestos underwear on. I felt like I had blasphemed. Cries of "HERETIC!" echoed throughout NANOG for days.
    And I do agree with the above comment that as far a free services go – you get exactly what you pay for.

  3. July 27th, 2008 at 13:12 | #3

    Very intersting post. I really enjoyed reading it.

  1. No trackbacks yet.