Welcome!

Web Performance is a Journey, Not a Destination

Mehdi Daoudi

Subscribe to Mehdi Daoudi: eMailAlertsEmail Alerts
Get Mehdi Daoudi via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Blog Feed Post

Sorting Through the Wreckage of Last Week’s Outages

Every now and then, when you step back and consider how reliant our society has become on online systems, it can really blow your mind. When you do it because those systems seem to be crashing all around us, it can be downright terrifying.

Such was the case last Wednesday, when United Airlines grounded its flights around the world due to a software glitch, the New York Stock Exchange suspended trading for four hours due to problems with their internal systems, and the Wall Street Journal homepage experienced significant problems, with localized outages around the country.

Now, whether or not you retreated your bunker on Wednesday and started making plans to repopulate the Earth, Colbert did make an important point: our reliance on technology has made our lives and businesses incredibly more efficient, but the fragility of those systems gets transposed onto us the more we rely on them.

To make matters worse, companies are usually hesitant to make the investment of both time and money that’s necessary to completely overhaul their software systems. This typically leads to technical debt – software updates being built on top of the old code (particularly for companies that have been using it for a long time, like airlines) rather than building new, advanced systems from scratch. The end result is a product that gets the job done most of the time, but has significant holes.

That’s the problem pointed out by Zeynep Tufekci, who says that while people were panicking last week about cyber-terrorism possibly playing a role the outages, they were ignoring the much greater risk of relying on outdated and flawed software systems.

Of course, this vulnerability only increases the need for proactive monitoring in order to catch the problems that crop up in these systems before they cause widespread outages and slowness. By running continuous synthetic tests on your software and infrastructure, you can not only catch the major problems like those suffered by United and the NYSE, but also the smaller “micro-outages” that the WSJ experienced.

In the meantime, 2015 continues to be the year of the outage. In addition to the three major ones last week, we’ve also seen other problems with United and rival airlines like American due to third party issues, tech giants like Facebook and Apple – specifically, iTunes – go down for hours at a time, and even Starbucks was forced to close thousands of locations around the country for a night in April due to a problem with their POS systems.

And short of a complete change of mindset on the part of the entire tech industry, it’s probably not going to get better anytime soon. The best we can do is to try and stay on top of it as much as possible.

 

Fastly OpsCast ad

The post Sorting Through the Wreckage of Last Week’s Outages appeared first on Catchpoint's Blog.

Read the original blog entry...

More Stories By Mehdi Daoudi

Catchpoint radically transforms the way businesses manage, monitor, and test the performance of online applications. Truly understand and improve user experience with clear visibility into complex, distributed online systems.

Founded in 2008 by four DoubleClick / Google executives with a passion for speed, reliability and overall better online experiences, Catchpoint has now become the most innovative provider of web performance testing and monitoring solutions. We are a team with expertise in designing, building, operating, scaling and monitoring highly transactional Internet services used by thousands of companies and impacting the experience of millions of users. Catchpoint is funded by top-tier venture capital firm, Battery Ventures, which has invested in category leaders such as Akamai, Omniture (Adobe Systems), Optimizely, Tealium, BazaarVoice, Marketo and many more.