Welcome!

Web Performance is a Journey, Not a Destination

Mehdi Daoudi

Subscribe to Mehdi Daoudi: eMailAlertsEmail Alerts
Get Mehdi Daoudi via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Blog Feed Post

Managing an Outage: How Catchpoint Eats its Own Dog Food

We talk a lot about analyzing and responding to outages. The truth is, much of the advice we give comes from watching how our clients deal with performance issues. We recently got a chance to manage an issue of our own, so we wanted to document what we saw and what we did. We think of it as a chance to eat our own dog food and to show off what we learned (and even what we’ve got left to learn) from the masters of customer experience management.

Like our clients, Catchpoint depends on third-party services to deliver the customer experience we aim for. We have strong partners and partnerships, but we all know that systems go down sometimes. We rely on Zendesk to support our help portal, and on the morning of April 22nd our system alerted us to an issue that would clearly affect our customers:

Dogfood_Chart_705bhttp://assetsblogfly1.catchpoint.com/wp-content/uploads/2016/05/Dogfood_... 300w, http://assetsblogfly1.catchpoint.com/wp-content/uploads/2016/05/Dogfood_... 624w" sizes="(max-width: 705px) 100vw, 705px" />

 

As this chart shows, our own synthetic testing against support.catchpoint.com detected connection, timeout, and 404 errors. At least some of our users were likely to find the help portal slow or unreachable.

Almost immediately Zendesk confirmed what we saw, and that we weren’t the only client impacted. This is a good example of a proactive response to an issue. Virtually every Zendesk client knew right away that the company was experiencing a technical issue, that the company was actively working on it, and that they’d traced it to a specific data center. We wrote recently about app rage, and how little it helps to badger a trusted partner to fix an issue that you know they’re dealing with. Zendesk communicated openly about the problem. We didn’t need to become another problem.

Dogfood_Zendesk_705bhttp://assetsblogfly1.catchpoint.com/wp-content/uploads/2016/05/Dogfood_... 300w, http://assetsblogfly1.catchpoint.com/wp-content/uploads/2016/05/Dogfood_... 624w" sizes="(max-width: 705px) 100vw, 705px" />

 

Instead we followed their lead and made sure that our own customers knew we were aware of the problem. Within ten minutes of detecting the outage we provided a clear update at status.catchpoint.com with what we knew about the issue, and offered an alternative way to reach our support team. Our customers didn’t need to call us or search for information because we found and addressed the issue so early.

Our team acted fast to get ahead of the situation, but that’s not really why our response was so quick. We have a playbook for service disruptions and our team followed it. Most of our clients have far more complex contingencies for detecting, diagnosing, and responding to outages, but our commitment to plan ahead and prepare for failures is based on best practices they developed.

http://assetsblogfly2.catchpoint.com/wp-content/uploads/2016/05/Dogfood_... 300w" sizes="(max-width: 623px) 100vw, 623px" />

The entire incident lasted only a few minutes. Zendesk communicated proactively, Catchpoint kept customers informed, and we continued to manage support requests without an interruption. Solid preparation by our partner and our team, and early detection through synthetic monitoring minimized the impact of a service outage. Ultimately our solution did its job and protected our customer experience, so at least on this day our dog food tasted pretty good.

The post Managing an Outage: How Catchpoint Eats its Own Dog Food appeared first on Catchpoint's Blog.

Read the original blog entry...

More Stories By Mehdi Daoudi

Catchpoint radically transforms the way businesses manage, monitor, and test the performance of online applications. Truly understand and improve user experience with clear visibility into complex, distributed online systems.

Founded in 2008 by four DoubleClick / Google executives with a passion for speed, reliability and overall better online experiences, Catchpoint has now become the most innovative provider of web performance testing and monitoring solutions. We are a team with expertise in designing, building, operating, scaling and monitoring highly transactional Internet services used by thousands of companies and impacting the experience of millions of users. Catchpoint is funded by top-tier venture capital firm, Battery Ventures, which has invested in category leaders such as Akamai, Omniture (Adobe Systems), Optimizely, Tealium, BazaarVoice, Marketo and many more.