Key Learnings from the Facebook Status Page

Gravatar for eduardo@messuti.io

Eduardo Messuti

Founder and CTO

April 09, 2021

Key Learnings from the Facebook Status Page

Yesterday April 8th 2021 at around 22:00 UTC, Facebook experienced a major outage where Facebook, Messenger, WhatsApp web and Instagram were down, lasting for as much as 3 hours.

This was reported at Facebook's status page (https://developers.facebook.com/status), which was a good example of how to communicate and incident.

Facebook status page incident

So, what could have been improved?

There are a few points that we think could have been done better on how this incident was communicated.

We would suggest learning from the following:

1. Status page should not have been hosted within their own infrastructure

Their status page went down together with their services as it seems to be hosted within the same infrastructure, or at least sharing some resources.

Why this is a strategy to avoid we address in our blog post Learning from Facebook: Keep your Status Page Separately from your Infrastructure.

2. Lack of information within the incident report

There is no status update regarding when the incident was first identified, no information regarding the investigation or resolution either, only a short text marking it as resolved.

So there is no way to tell for sure when the incident started and end, there is a start time and last update time, but no end time, or "duration" field for that matter.

Here is an example of an incident reported with appropriate start date and duration information.

Incident duration example

3. Status page did not reflect the current state of the services

At some point Facebook's status page was reporting "Platform is Healthy" while it was clearly still undergoing an outage at least for some users, as reported in Hacker News.

HackerNews comments 1

This is why it's so important to automate incident reporting within your status page, connecting your monitoring services to your status page in order to ensure it reflects the current state of the matters.

4. No way to subscribe for notifications

There doesn't seem to be a way to subscribe to receive notifications regarding Facebook's services status, the Subscribe button takes you to their developer notification settings page, where there is no clear way to subscribe for services status updates.

Subscribing should be easy and clear so any relevant user can get notified when there are outages, furthermore other channels than email like SMS or even Slack are great options.

Conclusion

Successful Incident communication is key to keep your customer's trust during downtime, more and more companies are opting for a status page as their primary tool for this, so keep this points in mind when choosing your status page provider as well as during the process of reporting outages.

If you are considering starting to take incident communication seriously, take a look at StatusPal, we cover all of the points mentioned above, and much more.

Gravatar for eduardo@messuti.io

Eduardo Messuti

Founder and CTO

April 09, 2021

Eduardo is a software engineer and entrepreneur with a passion for building digital products. He has been working in the tech industry for over 10 years and has experience in a wide range of technologies and industries.
See full bio

Getting started

Ready to streamline incident communication?

Give StatusPal status pages a test drive.

The free 14-day trial requires no credit card and includes all features.