Transparency is key ... Azure South Central US Outage

Hello to all of my readers. I wanted to reach out as I, like many of you, was heavily impacted as a customer of Azure services being down in the South Central US region (San Antonio, TX). When I got to work that morning, my team was definitely in fire-fighting mode as we had many of our services offline or impacted during the outage.

While planning for business continuity is important, reacting with the best information possible is the first step in the response. After I logged into the system, I did a check of Twitter, tech blogs, and news sites to see what was being published about the outage and what I saw was horrible. Much like the AWS Eastern US Storage outage of February 28th, 2017, many companies were knocked offline by this outage including systems at Microsoft, both internally and exterally focused.

One of the keys of any technology team has to be transparency with its customers. As a former Director of IT and current member of SRE team, the balance of transparency versus putting out too much information to scare your customers is a tight rope we have to walk. Many folks feel too much information will scare users and customers away. On the other side of the spectrum, not enough information makes users and customers leave the service because the feel the service "is a black box" and get no information about it.

After having read the Post Mortem from the Azure DevOps Team (formerly Visual Studio Team Service) and the preliminary Post Mortem from Azure, I think that transparency has been reached. I have always been proud to be part of VSTS/Azure DevOps teams in our transparency to internal and external customers. At the same time, I have desired more transparency from other teams at Microsoft and now I am seeing that from Azure.

Give both of these post mortems a quick read and you can determine if they are transparent enough or too transparent for your tastes. Figure out with your teams how much transparency to give to your customers and plan for that in your communicaitons including post mortems. Remember that you want a certain level of transparency from your providers so think about what your customers want from you.

 

Lightning Image - Copyright 2007, Mike Switzerland

Previous
Previous

Site Reliability Engineering at Ignite 2018

Next
Next

Can't make it to Ignite? Have I got some good news for you!