IT and the Empathy Deficit

This post is REALLY late, but I think the topic is still relevant, even if the trigger events have faded in our memory

</p>

The Information Technology field is completely devoid of any ability at self-reflection. The whole damn thing, from companies to board of directors, to developers, to system admins. How easily and quickly we can wag our finger when someone else fails, yet when we ourselves fall down, there’s a “perfectly logical explanation”.

</p>

In case you were under a rock on last Friday, many of Google’s services went down for an extended outage. I know for our fast paced world of hyper-connectivity, 25 minutes without email or documents is the end of the world. There’s the entrepreneur who finally got his chance to pitch in front of a venture capital firm, but couldn’t get to his presentation. The college kid that was trying to print his assignment before making a mad dash to beat the deadline. I get it, these services impact our lives in major ways.

</p>

But it’s alarming to see how the people who should understand most, are the first to pile on. Yahoo just couldn’t help themselves and tweeted about the issue multiple times. They have since apologized but honestly,at this point who cares.

</p>

But as the Twitterverse collectively freaked out everyone in my office was calm as a cucumber. Sure we couldn’t access email, but we knew Google would fix the problem and be back up as soon as possible. How did we know?

</p>

Because it’s what we would do.

</p>

News flash. Sometimes people make mistakes. Sometimes process fails. Sometimes gaps we didn’t know about are found. Sometimes test cases are missed. As a developer, tester or system admin, have you never made a mistake? Have you never let a bug slip in to production? Have you never under-estimated the impact of a change? If you’re perfect, then this message isn’t for you. But if you’re like the other 99.999% (see what I did there?) of people in our field, I’m sure we can agree on a few things.

</p>

</p>

That last one sounds crazy, but seriously. For someone on that Site Reliability Team, the outage wasn’t a laughing matter. It probably doesn’t feel good to know that the Internet is collectively dismayed and disgusted by a mistake you made, even though 50% of people wouldn’t understand the mistake if you explained it to them. Instead of ridicule, we should encourage open dialogue about how mistakes like this are made, so everyone, not just Google can learn from them.

</p>

Outages are learning opportunities for everyone. Why did it happen? Was it a tools failure? I’m sure others would like to know if it’s a tool they use as well. Was it a process failure? Open dialogue about the failures of traditional IT Operations shops and their failures had a huge hand in forming the DevOps movement. Was it human error? Why did that person think the action they took was the right one? If it made sense to them, it will make sense to someone else, which means you might have a documentation or a training issue.

</p>

All of these problems are correctable but only if we feel comfortable talking about our failures. This constant ridicule and cynicism our industry has when someone fails threatens the dialogue necessary.

</p>

Google has shared some details about the outage, and I’m happy to say it seems to be a growing trend among companies, but what about at a lower more personal level?

</p>

I challenge those in our field.

</p>

Be fallible
Be open with your failures
Get to the heart of why the failure happened. Don’t just call it a derp moment and move on.
Recognize when someone is trying to do these things and encourage it.

</p>

comments powered by Disqus