Made a Mistake don’t worry – Review, Talk and learn from it.
When I was a bit younger, faster, and didn’t have kids I was part of an adventure racing team which raced together over a few years building to an 800km 5-10 day race in Australia.
One thing we did after most races and training missions was a review of how things went. Most top athletes and teams do it, so why shouldn’t we.
We sat down and discussed the good, the bad, and the ugly. There were some interesting discussions; we were not afraid to say we, as individuals or a team, made mistakes or forgot to do something. We knew we were doing the review so we could learn from it and become better. It worked and I am sure made us better as a team.
Throughout my career in IT I have been involved with or watched many projects, recently these projects have used an Agile methodology. The Agile (Scrum) projects that I have seen accelerate their velocity and become successful are the ones that have taken Review and Retrospective (R n R) seriously after each sprint.
R n R is where a team review the last sprint and generally talk about what went well, and what didn’t go so well. An output of this is generally what the team should continue to do, start or stop doing. R n R helps a sprint team learn from what happen and become a better team.
Unfortunately, when working with service desks I seldom see a review after an incident. I am a SAS Administrator for OptimalBI’s Managed SAS Admin service and have been part of SAS support teams in other roles. Although slightly different to my two examples above I see the review process in incident management a critical part of the process. This is not so you can do a witch hunt and blame someone, but so the team can learn from the incident and response. Then be able to ensure the incident doesn’t happen again or at the very least you’re able to restore service faster.
A recent example is that there was an outage that occurred over the weekend. The usual fixes didn’t work so the plan was on Monday everyone would brainstorm the issue and resolve the issue. But on Monday morning the issue had magically fixed itself between 7.45 and 8am – how? Good question and we don’t know but I suspect someone came in on Monday saw an “unrelated” issue and fixed it which in turn resolved the issue that caused the outage. Afterwards, there was a little bit of finger pointing and discussions but that died out pretty quickly and we are none the wiser on what happened and how we could resolve if it occurred again.
Ok, so there are so many issues and problems with the above scenario and how not to run a service desk (and I don’t claim to be an expert on how to run a service desk), but we can focus on the review. If the team (everyone involved) got together and talked about what happened and looked at what else had occurred over the weekend. For example, was there something that could have been seen as unrelated issues in other parts of the business? We may have been able to work out what happened and why service was restored and potentially stop it from happening again.
Is it because with incidents, something has broken so people are afraid to admit they made a mistake or broke it, or fixed it? Is that human nature to hide from that or hide from what they perceive the consequences are? I don’t know the answer to that …
However, I do know, be it sport, at work or home you shouldn’t be afraid to make a mistake or say you made a mistake. Just review it and talk about it, as that is how we all learn, gain experience and become better at what we do.
Barry, Preventer of Chaos
Barry blogs about how to stop chaos in your systems
You can read all of Barry’s blogs here.
We’ve got the best SAS Administrators in New Zealand ready to help you with your SAS environment. Find out more here.