Crisis? What crisis?
It is the middle of the night, the phone rings… somewhere a data center is down, somewhere in a province there is a power failure, a project may not be put into production… ”start driving… I will update you in the car … ” The voice on the other side of the phone tells me. I’m getting a conference call number forwarded … no idea what’s going on. I put the address in my navigation and I see the windshieldwipers fighting the rain.
While driving I am searching for the phone number. While I register I hear that there are 21 people in the call. That it is not good. Apparently everyone is already kicked out of bed.
After 13 years in the role of crisis manager, I regularly only wonder one thing; “How did it come to that?” Can things not be otherwise?
Halon and servers at risk..
Different customer… different situation. The technician on duty had to test the Halon fire suppression system. The poor man does what he has been doing for years and flips a series of switches. The Continuity Plan provides for such tests. But what you see so often is that it has not been thought through. It is considered that a fire could start, so an extinguishing system must be installed. In order to limit the risk, the data center has been separated so that when something happens in the one data center, one can divert. Simple right?
Wrong … and it went wrong. Not through a fire, but through human failure. The test went horribly wrong. The poor man had forgotten to turn off the Halon system in his routine. The Halon was blown into the server cabinets via the backplanes. Someone had thought that the cold air to cool the servers had to come from under the system floor. Someone forgot to think that the second data center was build on the same floor also extracted the cool air from under the system floor… and yes… this time the cool air was provided with Halon… thick… greasy corroding Halon…
Someone had designed two data centers and that adapted to the growth of the company. The tapes turned out to be useful for a restore on the sixth attempt. The 6Tb backup would take days, if not weeks, to restore everything… but on what? All servers would die. The Halon would eat its way through the gaskets and destroy the servers. It wasn’t a question but how long those systems would run.
Any idea how many companies make servers? And yes, you can call IBM or HP… and no, they do not put down 125 blades (servers) after one call.
Crisis on the way..
New case … In the evening I watch the news and see that a province is completely without power. I go upstairs and start to grab a weekend bag and throw in some underwear, toothbrush and toothpaste. I don’t wait for that call nowadays. If it is wrong I drive fast. After years, you now know when it’s your turn.
When I arrive, I ask my contact person if I can consult the management as an advisor. It was not my turn yet, the crisis team was running at full speed, all procedures had started. For a moment … for a moment I cursed myself that I had to climb in the car so cleverly while I could have been in my bed. My domain was running, my employer hadn’t been called out of bed, I was sitting there for bacon and beans… I thought. I was slowly thinking about how to spend the night … if I could use an empty hospital bed.
As I thought about that, my gaze drifted out… and suddenly all the alarm bells rang in me.
Being trained for crisisis
In a critical situation, a pilot or soldier has only one thing on his mind and that is what he or she dies first of. He’ll fix that… or she… and then the next… and the next… until the situation is under control. That’s how we’ve been trained … that’s how we think. A pilot has no chance of calculating whether there is a risk that the engine will fail. He trains throughout his career training on an emergency. As soon as the situation actually occurs, he instinctively knows what to do. All that training gives him precious time… which can be precious seconds that make the difference between life and death.
I looked outside and saw people in another building busy meeting… lights on, projector on… business as… business as (un)usual.
I ask the director of ICT, how much diesel do you have? The man was busy and was bothered by my question. He looked slightly irritated and asked me to repeat my question … So I ask him again … “How much diesel do you have?”
He said the diesel generators had all come up and they had stock for three days. They are now running for over an hour, so there was still enough …
I felt a little tense. This was what I was afraid of. This is not good. I thought how I could bring this politically. How was I going to tell him he has a bigger problem than he thought… I asked him, when are the fuel trucks coming? Now visibly annoyed he replied that if the power did not return within 2 days, he would call the supplier and order the dieseltanks to be filled…
I can only conclude that a slight panic started to take hold of me. Those people think wrong. Think too straightforward… I understand. But it is not good.
The agitated director looks up from behind his laptop and puts down his pen… he looks at me and tells me to tell me what I have on my heart or liver, if you like.
I start to talk … first off the hook … but gradually more structured. I have to hold back not to tell everything at once.
So I advise him to measure the current diesel stock every hour with a dipstick and to map the current consumption. Then make a shutdown plan. All non-essential lighting must be turned off. The fuel trucks must deliver diesel every 6 hours. The tanks must be filled to the brim every 6 hours. This must be agreed with the supplier.
The reason behind this is that the longer the power failure lasts, the more the demand for diesel will be. That’s when you need that 3-day reserve. As a crisis manager you think differently. Think of scenarios that will probably never occur.
A key plan. All key card readers are of course a great invention, but are the batteries also replaced in time and tested so that you can be sure when the power actually goes out? Is there a key plan? Is there a protocol to issue and return documented keys? The key to the pharmacy is really a ticket to paradise when there is no control over it. How long can you do without ICT? Have you ever tested that? What is the time to switch from an EPD (Electronic Patient File) to paper. Do you need everything? What information do you need to keep a patient alive? What do you need to know when?
Look beyond crisismanagement: welcome to Business Resilience Management
It is my vision, the vision of S23k, to help organizations become more resilient, to use our knowledge of crisis management in the sincere hope that you will never need us as crisis managers, and with this knowledge to make your organization more resilient.
Business Resilience Management helps you prepare for known and unknown risks. We train and practice all kinds of situations. For example, in an exercise we would have learned this example involves a lot more than just placing a few generators. You can think of many things in advance, but you really want to know what you are facing in addition, what you could be confronted with and that is the moment you want to realize you need more than just installing a generator. During a #Crisis you simply do not have the time. You will be too late.
Please feel free to contact us for a no-obligation intake. And who knows, you might also sleep a little better.
And who knows, you might also sleep a little better.