Thursday 15 August 2019

Whiners never win

Whiners never win!
What simple yet powerful, three words? 
These are words my electrical engineering professor used on me back in my second year, when I challenged my paper being wrongly graded.  I remember waiting a long time at his office just to see him, the prof just gave me one minute "Mr. Khan, here's some advice for the rest of your life - whiners never win". That was it, and then he left, my paper unchanged.


This stuck with me as I completed my degree, started my first job, took a chance one year after graduating by emigrating to two countries, marriage, parenthood, etc. I keep coming back to this simple message "whiners never win" and so just get on with life or work...

Yet, coming back to South Africa, now 8 years and counting, I was surprised to learn how much whining happens in the workplace! It is emotionally draining. Almost everyone has a complaint, not happy with this or that, unable to separate personal emotions from being professional at the workplace. Just a whole lot of whining going on. I wonder if it's just the context of the nation, just whining, starting with the commute into work listening to radio stations (like 702) which I've stopped completely. I actually don't listen to such news anymore, it starts the day of on the wrong note, stepping into the office with negativity and stress. Don't forget the stress of driving on SA roads...

I was also surprised how people tend to bring their home issues into workplace as well, affecting how they show up, perform, etc. My mindfulness, listening and empathy skills have seen an accelerated growth since 2011...

Still, eight years on, I still see it a lot of whining in the workplace...come on folks, stop whining. 

Whining never wins. 

Bring your best self to work, show up and get through the struggle...


Stop whining. "Whining never wins"...

Don't get me wrong, I've got loads of empathy...but at some point, people need to step up and just stop whining as whiners never win!

Sunday 11 August 2019

On using metaphors for Tech Ops


I like finding metaphors from other worlds outside of technology or engineering that can help simplify concepts for business people. Using such comparisons "is like a..." can actually be quite a powerful way of connecting or contextualising, especially when these people (business stakeholders mostly) are not interested in the technical details at all. All these folks care about, when there's an operational issue impacting business/customers, are: "What's the issue? Why did it happen? When will it be fixed? I need 15 minute updates please. Not interested in detail. Just want to know when it will be resolved. And don't forget Root Cause Analysis RCA document & Consequence Management!When I get these remarks,  I'm like...left dumbfounded...and decided to find another way of communicating to manage expectations better...

Enter...
#draintheswamp
#ICU

#draintheswamp

Despite the term being popularised by Trump last year, which was incidentally around the same time I introduced the term - in the software engineering community, "draining the swamp" is used to describe some refactoring and possibly restructuring of code and architecture, to address key bottlenecks, instabilities and optimisations. This could also mean revisiting original design & implementation decisions. Some people in software might put this under the bucket of "paying down technical debt" whilst others might just put this down as a natural part of the software evolution.

I used this term to prepare business people on the impact of the changes we planned to make. This was in response to a business ask of anticipating the biggest load of the tech platform stack - will the system cope with a "Black Friday-like event"? The simple answer was NO, not in its current form, that we would need to make some aggressive changes to the platform - like scaling out to multiple datacentres. The stack was built as a monolith (multiple services & components, some micro services, running on traditional VM infrastructure, not containerised, etc.) and so the best chances of scaling for load was to move the stack to run off multiple data centres, where we could scale up on each datacentre as well, but still had bottlenecks with share database cluster off one primary datacentre.

Suffice to say, this stuff goes above the heads of the business people. So I said, it's like draining the swamp. When that happens, we are bound to find some nasty things lying beneath the surface, things we don't see today, and can only see when we start draining the swamp. When this happens, and we hit some boulders or some unforeseen monsters of the swamp, we are bound to experience downtime and an outage or two. Don't panic, this is expected, given our working constraints, that is, make the changes on live production environment.

#draintheswamp went down pretty well with business. They got it. Left us alone. They trusted us! They managed the customer & business impact when we needed help, and our tech teams worked heroic hours to get the stack running off multiple datacentres, just in time for the biggest event of the business for the year (2018), our user numbers reached record highs, and the platform did not fall over! So round one of #draintheswamp was done, but the swamp was not completely drained out...

Until earlier this year (2019), when we kicked-off another round of #draintheswamp to start the migration to cloud stacks, starting with containerisation...which led the platform to suffer a series of outages at the most inappropriate moments...customers were pissed, business impact teams were on the back-foot, social media was killing us, the changes for #draintheswamp was starting to kill the platform...when I resorted to introducing another metaphor: #ICU. Folks, the tech stack is in some severe TLC, we are instigating #ICU mode. Life support is initiated, vitals are not great, but patient is surviving, and needs high level of critical care...

Technical Operations / Site Reliability Engineering is not so different to running a hospital's ER Emergency Department...at any time, despite monitoring vital signs of existing patients (system components of tech stack), or doing day-to-day operational management of the ward (infra maintenance), an event could occur that can just spike, like a natural disaster, accident or terrorist attack (hardware failure, critical component dies, database crash unanticipated, etc.)...ER staff have to triage quickly (Tech Ops also have to triage), make life/death calls (Tech Ops when to call a P1 incident & inform business), decide on severity of the injury (requires immediate attention, operate now, or can wait??)...the same holds true with bringing back a technical platform from death to living healthy operations...so why wouldn't we want to reuse medical terms?? I think it makes perfect sense!


#ICU

The gist of this metaphor is simple: ICU/CCU means Intensive/Critical Care Unit. When a patient is in ICU, it is serious stuff, urgent priority, focused attention to monitoring, diagnostics & multiple treatment options, using whatever means necessary to enable a positive outcome for the patient.

The typical flow to recovery to health looks for these transitions:

  • Start in ICU/CCU, remain there until a period of time where interventions applied & life-support is no longer necessary, then...
  • Transfer to High-Care facility, remaining close to ICU, but stable enough to warrant reduced focus and attention as compared to being in ICU, but be prepared for surprises, so best be close to ICU ward and not far away...wait until doctors give the green light to move on to...
  • Normal / General Ward...the last stay before checking out to go home. Vitals and all other required checks all pass before discharged for home...
Although being in #ICU is rather stressful for everyone, there is a high sense of urgency, and the pressure to respond to business is quite intense (especially when Risk/Governance demands "Consequence Management"), we can draw additional parallels from medical ICU, like:
  • Remaining calm, address the topics / unknowns in logical manner, using tools of diagnosis you've trained for (Technical tools like Five Whys, etc.)
  • Running tests, take blood samples, etc. (Tech/Data forensics, logging, test hypotheses, etc.)
  • Call on other medical experts to bounce diagnosis / brainstorm (Involve as many tech experts to offer new perspectives)
  • Implement treatment plan according to hypothesis, wait for new results (Fix something, wait, analyse result, before making additional changes).
  • Communicate clearly to patient's stakeholders (Communicate to business stakeholders transparently without hiding or being defensive...come clean).
  • Pray :-)

Unpacking the medical terms...

According to Wikipedia, an ICU:
An intensive care unit (ICU), also known as an intensive therapy unit or intensive treatment unit (ITU) or critical care unit (CCU), is a special department of a hospital or health care facility that provides intensive treatment medicine.
Intensive care units cater to patients with severe or life-threatening illnesses and injuries, which require constant care, close supervision from life support equipment and medication in order to ensure normal bodily functions. They are staffed by highly trained physiciansnurses and respiratory therapists who specialize in caring for critically ill patients. ICUs are also distinguished from general hospital wards by a higher staff-to-patient ratio and to access to advanced medical resources and equipment that is not routinely available elsewhere.
Patients may be referred directly from an emergency department if required, or from a ward if they rapidly deteriorate, or immediately after surgery if the surgery is very invasive and the patient is at high risk of complications
When a technical platform goes into #ICU, we do the following:

  • We're on red alert - life threatening to business (customer experience is tanking)
  • Stop all other development work, or reduce planned work as much as possible by pulling in all the people we need to help (speaks to higher-staff-to-patient ratio)
  • Pull out all the stops, bring in experts, tool-up with advanced resources and equipment
  • Communicate daily to all business stakeholders
  • Perform multiple surgeries if needed (hotfixes, patches, etc.)
  • Run multiple diagnostics in parallel (think Dr. House)

According to Wikipedia, a High Care/Dependency Unit:
high-dependency unit is an area in a hospital, usually located close to the intensive care unit, where patients can be cared for more extensively than on a normal ward, but not to the point of intensive care. It is appropriate for patients who have had major surgery and for those with single-organ failure. Many of these units were set up in the 1990s when hospitals found that a proportion of patients was requiring a level of care that could not be delivered in a normal ward setting.[1] This is thought to be associated with a reduction in mortality.[1] Patients may be admitted to an HDU bed because they are at risk of requiring intensive care admission, or as a step-down between intensive care and ward-based care.
According to HealthTalk, the last point to recovery is General/Normal Ward:
People are transferred from the intensive care unit to a general ward when medical staff decide that they no longer need such close observation and one-to-one care. For many people, this move is an important step in their progress from being critically ill to recovering..
'Nuff said, IMHO the similitude mentioned above should be more than self explanatory ;-)

Tuesday 6 August 2019

On company values


"Disagree and commit"
"Embrace the elephant"
"Don't be a d!ck"
"Listen. Challenge. Commit. A good leader has the humility to listen, the confidence to challenge, the wisdom to know when to stop arguing and commit."
"We use data to drive decision making. Data-driven-decisions."

These are are just some of the catchy phrases that some "modern" workplaces aspire to implement as company culture....great words, but easier said than done in practise IMHO. There's this somewhat unreasonable expectation that human beings can just implement this stuff - but we all know too well, humans are emotional and predictably irrational...

If you're a leader...
You're bound to issue instructions, ideas or directives that are not going to sit well with your teams. You have a duty to listen to feedback, alternative points of view, and use these as inputs into your thought process and ultimately make the call. It's not about paying lip service and say "I've listened" without truly listening...what if you're wrong, and you might even be the HIPPO in the room? Even if your decision doesn't change, and you set the directive, you're ultimately responsible and accountable for the outcome.
What if the directive appears to be wrong? Do you stick it out till the end because of your ego and risk losing credibility? Some might say a real leader will step up, and own up saying "Hey, I was wrong guys, but we gained a lot of learning from this."
Still, as a leader, you then trust (expect) your team to follow-through no matter what...but is this even as simple as it seems? Probably not...how do you behave when you find out that there's still resistance? Do you flip your lid? What kind of a leader does that make you? Some might argue that being decisive, forthright, firm and tough on naysayers are great attributes for a leader...."the buck stops here!".

If you're on the receiving end of the instruction / strategic directive and you disagree...
You've said your piece, provided feedback, you may have even fundamentally disagreed with the decision...and now you're faced with understanding yourself: Can you let go, can you commit, even though you violently disagreed? Are you serious about the best outcome for the team regardless of your personal opinion? How will you stop yourself from unconsciously falling back into dissent-mode?
As a team player, your leader expects you do so. But have you prepared yourself to work towards that outcome?
How do you stop yourself from sounding like a stuck record?
My humble advice: stop being this guy. If you can't let go, then you're limiting your own growth.
BUT...ask yourself this: What am I willing to walk away from?
If the directive conflicts fundamentally with your core professional or personal value system, then what do you do?
My view: if it gets to that level of personal dilemma, and you're so sure of your value system, then leave, quit the company altogether, or change teams...it depends on how serious you are about this value system of yours.
If the conflict is not even near enough to compromising your value system, then ask yourself how can you adapt your own behaviour & approach, how can you help to solving the problem, and how can you help influence the outcome? How can you show your leader you're committed, no matter what?

This isn't always easy...the emotional forces at play can be intense, pulling you back...but once committed, your leader is expecting following through and be a team player.

Both Leader and Follower need to have a keen handle on self-awareness, "Know thyself"...some call this "mindfulness"...