#draintheswampDespite the term being popularised by Trump last year, which was incidentally around the same time I introduced the term - in the software engineering community, "draining the swamp" is used to describe some refactoring and possibly restructuring of code and architecture, to address key bottlenecks, instabilities and optimisations. This could also mean revisiting original design & implementation decisions. Some people in software might put this under the bucket of "paying down technical debt" whilst others might just put this down as a natural part of the software evolution.
I used this term to prepare business people on the impact of the changes we planned to make. This was in response to a business ask of anticipating the biggest load of the tech platform stack - will the system cope with a "Black Friday-like event"? The simple answer was NO, not in its current form, that we would need to make some aggressive changes to the platform - like scaling out to multiple datacentres. The stack was built as a monolith (multiple services & components, some micro services, running on traditional VM infrastructure, not containerised, etc.) and so the best chances of scaling for load was to move the stack to run off multiple data centres, where we could scale up on each datacentre as well, but still had bottlenecks with share database cluster off one primary datacentre.
#draintheswamp went down pretty well with business. They got it. Left us alone. They trusted us! They managed the customer & business impact when we needed help, and our tech teams worked heroic hours to get the stack running off multiple datacentres, just in time for the biggest event of the business for the year (2018), our user numbers reached record highs, and the platform did not fall over! So round one of #draintheswamp was done, but the swamp was not completely drained out...
Until earlier this year (2019), when we kicked-off another round of #draintheswamp to start the migration to cloud stacks, starting with containerisation...which led the platform to suffer a series of outages at the most inappropriate moments...customers were pissed, business impact teams were on the back-foot, social media was killing us, the changes for #draintheswamp was starting to kill the platform...when I resorted to introducing another metaphor: #ICU. Folks, the tech stack is in some severe TLC, we are instigating #ICU mode. Life support is initiated, vitals are not great, but patient is surviving, and needs high level of critical care...
The typical flow to recovery to health looks for these transitions:
- Start in ICU/CCU, remain there until a period of time where interventions applied & life-support is no longer necessary, then...
- Transfer to High-Care facility, remaining close to ICU, but stable enough to warrant reduced focus and attention as compared to being in ICU, but be prepared for surprises, so best be close to ICU ward and not far away...wait until doctors give the green light to move on to...
- Normal / General Ward...the last stay before checking out to go home. Vitals and all other required checks all pass before discharged for home...
- Remaining calm, address the topics / unknowns in logical manner, using tools of diagnosis you've trained for (Technical tools like Five Whys, etc.)
- Running tests, take blood samples, etc. (Tech/Data forensics, logging, test hypotheses, etc.)
- Call on other medical experts to bounce diagnosis / brainstorm (Involve as many tech experts to offer new perspectives)
- Implement treatment plan according to hypothesis, wait for new results (Fix something, wait, analyse result, before making additional changes).
- Communicate clearly to patient's stakeholders (Communicate to business stakeholders transparently without hiding or being defensive...come clean).
- Pray :-)
Unpacking the medical terms...According to Wikipedia, an ICU:
When a technical platform goes into #ICU, we do the following:An intensive care unit (ICU), also known as an intensive therapy unit or intensive treatment unit (ITU) or critical care unit (CCU), is a special department of a hospital or health care facility that provides intensive treatment medicine.Intensive care units cater to patients with severe or life-threatening illnesses and injuries, which require constant care, close supervision from life support equipment and medication in order to ensure normal bodily functions. They are staffed by highly trained physicians, nurses and respiratory therapists who specialize in caring for critically ill patients. ICUs are also distinguished from general hospital wards by a higher staff-to-patient ratio and to access to advanced medical resources and equipment that is not routinely available elsewhere.Patients may be referred directly from an emergency department if required, or from a ward if they rapidly deteriorate, or immediately after surgery if the surgery is very invasive and the patient is at high risk of complications
- We're on red alert - life threatening to business (customer experience is tanking)
- Stop all other development work, or reduce planned work as much as possible by pulling in all the people we need to help (speaks to higher-staff-to-patient ratio)
- Pull out all the stops, bring in experts, tool-up with advanced resources and equipment
- Communicate daily to all business stakeholders
- Perform multiple surgeries if needed (hotfixes, patches, etc.)
- Run multiple diagnostics in parallel (think Dr. House)
According to Wikipedia, a High Care/Dependency Unit:
A high-dependency unit is an area in a hospital, usually located close to the intensive care unit, where patients can be cared for more extensively than on a normal ward, but not to the point of intensive care. It is appropriate for patients who have had major surgery and for those with single-organ failure. Many of these units were set up in the 1990s when hospitals found that a proportion of patients was requiring a level of care that could not be delivered in a normal ward setting. This is thought to be associated with a reduction in mortality. Patients may be admitted to an HDU bed because they are at risk of requiring intensive care admission, or as a step-down between intensive care and ward-based care.According to HealthTalk, the last point to recovery is General/Normal Ward:
People are transferred from the intensive care unit to a general ward when medical staff decide that they no longer need such close observation and one-to-one care. For many people, this move is an important step in their progress from being critically ill to recovering..'Nuff said, IMHO the similitude mentioned above should be more than self explanatory ;-)