Friday, 16 November 2012

Process for Fast-Tracking Issues - The Daily Scrub

In this post I'd like to share a simple, yet effective process to managing the defect discovery phase, typical experienced during the early stabilisation phase of launching a STB product. The process does not only apply to STB projects, it can be used for Headend projects; or even general Software Projects that involve a number of components requiring system integration. Since the bulk of my experience comes from STB projects, the information presented here thus has a bias towards STB projects.

The material presented here is not that ground breaking, really. If you've been in the business of delivery DTV projects for a long time, then you probably have a tried and tested way of managing the system integration processes. However, if this is the first time you're involved in system integration, you are either a PayTV Operator taking more ownership of SI (System Integration), or you might be a professional services outfit and have won your first big project - If you fall into either of these categories, then I have no doubt that you'll find this useful, at best, trigger to review your currently planned strategy.

As an aspiring independent consultant, the Scrub Process is yet another addition to my PM Toolbox. Recently I had to present this process to a project team, who's processes were a bit old-fashioned and rather rigid, and slow. I needed a medium to express the concepts of gaining more efficiencies out of the SI Process, thus put together this presentation to communicate the idea, which I'll only briefly touch on here. For more details, please download the presentation.

The Problem with classic SI Process
Classic SI processes call for a structured way of working. Component vendors supply unit tested components into SI Team. SI put the components together - the act of integration - i.e. creates a build, does basic Smoke/Sanity testing, and passes onwards to further QA teams. Issues picked up from QA/Test teams then feedback into SI team for Triage or Characterisation. An SI engineer investigates the issue by attempting to first recreate the problem, then performs some debugging analysis, then proceeds with some analysis in pin-pointing the offending component that's causing the problem. Once the component is identified, the SI engineer packages a defect together, supplying all the information to the component owner as required, assigning the defect to component vendor. At some point in the release process, the SI team prioritise the defects for fixing in the next component release, component owner commits to fixing that defect. Usually this is a well-defined process that is repeated like clockwork as described in this process.

There is nothing inherently wrong with this process, except that following it to the letter takes time - the schedule or launch date will take as long as it takes. This is not what a Project Manager wants to hear. A real Showstopper defect might take an SI engineer 3-5 days to track down before identifying the offending component, but it could take a seasoned architect 10 minutes to make a decent intuitive guess and assign the offending component, based on intuition alone. In the time it takes an SI engineer to do the investigation, you could've got a fix in the same time from the component vendor.

So the real problem is of Bandwidth, Time & Efficiency gains, hence the Scrub Process - as depicted here:
Problem with SI is Bandwidth Bottleneck

The Scrub Process
To address this problem, the Scrub process basically adopts a fast-track mentality. Don't spend too much time investigating, get the right people in a room, assign components as fast-as-possible:
Scrub in a nutshell

It is essentially a forum where all key component owners get together with the main Product Release Manager and respective QA teams, review the incoming defects (newly raised), assess the information reported, and collaboratively agree on which component to assign ownership to. The forum also agrees on the defect's Severity & Priority rating.

The Scrub, essentially controlled by the Release Manager involves the following: On a daily basis, the Release Manager collates all the feedback from various test team leads (the test leads have already filtered issues as duplicates, etc). Well in advance of the next Scrub meeting, the new Issues List is sent out to all the impacted participants for review. The next morning, the forum reviews issues one-by-one, agreeing/debating the ownership, with the Release Manager providing the priority steer. Agreed defects then fold into the natural Release cycle process...

Does the Scrub Process work?
Yes, it does!
The process works - but as always, the effectiveness of any process lies in its execution. I've written about the various aspects to maintaining Effective Defect Management, which related most of the best practises I've experienced from a past project. We had integrated the Scrub process into our overall QA/Defect Process, as indicated here: 
Scrub Process Integration with Overall Defect Management Process
In that particular project, we were responsible for the full STB Software stack, owning the EPG, Middleware and SI, although SI was actually controlled by the actual customer (i.e. we provided the engineers). The Middleware at the time consisted of 80+ modular components which made it possible not just assign a defect to the Middleware blob, but actually to assign directly to the suspected component of the Middleware. We actively managed the backlog on SI, and only reserved the really difficult to reproduce, or issues lacking detail, or really challenging bugs to SI to investigate. The rest flowed directly to component owners, including external vendors. There was great participation from all teams, internal & external.

If we hadn't implemented this simple, practical way of fast-tracking defects, the project will not have delivered or launched in the time it did. The Scrub process has become a defacto process in all projects based on this STB Platform, and is currently going strong for four years in the running. As for context, the initial project launched on 4 million STBs on more than four variants of hardware...

I was actively involved in the Scrub process & overall Defect Management Process, performing the role of Middleware Development Owner, often standing in for the Architect in making the judgement calls, so it relied on being technical and having sound knowledge of the software architecture.

Tell me more please - What about Challenges, Roles & Responsibilities?
Yes, indeed - there are certainly challenges with implementing this process. Primarily, it requires a mindset change. It also requires participation, transparency, flexibility and willingness to collaborate & mutually co-operate on the part of all vendors involved in the project. This isn't so easy to achieve as generally vendors are not very receptive to new ways of working. If you're a young organization, that is, your maturity level with System Integration or Product Management is low, then you will possibly face organizational challenges, w.r.t. Product, Project & SI management. You may have to reconsider some existing roles & responsibilities. Also, like any process, it takes practice. It will not work the first time round: you will face people, infrastructural and administration challenges...for more detailed info on the specific problem areas, you can refer to the attached presentation please download the template process for more info.

I'll gladly help...
If you're interested in bouncing ideas around, networking or even willing to share your current processes & tools with me for review or feedback, I'm more than willing to working, sharing & collaborating with fellow professionals. Please do get in touch!

1 comment:

  1. This makes a lot of sense, and is a very pragmatic approach. Probably very hard to practice, but hats off to you guys for making it work!