Thursday, 28 June 2012

Managing Large-Scale Projects using Agile, Part 3 - Product Management Methodology


Agile Product & Project Management Model - Large-Scale Set-Top-Box Development Project
A while ago I started writing about my experiences with managing large scale software development projects using the Agile philosophy of very Disciplined Agile Delivery (DAD). In order to do this, I had to first introduce some context & background in the three previous posts leading up to this one, the last post dealing with overall organizational challenges. Now the scene is all set to discuss the actual Agile model we eventually implemented, that continues to stand the test of time: 4 years and is still running quite strong to this day.


Some Background Material
Quick introduction to the world of digital TV
What is a Set-Top-Box?
What is an EPG?
What in the world is Sky TV?


Disclaimer
The experiences I'm writing about is about past projects that I personally worked on or touched in my time a Middleware Development Owner & Middleware Products Manager, that are very much in the public domain:
BSkyB replacing OpenTV HDPVR with NDS technology
Fusion is the new NDS Mediahighway to displace OpenTV
BSkyB Launches new HDPVR
Sky Italia replacing OpenTV HDPVR with NDS Mediahighway
Sky Deutchland delivers enhanced VOD powered by NDS Mediahighway
UPC being powered by NDS MediaHighway


The Project
Take a look at this video clip:





Yes, this is the background project I was fortunate to work on, which forms the basis for my writing on a large-scale software development project. We were responsible for delivering the core software component (Middleware / Operating System) that enabled the EPG (UI Application) to be developed to offer the Sky Anytime+ User Experience.


On the surface, when interacting with an EPG, the user experience might suggest this was a simple application, compared to powerful PC software products like Excel/Word or Adobe Photoshop; surely it can't be a complicated endeavour to create this system? That's not entirely true: Beneath the seemingly simple UI is a complicated piece of software and electronics, so much so, that typical projects such as this one takes at least three years to complete (if all goes well)...


The project was greenfields development on all fronts for the software vendors employed, especially on our side for the Middleware component. But for the broadcaster it was a massively complicated migration project, because essentially we were replacing the full software stack of a legacy deployed based of 4 million+ STBs, with the option for rollback!! 

The project, though massive in size, relied on a core product & architecture team to manage the roadmap and project planning; supported by a very large technical team (software engineers, integrators & testers) that were geographically distributed throughout the world.  Having joined the project a year-and-half into its development cycle, then spent the next two years executing and driving through the implementation and delivery to Launch Candidate builds, I played primarily the role as Middleware Development Owner. I inherited the Agile founding principles of the Programme, which after two full years of practical exposure, resulted in a deep appreciation for Agile, such that all my subsequent projects going forward re-used and evolved the model based on those principles. I later joined the Product Management team that championed the processes through roadshows, training and ramping up new project managers on the "Project Toolbox" required to deliver all future customer projects...

This post is split into the following sections to make for easy reading:

Monday, 4 June 2012

Managing Large-Scale Projects using Agile, Part 2 - Organisational Challenges


Organisational Challenges of Large-Scale SDPs
In this post I will describe the organisational challenges with supporting large-scale software projects, especially when the management philosophy is one that promotes agility, or has its roots in adopting Agile/Scrum early with a small team and then ramps up to using a globally distributed project team. Specifically describing the approach we took in managing the example project as described in the previous post. As with any organisation adopting Agile, there must be management support. In some cases organisations have transformed through bottom-up approach, but in an LSSDP project, some initial strategic planning is expected, it's imperative that management support is won at the outset (more about this later...).

Whether you're running a classic, relatively small project (say a team of 15) locally or multiple teams (250+ people), the essential elements of team/project management remain the same, the only difference is the scaling factor as the team expands. In my opinion, the following factors are integral to any software development initiative:
  • Communications: Classic example is the communication model that Brooks provides, with n being the size of your team, the different communication paths is formulated as: n*(n-1)/2  - so a team of 10 would have 45 communication paths, so a team of at least 200 would create 19 900 paths!  With this in mind, pains must be taken to ensure that organisationally, a suitable structure must be put in place to sustain manageable communications.
  • Organisational Team Structures: Flowing from the above problem of communications, each team needs to have an identity - roles, responsibilities needs to be understood. This is a trait common to both small and large teams alike. Defining the structure for the project/organisation through the classic org chart, does help in clarifying the structure.  Even though Agile promotes a flat team, collaborative decision making and participation, it is still useful to ensure the roles are identified and understood.  With a large-scale project though, collaborative discussions and decision making can still sometimes happen, but is extremely challenging. Large scale projects call for a really strong management structure (which involves technical people as well) to ensure momentum in decision making. 
  • Worskpace Environments:  Common challenges to both large and small teams alike, one has to ensure the team's physical environment at which they work, is not only comfortable from a personal space perspective, but also to ensure the workspace can support the needs. For example, Set-Top-Box development/testing requires a desk-space large enough to cater for a few screens (monitors / TVs), ample power supplies, space for mobile stream players, etc.  Agile/Scrum promotes the use of whiteboards for tracking work for example: So do you have mobile whiteboards, fixed boards or what? How do you solve this problem when your team is distributed?? How do you ensure that all your people in all geographic locations have a similar setup as the rest of the world?
  • People Skills / Training needs: Whether you have a small or large team, you will experience the same people challenges: Does the team have the knowledge/skill-set to do the job or task at hand, effectively, efficiently?? How do you build competencies across your team base?  What are the basic fundamental prerequisites of skills/knowledge you need to have? How do you promote transfer of knowledge & training?
  • Peopleware Issues & Cross-Cultural Challenges: Building software is more about people dynamics than implementing technology. If you don't have a well-formed team (Agile/Scrum promotes self-organsing, well-formed teams), which is true for both large and small teams alike, you will experience problems that inhibit the smooth flow of development, impacting your project's delivery.  In a small team, the Team Leader, Scrum Master or Development Manager has more control over this, and has the time to see his/her team through the various stages of Forming, Storming, Norming & Performing. With small teams this process takes time.  With a large contingent of 200+ people, this process is exacerbated, being very difficult to manage and assess due to the spatial and temporal differences in managing distributed teams.  In our project we had people from UK (two locations), India, Israel (Russian, Polish, American, S. African), France & Korea, with third parties from Poland, UK, Ireland & India.   How can you achieve a well jelled team of international players? How do you solve the cultural issues? How do you avoid the frequent Lost In Translation moments?
  • Maintaining Integrity of Software Architecture: All teams need a custodian of the software architecture, typically this is your architect, in our context, the STB architect. Where the team is small, you can have just one STB architect. On a team as large as 200+ people, all implementing a complex architecture, how do you ensure the technical integrity of the architecture is maintained?
  • Tools/Infrastructure/Configuration Management: Like any other trade, a software team need the right tools for the job. Managing your local team it's easy to notice gaps and provide tools; it's far more complicated with distributed teams, since the challenge is in maintaining consistent adoption of tools, i.e. tools should proliferate and be different from team to team (Defect tracking tools spring to mind) so establishing a common infrastructure plan does help in creating consistency for the programme, and maintaining a sense of harmony within the distributed team itself.
In the remaining sections I'll touch on how we addressed each of the above areas referring to the example case study mentioned in my starting post. This post is arranged as follows:

Tuesday, 8 May 2012

Managing Large-Scale Projects using Agile, Part 1 - What is a LSSDP anyway?

What is a LSSDP anyway?
In my previous post I introduced the subject of LSSDP: Large-Scale Software Development Projects.
In this post I'll try to describe what I've come to understand as a Large-Scale Software Development Project (LSSDP) based on my real-world experience of successfully delivering such projects.  There are many buzzwords flying around, recapping my own attempt to capture headlines on my CV:  Experience with large-scale, geographically globally distributed-development projects spanning 200+ people, using Agile principles, multi-project
  • Large-Scale Software Development Projects
  • Global Software Development Projects
  • Multisite Software Development Projects
  • Globally Distributed Software Projects
  • Geographically Distributed Software Projects
What do I mean by Large-Scale?
I consider a project as being "Large-Scale" where the project in question has the following ingredients:
  • Significant financial investment in the project spanning tens of millions of dollars
  • Technology  being produced by the engineering team is of a complex nature
  • The engineering team (developers/testers/integrators), product and project management teams span in excess of 200 people
  • Exceedingly high Expectations placed on the project team to ensure a successful outcome
  • Strong implementation of processes & methodologies to co-ordinate and steer the project along
  • Globally Distributed teams (where co-location impossible)
  • Commercial/Proprietary
What do I mean by Globally Distributed?
Let's start with defining "Distributed" by using its opposite, which in software terms, we usually use "Centralised" - single point of control. This isn't strictly applicable here, we are instead focused on the physical locality of the the project's team members, i.e. the Locality of the team.  Locality in my opinion, has many variations, at the highest level, we sometimes say a team is co-located when all the team members are physically present in the same region/area (e.g. Office park, Campus R&D building, third-floor, west-wing) - but that is quite specific. Some might say that a team is co-located if they're all present in the same office park, for instance, your company might have a campus with many buildings on-site, just being on campus can be interpreted as locality local, co-located.  But I prefer to depart from this view and rather choose to say that a distributed team is one where the team are not all housed together and are separated by one or more arbitrary barriers.

How deep does one go? If my team are not all in the same building, on the same floor, then my team is distributed: Same building, different floor. Different buildings same campus park. Same country different province. Same province different campuses.   Get my drift?  As soon as there's some barrier that's going to impact communications, co-ordination, team building and coherency -- we then have a distributed team.

We have a Globally Distributed Project where the team consists of participants located from other geographic areas: Same continent, different countries. Different Continents, Different Countries - i.e. a Geographically Dispersed Team.

A Quick Note on Open Source Projects
Before anyone jumps up and down and wax lyrical on how the big-name open source projects are all large-scale and distributed and that we should learn from this community, etc. etc -- I accept, respect and admire the open source movement myself; but as we all know, the corporate world is an altogether different machine. Distributed projects can be run along the same principles as the open source movement, in the context of the corporate shell.  Delivery large-scale, commercial programmes come with a lot of risk to the business (which is not applicable to open source projects), and as such these projects have to be run a little differently.... All I would like to do is share my exposure in running such projects, not really advocating one methodology over another (although when you dig deeper into how decisions are made in open source projects, there is usually a single entity, some call it the BDFL that eventually calls the shots, much like a commercial programme!)...

Case Study: Developing Set-Top-Box Software, Large-Scale, Globally Distributed Project
The remaining posts are all based from a past project, which I can can truly say was one of the best project experiences in my career-to-date, that allowed me to grow in more ways than I can imagine.  It is no secret that NDS is on a mission to becoming the world's leading Middleware provider with its MediaHighway Portfolio.  Like all successful software product companies, there is always the challenge of supporting your existing customers using legacy technology, and trying to work on cutting-edge innovative new products & features.  NDS was no exception, over the 10-25 years of providing Middleware to many different customers, satisfying the various needs and demands of these customers, the products started to fragment and reach their maturity. With development centres situated across the world (US, UK, France, Israel, India, Korea) each supporting a slice of the product (Basic, Evolution, Advanced), each centre wishing to innovate and move the product forwards in their own way. To cut a long story short, the company decided to use the vast knowledge-base and experience from its global technical team to produce the next generation, advanced Middleware platform, which was known at one stage to the outside world as "MediaHighway Advanced", but this often changes whichever way the marketing wind is blowing. Internally we  (and for a short while, the press) referred to this product as MediaHighway Fusion, I'll stick to "Fusion" for the remaining posts.

Fusion was not just about new technology and new architecture. Yes, it was borne from learning the hard lessons from its' predecessors (Core4, Evolution, Pantalk, XTV), taking the best design principles, merging with modern Linux & general operating system architectures, changing some models up-side-down, for example the concept of CDI (Common-Driver-Interface) that departs from the classic HAL (Hardware Abstraction Layer)-STB manufacturer tight coupling....(future post will discuss STB architecture)....Fusion was more about improving response times to deliver new features to market: Agility, Flexibility and Continuous Delivery. Faster time to market, can we deliver new features in three week sprints, i.e. sustain product increments every three weeks??

So what made Fusion a Large-Scale Project anyway??  The core Fusion architecture and principles were conceived by a small team of technical experts. The first major milestone was the production of the CDI specification, which is a 1500+ pages of technical specifications aimed at Chipset Vendors to comply with the underlying broad platform infrastructure, the low-level interface definitions. That was the ultimate starting point as it laid the foundation for services supplied to the upper layers of the software stack.  The Middleware architecture was component based, consisting in excess of 100 components.  The components were to be written and owned by development teams, located throughout the world: Two sites in UK, France, Israel & India (Note that language and cultural dynamics!). The CDI team located in Korea.  The architecture was very complex, and was owned by a team of 20-30 architects, with a team of 3 chief architects controlling the overall decision making process. We referred to those 3 elite techno gurus as the Trinity. So IMHO, we have all the ingredients for a LSSDP...In fact, I think the method of Agile implemented, although not purist in its form, was very disciplined that proved itself well in managing this LSSDP project...

Even in the R&D phase, this internal product development project was considered large-scale: ticking boxes of complex architecture and geographically dispersed teams.  On landing the first customer win, Fusion switched focus to becoming both a development and delivery project (the Middleware was not complete, but we had to submit to project urgency). The customer (BSkyB) being quite bullish and demanding in its launch targets, resulted in an overdrive focus.  The Project teams grew and expanded, new component teams were created, methodologies were introduced to control multiregion planning, management, development and integration. In a relatively short period of time the core development team scaled from 50 to 250 people. Two years later, there would be between 300-700 people all engaged on project deliveries (for major customers) all on the Fusion platform (that was still maturing).

In the next post I'll go deeper into the organizational structures - but for now, lets summarise how Fusion meets the requirements for LSSDP:
- Complex Architecture: 100+ components, new technologies, strict architecture rules
- Trinity of Chief Architects - Geographically Distributed Globally
- Group of System Architects (20-30+) - Geographically Distributed Globally
- Development & Integration Team (250+) - Geographically Distributed Globally
- Project Management Team (10+) - Geographically Distributed Globally
- Product Management Team (5+) - Geographically Distributed Globally
- Financial Investment (Tools, Infrastructure, People) - Circa $100 million or more
- Delivery Pressures - Customer Launch delivery in 2-3 years

Do you agree this is an LSSDP?? Read Part 2: Organisational Challenges of LSSDP Agile Projects...

Monday, 7 May 2012

Managing Large-Scale Projects using Agile, Part 0 - Introduction...


A few years back, I had updated my Resume (CV) and as part of the headline included the text "experience with large-scale, geographically globally distributed-development projects spanning 200+ people, using Agile principles, multi-project" and was rightly challenged by a colleague as to exactly what message those words were actually trying to convey?  Of course, CVs supposed to contain all the attention-seeking buzzwords, but it really did sound and read like a mouthful, most people would not understand what I hoped it would mean, it sounded too arty-farty and dressed up - advice was "Please remove it, it makes no sense" - and this was coming from this guy who I hold in great esteem, one of the best software system architects I know (and worked with closely); and so removed that sentence from existence...until now.

In the next series of posts I aim to describe my experiences from a very large-scale software development project, that was not only based on agile principles in general but also adopted a highly structured & rigorous approach as well. Whilst I was not part of the original group that conceived and instigated the processes & framework foundations, I joined the team to take over the mainstream project management (relieving the founding team to concentrate on product management and future customers) to help get the first major product release delivered. This was at a stage when the founding management team had already experimented with various agile strategies twelve months into the project, having made numerous mistakes and having learnt from those mistakes, were just transitioning to their third attempt, which actually ended up proving itself as the final methodology; and was used for the subsequent two years with only minor, natural evolutionary updates to accommodate new needs and demands as the project evolved through development into delivery & final release phases. However, the underlying fundamental key principles were solidly embedded and continues to work reasonably well to this day IMHO. So I ended up being an executioner of the process for a period of just over two years, and then joined the product management team in promoting the methodology for new projects, providing teaching, training & coaching to other projects as well as introducing new tools and concepts, until I figured I'd seen enough and moved back into mainstream software development.

However, in those solid 3+ years of working on that programme, I had learnt quite a bit about the theory of Agile versus appreciating and adopting much of spirit of Agile, taking into account organisational & project maturity, structures and processes, theory vs practice, resistance vs acceptance, people vs management, processes & controls vs fluidity & flexibility, not forgetting customer focus vs ideal product management (customer always comes first).  So now I'd like to communicate the guiding principles that made and continues to make the above-mentioned endeavour (could this become a model for other companies perhaps?), which the industry refers to as "Large-scale Software Development", work in the real world:  where a truly geographically dispersed, distributed multidisciplinary team of developers, architects, integrators and testers, project & programme management teams are all synchronised on a common development & delivery methodology having its roots in Agile -- are actually sustaining high profile, time-critical, multimillion dollar projects of an exceptionally high quality and complex nature, including the challenges of using just one single codebase to support multiple independent customers -- a case study in Distributed Development using Disciplined Agile Methodologies for a Digital TV Set-Top-Box Software Project.

The underlying reason behind this post however, like all my previous posts, is a focus on Digital TV projects specifically around Set-Top-Box UI EPG & Middleware development initiatives, where increasingly organisations are embarking on implementing Agile end-to-end, even on my current project which I've inherited, Agile had been chosen as the development methodology.  Don't get me wrong, I am all for agile, I respect its philosophy; as someone who loves autonomy myself, I wouldn't have it any other way.  But nevertheless, as I've written in the past, Agile can go very wrong if being implemented for the very first time.  This is why (and this is not just my opinion, see this report), Agile should not be chosen on a mainstream project that is the company's crown jewels, instead, the recommendation has generally been to start with a pilot programme that you can afford to throw away, fail and start over, expect to fail many times before getting the ingredients just right....but I digress...getting back to the purpose behind the next series of posts:

I have seen a model of Agile (Disciplined Agile Delivery) implemented successfully in the real world on a large-scale development & delivery project - and would like to share this experience with others as I think it would be useful & helpful, at the very least can be used as input material if you're considering adopting Agile in your own DTV projects.

I've broken down the concepts into several themes/parts - I'll update with links to each post as each is published:
And finally, if you have read (or end up reading) all the parts of this topic; and would like to pursue this discussion further with me one-on-one, I am always available to discuss possible consulting opportunities - so please do get in touch!

Wednesday, 11 April 2012

Project RAG Status Conventions


Large-scale projects involving multiple streams of work (sub-projects) most of the time require a way of communicating the essence of progress updates without getting into the details. This is especially relevant if your audience is senior managers and above who have little time to appreciate the detail and want to have a birds-eye view of the status of say, the key high level milestones being delivered by the various work-streams.  It is often the responsibility of the Project/Program manager to feed this information in a format that is visually appealing and simple, this format is often referred to as a RAG status report:

R A G  =  Red  Amber  Green
This is a convention that project management use to summarise the status of work, largely around the following areas (which by the way is what I'm using currently in managing my own projects):

  • Project / Task Status 
    • Are we on track / target / according to plan? Green
    • Are we on track but have highlighted risks that could impact the original plan? Amber
    • Have we missed our original targets? Amber
    • Are we so screwed that the project launch / completion is delayed? Red
  • Issue Status
    • Is this a top priority Showstopper Critical issue? Red
    • Is this a medium priority issue that can wait a week to resolve? Amber
    • Is the issue resolved and closed? Green
  • Risks
    • Does this risk rank in top 5 (High Probability/Impact)? Red
    • Does this risk rank between 6-20 (Medium Probability/Impact)? Amber
    • Is the risk for completeness, to be monitored but very low probability/Impact? Green
Generally the RAG status is meant to communicate at a high level to senior management the status of various streams of work. This is well known in the project management knowledge-base, although the actual usage and value of the RAG does vary.  Used correctly, the RAG report can be invaluable, I've seen RAG statuses used in real world projects not just for reporting purposes, but had triggered active management intervention in saving projects, rather than being used for pointing blame.  This is not always possible hence some people believe the RAG report doesn't make much sense because it triggers actions too late. However, if the convention is designed such that problems are anticipated in advance, and you have the support of the senior stakeholders, and you have frequent RAG updates, then it is in fact possible to provide valuable output from the RAG report...

As highlighted in my previous post on the importance of defining clear defect severity and priority definitions to prevent ambiguity and confusion amongst the entire project team, the RAG status conventions need to be defined and accepted in pretty much the same way.  The RAG status is a sensitive topic, depending on your audience, you have to be careful how the RAG is applied.

Here are a few example references:

Real-world Examples
Speaking from experience again, the RAG status can be a source of contention. Whatever you do, make sure you review the terms and conditions of the RAG conventions before publicising your report. Whether you're a vendor delivering software components to your customer, or you're just simply reporting to your executive committee for your project, care must be taken to ensure your interpretation of the status accurately reflects the status as you best understand it. Sometimes you have to temper the status, keeping an internal RAG for your internal management, and an external RAG to soften the message upwards to the key stakeholders.  

Some might argue this is insane, that we need to be transparent as possible and that the stakeholders must appreciate the detail otherwise what value is the report if we're hiding behind a wall of misrepresentation, putting up a farce??

I am very much a strong-willed, principle-minded person who stands pretty much up to the transparency argument, but I've been burnt in the past and had to succumb to the wishes of senior management NOT to be so transparent. As a vendor, you want to promote a sense of order, that you got things under control. Present a sense of calm, but in the background you're frantically trying to put out the fire - In other words, when reporting back to the customer Never Ever Report a Red RAG status!!  Unbelievable, I know, but it actually causes less headache and keeps the customer off your back!

But what if this is an internal project and your stakeholders are CEOs of your own company who are too busy to get involved with the detail? What if you've been clear and honest in the reporting, and are following the conventions to the letter and are reporting on the RAG convention as published? What if all the milestones are Red?? Should you hide this from the Exec Committee?? Again, my initial reaction is No, absolutely not.  But as a seasoned manager, one has to gauge the culture of the organisation, taking appropriate actions to manage both upwards and downwards.  Recently though, I did provide a report that was flagged as Red for pretty much the whole programme, this didn't go down so well.  I am now thinking about tempering the format yet again, to avoid the avalanche from above...

However, the purist in me really shares the same view as this post "Starting Green" where it proposes the RAG actually turned on its head: start off with Red and work your way towards green.  Really interesting experiment to try, but you have to have an open-minded organisation to be willing to give this a go! 

However, the reality is though, the RAG status is quite entrenched in classic management's mindset, how can a project start off in the Red? Didn't you do enough planning? We wouldn't have baselined the plan unless the project manager's had produced the initial baseline plans using best/worst/realistic estimates, so the projects/tasks must definitely start off in the Green state??

My Own Templates that Seem to Work
As always I'd like to share some of the stuff that have proved to work for me, and am currently using in my day-to-day activities (hopefully these are all self-explanatory):
Generic RAG for Internal - External Translation (Notice NO Red to External Players)
Download PDF

Example of Sub-Project RAG for Multiple Workstreams
Download PDF

Component Software Development per User Story / Work Package
Download PDF

Other Formats
I've seen other RAG / Traffic light formats been used before.  Typical SAP projects use the UP / Down / Right icon graphics to signal progress:
  • Vertical Up Arrow - Amber
  • Vertical Down Arrow -  Red
  • Horizontal Green Arrow - Green (Amber is also used if tasks are on Target with some associate Risk)
There are also others that use primarily the colours to signify severity, but also include arrows to forecast/predict the status anticipated for the next update:
  • Horizontal Arrow - Status anticipated to be the same next report
  • Down Arrow - Status anticipated to be down next report (e.g. Amber next report from being Red this report)
  • Up Arrow - Status anticipated to be up next report (e.g. Amber next report from being Green this report)