Showing posts with label PMToolbox. Show all posts

Saturday, 14 September 2024

How I used ChatGPT o1-preview to build a rough interactive software engineering resource forecasting model

Seed Prompt for tool (click here to play with it): I am a software engineering manager. A large part of my job is understanding the engineering resource utilization for my engineering team. I manage multiple engineering teams of varying sizes from 5 people to 20 people in a team. When estimating software projects, understanding availability of productive engineers is important for planning purposes. Help me build an interactive resource forecasting tool, that can be fully self-contained within browser app, embedded in a google blogspot page. The tool must include inputs for: 1\ Funded team size; 2\ Current Available engineers (i.e. actual current size of team). The tool must account for the time it takes for hiring new engineers, ramp-up time, learning, etc. before they can be deemed as productive for committing to project work. The tool must show a table the resource plan over 52 weeks in the year, and include a visual graph to show how the team size grows to meet funded size. Research the best methods in software engineering management, and create the tool that can help software managers have a good model for resource utilization. The tool can store information in the local user browser storage as needed. Remember, I need the full code to be fully self-contained so I can run the tool from within a google blogspot page.

Prompt for this article: Great I think we have good baseline version to start with. Now using your PhD capabilities, write up an accompanying blog post that details the model, the thinking behind it, and call out the limitations using a critical eye, suggesting improvements for the next version.

Prompt for Image using Dall-E: "An isometric illustration of a diverse engineering team collaborating in a modern office space. They are gathered around a large table filled with charts, graphs, and laptops, discussing and planning. Behind them, a wall-sized calendar and a whiteboard display timelines, Gantt charts, and upward-trending graphs. Include elements like gears and clock icons subtly integrated into the design to symbolize engineering and time management. Use a professional, clean art style with vibrant yet muted colors to convey a serious but engaging atmosphere.

Engineering Resource Forecasting: A Comprehensive Model for Team Planning

By ChatGPT o1-Preview (took about 4 hours to get the tool to version 1)

Effective resource planning is crucial for engineering teams aiming to meet project deadlines, optimize productivity, and manage budgets efficiently. The Engineering Resource Forecasting Tool is designed to help managers and team leads forecast their engineering resources over a year, considering various real-world factors such as hiring rates, attrition, ramp-up times, efficiency, leave days, and public holidays.

In this blog post, we'll delve into the model behind the tool, explore the thinking that shaped its development, critically analyze its limitations, and suggest improvements for future iterations.

Understanding the Model

1. Objectives of the Tool

The primary goals of the Engineering Resource Forecasting Tool are to:

Forecast Team Size: Predict how the engineering team will grow or shrink over time.
Calculate Effective Capacity: Determine the actual productive capacity of the team, considering factors that reduce availability.
Assist in Planning: Provide insights for hiring strategies, resource allocation, and project scheduling.

2. Key Components of the Model

The model incorporates several essential elements:

Funded Team Size: The maximum number of engineers that the budget allows.
Current Available Engineers: The starting point for the forecasting, representing the existing team.
Hiring Rate and Time: The rate at which new engineers are hired and the average time it takes for them to join.
Ramp-Up Time: The period new hires require to reach full productivity.
Attrition Rate: The expected percentage of engineers who will leave the team annually.
Efficiency Factor: The proportion of time engineers spend on productive work.
Leave Days and Public Holidays: Time when engineers are not available due to vacations and holidays.

3. Model Calculations

a. Effective Engineers

The number of effective engineers is calculated weekly by adjusting the total number of fully ramped-up engineers for efficiency and availability:

Effective Engineers = Total Ramped-Up Engineers × Efficiency Factor × (Weekly Available Days / 5)

b. Weekly Available Days

Engineers aren't available all five working days each week due to leave and public holidays:

Weekly Available Days = 5 - ((Annual Leave Days + Public Holidays) / 52)

c. Attrition Modeling

Attrition reduces the team size over time:

Weekly Attrition Rate:

Weekly Attrition Rate = Annual Attrition Rate / 52

Cumulative Attrition: The total number of engineers who have left the team up to a given week.

d. Hiring and Ramp-Up Process

Hiring Pipeline: Represents engineers who have accepted offers but haven't started yet.
Ramping Engineers: New hires who are in the process of ramping up to full productivity.

e. SDE-Weeks and SDE-Days

SDE-Weeks: Sum of effective engineer contributions per week within a month.
SDE-Days: Sum of effective engineer contributions per day within a month.

These metrics provide a more granular view of the team's productive capacity.

Thinking Behind the Model

1. Realistic Representation

The model aims to mirror real-world scenarios by considering:

Time Delays: Hiring and ramp-up times introduce delays between initiating a hire and gaining full productivity.
Non-Productive Time: Leave, holidays, and non-productive activities reduce actual working time.
Attrition Effects: Engineers may leave the team, impacting capacity and necessitating replacement hires.

2. Balancing Simplicity and Complexity

While the model incorporates multiple factors, it maintains usability by:

Using Average Rates: Distributes leave, holidays, and attrition evenly over the year for simplicity.
Assuming Linear Processes: Hiring and ramp-up are modeled as linear processes without accounting for fluctuations.

3. Providing Actionable Insights

By visualizing the forecast through charts and tables, the tool helps managers:

Identify Hiring Needs: Determine the hiring rate required to meet team size goals.
Plan for Attrition: Anticipate capacity reductions due to attrition.
Adjust Strategies: Modify inputs to explore different scenarios and their outcomes.

Limitations and Critical Analysis

Despite its usefulness, the model has several limitations:

1. Uniform Distribution Assumptions

Leave and Holidays: The model assumes that leave days and public holidays are evenly distributed throughout the year. In reality, leave may cluster around certain periods (e.g., summer vacations, year-end holidays), leading to temporary drops in capacity.
Attrition Timing: Attrition is modeled as a steady rate, but actual departures may occur unpredictably and in clusters.

2. Lack of Granular Ramp-Up Modeling

Binary Productivity: The model assumes engineers are either ramping up or fully productive after the ramp-up period. It doesn't account for gradual increases in productivity during ramp-up.
Individual Variations: All new hires are treated identically in terms of ramp-up time and efficiency, ignoring individual differences.

3. Hiring Constraints and Real-World Challenges

Hiring Rate Feasibility: The model calculates the hiring rate needed to meet targets but doesn't consider the practical challenges of achieving that rate, such as market conditions or recruitment resource limitations.
Candidate Pipeline Variability: It doesn't account for fluctuations in candidate availability or acceptance rates.

4. Simplified Efficiency Factor

Static Efficiency: The efficiency factor is constant over time, not accounting for potential improvements or declines due to training, process changes, or team dynamics.
Team Composition: It doesn't differentiate between engineers with different skill levels or specializations.

5. Absence of Project Dependencies

No Task Modeling: The model focuses on resource capacity without linking it to specific project tasks or timelines.
Overtime and Workload Peaks: It doesn't account for periods when engineers might work overtime or when workload exceeds capacity.

6. Economic and External Factors

Budget Changes: The funded team size is assumed constant, ignoring potential budget cuts or increases.
External Events: Unforeseen events like economic downturns, pandemics, or natural disasters are not modeled.

Suggestions for Improvement

To enhance the model's accuracy and applicability, future versions could consider:

1. Incorporating Seasonal Variations

Variable Leave and Holiday Distribution: Model leave and public holidays based on historical patterns to reflect periods of low and high availability.
Attrition Peaks: Adjust attrition rates to account for known periods of higher turnover.

2. Enhanced Ramp-Up Modeling

Gradual Productivity Increase: Implement a ramp-up curve where productivity increases incrementally over the ramp-up period.
Individualized Ramp-Up: Allow different ramp-up times for different roles or experience levels.

3. Realistic Hiring Constraints

Recruitment Capacity Limits: Introduce maximum hiring rates based on recruitment team capacity.
Candidate Pipeline Dynamics: Factor in time-to-fill variations, candidate drop-off rates, and competition in the job market.

4. Dynamic Efficiency Factor

Efficiency Over Time: Allow the efficiency factor to change over time, reflecting process improvements or team fatigue.
Role-Based Efficiency: Differentiate efficiency factors for various roles or seniority levels.

5. Integration with Project Planning

Task-Level Planning: Link resource capacity to project tasks and milestones to assess whether capacity meets project demands.
Risk Modeling: Incorporate risks and uncertainties that could impact timelines and capacity.

6. Economic Scenario Planning

Budget Flexibility: Allow for adjustments in the funded team size based on potential budget changes.
External Shocks: Introduce scenarios for external events that could affect hiring, attrition, or productivity.

7. User Input Flexibility

Custom Time Frames: Enable users to adjust the time frame of the forecast beyond a single year.
Data Import: Allow users to import historical data to better tailor the model to their organization's patterns.

Conclusion

The Engineering Resource Forecasting Tool provides a solid foundation for planning and visualizing engineering team capacity over time. By accounting for critical factors like hiring, ramp-up, attrition, efficiency, leave, and holidays, it offers valuable insights for managers.

However, like any model, it simplifies complex real-world dynamics to remain practical and user-friendly. Recognizing its limitations is essential for interpreting the results accurately and making informed decisions. By incorporating suggested improvements, future versions can offer even more robust and precise forecasting, aiding organizations in navigating the challenges of resource planning.

About the Author

[Your Name] holds a PhD in [Your Field] and has extensive experience in project management, resource planning, and organizational development. Passionate about leveraging data-driven models to solve practical problems, [Your Name] contributes to advancing tools and methodologies that enhance operational efficiency.

Note: The Engineering Resource Forecasting Tool is intended as a planning aid and should be used in conjunction with professional judgment and other planning tools. Always consider your organization's specific circumstances when making staffing decisions.

Wednesday, 14 June 2023

A blast from the past: my experience building a large-scale tech platform

In the years 2003-2011, I worked for a pure technology service provider, NDS (acquired by Cisco in 2012, then later became Synamedia) which was considered at the time, the world leader in end-to-end digital TV software systems. I was fortunate enough to experience as an engineer every major area of platform development for this complex ecosystem; and then later as a software manager, I would own the software delivery for a core piece of the software stack known as "middleware", for NDS's primary anchor customer BSkyB/Sky Darwin and then later would own the full stack delivery of NDS's flagship Mediahighway Fusion/Unity product. This experience would mark my entry into very complex large-scale technology delivery initiatives, which even to this day, thirteen years later, as I work with the world's largest cloud provider, Amazon AWS, in building out its enterprise cloud support systems (AWS Support Center / Technical contact systems), Fusion still takes the prize for the most intense professional experience, learning and growth, technical complexity, risk and high-stakes projects. So yeah, I find myself having to dig deep into my memory to recall this work experience because it's funny that 13 years on, I'm encountering the same topics of engineering management even though it is supposed to be a different domain, turns out "software is just software"!

NDS had captured almost every top-tier PayTV operator around the globe at the time: Sky, DirecTV, UPC, Sky Italia, Sky Deutschland, Foxtel, Sky LA, Yes, Bharti, etc. NDS was prominently known for its conditional access product, a video content protection system call NDS Videoguard, however, NDS offered more than just security and offered customers a fully vertically integrated ecosystem (think "Apple" ecosystem for PayTV customers). Whilst digital TV was built on open standards and interoperability, most customers limited their integration points. So when they opted for NDS as their security provider, they also had the option of integrating all other services - from broadcast backend services in the headend to consumer device hardware development and software service integration with chipset vendors. The consumer device software was known as TV Middleware. At the time, the main players were NDS Mediahighway, OpenTV & TiVo. NDS was known for convincing customers to migrate to NDS Mediahighway, its technology migration programs were demanding, complex and executed flawlessly. As an engineer, I contributed software to replace TiVo, an overnight win for 40 million devices. Later as a software delivery manager for the Sky Darwin migration project, we would replace OpenTV software almost obliterating its presence from Sky, save for a few ancient, ageing hardware profiles.

NDS, with an increasing number of customers using its security, middleware and application services, couldn't afford to scale out with engineering teams for each custom build. A platform strategy was needed, consolidating the best of software from across the globe (US, UK, India, Israel, France) into a new shared technology stack, that offered flexible customisation and tailoring for any type of customer profile (Tier-1 customers like Sky for advanced applications to Tier-3/4 customers in territories just starting off with basic digital TV), using a shared engineering resource pool - and extensible configuration engine for producing tailored custom releases. So was borne, NDS Mediahighway Fusion.

The flagship customer for Fusion was Sky, which went live in 2010, replacing up to ten variants of its consumer device software services, with new Fusion components and Sky's own custom-developed consumer application "EPG" known then as the "Orchid EPG". Fusion provided an SDK/API for customers to develop their own primary applications, along with an interactive HTML engine, that allowed PayTV operators to add additional mini apps to their devices, like games and weather apps. With Sky being the anchor customer, Fusion had proved itself in the market and thus was ready to onboard new customers like Sky Italia, UPC, Foxtel, Yes, etc. Post Darwin launch, I took the lead for building the new platform vision, called Fusion Snowflake EPG through project Sunrise - birthing the platform that would create customer, tailorable configurations for any customer, maximising reuse and minimising customisation but allowing for a selection of custom user experiences.

Why am I claiming Fusion as large-scale (even in 2023, 13 years later)?

I write this in 2023, after spending 2.5 years with Amazon AWS. I am part of the group that build AWS Support Center and related Contact Center services. We are a team of under 100 people, deemed large- scale and building complex systems. Yet, if I have to be brutally honest with myself, I'm mildly impressed by my exposure to date, because my current work pails in comparison to my work on Fusion, 13 years ago. Yes I know it's a different domain, a different paradigm and culture of Amazon's 2-Pizza team model for software product ownership (which I actually find quite cool)...still I'm finding it hard to rationalise my move to AWS almost 2.5 years on, have I gone too far backwards? Am I living too much in the past & not ready to view things from a new perspective? What am I not seeing? (Topics for another post). So whilst I've defintely adapted my mental models since joining Amazon, yet I really can't ignore some software engineering truths which is the reason for my bringing up the past now.

In 2012, I wrote the first story about Fusion, introducing the term LSSDP I coined to mean Large Scale Software Development Project. I also dived deep, writing lengthy white papers about the product and engineering management processes:

Fast forward to 2023, now using my Amazon AWS experience as a lens for defining a large-scale initiative and indirectly checking engineering manager role guidelines for large-scale:

Business Impact - Fusion started off with a $75 million investment and later a joint-venture with the flagship customer, Sky. The entire company pivoted to focus on Fusion as its next-generation software platform, with up to 3000 engineers world-wide working on multiple streams, some strategic foundational streams kicked off at least 2 years before the mainstream program. In my role as software delivery owner for Sky Darwin project, it was critical the project delivered successfully, flawlessly - as it involved migrating software in 10 million people's homes (their living room TVs) seamlessly with no rollback. To the end customer (the person sitting at home watching TV), they would notice very little change to their experience. Overall, Fusion software components delivered to multiple middleware stacks, at the time of 2011 when I departed NDS, our software was running in excess of 60 million people's homes daily, globally.

Scope and Size - Fusion introduced a new paradigm of the TV software ecosystem, end-to-end, including broadcast headend components as well as embedded software architecture. The stack was open, based on a Linix/Posix and a complete departure from the initial decade of TV software operating systems. This was before the advent of Android TV or fully open source middleware. Fusion's product backlog captured over 2000 epics in the form of work packages, cutting across multiple customer needs, in parallel. The scope included all layers of the device software stack: Chipset drivers, hardware absraction layer, Linux kernel, Linux abstraction, Middleware services, Application SDK/APIs, multiple frontend application engine proxys for C / C++ / Java / HTML / Flash applications. Take a look at the software architecture diagram - it is multi-layered, multiple service teams. Another point on scope, we managed initiatives or epics in the form of work pacakages (WPs), that could impact up to 25 service teams in one WP, see here.

Product Plan visuals - concepts & examples from real-world programs

I recently wrote about my role as project leader for the original DStv Explora consumer device launched in 50 territories across the African continent from 2012-2014. In this post, I will share some visual tools I used to communicate the planning and release strategy. Suffice it to say, I am a big fan of visual planning tools over detailed text narratives any day. There is power in visualizing the plan, on a single piece of paper that beats reading pages of text.

The launch is when the work actually starts

Here's a sample of a post-launch plan that mashes big-picture milestones for executives whilst providing enough detail to software delivery and integration owners. With this single piece of paper, managers can use this schedule as their primary map to navigate their work plans.

Visualizing an end-to-end technology program on one page

Building a new consumer device such as a digital TV set-top-box, from the ground up, end-to-end is a large-scale program with many moving parts. The challenge is how to show as much high-level and low-level detail as possible, starting with output milestones and cascading to detailed team expectations like agile sprints. I can't claim to have authored this view from scratch since I borrowed concepts from my previous projects and other program managers I looked up to, when I worked with Sky/NDS in the UK.

The timeline below is a snapshot from the early days of Explora planning, where I was the primary plan owner and designer.

Below is a view with extra commentary showing business leaders the hotspots with the plan and calling to action for workstream owners:

For CEOs, I created much-simplified views since they weren't interested in the agile sprints:

A sample project charter for launching a consumer device end-to-end

In a previous post, I wrote about the importance of the project charter and the various forms it can take. In this post, I will go deeper and share a rather detailed sample of a project charter that I authored for a real-world project that involved launching a new original equipment manufacturer (OEM) consumer device, a digital TV set-top box, called the "DStv Explora" end-to-end. My role was overall program manager. My task was basically - to fix everything and get the project on track to deliver, owning the entire plan, directing and steering multiple businesses, project offices and engineering organisations. My stakeholders were C-suite from at least five firms. The project cost in the region of R2 billion ($200m).

By end-to-end, I mean all workstreams covering technology innovation & development, satellite infrastructure, device hardware engineering, device software engineering, infrastructure systems & software (including configuring satellites in space), application software development, software testing, end-to-end security, consumer field trials and go-to-market (finance, supply chain, marketing, communications, sales, customer support) launching in 50 countries in Africa, with catering for the unique rules & regulations per country, bespoke marketing & comms plans and at times supporting a different look-and-feel brand per country. I insisted on project charters for each go-to-market stream for significant territories like South Africa, Nigeria, Kenya & Ghana. The structure for the business at that time was rather loose, even though there were centralised project offices co-ordinating major launches, we had to partner with in-country business owners for launch planning. I don't share these business charters here, as they're go-to-market workstreams one can find online (or ask chatGPT).

The Explora project enabled me to make a significant impact on the company, the people and teams I worked with and most importantly, opened up my eyes to my potential as a prominent leader, boosting my confidence. Prior to the Explora, I'd just relocated from the UK, having worked with NDS for the last ten years, building and releasing a variety of TV software services, working in engineering and answerable to a number of customer and account delivery managers, taking instruction from bigger program managers and owning a few core technical workstreams. With Explora, my position was elevated to the highest level, giving me the opportunity I had long sought after, and that is to learn everything there was to know about running and operating a PayTV business. For Explora, I not only enjoyed a birds-eye view of the bigger picture - actually I created the bigger picture for everyone to follow. I was responsible for ensuring engineering teams are set-up for delivery success end-to-end, instigating and driving changes needed to ensure project success. My voice was heard. My opinions were listened to. My advice was heeded. I was granted autonomy once I'd earned the trust of all stakeholders involved (a post for another day).

As far as I can tell, many of the engineering disciplines I introduced for Explora still remain in place today, 10 years later. Here are some anecdotes as public endorsements on LinkedIn (so I'm not exaggerating my resume here, check out my LI profile recommendations for additional evidence):

Here's an email from Phil on the interventions I introduced in the last stretch of Explora launch:

Hi Muhammad

This one week cycle you have introduced on Explora is a stroke of genius. We have a lot to thank you for on this project, as you have saved our collective asses several times. I for one, really appreciate the quality and quantity of effort you put into supporting us. When we make it on the 1st August you should be able to look back on this project with a great deal of satisfaction. MCA is not the easiest place to bring order to, but you can't fault the guys on their commitment to making things happen:)

Have a great weekend my friend!
Regards Phil

Here's a video about the DStv Explora from 2013/14:

The project charter for Explora set the high bar for modelling all future consumer device launches going forward. Following Explora, I would manage the launch of the DStv HD decoder, followed by software updates launching new features like Catch Up Plus for streaming over the internet, incuding remote recordings. I left the consumer device division in 2015, to spend the next year running the program for launching Showmax, a new streaming video platform business, end-to-end from zero to launch in 8 months. With these successful launches behind me, I'd developed enough credibility in the market that further opened up opportunities for consulting, and later on, took me one step closer to reaching my original ambition of having a seat at the business table, the C-suite round table. Once I'd experienced that view, I decided to seek a new venture, risking starting over again (a future post, stay tuned).

Sample Project Charter

[Disclaimer: Please note I write about my past work and have permission to share my work experiences through my blog as was part of my contract with employers over the years. I've waited more than 10 years to share this particular work experience, the technology & business have moved on since then, such that this sharing is rather informational and can be seen as training material for engineering/delivery managers.]

A product roadmap visual depicting a single tech platform journey

I spent the first decade of my software engineering life building technology stacks for digital TV businesses for the likes of DirecTV, Sky, Multichoice DStv, Liberty, US Cable, etc - working at that time, for what was the world's leading Digital TV Technology Services company, NDS. We were in the business of selling full-stack embedded software (like Android / iOS SDK) tailored for set-top boxes (STBs), along with the backend infrastructure services needed for digitising, encrypting and transmitting TV signals over-the-[air/cable/internet] to these consumer device STBs, so people can basically watch TV. We would sell the tech stack along with a suite of TV applications that could essentially be tailored for any type of customer need, including changing different look-and-feel frontend/user experiences, configurable features like live recording or basic watching without recording, on-demand or internet streaming -- all without having to run multiple versions of software codebase per customer. In modern software parlance, we built multi-tenanted technology stacks in a way.

In 2010, we embarked on a vision to harmonize in creating the Nextgen version of the platform - taking the best of all customer engineering projects and core platform enhancements, and creating the next-generation stack, to scale to as many video entertainment providers across the world. We called this initiative Project Sunrise, symbolizing a new dawn for the next-generation experience built on Fusion Mediahighway Advanced offering a fully customizable Snowflake Unity UI experience.

Here's a short clip from 2011 on Snowflake:

Back then in 2010, I'd just come off delivering what was the biggest migration program in the history of the company, launching a new service for Sky - and thereafter landed another client build for UPC Horizon Gateway STB. In addition, we had 5+ other customers all lined up for new tenants! It was going to be a busy next few years indeed.

Our approach to building this technology stack included foresight from the very beginning. We were intentional about using a single stack, configurable architecture end-to-end, including customizable applications for custom experiences - to avoid rework and duplication, but most importantly quick delivery turnaround times. Our customers also benefited from leveraging features and capabilities they didn't have to pay for, because some other customers would have already funded the development anyway :-)

Technical Program & Product Management - Visualising the Roadmap

As our company was primarily a technology engineering company and a high-growth start-up, resources were constrained such that people took on multiple roles. I led the Sunrise project covering technical product and program management. I was responsible for creating the roadmap, backlog and overall sequencing, coordinating with multiple customer-delivery streams, along with the main core platform engineering deliveries - building the next-generation stack. The engineering activities were a mix of software integrations and application customizations through configuration, building out the default flagship application, that would come "out-of-the-box" for selling to prospective customers. The sales team would close the sale by signing off on the profiling customizations and configurations - and the Sunrise factory would eventually produce a release for the customer. This pattern of template-driven, profile-based software configuration approach was not new to us, but the technologies we used had changed over time (see the slide deck at the end). Project Sunrise was never a fully funded initiative though, so we had to partner with customer project teams and core platform engineering, being scrappy and inventive - but still ensuring we have a reference stack available, at all times, for new sales.

How did I communicate the Roadmap then?

I had a ton of detail to manage, spanning multiple customer requirements backlogs - working with teams across the globe, managing a unified backlog, understanding the features and gaps, prioritizing features for the base profile & then owning a delivery plan (which I'll expand upon in future posts). The one mechanism that earned the trust of senior leadership was a visualization I produced, that showed on a single piece of paper how all the streams fit together. Once the executives saw the roadmap, they then had an easy mental model to understand the complex pieces and stages of convergence - that went into building out the NextGen Sunrise platform.

I decided to write this blog piece today, 13+ years later because, it so happens, I find myself now again responsible for building a Nextgen product (V2), with V1 (single tenant) that is currently supporting existing customers with an active roadmap - and V2 targeting multiple-tenants onboarding new customers with their own specific configs/capabilities - and my team are considering ways of communicating the plan!

Check this out:

The above roadmap accomplishes the goal of showing the interplay between Customer engineering deliveries (above the Sunrise line) and simultaneously showing the core platform engineering feature deliveries (below the Sunrise line) - all contributing to the holistic platform called Sunrise. So customers mutually benefit from the platform core and they themselves benefit from the internal platform development. All of these deliveries are contained within a timeline that serves as the roadmap.

As I revisited this picture, thirteen years later, I can still appreciate the value of a picture like this - a powerful visualization that will beat any detailed text narrative IMHO.

And here's our scrappy Project Kick-Off Charter

Oh, and here is the initial MVP we demoed on Sunrise:

Wednesday, 31 May 2023

Why I never ran a program without a Project Charter

Lessons on large-scale delivery program management ...

I continue to dig into my past artefacts to showcase my work portfolio. I'm using a multi-pronged approach here: 1\ Showcase my work to prospective employers; 2\ Openly share my work so that others (people I coach, my colleagues and boss, etc.) can benefit; and 3\ Act as my own living knowledge repository.

I spent a decade climbing up the project management ladder, in the same way I climbed up the software engineering ladder (from junior engineer to principal engineer) - I first started project managing small software product development (2-4 teams with 10 services), then scaled up to large middleware services (20+ teams, 50+ services) as lead delivery owner, then up to full stack systems integration (full stack of all major components: kernel, middleware, integration layers, applications), then program managed a full go-to-market product launch scaling out to including Tech, Business (Finance, Marketing, Supply Chain) & Operations (Customer Care, Retention, Content, Legal, Regulatory) - as senior program manager. I also owned the full plan of starting a business from scratch to launch (a video streaming company) in 8 months. I did a stint in management consulting, running the top 5 business projects for a $3 billion run-rate business, which some companies might call Tier 0/1 initiatives - where I co-ordinated these large-scale programs, as Chief Program Director - delivering through multiple business lines, multiple project management offices and multiple product and engineering teams. In a sense, I served as the CxO program manager, advisor and delivery owner.

It is with this experience and knowledge, that I dare to share about my work experiences - and I'm not making these things up - you can check my LinkedIn recommendations page for proof.

During my tenure as the lead program director mentioned above, I often found myself picking up and repairing distressed programs - and along the way, I'd help improve team processes and coach the management teams as well. I also ran new business & technology initiatives from scratch, start-to-finish-then-handover. So with this diverse experience, I developed a simple method that helped me navigate both types of program scenarios: either resetting or starting from scratch, the simple, powerful mechanism of a Project Charter document. To this day, I'm surprised to see many program & project managers failing to use the Project Charter in the way it was meant to be used (clarifying the essence), and often find less-experienced, newly minted PMP/Prince2 certified professionals, doing it "by the book". My approach to project charters went much deeper than that...

So what do I mean by using the Project Charter in "clarifying the essence" then?

A seasoned, experienced project leader, chief program director, end-to-end project manager, senior technical program manager, etc - call the roles what you like - in my view, is not about just putting a plan together, working backwards from a deadline or target completion date. No, I believe as senior program leaders must apply their minds to appreciate the bigger picture and create a program structure that becomes the north star in guiding and leading multiple delivery teams. I never started a program without first establishing my project charter, which at the top level, focuses on the following:

Start by understanding the why. Why is this program needed? Why is it important?
Move on to understanding the who. Who are the sponsors, stakeholders and teams impacted? Who will be working on the program? "First who, then what"

A program manager must be sufficiently well-versed with all the roles expected from the program, and work hard to secure the roles needed. Yes, this means the program manager must escalate to get the people needed for the program (on the bus, as well as off the bus). A responsible program manager would raise all these risks & concerns up-front, before officially kicking off the program.

Clarify the what, including calling out what's missing - Set up the mental model for the program. What is this program about? What is it not about? What's in scope? What's not in scope? What workstreams make up the program? How do all pieces come together?
Agree, Align, Action - The 3 As of project execution involve agreement on the deliverable, alignment of all parties involved which includes acceptance of their workstreams and ultimately agreeing on the action plan to execute.

Project charters don't necessarily have to be communicated in a written document, a slide deck is more than adequate to communicate the essence. Depending on the business environment and culture (for example, some business cultures prefer slide decks over detailed documents to save reading time, whilst others like Amazon, insist on detailed text narratives). So a seasoned project leader must adapt their style to suit the particular business need & culture of the teams.

In this post, I'm sharing a version of a Project Charter as a slide deck. In a future post, I will share a detailed 50+ page document project charter that involved the launch of a consumer electronics device, the program covered a mix of engineering, business and operations workstreams.

Example Program: Transform Digital Self-Service of a $3 billion run-rate business

I was called in to help reset and kickstart an overarching cross-cutting program to improve a selection of key metrics that would result in increased usage of digital self-service channels, improving customer satisfaction and overall reducing operational costs. This program covered the full value chain delivering the service: 3rd party technology vendors developing phone "mobi" apps using USSD, iOS/Android self-service app, Website, Payments, a hardware kiosk station, set-top box interactive application, integration with internal & 3rd party CRM/Billing systems - and resulting business workflows: finance, customer care & banking channels. Technology teams were spread between the CTO/CIO lines (3 IT pillars), and business teams reported separately to the CEO. The program also served the needs of Group Strategy, Risk & Regulatory. Bringing all these things together requires a steady hand, a tactful negotiator, a strategic and business mindset as well as a strong technology leader. This is why I enjoyed such challenges as these programs were never boring, limited to only tech/engineering.

Enough said, let the slides talk and let me know what you think in the comments!

Thursday, 23 March 2023

Sense making, apples v oranges, finding a path forward from multiple options by asking searching questions

I've been writing about my experiences as technology executive when I was placed in the midst of uncertainty and high ambiguity that impacted both my personal and professional aspirations in a big way. Making the decision to leave my experiment into boutique management consulting behind, after building a solid reputation as a high-level program manager, switching to a deeply technology role into a business unit that was going through disruption due to both external and internal forces -- was not an easy, straightforward transition to make. I experienced classic imposter syndrome (see this post). Nevertheless, looking back now, more than five years on, those experiences helped shape me to becoming more well-rounded, what some would call a diverse Business, Technology and Operations (BTO) executive - or - as Amazon calls it, a Strong General Athlete (SGA).

My writing this month is on communication methods, mechanisms and tools:

How should CTOs (engineering leaders / technology executives) communicate to all groups of stakeholders?
What tools of writing and visualisations to use?
How to use critical thinking and the art of reflection to deep dive on the technology strategy - calling out the good, the bad and the ugly?
How to dive deep to sense make by asking searching questions, that force upwards stakeholder management to engage in guiding the teams on strategy?
How to find a common ground and build bridges between two (perceived) competing technology organisations?

Questions & Answers Tree - Seeking Clarity from Executives

Let's recap the situation:

In 2017, I took on the role of CTO for an online video streaming technology platform. The business unit was part of a traditional satellite PayTV company, that created an online companion application to supplement its existing TV subscribers to watch TV on the go, initially through web & mobile applications ("Delta" platform) - by investing in digital media division. Not long after this value added service was created, about two years later, the parent investment company, started up a new video streaming business ("Sierra" platform borne in the cloud, no attachments to traditional PayTV like Netflix), completely independent from the existing PayTV business. The two businesses hardly interacted or shared common product, marketing or technology elements for the first two years. When I joined in 2017, there was talk about potential synergies and closer partnerships - which directed my three year turnaround strategy - to modernise Delta closing the gap on Sierra, thus creating comparable modern video consumer experience (Netflix was the bar). A year later, additional complexity and uncertainty came in when the parent investment company, decided to unbundle its independent video businesses to allow itself to focus solely on e-commerce ventures. What happened? Naturally, Sierra business was folded into Delta - create a new business with two product & engineering organisations running in parallel: 2 CPOs, 2 CTOs - tasked to figure out what the future world could look like in creating a Delta 2.0 strategy.

As part of the interactions, still being the management consultant (at the time, I was regarded as independent without any affiliations to taking any sides - since I worked with all businesses before and had existing relationships with all), I helped the executives tackle their options.

The first one - let's understand the assumptions and questions that challenge assumptions. Can executives be clear about their end game? What is the vision? Why are you so caught up about the apparent duplication in tech platforms?

Here's the tree:

Comparing Apples to Oranges: The decision table view

When two engineering teams are challenged about their platforms doing essentially the same thing, especially when the ask comes from non-technical executives, very often engineering leaders become defensive and say "Ah, you can't compare apples with oranges, you must compare apples with apples". Whilst this might be technically accurate, this is not the way to manage communications with stakeholders. Part of technology leader's job is to simplify technical and product capabilities, meeting your customer and stakeholder needs, where they're at. Even if you feel a visual oversimplifies, you still need to tell a story, like the one I used to gain approval that cemented Delta 2.0 roadmap:

Who knew that four years later, I would be digging out the same mechanisms to help AWS executives, (in my first month barely completed onboarding by the way) to decide on a technology stack that my engineering team would need to build/deploy/operate - for providing calls / chats support as their technical call centre platform, servicing one of their highly-regulated, strictly controlled, private cloud partitions? See below decision table, similar oranges to apples story:

Tuesday, 14 March 2023

How to Visualise & Communicate Technology Migration 3 Year Plan

My previous post shared my journey into my early days as a CTO, taking you through my challenges and opportunities in my new role that experienced high degrees and uncertainty, and despite the ambiguity, I had to not only come up with a 100 day plan, but also win hearts-and-minds all-round.

[Disclaimer: I write about my past work experiences, dating back to 2017 referring to entities that no long exist today (in 2023). Previous mentions of such entities are widely in the public domain through news media outlets, press briefings, launch announcements, etc. I take time to ensure that nothing I share exposes commercially sensitive material. My intent is to rather showcase my work portfolio to current and future prospective employers, through my writing].

Despite the uncertainty with the mothership corporate's complete overhaul of its business operating models, we had to go on as business-as-usual. As such, I had to create a credible three year plan for the technology platform: Create future online video platform V2.0 that replaced the existing V1.0 (let's call this platform Delta) and combined the best of a standalone partner platform (Sierra).

Both platforms were roughly the same age, with Delta create to pretty-much replicate the traditional PayTV's Set Top Box (STB) experience (via a Satellite broadcast) but over the internet and consumed through devices other than STBs (i.e. mobile phones, web browsers, game consoles, smart TVs). Delta wasn't a truly home-grown, native internet application, being tightly coupled to existing broadcast and satellite workflows. (To understand the world of difference between Internet Over-the-Top video and traditional Broadcast TV systems, refer to my white paper here).

Sierra, on-the-other-hand, was built from the ground-up, as a pure internet streaming platform, but only for on-demand content, like Netflix - but didn't provide live broadcast TV & thus didn't experience the realtime expectations that the Delta platform did.

Nevertheless, the business strategy was to create a more unified, homogenous viewing experience for the customer, simplify the multiple applications problem by providing the customer (end-users that consume video over the internet), a single container application to enjoy any type of media content, on-demand - i.e. xVOD. From a product and user experience perspective, this made perfect sense. For the technology platforms though, it meant choosing a path forward that involved a technology modernisation program: Build out the new platform, migrate as much as possible, reach parity and provide the new 2.0 experience.

The job of creating this technology strategy naturally falls on the CTO, me. This initiative, to be a success would need the full co-operation of 5 separate commercial entities, impacting the day-to-day work and operations of 500+ people, including 50+ resources from our partners.

How do I simplify a complex technology strategy & execution plan on a single piece of paper?

How do I ensure the picture shows all the key elements (workstreams) that tell a credible story?

How do I present and communicate progress overall to non-technical stakeholders?

How do I influence stakeholders & bring in a wider audience than just the engineering teams?

My One Page Project Scorecard

Back when I was an independent management consultant, I used to lead very large enterprise-wide programs that cut across multiple business units, each with its own project management office. My job was to lead, direct, coach and deliver through others, without myself having any hierarchical power - apart from referent power as my sponsors were the C-suite themselves. The job itself was interesting as I had to wear multiple hats: dive into the detail working with implementation teams whilst at the same time, be ready to communicate with my higher-level stakeholders, abstracting the detail. But if asked any questions, I must have the answers for them, without differing to the workstream owners.

Typically my programs would entail any number of workstreams, from ten to fifty. Some workstreams (or work packages) themselves would be considered programs in their own right. A program being a collection of multiple projects. Projects being a unit of work usually involving a single group, to deliver a series of tasks. I would be leading and executing through many program and project managers, as well as individual functional managers.

Over time, I'd developed my own mechanisms for structuring and managing these large-scale initiatives. One such mechanism is a simple project dashboard, on a single piece of paper, that shows the full map of all the initiatives, calls out the owners responsible and overall status - highlighting a call to action.

As a consultant however, my role was to guide, raise risks and mitigate as much as I could (within my scope of influence and control), and then escalating upwards for decisions outside my control. What's a consultant to do, eh?

Let me know what you think of this visual?

An example One Page Project Report from 2015: large-scale media workflows automation program

Thursday, 28 October 2021

Simplifying complex projects using visuals

In my previous work experience as a program director, I led very large programs consisting of hundreds of people distributed across geographies. I had multiple stakeholders to manage, mostly C-Suite folks that depended on me to simplify the details and present the essence of program to them, so that they could make effective decisions in a timely manner. Whilst I managed the detailed project plans and task breakdowns with my team of project, program & engineering managers, I adopted varying styles of communications to suit my audience level.

To this day, even though I've left program and project leadership behind for some years now, I myself served as a SteerCo member, sponsoring projects and programs to deliver KPIs. I still prefer the art of simple visuals as a means of communication. A picture, presented in a way that directs a conversation can be so much more efficient and powerful than reading lines and lines of verbose text, IMHO.

Sort of a #throwbackthursday post, I came across these old visuals I used on a project Steerco going back six years ago. This particular program was pretty tight, very little buffer contingency, executed in under six months, a major upgrade and launch of feature end-to-end, with hundreds of people contributing. The steerco group consisted of a dozen executives. The detailed project plan cut across 20 workstreams, the launch release updated 55 countries in Africa.

Paths to Launch

Launch Go/No-Go Checklist & Plan

Some program and project managers can over complicate messaging and communications. I've learnt over time that whilst detail rigour is still necessary for tracking project deliveries, there is actually an art to managing projects - there is also a style to stakeholder engagement - the methods of communication plays a major role. Whilst some might argue that pictures remove much of the thinking and thought process behind, I believe in the power of visuals...after all, a picture is worth a thousand words!

Tuesday, 16 February 2021

Some simple but powerful, useful reminders

A LinkedIn connection shared this post that resonated with me on so many levels, I just had to capture these for posterity on my blog. A picture is worth a thousand words, I could write so much about each one!

Sunday, 11 August 2019

On using metaphors for Tech Ops

I like finding metaphors from other worlds outside of technology or engineering that can help simplify concepts for business people. Using such comparisons "is like a..." can actually be quite a powerful way of connecting or contextualising, especially when these people (business stakeholders mostly) are not interested in the technical details at all. All these folks care about, when there's an operational issue impacting business/customers, are: "What's the issue? Why did it happen? When will it be fixed? I need 15 minute updates please. Not interested in detail. Just want to know when it will be resolved. And don't forget Root Cause Analysis RCA document & Consequence Management!" When I get these remarks, I'm like...left dumbfounded...and decided to find another way of communicating to manage expectations better...

Enter...
#draintheswamp
#ICU

#draintheswamp

Despite the term being popularised by Trump last year, which was incidentally around the same time I introduced the term - in the software engineering community, "draining the swamp" is used to describe some refactoring and possibly restructuring of code and architecture, to address key bottlenecks, instabilities and optimisations. This could also mean revisiting original design & implementation decisions. Some people in software might put this under the bucket of "paying down technical debt" whilst others might just put this down as a natural part of the software evolution.

I used this term to prepare business people on the impact of the changes we planned to make. This was in response to a business ask of anticipating the biggest load of the tech platform stack - will the system cope with a "Black Friday-like event"? The simple answer was NO, not in its current form, that we would need to make some aggressive changes to the platform - like scaling out to multiple datacentres. The stack was built as a monolith (multiple services & components, some micro services, running on traditional VM infrastructure, not containerised, etc.) and so the best chances of scaling for load was to move the stack to run off multiple data centres, where we could scale up on each datacentre as well, but still had bottlenecks with share database cluster off one primary datacentre.

Suffice to say, this stuff goes above the heads of the business people. So I said, it's like draining the swamp. When that happens, we are bound to find some nasty things lying beneath the surface, things we don't see today, and can only see when we start draining the swamp. When this happens, and we hit some boulders or some unforeseen monsters of the swamp, we are bound to experience downtime and an outage or two. Don't panic, this is expected, given our working constraints, that is, make the changes on live production environment.

#draintheswamp went down pretty well with business. They got it. Left us alone. They trusted us! They managed the customer & business impact when we needed help, and our tech teams worked heroic hours to get the stack running off multiple datacentres, just in time for the biggest event of the business for the year (2018), our user numbers reached record highs, and the platform did not fall over! So round one of #draintheswamp was done, but the swamp was not completely drained out...

Until earlier this year (2019), when we kicked-off another round of #draintheswamp to start the migration to cloud stacks, starting with containerisation...which led the platform to suffer a series of outages at the most inappropriate moments...customers were pissed, business impact teams were on the back-foot, social media was killing us, the changes for #draintheswamp was starting to kill the platform...when I resorted to introducing another metaphor: #ICU. Folks, the tech stack is in some severe TLC, we are instigating #ICU mode. Life support is initiated, vitals are not great, but patient is surviving, and needs high level of critical care...

Technical Operations / Site Reliability Engineering is not so different to running a hospital's ER Emergency Department...at any time, despite monitoring vital signs of existing patients (system components of tech stack), or doing day-to-day operational management of the ward (infra maintenance), an event could occur that can just spike, like a natural disaster, accident or terrorist attack (hardware failure, critical component dies, database crash unanticipated, etc.)...ER staff have to triage quickly (Tech Ops also have to triage), make life/death calls (Tech Ops when to call a P1 incident & inform business), decide on severity of the injury (requires immediate attention, operate now, or can wait??)...the same holds true with bringing back a technical platform from death to living healthy operations...so why wouldn't we want to reuse medical terms?? I think it makes perfect sense!

#ICU

The gist of this metaphor is simple: ICU/CCU means Intensive/Critical Care Unit. When a patient is in ICU, it is serious stuff, urgent priority, focused attention to monitoring, diagnostics & multiple treatment options, using whatever means necessary to enable a positive outcome for the patient.

The typical flow to recovery to health looks for these transitions:

Start in ICU/CCU, remain there until a period of time where interventions applied & life-support is no longer necessary, then...
Transfer to High-Care facility, remaining close to ICU, but stable enough to warrant reduced focus and attention as compared to being in ICU, but be prepared for surprises, so best be close to ICU ward and not far away...wait until doctors give the green light to move on to...
Normal / General Ward...the last stay before checking out to go home. Vitals and all other required checks all pass before discharged for home...

Although being in #ICU is rather stressful for everyone, there is a high sense of urgency, and the pressure to respond to business is quite intense (especially when Risk/Governance demands "Consequence Management"), we can draw additional parallels from medical ICU, like:

Remaining calm, address the topics / unknowns in logical manner, using tools of diagnosis you've trained for (Technical tools like Five Whys, etc.)
Running tests, take blood samples, etc. (Tech/Data forensics, logging, test hypotheses, etc.)
Call on other medical experts to bounce diagnosis / brainstorm (Involve as many tech experts to offer new perspectives)
Implement treatment plan according to hypothesis, wait for new results (Fix something, wait, analyse result, before making additional changes).
Communicate clearly to patient's stakeholders (Communicate to business stakeholders transparently without hiding or being defensive...come clean).
Pray :-)

Unpacking the medical terms...

According to Wikipedia, an ICU:

An intensive care unit (ICU), also known as an intensive therapy unit or intensive treatment unit (ITU) or critical care unit (CCU), is a special department of a hospital or health care facility that provides intensive treatment medicine.

Intensive care units cater to patients with severe or life-threatening illnesses and injuries, which require constant care, close supervision from life support equipment and medication in order to ensure normal bodily functions. They are staffed by highly trained physicians, nurses and respiratory therapists who specialize in caring for critically ill patients. ICUs are also distinguished from general hospital wards by a higher staff-to-patient ratio and to access to advanced medical resources and equipment that is not routinely available elsewhere.

Patients may be referred directly from an emergency department if required, or from a ward if they rapidly deteriorate, or immediately after surgery if the surgery is very invasive and the patient is at high risk of complications

When a technical platform goes into #ICU, we do the following:

We're on red alert - life threatening to business (customer experience is tanking)
Stop all other development work, or reduce planned work as much as possible by pulling in all the people we need to help (speaks to higher-staff-to-patient ratio)
Pull out all the stops, bring in experts, tool-up with advanced resources and equipment
Communicate daily to all business stakeholders
Perform multiple surgeries if needed (hotfixes, patches, etc.)
Run multiple diagnostics in parallel (think Dr. House)

According to Wikipedia, a High Care/Dependency Unit:

A high-dependency unit is an area in a hospital, usually located close to the intensive care unit, where patients can be cared for more extensively than on a normal ward, but not to the point of intensive care. It is appropriate for patients who have had major surgery and for those with single-organ failure. Many of these units were set up in the 1990s when hospitals found that a proportion of patients was requiring a level of care that could not be delivered in a normal ward setting.^[1] This is thought to be associated with a reduction in mortality.^[1] Patients may be admitted to an HDU bed because they are at risk of requiring intensive care admission, or as a step-down between intensive care and ward-based care.

According to HealthTalk, the last point to recovery is General/Normal Ward:

People are transferred from the intensive care unit to a general ward when medical staff decide that they no longer need such close observation and one-to-one care. For many people, this move is an important step in their progress from being critically ill to recovering..

'Nuff said, IMHO the similitude mentioned above should be more than self explanatory ;-)

Pages