Showing posts with label Work. Show all posts
Showing posts with label Work. Show all posts

Saturday, 17 May 2025

How I used AI coding assistants (Gemini) to build an MVP for a Software Management Tool

A walkthrough of an app I decided to build to help me with my work as a software engineering leader. Core thesis: I wanted a planning management tool that is centered around core system architecture, working backwards from there to model the organizational / team structures that feeds the planning process. I found most project management tools lacking support of including software architecture inputs - so I decided to build something to close this gap.

My journey started in late 2024 (October), when LLM models started exposing coding capabilities. I wasn't too impressed by the likes of ChatGPT and Claude (at the time). But with Gemini 2.5 entering the scene, I picked this project up again in April 2025 - and slowly built out the MVP, first as a single page application (html/javascript) with basic old skool UI (this video), then refreshed the UI, refactored the code (next video "Building apps with AI (2) - MVP...". Since I stopped coding professionally in 2010, this entire journey was quite the learning experience. I'm going back to school again, but I've learnt so much about how to engage with the LLM, that I'm increasingly confident I will find a place in the new world of AI. AI-assisted coding is definitely the future, but it is not that straightforward yet, even with the latest craze in vibe coding.

Just take a look at my document sharing my conversation with Gemini here.

50 pages and counting!!! Essentially you need to be very clear with articulating the requirements and interacting with the model as a true assistant, like you would with any human engineer, business analyst, or UX designer.


Source code on Github here

You can play with the app here

Listen to the podcast generated by NotebookLM here
You can interact with my Google NotebookLM here




The rest of this blog post was written by partner, Gemini >>>>

From Vision to MVP: Building a Software Management & Planning Powerhouse (with an AI Co-Pilot!)

For any software engineering leader, manager, or even a senior individual contributor, the landscape of modern software development is a complex beast. We juggle intricate system architectures, dynamic team structures, ever-shifting priorities, and the constant challenge of aligning finite resources with ambitious goals. What if there was a tool that not only helped model this complexity but also empowered smarter, data-driven planning?

That's the question that sparked the creation of the Software Management & Planning Tool, an MVP we've been passionately building. This isn't just about another planning application; it's about a foundational belief: effective planning starts by working backwards from your systems and organizational architecture.

This blog post shares the journey of bringing this vision to life – from a core idea to a feature-rich Minimum Viable Product – and the unique experience of co-developing it with a generative AI coding partner, Gemini.

The Core Idea: Architecture-First Planning

The tool is built upon a hierarchical understanding of a software organization:

  • Systems: The overarching products or platforms.
  • Services & APIs: The functional components within those systems, each with defined interfaces.
  • Teams (2-Pizza Teams): Agile squads owning services, each with software development managers (SDMs) and engineers.
  • Engineers: The talent pool, with defined levels.
  • Management Structure: Senior Managers overseeing SDMs.
  • Dependencies: Critical platform components and inter-service relationships.

This rich data model allows the application to provide a holistic view, forming the bedrock for all subsequent planning and visualization.

Our MVP Journey: An Iterative Evolution with an AI Partner

What started as a concept rapidly took shape, built incrementally feature by feature. The initial scaffolding was a single HTML/JavaScript file – a testament to getting ideas off the ground quickly. But the vision was always much larger.

Key Milestones in Our Development:

  1. System & Org Modeling: The first step was enabling the definition of systems, services, APIs, teams, and their intricate relationships. This involved creating UI elements for adding, editing, and connecting these entities.
  2. Visualizing Complexity: Static data isn't enough. We introduced dynamic visualizations:
    • System Architecture Diagrams: Showing services, APIs, and platform dependencies.
    • Team Relationship Graphs: Illustrating how teams connect based on service ownership.
    • Service Dependency Views: Drilling down into specific service connections.
    • Interactive Org Charts & Team Breakdowns: Providing clear views of the organizational structure and team compositions. Initially, these were separate views, but based on user experience feedback, we evolved the system overview into a slick carousel navigation allowing users to seamlessly glide through these critical perspectives.
  3. The Game Changer: Yearly Planning: This became a cornerstone. The goal was an interactive table to manage initiatives against team capacity, showing what's "Above The Line" (ATL) and "Below The Line" (BTL). We implemented:
    • Dynamic team columns and SDE Year estimations.
    • Drag-and-drop prioritization.
    • "Protected" initiatives that lock to the top.
    • The crucial concepts of "Funded Headcount" vs. "Builders In Seats" (BIS).
  4. Introducing "Away Teams": Recognizing that real-world capacity isn't static, we enhanced the model to include "Away Team Members" – borrowed resources. This led to the "Effective BIS" metric, providing a more realistic picture of available capacity.
  5. Deep Dive into Capacity Tuning: This was a pivotal enhancement. We built a dedicated "Tune Capacity Constraints" module where users can define:
    • Global defaults (working days, holidays, org-wide events).
    • Team-specific adjustments (leave uptake, variable leave, team activities, recurring overhead). The output? A Calculated Net Project Capacity for each team, which can then feed back into the Yearly Planning view for a much more accurate ATL/BTL assessment. This module also includes a detailed narrative explanation and a waterfall chart to visualize capacity erosion.
  6. SDM Resource Forecasting: Leveraging the detailed capacity tuning, we integrated a forecasting model. SDMs can now select their team, input hiring and ramp-up estimates, and see a 52-week projection of their headcount and effective SDE availability, all adjusted by the previously configured capacity constraints.
  7. UI/UX Modernization: More recently, we've focused on enhancing the user experience, refactoring the codebase into modular files, and adopting a cleaner, card-based UI reminiscent of modern web applications.

The Gemini Co-Pilot Experience:

This journey wasn't a solo flight. As the human visionary and lead developer, I partnered extensively with Gemini, my AI coding assistant. This collaboration was an experiment in itself:

  • Acceleration: Gemini was instrumental in rapidly generating boilerplate code, implementing UI components based on descriptions, and refactoring complex functions. This significantly sped up the development of the MVP.
  • Brainstorming & Ideation: We bounced ideas for UI/UX improvements, data model structures, and feature refinements (like the Team Load Summary table in the planning view).
  • The Learning Curve (for both of us!): It wasn't always smooth sailing. There were moments where context was lost between sessions, leading to regressions or suggestions that didn't quite fit the current state. Clarifying requirements meticulously, sometimes repeatedly, became key. Debugging involved a human-AI partnership – I'd often describe the unexpected behavior, and Gemini would help pinpoint potential issues in the code it (or we) had written. It was a process of learning how to "prompt" effectively and how to guide the AI when it drifted.

This experience has been a powerful demonstration of how GenAI can be a force multiplier for developers, turning ambitious ideas into working software at an accelerated pace, even if it requires a new kind of collaborative patience and precision.

Key Features That Shine

While the tool is packed with interconnected features, some of the "wow" factors that have emerged include:

  • Holistic System Modeling: The ability to define not just services, but also the teams that own them, the engineers within those teams (including AI SWEs and their attributes like skills and experience!), and the management structure, all in one place.
  • Insightful Visualizations: The carousel of system, team, and service dependency graphs provides powerful, at-a-glance insights into complex relationships.
  • Realistic Capacity Tuning: Moving beyond simple headcount, the "Tune Capacity Constraints" feature allows for granular adjustments, factoring in everything from public holidays and company events to team-specific leave uptake and meeting overhead. The resulting "Net Project Capacity" and its narrative explanation offer a truer picture of what teams can achieve.
  • Interactive Yearly Planning: The drag-and-drop planning table, combined with dynamic ATL/BTL calculations based on selectable scenarios (Funded HC, Team BIS, Effective BIS) and the option to apply the "Net Capacity Constraints," makes for a powerful "what-if" analysis tool. The Team Load Summary provides immediate feedback on team-level impacts.
  • Integrated SDM Resource Forecasting: This module closes the loop by allowing managers to see how their hiring plans and attrition realities, combined with the tuned capacity constraints, will affect their team's SDE availability throughout the year.
  • An Evolving, Modern UI: The recent shift towards a cleaner, card-based design with a persistent top navigation bar significantly improves usability.

The Journey Continues: The Power of Iteration

This MVP is a significant milestone, but it's also a foundation. The backlog, as detailed in our project's README.md, is rich with possibilities: enhanced yearly planning with ROI tracking and versioning, a dedicated roadmap and backlog management module, deeper detailed planning with work packages, and, most excitingly, integrating more AI-powered enhancements for analysis and interaction.

The development of this tool underscores the power of iterative development and the transformative potential of collaborating with AI coding assistants. It shows how a complex vision can be broken down and built, piece by piece, learning and refining along the way. While the path of AI-assisted development has its unique challenges (like ensuring context retention and guiding the AI through ambiguities), the ability to rapidly prototype, generate code, and explore solutions is undeniable.

For any engineering leader wrestling with the complexities of software planning, or any developer curious about the potential of AI partnerships, we hope this journey provides some inspiration. With the right vision and the right tools (both human and AI!), ambitious ideas can indeed be brought to life.


Saturday, 14 September 2024

How I used ChatGPT o1-preview to build a rough interactive software engineering resource forecasting model

Seed Prompt for tool (click here to play with it): I am a software engineering manager. A large part of my job is understanding the engineering resource utilization for my engineering team. I manage multiple engineering teams of varying sizes from 5 people to 20 people in a team. When estimating software projects, understanding availability of productive engineers is important for planning purposes. Help me build an interactive resource forecasting tool, that can be fully self-contained within browser app, embedded in a google blogspot page. The tool must include inputs for: 1\ Funded team size; 2\ Current Available engineers (i.e. actual current size of team). The tool must account for the time it takes for hiring new engineers, ramp-up time, learning, etc. before they can be deemed as productive for committing to project work. The tool must show a table the resource plan over 52 weeks in the year, and include a visual graph to show how the team size grows to meet funded size. Research the best methods in software engineering management, and create the tool that can help software managers have a good model for resource utilization. The tool can store information in the local user browser storage as needed. Remember, I need the full code to be fully self-contained so I can run the tool from within a google blogspot page.

Prompt for this article: Great I think we have good baseline version to start with. Now using your PhD capabilities, write up an accompanying blog post that details the model, the thinking behind it, and call out the limitations using a critical eye, suggesting improvements for the next version.

Prompt for Image using Dall-E: "An isometric illustration of a diverse engineering team collaborating in a modern office space. They are gathered around a large table filled with charts, graphs, and laptops, discussing and planning. Behind them, a wall-sized calendar and a whiteboard display timelines, Gantt charts, and upward-trending graphs. Include elements like gears and clock icons subtly integrated into the design to symbolize engineering and time management. Use a professional, clean art style with vibrant yet muted colors to convey a serious but engaging atmosphere.

Engineering Resource Forecasting: A Comprehensive Model for Team Planning

By ChatGPT o1-Preview (took about 4 hours to get the tool to version 1)

Effective resource planning is crucial for engineering teams aiming to meet project deadlines, optimize productivity, and manage budgets efficiently. The Engineering Resource Forecasting Tool is designed to help managers and team leads forecast their engineering resources over a year, considering various real-world factors such as hiring rates, attrition, ramp-up times, efficiency, leave days, and public holidays.

In this blog post, we'll delve into the model behind the tool, explore the thinking that shaped its development, critically analyze its limitations, and suggest improvements for future iterations.


Understanding the Model

1. Objectives of the Tool

The primary goals of the Engineering Resource Forecasting Tool are to:

  • Forecast Team Size: Predict how the engineering team will grow or shrink over time.
  • Calculate Effective Capacity: Determine the actual productive capacity of the team, considering factors that reduce availability.
  • Assist in Planning: Provide insights for hiring strategies, resource allocation, and project scheduling.

2. Key Components of the Model

The model incorporates several essential elements:

  • Funded Team Size: The maximum number of engineers that the budget allows.
  • Current Available Engineers: The starting point for the forecasting, representing the existing team.
  • Hiring Rate and Time: The rate at which new engineers are hired and the average time it takes for them to join.
  • Ramp-Up Time: The period new hires require to reach full productivity.
  • Attrition Rate: The expected percentage of engineers who will leave the team annually.
  • Efficiency Factor: The proportion of time engineers spend on productive work.
  • Leave Days and Public Holidays: Time when engineers are not available due to vacations and holidays.

3. Model Calculations

a. Effective Engineers

The number of effective engineers is calculated weekly by adjusting the total number of fully ramped-up engineers for efficiency and availability:

Effective Engineers = Total Ramped-Up Engineers × Efficiency Factor × (Weekly Available Days / 5)

b. Weekly Available Days

Engineers aren't available all five working days each week due to leave and public holidays:

Weekly Available Days = 5 - ((Annual Leave Days + Public Holidays) / 52)

c. Attrition Modeling

Attrition reduces the team size over time:

  • Weekly Attrition Rate:

Weekly Attrition Rate = Annual Attrition Rate / 52

  • Cumulative Attrition: The total number of engineers who have left the team up to a given week.

d. Hiring and Ramp-Up Process

  • Hiring Pipeline: Represents engineers who have accepted offers but haven't started yet.
  • Ramping Engineers: New hires who are in the process of ramping up to full productivity.

e. SDE-Weeks and SDE-Days

  • SDE-Weeks: Sum of effective engineer contributions per week within a month.
  • SDE-Days: Sum of effective engineer contributions per day within a month.

These metrics provide a more granular view of the team's productive capacity.


Thinking Behind the Model

1. Realistic Representation

The model aims to mirror real-world scenarios by considering:

  • Time Delays: Hiring and ramp-up times introduce delays between initiating a hire and gaining full productivity.
  • Non-Productive Time: Leave, holidays, and non-productive activities reduce actual working time.
  • Attrition Effects: Engineers may leave the team, impacting capacity and necessitating replacement hires.

2. Balancing Simplicity and Complexity

While the model incorporates multiple factors, it maintains usability by:

  • Using Average Rates: Distributes leave, holidays, and attrition evenly over the year for simplicity.
  • Assuming Linear Processes: Hiring and ramp-up are modeled as linear processes without accounting for fluctuations.

3. Providing Actionable Insights

By visualizing the forecast through charts and tables, the tool helps managers:

  • Identify Hiring Needs: Determine the hiring rate required to meet team size goals.
  • Plan for Attrition: Anticipate capacity reductions due to attrition.
  • Adjust Strategies: Modify inputs to explore different scenarios and their outcomes.

Limitations and Critical Analysis

Despite its usefulness, the model has several limitations:

1. Uniform Distribution Assumptions

  • Leave and Holidays: The model assumes that leave days and public holidays are evenly distributed throughout the year. In reality, leave may cluster around certain periods (e.g., summer vacations, year-end holidays), leading to temporary drops in capacity.
  • Attrition Timing: Attrition is modeled as a steady rate, but actual departures may occur unpredictably and in clusters.

2. Lack of Granular Ramp-Up Modeling

  • Binary Productivity: The model assumes engineers are either ramping up or fully productive after the ramp-up period. It doesn't account for gradual increases in productivity during ramp-up.
  • Individual Variations: All new hires are treated identically in terms of ramp-up time and efficiency, ignoring individual differences.

3. Hiring Constraints and Real-World Challenges

  • Hiring Rate Feasibility: The model calculates the hiring rate needed to meet targets but doesn't consider the practical challenges of achieving that rate, such as market conditions or recruitment resource limitations.
  • Candidate Pipeline Variability: It doesn't account for fluctuations in candidate availability or acceptance rates.

4. Simplified Efficiency Factor

  • Static Efficiency: The efficiency factor is constant over time, not accounting for potential improvements or declines due to training, process changes, or team dynamics.
  • Team Composition: It doesn't differentiate between engineers with different skill levels or specializations.

5. Absence of Project Dependencies

  • No Task Modeling: The model focuses on resource capacity without linking it to specific project tasks or timelines.
  • Overtime and Workload Peaks: It doesn't account for periods when engineers might work overtime or when workload exceeds capacity.

6. Economic and External Factors

  • Budget Changes: The funded team size is assumed constant, ignoring potential budget cuts or increases.
  • External Events: Unforeseen events like economic downturns, pandemics, or natural disasters are not modeled.

Suggestions for Improvement

To enhance the model's accuracy and applicability, future versions could consider:

1. Incorporating Seasonal Variations

  • Variable Leave and Holiday Distribution: Model leave and public holidays based on historical patterns to reflect periods of low and high availability.
  • Attrition Peaks: Adjust attrition rates to account for known periods of higher turnover.

2. Enhanced Ramp-Up Modeling

  • Gradual Productivity Increase: Implement a ramp-up curve where productivity increases incrementally over the ramp-up period.
  • Individualized Ramp-Up: Allow different ramp-up times for different roles or experience levels.

3. Realistic Hiring Constraints

  • Recruitment Capacity Limits: Introduce maximum hiring rates based on recruitment team capacity.
  • Candidate Pipeline Dynamics: Factor in time-to-fill variations, candidate drop-off rates, and competition in the job market.

4. Dynamic Efficiency Factor

  • Efficiency Over Time: Allow the efficiency factor to change over time, reflecting process improvements or team fatigue.
  • Role-Based Efficiency: Differentiate efficiency factors for various roles or seniority levels.

5. Integration with Project Planning

  • Task-Level Planning: Link resource capacity to project tasks and milestones to assess whether capacity meets project demands.
  • Risk Modeling: Incorporate risks and uncertainties that could impact timelines and capacity.

6. Economic Scenario Planning

  • Budget Flexibility: Allow for adjustments in the funded team size based on potential budget changes.
  • External Shocks: Introduce scenarios for external events that could affect hiring, attrition, or productivity.

7. User Input Flexibility

  • Custom Time Frames: Enable users to adjust the time frame of the forecast beyond a single year.
  • Data Import: Allow users to import historical data to better tailor the model to their organization's patterns.

Conclusion

The Engineering Resource Forecasting Tool provides a solid foundation for planning and visualizing engineering team capacity over time. By accounting for critical factors like hiring, ramp-up, attrition, efficiency, leave, and holidays, it offers valuable insights for managers.

However, like any model, it simplifies complex real-world dynamics to remain practical and user-friendly. Recognizing its limitations is essential for interpreting the results accurately and making informed decisions. By incorporating suggested improvements, future versions can offer even more robust and precise forecasting, aiding organizations in navigating the challenges of resource planning.


About the Author

[Your Name] holds a PhD in [Your Field] and has extensive experience in project management, resource planning, and organizational development. Passionate about leveraging data-driven models to solve practical problems, [Your Name] contributes to advancing tools and methodologies that enhance operational efficiency.


Note: The Engineering Resource Forecasting Tool is intended as a planning aid and should be used in conjunction with professional judgment and other planning tools. Always consider your organization's specific circumstances when making staffing decisions.

Saturday, 1 June 2024

The Hawthorne Effect, a short story

So I was going through my Trello board today, catching up on my growing list of todos, things to learn and be curious, explore (from years ago), and tackled this card "Check out Hawthorne effect". As usual, Googling takes you to wikipedia to read... I went to chatGPT and instead prompted it with the following...:

Explain the Hawthorne effect through a short-story, framing the narrative in a style, similar to Goldratt's book "The Goal". Limit the short-story to no longer than two A4 pages

The Workshop Challenge


Chapter 1: A Problem Arises


Alex Raines, the plant manager of Evergreen Manufacturing, walked into his office to find a stack of reports on his desk. The company had been struggling with productivity issues for months. Despite numerous initiatives and new policies, the output remained stagnant. Frustration mounted as Alex tried to pinpoint the problem.


Chapter 2: A Visit from HQ


One afternoon, an unexpected visit from Emma Collins, a senior executive from headquarters, threw Alex off his routine. Emma, known for her innovative problem-solving skills, had a knack for seeing through complexities.


“Alex, I’ve been reviewing the numbers,” Emma began, taking a seat in his office. “We need to figure out why the productivity is lagging. Let’s start by observing the workshop directly.”


Together, they walked to the workshop floor, where the hum of machinery mixed with the chatter of workers. Emma suggested they conduct a series of interviews and observe the workflow discreetly.


Chapter 3: The Observation Begins


Emma and Alex set up a temporary observation station in a corner of the workshop. Over the next week, they meticulously noted the workers’ behaviors, machine efficiency, and workflow patterns. To Alex’s surprise, there was a noticeable uptick in productivity during this period.


“I don’t get it,” Alex said, scratching his head. “The only change we made was being present here.”


Emma smiled knowingly. “That, my friend, is the Hawthorne effect.”


Chapter 4: Unveiling the Mystery


Emma explained, “In the 1920s and 30s, the Western Electric Company conducted a series of experiments at their Hawthorne Works plant. They discovered that workers’ productivity improved when they were being observed. It’s a psychological phenomenon where individuals alter their behavior due to the awareness of being watched.”


Alex pondered this revelation. “So, our presence here made the workers more conscious and thus more productive?”


“Exactly,” Emma confirmed. “But this effect is usually temporary. The key is to understand why the workers felt the need to change their behavior and how we can sustain productivity improvements without constant observation.”


Chapter 5: Digging Deeper


Determined to find a long-term solution, Alex and Emma conducted anonymous surveys and held informal discussions with the workers. They discovered several underlying issues: lack of clear communication, inadequate recognition of hard work, and an outdated incentive system.


Emma suggested a few strategic changes:


1. Improve Communication: Implement regular team meetings to discuss goals and progress.

2. Recognize Achievements: Introduce a recognition program to celebrate workers’ contributions.

3. Revamp Incentives: Align the incentive system with productivity and quality metrics.


Chapter 6: Implementing Change


With a clear action plan, Alex rolled out the changes. They started with weekly team huddles, where workers shared updates and suggestions. A “Star of the Week” program was introduced, celebrating top performers. The incentive structure was reworked to reward not just quantity but also the quality of output.


Chapter 7: Sustained Improvement


Over the next few months, the changes bore fruit. The workshop buzzed with newfound energy. Productivity rose steadily, and the workers felt more engaged and valued.


During a follow-up visit, Emma observed the transformed environment with satisfaction. “This is the real impact we were aiming for, Alex. By addressing the core issues and fostering a supportive environment, you’ve achieved sustainable improvement.”


Alex nodded, grateful for the insights. “It all started with the Hawthorne effect, but we’ve turned it into something lasting.”


Epilogue: Reflecting on the Journey


Alex reflected on the journey. The initial spike in productivity from mere observation had been a catalyst. By addressing the root causes and implementing meaningful changes, they had created a culture of continuous improvement. The Hawthorne effect was just the beginning—a lesson in understanding human behavior and leveraging it for lasting success.


In this narrative, the short-lived productivity boost due to observation, known as the Hawthorne effect, served as a springboard for deeper changes that ultimately led to long-term improvements in the workplace. The story mirrors the style of Goldratt’s “The Goal,” emphasizing problem-solving and continuous improvement.

Wednesday, 24 April 2024

Advice for Technology Executives eyeing out AWS L7 Snr Manager role

As I write this post, my phonetool (a cool internal service where employees can lookup people) says I've been working at Amazon for 3 years, 3 months and 13 days (it actually feels much longer than that!!). 
 
I thought it might be a good idea to provide some advice for people who are considering applying for L7 Senior Software Engineering Manager roles at Amazon, even though I am just a sample of one. Amazon is a huge, gigantic, complex entity - so by no means is my experience, advice is representative of the company...rather I'm sharing my very own personal experience, aimed mostly at South African professionals who might be considering interviewing for senior management roles -- and are curious to learn from others who walked the path before. We are so lucky to have a global high-tech giant like Amazon operating in South Africa, offering engineering opportunities. I was even contemplating relocating back to the UK in 2020, but decided to stick it out in this beautiful country, try Amazon and live & work from Cape Town...No other global hi-tech company, apart from Amazon, offers software engineering roles in South Africa as far as I'm aware.
 
If you're considering leaving a C-level or Executive Head position behind, coming from a South African corporate, and are now eyeing out senior engineering manager roles in Amazon/AWS locally or internationally, then this post might just be of use to you. 

DISCLAIMER: I'm sharing my own personal experiences and recommendations in the hope my story could help others make an informed decision. Yes, I'm currently employed at AWS as I write this, and so far enjoyed my journey of transition, pretty settled in now as an Amazonian, despite a bit of a rough start of complete change/disruption to both professional and life streams.

My experience going into AWS L7 Senior Manager Role

Before Amazon, from May 2017 to October 2020, I was the single threaded owner (STO aka CTO) for a video streaming platform serving 50+ countries in Africa. Reporting to me were a group comprising executive head of departments and senior managers responsible for: 
 
1\ Software Delivery & Testing (Backend + Apps running on Smart TVs, Set Top Boxes, Browser Web Apps (Chrome, IE, Firefox, Safari), iOS and Android mobile phones and tablets, Apple TVs, Android TVs. In this org, reporting to the senior manager were: 3x Software Dev Managers (Frontend + Backend - 10 dev teams), 1x Test/QA Manager.

2\ Platform, Networking & Video Infrastructure, DevOps: Data Centres in Johannesburg, Cape Town, Durban, Nigeria, London, Ghana. AWS & Azure cloud. CDN infrastructure. Networking backbone, internet transit infrastructure. Devops. Application Web hosting.

3\ Site Reliability Engineering and 24/7/36 Platform/Application Operations monitoring, 2nd and 3rd line technical support team.

4\ Platform Intelligence Health Dashboards, Alerting & Analytics, Content Discovery, Search and Recommendations. An AI/ML team responsible for building video and content discovery services.This team was also responsible for building Infra/Platform/App Health Analytics consoles.

5\ Enterprise & Solution Architecture. A group of architects cutting across software architecture, video streaming, encoding and media processing platform architects. Live streaming video headends, CDN architecture.

6\ Agile Program and Project Management Office. A group of cross-functional project management experts, including program managers, release managers, scrum masters and general program management.

7\ Cyber Security, Piracy and Governance - risk management, legal, compliance and regulatory ownership. Overall IT/Security and Compliance.

I owned all the above (I designed the org structure around cloud modernization), reporting directly to CEO and was part of a the Top Leadership team. I had direct P&L responsibilities in the order of billions of rands, having authority to manage commercial relationships with technology vendors, suppliers, etc. signing licensing agreements, SOPs, reviewing RFPs,etc. I serviced business, technology and customer operations (marketing, content, acquisition, HR tech, customer support). Under my leadership, the platform scaled  to 10X scale, paved the way for a cloud-first future and created NextGen AI streams for personalized content discovery. I had delivered significant cost-savings, created a cloud-first transformation strategy and clear roadmap and built a strong leadership succession plan before leaving. I had a team of 200+ people, which extended to partner contractors & professional services. I decided to leave and was attracted to Amazon because of my exposure to AWS as an enterprise customer, I wanted to learn about what happens in the engine room of AWS.

Between 2013 and May 2017, before becoming a CTO, I was a freelance senior management consultant, taking on program director contracts where I worked with C-level stakeholders as my primary customer, running very large enterprise programs that cut-across multiple lines of business. Effectively having a virtual seat at the C-suite table, but without skin-in-the-game, helping executives deliver their top business, technology and product transformational goals. For these programs, I directed and executed large-scale program deliveries across the African continent, with project team sizes from 350-1000 people, most of my programs were tagged "billion rand programs" costing "a million rand a day". I enjoyed working 4-day weeks, took my time off according to my schedule, and worked just over 10 months a year, during that time period. 

What made me leave executive roles behind and take a risk with a mid-senior management position? In this post, I shared a deep reflection that motivated for a change. I had set my target on a new domain, i.e. cloud services, an opportunity presented itself with AWS, I applied - and the rest is history. I had made a one-way door decision, there was no looking back - but I did have a bumpy start, which in retrospect, if I had access to these pointers beforehand, who knows...

Tuesday, 29 August 2023

Be the leader you wish you had

BE THE LEADER YOU WISH YOU HAD

I use this saying often in my 1:1s with my directs and in my private coaching sessions. It is a powerful way to make one pause for a moment, reflect, adjust to the discomfort, then embrace the excitement of a new energy that is created.

Adopting this mindset has transformed me from standard "manager" to empathetic "leader". Reading Seth Godin's "The Song of Significance" reinforced my instinctual leadership practices. 

Quoting from "13. Let's Get Real or Let's Not Play", Seth says this:

<quote> No one goes to the gym to willingly get punched in the face by the senior vice president of boxing. But some folks eagerly pay for a sparring partner when it's time to get better.  The difference is obvious, but we've forgotten to say it out aloud.  No grades, no check marks, no badges. I'm not in charge of you, and I'm not manipulating you. I'm simply establishing the conditions for you to get to where you said you wanted to go.  You tell me where you're going and what you need. You make promises about your commitment and skills development.  I'll show up to illuminate, question, answer, spar with, and challenge you. I'll make sure you're part of a team of people who are ready to care as much as you do. We can get real. Or let's not play. </quote>
This is not some leadership mumbo jumbo. Some time ago, I developed a model for personal development that borrowed concepts from agile product management by way of user stories (search RAGE tag on this blog). I then used the same methods in the way I work with my direct reports. HR people might call this "contracting with the employee" but I take it further. I get real. It's not about objectives, KPIs & deliver results. I put myself on the line. I reach out. And so when it comes to performance reviews, my reviews are a two-way conversation. My direct also evaluates Mo's performance - because as a leader, I believe leaders mirror & contribute to the performance of their direct reports. 

What's my mechanism then?

I ask each person to write a user story in this format:

In order for me, [Name] to do [XYZ] (e.g. my job | grow | be inspired | learn | etc.) I need my manager (Mo) to support me by doing [....insert your wish-list here] so that I can ....

So I start the year with level setting on our contracts together, and in our 1:1s, we check-in and inspect, comment, re-calibrate, adjust.  

Guess what? 

This mechanism might seem simple but it's quite challenging for people. Usually, it's the first time they're experiencing a manager doing it this way. There's hook both ways. Often, it takes a few iterations to get the user stories crafted in way that is mutually relatable and agreeable. My mechanism goes beyond the standard business SMART goal setting. I make it human. Real. Personal. For me, this is my song of significance.

Here's some real-world examples in play, from senior managers that report into me - See how doing so puts me, Mo, on the hook?

* In order for me to do my job, I need my manager (Mo) to support me by throwing me in the deep end and exposing me to as much as possible so that I can quickly learn and understand this business

* In order for me to do be inspired, I need my manager (Mo) to support me by leading by example so that I can learn from his vast experience

* In order for me to do grow, I need my manager (Mo) to support me by pushing me out my comfort zone so that I can grow in all directions.

* In order for me to do my job. I need my manager (Mo) to support me by throwing me in the deep end and exposing me to as much as possible so that I can quickly learn and understand this business

* In order for me to grow my skillset, I need my manager to support me in blocking out time on my calendar so I can complete the ‘make great hiring decisions’ course (5hrs)

* In order for me to get promoted to L7, I need my manager to support me by identifying key opportunities so that I can start building a roadmap of promotional milestones

Wednesday, 31 May 2023

Why I never ran a program without a Project Charter

Lessons on large-scale delivery program management ...

I continue to dig into my past artefacts to showcase my work portfolio. I'm using a multi-pronged approach here: 1\ Showcase my work to prospective employers; 2\ Openly share my work so that others (people I coach, my colleagues and boss, etc.) can benefit; and 3\ Act as my own living knowledge repository.  

I spent a decade climbing up the project management ladder, in the same way I climbed up the software engineering ladder (from junior engineer to principal engineer) - I first started project managing small software product development (2-4 teams with 10 services), then scaled up to large middleware services (20+ teams, 50+ services) as lead delivery owner, then up to full stack systems integration (full stack of all major components: kernel, middleware, integration layers, applications), then program managed a full go-to-market product launch scaling out to including Tech, Business (Finance, Marketing, Supply Chain) & Operations (Customer Care, Retention, Content, Legal, Regulatory) - as senior program manager. I also owned the full plan of starting a business from scratch to launch (a video streaming company) in 8 months. I did a stint in management consulting, running the top 5 business projects for a $3 billion run-rate business, which some companies might call Tier 0/1 initiatives - where I co-ordinated these large-scale programs, as Chief Program Director - delivering through multiple business lines, multiple project management offices and multiple product and engineering teams. In a sense, I served as the CxO program manager, advisor and delivery owner.  

It is with this experience and knowledge, that I dare to share about my work experiences - and I'm not making these things up - you can check my LinkedIn recommendations page for proof.

During my tenure as the lead program director mentioned above, I often found myself picking up and repairing distressed programs - and along the way, I'd help improve team processes and coach the management teams as well. I also ran new business & technology initiatives from scratch, start-to-finish-then-handover. So with this diverse experience, I developed a simple method that helped me navigate both types of program scenarios: either resetting or starting from scratch, the simple, powerful mechanism of a Project Charter document. To this day, I'm surprised to see many program & project managers failing to use the Project Charter in the way it was meant to be used (clarifying the essence), and often find less-experienced, newly minted PMP/Prince2 certified professionals, doing it "by the book". My approach to project charters went much deeper than that...

So what do I mean by using the Project Charter in "clarifying the essence" then?

A seasoned, experienced project leader, chief program director, end-to-end project manager, senior technical program manager, etc - call the roles what you like - in my view, is not about just putting a plan together, working backwards from a deadline or target completion date. No, I believe as senior program leaders must apply their minds to appreciate the bigger picture and create a program structure that becomes the north star in guiding and leading multiple delivery teams. I never started a program without first establishing my project charter, which at the top level, focuses on the following:
  • Start by understanding the why. Why is this program needed? Why is it important?
  • Move on to understanding the who. Who are the sponsors, stakeholders and teams impacted? Who will be working on the program? "First who, then what"
    • A program manager must be sufficiently well-versed with all the roles expected from the program, and work hard to secure the roles needed. Yes, this means the program manager must escalate to get the people needed for the program (on the bus, as well as off the bus). A responsible program manager would raise all these risks & concerns up-front, before officially kicking off the program.
  • Clarify the what, including calling out what's missing - Set up the mental model for the program. What is this program about? What is it not about? What's in scope? What's not in scope? What workstreams make up the program? How do all pieces come together?
  • Agree, Align, Action - The 3 As of project execution involve agreement on the deliverable, alignment of all parties involved which includes acceptance of their workstreams and ultimately agreeing on the action plan to execute.
Project charters don't necessarily have to be communicated in a written document, a slide deck is more than adequate to communicate the essence. Depending on the business environment and culture (for example, some business cultures prefer slide decks over detailed documents to save reading time, whilst others like Amazon, insist on detailed text narratives). So a seasoned project leader must adapt their style to suit the particular business need & culture of the teams.

In this post, I'm sharing a version of a Project Charter as a slide deck. In a future post, I will share a detailed 50+ page document project charter that involved the launch of a consumer electronics device, the program covered a mix of engineering, business and operations workstreams.

Example Program: Transform Digital Self-Service of a $3 billion run-rate business

I was called in to help reset and kickstart an overarching cross-cutting program to improve a selection of key metrics that would result in increased usage of digital self-service channels, improving customer satisfaction and overall reducing operational costs. This program covered the full value chain delivering the service: 3rd party technology vendors developing phone "mobi" apps using USSD, iOS/Android self-service app, Website, Payments, a hardware kiosk station, set-top box interactive application, integration with internal & 3rd party CRM/Billing systems - and resulting business workflows: finance, customer care & banking channels. Technology teams were spread between the CTO/CIO lines (3 IT pillars), and business teams reported separately to the CEO. The program also served the needs of Group Strategy, Risk & Regulatory. Bringing all these things together requires a steady hand, a tactful negotiator, a strategic and business mindset as well as a strong technology leader. This is why I enjoyed such challenges as these programs were never boring, limited to only tech/engineering.

Enough said, let the slides talk and let me know what you think in the comments!

Sunday, 9 April 2023

How I scaled engineering ops excellence to ±10X with Mission Control


Continuing with me sharing my experiences as CTO, in this post I share the actions I took to help improve an engineering organisation's operational health in our journey of scaling an online video streaming platform from 1X to 10X, from May 2017 to October 2020. To get to 10X improvement takes a journey, which I achieved in under 3 years, and after reaching the goal, I decided I'd learnt enough of the CTO experience and exited, after having set up a strong succession leadership pipeline in place. 

To get an idea of some of the major themes that I tackled during this time, as a leader I had to lead from the front, back, left, right setting the direction of my managers to follow (as all of these interventions were new to them), whilst doing my best to respect what came before:
  • Establishing the team despite constant re-orgs going on at parent company - getting the right people in the right roles at the right time
  • Transforming a rag-tag undisciplined team to a disciplined, clear-headed, focused organised unit 
  • Introducing laser focus on product engineering by unbundling non-core video apps to other businesses
  • Being critical on the technology platform by establishing a baseline of the architecture, using third party auditors to rate the scalability of the platform
  • Improving physical infrastructure: networking, compute, storage and data centres. Move away from self-hosted and self managed data centres to partnering, shutting down data centres as needed.
  • Build an industrial grade networking stack and leveraging modern peering facilities and overhauling the server infrastructure
  • Setting the roadmap for cloud by transitioning first from single region data centres, to multiple data centre deployments, to running multiple stacks simultaneously, introducing containers and microservices then finally getting ready for cloud and leaping first into serverless paradigms
  • Embracing cloud partnerships with big players: Akamai, Microsoft, AWS, etc.
  • Improving product and engineering delivery by revamping and overhauling the agile work processes and backlog management.
  • Introducing communications mechanisms that helped remove doubt and earned trust across the many different business units and teams (we were known as the online pirates doing their own thing)
  • Improving risk, governance and security - bringing it to the top, raising awareness
  • Creating strategic partnerships internally and externally to leverage skills and expertise I couldn't get in-house or afford to build or manage ourselves
  • Introduced technical operations controls - Mission Control, more active management of operations daily, 24/7 with increased focus, planning and prep for peak times, like weekends and major events planning.
  • Aggressively reducing costs on key platform components whilst capitalising on gains through economy of scale
In this post, I'm going to share some of the early context and interventions I introduced in my first 3 months on the job, that remain effective to this day, more than five years later: Mission Control Ops.

The dreaded 403 We're sorry, something went wrong

I took over a team that weren't prepared for the intense discipline needed to run and operate a highly available 24/7/365 platform. There were many reasons for this which I might touch on in another most. I recall coining the term "Bloody May" as the month of so many outages, that I wondered: 1\ What on earth have I taken on? 2\ Is my life going to be consumed by work from now on? 3\ Is there any hope for this platform? 4\ How am I going to turn this platform around? 5\ How much is the job worth to me?

It turned out there was going to be many more "Bloody Mays"in 2017 whilst my team set about improving stability. In 2017, the platform experienced outages that racked up about 20 days of downtime in one year. This equates to ± 95% availability, which is unacceptable for a video streaming platform. When I left the team in Oct 2020, we had turned around the platform to reaching 99.5% availability trending higher. Today, 5 years on, I'm told the availability is much higher but their usage profile has drastically changed (reduced the number of concurrent streams to one device only, also reduced their devices supported, moved most of the services to AWS).