Continuing with me sharing my experiences as CTO, in this post I share the actions I took to help improve an engineering organisation's operational health in our journey of scaling an online video streaming platform from 1X to 10X, from May 2017 to October 2020. To get to 10X improvement takes a journey, which I achieved in under 3 years, and after reaching the goal, I decided I'd learnt enough of the CTO experience and exited, after having set up a strong succession leadership pipeline in place.
- Establishing the team despite constant re-orgs going on at parent company - getting the right people in the right roles at the right time
- Transforming a rag-tag undisciplined team to a disciplined, clear-headed, focused organised unit
- Introducing laser focus on product engineering by unbundling non-core video apps to other businesses
- Being critical on the technology platform by establishing a baseline of the architecture, using third party auditors to rate the scalability of the platform
- Improving physical infrastructure: networking, compute, storage and data centres. Move away from self-hosted and self managed data centres to partnering, shutting down data centres as needed.
- Build an industrial grade networking stack and leveraging modern peering facilities and overhauling the server infrastructure
- Setting the roadmap for cloud by transitioning first from single region data centres, to multiple data centre deployments, to running multiple stacks simultaneously, introducing containers and microservices then finally getting ready for cloud and leaping first into serverless paradigms
- Embracing cloud partnerships with big players: Akamai, Microsoft, AWS, etc.
- Improving product and engineering delivery by revamping and overhauling the agile work processes and backlog management.
- Introducing communications mechanisms that helped remove doubt and earned trust across the many different business units and teams (we were known as the online pirates doing their own thing)
- Improving risk, governance and security - bringing it to the top, raising awareness
- Creating strategic partnerships internally and externally to leverage skills and expertise I couldn't get in-house or afford to build or manage ourselves
- Introduced technical operations controls - Mission Control, more active management of operations daily, 24/7 with increased focus, planning and prep for peak times, like weekends and major events planning.
- Aggressively reducing costs on key platform components whilst capitalising on gains through economy of scale