Tuesday, 2 December 2025

The Antigravity Chronicles: A Week of Refactoring, Regret, and Redemption

In a previous post I shared my experience with Antigravity... here's what I was actually able to achieve in less than one week... for those of you still doubting the revolution and impact on software developers, you need to start waking up... I believe that AI coding agents will disrupt frontend software developers...

The rest of this blog post was written by Gemini, from within the Antigravity IDE (another use case for Antigravity, you don't need to leave the console to write a blog post!).

I asked it to review our changes from the past week and write a blog post reflecting on the journey so far... As usual, AI models are ever the optimists -- I actually spent a lot of time coaxing, cursing, shouting and fixing up both Claude and Gemini mistakes... sometimes undoing a lot of work and starting over again!!

The Spark

It started with a simple frustration: Mermaid.js wasn't cutting it.

On November 24th, I looked at our Gantt charts and realized they were static, rigid, and frankly, boring. I wanted interaction. I wanted to drag, drop, and resize. I wanted "Antigravity"—a branch where we would defy the weight of our legacy code and build something modern.

I didn't know I was about to break everything.

The Good: High Velocity & The "Frappe" Win

The early days were a rush. We integrated Frappe Gantt, and suddenly, the application felt alive.

  • Dynamic Resizing: We built a split-pane view that finally let users control their workspace.
  • Smart Scrolling: We conquered the "sync scroll" beast, ensuring the task table and chart moved in perfect harmony.

The velocity was intoxicating. We were shipping features daily. But speed has a price.

The Bad: The Monolith & The Mess

As we pushed new features, the cracks in our foundation began to show.

  • The 2600-Line Monster: Our style.css had grown into a terrifying monolith. Changing a button color in the "Settings" view would inexplicably break the layout in "Roadmaps."
  • Dependency Hell: At one point, we were loading multiple versions of libraries, causing race conditions that made the Gantt chart flicker like a strobe light.
  • The "Safety" Check-ins: You can see the panic in the git logs.
    Commit c628846: "safety checkin - fixed and broke service dependencies caroussel"

We were fixing one thing and breaking two others. The codebase was fighting back.

The Ugly: "I've Lost a Feature"

The low point came on December 1st. In our zeal to refactor the OrgView, we got careless. We deleted a chunk of legacy code that generated the engineer table. It wasn't just a bug; it was a regression.

Commit d3ebd4f: "checkin even if I've lost a feature in roadmaps :-("

This was the "Ugly." The moment where you question if the refactor is worth it. We had to go dumpster diving into the git history (specifically commit a0020189) to retrieve the generateEngineerTable function. It was a humbling reminder that "legacy" often means "working."

The Redemption: The Monolith Buster

We couldn't keep patching holes. We needed a structural change.

On December 2nd, we launched Operation Monolith Buster.

  • The Strategy: Divide and conquer. We identified every view (Org, Gantt, Roadmap) and gave it its own CSS file.
  • The Execution: We slashed style.css from 2,600 lines down to a manageable core.
  • The Result: 12 new modular files. 100% component isolation. Peace of mind.

The AI Tag Team: Gemini & Claude

This week wasn't just a test of code; it was a test of the "Multi-Model Workflow." We switched between Gemini and Claude depending on the problem at hand, and the results were illuminating.

Where Claude Shined: The Architect

When it came to the CSS Monolith Buster, Claude was the surgeon.

  • Strength: Precision in refactoring. I could paste a 500-line CSS file and ask it to "extract the Gantt styles," and it would do so with surgical accuracy, rarely missing a bracket.
  • The "Aha" Moment: Claude suggested the folder structure reorganization that finally made sense of our chaotic js/ directory.

Where Gemini Shined: The Visionary & The Fixer

Gemini was our "big picture" thinker and our debugger.

  • Strength: Contextual awareness. When we broke the OrgView, Gemini was the one that helped us "time travel" through the git history to find the missing generateEngineerTable function. It understood the intent of the missing feature, not just the code.
  • The "Aha" Moment: The narrative you're reading right now? That's Gemini. Its ability to synthesize logs, commits, and user intent into a coherent story is unmatched.

The Trip-Ups (The "Hallucinations")

It wasn't all smooth sailing.

  • The Over-Confidence: Both models struggled when I asked for "blind" fixes without enough context. The regression in commit d3ebd4f happened because we trusted an AI to "clean up unused code" without verifying if it was actually unused.
  • The Lesson: AI is a powerful accelerator, but it lacks object permanence. It doesn't "know" you need that feature you haven't touched in 3 months unless you tell it.

Lessons Learned

  1. Respect the Legacy: Don't delete code you don't understand until you're sure you don't need it.
  2. Commit Often, Even the Broken Stuff: Those "broken" commits saved us. They gave us a checkpoint when we needed to backtrack.
  3. Use the Right Model for the Job: Use Claude for structural refactoring and strict logic. Use Gemini for creative synthesis, debugging, and "big context" reasoning.
  4. AI is a Co-pilot, Not an Autopilot: The AI helped generate code fast, but it took human oversight to spot that we were await-ing outside of an async function.

Conclusion

The antigravity branch lives up to its name. We aren't just weighed down by technical debt anymore. We have a modern navigation system, a modular CSS architecture, and a powerful new Gantt engine.

It wasn't a straight line. It was a messy, chaotic, beautiful week of coding. And I wouldn't have it any other way.


By The Numbers

  • Total Commits: 80+
  • Net LoC Change: -1,200 (We deleted more than we wrote!)
  • Panic Commits: ~3
  • Features Resurrected: 1
Before Refactor Commit: 0b82efa
css
components
Modal.css
aiChatPanel.css
enhancedTableWidget.css
ganttChart.css
header.css
management.css
notifications.css
roadmapModal.css
settings.css
layout
main-layout.css
sidebar.css
style.css// 2600+ lines (MONOLITH)
js
accomplishmentsView.js
capacityTuning.js
dashboard.js
data.js
documentation.js
editSystem.js
featureFlags.js
ganttPlanning.js
goalsView.js
impactView.js
main.js
mermaidGenerator.js
orgView.js
roadmap.js
roadmapTableView.js
sdmForecasting.js
utils.js
visualizations.js
yearPlanning.js
components
HeaderComponent.js
ManagementView.js
RoadmapInitiativeModal.js
SettingsView.js
SidebarComponent.js
SystemsView.js
WorkspaceComponent.js
enhancedTableWidget.js
gantt
FrappeGanttRenderer.js
GanttFactory.js
GanttRenderer.js
MermaidGanttRenderer.js
ganttAdapter.js
ganttGenerator.js
managers
NavigationManager.js
NotificationManager.js
After Refactor Current State
css
style.css// Core only
views// NEW: Modular
accomplishments-view.css
capacity-tuning-view.css
dashboard-view.css
documentation-view.css
gantt-planning-view.css
goals-view.css
org-view.css
roadmap-table-view.css
roadmap-view.css
sdm-forecasting-view.css
system-edit-view.css
visualizations-view.css
welcome-view.css
year-planning-view.css
foundation-components// NEW: Design Sys
ai-chat.css
buttons.css
cards.css
collapsible.css
d3-visualizations.css
dependencies.css
dropdowns.css
dual-list.css
forms.css
legends.css
loading.css
modals.css
system-selector.css
tooltips.css
js
repositories// NEW: Data
SystemRepository.js
state// NEW: State
AppState.js
components
DashboardView.js
HeaderComponent.js
ManagementView.js
OrgView.js
RoadmapInitiativeModal.js
RoadmapView.js
SettingsView.js
SidebarComponent.js
SystemsView.js
WorkspaceComponent.js
enhancedTableWidget.js

Monday, 1 December 2025

Living in Antigravity: A Weekend with Gemini 3, Claude Opus 4.5, Codex 5.1 Max and the Death of "Clean Code"

I spent my entire weekend inside Google’s Antigravity IDE, juggling the three titans of late 2025: Gemini 3 Pro, Claude Sonnet 4.5, and OpenAI Codex 5.1 Max.

My test subject? A "Franken-app" I started building back in the stone age of September 2024. Back then, I was using GPT-4o and the early, limited versions of Claude Sonnet. At the time, those models felt like over-eager interns: they generated code fast, without thinking, and prioritized "showing it works" over best practices.

One+ years later, the landscape has shifted. The "Vibe Coding" hype suggested that software engineering was dead. I’m here to tell you that it’s not dead—but it has mutated into something entirely different.

Here is a video showing the new App refresh versus the old style side-by-side:


Here is what I learned re-wiring a two-year-old messy codebase into an enterprise-grade web app in 48 hours.

The Antigravity Experience: 13 Days In

As I write this, Google Antigravity is less than two weeks old. I installed it on Day 2, and honestly? It was rough. It froze, crashed, and felt heavy. But after the stability patch earlier this week, something clicked.

The "Killer App" isn't just the code generation; it’s the Conversational Context. With Gemini 3 Pro integrated directly into the IDE, I found myself managing a large-scale refactor not by typing code, but by debating architecture. I could propose ideas, ask Gemini to critique them, and even generate UI mockups right there in the console—no context switching to Figma or Photoshop.

However, it’s not perfect.

  • The "Human Bottleneck": Antigravity encourages you to run parallel workspaces with multiple agents. But I found that while the machine is multi-threaded, I am not. Context switching between three different agents causing chaos in three different branches is mentally exhausting.

  • The "Browser Agent": Promising, but slow. It’s fascinating to watch an AI navigate your app to test it, essentially replacing the QA role, but right now, it struggles to "drive" complex applications without crashing.

The Model Showdown: December 2025 Edition

Since Antigravity lets you swap models, I played them against each other. Here is the reality check:

1. Claude Opus 4.5 (The "Academic"): Supposedly the best coding model in the world. It provides the best analysis and writes detailed, "Senior Engineer" style impact assessments. However, it still makes junior mistakes. It would change a CSS class to fix one page, completely forgetting the cascading impact on the rest of the site. High IQ, low spatial awareness.

2. Gemini 3 Pro (The "Collaborator"): The best conversationalist. It holds the "big picture" context better than the others, but it struggles with long-running threads. Eventually, the context gets so heavy that it stalls or starts hallucinating old patterns.

3. OpenAI Codex 5.1 Max (The "Brute Force"): When I hit a wall, I called in Codex. It lacks foresight and nuance, but it is 100% confident and usually finds a way to "just make it work."


The Big Question: Is Spaghetti Code Actually Bad??

This brings me to my most controversial observation.

We spend so much time prompting these models to respect encapsulation, avoid global namespaces, and write "Clean Code." But why? We are forcing AI to mimic human limitations.

I successfully transformed a 90s-style interface into a Jira-level Enterprise UX in one weekend. The code under the hood? It’s a mess. It is clearly "vibe coded." A human engineer would weep looking at the dependency graph.

But I ask you: So What?

If the AI writes the code, maintains the code, and fixes the bugs, why does a human need to understand it?

  • Machines are better than us at reading large datasets and understanding complex, spaghetti-like relationships.

  • If the application is secure and performant, the "readability" of the code is purely a vanity metric for human ego.

We are moving toward a future where we trust the machine to do the job. The role of the software engineer is shifting from "Author" to "Architect" and "Reviewer." You still need a technical background to know what to ask for—you need to know what a "race condition" is to spot when the agent creates one—but you no longer need to care about how the syntax is structured.

The Missing Pieces: How Antigravity Can Win

While Antigravity is impressive, my weekend of deep work exposed three critical gaps where the "Agent" promise falls short of the reality. If Google wants this to replace VS Code, here is what needs to happen next.

1. We Need a "Time Machine" for Code

Right now, Antigravity operates in a linear flow. If an Agent goes down a rabbit hole for 20 minutes and ruins the codebase, your only option is to manually git revert or undo endlessly. The Fix: We need Conversation Forking and System Restore Points.

  • Imagine a "DVR" slider for your development session. I want to scrub back to 2:00 PM, right before I asked Gemini to "refactor the auth module," and fork the reality there to try a different prompt.

  • We need "Save States" for the Agent’s brain, not just the code. I want to revert the context, not just the files.

2. The Context "Amnesia" Problem

I found that even with massive context windows, I am still the only thread tying everything together.

  • The Problem: If I start a new workspace for a "UI refresh," that agent has no idea I just spent three hours discussing database schema changes in a different thread.

  • The Fix: Cross-Conversation Memory. Agents need a shared "hive mind" or a persistent Project Knowledge Graph. If Agent A makes a decision in Thread 1, Agent B in Thread 2 should instantly know about it without me having to copy-paste the summary.

3. "Architect Mode" vs. "Intern Mode"

It is baffling that in late 2025, I still have to explicitly prompt a state-of-the-art model to "avoid global namespaces" or "separate concerns."

  • The Reality: These models are still trained on the average of all GitHub code—which means they default to "average" (spaghetti) quality. They prioritize working code over maintainable code.

  • The Fix: I shouldn't have to prompt-engineer basic competence. Antigravity needs a toggle for "Strict Architect Mode"—a system-level guardrail that forces the Agent to adhere to SOLID principles and Design Patterns before it writes a single line of code. I don't want an eager intern; I want a Senior Engineer who pushes back when I ask for something stupid.

Observations So Far: The Paradox of Productivity

I have successfully rewired a 90s-style web app into a modern enterprise platform in 48 hours. The result is undeniable, but the process felt like herding cats.

  • The "Vibe Coding" Reality: The app looks great, but under the hood, it is a "Vibe Coded" mess. The dependency graph looks like a bowl of spaghetti.

  • The Stability Tax: I found myself having to "close-and-restart" Antigravity multiple times to clear out "ghosts" in the machine.

  • The Human Bottleneck: The promise of agents is parallelism, but I am still single-threaded. Trying to manage three agents simultaneously is mentally exhausting. Instead of writing code, I am now a Project Manager constantly unblocking three different junior developers who type at the speed of light but forget instructions every hour.

The value proposition is compelling, but we are not there yet. We have replaced "syntax errors" with "context errors," and I am not sure which one is harder to debug.

The Verdict

Antigravity is buggy, resource-hungry, and requires a "close-and-restart" every few hours. But it is undeniable that the tide has turned. Frontend engineering as we knew it in 2024 is gone.

The future isn't about writing code. It's about managing the agents that write it for you—and perhaps, learning to love the spaghetti.

Note: This blog post was initially written by me and later tidied up with Gemini Pro.

Sunday, 23 November 2025

How to use AI to generate software architecture diagrams from simple prompts

In a previous post, I shared some of my experience adding AI features into this proof of concept application I've been building entirely with AI coding assistants. This weekend, I found a way to break through the obstacle I faced with attempting to use Gemini to generate images through its Imagen API. A much simpler way to realise my goal of getting an AI to generate architecture diagrams from a simple prompt, is to instruct the LLM to produce Mermaid diagrams.

All it needed was a carefully crafted system prompt, instructing the AI to create the Mermaid diagram syntax from the data model, and voila! We now have a way to generate almost any diagram one can think of. 

Take a look at this video:


And here's built-in native features of the app that shows different Mermaid diagrams for the system under inspection:



Play with the app here
Source code
Project Wiki


Thursday, 20 November 2025

Demonstrating the power of bulk changes to delivery plans

So I've added some bulk operations to the SMT app.

Bulk AI Agent Scenarios (Macro Operations)

Empower the agent to perform complex, multi-entity updates that are tedious to do manually. Each plan highlights the scope before execution (e.g., “This will affect 12 teams...”).

1) Capacity & Resourcing Manager

  • User intent: “Reduce all teams’ capacity by 20% to account for burnout.” / “Set everyone’s AI productivity gain to 15%.”
  • Tool: bulkUpdateTeamCapacity (supports capacityReductionPercent, aiProductivityGainPercent, avgOverheadHoursPerWeekPerSDE, plus filters by teamIds or orgIdentifier).

2) Strategic Portfolio Manager

  • User intent: “Move all ‘Low ROI’ initiatives to the Backlog.” / “Approve all initiatives under the ‘Cloud Migration’ goal.”
  • Tool: bulkUpdateInitiatives (criteria: goalId, themeId, roiValue, confidenceLevel, status, isProtected).

3) Scope Trimmer (“Haircut” Tool)

  • User intent: “Reduce the scope of all committed initiatives by 10% to fit the plan.”
  • Tool: bulkAdjustInitiativeEstimates (adjustmentFactor scales SDE-year assignments; same criteria options as above).

4) Org Restructurer (Advanced)

  • User intent: “Move all teams from John Doe to Jane Smith.”
  • Tool: bulkReassignTeams (moves all teams from one SDM to another).

Safety checks

  • Bulk operations can be destructive; the agent surfaces an impact summary before running (e.g., “Reducing capacity for 15 teams”).
  • Plans are shown to the user before execution; review the description to confirm scope.

Example scenarios

  • Scenario A: “Austerity” Plan — “Increase overhead to 10 hours/week for everyone and move all ‘Low’ confidence initiatives to the backlog.”
    • Plan: bulkUpdateTeamCapacity({ avgOverheadHoursPerWeekPerSDE: 10 }) + bulkUpdateInitiatives({ status: 'Backlog' }, { confidenceLevel: 'Low' })
  • Scenario B: “AI Boost” — “Assume 20% productivity gain for all teams.”
    • Plan: bulkUpdateTeamCapacity({ aiProductivityGainPercent: 20 })

Here's a transcript of the chat - using AI agent to tweak the year plan:


How AI tooling can help software engineering and product managers improve productivity

Have you ever been curious to learn how popular systems like Excel, Visual Studio, Adobe Photoshop, Call of Duty, Amazon, Uber, AirBnB, ChatGPT, Netflix, Apple Appstore, etc. are architected and built? If you're a software manager designing a technical organisation, and want to learn how the best technology teams are organized, wouldn't you like some insights into how to organize your teams, drawing on how the big tech companies do this? What about if you're stepping into product and program management, need to create a 3-Year Plan roadmap, along with business goals with metrics for Return on Investment, and have a dashboard on a single page to view overall status? What if you're a software delivery manager, and need to manage multiple delivery teams, co-ordinate the plan, identify bottlenecks? If you're a senior software leader and wish to analyse where your engineering resources are being consumed, and model capacity and hiring scenarios, wouldn't you like a tool to help provide insights? What if you received a top-down instruction to reduce headcount by 25% and model the impact on your roadmap, or you need to optimize your delivery plan based on various funding scenarios, wouldn't it be nice if you had a tool to do this quickly? If you could have a tool that lets you run scenarios and show the impact in near real-time, wouldn't that save you a lot of time? Wouldn't you like a tool that generates product and strategic narratives for you without you writing a single sentence? What if you're preparing for interviews in domains you're not familiar with, wouldn't you like a tool that could just generate you a simulated mock system so you can have rich context so you show up as prepared and more informed?

If you've answered Yes, to at least one of these questions - then you might just be interested in an app I'm POC'ing :-) Click here to learn about the app, or just click here to start playing with it.

I had some good fun this past weekend building out AI features for my app "Software Management Tools", which is currently a single page web application built entirely in vanilla Javascript/HTML. It is hosted on Github and the source code is freely available here. This tool basically allows you to do everything above and much more. All you need is a Gemini API key (free tier), to get started and experimenting. What's more interesting is that I didn't write a single line of code myself, thanks to Google's Gemini, OpenAI's Codex, Github and Visual Studio code - I've become a copy-and-paste developer. Seriously though, I do believe frontend engineers are at risk of losing their jobs to AI, and now with the concepts my tool demonstrates, the need for technical managers is also questionable! 

AI Features Implemented (Using Gemini 2.5 Flash model)
  • Users can create a mock system, organisational structure and 3 year plan from a single simple prompt like "Create a spreadsheet platform like Micosoft Excel"
  • Users now have access to an integrated chat assistant that has context of the data model and the screen currently selected, where they can:
    • Learn more about the system architecture
    • Find out about platform dependencies like AWS, GCP services
    • Inspect bottlenecks in the architecture
    • View, Edit, Inspect and modify organisational and team structures through the chat
    • Inspect and Optimise the year plan using AI assistant
    • Write up business plans and narratives for a 3 year plan
    • Critique the engineering skills set, look for risks
    • Tweak the capacity constraints and model different capacity scenarios - with AI-generated suggestions
Where am I going next?
The possibilities are near endless - there's so much more to explore, but I've hit the classic stage of deciding to refactor the codebase to a more modern architecture, scrap the MVP and build again. Here are some ideas I'll progress next weekend:
  • Enable self-generating code contributions to the app itself
  • Build richer agents to perform complex operations from the chat like 
    • Reduce all teams capacity by 20% and show the updated ATL/BTL plan
    • Move all initiatives under the Goal XYZ to BTL to free up capacity
    • Create a new system from the chat panel - an AI-wizard
    • Export the plan to Excel or Project
    • Create a detailed quarterly release plan for team XYZ
    • Merge these two teams into one team, reporting to a new SDM, move these services to the new team, keep remaining services in unallocated team
    • etc.
  • Build an appstore feature for users to share and publish their generated systems
  • Build an internal app API for the app to make agent creation more accessible
  • Refactor the codebase to cleanly separate out concerns - and make future porting easier
  • Start building out detailed planning features
  • Explore a backend-as-a-service to make simpler API key integrations possible for folks who are not software engineers
What have I learnt?
In just two days, I learnt a lot:
  • How to integrate AI-features into an app, navigating the API Key challenges
  • How to build a model router
  • How to fine tune a system prompt for the task at hand
  • How to pass JSON syntax to the LLM to model your data structure you need the LLM to return
  • How to validate generated data for compliance
  • How to build a chat assistant, with memory, history and context
  • How to build an agent framework
  • How to structure your code to make AI integrations portable and extensible
  • How to build effective retry and backoff mechanisms when AI is overloaded
  • How clunky the AI provider APIs are - whilst apps like ChatGPT and Gemini are multimodal, it is not as simple or clean to do this from an API, there should be a simple multi-modal API for developers, one endpoint to call - without overly complex security concerns
  • How powerful and fun coding can be again!
Do you have some demos, I can't access the tool?
I thought you'd never ask. 

This video shows the AI creating a new system from prompt, using "Create with AI" feature:



This video shows using AI to inspect system architecture...

This video shows how AI can help manage product roadmaps...

This video shows of AI can assist with inspecting software delivery plans..

This video shows how managers can have fun using AI to manage their org structures...