AI Product Development Lifecycle
Took a forty-year Clarion codebase serving 1,500 hotels worldwide and rebuilt it as a cloud-native PMS using a 55-agent AI Product Development Lifecycle. Seven major launches in six months at roughly ten times conventional velocity.
- Launches in 6 months
- 7+
- Velocity vs conventional
- ~10×
- Agents in pipeline
- 55
- Person core team
- 4
Problem
roomMaster is a 40-year-old Property Management System running on a Clarion codebase, deployed across more than 1,500 hotels worldwide. The schema had been denormalized and patched for two decades. There were no architecture documents. The discovery-to-ship loop for a single feature was 12 to 18 months.
The strategic ask wasn't a refactor. It was a cloud-native successor — roomMaster Nova — that customers could migrate to without losing their workflows, ahead of competitors who were already shipping AI-native PMS products. Conventional team scaling wouldn't get there: hiring fifty engineers takes nine months and won't fix the bottleneck, which is the cycle time of every decision in the pipeline.
Approach
I designed and shipped what we now call the AI Product Development Lifecycle (AI PDLC): an orchestrated multi-agent pipeline that takes a market signal and walks it through discovery, PRD generation, code synthesis, QA, release, and post-launch analysis with typed handovers and quality gates between each stage.
- Five orchestrators.
Divide the lifecycle into discovery, design, build, ship, and learn. Each owns its outputs as typed artifacts (intent docs, PRDs, ADRs, PRs, runbooks, dashboards). Handovers are schema-validated; a failed gate kicks back to the previous orchestrator with a structured error.
- Twenty-six skills.
The reusable atomic units the orchestrators compose: competitive analysis, JTBD synthesis, schema diff, migration generator, telemetry plan, etc. Each skill is a prompt + tool contract pinned to a model and a budget.
- Fifty-five agents.
Execute the skills with role-specific system prompts, retrieval contexts, and evaluation harnesses. The agent population is small enough to hold in your head but large enough to specialize.
- A coding PM, not a spec-writer.
I ship production PRs to the AI Support Agent platform daily. The pipeline can't be designed by someone who hands work over the wall; it has to be designed by someone who feels the round-trip cost of every gate.
- Instrument everything.
Every agent run, prompt, eval score, and human-in-the-loop intervention writes to LangFuse. The pipeline isn't a black box — it's a profilable system you can A/B test like any other product.
AI PDLC is not a productivity tool story. It is a new operating architecture for how software gets built.
What shipped
Inside the same six months the AI PDLC produced:
- roomMaster Web App.
DAU +700% on US launch day; a single API contract unblocked seven dependent products.
- Booking Engine.
+35% conversion, +15% booking value, +60% mobile share.
- Channel Management.
99.95% uptime across launch quarter.
- AI Support Agent.
Multichannel (chat / voice / email) with up to 30–40% automatic case resolution; PCI guard, PII redactor, and hallucination guard layered as a 5-stage safety pipeline.
- AI Revenue Management.
+35% RevPAR, +40% ADR, 29 hours per month saved per property.
- AI Concierge.
+35% bookings, +60% CSAT, –35% operational cost.
- Plus.
Housekeeping mobile app, Admin Portal, and Metasearch integrations.
Every launch shared the same instrumentation, the same release runbook, the same SLO scaffolding — because the pipeline produced them, not seven different teams writing seven different conventions.
Outcome
Cycle time
Weeks → days
PRD-to-merged-PR median dropped from weeks to days. Single-feature ideas move from intent doc to production behind a flag inside one calendar day.
Headcount efficiency
4 people, ~10× velocity
Four-person core team outpacing the conventional baseline we measured against on the legacy product by roughly an order of magnitude.
Quality bar
Zero critical incidents
No critical post-release incidents on the launches above. 5-layer safety pipeline on the AI Support Agent passes 45+ unit tests per change.
Durability
Compounds, doesn't decay
The pipeline is the team's institutional memory. New skills and agents compose into it without re-architecting the orchestrators.
What I’d do differently
- Invest in agent observability earlier. LangFuse went in at month three. The two months before that, we were debugging by re-reading logs. Profile from day one.
- Pin model versions per skill, not globally. A model upgrade is a regression risk. Treat models like any other dependency.
- Build the eval harness before the skill, not after. Backfilling evals into already-shipped skills is twice the work.
Related work



