4 Engineers, 12 Microservices, and an AI Teammate

There's a specific kind of dread that settles in on a Sunday evening when your on-call week is about to start. For our OMS team (one PM and three engineers), that feeling was amplified by the twelve microservices we shepherded. We were perpetually drowning in context-switching, boilerplate code, and a growing backlog. We weren't building. We were bailing water.

The problem: death by context switches

Our microservice architecture, born from the best of intentions, had matured into a constellation of siloed knowledge. Each service was its own kingdom of forgotten lore and undocumented tribal wisdom. Adding a new return reason felt like orchestrating diplomacy between twelve mutually suspicious nations.

The pain points:

Context switching: Tracing a single user journey meant jumping between multiple codebases, deployment pipelines, and dashboards. Productivity was theoretical.
Boilerplate treadmill: Writing the same NestJS modules, DTOs, repository patterns, and Jest tests across services. We were human copy-paste machines.
Documentation as afterthought: Hard to update Confluence when production is on fire. Onboarding was a nightmare. Incident response was an archaeological dig.
Cursory code reviews: Everyone was stretched thin. Small inconsistencies slipped through and greeted us with PagerDuty alerts at 3 AM.

We couldn't hire 12 more engineers. We couldn't delete 8 microservices (though the thought crossed my mind during certain 3 AM calls). We needed a force multiplier.

An AI-augmented workflow

We didn't throw a chatbot at the problem. We integrated Claude into the development lifecycle with a specific goal: eliminate the drudgery, free our brains for the complex architectural problems that actually need human judgment.

Universal intern: context retrieval

First step was feeding Claude our entire repository, architectural decision records, API docs, and sanitized Slack conversations from #oms-dev. Now instead of interrupting the one engineer who remembers why a specific gRPC endpoint was designed that way six years ago, we just ask.

Typical query:

"What's the full data flow for a return request, from returns-api to inventory-service? Include key database tables and event topics."

Response:

returns-api receives POST to /returns.

Validates payload via CreateReturnDto.

ReturnsService creates entry in returns table with PENDING status.

Publishes return.created event to oms-events Kafka topic.

logistics-service consumes event, arranges pickup, updates to IN_TRANSIT.

inventory-service consumes return.received event, updates product_inventory, triggers refund.

Key files: returns.service.ts in returns-api and InventoryEventHandler.ts in inventory-service.

Onboarding went from a 3-week headache to a 3-day guided tour.

Boilerplate buster: code generation

This is where we saw the biggest productivity gains. We created standardized prompts for common tasks in our NestJS/TypeScript environment.

Prompt:

Create a NestJS DTO for 'UpdateShipmentStatus' using class-validator.
- 'shipmentId': string, UUID format, required.
- 'status': enum of 'IN_TRANSIT', 'DELIVERED', 'DELAYED', required.
- 'notes': string, optional, max 500 characters.
- 'eventTimestamp': ISO8601 date string, required.

Generated code:

import { IsString, IsUUID, IsEnum, IsOptional, MaxLength, IsISO8601 } from 'class-validator';
 
export enum ShipmentStatus {
  IN_TRANSIT = 'IN_TRANSIT',
  DELIVERED = 'DELIVERED',
  DELAYED = 'DELAYED',
}
 
export class UpdateShipmentStatusDto {
  @IsUUID()
  @IsString()
  readonly shipmentId: string;
 
  @IsEnum(ShipmentStatus)
  readonly status: ShipmentStatus;
 
  @IsOptional()
  @IsString()
  @MaxLength(500)
  readonly notes?: string;
 
  @IsISO8601()
  readonly eventTimestamp: string;
}

We extended this to generate entire modules, services with CRUD operations, and Jest test skeletons. The time saved from boilerplate goes to solving actual business problems.

Quality guardian: automated first-pass reviews

We integrated Claude into CI/CD via a custom GitHub Action. On every PR, it performs a first-pass review. Not replacing human oversight for logic and architecture, but catching the common stuff that slows us down.

Example PR comment from claude-bot:

Suggestion (Performance): In ProductService.ts line 84, this Array.prototype.find inside a loop is O(n²). Convert productsToFind to a Map before the loop for O(1) lookup.

Suggestion (Clarity): Variable data on line 112 is vague. customerOrderHistory would improve readability.

The results

Did we become a 16-person team? No. Headcount is still four. But we started shipping like a team three times our size.

Cycle time from ticket to production: cut in half.
Production bugs decreased over 60%. The AI catches simple mistakes; humans focus on complex logic.
Developer morale: Engineers spend less time on toil and more on creative problem-solving. The Sunday dread is still there, but it's lighter.

Your next hire might be an API call

AI won't fix bad architecture or a toxic culture. It's a tool, and its effectiveness depends on who's using it.

But if you treat it as a tireless pair programmer, it can free up your most valuable resource: your team's collective brainpower. It lets your best people do their best work.

Stop bailing water. Start building a better boat.