Engineering

Comprehension Debt: The Silent Killer in the Age of AI-Generated Code

I Merged 1,200 Lines I Didn't Understand

A 1,200-line pull request. AI-generated. Clean interfaces, dependency injection, a full suite of unit tests glowing green. CI passed. I approved it.

A week later, it brought down our staging environment during a client demo. While the team scrambled, I just sat there. I knew what broke. I could not explain why. I had reviewed the code but never actually understood it.

That was my introduction to Comprehension Debt.

We're producing more code. We understand less of it.

Andrej Karpathy pointed out that we're moving toward a world where AI writes the majority of our code. Teams on the cutting edge already merge dozens of AI-authored PRs a day. PR sizes and merge volumes have jumped over 100%, according to Faros AI and DORA metrics.

Sounds great until you look at what happened to review times. They nearly doubled. We sped up the typing, but the thinking got slower. The time savings from generation are being eaten by coordination overhead, context switching, and the raw volume of changes nobody fully grasps.

This is not regular technical debt. Technical debt is a conscious tradeoff: we know we're cutting a corner and plan to fix it later. Comprehension Debt is different. It accumulates silently. A developer merges a "plausible" AI-generated solution that passes tests and ships. The feedback loop feels positive: green checks, feature delivered. But the codebase is gradually becoming a house where nobody understands the plumbing.

The bugs are different now

Old bugs were usually localized. A null pointer, an off-by-one error, a typo in a variable name. Frustrating but diagnosable. The failures that come from Comprehension Debt are structural. They are rooted in an assumption the AI made in the first few milliseconds of generation.

Here's a concrete example. You ask an AI to build a checkout system:

Your prompt: "Create a robust service in TypeScript to handle user discounts during checkout."

The AI makes one assumption: a "discount" is always a percentage. It generates this:

// cart.ts
export interface CartItem {
  productId: string;
  price: number;
  quantity: number;
}
 
export interface Cart {
  items: CartItem[];
  getTotal: () => number;
}
 
// discount.strategy.ts
export interface IDiscountStrategy {
  applyDiscount: (total: number) => number;
}
 
export class PercentageDiscountStrategy implements IDiscountStrategy {
  private percentage: number;
 
  constructor(percentage: number) {
    if (percentage < 0 || percentage > 100) {
      throw new Error("Percentage must be between 0 and 100");
    }
    this.percentage = percentage;
  }
 
  applyDiscount(total: number): number {
    return total * (1 - this.percentage / 100);
  }
}
 
// discount.service.ts
import { Cart } from './cart';
import { IDiscountStrategy } from './discount.strategy';
 
export class DiscountService {
  private strategy: IDiscountStrategy;
 
  constructor(strategy: IDiscountStrategy) {
    this.strategy = strategy;
  }
 
  calculateFinalPrice(cart: Cart): number {
    const total = cart.getTotal();
    return this.strategy.applyDiscount(total);
  }
}

Clean, testable, follows the Strategy pattern. You approve it. A week later, marketing asks for a "$10 off your first order" coupon.

Now the entire foundation cracks. IDiscountStrategy, the concrete class, the service layer, all of it assumes percentage-based discounts. Adding a fixed-amount discount is not a small change. You have to gut the conceptual bedrock of the module. The AI's single assumption propagated through the whole structure, and the bill is a full refactor.

How to manage this

If AI writes most of the code, our job changes. We stop being typists and start being architects. The value isn't in the volume of code we produce. It's in how well we understand the systems we're responsible for.

Three practices that have worked for me:

1. Write the architecture yourself

Don't give the AI a vague instruction and hope for the best. Define the contracts, the constraints, the boundaries. Use the AI to fill in the implementation details.

Compare the original vague prompt to something like this:

Better prompt: "Design a discount system in TypeScript. Create an interface 'IDiscountStrategy' with a method 'apply(cart: Cart): number' that returns the final price. Implement two strategies: 'PercentageDiscountStrategy' (takes a percentage) and 'FixedAmountDiscountStrategy' (takes a currency amount). Create a 'DiscountService' that accepts any IDiscountStrategy."

You've defined the architecture. The AI is filling in blanks. There is no room for it to make a silent assumption about discount types.

2. Make developers explain AI code as if they wrote it

The old code review approach doesn't work here. Scanning for obvious bugs is pointless when the AI also wrote the tests, creating a closed loop of self-validation.

I added a rule to our PR process: the owner of a PR must be able to trace a request from entry to exit and explain why the code is structured the way it is. We added a required section to the PR template called "Architectural Narrative," where the author explains the reasoning behind the design. If you can't explain it, the PR is not ready. Green checks are irrelevant.

This forces the human to actually internalize the AI's output before it enters the codebase.

3. Test concepts, not just implementations

AI tests confirm AI logic. That's a circular reference. Unit tests still matter, but they shouldn't be your only defense.

Integration and end-to-end tests validate what the user actually experiences. They don't care if the code used a Strategy pattern, a function, or a chain of decorators. They care about outcomes.

// tests/checkout.spec.ts
import { test, expect } from '@playwright/test';
 
test.describe('Checkout Flow', () => {
  test('should apply a fixed amount discount correctly', async ({ page }) => {
    await page.goto('/products/classic-tee');
    await page.getByRole('button', { name: 'Add to Cart' }).click();
 
    await page.goto('/checkout');
    await page.getByLabel('Discount Code').fill('WELCOME10');
    await page.getByRole('button', { name: 'Apply' }).click();
 
    const originalPrice = await page.locator('.original-price').textContent();
    const finalPrice = await page.locator('.final-price').textContent();
 
    expect(originalPrice).toContain('$75.00');
    await expect(page.locator('.discount-applied')).toHaveText('-$10.00');
    await expect(finalPrice).toContain('$65.00');
  });
});

This test would have caught the flaw in our first example immediately. A "$10 off" coupon on a $75 item should give $65, not some percentage-based result. The test is agnostic to implementation. It only validates the business rule.

The job changed. The title didn't.

Nobody is writing a eulogy for developers. But the job is shifting. The repetitive, boilerplate parts of coding are disappearing. What's left is harder: system design, critical thinking, understanding the problem before touching the keyboard.

I've started thinking of myself less as "someone who writes code" and more as "someone who is responsible for software systems." The distinction matters. Writing code is a task. Being responsible means you can explain every decision, trace every failure, and defend every tradeoff.

The next time an AI hands you a thousand lines of passing code, don't just check the CI output. Ask yourself one question: Do I actually understand this?

Your codebase depends on an honest answer.