AI Coding Is Rotting Your Codebase

Q: Is AI coding bad for software teams?

AI coding is not bad by itself. It becomes dangerous when teams use it without clear specifications, automated tests, architectural ownership, and explicit review checkpoints.

Q: What is spec-driven development?

Spec-driven development means the team defines intent, acceptance criteria, states, edge cases, and tests before generating or writing implementation code.

Q: What should engineering leaders change first?

Move rigor upstream. Before asking AI to generate code, write clearer requirements, decision tables, state machines, non-goals, and tests that prove the behavior.

Q: What skills matter most for developers in the AI era?

System design, problem decomposition, edge-case prediction, test design, operational judgment, and the ability to express intent clearly matter more than syntax memorization.

AI will not destroy your engineering team because it writes bad code.

It will destroy your engineering team because it writes plausible code faster than your organization can understand, review, test, and own it.

That is the real danger. Not AI. Not code generation. Not agents. The danger is a company culture that treats velocity as the only truth. If a team already has weak requirements, shallow tests, unclear ownership, and rushed reviews, AI does not solve those problems. It gives them a jet engine.

At Vasilkoff, our position is simple: AI writes the code. Humans own the result. That sounds clean, but it has serious consequences. If humans own the result, humans must own the specification, the architecture, the tests, the operating model, and the incident response. You cannot outsource meaning to a probabilistic system and then act surprised when production behaves like nobody was responsible.

This post lays out the 10 theses behind how we think resilient IT companies need to operate in the AI era.

The new bottleneck is intent, not typing

For decades, software delivery was constrained by implementation speed. Developers had to translate ideas into syntax by hand. That created a natural limit on how fast complexity could enter a codebase.

AI removed part of that limit.

Now a small team can generate thousands of lines of code, migrations, tests, API clients, UI states, documentation, and infrastructure definitions in hours. That is powerful. It is also dangerous when the organization still manages engineering as if code volume were the scarce asset.

The bottleneck moved. The hard work is no longer "Can someone type this implementation?" The hard work is:

Do we know what the system must do?
Do we know what it must never do?
Do we know how every state changes?
Do we know what failure looks like?
Do we have tests that prove the important behavior?
Do we know who owns the result at 3:00 AM?

If you cannot answer those questions, AI will not save you. It will help you build the wrong thing faster.

1. The code is dispensable; the specification is the product

Generated code is cheap. Clear intent is expensive.

That is the philosophical shift most companies still resist. They still treat code as the primary artifact and requirements as the messy preface. In AI-assisted development, that model is backwards.

The real intellectual property of a modern software company is not raw syntax. It is the system definition:

product requirements that remove ambiguity,
state machines that describe valid transitions,
decision tables that capture business rules,
acceptance criteria that can be verified,
test suites that prove the behavior,
operational runbooks that explain what to do when reality breaks.

If those artifacts are strong, the implementation can be regenerated, rewritten, ported to another language, or refactored with confidence. If those artifacts are weak, even beautiful code becomes fragile because nobody can tell whether it still means the right thing.

This is why we treat specification as product infrastructure. A spec is not paperwork. It is the instruction layer that lets humans and AI cooperate without guessing.

2. Shift engineering rigor upstream

Traditional engineering often applies rigor after implementation. A developer writes code, opens a pull request, and reviewers try to detect architectural mistakes, missing edge cases, security concerns, and product misunderstandings after the implementation already exists.

That model becomes expensive with AI.

When AI generates a large change, reviewers are not evaluating a carefully hand-built argument. They are evaluating a huge amount of plausible output. The code may compile. The types may pass. The UI may look fine. The implementation may still be semantically wrong.

So the quality gate must move earlier.

Before generation, the team should define:

the exact user outcome,
the non-goals,
the inputs and outputs,
the valid states,
the invalid states,
the edge cases,
the migration risks,
the rollback plan,
the tests that must pass.

This is not bureaucracy. It is leverage. Every minute spent clarifying intent before generation prevents hours of review, rewrite, and production confusion after generation.

3. Redefine developers as supervisors and architects

The old junior, mid-level, senior ladder is becoming unstable.

Junior developers can produce more output than ever because AI removes many syntax and boilerplate barriers. Senior developers remain valuable because they understand systems, tradeoffs, risk, ownership, and failure modes. The most difficult transition is often the middle layer: developers who became strong through implementation fluency but have not yet built deep architectural judgment.

Companies need to redesign engineering roles around supervision and architecture.

That does not mean juniors become useless. It means juniors need structured work, strong tests, and senior-defined boundaries so they can learn safely while producing value. It does not mean seniors should spend all day reviewing giant AI diffs. It means seniors must spend more time defining architecture, constraints, interfaces, invariants, and failure playbooks.

The role split becomes clearer:

Execution supervisors manage AI-assisted implementation inside narrow boundaries.
System architects define the boundaries, state models, test strategy, and integration contracts.
Human owners remain accountable for production behavior.

If your most expensive engineers become cleanup staff for unbounded generation, you are not accelerating engineering. You are burning senior judgment on preventable mess.

4. Fight vibe coding with spec-driven development

"Vibe coding" is fun until the bill arrives.

The pattern is familiar: prompt, generate, run, patch, prompt again, accept a workaround, ask for a quick fix, then repeat until the codebase contains five partial ideas that almost agree with each other.

This feels fast because the feedback loop is immediate. It feels productive because the screen changes. But without a governing spec, the system slowly loses coherence.

Spec-driven development is the antidote.

In SDD, AI does not receive a vague wish. It receives steering documents:

feature goals,
domain vocabulary,
interface contracts,
data models,
state transitions,
test expectations,
coding constraints,
review checklist.

The difference is not academic. A vague prompt asks the AI to invent the missing product thinking. A spec-driven prompt asks the AI to execute defined intent.

That is the right division of labor. Humans define meaning. AI accelerates implementation.

5. Build an institutional subconscious

AI reads what exists. It does not magically know what your senior engineer learned during an outage two years ago.

Every company has hidden operating knowledge:

why a database pool was tuned a certain way,
why a customer import job runs at a strange hour,
why a legacy field cannot be renamed,
why a retry policy exists,
why one provider fails under a specific traffic pattern,
why a "temporary" workaround is still in production.

If that knowledge lives only in people's heads, AI will default to generic advice. It may restart a crashing service when the real problem is a slow pool leak. It may simplify a weird condition that exists because of a real customer edge case. It may remove "dead" code that is only used during a monthly reconciliation.

Resilient companies need an institutional subconscious: a living knowledge layer that captures not only what the system does, but why.

This can be built through:

incident writeups,
architecture decision records,
runbooks,
internal knowledge graphs,
domain glossaries,
postmortem-linked code comments,
searchable project memory.

The goal is not documentation for its own sake. The goal is to make tribal knowledge available to people and agents before they change production systems.

6. Deploy angry agents

Most AI assistants are designed to be helpful, polite, and compliant. That is useful for routine work. It is risky for high-stakes engineering.

In an outage, migration, security review, or architectural decision, a yes-man assistant can reinforce the wrong assumption. If a developer says, "I think the cache is the issue," a compliant assistant may help debug the cache while the real issue sits in the queue system.

That is why teams need adversarial AI roles: angry agents.

An angry agent is not rude for entertainment. It is deliberately prompted to challenge assumptions, search for contradictions, ask what evidence is missing, and propose failure modes the team has not considered.

Use angry agents to ask:

What assumption is weakest?
What would make this rollback fail?
Which test gives false confidence?
What dependency is treated as reliable but is not?
What user path is not represented in the spec?
What security boundary did we assume instead of prove?

In serious systems, agreement is cheap. Useful opposition is valuable.

7. Prevent code ownership evaporation

AI can create a strange psychological failure: developers merge code they do not really understand.

At first, this looks efficient. A feature ships. A bug is fixed. A migration completes. But over time, the team becomes a stranger to its own codebase.

Then production fails at 3:00 AM.

The team opens the code and realizes nobody truly knows why the system is shaped this way. The AI can explain what the code appears to do, but it cannot provide the original human intent if nobody captured it. The developer on call is now debugging a machine-generated decision with no memory attached.

Companies must prevent ownership evaporation with explicit checkpoints:

Every significant AI-generated change must include a human-readable design note.
Reviewers must understand the behavior, not only scan the diff.
AI should explain why it selected the design, and humans should accept or reject that explanation.
High-risk changes need architecture review before merge.
Runbooks must be updated when behavior changes.

If your team cannot explain a change without asking the model to re-derive it, you do not own that change yet.

8. Hire for mental models, not syntax mastery

Syntax memory used to be a strong signal. It still matters, but it is no longer enough.

The premium skill now is the ability to think clearly about systems. A developer who can name edge cases, decompose ambiguous problems, design a test matrix, identify failure modes, and express constraints precisely will outperform a faster typist with weaker judgment.

Hiring should test for:

problem decomposition,
architecture tradeoffs,
domain modeling,
test design,
debugging reasoning,
incident thinking,
communication clarity,
ability to supervise AI output.

Ask candidates to write a spec before writing code. Ask them to review an AI-generated design. Ask them where the implementation will fail. Ask them what they would test first if production money were at risk.

The best AI-era engineer is not the person who never uses AI. It is the person who can govern AI output with a strong mental model.

9. Treat the test suite as the ultimate software guardrail

In AI-assisted development, tests are not a nice-to-have. They are the safety boundary that makes speed possible.

A strong test suite lets you regenerate code, refactor aggressively, migrate frameworks, and change architecture without relying on optimism. It turns "the model says it works" into "the system proves it still behaves correctly."

The most valuable tests are not only unit tests around isolated functions. They include:

business-rule tests,
integration tests for critical workflows,
state transition tests,
contract tests between services,
migration tests,
permission tests,
regression tests from real incidents.

The test suite becomes the executable memory of the organization. It captures what must remain true even when implementation changes.

This is especially important for legacy modernization. If you want to feed a legacy backend into agents and ask for a major migration, the test suite is the contract. Without it, you are asking the model to preserve behavior nobody has formally described.

10. Retrain mid-level mindsets quickly

The hardest cultural transition may not be technical. It may be identity.

Junior developers often adapt quickly because they have less attachment to old workflows. Senior developers can adapt because their real value was never syntax. It was judgment.

Mid-level developers can get trapped. They are good enough to produce code independently, but not always experienced enough to define the system boundaries that AI needs. If their identity is built around implementation speed, AI feels like a threat instead of a tool.

Companies need to retrain this group deliberately.

The goal is to move them from "I write the code" to "I supervise the system outcome." That requires practice in:

writing acceptance criteria,
designing tests,
creating decision tables,
reviewing generated code,
explaining architecture,
running incident simulations,
using adversarial review,
documenting intent.

This is not optional. If mid-level engineers do not become supervisors and system thinkers, they will either become passive prompt operators or overloaded reviewers of work they did not shape.

What this means for CTOs and founders

If you run a software company, the temptation is obvious: use AI to ship more. That is not wrong. But "more" is not a strategy.

The better question is: What must become stricter so speed can increase safely?

Our answer:

stricter specs,
stricter tests,
stricter state models,
stricter ownership,
stricter review boundaries,
stricter incident memory,
stricter human accountability.

AI is a signal amplifier. If the signal is clear, it compounds productivity. If the signal is noise, it compounds chaos.

The winners will not be the teams that generate the most code. They will be the teams that preserve intent while generating code at speed.

A practical starting checklist

If your team wants to use AI without rotting the codebase, start here:

Pick one production workflow and write its state machine.
Convert the top business rules into decision tables.
Add tests for the money path, permission path, and failure path.
Write an architecture decision record for the next AI-assisted feature.
Require every large generated change to include a design note.
Add one adversarial review prompt to every major technical plan.
Create an incident memory folder and link real regressions to tests.
Train developers to write specs before prompts.
Protect senior engineers from giant review queues.
Measure ownership, not just velocity.

This is not anti-AI. It is the opposite. It is how AI becomes usable in serious software delivery.

What a responsible AI-assisted change looks like

The principles above become useful only when they change how work moves through the team. A responsible workflow does not need a hundred-page specification, but it does need explicit checkpoints.

Imagine a team adding subscription cancellation to a SaaS product. A weak AI workflow begins with: "Add a cancel subscription button." The agent generates a button, an endpoint, and a database update. The happy path works, so the change is merged.

But the real feature contains questions the prompt ignored:

Does cancellation happen immediately or at the end of the billing period?
What happens to prepaid credits?
Can an account owner cancel while an invoice is overdue?
Which roles have permission?
What must be sent to the payment provider?
Can the operation be retried safely?
What happens if the provider succeeds but the local database update fails?
How does support reverse an accidental cancellation?

A resilient team handles the same change in six stages.

1. Define the outcome

Write the user outcome, business rule, non-goals, and acceptance criteria in plain language. Name the owner who can resolve ambiguity.

2. Model the states

Describe the valid subscription states and transitions. For example: active -> cancellation_scheduled -> cancelled, with separate failure and recovery paths. If a state transition matters to money, permissions, or customer access, it should not live only inside generated code.

3. Define proof before implementation

List the tests that will prove the feature works: successful cancellation, unauthorized cancellation, duplicate requests, provider timeout, partial failure, and rollback. This gives the agent a target more precise than "make it work."

4. Generate inside boundaries

Give the agent the relevant interfaces, architecture constraints, coding conventions, and files it may change. Ask it to identify assumptions before writing code. Smaller, bounded changes are easier to review and easier to discard.

5. Review behavior, not code volume

The human reviewer should trace the state transitions, failure handling, permissions, observability, and test evidence. A large diff is not proof of substantial work. A passing happy-path demo is not proof of correctness.

6. Record what changed

Update the design note, runbook, and incident-relevant documentation. The next engineer or agent should be able to understand why the behavior exists without reconstructing the decision from a thousand-line diff.

The code may take minutes to generate. The value comes from the structure around it.

How to tell whether AI is creating leverage or debt

Lines of code, prompt counts, and agent task completion rates are weak measures. They show activity, not engineering health.

Watch for these indicators instead:

Change failure rate: Are more releases causing incidents, rollbacks, or urgent patches?
Review latency: Are pull requests growing faster than qualified reviewers can understand them?
Regression rate: Are previously fixed bugs returning because generated changes ignore system history?
Mean time to understand: How long does an engineer need to explain a generated component before changing it safely?
Test signal quality: Do tests catch real regressions, or do they merely confirm the implementation the agent just wrote?
Ownership coverage: Does every production area have a human who understands its purpose, risks, and recovery path?
Architectural drift: Are new features following established boundaries, or creating parallel abstractions for the same concept?
Deletion confidence: Can the team remove or replace generated code safely because behavior is specified and tested?

One practical warning sign is the ratio between generated output and reviewed intent. If code volume rises sharply while design notes, acceptance criteria, and meaningful tests remain flat, the organization is accumulating ambiguity faster than software.

The target is not maximum generation. The target is minimum time from clear intent to verified outcome.

Frequently asked questions

Is AI coding bad for software teams?

No. AI coding becomes dangerous when a team uses it to bypass product thinking, architecture, testing, and ownership. Inside clear boundaries, it can remove repetitive work and give experienced engineers more time for higher-value decisions.

What is spec-driven development?

Spec-driven development means defining intent before implementation: the desired outcome, acceptance criteria, states, edge cases, constraints, and proof of correctness. The specification can be concise, but it must remove the important ambiguity the AI would otherwise invent.

What should engineering leaders change first?

Start with one critical workflow. Document its states and business rules, add tests for its success and failure paths, and require the next AI-assisted change to work from those artifacts. A focused pilot reveals more than a company-wide policy nobody follows.

What skills matter most for developers in the AI era?

System design, problem decomposition, test design, debugging, edge-case prediction, operational judgment, and precise communication are becoming more valuable. Syntax still matters, but it is no longer the main constraint on output.

The bottom line

AI does not remove the need for engineering discipline. It punishes the absence of it faster.

The future software company is not a room full of people manually typing every line. It is also not a room full of prompt operators blindly accepting generated output.

It is a company where humans manage intent, machines accelerate execution, tests enforce truth, and senior engineers remain responsible for meaning.

That is the standard we are building toward at Vasilkoff: AI writes the code; humans own the result.

You can see that model in Vasilkoff.com's AI estimation platform, where automated reasoning and generation sit inside explicit workflows, guardrails, and human ownership.

If you need software delivery that combines AI speed with accountable engineering, explore our AI development services or talk to us about your project.

Last updated: June 14, 2026