If you look at the AI engineering stack today, it feels like we are buying a lot of very fast engines but forgetting to build the steering wheel.
There are coding assistants that help developers write and change code at incredible speeds. There are review tools that comment on pull requests. There are analytics platforms that measure activity and cycle time.
Each category is useful. But none of them, by itself, gives leadership a reliable operating model for AI software delivery.
The missing layer is the AI-native SDLC operating layer: the system that turns product intent into approved engineering context, distributes that context to agents, verifies output against the approved artifacts, and gives leadership a measurable view of outcomes.
Without that layer, AI adoption stays fragmented, and throughput stays flat.
The Stack Optimizes Pieces, Not the System
Most software organizations now have three kinds of tools in the AI engineering conversation: coding tools (like Cursor or Copilot), review tools, and analytics tools.
The issue isn't that these tools are weak. The issue is that they assume something important already exists: a clear, approved, shared understanding of what should be built, how it should be built, and how success will be verified.
In many organizations, that intent layer is fragile.
It lives in vague tickets, Slack threads, tribal knowledge, and the heads of senior engineers. It is rarely explicit enough for autonomous agents. It is rarely durable enough for audit. It is rarely structured enough to become review criteria.
That is the gap.
Coding Tools Cannot Own the Intent Layer
AI coding tools operate inside a session. They are incredibly powerful when the session has the right context. They are dangerous when the context is incomplete, stale, or ambiguous.
An engineer can prompt an agent with a ticket and a handful of files. But the agent may still miss the customer problem behind the request, a hidden non-goal, a relevant architecture decision (ADR), or a test expectation from QA.
The agent may generate correct-looking code that solves the wrong problem.
That isn't a model problem. It is an operating model problem. The organization needs a layer that decides what context is authoritative before coding starts.
Review Tools Are Too Late to Be the Control Plane
AI review tools help at the PR stage. That is valuable, but the PR is the most expensive place to find out you misunderstood the ticket.
If a PR misses product intent, the team has already spent implementation time. If it violates an architecture pattern, the wrong design is already in code. Late detection is expensive.
AI-native delivery needs checks earlier in the flow:
- Before coding: is the work clearly specified?
- Before implementation: is the technical approach approved?
- During coding: does the in-progress diff satisfy the approved artifacts?
The point isn't to eliminate PR review. The point is to stop using PR review as the first moment the organization discovers whether the work was understood.
Analytics Tools Need Better Source Data
Engineering analytics tools are great for observing motion. They show activity, cycle time, and bottlenecks.
But AI adoption makes a new question more important: Did the work deliver the intended outcome?
Activity is not enough. More commits are not enough. Faster PRs are not enough. Leaders need to know whether implementation satisfied approved intent, followed the technical plan, covered the test plan, respected architecture decisions, and avoided rework.
That requires a system that produces and tracks the intent layer, not just a system that measures downstream activity.
What the Missing Layer Must Do
An AI-native SDLC operating layer has five jobs:
- Define Intent: Turn vague work into structured, reviewable, approved intent (Specs).
- Plan Execution: Convert approved intent into an engineering plan (Tech Plans).
- Define Verification: Create a test plan with concrete scenarios tied back to the Spec.
- Distribute Context to Agents: Make approved artifacts available to coding agents through a standard interface (MCP). Only approved artifacts should become default execution context.
- Verify Output and Measure Outcomes: Check implementation against the artifacts approved before coding began. Surface requirement gaps, ADR drift, and rework avoided.
This is how the enterprise moves from AI activity to AI delivery.
CodeMerlin as the Operating Layer
At Amulent, we built CodeMerlin to provide that missing layer.
It sits between work intake and autonomous implementation. Jira and Linear provide work signals. CodeMerlin creates and manages approved artifacts. Agents consume approved context through MCP. GitHub and PR workflows become the verification surface.
The key is that CodeMerlin doesn't replace the stack. It coordinates it.
Product and technology organizations don't want another disconnected workflow. They want AI to work inside the tools they already use, with enterprise controls around it.
The winning layer in AI engineering isn't just another coding assistant. It is the system that defines intent, distributes it to agents, governs execution, verifies output, and proves the outcome. For technology-driven companies trying to move faster without losing control, it is becoming essential.