AI transformation is a systems problem

Every engineering leader I talk to is trying to transform. They want their teams to be AI-native. They're rolling out tools, tracking adoption, watching dashboards. But the harder questions are the ones most leaders aren't asking out loud. What are we actually trying to change? What does success look like? How should we be thinking about this?

Justin Reock, CTO at DX, spends his days talking to engineering leaders about exactly these questions. DX sits on top of telemetry and survey data from hundreds of thousands of engineers across hundreds of companies, including a longitudinal study of 500 companies tracking PR velocity from November 2024 through February 2025. I sat down with him for the latest episode of We Built What. The data tells a clear story: AI is exposing a systems problem that was always there. The companies winning right now are the ones who see it.

The data says calm down. It also says you've been looking at the wrong thing.

Here's the headline number from DX's study: a 7.5% median uplift in PR velocity. The average was 13%. The top performer hit 70%.

"Who wouldn't invest in the 10 or 15 percent uplift in overall productivity?" Justin asked. "I think we just need to start allowing ourselves the grace that 10 or 15 percent is actually successful."

That's a recalibration most leaders need to hear, but it's not the most important thing the data is telling us.

Atlassian's State of DevEx research has consistently shown that engineers only spend about 16% of their time actually writing code. Justin made the implication clear: "Even if you had a tool that was 100% accurate, required no rewrite, no refactor or review, and was instantaneous, you're still only attacking 16% of the problem."

If you point AI at that 16% and leave the other 84% untouched, single-digit productivity gains are exactly what you should expect.

The systems lesson is older than DevEx

This isn't a new finding. The coding war games of the 1970s found something that has held up for fifty years. Across different organizations, top performers produced at 11 times the rate of the bottom performers. Within the same organization, the spread between individuals was only about 20%.

W. Edwards Deming, decades before that, articulated it even more cleanly: 90 to 95% of an organization's productivity output is determined by the system, not the worker.

Google has rediscovered the same thing more recently. When they studied 180 teams to figure out what made the best ones work, the composition of the team didn't matter. The norms did. The same individual could thrive on one team and struggle on another. The strongest predictor of performance is the system, the manager, and the team they're embedded in.

AI doesn't change this. It amplifies it. Drop a powerful AI tool into a system designed for slow, sequential, gate-heavy work, and you'll get a marginal improvement on a fundamentally constrained pipeline. Drop the same tool into a system designed for flow (modular code, fresh docs, fast CI, psychological safety to experiment) and you get the 70% uplift.

What "the system" actually means

Justin laid out what the highest-performing companies in DX's data have in common:

Code modularity and accessible, current documentation. Stale docs and tangled code aren't just engineer pain points. They're inference inputs.
Fast CI/CD pipelines. If your build takes 40 minutes, your agent's feedback loop takes 40 minutes too.
Education and time to absorb it. DX's data showed something counterintuitive. Light AI adoption actually decreases productivity. Only moderate-to-heavy adoption outperforms non-adoption. The learning curve is real, and it requires time, safe-to-fail projects, and psychological safety to experiment.
Agent orchestration and good inference pipeline design. Justin's example: have a higher-temperature model generate a quality rubric, then have a lower-temperature reasoning model validate against it.

"What's good for humans is also good for agents. All this stuff that we've been saying organizations need to do for a good developer experience over the last decade, they're finally starting to care about this because they're burning so much money in tokens."

Knowing what the system is is one thing. Knowing where in your system the friction actually lives is another.

Map the value stream, not the engineer

Before you can fix the system, you have to see it. Justin's recommendation is to do the slow, expensive work most teams skip: a real value stream map. Sit down with leaders. Trace value from ideation through to revenue or proven customer value.

Most teams don't do this because cycle times are easier to measure. PR approval to release. Ticket open to ticket close. Those numbers tell you about friction in the middle of the process. But the real bottlenecks usually live somewhere else: upstream in ideation and prioritization, or downstream in rollout, adoption, and monetization.

Justin invoked Eli Goldratt: an hour saved on something that isn't the bottleneck is worthless.

Agent experience is the new developer experience

The systems argument extends into the next era. Hybrid human-agent teams are already normal. The system has to work for both populations.

Justin shared something DX is already doing: they're starting to send their developer experience surveys to agents. After an agent completes a task, it gets the same survey a human would. They're calling it the Agent Experience Index.

"If developer experience is the leading indicator of developer productivity, then agent experience is going to be the leading indicator of agent productivity."

The clean docs, modular code, and fast feedback loops we've been building for our humans are exactly the inputs that determine how well our agents perform. The system either works for both, or it works for neither. Which means the work you've already done on developer experience isn't legacy investment. It's the foundation your agent strategy is going to run on.

The job of an engineering leader is still about the system

Justin put it directly: developer experience is much more about the system than the people. That shifts accountability back onto leadership. Your job isn't to extract more from your individual engineers. Your job is to design a system where engineers and agents can convert effort into outcomes.

"We tend to focus too much on the individual and not enough on the system. And I think that's something we should correct. If we really widen our aperture, we can find much more creative applications to really create that flow and that payoff that we want to see from these investments."

Trace your value stream. Find your real bottlenecks. Fix the docs, the build pipeline, the modularity, the learning culture. Build a system where humans and agents both thrive, because the same things make both of them productive.

Code generation alone won't make you AI-native. The system will.

Listen to the full episode of We Built What with Justin Reock, CTO at DX, on YouTube, Spotify, or Apple Podcasts. We talk about why the 7.5% median is good news, the trap of optimizing the wrong 16% of an engineer's day, and why Justin thinks there's never been a better time to be a software engineer.