Everything Tracked, Nothing Measured

Picture the engineer who just demoed the platform to their exec team. It pulls event data from a dozen systems: every issue tracker, code repository, CI pipeline, Slack channel, Zoom call, AI coding session, and cloud environment in the company. Feed it a name and a date range, and it reconstructs that engineer's day to the minute, when they were in a meeting, when they were actively coding, when their computer sat idle. It can guess when they went to get lunch.

The executives love it. The engineer who built it now dreads what comes next.

This isn't a hypothetical. It's a pattern showing up across engineering orgs this year, and the dread is well-founded. AI coding tools genuinely increased individual output, in some orgs, dramatically. That surge created a question nobody had a clean answer to: is this actually good? When your team goes from shipping one feature a week to three, the board wants to know if the improvement is real. When you can't evaluate quality at the leadership level, you reach for quantity. When quantity accelerates, you build dashboards to track it.

The impulse is understandable. The outcome is predictable.

The CPU Problem

The engineer who built that platform offered an analogy worth sitting with. 100% CPU usage doesn't tell you your computer is being useful. It might be rendering video. It might be stuck in a loop. CPU usage tells you the processor is doing something. It tells you nothing about whether that something matters.

An engineer submitting thirty commits a day, attending eight meetings, and logging AI prompts from morning to evening looks exceptional on an activity dashboard. They might be the most stuck person on your team, refactoring the same component in circles, reviewing PRs that should have been smaller, firefighting a production issue that proper architecture would have prevented. The dashboard captures the activity and captures nothing about whether the work moved the company forward.

What makes this dangerous right now is the timing. AI tools accelerated output faster than most teams developed practices to evaluate it. That gap, real gains that no one knows how to validate, is precisely what produces the visibility trap. Management reaches for data because something is clearly happening, and data feels like an answer. This is the deeper shift AI is forcing on every business: it's turning all of them into technology companies whether they planned for it or not, and the ones that don't build real judgment about their own engineering get run over by the ones that do.

What Gets Measured, What Gets Done

There's a principle every experienced engineer has watched play out: once a metric becomes a target, it stops measuring what you think it measures. The fastest validation of that principle came from the engineers themselves. Within days of dashboards going live, the response is coordinated and immediate, pull requests with thousands of commits, activity spikes timed to the logging window, prompts fired at AI tools between tasks to keep the numbers moving. Metrics look strong. Product delivery is unchanged.

This isn't a failure of the engineers. It's the correct response to a changed evaluation system. Engineers are rational. When the criteria for "productive" shifts from "shipped something customers value" to "generated enough activity for the dashboard," they optimize for the new criteria. The most senior people on your team, the ones you need thinking about hard problems, tend to be first to recognize what the new game is. They're also the least willing to play it. They update their LinkedIn profiles and start taking recruiter calls.

What you're left with is a team optimized for visibility and metrics that look strong in the board deck. The dashboard won't show what departed with your senior engineers, or the technical debt accumulating in corners where no one's watching. It also won't show the failure mode that's already surfacing at some companies: when a data connector breaks, the system doesn't log a gap. It logs an engineer who wasn't working. The AI summarizing that data doesn't caveat its conclusions. It presents an engineer as unproductive for days. Guilty until proven innocent, with an accuser that can't be cross-examined.

“When a data connector breaks, the system doesn't log a gap. It logs an engineer who wasn't working, guilty until proven innocent, with an accuser that can't be cross-examined.”

What a CTO Actually Does Here

When an exec team falls in love with an activity dashboard, someone needs to be in the room asking the harder question: are we measuring what's easy to see, or what actually matters?

That question is nearly impossible to raise from inside the engineering org. An engineer who asks it sounds defensive. A manager who raises it looks like they're blocking accountability. Finance doesn't know to ask. Product doesn't have the context. And so the dashboard gets built, the C-suite circulates it in board prep, and the engineering organization quietly starts optimizing for the wrong thing.

This is where technical leadership earns its keep, not in architecture reviews, but in the room before a bad idea becomes infrastructure. We've been in enough of these conversations to recognize the pattern before it sets. The instinct behind the dashboard is legitimate: you want to know whether your investment in AI tooling is paying off. The problem is that activity data answers a different question than the one you're actually asking.

The metrics that tell you whether engineering investment is working look different. Are features reaching customers faster? Is the error rate in production going down quarter over quarter? Are your senior engineers staying? When engineers push back on scope, do they win sometimes, meaning they're engaged rather than just executing? These signals are harder to pull from a dashboard. They require judgment, which is uncomfortable. But they're the only signals that tell you whether the machine is useful, not just running.

The Reckoning That's Already Here

The market for engineering surveillance tools exists because something real is missing: a credible way to evaluate engineering output at the leadership level. If you're a CTO, that gap is yours to fill. The alternatives are two: build the judgment yourself, or someone above you fills it with a spreadsheet.

Building the judgment means getting specific. Not "the team is productive" but "we shipped the authentication rewrite in six weeks instead of twelve, and production incidents dropped forty percent the following quarter." Specifics require you to be close enough to the work to know what good looks like. Most of the conversations we're brought in for happen after the dashboard is already built, after the senior engineers have started leaving, after the metrics look great and the product has quietly stalled.

The earlier version of that conversation is cheaper.

When your board asks whether your engineering team is being productive, the answer they need is about outcomes. If you don't have that answer ready, a dashboard will fill the gap. And once it's in place, it takes on a life of its own, measuring what's visible, rewarding what's legible, and steadily driving out the engineers who know the difference.

Before the dashboard becomes infrastructure.

If your leadership team is reaching for activity metrics to evaluate engineering output, the earlier conversation is the cheaper one. Whether you're a tech company or a business that's becoming one because AI left you no choice, it's worth talking to founding-level operators who've been on both sides of that room.

Start a conversation