Goodhart’s Law
When a measure becomes a target, it ceases to be a good measure
This is the cardinal rule of metrics and the most common trap in management. It states that the moment a measure is used as a target for reward or evaluation, its value as an objective indicator is destroyed. The system being measured will be manipulated to optimize for the metric itself, often at the expense of the actual, desired outcome. In engineering, this means if you make “lines of code” a target, you’ll get verbose, bloated code. If you make “story points” a target, you’ll get inflated estimates and artificially small tasks.
Why it happens:
- The Path of Least Resistance: It is almost always easier to game a proxy metric than it is to achieve the real goal the metric was supposed to represent. Hitting a numeric target becomes the job, displacing the more complex work of delivering genuine value.
- Malicious Compliance: When people are pressured to hit a specific number, they will comply with the letter of the law while violating its spirit. They will close tickets without properly fixing the bug or deploy code that meets the spec but fails to solve the user’s problem.
- Focus on Outputs, Not Outcomes: Simple, countable metrics (outputs) like “commits” or “deployments” are easy to measure and target. The real goal (outcome), such as “improved user satisfaction” or “reduced system downtime,” is harder to quantify. Goodhart’s Law pushes organizations to focus on the easy-to-measure outputs, even when they become disconnected from valuable outcomes.
What to do about it:
- Measure Processes, Not People: Use metrics to understand the health of your development system, not to evaluate individual performance. Track metrics like cycle time (how long it takes an idea to get to production) to identify bottlenecks in your process, not to rank engineers against each other.
- Use a Dashboard of Counter-Balancing Metrics: Never rely on a single metric. Instead, use a “dashboard” of metrics that balance each other out. For example, pair a speed metric like “Deployment Frequency” with a quality metric like “Change Fail Rate.” This makes it much harder to game the system, as optimizing for one metric at the expense of the other will be immediately visible. The DORA metrics are an excellent example of such a balanced dashboard.
- Tie Targets to Business Outcomes: When you must set targets, tie them as closely as possible to real business or customer value. Instead of targeting “10 features shipped,” target a “5% reduction in customer churn.” This forces the team to focus on solving the real problem, rather than just hitting a proxy. The “how” is left to the team, but the “why” is clear and difficult to game.
On the other hand, I’ve written about Why Goodhart’s Law Isn’t All That Useful…
Gilb’s Law
Anything you need to quantify can be measured in some way that is superior to not measuring it at all
This law is a direct challenge to the common excuse, “But that’s too subjective/complex to measure.” It argues that for any desirable quality—be it user satisfaction, code maintainability, or developer productivity—an imperfect proxy metric is vastly superior to having no metric at all. The act of defining a measurement, even a flawed one, forces clarity of thought and provides a baseline against which to judge progress. If you cannot measure it, you cannot meaningfully improve it.
Why it happens (Why this principle is necessary):
- Vagueness Prevents Action: Without measurement, goals remain ambiguous platitudes. “Improve user experience” is a wish, not a plan. “Reduce user clicks for the core workflow from an average of 7 to 4” is a measurable, actionable goal.
- Measurement Forces Definition: The process of trying to measure a fuzzy concept like “code quality” forces you to define what you mean. Is it test coverage? Cyclomatic complexity? The rate of production bugs? This act of definition is often more valuable than the resulting number, as it creates a shared understanding within the team.
- Perfection is the Enemy of Progress: Teams often get stuck in “analysis paralysis,” endlessly debating the flaws of every proposed metric. Gilb’s Law encourages a pragmatic approach: start with a “good enough” metric now, learn from it, and improve it later. An imperfect signal is better than no signal.
What to do about it:
- Start with a Proxy Metric: When faced with a seemingly unmeasurable quality, find a reasonable proxy. To measure “developer morale,” you could start by tracking voluntary team event attendance or responses to a simple weekly poll. It’s not perfect, but it’s a data point where before you had none.
- Decompose the Concept: Break down large, abstract goals into smaller, measurable components. “Improving platform stability” can be decomposed into “increase mean time between failures (MTBF),” “decrease mean time to recovery (MTTR),” and “reduce the number of P1 incidents per month.”
- Iterate on Your Metrics: Treat your metrics like code. Your first attempt will likely have flaws. Use it, observe its shortcomings, and then refactor or replace it. The goal is not to find the perfect metric on day one, but to continuously evolve your ability to measure what matters.
- Use Metrics for Insight, Not as a Weapon: This is the crucial companion to Goodhart’s Law. The initial purpose of a new metric should be for observation and learning, not for setting performance targets. Use it to understand the system and spark conversations. Only once a metric is well-understood and stable should it be cautiously considered as part of a balanced dashboard of targets.