Topic: adr review checklist

ADR Review Checklist — What to Look For Before Merging an Architecture Decision Record

Q: How do I review the Alternatives section without deep domain expertise?

Focus on trade-off quality, not coverage. The reviewable signal is this: each alternative should have at least one 'this one was rejected because X' statement where X is more than a single word. If the alternatives section says 'Option B: MongoDB (rejected)' with no further detail, ask for one sentence describing what made Postgres the better fit. That sentence is the irreplaceable part — it is what a new engineer three years from now will search for when they ask 'why not MongoDB?' You do not need domain expertise to tell whether the rejection rationale is present; you do need it to assess whether the rationale is correct, which is not your job as a documentation reviewer.

Q: Does the ADR review have to happen before the code ships?

Ideally yes, but in practice ADRs often trail the implementation by days or weeks, especially for fast-moving decisions under deadline pressure. The review still matters because an ADR that ships two weeks late with accurate trade-offs and honest alternatives is more valuable than one that ships before the code with placeholder content. The goal is to keep the review fast enough that authors feel comfortable drafting ADRs on real decisions rather than waiting until they have a sanitized summary prepared — which usually means never. Three checklist items (structure, traceability, longevity) can be verified in under two minutes; only the reasoning-quality items require reading the decision carefully.

Most ADR practices fail for one of two reasons: teams never write ADRs in the first place, or they write them but the review process lets in low-quality records that accumulate into a directory full of vague, incomplete, or actively misleading history. The review stage is where the practice either strengthens over time or quietly degrades. This page is the PR-reviewer's counterpart to the ADR template: a 12-item checklist across four categories that takes under five minutes to apply and catches the failure modes CI can't.

TL;DR

ADR review is not a second decision gate — it assesses documentation quality, not the correctness of the choice. The 12 checklist items split into four categories: structure (title is a decision statement, Status is valid, Context doesn't presuppose the solution, Decision is active voice), reasoning quality (at least two alternatives listed each with a real trade-off, Consequences has at least one downside, rejected alternatives explain why), traceability (PR or issue number present, supersession is bidirectional if applicable), and longevity (no version numbers that will rot in a year, no implementation details that belong in code comments). The ADR GitHub Action workflow handles mechanical structure enforcement; the reasoning-quality items are where human review adds irreplaceable value.

Why ADR review differs from code review

In a code review you're assessing correctness — will this code work, is it safe, does it match the intent. In an ADR review you're assessing documentation quality — is this decision well-enough documented that a future engineer can understand what was decided, what was considered, and what trade-offs were accepted. Those are different questions, and conflating them is the root cause of almost every dysfunctional ADR review.

The practical consequence: blocking an ADR PR because you disagree with the decision is the most common way ADR practices die. Once the team learns that ADR PRs get re-litigated the same way implementation PRs do, they stop writing ADRs for time-sensitive calls. The ADR directory degrades from "decisions that actually happened" to "decisions that had enough political runway to survive a double review." That is a worse audit trail than no ADRs at all, because it creates the illusion of documentation without the reality.

The reviewer's scope is limited to four questions:

Is the decision legible — can a future engineer understand what was decided, and why?
Is the reasoning honest — does it acknowledge trade-offs and rejected options with real rationale?
Is the record traceable — is there a PR or issue link, and are supersession pointers bidirectional?
Is the record durable — will it still make sense in two years, or has it embedded version numbers and implementation details that will rot?

You can raise an alternative you believe was overlooked. You can flag a risk you think the Consequences section undersells. You can request clarity when the Context section is ambiguous. What you shouldn't do is block approval pending agreement on the decision itself — if you have the authority to change the decision, that conversation belongs before the ADR PR exists, not inside it.

The 12-item review checklist

Structure (4 items)

These four checks are mechanical enough that CI can partially enforce them, but a human reviewer catches the subtler violations the regex misses.

The title is a decision statement, not a question or a topic. "Choose Postgres for the user database" is a decision statement. "Database choice" is a topic. "Should we use Postgres or MongoDB?" is a question. Only decision statements make the directory scannable and make the index useful. The ADR GitHub Action's lint job checks that the title starts with a capital letter and has reasonable length; it can't check that the title actually states the decision made.
Status is one of the valid lifecycle states. The canonical set is: Proposed, Trial Period, Accepted, Superseded, Deprecated, Rejected. Variants like "Draft," "WIP," "Under review," or "Pending" are not valid Status values — they describe the PR state, not the ADR's lifecycle state. An ADR that merges with Status: Draft was merged too early; block it or ask the author to set Status: Proposed (which is the correct state for a decision that is submitted but not yet ratified).
Context states the problem without presupposing the solution. The Context section should describe the situation that made a decision necessary, not the decision itself. "We need to choose Postgres" is not a context; "Our write throughput is exceeding the current SQLite ceiling and we need a transactional database that supports concurrent writers" is. If Context already names the chosen option, the Author has merged problem and solution — which makes the record hard to use for supersession later, when the context has changed but the decision may still be valid.
Decision is written in active voice, first-person plural. "We chose Postgres over MongoDB" is the canonical form. "Postgres was chosen" obscures agency and makes the record feel like a legal artifact rather than a team commitment. "The team decided on Postgres" adds indirection without adding information. Active first-person plural is the format the Nygard template established and MADR preserved; consistency across the directory makes the index scannable and the records quotable.

Reasoning quality (4 items)

These four checks are where the human review adds irreplaceable value. CI cannot assess whether a trade-off is real, whether an alternative was actually considered, or whether the Consequences section is honest.

At least two alternatives are named. An ADR with one alternative listed ("we considered MongoDB, rejected it, chose Postgres") is better than one with none. But two is the threshold where the record becomes useful for supersession: when the team revisits the decision, they need to know what was evaluated, not just what was chosen. If the author truly considered only one alternative, that's worth capturing as context ("at the time, only two options were evaluated because X") rather than presenting a single-alternative section.
Each rejected alternative has a real rejection reason. "Option B: MongoDB (rejected)" is not a rejection reason; it is a label. The reviewable signal is that each alternative's rejection gets at least one sentence where the sentence does more than restate the chosen option's name. "Rejected because Postgres has stronger ACID guarantees for our dual-write pattern" is a real reason. "Rejected because we preferred Postgres" is circular. This is the single checklist item most worth spending time on — the rejection rationale is what a future engineer will search for when they're considering revisiting the decision, and it is impossible to reconstruct from the git history if it wasn't written at the time.
Consequences lists at least one negative consequence. An ADR whose Consequences section lists only benefits wasn't written honestly — every architectural choice has trade-offs, and a record that doesn't acknowledge them provides false assurance rather than a useful audit trail. The downside doesn't have to be large; it just has to be real. "Postgres introduces operational overhead that SQLite didn't require; we've committed to using RDS to offload most of that, which adds cost" is honest. "Postgres provides all the reliability and performance we need" is a marketing pitch, not a Consequences section.
The Consequences section is bounded to the architectural call, not the implementation. "We will need to configure connection pooling" is an implementation detail. "We accept the operational dependency on a managed database service" is an architectural consequence. The distinction matters because implementation details go stale quickly (connection pool config changes weekly; the managed-service dependency is still true in three years) and because mixing them makes the record longer without making it more useful. The implementation details belong in the PR description, the runbook, or the code comments — not the ADR.

Traceability (2 items)

A PR link or issue number is present. The ADR should trace back to the conversation that produced the decision. A PR link ("see discussion in #412") or issue number ("arising from INFRA-88") is sufficient. This matters for supersession: when a new ADR supersedes this one, the author of the new ADR needs to reconstruct the original reasoning, and the git history of the ADR file alone rarely contains the full context. The link is the bridge between the formal record and the informal discussion where the real reasoning happened — including, increasingly, the AI chat conversation where the trade-offs were first worked through.
If this ADR supersedes another, both files are in the PR. Supersession must be atomic: the new ADR gets a Supersedes: NNNN-old-slug.md line, and the old ADR gets Status: Superseded and a Superseded-by: MMMM-new-slug.md back-pointer. If you see a PR that adds a new ADR with a Supersedes line but doesn't include the old ADR as a changed file, block it — one-sided supersession is the most common integrity bug in ADR directories. The ADR supersession pattern page documents the full two-file atomic protocol; the GitHub Action's supersession job catches existing violations on every PR and nightly.

Longevity (2 items)

No version numbers in the Status or Context sections that will become misleading within a year. "We chose React 18.2 because of its concurrent rendering features" embeds a version number that makes the ADR look outdated the moment React 19 ships — even if the decision to use React remains valid. Write "React (then at 18.2)" in the body if the version matters for historical context, but write the Status and Context in terms of the capability or the constraint, not the version. The ADR should be true forever; the version was true at the time. A future engineer reading "we chose the concurrent-rendering model of React" understands the decision; one reading "we chose React 18.2" learns nothing useful about whether the decision is still valid today.
No implementation details that belong in code comments or runbooks. Connection pool configuration, specific environment variable names, deployment targets, migration commands, and operational run-books are not architectural consequences — they are implementation details. An ADR that embeds them becomes a maintenance liability: every infrastructure change requires an ADR update, and the ADR directory becomes indistinguishable from the ops wiki. The test: would this sentence still be true after a migration to a different cloud region, a different orchestration tool, or a different deployment model, assuming the core architectural choice stays the same? If not, it's an implementation detail.

Common review failures that kill ADR practices

Beyond the 12 checklist items, three review patterns consistently damage the long-term health of ADR practices.

Rubber-stamping without reading

An ADR PR approval without a comment is almost always a rubber-stamp. The problem isn't the reviewer's efficiency — it's what it signals to the author: that the review adds no value, and that the ADR is ceremony rather than signal. The minimum useful review is one comment on one checklist item (even "looks good, Consequences section is honest about the operational overhead" tells the author the review was real). A blanket approve-without-comment culture produces ADRs written to minimize review friction, not to document decisions honestly.

Blocking on decision disagreement

The failure mode described in the opening section, restated: if a reviewer's block is "I would have chosen MongoDB" rather than "this ADR doesn't document why MongoDB was rejected," the block is inappropriate. The distinction is between a documentation review and a decision review. The reviewer who disagrees with the decision should escalate through whatever governance process the team uses for architecture decisions — an Architecture Board review, a team sync, a separate RFC — not through a hold on the ADR PR. Blocking the ADR PR delays documentation without having any effect on the decision, because the code has already shipped or the decision has already been made.

Requiring perfect alternatives coverage under time pressure

The "at least two alternatives" item should be applied proportionally to the decision's reversibility and impact. For a foundational stack choice that will shape the codebase for five years, comprehensive alternatives coverage is worth a slower review. For a time-sensitive operational decision made under an incident, an ADR with one alternative and a note that "time constraints limited evaluation" is more honest and more useful than an ADR delayed until the incident is resolved. The ADR practice dies when the review bar makes authors feel that writing the ADR will cost them more than not writing it. Apply the alternatives-coverage requirement in proportion to the decision's reversibility.

The automation ceiling

The ADR GitHub Action workflow enforces a subset of the checklist automatically on every PR that touches doc/decisions/:

Checklist item	CI can catch?	Human required?
Title is a decision statement	Partially — checks length and capitalisation	Yes — CI can't distinguish "Choose Postgres" from "Database considerations"
Status is a valid lifecycle value	Yes — vocabulary check via grep	No — if the regex covers all valid values
Context doesn't presuppose the solution	No	Yes
Decision is active voice, first-person plural	Partially — checks for "was chosen" passive patterns	Yes — for subtler passive constructions
At least two alternatives listed	Yes — count alternatives section items	No — if the CI check covers the template format
Each alternative has a rejection reason	No — requires reading for substance	Yes
Consequences has at least one downside	No	Yes
Consequences is bounded to the architectural call	No	Yes
PR or issue link present	Yes — regex for #NNN or JIRA-NNN patterns	No
Supersession is bidirectional (both files in PR)	Yes — the supersession integrity job	No — when the CI job covers both directions
No rotting version numbers in Status/Context	No	Yes
No implementation details in Consequences	No	Yes

The six items CI cannot catch — decision statement quality, context presupposition, rejection rationales, downside consequences, architectural boundary, and rotting details — are exactly the items that determine whether the ADR is useful three years from now. The four items CI can enforce fully are mechanical; getting them wrong is embarrassing but not damaging. The reviewer's time is well spent on the six.

How WhyChose fits in

The hardest checklist items to pass are the ones that require original reasoning at writing time: real alternatives with real rejection rationales, an honest Consequences section with at least one downside, a Context section that states the problem without presupposing the solution. The WhyChose extractor reads your ChatGPT or Claude chat exports and surfaces decision-shaped exchanges — conversations where the team considered multiple options, traded off their consequences, and reached a conclusion. The extractor's output pre-fills exactly those sections: the alternatives considered in the chat become the Alternatives section, the trade-offs discussed become the Consequences section, the context established before the decision was made becomes the Context section. An ADR drafted from an extractor output typically passes the reasoning-quality checklist items on the first review, because the reasoning was done in the chat and the extractor makes it legible.

Get early access