Blog · 2026-06-11 · ~11 min read
The ADR review checklist: what to verify before merging
An engineer opens a PR for an ADR. The review takes four minutes. The reviewer leaves two comments: one corrects a typo, one asks for the status field to be capitalized. The PR merges. Fourteen months later, a new engineer asks why the team chose TypeScript over Go for the API service. The ADR exists. It says: "TypeScript — chosen for team familiarity. Go — not chosen." The original author left eight months ago. The reviewer left six months ago. "Team familiarity" is the whole record. What the ADR doesn't say: that Go was benchmarked and hit the latency target; that TypeScript was chosen because the team already owned an internal ORM that would have taken six weeks to rebuild in Go; that the ORM dependency cascaded into the database schema choice; that this decision created a hiring constraint that the team later felt but couldn't trace to its origin. The reviewer approved an ADR that was a filing artifact, not a decision record. The PR passed because it had the right sections. It failed as a document because nobody checked what was inside them.
TL;DR
Most ADR reviews are format reviews — checking that sections exist, status is set, and the title is descriptive. Format reviews produce correctly structured records with thin reasoning. A decision review checks whether the ADR captures the three things that make it useful in 18 months: named alternatives with concrete rejection reasons, the constraint that drove the choice, and honest consequences that name a real trade-off. Before opening a PR, authors should run five checks. During review, reviewers should run three. The Alternatives Considered section is the most consistently under-populated field in any decision log — the failure mode is entries that name options without explaining rejections. The fix is to run the WhyChose extractor on the AI chat session that preceded the ADR before writing the section: the extraction output shows which alternatives were actually deliberated, with the reasoning that was live at the time rather than reconstructed from memory after the decision was made.
The difference between a format review and a decision review
Code reviews have a clear success criterion: the code compiles, tests pass, behavior is correct, the approach is maintainable. ADR reviews have a less obvious one because an ADR is a document, not a program — it can't fail a test. The result is that most ADR reviews default to the only criterion that's visible on the surface: format correctness.
Format reviews ask: Are all required sections present? Is the status field set? Is the title descriptive? Is the Nygard or MADR template followed correctly? These questions are answerable in two minutes and produce clear pass/fail signals.
Decision reviews ask: Are the alternatives real ones that were actually considered, or are they strawmen included to fill the section? Does the rejection reason for each alternative explain why this option failed for this team at this time, not just describe the option? Is the constraint that drove the choice named explicitly — not "performance" but "we needed p99 under 50ms for the payment flow given the existing infrastructure"? Are the consequences honest — does the record acknowledge what was given up, not just what was gained?
The two types of review are not alternatives. Format is necessary — a structurally broken ADR is hard to parse and fails search. But format is not sufficient. An ADR with all the right sections, correctly capitalized, with a valid status enum value, can still be a filing artifact if the sections contain thin content. The checklist below covers both, with the author responsible for format correctness before the PR is opened and the reviewer responsible for decision quality once the PR is open.
The reason format and decision review are often confused is that most ADR tooling — Nygard's original template, MADR, adr-tools, Log4Brains — enforces structure, not substance. The tool can verify that a Considered Options section exists. It cannot verify that the entries in it contain useful information. The checklist fills that gap.
Five checks before opening the PR (author)
These checks should run before the PR is created, not as PR comments. Catching a thin Considered Options section before review means fixing it while the reasoning is still fresh — the decision was made recently, the AI chat session is still accessible, and the rejected alternatives are still in working memory. Catching it after review means reconstructing reasoning that may have already faded.
Check 1: Is the decision actually made?
The Status field should be Accepted or Rejected before the PR opens. An ADR with Status: Proposed or Status: Under Discussion is a proposal document, not a decision record. Proposals belong in an RFC or draft document — opening a PR for a Proposed ADR signals that review is part of the decision-making process, which creates confusion about what the review is for. Reviewers don't know whether to weigh in on the decision itself or on the documentation of a decision that's already been made.
The rule: ADR PRs are for documentation review, not for making the decision. The decision happens first — in a meeting, an async thread, an RFC comment window, or a Slack message. The ADR records it. If you need consensus on the decision before writing the ADR, that's what an RFC is for.
Exception: some teams use ADR PRs for the final stage of an RFC process, where the PR is the comment window and an accepted PR equals an accepted decision. If your team does this, be explicit: the ADR template should carry a distinct "RFC-in-progress" status separate from "Proposed", and the PR description should say "this PR accepts the decision when merged." Reviewers need to know which mode they're in.
Check 2: Are there at least two named alternatives with rejection reasons?
The Alternatives Considered section (or Considered Options in MADR) is the most load-bearing part of an ADR for long-term value. Six months after a decision, the question is almost never "what did we choose?" — that's in the code. The question is "what did we not choose, and why?" The chosen option is visible. The rejected options are invisible unless they're written down.
Two named alternatives is a floor, not a ceiling. Most architectural decisions involve three to five real contenders. If the Alternatives Considered section lists only one alternative, the ADR should explain why — either there was genuinely only one alternative (rare), or the section is incomplete.
Each entry needs a rejection reason. "Redis — not chosen" fails. "Redis — evaluated for session storage; rejected because the team has no operational experience with Redis and the additional on-call burden for a 5ms latency gain didn't meet the risk threshold given our 3-engineer infrastructure team" passes. The rejection reason doesn't need to be long. It needs to be specific enough that an engineer who joins in two years understands why this option was off the table at the time, and whether the constraint that rejected it might have changed.
Check 3: Is the constraint named?
The Consequences or Decision Outcome section in most templates has room for the rationale: why this option won. The most common failure mode is that the rationale describes the chosen option's properties without naming the constraint that selected it.
"TypeScript was chosen because it offers strong typing and good tooling" is not a constraint — it describes the option's features. "TypeScript was chosen because the team already owns the payment-flow ORM in TypeScript and rebuilding it in Go would have required six engineer-weeks we didn't have before the Q3 feature freeze" is a constraint. The difference matters in 18 months when the team is considering a partial Go migration: the first rationale sounds like a permanent endorsement of TypeScript's virtues; the second clearly shows a context-dependent decision that should be revisited if the ORM dependency changes.
Good constraint language includes: timeline pressure, team capability gaps, existing system dependencies, cost ceilings, compliance requirements, vendor lock-in decisions, performance requirements with specific numbers. Abstract rationale — "better long-term maintainability," "aligns with industry best practices" — is almost never a real constraint. Real constraints have specificity.
Check 4: Are the Consequences honest?
The Consequences section of a decision record is not a features list. Every architectural choice accepts a trade-off — that's what makes it a decision rather than an obvious correct answer. An ADR whose Consequences section lists only gains and no costs is a sign that the trade-offs were either not analyzed or not written down.
"This will improve performance and maintainability" is not an honest consequences section. "This improves latency for the read path at the cost of write complexity — every update now requires two writes and a cache invalidation. The engineering team will need to own the invalidation logic and add monitoring for cache drift" is honest.
The question to ask before opening the PR: "What did we give up?" If the answer is "nothing," the ADR is either documenting a trivially correct decision that didn't need an ADR, or the trade-offs haven't been surfaced. In a real decision with real contenders, something is always traded. Write what it is.
This is especially important for security decisions, where the Consequences section needs to name accepted risks explicitly — not just desired outcomes. An ADR that says "we chose to store session tokens in the database rather than a dedicated secret store" needs to acknowledge the accepted risk, not just say "simpler to operate."
Check 5: Is the ADR self-contained?
A new engineer with no context beyond the decisions/ folder should be able to read the ADR and understand the decision without needing to search Slack, watch a recording, or interview the author. This check is often violated by ADRs that reference context that exists elsewhere: "as discussed in the architecture review on [date]," "given the constraints from the infrastructure team's Q2 planning," "following the decision from ADR-0032."
References to other ADRs are fine — linking to a prior decision that established a constraint is exactly what the supersedes/linked-by field is for, and those links remain navigable inside the decisions/ folder. References to external surfaces (meetings, Slack threads, documents outside the repo) are a risk: those surfaces may not exist in 18 months, and even if they do, a new engineer shouldn't need to retrieve them to understand the decision.
The test: take the ADR, strip the header metadata, and read only the Context, Considered Options, Decision, and Consequences sections. Can you reconstruct why this decision was made? If not, the referenced context needs to be brought into the ADR, not linked out to an ephemeral source.
Three checks during review (reviewer)
The reviewer's job is different from the author's. The author checks completeness from the inside — they know what the decision was and whether the ADR captures it. The reviewer checks utility from the outside — they're simulating the new engineer who will read this ADR in 18 months with no context other than the document itself.
Reviewer check 1: Can you reconstruct the deliberation from the Alternatives Considered section alone?
Read the Alternatives Considered section without looking at the Decision section. Do the entries contain enough information to understand why each option was in contention and why it was ultimately ruled out? If you can't tell which options were serious contenders versus token inclusions, or if you can't tell why each non-chosen option was rejected, the section is incomplete.
This check surfaces the most common failure mode: alternatives that are named but not explained. "PostgreSQL, MySQL, MongoDB — PostgreSQL chosen" gives you a list but not a deliberation. "PostgreSQL — chosen; strong consistency guarantees, team has operational expertise, existing backup tooling. MySQL — evaluated; compatibility concerns with the JSON column type we need for the configuration store. MongoDB — evaluated; team has strong preference against schemaless stores for financial data after an incident at a previous company" is a deliberation you can reconstruct the reasoning from.
If the review comment is "I can't tell why X was rejected," the author's fix isn't to argue the point in the PR thread — it's to add the rejection reason to the ADR. The PR thread won't be readable in 18 months; the ADR will be.
Reviewer check 2: Is the constraint specific enough to expire correctly?
Every constraint has a half-life. "The team doesn't have Go expertise" is a constraint that expires when the team hires a Go engineer. "The Q3 feature freeze made a six-week migration infeasible" is a constraint that expired in Q4. "Our compliance requirements at the time prohibited storing PII in third-party services" is a constraint that may expire when compliance requirements are renegotiated. Writing the constraint with enough specificity that its expiry is detectable is what lets future engineers decide whether to revisit the decision.
The reviewer's question: "If this constraint changed, would a future engineer be able to tell this decision should be reconsidered?" If the constraint is so abstract that it couldn't change (it's described as a timeless best practice rather than a context-dependent limitation), it's not the real constraint. Ask for the real one.
This connects directly to the ADR lifecycle: well-scoped constraints let engineers set appropriate review triggers. "Revisit if team Go expertise exceeds three engineers" is a specific condition. "Revisit when TypeScript no longer makes sense" is not.
Reviewer check 3: Does the Consequences section name what was given up?
As a reviewer, your specific job is to be the new engineer who inherits this system with no prior context. Read the Consequences section as that person. Are you being told what you'll have to own, build, or work around as a result of this decision? Are you told what the system won't do because of this choice?
If the Consequences section reads as a description of the chosen option's benefits rather than as a list of trade-offs accepted, it's incomplete. A test: replace the chosen option's name with a competitor's name in the Consequences section. Does the section still read as true? If so, the section is describing the option's general properties, not the specific trade-offs accepted in this context. Ask for the trade-offs that are specific to this team, this system, and this decision.
Why Alternatives Considered is consistently under-populated
Every ADR format treats the Alternatives Considered section as a required field. In practice, it's the field most often filled with thin content. Three structural reasons explain why — and understanding the reasons points to the fix.
The deliberation happened in AI chat, not in a document
The engineer who made the decision didn't evaluate alternatives in a structured document. They opened ChatGPT or Claude, described the problem, and explored options in natural language over three or four messages. The AI responded with pros, cons, trade-offs, questions. The engineer refined their thinking. By the end of the session, the decision was essentially made — but the reasoning lived in a chat session, not in a structured record.
When the engineer sits down to write the ADR hours or days later, the chat session is closed. They're writing from memory. The alternatives they can remember are the ones that survived most clearly — usually the chosen option and one or two prominent rejected options. The subtler alternatives, the constraints that ruled out whole categories, the trade-off reasoning that made the final choice clear — those are harder to reconstruct from memory than they were to generate in real time.
The fix: run the WhyChose extractor on the AI chat session that preceded the ADR before writing the Considered Options section. The extractor surfaces the deliberation candidates — the question-shaped messages, the trade-off markers, the commit phrases — that represent the decision-making content of the session. Running the extractor while the chat is still fresh (before opening the PR, not after) produces extraction output that maps directly onto the Considered Options entries. The alternatives are already documented; they just need to be moved from the chat session into the ADR.
This is the practical connection between the quarterly extraction pass and the per-decision ADR review: you don't have to wait for the quarterly review to extract from a chat session. If you wrote an ADR and know the deliberation happened in AI chat, extracting from that specific session before the PR opens produces a better ADR with less reconstruction effort.
The engineer already knows the answer
When you made the decision, you had full context: all the alternatives, all the constraints, the conversation that shaped the final choice. Writing the Alternatives Considered section from that context feels redundant — you know why Redis was rejected, so writing it down feels like explaining something obvious.
The problem is that the section isn't for you — it's for the engineer who joins in 18 months and has none of that context. What's obvious to you is invisible to them. The rule of thumb for alternatives: if you can say out loud why a specific option was rejected in a way that would satisfy a skeptical colleague, that reasoning belongs in the ADR. If you find yourself thinking "it's obvious why we rejected it," that's usually the reasoning most worth writing down — because it's only obvious to the people who were there.
The format doesn't prompt for rejection reasons
The Nygard template lists Considered Options as a simple list — options, not options-with-reasons. The MADR 4.0 template has a Pros and Cons section per option, which is better, but it's still structured as a feature list rather than a rejection argument. Neither format explicitly prompts: "For each option you didn't choose, write the reason it was rejected."
Adopting an explicit prompt in your team's ADR template is a low-cost fix for this. Add a line under the Considered Options heading: "For each non-chosen option, include a one-sentence rejection reason." The prompt doesn't add ceremony — it just makes the expectation visible. Teams that add this prompt see significantly better Alternatives Considered sections than teams that rely on engineers to fill the section correctly by default.
The architecture decision record template comparison page shows how different templates handle this section — MADR's structured approach versus Nygard's minimal list, with a guide for which approach produces better downstream value.
What good ADR feedback looks like
Good review feedback on an ADR asks about decisions. Bad review feedback asks about format. The difference is visible in the questions.
Good feedback:
- "The Consequences section says this will be more maintainable. What's the trade-off you're accepting? Every architectural choice gives something up."
- "You list Redis as a considered option but there's no rejection reason. What made Memcached the right choice here?"
- "The constraint is listed as 'performance requirements.' Can you add the actual numbers? Future engineers won't know if those requirements have changed or whether this decision is still appropriate."
- "This references the Q3 architecture review for context. A new engineer in 2028 won't be able to attend that meeting. Can you inline the relevant context from that review?"
- "The Consequences section mentions operational complexity. Can you be specific about what that means — is there monitoring to add, an on-call runbook to write, a process the team needs to own?"
Unhelpful feedback:
- "Should 'Accepted' be capitalized?"
- "The date format should be ISO 8601."
- "Can you add a blank line before the headers?"
- "The title might be more descriptive."
Format feedback isn't wrong — formatting matters and CI often can't catch all of it. But a review session that produces only format feedback means the reviewer read the sections for structure and not for content. Set the expectation in your team's review norms: at least one decision-quality comment before a review is approved. This pushes reviewers to engage with the content and surfaces thin sections that would otherwise slip through.
The most common push-back on decision-quality feedback is "I wasn't in the room when this decision was made." The answer is: that's exactly the point. If you weren't in the room and you can't reconstruct the deliberation from the ADR alone, the ADR isn't doing its job. Your confusion as a reviewer is data — it's the new-engineer experience in miniature.
Automation: what CI can check, what it can't
CI tooling can automate structural validation. Tools like Log4Brains, the adr-tools linter, and custom GitHub Actions can verify: required sections present, Status is a valid enum value, Considered Options has at least one entry, no placeholder text remains (no "TODO" or "TBD" in a field meant to be filled), the ADR file is named correctly and indexed.
Setting up structural validation as a CI gate — blocking PR merges on structural failures — is worth doing. It catches the class of problems that a reviewer shouldn't need to catch: the ADR that's missing the Consequences section, the draft that was opened before Status was set, the template with unfilled sections.
What CI cannot check: whether the entries in a structurally valid section contain useful information. "Redis — not chosen" passes a structural check. The check that would catch it — "does each Considered Options entry include a rejection reason?" — requires semantic understanding that current linters don't have. This is the gap the human review checklist fills.
The practical division: CI handles structural gate (format correctness, section presence, valid enum values), human review handles decision quality (rejection reasons, constraint specificity, honest consequences). Neither replaces the other. Teams that rely only on CI produce structurally valid but thin decision records. Teams that rely only on human review produce inconsistent formatting. Both gates are needed.
One emerging pattern for teams with high ADR volume: a separate pull request template for ADR files (via GitHub's PULL_REQUEST_TEMPLATE directory) that includes the review checklist as a checkbox list in the PR description. Reviewers check the boxes as they work through the review, making the checklist visible in the PR rather than in a separate doc. Teams that use this report better decision-quality feedback because the checklist is in the reviewer's view during the review, not in a page they have to navigate to separately.
Review triggers: approved ADRs that need re-review after time passes
Not all ADRs age at the same rate. An ADR that documents a formatting convention or a process norm is likely stable for years. An ADR that documents a technology choice, a vendor selection, or a performance-driven architecture decision may need re-review when the conditions that drove it change.
The Consequences section is the right place to write review triggers: explicit conditions under which the decision should be reconsidered. "Revisit if monthly infrastructure cost exceeds $5k — at that threshold a move to a dedicated service is cost-justified." "Revisit when the team includes a Go engineer with >2 years of production experience — the language choice was constrained by the current team's expertise." "Revisit if we add a compliance requirement that affects data residency — current vendor was selected under EU-only requirements."
Review triggers aren't mandatory for every ADR. They're most valuable for constraint-dependent decisions — decisions where the winning option was driven by a specific context that could change. Decisions driven by timeless principles (prefer reversibility, minimize operational complexity) may not need triggers. Decisions driven by current-team constraints, current-budget limits, or current-technology maturity usually do.
The ADR lifecycle treats the Superseded status as the mechanism for handling these re-reviews: when a review trigger fires, a new ADR is written that supersedes the old one, with the old record marked Superseded and linked to the new one. The review process for the new ADR is the same as for any ADR — the checklist applies, and the "new" ADR should explain why the original decision is being revisited and what has changed since.
Using AI chat extraction to improve Alternatives Considered before the PR
The most practical application of the WhyChose extraction approach at the per-decision level is as a pre-PR writing aid for the Alternatives Considered section.
The workflow:
- Make the decision. If the deliberation happened in ChatGPT or Claude, leave the chat session open.
- Before opening the ADR draft, export the chat session (one conversation at a time if the tool supports it — both ChatGPT and Claude allow single-conversation exports or JSON history exports) and run the extractor on it.
- Read the extractor's output. The candidates that map to considered alternatives are typically the question-shape messages ("which approach is better for this use case?", "what would happen if we used X instead?") and the trade-off marker messages ("the advantage of X is... but the downside is..."). These are the deliberation moments.
- Use the extraction candidates to write the Considered Options section. Each candidate that represents a serious alternative becomes an entry. The extractor's output often contains the rejection reasoning verbatim — the trade-off language from the session maps directly onto the rejection reason field.
- Run author check 2 (named alternatives with rejection reasons) against what you wrote. If the extractor output didn't surface a specific alternative, either the alternative wasn't seriously deliberated (and may not belong in the ADR) or the deliberation happened somewhere other than AI chat (in which case the reconstruction effort is the same as writing from memory, but at least you know that now).
This approach is most useful for decisions where the deliberation was extensive — technology choices with multiple serious contenders, architecture decisions with significant long-term consequences, hiring and team-structure decisions with real trade-offs. For lightweight decisions — process adoptions, minor dependency choices, naming conventions — the overhead of running the extractor before writing the ADR isn't justified. The ADR review checklist applies regardless; the extraction approach is a tool for filling it well, not a gate.
For distributed teams, where the deliberation may be split across multiple engineers' AI chat sessions, pooling extraction output before the PR opens produces an even more complete Considered Options section — the proposing engineer's pre-RFC deliberation and the objecting engineer's counter-argument session together give a more complete picture of what was considered than either alone.
Where to start
If your team already has ADRs but no explicit review process, the fastest improvement is a PR template for ADR files with the checklist as checkboxes. This doesn't require changing how you write ADRs — it just makes the review criteria visible to reviewers and authors at the point of review.
If you're auditing existing ADRs, the Considered Options section is the highest-value place to look. Find the five decisions that have generated the most "why did we do it this way?" questions in the last year — the ones where you've had to explain the reasoning to new engineers or justify it in a post-mortem. Read those ADRs' Alternatives Considered sections. If they fail author check 2 (named alternatives with rejection reasons), they're candidates for retrospective enrichment: finding the original AI chat session or PR thread and adding the rejection reasoning that wasn't captured at the time.
If you're building the review process from scratch, start with the three reviewer checks. They're faster to adopt than the five author checks because reviewers read ADRs that already exist rather than writing new ones, and the three questions ("Can I reconstruct the deliberation? Is the constraint specific enough to expire? Does the Consequences section name what was given up?") can be applied to any ADR regardless of template format.
The payoff is not in the review session itself — the payoff is in the next new engineer who reads the ADR and gets the decision right without asking anyone. A four-minute review that produces a correctly formatted thin ADR is a missed opportunity to make that future conversation unnecessary. A twelve-minute review that surfaces thin Alternatives Considered and makes the author write the rejection reasoning is the investment that prevents the same question from being asked fourteen months later.
See also: why ADRs go stale after 60 days (the review checklist prevents staleness at entry; that post explains what to do when records are already stale), how to document architecture decisions (the end-to-end process from decision to documented record), and the GitHub-hosted ADR workflow (CI linting and CODEOWNERS configuration that automates the structural checks).
Frequently asked questions
What should an ADR author check before opening a PR?
Five checks: (1) Is the decision made — Status is Accepted or Rejected, not Proposed? (2) Are there at least two named alternatives with concrete rejection reasons? (3) Is the constraint named — the actual reason one option won? (4) Are the Consequences honest — do they name a real trade-off? (5) Is the ADR self-contained — can a new engineer understand it without reading the Slack thread?
Why is Alternatives Considered consistently under-populated?
Three reasons: the deliberation happened in AI chat and not in a document (so alternatives are reconstructed from memory), the author already knows the answer and finds the alternatives obvious, and the standard templates don't prompt for rejection reasons. Running the WhyChose extractor on the relevant AI chat session before writing the section is the most effective fix — the extraction output shows the actual alternatives deliberated with the reasoning that was live at the time.
What does good ADR review feedback look like?
Good feedback asks about decisions: "What trade-off are you accepting?" "Can you add the rejection reason for this alternative?" "The constraint needs specific numbers — what were the performance targets?" Unhelpful feedback asks about format: capitalization, date formats, blank lines. Format matters and CI should handle it, but a review with only format comments means the reviewer didn't engage with the content.
Can CI automate ADR review?
CI can automate structural checks: required sections present, valid status enum, no placeholder text. Tools like Log4Brains and adr-tools provide this. What CI can't check is decision quality — whether alternatives have rejection reasons, whether the constraint is specific, whether the consequences are honest. The human review checklist exists because structural validity doesn't imply decision quality.