The interface decision: why the contracts between your components deserve their own records

Every public interface is a promise. A REST endpoint promises a JSON response shape. A gRPC service promises a Protobuf contract and streaming semantics. A message queue event schema promises that consumers will receive payloads in a specific format. The promise constrains everyone who depends on it, and the choice of what to promise — which protocol, which fields, which versioning strategy — is a decision that was made against alternatives. In most codebases, it exists as an artifact without a record of how it was chosen.

The typical history of an undocumented interface goes like this. An engineer builds the first endpoint, defines the request and response shapes, chooses the status codes, picks the error format, decides on pagination strategy. Each of these choices was made against alternatives — probably in an AI chat session, possibly in a Slack thread, maybe in a brief conversation at a whiteboard. The API that results is the record. But the API itself captures what was decided, not why. The developer considered putting the list of items at the top level but wrapped it in an object for forward extensibility. They chose HTTP 422 over HTTP 400 for validation errors because of a semantic distinction the team had discussed. They chose cursor pagination over offset because a teammate had argued for stable result ordering. None of this is visible in the endpoint.

Six months later, a new consumer team discovers the API. They build an integration. They depend on the consistent ordering of results — not because anyone promised it, but because it has always been true in their testing. They depend on the error object always including a `details` array — not because the schema marks it as required, but because it has always been present. Then the producing team changes something. Result ordering becomes nondeterministic in a new implementation path. The error object omits `details` for a new error category. The consumer team's integration breaks. And the producing team, reviewing the change, can't determine whether these were promises or implementation details — because the promises were never written down.

This is the interface documentation problem, and it isn't primarily a technical problem. It's a decision record problem. The producing team made decisions about what to promise. The record of those decisions — the reasoning, the constraints, the alternatives considered — would have made the distinction clear. Without it, every change to an interface is a negotiation about what was ever intended.

What makes an interface a decision, not just an artifact

An implementation detail is something a producer controls and can change without informing consumers, because consumers are not supposed to depend on it. An interface contract is the opposite: it's the surface that producers have — explicitly or implicitly — committed to maintaining. Consumers build against it. When it changes, consumers break.

The trouble with most interface decisions is that they're made at the moment of implementation without being separated from the implementation. A developer writing the first endpoint is making two kinds of choices simultaneously: decisions about the implementation (which data store to query, how to compute derived fields, which cache to invalidate) and decisions about the contract (what fields to expose, what error semantics to commit to, what ordering guarantee to make). The first kind is entirely within the producer's control. The second kind creates obligations. But both kinds get committed in the same pull request, and neither kind gets a decision record.

The distinction matters because it determines who can change the interface and when a change requires coordination. A producer who knows that cursor pagination was chosen deliberately — because offset pagination produces inconsistent results across concurrent modifications — knows that switching to offset pagination is not a refactoring. A producer who doesn't know this may switch to offset pagination as an "internal optimization," discover that it breaks consumers who depended on result stability, and have no documentation to explain the original intent. The record of the cursor pagination decision would have made the choice visible at the point of change, not discoverable through a consumer-reported regression.

Three types of interface decisions worth documenting

Transport and protocol decisions. The choice of REST, gRPC, GraphQL, or a message queue protocol is one of the highest-leverage interface decisions a team makes. It determines which consumers can integrate without additional tooling, which capabilities are available without custom engineering (streaming, introspection, multiplexing, connection management), and what the per-operation overhead is at the team's traffic volume. Once made, the protocol decision is one of the most expensive to reverse — a migration from REST to gRPC rebuilds every consumer in the process and often requires bridging infrastructure during the transition.

The transport decision is also the most commonly undocumented because it feels like a posture, not a choice. "We use REST" is a statement about the codebase, not a record of the conversation where REST was chosen over gRPC given specific constraints. But that conversation happened — in an AI chat session, in a team meeting, in a Slack thread — and the constraints that made REST the right choice for a consumer-facing API in 2024 may not apply to the internal service being built in 2026, where all consumers are backend services and bidirectional streaming would eliminate a polling loop. Without a record, the REST choice propagates by inertia, not by evaluation.

Schema and shape decisions. The structure of a request or response is a decision with a specific set of alternatives. Fields at the root versus wrapped in an envelope object. Flat versus nested for related resources. Pagination by offset, cursor, or page number. Error format as a plain string versus a structured object with a code and details array. Timestamp format as Unix milliseconds versus ISO 8601 with timezone. Each of these choices was made against the alternatives, and each creates a consumer expectation that persists across the lifetime of the interface.

Schema decisions are particularly prone to the incremental extension problem: they're made once at initialization and then extended field by field as new requirements arrive, without any of the extension decisions being recorded. The schema that exists after eighteen months of use is a palimpsest of many individual decisions, none of which are documented separately. The "not building this" record type has an exact analogue here: the field that was considered and rejected ("we could add a full user object here, but we're including only the ID for now to avoid consumer coupling to user schema changes") is as important as the field that was included, and it's invisible in both cases.

Versioning strategy decisions. Every API that will have breaking changes needs a versioning strategy before the first breaking change arrives — ideally before the API is first consumed, since the strategy affects the initial URL structure (/v1/) and the consumer's integration expectations. The choice between URL versioning, header versioning, content negotiation, or no versioning at all is a decision with long-term consequences. Made correctly, it enables graceful evolution. Made by default — "no strategy" is an implicit choice that breaking changes are breaking — it creates a constraint that is expensive to retrofit later, when the API is consumed by teams who weren't part of the original design.

The consumer-producer asymmetry

An interface decision has an unusual property compared to most architectural decisions: the consumer and the producer have asymmetric information about what the interface is meant to promise.

The producer knows which behaviors are intentional and which are implementation details. The producer knows that the created_at field is always present in production data but technically optional in the schema. The producer knows that the API currently returns results sorted by creation date even though the response documentation doesn't specify ordering. The producer knows that the maximum page size of 1000 is an undocumented implementation limit, not a committed contract. The producer holds all of this as implicit knowledge, distributed across the memory of the engineers who wrote the implementation.

Consumers infer intent from observed behavior. If created_at is always present and the consumer's code handles its absence nowhere, it becomes a de facto requirement regardless of the schema's optional marking. If results are consistently ordered by creation date and a consumer displays them in a time-ordered list, the consumer has taken a dependency on ordering that the producer didn't commit to. If a consumer's pagination logic assumes the maximum page size is 1000, it's depending on an undocumented limit that the producer may not know anyone is relying on.

Without a record of what the interface was meant to promise, this asymmetry generates two failure modes. The first is the consumer that depends on an unspecified behavior: the producer "fixes" a bug in the ordering logic, and the consumer breaks because their UI was depending on the stability of a behavior the producer considered internal. The second is the inverse: the producer avoids fixing a known implementation problem because "consumers might be depending on it" — not because anyone actually is, but because without a record distinguishing promises from implementation details, the producer cannot determine which behaviors are safe to change. Both failure modes are caused by the same missing document: the record of what the producer decided to promise, and why.

This is a specific instance of the cross-team decision problem: one team's design decision becomes another team's constraint, and without a record, neither team can reconstruct the original intent when the constraint becomes inconvenient. The interface ADR is the document that makes the producer's intentions legible to consumers — not as a runtime schema, but as a human-readable record of what was committed to and what conditions would make the commitment reviewable.

The incremental extension problem

Interface schemas don't stay static. New fields are added as new requirements arrive. "Backward-compatible" additions accumulate. What was a simple response envelope after six months becomes, after eighteen months, a nested structure with optional arrays, fields with subtly different semantics across different response states, and a metadata catch-all object that contains a mixture of structured and unstructured data.

Each individual addition felt like a non-decision: adding one field is backward compatible, and backward compatible means safe. But a sequence of additions that individually felt like non-decisions can collectively represent a set of structural choices that nobody made explicitly. After eighteen months:

The response envelope that wraps the list object was added "for forward extensibility" in the original implementation. Now it's a permanent consumer contract: consumers parse the outer object, then the nested list. But adding metadata fields to the outer envelope rather than the root level was never explicitly decided — it just became the pattern. The next consumer to write integration code for this API will follow the pattern without knowing that it was itself a choice.

An id field was added before the team had a stable identifier strategy. It became a string UUID when the team later standardized on UUIDs. A legacy_id field was added for consumers who had already integrated with the numeric database ID. Neither the original id decision (why string rather than numeric at a time when the underlying store used auto-increment integers?) nor the legacy_id decision (maintaining two ID representations indefinitely versus requiring existing consumers to migrate) was recorded. An engineer reading the response schema today sees two ID fields with no explanation of why both exist.

A metadata object was added as an escape hatch for fields that didn't fit the schema cleanly. Now it contains a mixture of fields with consistent semantics across all resources and fields with resource-specific semantics. The original decision — escape hatch versus extending the typed schema, accepting that the metadata object would accumulate heterogeneous content — was made once and never recorded. The technical debt is now invisible because there's no record of the decision that created it.

The interface ADR for an extension decision doesn't need to be a full Nygard-format document. A brief decision note in the API changelog — what was added, what was considered instead, what constraint drove the choice — is enough to convert each incremental extension from an invisible accumulation into a retrievable record. "Added cursor to the pagination envelope instead of adding a second offset-based endpoint. Considered: separate endpoint for cursor-based pagination (rejected: two pagination paradigms in one API creates consumer confusion and doubles the surface to maintain). Consequence: offset-based consumers must migrate to cursor-based pagination when they need stable ordering under concurrent writes."

The versioning decision specifically

The versioning strategy deserves its own record because it's the decision that determines the cost of all future breaking changes. A team that makes no explicit versioning decision is implicitly choosing "breaking changes are breaking" — which is sometimes the right choice, but should be made deliberately.

The versioning decision is also a statement about the team's relationship to its consumers. URL versioning (/v1/, /v2/) announces that breaking changes are anticipated and provides an explicit migration path. Header versioning (API-Version: 2) is lower-friction for consumers but requires more producer-side routing infrastructure. Simultaneous support — running v1 and v2 in parallel during a migration window — has different operational costs than sunset-and-migrate, where consumers receive a deadline to upgrade before v1 is removed. The team's choice of versioning strategy is also a choice about how much migration cost it's willing to absorb versus how much it's willing to transfer to consumers.

The versioning decision interacts with the API lifecycle decision — what is the process for deprecating an endpoint, when is a feature considered stable enough to commit to, what is the sunset timeline for an old version — and these downstream decisions are easier to make consistently when the versioning strategy is written down and the reasoning is available. The ADR lifecycle for interface versioning follows the same pattern as other decisions: the v1 strategy record is superseded by a v2 migration decision record, which is superseded by a deprecation record when v1 is sunset. The chain of custody shows how the interface evolved and why each evolution was the right response to the constraints at the time.

What makes the versioning decision particularly consequential is that it's often made implicitly, before the first consumer has been added, in a moment when the cost of getting it wrong feels theoretical. The team builds the first endpoint, another team integrates with it, and the versioning strategy question is "we'll deal with that when we need to break something." By the time they need to break something, there are five consuming teams with deeply integrated clients, and "just version it" is a multi-sprint migration. The record of "we made no versioning decision, and here is why that was acceptable at the time — single internal consumer, shared deployment cycle" would at least make the implicit decision visible for re-evaluation when the consumer profile changes.

The REST / gRPC / GraphQL decision

The protocol selection decision for a new service is one of the most common technical architecture discussions that happens in AI chat and one of the most consistently undocumented. The conversation spans one to three sessions — "should we use REST or gRPC for this internal service?", "what are the trade-offs of GraphQL for a public API with diverse consumers?", "does REST make sense if we need streaming?" — and it concludes with a selection that shapes every integration that follows.

The reason this decision almost never gets documented is that it feels like a category-level posture rather than a specific decision with named constraints. "We use REST for our public APIs" is a statement that ends the conversation, not a record that explains it. But posture without constraint is architectural drift: the next team building a new service will replicate the posture by default — or deviate from it by local preference — without knowing whether the original choice was deliberate or incidental.

A REST vs. gRPC decision record should name the consumer profile that was decisive. Consumer types are rarely uniform: a browser-native consumer, a mobile app, and two third-party integrations with unknown tooling have very different integration costs for gRPC versus REST. Browser-native consumption of gRPC requires a grpc-web proxy layer that the team must operate. Mobile SDK generation from Protobuf schemas is excellent for gRPC but unfamiliar tooling for teams without prior experience. Third-party consumers whose integration stack is unknown face unpredictable friction with gRPC's binary protocol in environments where REST is a commodity. The consumer profile is often the single most decisive constraint, and it's completely invisible from the interface itself.

The GraphQL decision deserves its own treatment because the trade-offs are different from the REST/gRPC comparison. GraphQL's primary advantage — consumer-defined query composition, avoiding over-fetching and under-fetching for diverse client needs — is most valuable when the consuming clients have substantially different data requirements and evolve their requirements independently. Its primary cost — per-query parsing and validation overhead, the operational complexity of subscription implementation, the cache-friendliness problem for CDN-cached responses — is most significant at high traffic volume and with clients who share similar data requirements. A team building a public API for a product with one web client and one mobile client has a very different GraphQL trade-off than a team building a public API for a platform with dozens of third-party integrations.

The record should name what was explicitly rejected and why. "We considered GraphQL for the consumer-facing product API because it would allow mobile and web clients to co-evolve their query shapes without requiring API versioning. We rejected it because our mobile team has two engineers and the runtime cost of per-query parsing and validation at our traffic volume would require dedicated infrastructure we can't staff at current headcount. We'll reconsider when mobile team grows past four engineers or when we adopt an edge caching strategy that makes per-query parsing cost negligible." This record converts a protocol decision from a posture into an evaluatable constraint — one that is explicitly revisable when the named condition changes, rather than implicitly permanent because it was never stated.

Platform team interface decisions have additional stakes: when a platform team's service is the infrastructure that product teams build on, the interface decision becomes a constraint for every product team simultaneously. A platform team that adopted gRPC for internal services in 2023 because it had the engineering capacity to maintain the proxy layer creates a constraint for product teams who haven't used gRPC and are now required to adopt it for any service that integrates with platform infrastructure. The record of the platform team's protocol decision — including the staffing constraint that made gRPC workable and the consumer profile assumption that all consumers would be backend services — gives product teams the context they need to understand whether the constraint is still applicable to their integration pattern.

Writing the interface ADR

The Nygard ADR format works directly for interface decisions. The decision-statement title convention applies with the subject being the interface or contract:

The title should make the comparison visible at the list level without opening the file. A new engineer scanning a decisions directory can answer "how was the API protocol selected?" in one second per title, rather than opening files to reconstruct the reasoning.

Context. The interface being defined, the consumers it will have, and the specific requirements that made the protocol or schema choice non-obvious. This section should name the consumer profile explicitly: which consumer types exist, what their integration tooling is, and what capabilities each consumer type requires. "We're defining the public API for the resource management service. Consumers are: the React web dashboard (browser-native, no proxy layer available), the iOS app (native HTTP client, no Protobuf tooling), and two third-party integrations where the consumer's integration stack is unknown. The service needs to support real-time state updates as resources transition between states."

Alternatives Considered. For a transport decision: the protocols evaluated with their specific constraints relative to the consumer profile. REST (universal consumer support, browser-native, real-time requires SSE or polling), gRPC (not browser-native without grpc-web proxy, excellent bidirectional streaming, requires Protobuf tooling for third-party consumers), GraphQL (consumer-defined queries, subscription complexity, cache-unfriendly for CDN). For a schema decision: the shape alternatives with the reason one was chosen and the specific consumer constraint that was decisive.

Decision. What was chosen and the primary constraint that drove the selection. "REST with Server-Sent Events for real-time updates, not gRPC. Browser-native consumption without a proxy layer and third-party consumer tooling compatibility were the decisive constraints. GraphQL deferred — the schema is small enough at launch that consumer-defined query composition adds operational overhead without a current benefit; the single consumer per resource type means REST's fixed query surface is not producing over-fetching problems."

Consequences. The interface ADR's Consequences section is especially important because it should name explicitly which capabilities the team accepted not having. "We accepted that real-time updates use SSE rather than bidirectional streaming — no server-side acknowledgment of client receipt, no backpressure from client to server. We accepted that third-party consumers must implement HTTP retry logic rather than relying on gRPC's built-in retry semantics. We accepted that schema evolution requires explicit versioning when breaking changes arrive, since REST has no built-in protocol-level negotiation for schema evolution."

Revisitation condition. "Revisit this decision if: (1) real-time update acknowledgment becomes a product requirement for resource state machines; (2) mobile team requirements include high-volume state updates where SSE polling overhead is significant at scale; (3) third-party consumers explicitly request a gRPC endpoint and represent enough integration volume to justify operating the grpc-web proxy layer; (4) GraphQL is adopted for a different service and the team builds operational expertise — the cost comparison changes once the infrastructure is shared."

The Revisitation condition is what opening the ADR template before finalizing the decision surfaces: an engineer who must write a named revisitation condition may discover that they can't name one — which means the decision is being treated as permanent when it should be provisional, or that the constraint driving the decision hasn't been identified precisely enough to state what would change it. The blank Revisitation section is diagnostic.

Finding interface decisions in AI chat

Protocol selection decisions produce a characteristic session shape in AI chat. They begin as trade-off comparison questions — "what are the trade-offs between REST and gRPC for an internal service?", "should I use GraphQL if my consumers have very different data needs?", "does REST make sense here if I need streaming?" — and they often span multiple sessions before concluding with a selection. These are among the highest-confidence extraction targets for the WhyChose extractor because they contain an explicit comparison structure, named alternatives, and a stated outcome.

The distinguishing feature of a protocol selection session is that it ends with a protocol choice, not an implementation plan. The engineer knows what they're going to build at the end of the session. The reasoning that produced the choice is fully present in the session — the consumer types, the tooling constraints, the capability requirements — but it exists nowhere else. There's no implementation artifact that captures why REST was chosen over gRPC; there's only the REST API itself, which says nothing about the evaluation.

Schema design decisions are harder to find because they're often embedded in implementation sessions rather than stated as explicit trade-off questions. The envelope vs. root-level decision appears as a design question in the middle of an implementation conversation: "should I wrap this in an object or return the array directly?" The pagination decision appears as a question about how to handle a specific edge case: "if I'm using offset pagination and a record is deleted during pagination, does the next page skip a record?" These questions contain the alternatives (offset vs. cursor) and the constraint (consistent results under concurrent modification), but they're embedded in a session that's nominally about implementation, not design.

The versioning strategy session is often brief — a single exchange before the first breaking change is needed — and it frequently concludes with deferred-decision language: "we'll figure out versioning when we need to break something." The WhyChose extractor identifies deferred decisions through this pattern: a question about versioning strategy that concludes with "we'll cross that bridge when we get there" is itself a decision (implicit no-versioning-now) with a revisitation condition (when the first breaking change is required). The deferred decision is worth recording: "We made no versioning decision at API launch. All consumers are internal and share our deployment cycle. We'll adopt URL versioning (/v2/) when we have either an external consumer or a breaking change required. Expected: Q3 2026 when the public beta begins."

The quarterly decision review is the right mechanism for finding undocumented interface decisions from the past 90 days. The pattern to search for: sessions that discuss API protocol, response shape, or versioning strategy and conclude with a selection or deferral. These sessions are in the chat history; they produced the interfaces that exist today; and the constraints that drove the choices are recoverable from the session content even months after the implementation was completed.

The interface as a record of intentions

A well-documented interface is two artifacts, not one. The first artifact is the interface itself — the schema, the protocol, the endpoint structure. The second artifact is the record of what the interface was designed to promise: which behaviors were intentional, which were implementation details, which constraints drove the design, and which conditions would make the design worth revisiting.

The first artifact exists in every codebase. The second artifact is missing from almost all of them. The gap between the two is where the consumer-producer asymmetry lives, where the implicit dependency accumulates, where the incremental extension problem compounds, and where the versioning decision gets deferred until it's expensive to make well.

The new technical leader encountering an undocumented interface faces a specific version of the context problem: they see the API, they don't know what was promised. They must infer intent from the schema, from the consumer integrations, from the incidental behaviors that have calcified into implicit requirements. A new technical leader who can't determine whether the response envelope is a deliberate extensibility choice or an accidental artifact of the original implementation cannot evaluate whether simplifying it is a refactoring or a breaking change. The interface ADR is the document that converts this inference exercise into a retrieval exercise. The intent is stated, the constraints are named, and the condition for revisiting the design is explicit.

Interface decisions are a category where the standard case for architecture decision records is especially strong. The decision is high-leverage — a protocol choice at the interface level affects every integration that follows. The decision is made once and long-lived — interfaces are rarely replaced, only versioned forward. The decision is made against alternatives that a new engineer might propose when they arrive — REST seems like an obvious choice until the team reaches a state where gRPC would have been better, at which point the record of why REST was chosen is the only artifact that explains whether that choice was deliberate or a default. And the decision's reasoning degrades rapidly from memory — the constraints that drove a protocol choice in 2024 are difficult to reconstruct accurately in 2026 when the next engineer asks why this API is REST.

The interface record converts an implicit promise into an explicit one. It doesn't change the promise itself — the API behaves the way it behaves regardless of whether a record exists. What it changes is the producing team's ability to distinguish intentional promises from implementation details, and the consuming team's ability to understand what they're depending on and under what conditions the producer might revisit the design. That distinction is the difference between an interface that evolves gracefully and one that accumulates silent coupling until change is impossible.

Further reading