The monorepo vs. polyrepo decision record: why the repository structure you chose in year one shapes your CI costs, code ownership, and dependency management in year four
Repository structure is chosen once — at project inception, when the codebase has one service and one engineer — and rarely revisited. Four years later, the build graph determines whether a one-line change to a shared utility takes 4 minutes or 40 minutes in CI. The CODEOWNERS model determines whether a cross-team PR review takes one day or one week. The dependency sharing model determines whether a breaking change in a shared package can be deployed atomically or requires coordinating five separate service teams. None of this was visible at the beginning. None of it is written down.
A platform engineer joins a four-year-old SaaS. The codebase is a monorepo containing 40 packages — services, shared utilities, component libraries, tooling. Their first task: upgrade the logging library, a shared utility used by 35 of the 40 packages. They run the upgrade, the type errors resolve, the unit tests pass. CI reports green across all affected packages. They merge.
Two days later, the payments service fails in a way that isn't caught by CI or end-to-end tests. The failure is subtle: structured log records from the payments service are missing the traceId field that the observability platform uses to link distributed spans. Transactions cannot be traced, but they still process correctly, so the failure surfaces in the SRE dashboard rather than in customer support.
The root cause: the payments service imports createLogger from @internal/shared-utils, which re-exports it from @internal/logging. The logging upgrade renamed the trace context parameter from traceId to trace.id in a new log schema. The payments service's tests mock the logger, so the schema change didn't surface. @internal/shared-utils was not declared as a direct dependency of the payments service in its package.json — it was a phantom dependency, resolved because it happened to be installed in the workspace. The monorepo's build tool never marked the payments service as affected by the logging upgrade, because it had no record that payments → shared-utils → logging was a dependency path. CI was green. The deployed artifact was built against the old logger interface.
Like most foundational infrastructure decisions, the repository structure is visible as a fact — the codebase is a monorepo, it uses Turborepo, the packages are in packages/ — but invisible as a decision. The fact tells the new engineer where to find the code. The decision record answers the questions that determine whether the codebase works correctly: what does the affected computation rely on, what policy governs phantom dependency prevention, and what is the accuracy requirement for package dependency declarations in package.json? Without the record, the phantom dependency incident is a surprise. With it, the incident is a known consequence of a known gap in the documented policy.
What "we use a monorepo" means across four structural patterns
The repository structure decision is not binary. "Monorepo vs. polyrepo" describes two ends of a spectrum with four distinct structural patterns in common use, each carrying a different set of CI, ownership, and dependency consequences.
The flat monorepo — a single git repository containing all code, with no formal package boundaries. All services share a single build step, a single dependency lock file, and a single CI pipeline. This is the most common starting point: small team, one or two services, shared libraries imported by relative path. The flat monorepo has no build caching problem (there is only one build), no phantom dependency problem (everything is in scope), and no affected computation problem (CI runs everything on every change). The problem that accumulates is that as the codebase grows, CI time grows linearly with code size regardless of change scope, every change to any file triggers the full test suite, and ownership is undefined because there are no formal boundaries. Teams that started here and never re-evaluated the structure have a CI pipeline that runs 20 minutes to test a one-line change to a utility function. The revisitation condition — "when does the flat monorepo become a problem?" — is the question most teams cannot answer because it was never documented as a condition.
The modular monorepo — a single git repository with explicit package boundaries, a workspace manager (pnpm workspaces, npm workspaces, Yarn Berry workspaces), and a task runner with build caching (Turborepo, Nx, Bazel). This is the most common architectural monorepo pattern for teams that have evaluated the structure deliberately. The key capability is affected-package computation: the task runner computes a dependency graph from the workspace configuration and package dependency declarations, and on each CI run determines which packages changed (directly or transitively) and runs only the tasks for those packages. A one-line change to a shared utility runs CI for the utility and its dependents, not for the 30 packages that don't depend on it. The accuracy of the affected computation is the critical invariant that the modular monorepo depends on for correctness, and it is the invariant most teams never documented as a policy requirement.
The polyrepo — one git repository per service or deployment unit. Services are independently versioned, independently deployed, and independently owned. Shared code is published to an internal package registry (Verdaccio, GitHub Packages, AWS CodeArtifact) and consumed via versioned dependencies. A breaking change in a shared library is published as a new semver major version; consumers upgrade on their own schedule. The key capability is independent release cadence: a service team can ship without waiting for any other team, and a dependency upgrade can be rolled out to services in stages rather than atomically. The key cost is cross-service coordination: a change that spans multiple services requires multiple PRs, multiple CI pipelines, and multiple merge approvals, and the integration between services cannot be verified until all PRs are deployed. Refactoring a shared interface requires publishing the new interface, upgrading each service, and maintaining backward compatibility for the duration of the migration — a multi-month process for a multi-service codebase.
The hybrid structure — a monorepo for closely related services (a domain cluster) with service boundaries between domains. Frontend, backend API, and shared components in a single monorepo; a separate monorepo for infrastructure tooling; a third repository for data pipeline services. The hybrid structure inherits the affected computation complexity of the modular monorepo within each repository, and the cross-repository coordination cost of the polyrepo between repositories. It is the correct answer for organizations with clear domain boundaries and unclear intra-domain boundaries — but it is also the most common emergent structure, having arrived not by deliberate decision but by a sequence of "should we add this service to the existing repo?" decisions that were made locally without reference to an organizational repository policy.
The build graph and CI affected computation
In a modular monorepo with a task runner, the affected computation is not an operational feature — it is a correctness invariant. If the computation is inaccurate, CI produces results that do not reflect the actual dependency relationships in the codebase, and the deployed artifacts may be built against outdated versions of their transitive dependencies.
The affected computation works by constructing a dependency graph from the workspace's package dependency declarations and then performing a topological traversal from the changed packages. If packages/payments declares "@internal/logging": "workspace:*" in its package.json, and packages/logging changes, the task runner marks packages/payments as affected and includes its tasks in the CI run. If the declaration is absent — even if the import works at runtime because the package is installed as a transitive dependency — the task runner has no record of the relationship and excludes packages/payments from the affected set.
The phantom dependency problem is the most common source of affected computation inaccuracy. A phantom dependency is a package that a module imports directly but does not declare in its own package.json. It is resolved at install time because a peer dependency or a sibling package declares it, making it available in the workspace's node_modules. At runtime, the import resolves. In CI, under task runner affected computation, the dependency relationship is invisible, and the affected set is wrong.
The consequence of phantom dependencies accumulates silently. Each phantom dependency represents a package whose CI results are decoupled from changes to its actual transitive dependencies. Over four years of development, a codebase with 40 packages and no phantom dependency prevention policy can accumulate dozens of phantom dependencies — each one a potential "CI green, production broken" scenario waiting for the upstream package to make a breaking change.
For platform teams that own the monorepo tooling, the phantom dependency policy is the most consequential undocumented constraint imposed on product teams. A platform team that deploys strict workspace dependency enforcement (pnpm's hoist=false setting, which prevents transitive packages from resolving to non-declared dependencies) converts phantom dependencies from a silent risk to an immediate install-time error — but this policy must be communicated to product teams because it changes the development contract. A product team that discovers pnpm install fails because of a phantom dependency they've used for two years encounters a breaking change in the developer environment that they were not warned was coming.
The remote build cache adds a second accuracy requirement: cache invalidation. Turborepo and Nx both support remote caches — build outputs are stored in a shared backend (Turborepo Remote Cache, Nx Cloud, or a self-hosted S3 bucket with a cache key mechanism) and reused across CI runs when the inputs hash matches. The inputs hash is computed from the files in the package and, critically, the declared dependencies' input hashes. A phantom dependency whose output changes does not invalidate the consuming package's cache key, because the phantom dependency's hash is not part of the key computation. The remote cache hit for the consuming package serves an artifact built against the outdated phantom dependency's output.
The build cache is analogous to a request cache in a web application. Like the application-layer caching decisions that determine which writes propagate immediately versus which silently diverge until TTL expiry, the build cache invalidation decision determines which dependency changes propagate immediately to all downstream packages versus which silently produce stale build artifacts. The difference is that build cache staleness produces incorrect deployed artifacts rather than stale data — a correctness failure rather than a freshness failure. The policy for preventing phantom dependencies, and the policy for validating that the dependency graph declarations are accurate, must be documented as correctness requirements of the CI system, not implementation details.
Code ownership: CODEOWNERS as a policy document
In a polyrepo, code ownership is implicit and structural: the team that owns the repository owns the code in it. Membership in the repository's maintainer group or the GitHub team with write access is the ownership record. When a new engineer joins the team, they are added to the repository's maintainer group. When an engineer leaves, they are removed. The ownership record is a fact of the repository access control system.
In a monorepo, code ownership must be explicit and documented. The GitHub CODEOWNERS file (or the GitLab equivalent, or the Nx project.json owner field, or the Turborepo equivalent) defines which paths or packages require review from which teams or individuals. A PR that modifies packages/payments/ requires approval from the team or individuals listed as CODEOWNERS of that path. The CODEOWNERS file is a policy document — it encodes the organizational ownership structure as a repository configuration artifact that must be maintained as the organization changes.
The CODEOWNERS staleness problem is the monorepo equivalent of the permission cache staleness problem. Like a permission cache with a 30-minute TTL that continues to authorize access to cancelled subscriptions, a stale CODEOWNERS entry continues to gate PR reviews on an engineer who left the team eight months ago. The consequence is a PR blocked on a reviewer who is no longer on the team, or a PR that cannot be merged because the required reviewer no longer has access to the repository. The engineering overhead is the same: someone must investigate, discover that the reviewer is no longer the owner, update CODEOWNERS, and restart the review process. Without a CODEOWNERS maintenance policy — who audits it, on what cadence, what triggers an update — the staleness accumulates and the review friction accumulates with it.
The more subtle CODEOWNERS problem is the ownership gap: code that is not owned by any team. In a growing monorepo, a new package added by a cross-functional feature team may not be added to CODEOWNERS, or may be added with a placeholder that was never updated to reflect the team that actually owns it in production. Code without CODEOWNERS entries has no required reviewers, which means it can be modified by any PR without an ownership gate. This is sometimes intentional (low-risk utilities shared across teams where any reviewer is acceptable) and sometimes accidental (a critical package added under time pressure that was never assigned to an owning team). Without a documented ownership policy that distinguishes intentional ungated packages from accidentally ungated packages, a security or compliance audit cannot distinguish ownership gaps from ownership choices.
Cross-team contribution is where the CODEOWNERS model most directly shapes the developer experience. In a modular monorepo, an engineer who needs to modify a shared utility owned by another team submits a PR that automatically requests review from the owners of that package. The review process and the CI pipeline are identical to a within-team PR. The friction is the review turnaround time from the owning team. In a polyrepo, the same contribution requires finding the separate repository, cloning it, understanding its CI pipeline, submitting a PR in a different repository with a different context, and tracking two separate PRs (one in the service repo, one in the utility repo) to completion. The monorepo reduces the cross-team contribution friction at the cost of requiring the CODEOWNERS model to be accurate — inaccurate CODEOWNERS routing a PR to the wrong reviewer produces a delay as the PR is re-routed, which is worse friction than an empty review queue in a polyrepo.
The inner-source model — where engineers contribute across team boundaries freely, with lightweight review by the owning team — is easier to support in a monorepo where all code is visible in one place and all contribution uses the same PR process. Whether the organization intends to support inner-source contribution at scale, and what the CODEOWNERS policy should be to enable it rather than impede it, is a repository structure decision that must be made explicitly. Like most architectural decisions, the default (no written CODEOWNERS policy) is itself a policy — one that produces inconsistent review gates across packages and ownership gaps that accumulate until a compliance audit or a production incident reveals them.
Dependency management across repository boundaries
The dependency sharing model is the repository structure decision with the most direct consequence for how teams deploy and for how long breaking changes accumulate before they are fully adopted.
In a modular monorepo using workspace:* package references, all packages always use the current version of every shared internal package. There is no publication step for internal packages, no versioning, no consumer that is behind on upgrades. When a shared package changes its API, all consumers in the monorepo see the change in the same commit. If the change is a breaking change, the PR must update every consumer before it can merge. This is the atomic refactoring argument for monorepos: a breaking change to a shared interface can be deployed atomically across all consumers in one PR, with a single CI run validating the entire refactoring. The cost is that a breaking change to a widely-used shared package requires touching 30 consumers in one PR, which is a large PR that is hard to review and hard to revert if the change turns out to be wrong.
In a polyrepo, shared code is published to an internal registry as versioned packages. A breaking change is published as a semver major version — @internal/logging@2.0.0. Each consuming service upgrades independently: some services may upgrade in the next sprint, others may defer for months. The duration of the version skew window is the period during which both the old and the new API are in production simultaneously. Like a dependency upgrade decision that must account for the "why now?" forcing function, the upgrade schedule in a polyrepo is driven by each team's own priorities, and in the absence of a mandatory upgrade policy, the "why not defer?" question has no answer. Teams that deferred one dependency upgrade for a reasonable reason may defer the next, and the next, until they are multiple major versions behind and the migration cost is a multi-sprint project.
The version skew window creates a compatibility maintenance obligation. The publishing team must maintain backward compatibility for as long as any consumer is on the old version. Without a documented maximum version skew policy — "all consumers must upgrade to a new major version within 6 weeks of release" — the publishing team cannot safely deprecate old API surface. The compatibility obligation accumulates: every old major version that still has consumers is an API surface that the publishing team must not break. In a polyrepo without a version skew policy, a shared library may simultaneously maintain compatibility for three or four major versions, each with consumers at different upgrade stages.
The internal package versioning policy is the third dimension of the dependency sharing model. Modular monorepos with workspace packages generally do not version internal packages for production — workspace:* pins everything to the current version. But some monorepo architectures do publish internal packages to an internal registry as versioned artifacts, either for external consumption (some packages are also published publicly) or for supply-chain auditability (the deployed artifact is a published version, not an unversioned workspace reference). The versioning policy — whether internal packages are versioned, how semver is applied to internal packages, whether CHANGELOG maintenance is required for internal packages — is a decision that determines the audit trail for production artifacts and the migration mechanism for breaking changes.
Like the interface decision record for component contracts, the dependency sharing model determines what change is "backward compatible" (consumers need not update) and what change is "breaking" (consumers must update). In a workspace monorepo, there is no semver enforcement for internal packages — a consumer does not pin to a version, so a "breaking" change is any change that requires consumers to update their code. The definition of "breaking" must be documented as a policy because the threshold determines how often consumers must actively change in response to a shared package update, and how much communication is required before a shared package change is merged.
The migration cost: what the structure decision commits to
The most consequential aspect of the repository structure decision is the migration cost of changing it later. Both directions of migration — monorepo to polyrepo, and polyrepo to monorepo — are recoverable, but each requires engineering effort that is proportional to the number of packages, the number of cross-package dependencies, and the maturity of the CI pipelines involved.
A monorepo-to-polyrepo migration requires splitting the git history. The standard tools (git filter-repo, git subtree split) can produce a per-package repository with the full history of that package's files, but not with cross-package history context. A commit that modified both packages/payments and packages/shared-utils in a single atomic change becomes two separate commits in two separate repositories, with no reference between them. The cross-package context — why this change was made atomically, what invariant was maintained across the two packages — is lost in the split unless the commit message was written to be self-contained. CI pipelines must be rebuilt per-service. CODEOWNERS-based ownership must be reconstructed as repository-level team membership. Internal packages must begin publishing to a registry. The migration is a multi-week project for a medium-sized codebase with established CI infrastructure.
A polyrepo-to-monorepo migration requires merging multiple git histories into one. The standard approaches are to import each repository as a subtree (preserving history but producing a large merge commit), or to simply copy the files (losing history but producing a clean slate). The merged history contains commit messages that reference issue numbers from the original repository, PR URLs that no longer exist in the new context, and CI configuration that must be merged into the monorepo's workspace-level CI. Establishing CODEOWNERS requires agreeing on path-based ownership across all teams simultaneously. Configuring the task runner requires mapping the per-service CI pipelines to workspace-level task definitions. The monorepo tooling evaluation — Turborepo vs. Nx vs. Bazel vs. custom scripts — must happen before the migration can begin, because the task runner determines the build model that all teams must adopt. Like a build-vs-buy decision for a platform tool, the task runner choice carries integration cost and lock-in consequences that must be evaluated before the migration, not discovered after.
The migration cost is why the repository structure decision should document revisitation conditions. A flat monorepo that was correct at 5 engineers and 3 services may become a CI-speed problem at 20 engineers and 15 services. A modular monorepo that was correct at 20 engineers and 15 services may become a coordination problem at 100 engineers across 8 teams where cross-team refactoring requires touching 40 packages in one PR and the review process is a bottleneck. A polyrepo that was correct at 8 independent service teams may become an integration problem when a new platform initiative requires coordinating changes across 12 services simultaneously. The revisitation condition names the checkable threshold at which the current structure's costs exceed its benefits — and converting that condition from a vague intuition ("when it stops working") to a measurable criterion ("when cross-team PRs average more than 3 days to merge, or when P99 CI time exceeds 20 minutes for any single-package change, or when the number of simultaneous internal package versions in production exceeds 3") makes it possible to recognize the trigger without a post-mortem.
Writing the repository structure decision record
The Nygard ADR format adapts for repository structure decisions with five sections that most structure choices leave entirely undocumented.
The structure decision with alternatives evaluated. Name the structural model chosen, the build tooling adopted, the workspace manager, and the alternatives evaluated with specific rejection reasons. "We evaluated three repository structures in March 2023: (1) flat monorepo (single git repository, no package boundaries, single CI pipeline) — evaluated for the initial setup and rejected because even at current scale (6 services), the CI pipeline rebuilds all 6 services on every change with no caching; at projected 18-month scale (15 services), CI time would exceed 30 minutes on any commit; (2) polyrepo (one repository per service) — evaluated as the alternative to a monorepo and rejected because 5 of the 6 services share at least two internal utility packages that would need versioning and publication workflows; coordinating breaking changes across 6 separate repositories would require 6 PRs per cross-cutting change and a version skew window of unknown duration; (3) modular monorepo with Turborepo — selected; pnpm workspaces as the workspace manager (strict hoisting disabled via public-hoist-pattern[] to prevent phantom dependencies from resolving silently), Turborepo as the task runner with remote caching via Vercel's Remote Cache API. Packages in packages/ (shared libraries) and services/ (deployable services). Hybrid polyrepo (modular monorepo per domain) was considered for the future if team count exceeds 8 and cross-team PR review bottlenecks exceed a documented threshold — see Revisitation Conditions."
The CI build model and caching policy. Name how affected computation works, what it depends on for accuracy, and what prevents phantom dependencies. "Affected computation: Turborepo computes the dependency graph from the dependencies, devDependencies, and peerDependencies fields in each package's package.json. All internal workspace packages must declare their internal dependencies with workspace:* references. A package that imports from another package without declaring it in package.json is a phantom dependency — the affected computation will not mark the importing package as affected when the imported package changes, producing a stale build cache hit. Phantom dependency prevention: pnpm is configured with public-hoist-pattern[] empty (strict mode), preventing phantom dependencies from resolving silently. If a package attempts to import a non-declared dependency, pnpm will throw a module resolution error at runtime, making the phantom dependency visible during development rather than silently at production deployment. Additionally, dependency-cruiser runs as a pre-merge CI check and fails if any import statement resolves to a package not listed in package.json. Remote cache: Turborepo Remote Cache stores build outputs keyed by the hash of the package's input files and the hashes of declared dependencies' outputs. Cache hits serve the stored output without running the build task. Cache invalidation is driven by the dependency graph declared in package.json — accurate declarations are required for cache correctness. Cache backend: Vercel Remote Cache with a 14-day TTL. Local developer cache is enabled by default (~/.turbo/)."
The code ownership model. Name the CODEOWNERS mechanism, the maintenance policy, and the cross-team contribution process. "Ownership is defined in .github/CODEOWNERS. Each entry maps a path pattern to the GitHub team that owns it. Teams are responsible for keeping their CODEOWNERS entries current; the mechanism is a PR requirement — any PR that modifies a path covered by CODEOWNERS requires approval from the listed team. CODEOWNERS maintenance policy: (1) when an engineer joins a team, the team lead updates the relevant CODEOWNERS entries if the engineer is joining as a primary reviewer; (2) when an engineer leaves a team, the team lead removes them from CODEOWNERS entries within 5 business days; (3) quarterly, engineering-leads run a CODEOWNERS audit against the current GitHub team membership and open PRs to fix divergences; the audit is a 15-minute process using the gh api /repos/:owner/:repo/teams endpoint cross-referenced against CODEOWNERS entries. Cross-team contributions: a PR that modifies a path under another team's CODEOWNERS requires approval from that team. Expected turnaround: 1 business day. If approval is not received within 2 business days, the contributing engineer pings the owning team's channel. Escalation path: if a review is blocked for more than 3 business days, the contributing engineer and the owning team's tech lead resolve via a 15-minute sync. Packages intentionally without CODEOWNERS (ungated by design): documented in docs/ungated-packages.md. Adding a new ungated package requires a PR that explicitly adds it to the ungated list with a rationale — the absence of a CODEOWNERS entry must be a documented choice, not a gap."
The dependency sharing model. Name whether internal packages are versioned, how breaking changes are deployed, and what the compatibility policy is. "Internal packages use workspace:* version references — no publication step, no semver versioning for internal-only packages. All packages in the workspace always use the current version of all internal dependencies. Breaking change policy for shared packages: a breaking API change (removal of an export, a type signature change that requires call-site updates, a behavioral change that requires call-site adaptation) must update all consumers in the same PR. A breaking change PR that updates a shared package without updating all consumers will fail CI because TypeScript type errors and test failures in the unadapted consumers will surface in the affected computation. Reviewers of a breaking change PR should verify that: (1) all consumers are adapted in the same commit, (2) the PR description explains the reason for the breaking change and the migration pattern for any future consumers that were not in the workspace at the time of the change. External publication: two packages are published to npm (@whychose/extractor and @whychose/schema). These follow semver and have a separate changelog policy — see packages/extractor/CHANGELOG.md. Deprecation policy for internal package APIs: internal packages may deprecate an export with a @deprecated JSDoc annotation. A deprecated export must be removed no earlier than 60 days after the deprecation annotation is merged, and only after all in-workspace consumers have migrated off the deprecated API."
The revisitation conditions. Name the checkable triggers under which the repository structure decision should be re-evaluated. "Re-evaluate the modular monorepo structure if any of the following triggers occur: (1) P99 CI time for a single-package change (one package modified, its direct descendants rebuilt) exceeds 20 minutes — this indicates the dependency fan-out from a single change is too large for the monorepo to deliver CI results at an acceptable speed, and either the dependency graph requires pruning or a polyrepo split of high-fan-out packages is warranted; (2) the number of engineering teams exceeds 8, and cross-team PR review turnaround (the time from PR open to approval from the CODEOWNERS team) averages more than 3 business days across a rolling 4-week window — this indicates the CODEOWNERS model is producing review friction that exceeds the benefit of atomic cross-team refactoring; (3) a compliance requirement is identified that requires per-service git history isolation, per-service access control below the path level, or per-service audit logs independent of cross-package commits — none of which are achievable within a monorepo structure; (4) the number of external packages published from the monorepo to npm exceeds 5 — at this point the publication workflow complexity may justify a hybrid structure where publicly published packages live in separate repositories with their own release pipelines."
Finding monorepo vs. polyrepo decisions in AI chat
The WhyChose extractor surfaces repository structure decisions from four session types that contain the reasoning most teams cannot reconstruct when a new platform engineer asks why the current structure was chosen, or when a CI speed incident prompts someone to ask whether a different structure would have prevented it.
The initial setup session. "Should we use a monorepo or separate repos for our microservices?", "how do we set up a pnpm workspace with shared packages?", "Turborepo vs. Nx — which should we use for a Node.js monorepo?", "should we have one Git repo per service or one repo for everything?", "how do I share code between services without publishing to npm?", "what's the difference between a monorepo and a multi-package repo?". These sessions hold the structure choice, the build tooling selection, and the alternatives considered before the decision was made. The initial setup session is the most important to recover because it contains the specific alternatives evaluated and their rejection reasons — the information that prevents the same alternatives from being re-evaluated by a new engineer who arrives four years later and wonders why Nx wasn't used instead of Turborepo, or why the team chose pnpm over Yarn Berry.
The CI speed session. "Our monorepo CI takes 40 minutes — how do we fix it?", "how does Turborepo remote caching work?", "Nx affected computation — why is CI rebuilding packages that didn't change?", "how do I configure Turborepo to only run tests for changed packages?", "why is my GitHub Actions matrix strategy rebuilding everything?", "how do I speed up monorepo CI with parallel jobs?". These sessions contain the build caching decision: whether remote caching was adopted, what backend was chosen, and what the cache key design is. Like performance optimization decisions that determine latency behavior under load, the build caching decision determines CI throughput under the traffic pattern of a team making frequent small commits — and the caching mechanism determines whether the throughput gain is real (accurate dependency declarations drive accurate affected computation) or illusory (phantom dependencies produce cache hits that serve stale artifacts).
The phantom dependency session. "Why does my package build locally but fail in CI after removing an unrelated package?", "what is a phantom dependency in pnpm?", "how do I fix 'module not found' errors in a pnpm strict workspace?", "why is my Nx build not detecting changes to a package it depends on?", "my monorepo CI shows green but the deployed service is broken — how?", "how do I find all undeclared dependencies in a pnpm workspace?". These sessions document the specific phantom dependency incident that triggered the accuracy policy requirement. Like error handling incident sessions that reveal the error propagation policy gaps, the phantom dependency incident session contains the specific failure mode, the diagnosis, and the fix applied — which is the policy requirement that should have been in the ADR from the beginning. Recovering this session produces the phantom dependency prevention section of the decision record without requiring the team to reconstruct the policy from the theoretical risk analysis.
The cross-team contribution session. "How do CODEOWNERS files work in a GitHub monorepo?", "how do I set up code ownership per team in a monorepo?", "why does my PR require review from three different teams?", "how do I reduce review friction for cross-team contributions in our monorepo?", "our PR reviews are slow because reviewers own too many packages — how do we fix CODEOWNERS?", "should we require CODEOWNERS review for every PR or only for specific paths?". These sessions contain the CODEOWNERS model decision: the granularity of ownership entries, the review requirement policy, and the cross-team contribution process. For platform teams, recovering cross-team contribution sessions identifies the CODEOWNERS design decisions that determined the current review friction level — and provides the evidence for whether a restructuring of the CODEOWNERS granularity or the review requirement policy would reduce friction without eliminating accountability.
What the decision record prevents
A documented repository structure decision prevents three recurring problems that teams encounter as their monorepo grows and their engineering team turns over.
It prevents the phantom dependency production failure. A team without a documented phantom dependency policy accumulates phantom dependencies over years of development as engineers import convenience utilities from sibling packages without adding them to package.json. The practice is invisible — the imports resolve, the tests pass, CI shows green. The production failure surfaces when an upstream package makes a breaking change and the affected computation does not include the phantom-dependent package in the rebuild set. The decision record that names "phantom dependency prevention: strict pnpm hoisting + dependency-cruiser pre-merge check" converts the phantom dependency risk from an accumulating silent failure mode into a development-time error — engineers receive an immediate signal when they create a phantom dependency rather than discovering it as a production incident months later.
It prevents the CODEOWNERS review bottleneck. A monorepo without a CODEOWNERS maintenance policy accumulates stale entries over years of team changes. Engineers who left the company are still listed as required reviewers. Engineers who moved to different teams still own packages they no longer maintain. PRs pile up waiting for reviewers who are no longer responsible for the code. The specific review bottleneck — "the payments team reviewer left six months ago and nobody updated CODEOWNERS" — is discovered during the incident, not before it. The decision record that names "CODEOWNERS maintenance policy: remove within 5 business days of team departure, quarterly audit" converts the maintenance obligation from an implicit expectation into an explicit process that each team knows they are responsible for. Like ADR lifecycle policies that define when a decision requires revisitation, the CODEOWNERS maintenance policy is only effective if it names a specific trigger and a specific process — "keep it current" is not a policy.
It prevents the version skew accumulation failure. A polyrepo or hybrid structure without a maximum version skew policy accumulates version skew as each team upgrades shared packages on their own schedule. The first team to upgrade gains the benefit of the new version. The last team to upgrade may be behind by a year when the publishing team needs to remove the old API surface. The publishing team cannot remove the old API without breaking the last consumer, and the last consumer cannot upgrade quickly because the version gap is now a multi-sprint migration. The decision record that names "maximum version skew: all consumers must upgrade to a new major version within 6 weeks of release" creates a coordination mechanism that limits skew accumulation. A technical leader who inherits a polyrepo without a version skew policy cannot assess the migration debt without reading every service's dependency lock file — a repository archaeology exercise that a documented policy would have made unnecessary.
Further reading
- Decisions that never get written down — the repository structure decision is one of the decisions most likely to be undocumented: it is made at project inception before the codebase is large enough to reveal the decision's consequences, it feels like a technical preference rather than an architectural commitment, and the consequences — CI cost, ownership model, dependency sharing model — are only visible years later when they have accumulated enough to become problems
- ADRs for platform teams: how infrastructure decisions become constraints for product teams — the phantom dependency prevention policy, the CODEOWNERS maintenance process, and the task runner configuration are platform team decisions that create constraints for every product team working in the monorepo; documenting these as ADRs converts them from implicit platform team conventions into explicit constraints that product teams can rely on and argue against explicitly
- The caching strategy decision record — the monorepo build cache and the application-layer request cache share a structural problem: both are cache invalidation problems where the correctness of the cache hit depends on the accuracy of the dependency declaration; a phantom dependency in a monorepo is the build cache equivalent of a cache-aside implementation that doesn't call invalidation on write — the cache returns a result that does not reflect the current state of its dependencies
- The dependency upgrade decision record: documenting the 'why now?' of a breaking migration — in a polyrepo, the dependency upgrade decision recurs for every service every time a shared library publishes a new major version; the "why now?" question is the forcing function that determines how long version skew accumulates before a given consumer upgrades; without a documented maximum version skew policy, the "why now?" question has no answer and deferred upgrades accumulate
- The rejected dependency: why the libraries you didn't install deserve a decision record — the monorepo tooling selection (Turborepo vs. Nx vs. Bazel vs. custom scripts) is a build-tool dependency decision with a rejection record; a team that evaluated Nx and chose Turborepo for specific reasons has a documented position that prevents re-evaluation of Nx from scratch when a new engineer arrives with Nx experience and proposes the switch; without the rejection record, the evaluation must be repeated in full
- The interface decision: why the contracts between your components deserve their own records — the breaking change policy for shared packages in a monorepo is a contract decision: what constitutes a breaking change, what update is required from consumers, and what the migration timeline is; without a documented breaking change contract, each shared package maintainer applies their own judgment and consumers encounter inconsistent migration requirements across packages
- The build vs. buy decision record: why the make-or-buy choice is the hardest to document honestly — the task runner selection (Turborepo, Nx, Bazel, or a custom affected-computation script) is a build-vs-buy decision with a vendor lock-in dimension; Turborepo Remote Cache requires Vercel as the cache backend (or a self-hosted compatible server); Nx Cloud requires Nx as the cache backend; Bazel is self-hosted but requires Starlark build file authoring; the vendor implications of each choice belong in the build-vs-buy rejection reasoning
- ADR lifecycle: superseding and deprecating decisions — repository structure decisions are the most likely to be superseded rather than amended: a flat monorepo that becomes a modular monorepo, a modular monorepo that splits into a hybrid, a polyrepo that consolidates into a monorepo — each of these is a structure change that supersedes the prior decision; the supersession record captures the accumulated cost that triggered the migration and the migration approach chosen, so the next migration can learn from both rather than treating the migration as a novel problem
- Three months of AI chat history, undocumented — repository structure decisions appear in AI chat in four session types: the initial setup session (structure choice, build tooling selection, alternatives evaluated); the CI speed session (build caching decision, affected computation configuration); the phantom dependency session (accuracy policy decision surfaced through incident); and the cross-team contribution session (CODEOWNERS design and review process); the CI speed session and the phantom dependency session are the highest-value targets because they document the build model consequences discovered through production rather than theoretical analysis
- The new-CTO onboarding problem: when nobody can tell you why — a technical leader who inherits a monorepo cannot determine why the structure was chosen, what the alternative considered was, what the phantom dependency prevention policy is, or what the CODEOWNERS maintenance process is without reading CONTRIBUTING.md files (if they exist) or asking the engineers who were present at project inception; the repository structure ADR converts these questions from an oral history exercise into a lookup against a documented record that was written at decision time, not reconstructed from memory
- Nygard ADR template — the standard format adapts for repository structure decisions with the revisitation conditions section as the most important addition; unlike most architectural decisions where the consequences are visible within months, the repository structure decision's costs (CI time, CODEOWNERS friction, version skew) accumulate over years; naming the checkable trigger conditions at decision time makes it possible to recognize when the decision should be revisited without waiting for a pain threshold to be crossed
- WhyChose extractor — repository structure decisions appear in AI chat in four session types: the initial setup session (structure choice and tooling selection), the CI speed session (build caching and affected computation), the phantom dependency session (accuracy policy discovery through incident), and the cross-team contribution session (CODEOWNERS design); the phantom dependency session is the most valuable for the decision record because it documents the specific correctness failure in terms of a real production incident rather than a theoretical risk