The developer experience decision record: why the local development environment you chose determines your onboarding time and your production-parity gap
The local development environment is configured during the first sprint of a project. Someone runs docker-compose up, adds a docker-compose.yml with three services, and the team begins working. Six months later, a third service has been added. A year after that, the Postgres version in docker-compose.yml is still 13 because nobody wanted to break the setup, and production is running 15. Nobody wrote down why Docker Compose over Dev Containers, what the acceptable production-parity divergence was, or who owns the file when it drifts. The decision was implicit, load-bearing, and invisible — and it stays invisible until a new engineer spends four days getting a working setup, or until a production incident traces back to a behavior difference between Postgres 13 and Postgres 15 that was present in every PR for eleven months.
The developer experience decision record makes the local environment strategy explicit: which tooling was chosen, why it was chosen over the alternatives that were available at the time, what the production-parity assumptions are and where the known divergences are, what the onboarding time target is and the mechanism for detecting when it has been exceeded, and who owns the environment setup as the production topology evolves. This is not documentation for documentation's sake — it is the record that allows a future engineer to evaluate whether the current environment strategy still fits the team's size and the application's topology, or whether the original decision's assumptions have been quietly invalidated by two years of growth.
Two things that happen when the decision is not written down
The four-day onboarding
A 28-person B2B SaaS company began development three years ago with a single engineer who set up Docker Compose with Postgres, Redis, and the application server. The setup worked. As the team grew, services were added: a background job worker, an async email service, a separate analytics database, an internal admin API. Each service was added to docker-compose.yml with the configuration that worked for the engineer who added it. By the time the team reached 28 people, the compose file had nine services, two of which required environment variables that were not in the .env.example file because they had been added directly to individual engineers' local .env files and never synced back. One service required a specific version of a native library that was not documented anywhere — it was installed on every existing engineer's machine because they had been around when the library was needed, but new engineers discovered the missing dependency only at runtime, after starting all nine services, by reading a cryptic error message from a library that was not in the service's own README.
New engineer onboarding took an average of four days. The three days past the first were spent finding undocumented manual steps: the missing native library, the undocumented environment variables, the fact that the Postgres 13 image in docker-compose.yml did not have the pgvector extension that was now required by one service (the service requiring it was added eighteen months after the original compose file was written, and the engineer who added it had been using a separately installed local Postgres with pgvector — they had updated their own local compose file but not the repository's). The onboarding steps existed — spread across five READMEs, a wiki page, and a Slack message archived in a channel the new engineer was not yet in — but they were not sequenced, not authoritative, and not maintained against the current state of the repository. The gap between the documented setup and the working setup represented the accumulated undocumented decisions of three years of development: each small configuration change had been made by someone who already had a working environment, and none of them had been recorded as a decision with explicit reasoning and explicit impact on the environment setup.
The production-parity incident cadence
A 15-person developer tools company used Dev Containers with a full local Kubernetes cluster (kind plus Telepresence proxying) to maximize production parity. The original decision to use this configuration was made by the founding engineer who had been burned by environment parity issues at a previous company and chose the highest-fidelity option available. The setup was never written down as a decision with explicit reasoning: no documentation of why kind over minikube, no documentation of what the acceptable parity divergence was for the components that were still running natively (the build toolchain, the test runner, the IDE language server), and no documentation of what the migration path was if the environment setup became incompatible with new hardware.
When the company issued MacBook Pro M-series laptops, five engineers received them in the same week. The arm64 architecture exposed a gap that had been invisible on x86: two of the nine container images in the compose stack were published only for linux/amd64 and had no arm64 variants. The kind cluster failed to start. No documentation existed for this failure mode because the original environment decision record — which would have included a section on "supported hardware architectures and escalation path when a new architecture is introduced" — had never been written. The founding engineer who had made the original tooling choice had left the company eight months earlier. The five engineers with new laptops spent a combined ten days investigating the failure, attempting cross-compilation, and finally setting up Docker Desktop's QEMU emulation layer as a workaround — a workaround that introduced a new parity gap (QEMU-emulated networking behavior differs from native) that was not documented because the resolution was a workaround, not a decision.
Both companies are illustrating the same failure: the developer experience decision record was never written, so the load-bearing assumptions of the environment — which hardware architectures it supports, which service versions are required, what the onboarding sequence is, who owns the configuration when it drifts — could not be maintained, evaluated, or migrated when circumstances changed. The decision is not recoverable from the docker-compose.yml, the Dockerfile, or the README. It requires the reasoning at the time of the choice: what the alternatives were, why those alternatives were rejected, and what conditions would warrant revisiting.
Three structural properties that are set at environment selection time
1. The onboarding time floor and the manual-step surface
The onboarding time floor is the minimum time required for a new engineer to reach a working local development environment from a fresh machine, assuming all documented steps are followed correctly and all dependencies are available at the expected versions and architectures. The floor is determined by: the number of required downloads and their sizes; the number of build steps required (building container images locally, compiling native dependencies, running database migrations); and the number of decisions the engineer must make during the setup that are not resolved by the documented steps. Each undocumented step that requires judgment — "which version of the runtime should I install?", "do I need to configure this environment variable before or after running the migration?", "is this error expected or does it indicate a problem I need to resolve?" — adds time and unpredictability to the floor. An environment strategy that minimizes undocumented judgment steps minimizes the floor.
The manual-step surface is the set of setup steps that must be performed manually rather than being automated by the environment setup tooling. Dev Containers and Nix expressions minimize the manual-step surface by defining the environment declaratively and running setup automatically on first use; Docker Compose reduces but does not eliminate it by automating service orchestration while leaving native tool installation, environment variable setup, and database migration to manual steps documented outside the compose file; native host installs maximize the manual-step surface by requiring each engineer to install every dependency manually using their platform's package manager. The manual-step surface grows over the life of the project as new services and dependencies are added; without a documented owner and a policy for keeping the automated setup in sync with the current state of the repository, the gap between the automated setup and the working setup widens with each new addition.
Document the expected onboarding time at selection time and the mechanism for detecting when it has been exceeded. A simple mechanism is a periodic test: onboard a new contractor or an intern using only the documented steps, without assistance from existing engineers, and measure the time from fresh checkout to first passing test suite run. If the measured time exceeds the documented target, the delta is the accumulated undocumented manual-step surface — the undocumented decisions of the period since the last test. Without a documented target and a periodic test, the onboarding time grows invisibly until it becomes a productivity incident.
2. The production-parity gap and its failure-mode categories
The production-parity gap is not a single number — it is a set of specific divergences between the local environment and the production environment, each of which creates a category of bugs that are not reproducible locally. A Postgres version divergence creates a gap in query planner behavior, in supported syntax, and in extension availability. A local mock for an external API creates a gap in error response shapes, rate-limiting behavior, and retry semantics. A Docker bridge network creates a gap in DNS resolution behavior relative to Kubernetes service discovery. A local filesystem creates a gap in file permission semantics, in inode limit behavior, and in latency relative to a network-attached storage volume. Each specific divergence corresponds to a category of bugs that can be introduced in code that passes local testing and review but fails in production.
Some parity gaps are acceptable: local development does not need to replicate multi-region failover behavior, production traffic load, or datacenter network latency. The question is not whether to accept any parity gap — some gaps are always acceptable — but which gaps are explicitly accepted and which are explicitly closed. A documented parity gap list enables the team to maintain deliberate integration test coverage for the accepted-gap behaviors in a staging or CI environment that runs against production-equivalent versions. An undocumented parity gap means that certain failure categories have no test coverage at any layer below production, because nobody knows the gap exists to design tests around it.
The failure mode of an undocumented parity gap is characteristically confusing: the bug is reproducible in production, not reproducible locally, and not reproducible in CI (which also runs against the local environment's versions). The investigation starts from the production behavior, works backwards through the code, and eventually identifies a behavioral difference between the production and local versions of a dependency that nobody knew was different. The resolution requires either updating the local environment to match production (which may break other engineers' local environments if the update is not coordinated) or adding a specific integration test that exercises the production-equivalent behavior. The cost of the resolution is proportional to the undocumented duration of the gap — how long the divergence has been present without being detected.
3. The environment maintenance ownership and the version drift rate
The environment setup — docker-compose.yml, .devcontainer/devcontainer.json, flake.nix, or equivalent — is a software artifact with dependencies that require maintenance. Container image tags become stale. Base images receive security patches that change behavior. Native library versions in use on macOS diverge from those in use on Linux. The version drift rate is the speed at which the environment's declared dependency versions fall behind the versions in use in production and CI. Without a documented owner and a documented update policy, the drift rate is proportional to the speed at which production dependencies are updated and inversely proportional to the frequency with which someone updates the local environment configuration.
The drift rate compounds into two distinct problems. The first is the parity gap: as production dependencies are updated and local environment dependencies are not, the parity gap widens and new failure-mode categories are introduced. The second is the setup fragility: as the environment's declared versions fall further behind, the probability of encountering a conflict between the declared versions and the current state of a new engineer's machine increases. A docker-compose.yml that specifies Postgres 13 does not cause onboarding problems when Postgres 13 images are current and widely available; it begins causing problems when Postgres 13 reaches end of life and the Docker Hub image cache begins evicting it in favor of newer versions. Document the version update policy at environment selection time: who runs the update, on what cadence, and what the validation step is (which tests are run after an environment version bump to confirm that the application still works correctly with the updated dependencies). Without a documented policy, the update happens reactively — when a new engineer cannot complete setup, or when a production incident traces back to a parity gap — rather than proactively, when the cost is minimal.
The five ADR sections for a developer experience decision
1. Local environment strategy and technology selection
Document the chosen local environment strategy — Docker Compose, Dev Containers, Nix/NixOS development shells, Minikube, kind with Telepresence, native host installs, or a combination — with explicit rejection reasoning for the alternatives that were evaluated. For each alternative rejected, document the specific property that made it unsuitable at the time of the decision: Dev Containers rejected because the team's existing macOS toolchain required native library versions that did not cross-compile cleanly into a container; Nix rejected because the team's collective experience with Nix expressions was insufficient to maintain them as the service dependency graph grew; kind rejected because the memory requirements for running a local Kubernetes cluster on a 16 GB machine exceeded what the team's laptops could sustain without degrading IDE performance.
Document the supported hardware architectures explicitly. As of 2026, a team issuing both Intel Macs and M-series Macs, or both macOS and Linux development machines, must document which architectures are supported, whether multi-platform container images are required for all services (and the build and publish tooling that produces them), and what the documented procedure is when a required image is not available for a team member's architecture. Document the expected setup time on each supported platform, measured by timing the setup on a freshly provisioned machine of each type. If the setup time varies significantly between platforms, document the reason for the variance and whether it is expected to persist or is a temporary condition while a platform-specific issue is resolved.
Document the migration cost model. What is the effort required to migrate from the chosen environment strategy to an alternative if the current strategy's properties (setup time, parity gap, maintenance cost, hardware compatibility) become unacceptable at the team's future scale? Migrating from Docker Compose to Dev Containers requires writing a devcontainer.json for each service, validating that all native tools used in development are available in the container or can be installed deterministically, and coordinating the migration across all engineers simultaneously (a split environment — some on Docker Compose, some on Dev Containers — produces systematic 'works on my machine' problems during the transition). A team that makes the Docker Compose choice knowing the migration cost to Dev Containers at 30 engineers is approximately three engineering weeks of setup work plus one week of coordination is making an informed trade-off. A team that makes the same choice without modeling the migration cost discovers the trade-off when they are executing the migration, with a larger team and a larger surface area than they anticipated.
2. Production-parity surface and acceptable divergence policy
Document every known divergence between the local development environment and the production environment at the time of the decision. For each divergence, document whether it is explicitly accepted and the rationale, or whether it is a known gap that will be closed in a future milestone. The divergence inventory should include: software version differences for every persistent service (database, cache, message broker, search engine); network topology differences (local single-node vs. production multi-node, local Docker bridge network vs. production Kubernetes service mesh, local TLS termination approach vs. production TLS termination approach); authentication and authorization differences (local mocked auth vs. production identity provider, local static API keys vs. production secrets manager); and external dependency differences (local mocks vs. production third-party API, local seeded data vs. production live data, local filesystem vs. production object storage).
For each accepted divergence, document the category of bugs that the divergence makes undetectable locally. A local mock for a Stripe webhook that returns only success responses makes undetectable locally any bug that depends on the Stripe webhook retry behavior (three retries over 72 hours, with exponential backoff, with idempotency key semantics on retry). Document where coverage for that failure category is provided instead — integration tests against a Stripe test mode account in CI, a staging environment that runs against Stripe test mode directly, a documented manual testing procedure for webhook retry scenarios before merging code that changes webhook handling logic. The parity gap inventory, the failure-mode categories it creates, and the compensating coverage in non-local environments form a complete picture of the project's test coverage architecture. Without the explicit divergence documentation, the coverage architecture cannot be reasoned about, and gaps between "what is covered locally" and "what is covered in production" are invisible.
The service mesh decision record and the container orchestration decision record define the production networking topology. The developer experience decision record defines how that topology is approximated locally — and the deviation between the production service discovery model and the local Docker Compose bridge network model is a parity gap that should be documented in both records, with explicit notes about which network-dependent behaviors are and are not reproducible in the local environment.
3. Onboarding sequence and time-to-first-PR target
Document the onboarding sequence as an ordered, validated list of steps that a new engineer can follow from a fresh machine to a working development environment. Validated means that the sequence has been tested on a machine that did not previously have any of the project's dependencies installed, and that following the steps in order, without assistance, produces a working environment. Document the expected duration for each step group (downloads, builds, migrations, validation) and the total expected duration for a first-time setup on each supported hardware platform. Document the validation command that confirms the environment is correctly configured — typically a command that starts all required services, runs the test suite against the local services, and exits zero if everything is working.
Establish a time-to-first-PR target: the maximum elapsed calendar time from a new engineer's first day to their first merged PR, where the elapsed time includes environment setup and codebase familiarity but not the review cycle for the PR itself. A team that values rapid onboarding should target two to three days. A team with a complex local environment that genuinely requires longer setup may target five days but should document the reason the target is five rather than three, and should document the condition under which a future team would revise the target downward (typically: migrating to a higher-reproducibility environment strategy that reduces the manual-step surface). Without a documented target, the time-to-first-PR is an unmeasured quantity that grows with the environment's complexity and with the accumulation of undocumented manual steps.
Document the escalation path when a new engineer cannot complete the setup using the documented steps. The escalation path should include: the first contact (typically the engineer most recently onboarded, whose memory of the current state of the setup is freshest), the maximum time before escalation (a new engineer blocked for more than two hours on an undocumented step should escalate — the block indicates either a documentation gap or an environment bug that affects all engineers who do not yet have the workaround), and the follow-up responsibility (whoever unblocks the new engineer is responsible for either updating the documentation or opening an issue to fix the environment bug before the next onboarding). Without a documented escalation path and follow-up responsibility, undocumented steps accumulate — each new engineer encounters the same gaps but resolves them ad-hoc, without updating the shared documentation, because no process requires it.
4. Service dependency management and local overrides
Document the approach to managing service dependencies in the local environment. For Docker Compose-based environments, document the versioning policy for service images: whether service images are pinned to specific version tags (deterministic, but requires manual updates), pinned to major-version-only tags (semi-deterministic, updated automatically on minor/patch releases but not on major version bumps), or floating on latest (non-deterministic, updated automatically to whatever the registry considers latest, which changes without notice). Floating on latest optimizes for always having the newest service version at the cost of non-reproducible environments: two engineers who run docker-compose pull on different days may end up with different service versions, and a bug that appears in one engineer's environment but not another's may trace back to a service version difference introduced by a registry update, not a code change.
Document the secrets management approach for local development. Production secrets are stored in a secrets manager; local development requires equivalent credentials for services that require authentication (database, cache, external APIs). Document whether local development uses: static plaintext values in a .env.example file that are committed and safe for local use only (simplest, but creates a habit of treating secrets as plaintext that conflicts with production secrets hygiene); a local secrets manager instance (higher fidelity with production, but adds a service to the local setup that must be running before the application starts); or a secrets manager SDK configured to use a development-mode local store (closest to production authentication patterns, but requires SDK integration in the application's startup path). The choice affects not just local setup but the application's secrets handling code — an application that reads secrets from environment variables in development and from a secrets manager SDK in production has two code paths for the same operation, which is a source of bugs if the paths diverge.
Document the database seeding strategy. Local development requires a populated database to exercise the application's behavior; the database migration strategy decision record defines the migration tooling, and the developer experience decision record defines what data is loaded after migrations run. Options include: a seed script that inserts a minimal set of records sufficient to exercise the application's main flows; a seed script that inserts a larger set of records mimicking a realistic usage volume; an anonymized production snapshot loaded via a restore script; or no seeding, with engineers creating data manually as needed. The seeding strategy affects the categories of bugs that are detectable locally: a minimal seed does not exercise pagination behavior, high-volume query performance, or data distribution edge cases; an anonymized production snapshot exercises all of these but requires a pipeline to produce, anonymize, and distribute the snapshot regularly, and introduces a parity gap if the snapshot's schema version lags the current migration head.
5. Environment maintenance ownership and version drift policy
Assign explicit ownership of the local development environment configuration as a maintained software artifact. In a team with a platform or infrastructure engineering function, the local environment configuration is typically owned by that function alongside the CI/CD configuration and the production infrastructure definitions — the CI/CD pipeline decision record and the infrastructure-as-code strategy decision record govern the production infrastructure, and the developer experience decision record governs the local simulation of that infrastructure. In a smaller team without a dedicated platform function, assign ownership to the engineer most recently responsible for infrastructure changes, and document the handoff process when that engineer transitions to a different role.
Document the version update cadence and trigger conditions. Proactive updates run on a fixed schedule (typically monthly, aligned with the production dependency update cadence) and consist of bumping service versions in the local environment configuration to match the current production versions, running the validation command to confirm the application still works with the updated versions, and committing the update with a changelog entry. Reactive updates are triggered by specific conditions: a new engineer cannot complete onboarding because a required image version is no longer available in the registry; a production incident is traced to a local environment parity gap; the application gains a new dependency that is not present in the local environment configuration. Document both triggers in the decision record so that the maintenance cadence is explicit and the conditions that warrant an unscheduled update are clear.
Document the policy for handling environment divergence between engineers during a major version migration. When the local environment is being updated from Postgres 13 to Postgres 15 (to match a production migration), some engineers will have already updated their local environments and some will not. The application code that runs against Postgres 15 locally may behave differently from the application code that runs against Postgres 13 locally if the code uses features or behaviors that differ between versions. Document the maximum window during which split-version local environments are acceptable — typically one sprint — and the process for communicating to the team that the local environment update is required. Without a documented window and a communication process, the split persists indefinitely, and reports of "this test fails on my machine but passes on yours" become impossible to diagnose without first establishing which Postgres version each machine is running. The observability platform decision record defines production monitoring; the developer experience decision record should include a note on whether local development generates observable signals — logs, traces, metrics — in the same format as production, so that engineers can use the same diagnostic tooling locally and in production rather than maintaining separate local debugging workflows.
The FAQ section from your own AI chat history
The developer experience decision record belongs to the class of decisions that are almost always made via AI chat conversation rather than synchronous meeting. "What's the best local development setup for a Node.js + Postgres + Redis stack?" is a natural ChatGPT or Claude question. The answer — including the trade-off reasoning, the specific version recommendations, the alternatives-considered section, and the caveats about M-series Mac compatibility — lives in that conversation, not in any document. The decisions that never get written down are disproportionately the decisions made via AI chat: they get the best reasoning in the moment, from a context-aware interlocutor who can enumerate the alternatives and their trade-offs precisely, and they produce zero durable documentation unless the engineer explicitly creates it.
The WhyChose open-source extractor recovers exactly these decisions: the conversation where you asked Claude why Dev Containers versus Docker Compose, where you discussed the M-series compatibility concerns, where you worked through the database seeding strategy options. The developer experience decision record begins with the reasoning from that conversation, structured into the five ADR sections above, and committed to the repository where it can be maintained alongside the environment configuration it documents. The alternative is four days of onboarding for the next engineer who joins after the person who had the original conversation has moved on.
Further reading
- The CI/CD pipeline decision record — the CI environment should mirror the local development environment's service versions closely enough that bugs reproducible in CI are also reproducible locally; the CI pipeline decision record and the developer experience decision record together define the full test environment architecture across local, CI, and production
- The container orchestration decision record — the production container orchestration topology (Kubernetes, ECS, Nomad) defines the networking, service discovery, and resource constraint model that the local development environment must approximate; the fidelity of the local approximation determines the categories of orchestration-dependent bugs that are detectable before merge
- The service mesh decision record — local development environments rarely replicate the service mesh's mTLS, traffic shaping, and circuit-breaking behaviors; documenting the parity gap between local service communication and production service mesh behavior makes explicit which network-level failure modes are not exercised until staging or production
- The secrets management decision record — the local development approach to secrets (plaintext
.envfiles vs. local secrets manager instance vs. SDK-configured development mode) determines whether the application's secrets handling code has one code path or two, and whether the local environment builds the habit of treating secrets as secrets or as configuration values - The database migration strategy decision record — migration tooling and migration application procedures are a core part of local environment setup; a migration strategy that requires manual steps beyond
npm run migrateoralembic upgrade headadds to the onboarding manual-step surface and increases the probability of a new engineer starting with a schema that differs from the current migration head - The build system decision record — the local build system is the primary tool a developer uses dozens of times per day; its incremental build behavior, its remote cache configuration, and its affected-package detection strategy directly determine how long a developer waits between making a change and seeing whether the change works, which is the most frequent feedback loop in local development
- The test strategy decision record — the test strategy defines what runs in local development versus CI versus staging; the developer experience decision record defines whether the local environment can actually run the full test suite (all required services present and running at the correct versions) or only a subset (unit tests, with integration tests deferred to CI)
- The feature flag decision record — local development needs a working feature flag evaluation path; the choice of local flag evaluation (SDK-in-local-evaluation-mode with a static flag file vs. network calls to a flag service running locally vs. environment variables overriding flag values) affects both setup complexity and the fidelity of local testing for flag-dependent code paths
- The logging strategy decision record — if production logs are structured JSON shipped to a log aggregation service, but local logs are pretty-printed console output, the local development environment does not exercise the logging code path that production uses; a log format bug or a missing field in the structured log output is not detectable locally, only in production
- The infrastructure-as-code strategy decision record — IaC definitions describe the production environment; the developer experience decision record describes the local simulation of that environment; the two should be read together by any engineer who wants to understand the full set of differences between the environment they work in daily and the environment their code runs in production
- The observability platform decision record — a local development environment that produces traces, metrics, and logs in the same format as production enables engineers to develop and test observability instrumentation locally; a local environment that produces console output only means that observability bugs are not found until the code is deployed to a production-equivalent environment
- The database connection pooling decision record — connection pooling behavior differs between a local single-Postgres-instance environment and a production environment with a connection pooler (PgBouncer, RDS Proxy) in front of it; connection exhaustion bugs and prepared statement caching bugs that depend on the pooler's behavior are not reproducible locally if the pooler is absent from the local environment
- The decisions that never get written down — developer experience decisions join the class of consequential undocumented infrastructure choices: made during the first sprint as obvious configuration defaults, discovered as architectural constraints when team growth makes the original decision's assumptions invalid
- The WhyChose open-source extractor — recover the original local environment discussion from your AI chat history, including the Docker Compose vs. Dev Containers trade-off analysis and the M-series compatibility concerns that informed the original choice before anyone wrote them down