TCP #127: The ADR template that turns architecture into a written contract

Decision context, tradeoffs, risk, cost, compliance, and rollback captured in a format leadership can read.

Jun 21, 2026

Architecture Decision Records are not new.

The format has been documented for over a decade. Most engineering organizations have adopted it. Most engineering organizations also continue to ship architectures that look fine on day one and become liabilities by month 18.

The ADR is not the problem. The way most ADRs are written is.

Most ADR templates capture three things: context, decision, consequences.

The format is clean, the discipline is real, but the output is rarely actionable.

The “consequences” section is a paragraph of speculation. The decision is recorded. The risks are not modeled. The architecture ships, the document is filed, and the next team to touch the system has no instrument for evaluating whether the original decision still holds.

A well-designed ADR template fixes this.

It forces the team to write down what would change the decision, what the cost trajectory looks like, what compliance scope is closed off, and what the rollback path is.

The template does not produce better decisions because the team is smarter. It produces better decisions because the format will not accept the easy answers.

What an ADR Is For

An ADR is a written contract between the team that makes a decision and the team that will inherit it.

The contract states: this is what we knew, this is what we evaluated, this is what we chose, this is what we accepted. The future team reads it and knows whether the conditions that justified the decision still hold. If they do, the architecture continues. If they do not, the team has the documentation to argue for change.

An ADR is not a justification document. It is not a sales pitch for the decision the team has already made. The most useful ADRs document the alternatives the team rejected with the same rigor as the alternative the team accepted. A reader who cannot tell from the ADR why the second-place option lost is reading a document that failed to do its job.

An ADR is also not a one-time artifact. It has a lifecycle: proposed, accepted, deprecated, superseded. A team that accepts an ADR commits to revisiting it on a defined cadence and updating its status when the conditions change.

The discipline of writing an ADR slowly compounds organizational memory. Decisions that would have been forgotten and re-debated three years later instead become the foundation for the next decision.

The Eleven-Section ADR Template

The template below is ten percent more structured than the canonical Michael Nygard ADR format. The additional structure is what produces the behavioral change.

1. Title

2. Status

3. Context

4. Decision

5. Alternatives Considered

6. Operational Risk

7. Cost Trajectory

8. Compliance and Tenancy Impact

9. Rollback Path

10. Review Cadence

11. Related ADRs

Each section has a defined purpose, a length target, and a quality bar. ADRs that fail to meet the quality bar are returned for revision before acceptance.

Section-by-Section Specifications

1. Title. A complete sentence stating the decision in the active voice. Not “Compute Choice for Service X.” Instead: “Service X will run on ECS Fargate rather than Lambda or EKS.” The title alone should communicate what was decided.

2. Status. One of: Proposed, Accepted, Deprecated, Superseded by ADR-NNN. Status is dated. A Proposed ADR with no status change in 14 days is escalated to the architecture review forum.

3. Context. Two to four paragraphs. Why is the decision being made now? What problem is being solved? What constraints (deadline, budget, customer commitment, compliance scope) are operating? The reader must be able to determine from the context section whether the same decision would be made today, regardless of what the decision was.

4. Decision. Two paragraphs maximum. What is being decided? Specific service names, specific configurations, specific topology. Avoid abstraction. A decision that reads “we will use a managed service for queueing” is not a decision. A decision that reads “we will use SQS Standard with a 14-day retention and a dead-letter queue redrive policy of 5 attempts” is.

5. Alternatives Considered. Three or more alternatives. Each alternative gets a paragraph. The paragraph names the alternative, names two reasons it could have been the right choice, and names the specific reason it was not. Do not write “we considered X but it did not fit.” Write “we considered EKS because three teams already operate it and the operational tooling exists, but rejected it because the workload is single-tenant and bursty and would require a dedicated node group with low utilization.”

6. Operational Risk. Three paragraphs. The first names the most likely failure mode the architecture introduces and the recovery path. The second states the recovery-time objective the architecture supports. The third names the operational burden the team accepts: on-call obligations, required runbooks, and required monitoring instrumentation.

7. Cost Trajectory. A table with three rows: cost at current workload, cost at 10x current workload, cost at the volume the team forecasts in 24 months. Each row shows the monthly figure and the assumption set. Below the table, one paragraph naming the cost cliff (the point at which a different architecture would be cheaper) and the conditions that would trigger reconsideration.

8. Compliance and Tenancy Impact. Three paragraphs. The first names the compliance frameworks the architecture supports today (SOC 2, FedRAMP, HIPAA, ISO 27001, PCI). The second names the compliance frameworks or customer types the architecture excludes or makes harder. The third names the tenancy model the architecture commits to (shared, isolated, hybrid) and what would change if that commitment is revisited.

9. Rollback Path. Two paragraphs. The first names the conditions under which this decision would be reversed. The second names the cost, duration, and team capacity required to reverse it. The rollback path is the most important section to write honestly. A decision with no rollback path is not a decision. It is a one-way door.

10. Review Cadence. A single sentence stating the date the ADR will be reviewed and the trigger for an earlier review. Default cadence: 12 months from acceptance. Earlier triggers: cost trajectory variance above 20 percent, an incident attributable to the decision, a customer or compliance scope change.

11. Related ADRs. A list of ADR numbers with one-line annotations describing the relationship: depends on, supersedes, related to, conflicts with. This section is what produces organizational memory. Future teams use the related ADRs to navigate the decision history.

Worked Example: Compute Selection for a New Service

Title: The customer-events ingestion service will run on ECS Fargate rather than Lambda or EKS.

Status: Accepted, 2026-06-10.

Context: The customer-events service ingests webhook events from third-party integrations and writes them to an internal event bus. Forecast volume is 200 RPS sustained, 2,000 RPS burst, with an average payload size of 4KB. The service is expected to be a foundation for customer-facing event replay capabilities, which will increase volume by an estimated 5x within 12 months. The team operating this service has two engineers familiar with ECS and one engineer familiar with Lambda. EKS is operated by the platform team, but no application teams currently deploy to it.

Decision: The service runs on ECS Fargate with a minimum of 2 tasks and an autoscaling target of 60% CPU utilization, scaling up to a maximum of 20 tasks. The service is deployed to the platform team’s standard ECS cluster in production with the existing observability stack. The container image is built on the standard Python 3.12 base image and includes the platform team’s standard observability sidecar.

Alternatives Considered:

Lambda was considered for the speed-of-shipping advantage and the lack of cluster management overhead. Two engineers had already prototyped the workload on Lambda. Rejected because at the 24-month forecast volume of 1,000 RPS sustained, Lambda costs are 3.4x ECS costs based on internal calculator output, and the cold-start variance was incompatible with the latency SLO of P99 under 200ms.

EKS was considered for consistency with the platform team’s longer-term direction and access to the broader Kubernetes ecosystem. Rejected because the team operating this service has no Kubernetes experience and the workload does not justify the operational learning curve. The platform team confirmed that ECS is supported indefinitely as a first-class option.

A self-managed EC2 fleet was considered briefly for cost predictability at high volume. Rejected because the operational burden of patching, scaling, and logging would consume engineering time disproportionate to the cost savings.

Operational Risk: The most likely failure mode is autoscaling lag during sudden traffic spikes from third-party integration partners. The recovery path is to increase the autoscaling minimum capacity manually via the ECS console; runbook RB-042 documents the procedure. The recovery time objective is 5 minutes for autoscaling-related capacity issues. The team accepts on-call rotation for this service, integration into the existing PagerDuty schedule, and the maintenance of two runbooks.

Cost Trajectory:

Current Load (200 RPS)

Estimated Monthly Cost: $480

Assumption: Average of 4 ECS Fargate tasks running in us-east-1 using the current task configuration.

24-Month Forecast (1,000 RPS)

Estimated Monthly Cost: $2,200

Assumption: Average of 9 ECS Fargate tasks running with the same task sizing and configuration.

10× Growth Scenario (2,000 RPS)

Estimated Monthly Cost: $4,200

Assumption: Average of 18 ECS Fargate tasks running with the same task sizing and configuration.

The cost cliff is at approximately 4,000 RPS sustained, where moving to ECS on EC2 with reserved instances would produce a 35 percent saving. The trigger for reconsideration is sustained volume above 3,500 RPS for two consecutive months.

Compliance and Tenancy Impact: This architecture supports SOC 2 Type II without modification using the platform team’s standard logging and access controls. FedRAMP Moderate is supported by the addition of the GovCloud deployment pattern, as documented in ADR-039. The architecture is shared-tenant by default; isolating a specific customer to a dedicated task fleet is a 1-week effort if required.

Rollback Path: Reversing this decision to Lambda would require approximately 3 weeks of engineering effort (rewriting the deployment pipeline, refactoring the service initialization, and recreating the observability instrumentation). Reversing to EKS would require approximately 6 weeks, including team training. Either rollback would require a coordinated cutover with the third-party integration partners. The rollback decision would be triggered by a sustained variance in cost trajectory of more than 30 percent or by a platform-level deprecation of ECS Fargate.

Review Cadence: This ADR will be reviewed on 2027-06-10. Earlier review triggers: sustained volume above 3,500 RPS, cost variance above 20 percent, FedRAMP customer commitment.

Related ADRs: ADR-014 (platform observability standard, depends on), ADR-027 (ECS cluster topology, depends on), ADR-039 (GovCloud deployment pattern, related to).

How to Roll Out the ADR Template

Adopting this template across a team is a behavioral change. Three rollout steps reduce friction.

Start with one new decision. Do not retroactively author ADRs for existing systems. Pick the next non-trivial architecture decision the team faces and write it in the new format. Have the team review the ADR before approving the decision. The team learns the format by using it on a real decision under real pressure.
Treat the first three ADRs as calibration. The first ADR will be too long. The second will be too short. The third will start to feel right. Resist the temptation to fix the first two. The format is the point; the calibration is automatic.
Make ADRs visible to leadership. ADRs are not an internal team artifact. They are how the platform team communicates architectural commitment to the rest of the engineering organization. Publish accepted ADRs to a known location and reference them in MBR documentation. Leadership that can see the ADRs makes better resource decisions about the platform team.

What to Measure to Evaluate the ADR Practice

The ADR practice is itself an investment. It should be measured.

Time-to-acceptance for new ADRs. From Proposed to Accepted, the target is 7 business days. Longer than that indicates either the decision is not ready to be made or the review forum is not meeting often enough.
Percentage of architecture decisions documented as ADRs. Track this manually for the first quarter. Target: above 80 percent of decisions touching shared infrastructure, compliance scope, or production tenancy. Below 60 percent indicates the format is being seen as overhead and rolled around.
Frequency of ADR reference in subsequent decisions. Count how often new ADRs cite prior ADRs in the Related ADRs section. This is the leading indicator that organizational memory is compounding. Target: at least 60 percent of new ADRs reference at least one prior ADR by month 6.
Outcome accuracy. When an ADR comes up for review, evaluate how well the original cost trajectory, operational risk, and compliance impact predictions matched reality. Predictions that miss by more than 30 percent indicate the team is not modeling rigorously enough at decision time.

Review these metrics in the platform's monthly operating review. The ADR practice that is not measured is the practice that will be abandoned in the next quarter under deadline pressure.

When ADRs Become the Architecture Conversation

A mature ADR practice produces a specific change in how the engineering organization talks about architecture.

Discussions move from speculation to documents. “What did we decide about X?” becomes “let me pull up ADR-073.”

Disagreements become more productive: the team is arguing about the document, not about memories. New engineers ramp faster: the architecture has a written history, and the history reads as a sequence of decisions rather than a state of the world.

Most importantly, the architecture stops accumulating decisions that nobody can defend. The ADR forces every load-bearing choice through a format that requires defense at the moment of the choice. Choices that cannot be defended in the format are reconsidered before they ship.

Upgrade If You Need Implementation, Not Just Ideas

If you’re using these emails to guide real decisions on your platform, you’ll get more leverage from the paid version of The Cloud Playbook.

The free newsletter gives you patterns and language.

The paid newsletter turns those patterns into implementation kits you can ship inside a quarter:

Concrete rollout plans (90‑day roadmaps for each pattern)
Templates and checklists (policies, runbooks, tagging schemes, review checklists)
Real examples from high‑stakes AWS environments (what we actually shipped and why)

If the paid side doesn’t save you more than the subscription in one incident, audit cycle, or bad migration you avoid, you should cancel and keep the playbooks.

Upgrade to the Paid Cloud Playbook

That’s it for today!

Did you enjoy this newsletter issue?

Share with your friends, colleagues, and your favorite social media platform.

Share The Cloud Playbook

Until next week — Amrut

Get in touch

You can find me on LinkedIn or X.

If you would like to request a topic to read, please feel free to contact me directly via LinkedIn or X.

The Cloud Playbook

Discussion about this post

Ready for more?