<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[The Cloud Playbook]]></title><description><![CDATA[Most platform teams optimize for speed. The best ones optimize for predictability on AWS in high‑stakes environments.]]></description><link>https://www.thecloudplaybook.com</link><image><url>https://substackcdn.com/image/fetch/$s_!7MI5!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1ca555-b578-4ae3-8dda-a03cbc6b1d18_500x500.png</url><title>The Cloud Playbook</title><link>https://www.thecloudplaybook.com</link></image><generator>Substack</generator><lastBuildDate>Thu, 30 Apr 2026 06:19:13 GMT</lastBuildDate><atom:link href="https://www.thecloudplaybook.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Amrut Patil]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[thecloudplaybook@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[thecloudplaybook@substack.com]]></itunes:email><itunes:name><![CDATA[Amrut Patil]]></itunes:name></itunes:owner><itunes:author><![CDATA[Amrut Patil]]></itunes:author><googleplay:owner><![CDATA[thecloudplaybook@substack.com]]></googleplay:owner><googleplay:email><![CDATA[thecloudplaybook@substack.com]]></googleplay:email><googleplay:author><![CDATA[Amrut Patil]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[TCP #117: Your platform team doesn’t have a capacity problem.]]></title><description><![CDATA[4 structure checks to recover 30&#8211;40% of their time without hiring.]]></description><link>https://www.thecloudplaybook.com/p/platform-team-bottleneck</link><guid isPermaLink="false">https://www.thecloudplaybook.com/p/platform-team-bottleneck</guid><dc:creator><![CDATA[Amrut Patil]]></dc:creator><pubDate>Sun, 26 Apr 2026 14:21:47 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!xxQU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4d86b01-1368-4adf-8e0b-e116e611887d_1122x1402.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Requests pile up. Developers escalate to their managers, who escalate to platform leadership.</p><p>The SLA misses compound. Engineers work hard and still fall behind.</p><p>Every VP who sees this situation reaches the same conclusion: the platform team needs more headcount.</p><p>That conclusion is almost always wrong.</p><p>Platform team bottlenecks do not come from teams that are too small. They come from work that arrives through unclear intake channels, gets routed to ambiguous owners, and waits for undocumented approvals.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xxQU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4d86b01-1368-4adf-8e0b-e116e611887d_1122x1402.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xxQU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4d86b01-1368-4adf-8e0b-e116e611887d_1122x1402.png 424w, https://substackcdn.com/image/fetch/$s_!xxQU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4d86b01-1368-4adf-8e0b-e116e611887d_1122x1402.png 848w, https://substackcdn.com/image/fetch/$s_!xxQU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4d86b01-1368-4adf-8e0b-e116e611887d_1122x1402.png 1272w, https://substackcdn.com/image/fetch/$s_!xxQU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4d86b01-1368-4adf-8e0b-e116e611887d_1122x1402.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xxQU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4d86b01-1368-4adf-8e0b-e116e611887d_1122x1402.png" width="1122" height="1402" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f4d86b01-1368-4adf-8e0b-e116e611887d_1122x1402.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1402,&quot;width&quot;:1122,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1492691,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.thecloudplaybook.com/i/193221322?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4d86b01-1368-4adf-8e0b-e116e611887d_1122x1402.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xxQU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4d86b01-1368-4adf-8e0b-e116e611887d_1122x1402.png 424w, https://substackcdn.com/image/fetch/$s_!xxQU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4d86b01-1368-4adf-8e0b-e116e611887d_1122x1402.png 848w, https://substackcdn.com/image/fetch/$s_!xxQU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4d86b01-1368-4adf-8e0b-e116e611887d_1122x1402.png 1272w, https://substackcdn.com/image/fetch/$s_!xxQU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4d86b01-1368-4adf-8e0b-e116e611887d_1122x1402.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2>Why Faster Ticket Response Does Not Fix the Platform Engineering Bottleneck</h2><p>The reflexive response to a growing platform team backlog is to optimize throughput.</p><p>Run intake meetings twice a week instead of once. Add a triage rotation. Write SLA targets. Bring in a TPM to route requests. Some leaders introduce a tiered priority system: P0 gets a 24-hour response, P1 gets a five-day response, and P2 gets a two-week response.</p><p>Each change makes the intake process marginally more efficient. None of them fixes what actually causes requests to stall.</p><p>They do not tell a developer where to submit a request when their Slack message from two weeks ago went unanswered. They do not clarify which approval an engineer needs to unblock a security exception. They do not identify who owns an ambiguous request when it lands in the queue with no routing context.</p><p>Adding a priority label to an unrouted request does not route it.</p><p>Faster throughput into an unclear structure is still unclear structure.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Cloud Playbook is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2>Four Structural Gaps That Make Platform Teams a Bottleneck</h2><p>Platform engineering scalability problems follow a consistent pattern: four structural elements are usually missing at once.</p><ol><li><p><strong>Intake clarity.</strong> There is no single, well-defined path to request platform work. Some teams submit tickets. Others use Slack. Some skip both and corner a platform engineer directly. </p><p>Because the platform team intake process is informal, everything arrives marked urgent. The team cannot distinguish a genuine blocker from a request that can wait two weeks.</p></li><li><p><strong>Routing clarity.</strong> Once a request lands, no one is certain who will handle it. The team is large enough that ownership is ambiguous.</p><p>Requests get forwarded, sit in limbo, or wait for whoever happens to know the most about that area. There is no platform team request routing logic written down anywhere.</p></li><li><p><strong>Approval clarity.</strong> New infrastructure, security exceptions, and networking changes: each requires sign-off. But the approval chain is not documented.</p><p>Requests stall while engineers chase the right approver. Without a defined process, there is no predictable SLA for anything requiring sign-off, and every blocked request becomes a separate escalation path.</p></li><li><p><strong>Ownership clarity.</strong> When something breaks or a decision needs to be made, &#8220;Who owns this?&#8221; takes too long to answer. If developer platform ownership is ambiguous during normal operations, it becomes a crisis under pressure. Every incident starts with a 20-minute conversation that should take 90 seconds.</p></li></ol><p>These four gaps appear to be a capacity problem from the outside. Inside, they feel like everyone is working hard, but nothing is moving.</p><p>Adding engineers to this structure does not fix it. It replicates it. Each new hire spends their first months navigating the same ambiguity the current team has learned to live with.</p><div><hr></div><h2>The Four Questions That Confirm a Structure Problem</h2><p>Before approving a headcount requisition, run this diagnostic.</p><ol><li><p>If a developer needs a new service account today, do they know exactly where to submit the request? Or does the answer depend on who they know?</p></li><li><p>When a request arrives, can your platform engineer identify the owner in under five minutes without asking three colleagues?</p></li><li><p>For a security exception request, can you name the approver and the expected response time right now, without looking it up?</p></li><li><p>If you ask five engineers on your platform team, &#8220;Who owns the API gateway?&#8221; do you get the same answer within five minutes?</p></li></ol><p>One &#8220;it depends&#8221; in those answers means you have a platform team structure problem, not a headcount problem. Hiring more engineers will not change those answers.</p><div><hr></div><h2>How to Make Platform Team Structure Explicit Before You Hire</h2><p>These structural fixes cost less than a single hire and last longer than any retrospective.</p><ul><li><p><strong>Define one intake channel.</strong> One Slack channel. One ticket form. One entry point for all requests.</p><p>Not &#8220;it depends on the request type.&#8221; One place. This makes the queue visible and eliminates the parallel-path problem where the same work gets started twice by two people who each received a slightly different version of the request.</p></li><li><p><strong>Build a routing matrix.</strong> For each request category, define who handles it by role, not name.</p><p>New service account: Platform Infrastructure team, reviewed Mondays. Security exception: Security guild plus Platform lead, SLA 5 business days. The matrix need not be complex. It needs to exist.</p></li><li><p><strong>Document the approval chain.</strong> For every request type requiring sign-off, name the role and the expected turnaround. Post it in your intake channel.</p><p>Approvals do not need to be fast. They need to be predictable.</p></li><li><p><strong>Assign single owners.</strong> Every platform component, every shared service, every critical decision needs one named person, not a team. Ownership rotates on a schedule. The clarity does not.</p></li></ul><p>The goal is not to eliminate judgment from the platform team. It is to remove the structural overhead that consumes judgment before real work begins. When intake, routing, approvals, and ownership are clear, engineers spend more time engineering.</p><div><hr></div><h3>Run this check this week</h3><p>Pull the last five platform requests that missed your SLA.</p><p>For each one, trace its entry into the system, its routing, who needed to approve it, and at which step it stopped moving.</p><p>That step is your structural gap. Fix it before opening a headcount requisition.</p><p>Teams that define intake, routing, and ownership before their next hire recover 30 to 40 percent of effective capacity without adding a single engineer. That is the capacity that the structural ambiguity was absorbing.</p><p>Every time I have traced a chronic platform team backlog to its root cause, the issue was structural: a missing routing matrix, an undocumented approval chain, and no one who could answer &#8220;who owns the API gateway&#8221; in under thirty seconds. </p><p>The team was not too small. The structure was invisible.</p><div><hr></div><h3>Upgrade If You Need Implementation, Not Just Ideas</h3><p>If you&#8217;re using these emails to guide real decisions on your platform, you&#8217;ll get more leverage from the paid version of The Cloud Playbook.</p><p>The free newsletter gives you patterns and language.</p><p>The paid newsletter turns those patterns into implementation kits you can ship inside a quarter:</p><ul><li><p>Concrete rollout plans (90&#8209;day roadmaps for each pattern)</p></li><li><p>Templates and checklists (policies, runbooks, tagging schemes, review checklists)</p></li><li><p>Real examples from high&#8209;stakes AWS environments (what we actually shipped and why)</p></li></ul><p>If the paid side doesn&#8217;t save you more than the subscription in <strong>one</strong> incident, audit cycle, or bad migration you avoid, you should cancel and keep the playbooks.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe&quot;,&quot;text&quot;:&quot;Upgrade to the Paid Cloud Playbook&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.thecloudplaybook.com/subscribe"><span>Upgrade to the Paid Cloud Playbook</span></a></p><div><hr></div><h2><strong>That&#8217;s it for today!</strong></h2><p>Did you enjoy this newsletter issue?</p><p>Share with your friends, colleagues, and your favorite social media platform.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share The Cloud Playbook&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share The Cloud Playbook</span></a></p><p><strong>Until next week &#8212; Amrut</strong></p><div><hr></div><h2><strong>Get in touch</strong></h2><p>You can find me on <a href="https://www.linkedin.com/in/patilamrut/">LinkedIn</a> or <a href="https://twitter.com/realamrutpatil">X</a>.</p><p>If you would like to request a topic to read, please feel free to contact me directly via LinkedIn or X.</p>]]></content:encoded></item><item><title><![CDATA[TCP #116: Most teams don't have a technical debt problem.]]></title><description><![CDATA[They have a decision debt problem. The distinction changes what you measure, build, and protect.]]></description><link>https://www.thecloudplaybook.com/p/technical-debt-vs-decision-debt-platform-engineering</link><guid isPermaLink="false">https://www.thecloudplaybook.com/p/technical-debt-vs-decision-debt-platform-engineering</guid><dc:creator><![CDATA[Amrut Patil]]></dc:creator><pubDate>Sun, 12 Apr 2026 14:30:47 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Zf9E!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff09ac1b8-33af-4d9f-a2c6-8d00d230521e_1948x544.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Most engineering leaders talk about technical debt as if it is a coding problem.</p><p>It is not.</p><p>The systems that break expensively, the ones that consume quarters of remediation work, delay IPOs, and create the audit findings nobody can explain, almost never break because the code was bad.</p><p>They break because the decisions behind the code were never documented.</p><p>That is a different problem. And it requires a different fix.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Zf9E!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff09ac1b8-33af-4d9f-a2c6-8d00d230521e_1948x544.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Zf9E!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff09ac1b8-33af-4d9f-a2c6-8d00d230521e_1948x544.png 424w, https://substackcdn.com/image/fetch/$s_!Zf9E!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff09ac1b8-33af-4d9f-a2c6-8d00d230521e_1948x544.png 848w, https://substackcdn.com/image/fetch/$s_!Zf9E!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff09ac1b8-33af-4d9f-a2c6-8d00d230521e_1948x544.png 1272w, https://substackcdn.com/image/fetch/$s_!Zf9E!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff09ac1b8-33af-4d9f-a2c6-8d00d230521e_1948x544.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Zf9E!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff09ac1b8-33af-4d9f-a2c6-8d00d230521e_1948x544.png" width="1456" height="407" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f09ac1b8-33af-4d9f-a2c6-8d00d230521e_1948x544.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:407,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1787664,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.thecloudplaybook.com/i/190867678?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff09ac1b8-33af-4d9f-a2c6-8d00d230521e_1948x544.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Zf9E!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff09ac1b8-33af-4d9f-a2c6-8d00d230521e_1948x544.png 424w, https://substackcdn.com/image/fetch/$s_!Zf9E!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff09ac1b8-33af-4d9f-a2c6-8d00d230521e_1948x544.png 848w, https://substackcdn.com/image/fetch/$s_!Zf9E!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff09ac1b8-33af-4d9f-a2c6-8d00d230521e_1948x544.png 1272w, https://substackcdn.com/image/fetch/$s_!Zf9E!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff09ac1b8-33af-4d9f-a2c6-8d00d230521e_1948x544.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>How Most Engineering Leaders Frame The Problem</h2><p>Technical debt is the dominant frame for platform problems.</p><p>It is a useful shortcut. But it points to the wrong layer.</p><p>When leaders say &#8220;we have technical debt,&#8221; they usually mean one of three things: </p><ul><li><p>The codebase is harder to change than it should be</p></li><li><p>The system is harder to reason about than it should be, or </p></li><li><p>The architecture does not match the current scale of the organization.</p></li></ul><p>All of those things can be true.</p><p>But they are usually symptoms of a deeper problem: the team cannot explain why the system works the way it does, because the decisions that produced it were made informally, without documentation, by people who may no longer be at the company.</p><p>The technical debt is real. But it is downstream of the decision debt that created it.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Cloud Playbook is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Why Framing This As Technical Debt Produces The Wrong Fix</h2><p>The technical debt frame leads to the same response: allocate engineering time to refactor, migrate, or modernize. That work often creates genuine value.</p><p>It does not prevent the same problem from recurring.</p><p>Teams that pay down technical debt without addressing the decision practices that created it are running a maintenance loop. They clean up the current accumulation. New decision debt creates new technical debt. The cycle repeats.</p><p>The root issue is structural: most engineering organizations do not treat decisions as artifacts that need to be created, stored, and maintained. They treat decisions as conversations that produce outcomes.</p><p><strong>Decisions made in conversations evaporate. The outcome, the code, the architecture, and the policy persist.</strong></p><p>The reasoning does not.</p><p>Three months later, a new engineer inherits the system and asks why it works the way it does. The answer is: nobody knows.</p><p>That is decision debt.</p><h2>The Better Frame: Decisions Are Durable Artifacts</h2><p>Decision debt is the accumulation of choices that were made but not documented: the AWS account structure rationale, the secrets management approach, the deployment ownership model, and the trade-offs accepted under time pressure.</p><p>Unlike technical debt, decision debt is invisible.</p><p>You cannot run a linter against it. It does not surface in code reviews. It shows up during an audit, a compliance review, a post-incident retrospective, or a due diligence process when someone needs to understand why the platform works the way it does, but no one can answer.</p><p>Reframing from technical debt to decision debt changes what you measure, what you build, and what you protect.</p><h2>What Changes When You See It This Way</h2><p>When you treat decision debt as the primary problem, the fix shifts from code to documentation, but not the kind of documentation most teams write.</p><p>Not README files. Not wiki pages that go stale in 90 days.</p><p>The artifact that matters is a decision record: a brief, durable document that captures what was decided, what the alternatives were, why this option was chosen, and what conditions would cause you to revisit it.</p><p><strong>Architecture Decision Records (ADRs)</strong> are the most common format. The format matters less than the habit.</p><p>Platform teams that practice decision documentation accumulate something more valuable than clean code: they accumulate institutional reasoning.</p><p>When an auditor asks why the platform has three separate IAM policies, the team with decision records can answer in 10 minutes. The team reconstructs the rationale without them over six weeks.</p><p>When a new CTO joins and asks why the organization chose multi-account over single-account AWS, the team with decision records shows them the 2023 evaluation. The team, without them, shrugs.</p><p>The gap compounds at every leadership transition, every compliance review, and every architecture evolution.</p><h2>One Action: Start The Record</h2><p>Identify the five most consequential platform decisions made in the last 24 months.</p><p>For each, write a single paragraph capturing: what was decided, what was rejected, why, and what would cause you to revisit it. Date it. Name the decision owner.</p><p>Store it somewhere that the next engineer and the next auditor can find it.</p><p>That is your starting point for a decision debt practice. It will not eliminate the backlog overnight. But it will stop the accumulation.</p><div><hr></div><h2>What to do this week</h2><p>Pull your last three post-incident retrospectives.</p><p>For each incident, identify whether the root cause was a technical failure or a decision that was made without documentation.</p><p>If you cannot answer that question, the decision record does not exist &#8212; and the same incident will recur under different conditions.</p><p>Platform reliability is not a code quality problem. It is a decision quality problem. The documentation is the practice.</p><p><em>Every time I&#8217;ve worked through a platform audit or due diligence process, the hardest questions to answer are not technical. </em></p><p><em>They are: &#8220;Why does this work the way it does?&#8221; and &#8220;Who decided this?&#8221; </em></p><p><em>Teams with decision records answer in minutes. Teams without them answer in months.</em></p><div><hr></div><p>Tools make noise. Boundaries create signal.</p><p>I build platforms by drawing the right lines between teams, not by adding more stacks.</p><div><hr></div><p><strong>Upgrade If You Need Implementation, Not Just Ideas</strong></p><p>If you&#8217;re using these emails to guide real decisions on your platform, you&#8217;ll get more leverage from the paid version of The Cloud Playbook.</p><p>The free newsletter gives you patterns and language.</p><p>The paid newsletter turns those patterns into implementation kits you can ship inside a quarter:</p><ul><li><p>Concrete rollout plans (90&#8209;day roadmaps for each pattern)</p></li><li><p>Templates and checklists (policies, runbooks, tagging schemes, review checklists)</p></li><li><p>Real examples from high&#8209;stakes AWS environments (what we actually shipped and why)</p></li></ul><p>If the paid side doesn&#8217;t save you more than the subscription in <strong>one</strong> incident, audit cycle, or bad migration you avoid, you should cancel and keep the playbooks.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe&quot;,&quot;text&quot;:&quot;Upgrade to the Paid Cloud Playbook&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.thecloudplaybook.com/subscribe"><span>Upgrade to the Paid Cloud Playbook</span></a></p><div><hr></div><h2><strong>That&#8217;s it for today!</strong></h2><p>Did you enjoy this newsletter issue?</p><p>Share with your friends, colleagues, and your favorite social media platform.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share The Cloud Playbook&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share The Cloud Playbook</span></a></p><p><strong>Until next week &#8212; Amrut</strong></p><div><hr></div><h2><strong>Get in touch</strong></h2><p>You can find me on <a href="https://www.linkedin.com/in/patilamrut/">LinkedIn</a> or <a href="https://twitter.com/realamrutpatil">X</a>.</p><p>If you would like to request a topic to read, please feel free to contact me directly via LinkedIn or X.</p>]]></content:encoded></item><item><title><![CDATA[TCP #115: You don't have a platform ROI problem.]]></title><description><![CDATA[You have a translation problem. Here's the framework for making platform investment visible in terms executives actually use.]]></description><link>https://www.thecloudplaybook.com/p/platform-investment-roi-engineering-leaders</link><guid isPermaLink="false">https://www.thecloudplaybook.com/p/platform-investment-roi-engineering-leaders</guid><dc:creator><![CDATA[Amrut Patil]]></dc:creator><pubDate>Wed, 08 Apr 2026 16:30:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!eOEm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d47098f-6854-40c5-9d5b-03685b976e37_2816x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You can also read my newsletters from the Substack mobile app and be notified when a new issue is available.</p><div class="install-substack-app-embed install-substack-app-embed-web" data-component-name="InstallSubstackAppToDOM"><img class="install-substack-app-embed-img" src="https://substackcdn.com/image/fetch/$s_!7MI5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1ca555-b578-4ae3-8dda-a03cbc6b1d18_500x500.png"><div class="install-substack-app-embed-text"><div class="install-substack-app-header">Get more from Amrut Patil in the Substack app</div><div class="install-substack-app-text">Available for iOS and Android</div></div><a href="https://substack.com/app/app-store-redirect?utm_campaign=app-marketing&amp;utm_content=author-post-insert&amp;utm_source=thecloudplaybook" target="_blank" class="install-substack-app-embed-link"><button class="install-substack-app-embed-btn button primary">Get the app</button></a></div><div><hr></div><p>Most platform engineering leaders know their work creates value.</p><p>Most cannot explain it in terms that survive a budget review.</p><p>When a CFO asks, &#8220;What is the ROI of our platform team?&#8221;, the answer most platform leaders give is a list of what the team built.</p><p>That is not an answer to the question asked.</p><p>The question is not: &#8220;What did you ship?&#8221;</p><p><strong>The question is: &#8220;What changed in the business because you shipped it?&#8221;</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eOEm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d47098f-6854-40c5-9d5b-03685b976e37_2816x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eOEm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d47098f-6854-40c5-9d5b-03685b976e37_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!eOEm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d47098f-6854-40c5-9d5b-03685b976e37_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!eOEm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d47098f-6854-40c5-9d5b-03685b976e37_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!eOEm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d47098f-6854-40c5-9d5b-03685b976e37_2816x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eOEm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d47098f-6854-40c5-9d5b-03685b976e37_2816x1536.png" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5d47098f-6854-40c5-9d5b-03685b976e37_2816x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7700970,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thecloudplaybook.com/i/190867653?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d47098f-6854-40c5-9d5b-03685b976e37_2816x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eOEm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d47098f-6854-40c5-9d5b-03685b976e37_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!eOEm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d47098f-6854-40c5-9d5b-03685b976e37_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!eOEm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d47098f-6854-40c5-9d5b-03685b976e37_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!eOEm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d47098f-6854-40c5-9d5b-03685b976e37_2816x1536.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2>The Real Fork in the Road for Platform Investment</h2><p>Platform engineering ROI justification typically arrives in one of three situations: annual budget cycles, headcount requests, or post-incident reviews after something expensive broke.</p><p>Each creates a different conversation. All three expose the same gap.</p><p>The decision most platform leaders avoid making explicitly: are we measuring platform investment as engineering spend or as a business capability?</p><p>Engineering spend framing produces a cost conversation.</p><p>Business capability framing produces an investment conversation.</p><p>These are not the same conversation. The frame you choose determines the outcome you get.</p><h2>Platform Investment Arguments: What Works and What Doesn&#8217;t</h2><h3>Option 1: Technical output metrics</h3>
      <p>
          <a href="https://www.thecloudplaybook.com/p/platform-investment-roi-engineering-leaders">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[TCP #114: Buy vs. Build for Compliance Automation: The Decision That Stalls Most Platform Teams]]></title><description><![CDATA[There are three options, not two. Here's which tradeoffs your org can actually absorb.]]></description><link>https://www.thecloudplaybook.com/p/compliance-automation-buy-vs-build</link><guid isPermaLink="false">https://www.thecloudplaybook.com/p/compliance-automation-buy-vs-build</guid><dc:creator><![CDATA[Amrut Patil]]></dc:creator><pubDate>Sun, 05 Apr 2026 14:25:09 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!tsDt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bfe628-7126-418a-a705-ca17fa9d8257_2816x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Every engineering team in a regulated environment eventually hits the same moment.</p><p>Compliance evidence is piling up. Auditors want controls you haven&#8217;t mapped yet. </p><p>Someone says, &#8220;We should automate this.&#8221; And then the room splits: buy a compliance automation platform, or build the tooling yourselves.</p><p>The decision seems tactical. It isn&#8217;t. </p><p>The wrong call costs six to eighteen months of engineering time and still leaves gaps in your audit readiness.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tsDt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bfe628-7126-418a-a705-ca17fa9d8257_2816x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tsDt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bfe628-7126-418a-a705-ca17fa9d8257_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!tsDt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bfe628-7126-418a-a705-ca17fa9d8257_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!tsDt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bfe628-7126-418a-a705-ca17fa9d8257_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!tsDt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bfe628-7126-418a-a705-ca17fa9d8257_2816x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tsDt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bfe628-7126-418a-a705-ca17fa9d8257_2816x1536.png" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/18bfe628-7126-418a-a705-ca17fa9d8257_2816x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7630173,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.thecloudplaybook.com/i/190867299?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bfe628-7126-418a-a705-ca17fa9d8257_2816x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tsDt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bfe628-7126-418a-a705-ca17fa9d8257_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!tsDt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bfe628-7126-418a-a705-ca17fa9d8257_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!tsDt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bfe628-7126-418a-a705-ca17fa9d8257_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!tsDt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bfe628-7126-418a-a705-ca17fa9d8257_2816x1536.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><p><strong>Building an IDP from Scratch &#8212; Live 2-day Workshop</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MXLt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7f6441a-57b0-456c-bfa8-d7292e22d87d_1280x640.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MXLt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7f6441a-57b0-456c-bfa8-d7292e22d87d_1280x640.png 424w, https://substackcdn.com/image/fetch/$s_!MXLt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7f6441a-57b0-456c-bfa8-d7292e22d87d_1280x640.png 848w, https://substackcdn.com/image/fetch/$s_!MXLt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7f6441a-57b0-456c-bfa8-d7292e22d87d_1280x640.png 1272w, https://substackcdn.com/image/fetch/$s_!MXLt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7f6441a-57b0-456c-bfa8-d7292e22d87d_1280x640.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MXLt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7f6441a-57b0-456c-bfa8-d7292e22d87d_1280x640.png" width="1280" height="640" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a7f6441a-57b0-456c-bfa8-d7292e22d87d_1280x640.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:640,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:193957,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thecloudplaybook.com/i/190867299?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7f6441a-57b0-456c-bfa8-d7292e22d87d_1280x640.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MXLt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7f6441a-57b0-456c-bfa8-d7292e22d87d_1280x640.png 424w, https://substackcdn.com/image/fetch/$s_!MXLt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7f6441a-57b0-456c-bfa8-d7292e22d87d_1280x640.png 848w, https://substackcdn.com/image/fetch/$s_!MXLt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7f6441a-57b0-456c-bfa8-d7292e22d87d_1280x640.png 1272w, https://substackcdn.com/image/fetch/$s_!MXLt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7f6441a-57b0-456c-bfa8-d7292e22d87d_1280x640.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Design and build an Internal Developer Platform that scales and gets adopted. This hands-on, 2-day workshop led by Ajay Chankramath (Founder of Platformetrics, former ThoughtWorks leader, author of Platform Engineer&#8217;s Handbook) covers platform-as-a-product thinking, cloud-native architecture, Infrastructure as Code, automation patterns, and production readiness.<br><br>Ideal for platform engineers, DevOps teams, and engineering leaders building or stabilizing IDPs.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.eventbrite.com/e/building-an-internal-developer-platform-from-scratch-tickets-1978960034736?aff=cloudplaybook&quot;,&quot;text&quot;:&quot;Sign-up today&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.eventbrite.com/e/building-an-internal-developer-platform-from-scratch-tickets-1978960034736?aff=cloudplaybook"><span>Sign-up today</span></a></p><p>Use discount code <strong>CLOUD40</strong> during sign-up to get <strong>40% off</strong></p><div><hr></div><h2>Three Paths On The Table. Most Teams Only See Two.</h2><p>There are three real options for compliance automation.</p><ol><li><p><strong>Buy a compliance automation platform.</strong> Products like Vanta, Drata, or Secureframe sit on top of your cloud infrastructure. They provide automated evidence collection, control mapping, and audit readiness dashboards. You configure integrations. They handle framework updates when the standard changes.</p></li><li><p><strong>Build compliance tooling internally.</strong> Your platform team writes custom scripts or a compliance-as-code layer that pulls evidence from your environment, structures it for auditors, and stores it with full version history. You own every line. You control every integration.</p></li><li><p><strong> Hybrid.</strong> Buy a platform for standard controls and automated evidence collection. Build custom tooling only for what the vendor doesn&#8217;t cover: proprietary systems, non-standard integrations, or regulatory requirements outside the vendor&#8217;s framework mappings.</p></li></ol><p>Most engineering leaders treat this as a binary. It isn&#8217;t.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Cloud Playbook is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2>What Each Option Actually Cost You</h2><p><strong>Buying costs flexibility.</strong></p><p>Commercial compliance automation tools map well to recognized frameworks: SOC 2, ISO 27001, HIPAA, and FedRAMP Moderate. They handle common integrations well. AWS, GitHub, Okta, Jira. But if your architecture diverges from what the vendor built for, expect gaps.</p><p>When your regulatory requirements don&#8217;t align with the vendor&#8217;s control library, you spend significant time on manual evidence uploads and exception management. Platform engineers end up maintaining the GRC platform instead of building a product.</p><p>That hidden maintenance cost rarely appears in any vendor&#8217;s ROI calculator.</p><p><strong>Building costs time you don&#8217;t have &#8212; and creates a new risk surface.</strong></p><p>Internal compliance tooling gives you full control. You can map exactly to your control set, pull compliance evidence from any system, and structure the output precisely the way your auditors want it.</p><p>But there is a cost most teams don&#8217;t price in: the moment you build your own compliance automation, that codebase becomes part of your risk surface. You need to validate, secure, and audit it as you would any other production system. Your internal compliance tooling is itself a compliance artifact.</p><p>And a credible internal automated evidence collection pipeline, with versioning, access controls, and audit trails, takes three to six months to build and maintain as a first-class capability. Most platform teams don&#8217;t have that capacity during an active audit cycle.</p><p><strong>Hybrid costs coordination.</strong></p><p>You get coverage where the vendor is strong and control where you need custom logic. The cost is maintaining two systems and keeping compliance evidence consistent across both.</p><p>Gaps appear at the seams. Auditors notice when evidence from your internal tooling doesn&#8217;t align with what the compliance platform reports. Reconciling those gaps during an audit is expensive and avoidable.</p><p>No option is clean. The question is which tradeoffs your organization can absorb right now.</p><div><hr></div><h2>Why I Default To Buy First And Build Only At The Edges</h2><p>My recommendation is hybrid, weighted toward buy for the first two years.</p><p>Buy a commercial platform for standard framework controls. You are not going to out-engineer a vendor&#8217;s SOC 2 compliance automation mappings. They have mapped thousands of audits. You have mapped one, maybe two. Get automated evidence collection running quickly. Let the vendor handle framework updates when the standard changes.</p><p><strong>Build for the gaps.</strong> If you run on-premise infrastructure, proprietary data pipelines, or a regulated data classification system that vendors don&#8217;t support, build a lightweight evidence collector for those specific controls. Keep it narrow. Keep it maintainable. Resist the urge to expand scope.</p><p>The reason I weigh toward buying early is simple: compliance automation is not a core differentiator. Your platform team&#8217;s time is finite.</p><p>Spending six months building an internal evidence pipeline is six months not spent on developer experience, deployment infrastructure, or reliability tooling. Buy the commodity. Build only what the market doesn&#8217;t cover.</p><p>The exception is scale. For more than 500 engineers or when operating across multiple regulatory frameworks simultaneously, vendor licensing costs and integration maintenance overhead can exceed what a well-resourced internal team would spend.</p><p>Multi-framework compliance at scale is where commercial platforms often show their seams. At that point, the build becomes economically rational and strategically worth resourcing.</p><div><hr></div><h2>Two Signals That Flip The Answer Towards Build</h2><p>This framework holds when you are in your first or second audit cycle and your scope maps to a recognized framework.</p><p>Two signals change the answer.</p><p><strong>1/ Your architecture is genuinely non-standard.</strong></p><p>Air-gapped environments, on-premise data processing, proprietary protocols. No compliance automation tool covers these well. You will spend more time managing exceptions than building. In these environments, a commercial platform becomes a workaround, not a solution. Build.</p><p><strong>2/ Compliance is your product.</strong></p><p>If customers buy from you because of your compliance posture, continuous compliance is a core platform capability, not overhead. Build it. Own it. Treat it as a product with its own roadmap and dedicated engineering investment. The compliance platform decision is a competitive question when compliance is what you sell.</p><p>Everything else defaults to buy.</p><div><hr></div><p><strong>Run this check before your next vendor conversation:</strong></p><ol><li><p>List every system in your environment that generates compliance evidence. Flag which of your shortlisted compliance automation tools do not cover natively.</p></li><li><p>Price the gap: estimate how many engineering hours per quarter you would spend on manual evidence collection for uncovered controls.</p></li><li><p>Calculate the build cost for a lightweight internal evidence collector for just those controls, and factor in the ongoing security and maintenance overhead of owning that codebase. Compare it against the manual overhead.</p></li></ol><p>That comparison tells you whether you are buying a solution or renting a workaround. The answer changes how you negotiate, what integrations you demand, and whether you sign at all.</p><p>Teams that run this analysis before signing a compliance automation platform reduce their post-implementation integration work by 40 to 60 percent.</p><p>Every team that skips it ends up in the same place: one engineer maintaining a patchwork of scripts and manual uploads six months after go-live, during an active audit.</p><div><hr></div><p><em>If you want the implementation details, I go one level deeper in the <strong>paid Cloud Playbook tier</strong>:</em></p><ul><li><p><em>The exact RFP checklist I use to pressure&#8209;test compliance automation, vendors</em></p></li><li><p><em>A build&#8209;vs&#8209;buy spreadsheet you can plug your own engineer costs and audit scope into</em></p></li><li><p><em>Example &#8220;hybrid&#8221; reference architectures that keep vendors in their lane and your team focused on product</em></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe&quot;,&quot;text&quot;:&quot;Upgrade Here&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.thecloudplaybook.com/subscribe"><span>Upgrade Here</span></a></p><div><hr></div><h2><strong>Whenever you&#8217;re ready, there are 2 ways I can help you:</strong></h2><ol><li><p><strong>Free guides and helpful resources: </strong><a href="https://thecloudplaybook.gumroad.com/">https://thecloudplaybook.gumroad.com/</a></p></li><li><p>Get certified as an <strong>AWS AI Practitioner</strong> in 2026. Sign up today to elevate your cloud skills. (<em><a href="https://www.udemy.com/course/aws-certified-ai-practitioner-practice-exams-aif-c01/">link</a></em>)</p></li></ol><div><hr></div><h2><strong>That&#8217;s it for today!</strong></h2><p>Did you enjoy this newsletter issue?</p><p>Share with your friends, colleagues, and your favorite social media platform.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share The Cloud Playbook&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share The Cloud Playbook</span></a></p><p><strong>Until next week &#8212; Amrut</strong></p><div><hr></div><h2><strong>Get in touch</strong></h2><p>You can find me on <a href="https://www.linkedin.com/in/patilamrut/">LinkedIn</a> or <a href="https://twitter.com/realamrutpatil">X</a>.</p><p>If you would like to request a topic to read, please feel free to contact me directly via LinkedIn or X.</p>]]></content:encoded></item><item><title><![CDATA[TCP #113: Why your platform team is a bottleneck (and why hiring won’t fix it)]]></title><description><![CDATA[Six engineers, eighty tickets, twelve product teams queued. The problem isn&#8217;t headcount. It&#8217;s the org structure nobody touches.]]></description><link>https://www.thecloudplaybook.com/p/platform-team-bottleneck-organizational-structure</link><guid isPermaLink="false">https://www.thecloudplaybook.com/p/platform-team-bottleneck-organizational-structure</guid><dc:creator><![CDATA[Amrut Patil]]></dc:creator><pubDate>Wed, 01 Apr 2026 16:31:37 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!40sM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F767f4413-4c84-46f3-847a-fcc3db9c3f2a_2816x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>SIX ENGINEERS. EIGHTY OPEN TICKETS. ONE STRUCTURAL FAILURE.</h2><p>One platform team. Twelve product teams queued behind them.</p><p>Every deploy request was a ticket. Every new AWS account required platform sign-off. Every tool decision got routed through a weekly sync.</p><p>The team had six engineers and eighty open tickets. Leadership called it a headcount problem.</p><p>It wasn&#8217;t.</p>
      <p>
          <a href="https://www.thecloudplaybook.com/p/platform-team-bottleneck-organizational-structure">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[TCP #112: The single question that predicts platform team success]]></title><description><![CDATA[Why &#8220;can a stranger deploy without help?&#8221; predicts adoption better than NPS or toil charts.]]></description><link>https://www.thecloudplaybook.com/p/single-question-predicts-platform-team-success</link><guid isPermaLink="false">https://www.thecloudplaybook.com/p/single-question-predicts-platform-team-success</guid><dc:creator><![CDATA[Amrut Patil]]></dc:creator><pubDate>Sun, 29 Mar 2026 14:21:48 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Y9T4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb40c162d-0443-4fd6-9c65-5318b533edcc_2816x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You can also read my newsletters from the Substack mobile app and be notified when a new issue is available.</p><div class="install-substack-app-embed install-substack-app-embed-web" data-component-name="InstallSubstackAppToDOM"><img class="install-substack-app-embed-img" src="https://substackcdn.com/image/fetch/$s_!7MI5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1ca555-b578-4ae3-8dda-a03cbc6b1d18_500x500.png"><div class="install-substack-app-embed-text"><div class="install-substack-app-header">Get more from Amrut Patil in the Substack app</div><div class="install-substack-app-text">Available for iOS and Android</div></div><a href="https://substack.com/app/app-store-redirect?utm_campaign=app-marketing&amp;utm_content=author-post-insert&amp;utm_source=thecloudplaybook" target="_blank" class="install-substack-app-embed-link"><button class="install-substack-app-embed-btn button primary">Get the app</button></a></div><div><hr></div><p>If you run a platform or infra team, you only need one question:</p><p><strong>Can a developer I&#8217;ve never met deploy a production service without asking anyone for help?</strong></p><p>That is it.</p><p>Not &#8220;what is our deployment frequency?&#8221; </p><p>Not &#8220;what is our platform NPS score?&#8221; </p><p>Not &#8220;how much toil have we eliminated?&#8221;</p><p>Those metrics matter. But they are lagging indicators. They tell you what your platform did last quarter.</p><p>This question tells you what your platform is, right now.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Y9T4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb40c162d-0443-4fd6-9c65-5318b533edcc_2816x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Y9T4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb40c162d-0443-4fd6-9c65-5318b533edcc_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!Y9T4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb40c162d-0443-4fd6-9c65-5318b533edcc_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!Y9T4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb40c162d-0443-4fd6-9c65-5318b533edcc_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!Y9T4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb40c162d-0443-4fd6-9c65-5318b533edcc_2816x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Y9T4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb40c162d-0443-4fd6-9c65-5318b533edcc_2816x1536.png" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b40c162d-0443-4fd6-9c65-5318b533edcc_2816x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:6622315,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thecloudplaybook.com/i/190867254?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb40c162d-0443-4fd6-9c65-5318b533edcc_2816x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Y9T4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb40c162d-0443-4fd6-9c65-5318b533edcc_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!Y9T4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb40c162d-0443-4fd6-9c65-5318b533edcc_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!Y9T4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb40c162d-0443-4fd6-9c65-5318b533edcc_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!Y9T4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb40c162d-0443-4fd6-9c65-5318b533edcc_2816x1536.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2>WHERE THIS CAME FROM</h2><p>I started asking this question after a specific pattern repeated itself across three different organizations.</p><p>Each team had strong DORA metrics. Deployment frequency was high. Lead times were short. On the surface, the platform was working.</p><p>Then we&#8217;d onboard a new team and watch what happened. Engineers would read the documentation, hit a wall, open a ticket, wait, get an answer, try again, hit another wall, open another ticket.</p><p>Two weeks of this and they&#8217;d either give up on the platform or build their own path around it.</p><p>The metrics hadn&#8217;t lied. They reflected what existing users could do after months of accumulating tribal knowledge.</p><p>They did not reflect what the platform actually offered to someone arriving cold.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Cloud Playbook is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2>WHY THIS QUESTION PREDICTS WHAT METRICS MISS</h2><p>The deployment frequency metric measures how often your most experienced teams ship.</p><p>The platform NPS score measures whether developers who already use the platform are satisfied with it.</p><p>The toil reduction metric measures how much manual work you eliminated for teams that were doing it manually before.</p><p>All three measure existing behavior in existing users.</p><p>The question &#8220;can a developer I&#8217;ve never met deploy without asking anyone for help&#8221; measures something different. </p><p>It measures the platform&#8217;s legibility to a stranger.</p><p><strong>Legibility is the real test. Not performance. Not satisfaction. Not throughput.</strong></p><p><strong>If a new engineer can&#8217;t even find the path, it doesn&#8217;t matter how fast your existing teams can run it.</strong></p><p>Not performance. Not satisfaction. Not throughput.</p><p>A platform that requires tribal knowledge to operate has a known failure mode that it hasn&#8217;t solved. Every new team that joins the organization will pay the onboarding tax. Every engineer who moves between teams will pay it again.</p><p>The tax compounds quietly. It shows up as a two-week onboarding period that should take two days. It shows up as a Slack message that interrupts a senior engineer at 2 pm. It shows up as a ticket queue that the platform team treats as normal operational load, when it is actually evidence that the platform has not done its job.</p><div><hr></div><h2>WHAT TO DO WITH THIS INSIGHT</h2><p>Run the test literally.</p><p>Find a developer who has not used your platform before. Give them your documentation and nothing else. Watch where they stop. Watch what they search for. Watch what they eventually ask a human to explain.</p><p>Every stopping point is a design failure, not a user failure.</p><p>The goal is not a platform that experienced engineers find fast. The goal is a platform that a new engineer can navigate to a first successful deployment without human intervention.</p><p>That bar is higher than most platform teams think it is. Most teams build for their current users, optimizing for the paths they already know. New users are left to discover the path themselves.</p><div><hr></div><h2>RUN THIS CHECK</h2><p>What to do this week:</p><p>What to do this week (30&#8211;60 minutes):</p><p>&#8226; Identify one engineer who joined your organization in the last 60 days.</p><p>&#8226; Ask them to walk you through their first deployment experience: where they got stuck, who they had to ask for help.</p><p>&#8226; Count the number of human interventions between &#8220;first commit&#8221; and &#8220;service running in production.&#8221; Each intervention is a platform gap, not a developer gap.</p><p>&#8226; Set a target: <strong>zero human interventions for a standard service deployment</strong> and track it alongside your DORA metrics.</p><div><hr></div><p>Reducing new-engineer onboarding time from two weeks to two days returns compounding capacity across every hiring cycle. The teams that run this test consistently report that the gaps it surfaces are more actionable than any NPS survey because they are specific, reproducible, and owned by the platform team, not the developers experiencing them.</p><p>Every platform I have seen struggle with adoption had the same root cause. It was built for the people who built it. The documentation assumed knowledge th<strong>at th</strong>e documentation was supposed to provide. The golden path was golden for the people who paved it.</p><div><hr></div><p><em>If you&#8217;re serious about how your platform serves new users, this is exactly what the <strong>Paid</strong> version of this newsletter is for.</em></p><p><em>Paid subscribers get the full <strong>Platform Team Scorecard</strong>: nine capability areas, a copy&#8209;paste audit worksheet, and example questions you can run with your teams to find the gaps that block new engineers from shipping. You also unlock the back catalog of scorecards and implementation guides, so you&#8217;re not inventing your own framework from scratch.</em></p><p><em>If you want to go beyond reading and actually instrument your platform, <strong>upgrade to Paid</strong> and run the Scorecard with your team in the next week.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe&quot;,&quot;text&quot;:&quot;Upgrade Here&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.thecloudplaybook.com/subscribe"><span>Upgrade Here</span></a></p><div><hr></div><h2><strong>Whenever you&#8217;re ready, there are 2 ways I can help you:</strong></h2><ol><li><p><strong>Free guides and helpful resources: </strong><a href="https://thecloudplaybook.gumroad.com/">https://thecloudplaybook.gumroad.com/</a></p></li><li><p>Get certified as an <strong>AWS AI Practitioner</strong> in 2026. Sign up today to elevate your cloud skills. (<em><a href="https://www.udemy.com/course/aws-certified-ai-practitioner-practice-exams-aif-c01/">link</a></em>)</p></li></ol><div><hr></div><h2><strong>That&#8217;s it for today!</strong></h2><p>Did you enjoy this newsletter issue?</p><p>Share with your friends, colleagues, and your favorite social media platform.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share The Cloud Playbook&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share The Cloud Playbook</span></a></p><p><strong>Until next week &#8212; Amrut</strong></p><div><hr></div><h2><strong>Get in touch</strong></h2><p>You can find me on <a href="https://www.linkedin.com/in/patilamrut/">LinkedIn</a> or <a href="https://twitter.com/realamrutpatil">X</a>.</p><p>If you would like to request a topic to read, please feel free to contact me directly via LinkedIn or X.</p>]]></content:encoded></item><item><title><![CDATA[TCP #111: One standardized policy that killed 4 months of IAM escalations]]></title><description><![CDATA[How we went from recurring tickets to zero and made audit evidence boring again.]]></description><link>https://www.thecloudplaybook.com/p/standardized-iam-policy-cross-team-conflict</link><guid isPermaLink="false">https://www.thecloudplaybook.com/p/standardized-iam-policy-cross-team-conflict</guid><dc:creator><![CDATA[Amrut Patil]]></dc:creator><pubDate>Wed, 25 Mar 2026 16:31:07 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!I1CL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffb01b23-7035-45d8-94e1-49e0f6bc6e06_2816x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You can also read my newsletters from the Substack mobile app and be notified when a new issue is available</p><div class="install-substack-app-embed install-substack-app-embed-web" data-component-name="InstallSubstackAppToDOM"><img class="install-substack-app-embed-img" src="https://substackcdn.com/image/fetch/$s_!7MI5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1ca555-b578-4ae3-8dda-a03cbc6b1d18_500x500.png"><div class="install-substack-app-embed-text"><div class="install-substack-app-header">Get more from Amrut Patil in the Substack app</div><div class="install-substack-app-text">Available for iOS and Android</div></div><a href="https://substack.com/app/app-store-redirect?utm_campaign=app-marketing&amp;utm_content=author-post-insert&amp;utm_source=thecloudplaybook" target="_blank" class="install-substack-app-embed-link"><button class="install-substack-app-embed-btn button primary">Get the app</button></a></div><div><hr></div><p>If three app teams share an S3 bucket in your AWS org, you probably have three different IAM policies and a hidden audit problem.</p><p>Here&#8217;s how we eliminated four months of IAM permission escalations with a single customer-managed policy in a single sprint.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!I1CL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffb01b23-7035-45d8-94e1-49e0f6bc6e06_2816x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!I1CL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffb01b23-7035-45d8-94e1-49e0f6bc6e06_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!I1CL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffb01b23-7035-45d8-94e1-49e0f6bc6e06_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!I1CL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffb01b23-7035-45d8-94e1-49e0f6bc6e06_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!I1CL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffb01b23-7035-45d8-94e1-49e0f6bc6e06_2816x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!I1CL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffb01b23-7035-45d8-94e1-49e0f6bc6e06_2816x1536.png" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ffb01b23-7035-45d8-94e1-49e0f6bc6e06_2816x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5875188,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.thecloudplaybook.com/i/190867223?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffb01b23-7035-45d8-94e1-49e0f6bc6e06_2816x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!I1CL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffb01b23-7035-45d8-94e1-49e0f6bc6e06_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!I1CL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffb01b23-7035-45d8-94e1-49e0f6bc6e06_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!I1CL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffb01b23-7035-45d8-94e1-49e0f6bc6e06_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!I1CL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffb01b23-7035-45d8-94e1-49e0f6bc6e06_2816x1536.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2>THE PERMISSION CONFLICT THAT KEPT ESCALATING</h2><p>Before the change, each team had written its own inline IAM policy for accessing the same shared S3 bucket.</p><p>Team A had scoped their policy by prefix. Team B had scoped by action, then added a wildcard when something broke in a hurry. Team C had copied Team B&#8217;s policy six months earlier, before Team B&#8217;s wildcard was added, and was missing two actions their pipeline now needed.</p><p>Every two to three weeks, one of the three teams would open a ticket. A deployment would fail. An access denied error would appear in CloudTrail. Someone would ping the platform team. The platform engineer on rotation would spend 45 minutes reading three different policy documents to figure out which one was the source of the conflict.</p>
      <p>
          <a href="https://www.thecloudplaybook.com/p/standardized-iam-policy-cross-team-conflict">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[TCP# 110: The question that quietly kills your incident response]]></title><description><![CDATA[Platform teams don&#8217;t fail because of bad tools. They fail because, at 2 am, nobody can answer one question: who owns this service?]]></description><link>https://www.thecloudplaybook.com/p/platform-team-failure-unclear-ownership</link><guid isPermaLink="false">https://www.thecloudplaybook.com/p/platform-team-failure-unclear-ownership</guid><dc:creator><![CDATA[Amrut Patil]]></dc:creator><pubDate>Sun, 22 Mar 2026 14:28:37 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!nDHA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54656ef4-ae59-4c57-8280-34b194997879_1888x2238.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You can also read my newsletters from the Substack mobile app and be notified when a new issue is available.</p><div class="install-substack-app-embed install-substack-app-embed-web" data-component-name="InstallSubstackAppToDOM"><img class="install-substack-app-embed-img" src="https://substackcdn.com/image/fetch/$s_!7MI5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1ca555-b578-4ae3-8dda-a03cbc6b1d18_500x500.png"><div class="install-substack-app-embed-text"><div class="install-substack-app-header">Get more from Amrut Patil in the Substack app</div><div class="install-substack-app-text">Available for iOS and Android</div></div><a href="https://substack.com/app/app-store-redirect?utm_campaign=app-marketing&amp;utm_content=author-post-insert&amp;utm_source=thecloudplaybook" target="_blank" class="install-substack-app-embed-link"><button class="install-substack-app-embed-btn button primary">Get the app</button></a></div><div><hr></div><p>Platform teams don&#8217;t fail because of bad tools.</p><p>They fail because nobody can answer one question: who owns this?</p><p>Not &#8220;who built it.&#8221; Not &#8220;who is on-call for it this week.&#8221; </p><p><strong>Who is accountable for its behavior, its health, and its evolution over time?</strong></p><p>That question sounds simple. In most organizations, it has no clean answer.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nDHA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54656ef4-ae59-4c57-8280-34b194997879_1888x2238.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nDHA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54656ef4-ae59-4c57-8280-34b194997879_1888x2238.png 424w, https://substackcdn.com/image/fetch/$s_!nDHA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54656ef4-ae59-4c57-8280-34b194997879_1888x2238.png 848w, https://substackcdn.com/image/fetch/$s_!nDHA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54656ef4-ae59-4c57-8280-34b194997879_1888x2238.png 1272w, https://substackcdn.com/image/fetch/$s_!nDHA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54656ef4-ae59-4c57-8280-34b194997879_1888x2238.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nDHA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54656ef4-ae59-4c57-8280-34b194997879_1888x2238.png" width="1456" height="1726" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/54656ef4-ae59-4c57-8280-34b194997879_1888x2238.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1726,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5474221,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.thecloudplaybook.com/i/190864181?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54656ef4-ae59-4c57-8280-34b194997879_1888x2238.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nDHA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54656ef4-ae59-4c57-8280-34b194997879_1888x2238.png 424w, https://substackcdn.com/image/fetch/$s_!nDHA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54656ef4-ae59-4c57-8280-34b194997879_1888x2238.png 848w, https://substackcdn.com/image/fetch/$s_!nDHA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54656ef4-ae59-4c57-8280-34b194997879_1888x2238.png 1272w, https://substackcdn.com/image/fetch/$s_!nDHA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54656ef4-ae59-4c57-8280-34b194997879_1888x2238.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2>THE SIGNAL MOST LEADERS MISS UNTIL IT&#8217;S TOO LATE</h2><p>The pattern shows up the same way every time.</p><p>An incident occurred at 2 am. The alert lands in a shared channel. Engineers from three teams join the call. The first fifteen minutes are spent not on the fix, but on the question: whose service is this?</p><p>Nobody is lying. Nobody is avoiding the work. The system was just never designed to answer that question clearly.</p><p>This is not a tooling gap. No amount of better observability surfaces an owner. No dashboard tells you who is responsible for making a decision. That is an organizational design problem.</p><p>The cost is not just the 15 minutes of confusion at 2 am. </p><p>The cost compounds: a slower mean time to resolution, higher engineer burnout, recurring incidents because no one has a clear mandate to fix the root cause, and audit findings where evidence collection stalls because nobody owns the system that should automatically produce it.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Cloud Playbook is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2>WHAT UNCLEAR OWNERSHIP ACTUALLY LOOKS LIKE IN PRODUCTION</h2><p>It does not look like chaos. It looks like reasonable ambiguity.</p><p>A service was built by one team, migrated to another, and is now consumed by six. The original team documented it two years ago. The documentation is stale. The consuming teams have added workarounds. No one has updated the service catalog entry because nobody feels like the owner.</p><p>During normal operations, this is invisible. The service runs. Nobody asks questions.</p><p>During an incident, a compliance audit, or a cost-optimization review, ambiguity becomes costly. Three teams each spend four hours gathering evidence for the same control because none of them is sure who should do it. You pay for that coordination overhead every single time.</p><p>The research backs this up. Organizations with explicit ownership models, where every service has a named team, and that assignment is enforced in the deployment pipeline, resolve incidents measurably faster. The ownership metadata is not just a cultural artifact. It is an operational infrastructure.</p><div><hr></div><h2>THE OPERATING PRINCIPLE AT WORK</h2><p>Ownership is not a feeling. It is a system-level declaration that must be maintained like code.</p><p>The fix is not a new tool. It is a new invariant: no service runs in production without a named owner encoded in its configuration, linked to an on-call rotation, and tied to a support tier. That invariant is enforced at deploy time, not suggested in a wiki.</p><p>Here is what that looks like concretely.</p><p>Every resource in your service catalog carries three fields: owning team, support tier, and deprecation status. Deployments that do not carry valid owner labels are rejected at the infrastructure layer, not flagged afterward. The ownership registry is queried automatically at incident time, so the first alert includes the owning team, not just the service name.</p><p>This is not complex to build. It is a webhook, a registry, and a policy. Most teams have the technical capability in a few sprints.</p><p>What they lack is the mandate. Ownership enforcement feels bureaucratic until the first incident, where it saves 45 minutes of confusion. After that, engineers stop objecting to it.</p><p>The deeper shift is cultural. When ownership is enforced at the infrastructure layer, it stops being a conversation and starts being a constraint. Constraints are honest. They tell you exactly what the system expects of you. That honesty reduces the cognitive overhead that ambiguous shared ownership creates for everyone.</p><div><hr></div><h2>RUN THIS CHECK</h2><p>What to do this week:</p><p>Pull your current service catalog. Count the services with no owner or an owner field pointing to a team that no longer exists. If that number is above 10%, you have a structural problem, not a documentation problem.</p><p>Pick one critical service with ambiguous ownership. Assign it to a named team, add the assignment to the deployment configuration, and run a tabletop incident drill where that team is the first call. Track resolution time.</p><p>Write down the three services that caused the most escalation overhead in the last quarter. In each case, determine whether the escalation was driven by unclear ownership. It almost always is.</p><div><hr></div><p>Teams that enforce ownership at the infrastructure layer cut incident mean-time-to-resolution by removing the ownership-discovery step entirely. That step costs more time than most engineering leaders realize, because it rarely appears in post-incident reviews as a distinct line item.</p><p>Every time I&#8217;ve seen a platform team struggle with recurring incidents in the same service area, the root cause has been the same: the team on-call did not feel accountable because they did not feel like the owner. Ownership clarity is not a nice-to-have. It is the precondition for accountability.</p><div><hr></div><p><em><strong>Free tool: Score your AWS platform&#8217;s predictability in 5 minutes</strong><br>If this hit close to home, you probably have other places where ownership and accountability are fuzzy but invisible until something breaks.</em></p><p><em>To make this practical, I put together a free <strong>AWS Platform Predictability Starter Kit</strong> for readers:</em></p><ul><li><p><em>5&#8209;minute predictability checklist</em></p></li><li><p><em>&#8220;Where are we bleeding?&#8221; team scorecard</em></p></li><li><p><em>Platform risk radar with 12 early&#8209;warning signals</em></p></li><li><p><em>10 executive questions with weak vs strong answers + debrief worksheet<br>Most leaders run through these in a week of normal meetings and come away with a clear &#8220;top 3&#8221; to fix next.</em></p></li></ul><p><em>&#128073; Grab the free PDF here: <a href="https://thecloudplaybook.gumroad.com/l/aws-platform-predictability-check">https://thecloudplaybook.gumroad.com/l/aws-platform-predictability-check</a></em></p><p><em>If you run an AWS platform in production, <strong>do this before your next incident review</strong> so you&#8217;re not guessing which part of the platform will bite you next.</em></p><p><em>In the paid Cloud Playbook tier, I share the exact &#8220;Platform vs Team Contract&#8221; template and review checklist I use to draw these boundaries without starting a turf war.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe&quot;,&quot;text&quot;:&quot;Upgrade Here&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.thecloudplaybook.com/subscribe"><span>Upgrade Here</span></a></p><div><hr></div><h2><strong>Whenever you&#8217;re ready, there are 2 ways I can help you:</strong></h2><ol><li><p><strong>Free guides and helpful resources: </strong><a href="https://thecloudplaybook.gumroad.com/">https://thecloudplaybook.gumroad.com/</a></p></li><li><p>Get certified as an <strong>AWS AI Practitioner</strong> in 2026. Sign up today to elevate your cloud skills. (<em><a href="https://www.udemy.com/course/aws-certified-ai-practitioner-practice-exams-aif-c01/">link</a></em>)</p></li></ol><div><hr></div><h2><strong>That&#8217;s it for today!</strong></h2><p>Did you enjoy this newsletter issue?</p><p>Share with your friends, colleagues, and your favorite social media platform.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share The Cloud Playbook&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share The Cloud Playbook</span></a></p><p><strong>Until next week &#8212; Amrut</strong></p><div><hr></div><h2><strong>Get in touch</strong></h2><p>You can find me on <a href="https://www.linkedin.com/in/patilamrut/">LinkedIn</a> or <a href="https://twitter.com/realamrutpatil">X</a>.</p><p>If you would like to request a topic to read, please feel free to contact me directly via LinkedIn or X.</p>]]></content:encoded></item><item><title><![CDATA[TCP #109: The trust event that killed your platform adoption]]></title><description><![CDATA[Platforms rarely &#8220;fade.&#8221; One unannounced breaking change quietly trains teams to avoid you. This issue shows you how to surface that moment.]]></description><link>https://www.thecloudplaybook.com/p/developer-platform-adoption-trust-internal-developer-platform</link><guid isPermaLink="false">https://www.thecloudplaybook.com/p/developer-platform-adoption-trust-internal-developer-platform</guid><dc:creator><![CDATA[Amrut Patil]]></dc:creator><pubDate>Wed, 18 Mar 2026 16:31:12 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!HpeI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd71e13cf-93d5-45b6-8e18-9b973134da23_1888x2238.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You can also read my newsletters from the Substack mobile app and be notified when a new issue is available.</p><div class="install-substack-app-embed install-substack-app-embed-web" data-component-name="InstallSubstackAppToDOM"><img class="install-substack-app-embed-img" src="https://substackcdn.com/image/fetch/$s_!7MI5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1ca555-b578-4ae3-8dda-a03cbc6b1d18_500x500.png"><div class="install-substack-app-embed-text"><div class="install-substack-app-header">Get more from Amrut Patil in the Substack app</div><div class="install-substack-app-text">Available for iOS and Android</div></div><a href="https://substack.com/app/app-store-redirect?utm_campaign=app-marketing&amp;utm_content=author-post-insert&amp;utm_source=thecloudplaybook" target="_blank" class="install-substack-app-embed-link"><button class="install-substack-app-embed-btn button primary">Get the app</button></a></div><div><hr></div><p>When developers stop using your platform, the problem is not adoption. It is trust.</p><p>Adoption is a behavior. Trust is what drives it. </p><p>When developers route around your platform, open tickets in the wrong queue, or write their own Terraform instead of using your modules, they are telling you something. </p><p>Most platform teams respond by addressing the behavior. The right response is to diagnose what broke the signal underneath it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HpeI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd71e13cf-93d5-45b6-8e18-9b973134da23_1888x2238.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HpeI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd71e13cf-93d5-45b6-8e18-9b973134da23_1888x2238.png 424w, https://substackcdn.com/image/fetch/$s_!HpeI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd71e13cf-93d5-45b6-8e18-9b973134da23_1888x2238.png 848w, https://substackcdn.com/image/fetch/$s_!HpeI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd71e13cf-93d5-45b6-8e18-9b973134da23_1888x2238.png 1272w, https://substackcdn.com/image/fetch/$s_!HpeI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd71e13cf-93d5-45b6-8e18-9b973134da23_1888x2238.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HpeI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd71e13cf-93d5-45b6-8e18-9b973134da23_1888x2238.png" width="1456" height="1726" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d71e13cf-93d5-45b6-8e18-9b973134da23_1888x2238.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1726,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5649274,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.thecloudplaybook.com/i/190239745?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd71e13cf-93d5-45b6-8e18-9b973134da23_1888x2238.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HpeI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd71e13cf-93d5-45b6-8e18-9b973134da23_1888x2238.png 424w, https://substackcdn.com/image/fetch/$s_!HpeI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd71e13cf-93d5-45b6-8e18-9b973134da23_1888x2238.png 848w, https://substackcdn.com/image/fetch/$s_!HpeI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd71e13cf-93d5-45b6-8e18-9b973134da23_1888x2238.png 1272w, https://substackcdn.com/image/fetch/$s_!HpeI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd71e13cf-93d5-45b6-8e18-9b973134da23_1888x2238.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2>WHAT PLATFORM TEAMS SEE FIRST</h2><p>The symptom surfaces quietly.</p><p>Golden path usage drops. A team submits a request to bypass the CI pipeline for a one-off deployment. Another team forks the shared Terraform module instead of requesting a change. A senior engineer casually mentions that the platform adds a step that wasn&#8217;t there before.</p><p>Nobody files a ticket saying, &#8220;I do not trust this platform.&#8221; They just stop using it.</p><p>The metrics follow a few weeks later. Deployment frequency drops on platform-managed services. </p><p>Support requests thin out, which looks like a good thing until you realize it means teams stopped asking and started working around. Usage of the internal developer portal flatlines.</p><p>The signal is not loud. That is what makes it easy to misread.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Cloud Playbook is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2>THE REFLEXIVE RESPONSE THAT MAKES IT WORSE</h2><p>Most platform teams respond to dropping adoption the same way.</p>
      <p>
          <a href="https://www.thecloudplaybook.com/p/developer-platform-adoption-trust-internal-developer-platform">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[TCP #108: Who owns this service? (If you need Slack to answer, you have a problem)]]></title><description><![CDATA[Turning ownership from a stale spreadsheet into an enforced AWS constraint wired to tags, on-call, and cost.]]></description><link>https://www.thecloudplaybook.com/p/service-ownership-enforcement-platform-engineering</link><guid isPermaLink="false">https://www.thecloudplaybook.com/p/service-ownership-enforcement-platform-engineering</guid><dc:creator><![CDATA[Amrut Patil]]></dc:creator><pubDate>Sun, 15 Mar 2026 14:29:27 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!rMd4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f56a653-7c8b-414c-8648-2f7a3319f8ad_2816x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You can also read my newsletters from the Substack mobile app and be notified when a new issue is available.</p><div class="install-substack-app-embed install-substack-app-embed-web" data-component-name="InstallSubstackAppToDOM"><img class="install-substack-app-embed-img" src="https://substackcdn.com/image/fetch/$s_!7MI5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1ca555-b578-4ae3-8dda-a03cbc6b1d18_500x500.png"><div class="install-substack-app-embed-text"><div class="install-substack-app-header">Get more from Amrut Patil in the Substack app</div><div class="install-substack-app-text">Available for iOS and Android</div></div><a href="https://substack.com/app/app-store-redirect?utm_campaign=app-marketing&amp;utm_content=author-post-insert&amp;utm_source=thecloudplaybook" target="_blank" class="install-substack-app-embed-link"><button class="install-substack-app-embed-btn button primary">Get the app</button></a></div><div><hr></div><p>Average teams document ownership. Great teams enforce it.</p><p>Most engineering teams have a spreadsheet, a Confluence page, or a service catalog entry that says who owns what. That document gets created during a planning cycle, reviewed once, and then ignored. </p><p>Nobody updates it when engineers leave, when services get refactored, or when a new team inherits an old codebase.</p><p>The document exists. Ownership does not.</p><p>This gap is not a knowledge problem. <strong>It is a mechanism problem.</strong> </p><p><em>In the paid version of this issue, I include the exact AWS policies, Config rules, and Terraform patterns I use to enforce ownership in regulated environments. This free version explains the pattern so you can decide if it&#8217;s worth wiring into your platform.</em></p><p>Until you treat it that way, your incidents will keep revealing owners who did not know they were owners, your audits will surface gaps nobody saw coming, and your cost anomalies will have no clear path to resolution.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rMd4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f56a653-7c8b-414c-8648-2f7a3319f8ad_2816x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rMd4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f56a653-7c8b-414c-8648-2f7a3319f8ad_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!rMd4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f56a653-7c8b-414c-8648-2f7a3319f8ad_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!rMd4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f56a653-7c8b-414c-8648-2f7a3319f8ad_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!rMd4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f56a653-7c8b-414c-8648-2f7a3319f8ad_2816x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rMd4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f56a653-7c8b-414c-8648-2f7a3319f8ad_2816x1536.png" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0f56a653-7c8b-414c-8648-2f7a3319f8ad_2816x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:6484041,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thecloudplaybook.com/i/190236661?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f56a653-7c8b-414c-8648-2f7a3319f8ad_2816x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rMd4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f56a653-7c8b-414c-8648-2f7a3319f8ad_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!rMd4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f56a653-7c8b-414c-8648-2f7a3319f8ad_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!rMd4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f56a653-7c8b-414c-8648-2f7a3319f8ad_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!rMd4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f56a653-7c8b-414c-8648-2f7a3319f8ad_2816x1536.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2>THE CATALOG THAT DRIFTS WHILE YOUR ORG MOVES</h2><p>Most teams treat ownership as a one-time labeling exercise.</p><p>They build a service catalog. They assign owners. They add a field in Backstage or a column in a spreadsheet. A few quarters later, engineers rotate, services split, and the catalog drifts from reality.</p><p>The result is a document that looks complete but carries no accountability.</p><p>When an incident fires at 2 am, the on-call engineer finds an owner who left eight months ago and starts pinging Slack channels. </p><p>The incident resolution time climbs. Post-mortems list &#8220;unclear ownership&#8221; as a contributing factor. Nobody changes the underlying system.</p><p>The same pattern appears in compliance reviews. The team points to documentation. The documentation points to a person who no longer holds that role. The audit finding stems from the fact that the document said one thing, while the organization reflected another.</p><p>Documentation without enforcement is not governance. It is the appearance of governance.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">If you&#8217;re on the free tier, you&#8217;ll get the concepts. Paid subscribers get the concrete templates and walkthroughs to ship this inside your AWS org.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2>OWNERSHIP WIRED TO FUNCTION, NOT IDENTITY</h2><p>High-performing teams make ownership structural, not declarative.</p><p>They do not ask teams to record who owns a service. They build systems that make it impossible to provision a resource without declaring ownership. </p><p>They tie that declaration to on-call rotation, cost attribution, and access controls. When ownership changes, the system reflects it within one sprint, or the pipeline breaks.</p><p>The mechanism starts at provisioning. AWS Service Control Policies and Tag Policies in AWS Organizations prevent resource creation when mandatory ownership tags are absent. </p><p>For paid subscribers, I break down the specific tag keys, example SCPs, and the safe rollout sequence I&#8217;ve used so you don&#8217;t brick existing pipelines.</p><p>The tags include service name, team alias, on-call contact, environment, and cost center. A resource without those tags cannot be deployed. The enforcement is not a reminder. It is a gate.</p><p>The owner field points to a team alias or rotation, not a person&#8217;s name. When an engineer leaves, the alias stays active. The team updates the rotation. The on-call system stays intact.</p><p>Compliance reporting then runs against the live state, not the catalog. AWS Config Rules continuously check tagging compliance. </p><p>Drift gets surfaced to platform dashboards weekly. Teams see their own compliance score. Remediation is self-service.</p><p>This approach shifts ownership from an administrative task to an engineering constraint. It does not rely on discipline. It relies on design.</p><div><hr></div><h2>FASTER INCIDENTS. CLEANER AUDITS. NO ARCHAEOLOGY</h2><p>The operational gap between these two approaches is not theoretical.</p><p>Teams that enforce ownership at provisioning resolve incidents faster. The on-call contact is visible in the resource tag, surfaced by the monitoring tool, and reachable in under two minutes. There is no Slack archaeology.</p><p>Ownership is queryable. It is attached to the resource, not stored in someone&#8217;s memory. When auditors ask about access controls or cost attribution, the answer is a tag report.</p><p>Cost anomalies get routed to the right team without a platform team investigation.</p><p>A platform team that enforces ownership is running governance as a system. A platform team that documents ownership is running governance as a hope. The first team gets the budget. The second team gets audit findings.</p><div><hr></div><h2>VISIBILITY FIRST. ENFORCEMENT SECOND. IN THAT ORDER.</h2><p>The path from documentation to enforcement does not require rebuilding your platform. It requires picking the right starting point.</p><p>Pull an AWS Config report or use Resource Groups Tag Editor to identify every resource missing an owner tag. Show that report to your engineering leads. Not as a compliance finding. As a shared problem.</p><p>Then apply SCP-based tag enforcement to new accounts first. Existing accounts get a remediation window of 30 to 60 days. The Terraform modules in your golden path should include required tags by default, so new services deploy compliant from day one.</p><p>Wire ownership to function next. Link your on-call rotation to a team alias that appears in the service&#8217;s owner tag. When PagerDuty fires, the routing is automatic.</p><p>Surface each team&#8217;s ownership compliance score in your developer portal. Teams respond to scorecards they can see. They do not respond to spreadsheets they cannot find.</p><p>This shift takes one to two quarters. It is a governance layer applied to the infrastructure you already own.</p><div><hr></div><h2>RUN THIS CHECK THIS WEEK</h2><p>Pull your AWS Resource Groups Tag Editor report across all accounts. Filter for resources missing an &#8220;owner&#8221; or &#8220;team&#8221; tag.</p><p>If more than 20% of resources lack an owner declaration, you will have a gap that will surface in your next audit or incident.</p><p>Pick one account. Apply SCP-based tag enforcement only to new resources. Update your Terraform module defaults to include owner, team, environment, and cost-center tags. Ship that in the next sprint.</p><p>If your team does not have a standard Terraform module or a defined set of required tags, that is the starting point.</p><div><hr></div><p><em>If this resonated, you&#8217;re probably already picturing where your own ownership model would crack during an incident or audit.</em></p><p><em>You can close that gap in two ways:</em></p><ul><li><p><em>Assemble the mechanisms yourself from this essay, AWS docs, and a few painful incidents, or</em></p></li><li><p><em>Start from a baseline that has already survived real FedRAMP / HIPAA / ISO environments.</em></p></li></ul><p><em>The paid version of The Cloud Playbook takes essays like this and turns them into implementation kits.</em></p><p><em>For this issue, paid subscribers get:</em></p><ul><li><p><em>A 90&#8209;day rollout plan for enforcing ownership at provisioning</em></p></li><li><p><em>Example AWS Tag Policies and SCPs to block untagged resources</em></p></li><li><p><em>AWS Config rules to surface ownership drift weekly</em></p></li><li><p><em>An ownership scorecard spec you can drop into your portal</em></p></li></ul><p><em>If enforcing ownership like this doesn&#8217;t save you more than the subscription in one incident or one audit cycle, you should cancel.</em></p><p><em>If you want that kit, upgrade to the paid newsletter here</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe&quot;,&quot;text&quot;:&quot;Upgrade Here&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.thecloudplaybook.com/subscribe"><span>Upgrade Here</span></a></p><div><hr></div><h2><strong>Whenever you&#8217;re ready</strong></h2><p>There are two ways I can help you further:</p><ol><li><p><strong>Get the AWS Platform Predictability Starter Kit (Free)</strong><br>Four short tools to baseline where your platform is strong vs where it&#8217;s bleeding:</p><ul><li><p>5&#8209;minute predictability checklist</p></li><li><p>&#8220;Where are we bleeding?&#8221; team scorecard</p></li><li><p>Platform risk radar with 12 early&#8209;warning signals</p></li><li><p>10 executive questions with weak vs strong answers + debrief worksheet<br></p></li></ul><p>Most leaders run through these in a week of normal meetings and come away with a clear &#8220;top 3&#8221; to fix.</p><p><br>&#8594; <strong>Grab the Starter Kit</strong>: <a href="https://thecloudplaybook.gumroad.com/l/aws-platform-predictability-starter-kit">https://thecloudplaybook.gumroad.com/l/aws-platform-predictability-starter-kit</a></p></li><li><p><strong>Keep getting essays like this every week</strong><br>Stay on the free list, apply one check per week, and share this with your platform peers so you&#8217;re solving the same problems with the same language.</p></li></ol><div><hr></div><h2><strong>Get in touch</strong></h2><p>You can find me on <a href="https://www.linkedin.com/in/patilamrut/">LinkedIn</a> or <a href="https://twitter.com/realamrutpatil">X</a>.</p><p>If you would like to request a topic to read, please feel free to contact me directly via LinkedIn or X.</p>]]></content:encoded></item><item><title><![CDATA[TCP #107: Database Architecture for Multi-Tenant Platforms: The Tradeoffs Nobody Explains Well]]></title><description><![CDATA[What I would build differently and the one rule I enforce at tenant onboarding now.]]></description><link>https://www.thecloudplaybook.com/p/multi-tenant-database-architecture-tradeoffs-pooled-siloed-schema</link><guid isPermaLink="false">https://www.thecloudplaybook.com/p/multi-tenant-database-architecture-tradeoffs-pooled-siloed-schema</guid><dc:creator><![CDATA[Amrut Patil]]></dc:creator><pubDate>Wed, 11 Mar 2026 12:03:29 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!HEg0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbc967b-3269-4151-8336-42fdfe60648d_2816x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You can also read my newsletters from the Substack mobile app and be notified when a new issue is available.</p><div class="install-substack-app-embed install-substack-app-embed-web" data-component-name="InstallSubstackAppToDOM"><img class="install-substack-app-embed-img" src="https://substackcdn.com/image/fetch/$s_!7MI5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1ca555-b578-4ae3-8dda-a03cbc6b1d18_500x500.png"><div class="install-substack-app-embed-text"><div class="install-substack-app-header">Get more from Amrut Patil in the Substack app</div><div class="install-substack-app-text">Available for iOS and Android</div></div><a href="https://substack.com/app/app-store-redirect?utm_campaign=app-marketing&amp;utm_content=author-post-insert&amp;utm_source=thecloudplaybook" target="_blank" class="install-substack-app-embed-link"><button class="install-substack-app-embed-btn button primary">Get the app</button></a></div><div><hr></div><p>When we onboarded our ninth external tenant, we ran into a wall.</p><p>A compliance audit required per-tenant evidence of data isolation. Our pooled RDS instance, with row-level security as the only enforcement layer, could not produce that evidence cleanly.</p><p>We spent six weeks generating audit documentation that a siloed architecture would have produced automatically.</p><p>The database architecture decision I made at tenant two was still costing us at tenant nine.</p><p>This is how I evaluate it now.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HEg0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbc967b-3269-4151-8336-42fdfe60648d_2816x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HEg0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbc967b-3269-4151-8336-42fdfe60648d_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!HEg0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbc967b-3269-4151-8336-42fdfe60648d_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!HEg0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbc967b-3269-4151-8336-42fdfe60648d_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!HEg0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbc967b-3269-4151-8336-42fdfe60648d_2816x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HEg0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbc967b-3269-4151-8336-42fdfe60648d_2816x1536.png" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7bbc967b-3269-4151-8336-42fdfe60648d_2816x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7424181,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.thecloudplaybook.com/i/188976623?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbc967b-3269-4151-8336-42fdfe60648d_2816x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HEg0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbc967b-3269-4151-8336-42fdfe60648d_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!HEg0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbc967b-3269-4151-8336-42fdfe60648d_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!HEg0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbc967b-3269-4151-8336-42fdfe60648d_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!HEg0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbc967b-3269-4151-8336-42fdfe60648d_2816x1536.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h3>THREE DATABASE ISOLATION MODELS. ONE CHOICE. NO EASY UNDO.</h3><p>Every multi-tenant platform on AWS eventually faces the same fork: how do you store tenant data?</p><p>Three models dominate the decision.</p>
      <p>
          <a href="https://www.thecloudplaybook.com/p/multi-tenant-database-architecture-tradeoffs-pooled-siloed-schema">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[TCP #106: Developers want autonomy. Platform wants consistency. Both are wrong.]]></title><description><![CDATA[Not because either position is bad. Because neither one is a strategy.]]></description><link>https://www.thecloudplaybook.com/p/developer-autonomy-vs-platform-consistency-where-both-go-wrong</link><guid isPermaLink="false">https://www.thecloudplaybook.com/p/developer-autonomy-vs-platform-consistency-where-both-go-wrong</guid><dc:creator><![CDATA[Amrut Patil]]></dc:creator><pubDate>Sun, 08 Mar 2026 12:02:56 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Qja7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dce921a-530e-44fe-8d99-529235884356_2816x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You can also read my newsletters from the Substack mobile app and be notified when a new issue is available.</p><div class="install-substack-app-embed install-substack-app-embed-web" data-component-name="InstallSubstackAppToDOM"><img class="install-substack-app-embed-img" src="https://substackcdn.com/image/fetch/$s_!7MI5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1ca555-b578-4ae3-8dda-a03cbc6b1d18_500x500.png"><div class="install-substack-app-embed-text"><div class="install-substack-app-header">Get more from Amrut Patil in the Substack app</div><div class="install-substack-app-text">Available for iOS and Android</div></div><a href="https://substack.com/app/app-store-redirect?utm_campaign=app-marketing&amp;utm_content=author-post-insert&amp;utm_source=thecloudplaybook" target="_blank" class="install-substack-app-embed-link"><button class="install-substack-app-embed-btn button primary">Get the app</button></a></div><div><hr></div><p>This tension is as old as platform engineering itself.</p><p>Developers want to move fast. They want to choose the tools they know. They want to deploy without filing a ticket or waiting for a pipeline they did not build.</p><p>Platform teams want predictability. They want one observability stack, one deployment model, and one set of guardrails that applies across every team.</p><p>Both are right.</p><p>Both, taken to their conclusion, produce organizations that cannot scale.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Qja7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dce921a-530e-44fe-8d99-529235884356_2816x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Qja7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dce921a-530e-44fe-8d99-529235884356_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!Qja7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dce921a-530e-44fe-8d99-529235884356_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!Qja7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dce921a-530e-44fe-8d99-529235884356_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!Qja7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dce921a-530e-44fe-8d99-529235884356_2816x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Qja7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dce921a-530e-44fe-8d99-529235884356_2816x1536.png" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2dce921a-530e-44fe-8d99-529235884356_2816x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7038885,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.thecloudplaybook.com/i/188972957?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dce921a-530e-44fe-8d99-529235884356_2816x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Qja7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dce921a-530e-44fe-8d99-529235884356_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!Qja7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dce921a-530e-44fe-8d99-529235884356_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!Qja7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dce921a-530e-44fe-8d99-529235884356_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!Qja7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dce921a-530e-44fe-8d99-529235884356_2816x1536.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h3>AUTONOMY IS RIGHT UNTIL IT ISN&#8217;T</h3><p>The case for developer autonomy in platform engineering is real.</p><p>Engineers closest to the problem understand the constraints better than anyone else.</p><p>A team building a real-time data pipeline knows what latency profile they need. A team managing a compliance-critical workflow knows where the edge cases are.</p><p>Giving them the tools and the freedom to solve their problem without platform overhead produces faster decisions and better systems for that specific problem.</p><p>Autonomy also drives platform adoption.</p><p>Teams that feel constrained by a platform find workarounds. They build shadow infrastructure. They use unapproved tooling. They create exactly the inconsistency the platform was designed to prevent, except now it is invisible to the platform team.</p><p>Developer autonomy, when it works, is not chaos. It is trust.</p><p>It signals that the platform team respects engineering judgment and is not building for control.</p><p>The failure mode is not autonomy itself. It is autonomy without any shared foundation.</p><p>Twelve teams. Eight deployment mechanisms. Six observability stacks. No consistent tagging. No shared on-call model. No golden path anyone actually walks.</p><p>At that point, developer autonomy has produced a system nobody can operate at scale.</p><p>Incidents cross service boundaries that no single engineer understands. Compliance audits require twelve different evidence formats. New engineers spend months learning the local conventions of each team before they can contribute.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Cloud Playbook is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h3>CONSISTENCY IS RIGHT UNTIL IT ISN&#8217;T</h3><p>The case for platform consistency is equally real.</p><p>When every team deploys through the same pipeline, tags resources consistently, and emits metrics in the same format, the platform team can support them all.</p><p>Incidents become diagnosable because the signal looks the same across every service.</p><p>Compliance audits become repeatable because the evidence structure does not change between tenants.</p><p>Cost attribution becomes automatic because the tagging model is enforced rather than aspirational.</p><p>Consistency is what makes a platform team of seven supportable across seventy-five developers.</p><p>Without it, every team becomes its own operational burden.</p><p>The failure mode is not consistency itself. It is consistency applied to the wrong layer.</p><p>When a platform mandates a specific logging library, a specific test framework, a specific database client, it has crossed from enforcing operational standards into controlling engineering decisions that belong to the team.</p><p>That is where platform adoption collapses.</p><p>Engineers do not fight the pipeline. They route around it. They build locally and push to production via the path with the least friction, which is now outside the platform&#8217;s visibility.</p><p>The platform team has achieved consistency on paper and lost it in practice.</p><div><hr></div><h3>WHERE BOTH GO WRONG AT THE SAME TIME</h3><p>The trap is not picking the wrong side.</p><p>The trap is treating this as a values conflict between platform control and developer freedom, then oscillating between them based on whoever complained most recently.</p><p>Platform teams that get burned by inconsistency clamp down. They add mandatory steps. They restrict tool choices. Developers push back. The platform team softens the requirements. Inconsistency returns.</p><p>The cycle repeats.</p><p>Neither position is wrong. Both are responding to real failure modes.</p><p>The problem is that swinging between them is not a strategy. It is a symptom of not having a clear model for where consistency is non-negotiable and where developer autonomy is not just acceptable but preferable.</p><div><hr></div><h3>HOW TO HOLD THE TENSION</h3><p>The resolution is not a compromise. It is a boundary.</p><p>Define what the platform owns and what the team owns. State it explicitly. Enforce the platform&#8217;s layer. Leave the team&#8217;s layer genuinely open.</p><p>The platform owns: tagging, account structure, network topology, deployment gates, security baselines, compliance controls, and observability standards.</p><p>These are non-negotiable because variation here creates systemic risk. One team&#8217;s non-standard deployment gate becomes a compliance gap that blocks the entire organization&#8217;s certification.</p><p>The team owns: language, framework, internal libraries, database choice within approved types, caching strategy, and service architecture.</p><p>These decisions belong to the engineers closest to the problem. The platform&#8217;s job is not to make these decisions for them. It is to make sure those decisions do not create operational or compliance risk at the system level.</p><p>The golden path in platform engineering is not a mandate. It is an offer.</p><p>Here is the fastest way to build and ship something that is secure, compliant, and observable. Use it if it fits.</p><p>If it does not fit, tell the platform team why, and we will decide together whether the standard needs to change or the exception needs guardrails.</p><p>That conversation is what separates a platform developers trust from one they tolerate.</p><p>When developers can see exactly where the boundary is, and it is set at the right layer, the autonomy vs. consistency tension stops being a conflict.</p><p>It becomes a design.</p><p>The platform&#8217;s job is not to eliminate developer judgment. It is to make sure that judgment operates within boundaries that the whole organization can rely on.</p><div><hr></div><p>This week, write a one&#8209;page &#8220;platform vs team&#8221; RACI:</p><ul><li><p>List: tagging, accounts, network, deploy gates, security, observability</p></li><li><p>List: language, framework, DB within approved list, caching, service architecture<br>Circle which ones are fuzzy today. Those fuzzies are where your incidents and platform fights are coming from.</p></li></ul><div><hr></div><p><em><strong>Free tool: Score your AWS platform&#8217;s predictability in 5 minutes</strong><br>I just shipped a new free tool for you: a 6&#8209;page, 18&#8209;question checklist to score how predictable your AWS platform really is across deployments, incidents, onboarding, cost, compliance, and throughput.</em></p><p><em>It takes 5 minutes and tells you if you&#8217;re in Reactive, Stabilizing, or Predictable territory, plus what to fix first.</em></p><p><em>&#128073; Grab the free PDF here: <a href="https://thecloudplaybook.gumroad.com/l/aws-platform-predictability-check">https://thecloudplaybook.gumroad.com/l/aws-platform-predictability-check</a></em></p><p><em>If you run an AWS platform in production, <strong>do this before your next incident review</strong> so you&#8217;re not guessing which part of the platform will bite you next.</em></p><p><em>In the paid Cloud Playbook tier, I share the exact &#8220;Platform vs Team Contract&#8221; template and review checklist I use to draw these boundaries without starting a turf war.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe&quot;,&quot;text&quot;:&quot;Upgrade Here&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.thecloudplaybook.com/subscribe"><span>Upgrade Here</span></a></p><div><hr></div><h2><strong>Whenever you&#8217;re ready, there are 2 ways I can help you:</strong></h2><ol><li><p><strong>Free guides and helpful resources: </strong><a href="https://thecloudplaybook.gumroad.com/">https://thecloudplaybook.gumroad.com/</a></p></li><li><p>Get certified as an <strong>AWS AI Practitioner</strong> in 2026. Sign up today to elevate your cloud skills. (<em><a href="https://www.udemy.com/course/aws-certified-ai-practitioner-practice-exams-aif-c01/">link</a></em>)</p></li></ol><div><hr></div><h2><strong>That&#8217;s it for today!</strong></h2><p>Did you enjoy this newsletter issue?</p><p>Share with your friends, colleagues, and your favorite social media platform.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share The Cloud Playbook&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share The Cloud Playbook</span></a></p><p><strong>Until next week &#8212; Amrut</strong></p><div><hr></div><h2><strong>Get in touch</strong></h2><p>You can find me on <a href="https://www.linkedin.com/in/patilamrut/">LinkedIn</a> or <a href="https://twitter.com/realamrutpatil">X</a>.</p><p>If you would like to request a topic to read, please feel free to contact me directly via LinkedIn or X.</p><p></p>]]></content:encoded></item><item><title><![CDATA[TCP# 105: The Multi-Tenant Architecture I'd Never Build Again]]></title><description><![CDATA[Nine tenants. Eleven services. One pooled model. This is what we got wrong.]]></description><link>https://www.thecloudplaybook.com/p/multi-tenant-architecture-mistakes-lessons-aws-platform-engineering</link><guid isPermaLink="false">https://www.thecloudplaybook.com/p/multi-tenant-architecture-mistakes-lessons-aws-platform-engineering</guid><dc:creator><![CDATA[Amrut Patil]]></dc:creator><pubDate>Wed, 04 Mar 2026 13:04:12 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!EL2x!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fe7feed-4165-4fa6-8fd6-f7741eae3533_3168x1344.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You can also read my newsletters from the Substack mobile app and be notified when a new issue is available.</p><div class="install-substack-app-embed install-substack-app-embed-web" data-component-name="InstallSubstackAppToDOM"><img class="install-substack-app-embed-img" src="https://substackcdn.com/image/fetch/$s_!7MI5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1ca555-b578-4ae3-8dda-a03cbc6b1d18_500x500.png"><div class="install-substack-app-embed-text"><div class="install-substack-app-header">Get more from Amrut Patil in the Substack app</div><div class="install-substack-app-text">Available for iOS and Android</div></div><a href="https://substack.com/app/app-store-redirect?utm_campaign=app-marketing&amp;utm_content=author-post-insert&amp;utm_source=thecloudplaybook" target="_blank" class="install-substack-app-embed-link"><button class="install-substack-app-embed-btn button primary">Get the app</button></a></div><div><hr></div><p>We built a shared-everything multi-tenant platform on AWS.</p><p>One database per service. Tenant data separated by row-level filters. One deployment pipeline. One observability stack. One set of IAM roles scoped to the service, not the tenant.</p><p>It looked clean on a whiteboard.</p><p>It did not survive contact with production.</p><p>This is what we got wrong, what it cost us, and what I would build instead.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EL2x!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fe7feed-4165-4fa6-8fd6-f7741eae3533_3168x1344.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EL2x!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fe7feed-4165-4fa6-8fd6-f7741eae3533_3168x1344.png 424w, https://substackcdn.com/image/fetch/$s_!EL2x!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fe7feed-4165-4fa6-8fd6-f7741eae3533_3168x1344.png 848w, https://substackcdn.com/image/fetch/$s_!EL2x!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fe7feed-4165-4fa6-8fd6-f7741eae3533_3168x1344.png 1272w, https://substackcdn.com/image/fetch/$s_!EL2x!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fe7feed-4165-4fa6-8fd6-f7741eae3533_3168x1344.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EL2x!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fe7feed-4165-4fa6-8fd6-f7741eae3533_3168x1344.png" width="1456" height="618" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6fe7feed-4165-4fa6-8fd6-f7741eae3533_3168x1344.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:618,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7827201,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.thecloudplaybook.com/i/188755684?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fe7feed-4165-4fa6-8fd6-f7741eae3533_3168x1344.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EL2x!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fe7feed-4165-4fa6-8fd6-f7741eae3533_3168x1344.png 424w, https://substackcdn.com/image/fetch/$s_!EL2x!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fe7feed-4165-4fa6-8fd6-f7741eae3533_3168x1344.png 848w, https://substackcdn.com/image/fetch/$s_!EL2x!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fe7feed-4165-4fa6-8fd6-f7741eae3533_3168x1344.png 1272w, https://substackcdn.com/image/fetch/$s_!EL2x!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fe7feed-4165-4fa6-8fd6-f7741eae3533_3168x1344.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h3>THE INCIDENT THAT EXPOSED EVERYTHING</h3><p>Eighteen months after launch, a single tenant&#8217;s batch job consumed enough database connection pool capacity to degrade response times for every other tenant on the platform.</p><p>No data breach. No data loss.</p><p>Just one tenant&#8217;s workload bleeding into every other tenant&#8217;s experience.</p><p>Leadership called it a performance issue.</p>
      <p>
          <a href="https://www.thecloudplaybook.com/p/multi-tenant-architecture-mistakes-lessons-aws-platform-engineering">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[TCP #104: When teams route around your standards, the standards are wrong.]]></title><description><![CDATA[Not the teams. Here is how to tell the difference.]]></description><link>https://www.thecloudplaybook.com/p/standardization-vs-team-autonomy-tradeoff-platform-engineering</link><guid isPermaLink="false">https://www.thecloudplaybook.com/p/standardization-vs-team-autonomy-tradeoff-platform-engineering</guid><dc:creator><![CDATA[Amrut Patil]]></dc:creator><pubDate>Sun, 01 Mar 2026 13:00:59 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!8XK5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b4480e8-b2f4-49da-8247-bfded8c09051_2816x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You can also read my newsletters from the Substack mobile app and be notified when a new issue is available.</p><div class="install-substack-app-embed install-substack-app-embed-web" data-component-name="InstallSubstackAppToDOM"><img class="install-substack-app-embed-img" src="https://substackcdn.com/image/fetch/$s_!7MI5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1ca555-b578-4ae3-8dda-a03cbc6b1d18_500x500.png"><div class="install-substack-app-embed-text"><div class="install-substack-app-header">Get more from Amrut Patil in the Substack app</div><div class="install-substack-app-text">Available for iOS and Android</div></div><a href="https://substack.com/app/app-store-redirect?utm_campaign=app-marketing&amp;utm_content=author-post-insert&amp;utm_source=thecloudplaybook" target="_blank" class="install-substack-app-embed-link"><button class="install-substack-app-embed-btn button primary">Get the app</button></a></div><div><hr></div><p>Most platform conversations eventually hit this fork.</p><p>One side wants consistency.</p><p>One runtime. One deployment pipeline. One observability stack. One way to provision infrastructure.</p><p>The other side wants freedom.</p><p>Teams should choose the tools that fit their problem. Constraints slow engineers down. Autonomy produces better outcomes.</p><p>Both positions are partially right.</p><p>Both, when taken to extremes, produce systems that fail in predictable ways.</p><p>Here is how I honestly evaluate the trade-off.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8XK5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b4480e8-b2f4-49da-8247-bfded8c09051_2816x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8XK5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b4480e8-b2f4-49da-8247-bfded8c09051_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!8XK5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b4480e8-b2f4-49da-8247-bfded8c09051_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!8XK5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b4480e8-b2f4-49da-8247-bfded8c09051_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!8XK5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b4480e8-b2f4-49da-8247-bfded8c09051_2816x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8XK5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b4480e8-b2f4-49da-8247-bfded8c09051_2816x1536.png" width="727" height="396.4546703296703" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4b4480e8-b2f4-49da-8247-bfded8c09051_2816x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:727,&quot;bytes&quot;:6568933,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thecloudplaybook.com/i/188728685?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b4480e8-b2f4-49da-8247-bfded8c09051_2816x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8XK5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b4480e8-b2f4-49da-8247-bfded8c09051_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!8XK5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b4480e8-b2f4-49da-8247-bfded8c09051_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!8XK5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b4480e8-b2f4-49da-8247-bfded8c09051_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!8XK5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b4480e8-b2f4-49da-8247-bfded8c09051_2816x1536.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h3>STANDARDIZATION HAS A REAL COST</h3><p>Start with what standardization actually takes from teams.</p><p>When a platform mandates a single runtime, a single pipeline, and a single logging format, it removes decision-making from engineers closest to the problem.</p><p>Sometimes, that decision-making was producing divergence that the platform could not support. Sometimes it was producing genuine innovation, but the platform was killed.</p><p>Standardization shifts cognitive load from individual teams to the platform team. Engineers stop thinking about infrastructure choices and start working within guardrails.</p><p>That is the intended outcome.</p><p>But guardrails set in the wrong place constrain the right behaviors alongside the wrong ones.</p><p>A Python team forced into a Java deployment pipeline does not get faster. A team building real-time data pipelines constrained by a batch-processing standard does not become more reliable.</p><p>Standardization applied without context produces friction that engineers route around, which is worse than no standard at all.</p><p>The cost of standardization is real. It is measured in slowed onboarding for edge cases, frustrated senior engineers who know a better path exists, and workarounds that accumulate outside the platform&#8217;s visibility.</p><p>Ignore that cost, and you build a platform that teams tolerate instead of one they trust.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">If you&#8217;re on the free tier, you&#8217;ll get the concepts. Paid subscribers get the concrete templates and walkthroughs to ship this inside your AWS org.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h3>AUTONOMY HAS A REAL COST TOO</h3><p>Autonomy sounds right because it respects engineering judgment.</p><p>Teams pick the tools they know. They move faster in familiar environments. They own their decisions and their outcomes.</p><p>In theory, this produces better systems built by motivated engineers.</p><p>In practice, unconstrained autonomy produces something else entirely.</p><p>Twelve teams. Eight languages. Six deployment mechanisms. Four observability stacks.</p><p>No shared on-call playbook. No consistent tagging. No standard for how infrastructure gets provisioned or decommissioned.</p><p>When an incident crosses service boundaries, no single engineer understands the full system.</p><p>When a compliance audit requires evidence across all services, each team produces it differently.</p><p>When a senior engineer leaves, their infrastructure choices leave with them.</p><p>Autonomy without boundaries does not produce ownership. It produces silos.</p><p>Each team optimizes locally, and the system pays the cost globally.</p><p>In regulated environments, the cost is sharper.</p><p>One team&#8217;s non-standard deployment mechanism produces an audit finding that blocks the entire organization&#8217;s certification. One team&#8217;s custom observability tooling creates a gap in the evidence library.</p><p>The audit does not care that the team had good reasons.</p><p>Autonomy is not free. Its cost is paid in coordination overhead, incident complexity, and compliance risk.</p><div><hr></div><h3>WHERE MOST TEAMS GET IT WRONG</h3><p>The mistake is treating this as a binary choice.</p><p>Platform teams that have been burned by inconsistency push toward full standardization. Platform teams accused of slowing engineers down push toward full autonomy.</p><p>Both overcorrect. Both create the failure mode they were trying to avoid.</p><p>Full standardization produces a platform that works for 80% of use cases and creates an adversarial relationship with the 20% that do not fit.</p><p>Engineers in that 20% stop engaging with the platform and build outside it. The platform team loses visibility into what those teams are running, which is exactly the opposite of what standardization was supposed to produce.</p><p>Full autonomy produces a platform that nobody calls a platform.</p><p>It is just a collection of individual team choices with a shared AWS account. When something breaks at the system level, nobody owns it.</p><p>The mistake is picking a pole and defending it.</p><p>The trade-off is not between standardization and autonomy. It is figuring out which layer of the stack each one applies to.</p><div><hr></div><h3>HOW I EVALUATE IT</h3><p><strong>The frame I use: standardize the floor, not the ceiling.</strong></p><p>The floor is everything that creates systemic risk when it varies.</p><p>Tagging. Account structure. Network topology. Security baselines. Compliance controls. Deployment gates.</p><p>These are non-negotiable. Variation here does not produce innovation. It produces incidents, audit findings, and cost spikes that nobody can attribute.</p><p>The ceiling is everything teams use to solve their specific problem.</p><p>Language. Framework. Internal libraries. Caching strategy. Database choice within approved types.</p><p>Autonomy here produces genuine value. Engineers make better decisions when they understand the problem space, and they understand it better than the platform team does.</p><p>The question I ask for any proposed standard: what is the systemic cost of variation here?</p><p>If variation in this layer creates on-call complexity, compliance gaps, or cost-assignment issues, standardize it. The cost of enforcement is lower than the cost of the failure mode.</p><p>If variation in this layer reflects legitimate differences in team context and problem type, leave it alone. The platform&#8217;s job is not to eliminate judgment. It is to channel it toward the right decisions.</p><p>One more signal: if a standard requires significant ongoing enforcement, it is probably set at the wrong layer.</p><p>Good standards are adopted because they reduce friction, not because they are mandated. If teams are routing around a standard consistently, the standard is wrong, not the teams.</p><p>A second signal: watch where senior engineers spend their time.</p><p>If your best engineers are spending hours navigating platform constraints to do work that the platform should be making simple, the standard is in the wrong place.</p><p>If they are spending hours cleaning up after teams that went off-road, autonomy is in the wrong place.</p><p>The answer to this tradeoff is not a policy. It is an ongoing calibration.</p><p>Your platform design should reflect the actual failure modes your organization has experienced, not the theoretical ones you are trying to prevent. Start with where things have broken. Standardize there first. Leave everything else open until you have evidence that variation is creating systemic cost.</p><p>Most teams skip this conversation because it requires admitting that both standardization and autonomy have failed them at some point.</p><p>Naming that honestly is the starting point for building a platform that does neither.</p><div><hr></div><p><em>The Cloud Playbook publishes every Wednesday and Sunday. In the paid playbook, I walk through how to actually map your current stack into &#8216;floor vs ceiling&#8217; and codify it into standards and exception paths.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe&quot;,&quot;text&quot;:&quot;Upgrade Here&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.thecloudplaybook.com/subscribe"><span>Upgrade Here</span></a></p><div><hr></div><h2><strong>Whenever you&#8217;re ready, there are 2 ways I can help you:</strong></h2><ol><li><p><strong>Free guides and helpful resources: </strong><a href="https://thecloudplaybook.gumroad.com/">https://thecloudplaybook.gumroad.com/</a></p></li><li><p>Get certified as an <strong>AWS AI Practitioner</strong> in 2026. Sign up today to elevate your cloud skills. (<em><a href="https://www.udemy.com/course/aws-certified-ai-practitioner-practice-exams-aif-c01/">link</a></em>)</p></li></ol><div><hr></div><h2><strong>That&#8217;s it for today!</strong></h2><p>Did you enjoy this newsletter issue?</p><p>Share with your friends, colleagues, and your favorite social media platform.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share The Cloud Playbook&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share The Cloud Playbook</span></a></p><p><strong>Until next week &#8212; Amrut</strong></p><div><hr></div><h2><strong>Get in touch</strong></h2><p>You can find me on <a href="https://www.linkedin.com/in/patilamrut/">LinkedIn</a> or <a href="https://twitter.com/realamrutpatil">X</a>.</p><p>If you would like to request a topic to read, please feel free to contact me directly via LinkedIn or X.</p>]]></content:encoded></item><item><title><![CDATA[TCP# 103: We had 47 dashboards. Nobody knew which one to open during an incident. ]]></title><description><![CDATA[We fixed it by deleting 28 of them.]]></description><link>https://www.thecloudplaybook.com/p/deleted-dashboards-observability-got-better</link><guid isPermaLink="false">https://www.thecloudplaybook.com/p/deleted-dashboards-observability-got-better</guid><dc:creator><![CDATA[Amrut Patil]]></dc:creator><pubDate>Wed, 25 Feb 2026 13:01:16 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!nw4I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac37688-d69c-4cd7-955e-14122b257731_2816x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You can also read my newsletters from the Substack mobile app and be notified when a new issue is available.</p><div class="install-substack-app-embed install-substack-app-embed-web" data-component-name="InstallSubstackAppToDOM"><img class="install-substack-app-embed-img" src="https://substackcdn.com/image/fetch/$s_!7MI5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1ca555-b578-4ae3-8dda-a03cbc6b1d18_500x500.png"><div class="install-substack-app-embed-text"><div class="install-substack-app-header">Get more from Amrut Patil in the Substack app</div><div class="install-substack-app-text">Available for iOS and Android</div></div><a href="https://substack.com/app/app-store-redirect?utm_campaign=app-marketing&amp;utm_content=author-post-insert&amp;utm_source=thecloudplaybook" target="_blank" class="install-substack-app-embed-link"><button class="install-substack-app-embed-btn button primary">Get the app</button></a></div><div><hr></div><p>Eighteen months ago, we had 47 dashboards across our platform.</p><p>Engineers had built them incrementally, one incident at a time. Each one made sense when it was created.</p><p>Collectively, they had become noise.</p><p>We deleted 28 of them. No migration. No archiving strategy. Gone.</p><p>Observability improved within two weeks.</p><h3>THE DECISION NOBODY WANTED TO MAKE</h3><div class="paywall-jump" data-component-name="PaywallToDOM"></div><p>Before the deletion, every dashboard had a defender.</p><p>Someone built it. Someone remembered why. Deleting it felt like losing institutional knowledge, even if nobody had opened it in six months.</p><p>The instinct was to keep everything, refine it later, and consolidate someday.</p><p>Someday never came.</p><p>The dashboards multiplied instead. New services spun up. Engineers cloned existing dashboards and modified them. Naming conventions drifted.</p><p>By the time we audited, we had dashboards with overlapping metrics, conflicting definitions of the same signals, and no clear owner for most of them.</p><p>When an incident occurred, engineers opened dashboards at random, cross-referencing panels that told different stories.</p><p>The time to diagnosis stretched. Attention fractured across too many surfaces.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nw4I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac37688-d69c-4cd7-955e-14122b257731_2816x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nw4I!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac37688-d69c-4cd7-955e-14122b257731_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!nw4I!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac37688-d69c-4cd7-955e-14122b257731_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!nw4I!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac37688-d69c-4cd7-955e-14122b257731_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!nw4I!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac37688-d69c-4cd7-955e-14122b257731_2816x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nw4I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac37688-d69c-4cd7-955e-14122b257731_2816x1536.png" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dac37688-d69c-4cd7-955e-14122b257731_2816x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:6652269,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thecloudplaybook.com/i/188725780?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac37688-d69c-4cd7-955e-14122b257731_2816x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!nw4I!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac37688-d69c-4cd7-955e-14122b257731_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!nw4I!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac37688-d69c-4cd7-955e-14122b257731_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!nw4I!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac37688-d69c-4cd7-955e-14122b257731_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!nw4I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac37688-d69c-4cd7-955e-14122b257731_2816x1536.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h3>WHAT DELETION ACTUALLY PRODUCED</h3><p>We ran a usage audit first.</p><p>Grafana logs showed which dashboards had been opened in the prior 90 days and by whom.</p><p>Twenty-eight dashboards had zero views. Eleven more had been opened once, by the person who created them.</p><p>We deleted the zero-view dashboards immediately. We reviewed the low-use group and cut all but three.</p><p>What remained: 19 dashboards with clear owners, consistent naming, and defined purposes.</p><p>Service health. Infrastructure cost. Deployment pipeline. Per-tenant SLA tracking.</p><p>Within two weeks, the on-call rotation reported faster incident orientation.</p><p>Not because the dashboards had better charts. Because engineers knew exactly which dashboard to open and what question each one answered.</p><p>The cognitive load of choosing dropped out of the incident response process entirely.</p><p>The mean time to diagnosis dropped 35% over the following quarter. That number came from incident retrospectives, not tooling. Engineers reported it themselves.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">If you&#8217;re on the free tier, you&#8217;ll get the concepts. Paid subscribers get the concrete templates and walkthroughs to ship this inside your AWS org.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h3>WHY THIS NEVER MADE IT INTO A RETRO</h3><p>Nobody celebrated this work.</p><p>Deleting dashboards does not ship a feature. It does not close a ticket. It does not appear in a sprint demo.</p><p>The engineers who did the audit and ran the deletion spent two days on it. The output was absence, not presence. Fewer things, not more.</p><p>In most engineering cultures, that work is invisible.</p><p>It does not generate a Slack notification. It does not move a burndown chart. Leadership does not ask about it in planning.</p><p>The benefit showed up indirectly: faster incidents, less confusion, and cleaner on-call handoffs.</p><p>Those outcomes were credited to the teams that responded well, not to the people who removed the friction that slowed responses down.</p><div><hr></div><h3>SIGNAL DENSITY IS AN ENGINEERING DECISION</h3><p>The instinct in observability is to add.</p><p>Add metrics. Add panels. Add dashboards. More data means more visibility.</p><p>That logic is wrong.</p><p>More data means more surface area for attention to scatter across.</p><p>Observability is not a volume problem. It is a signal density problem.</p><p>The question is not how much your platform can surface. It is how quickly an engineer on call at 2 am can move from alert to diagnosis to action.</p><p>Every dashboard that does not answer a specific question in a specific context is a tax on that process. The tax compounds across every incident, every rotation, every engineer new to the system.</p><p>Deleting 28 dashboards was not a cleanup task.</p><p>It was an architectural decision about what the platform communicates and to whom. That decision belongs to the platform team. It requires judgment about which signals matter, what noise looks like, and what engineers actually need when things break.</p><div><hr></div><p><em>The Cloud Playbook publishes every Wednesday and Sunday. If your team is drowning in dashboards nobody opens, forward this to the person who owns your observability stack.</em></p><p><em>P.S. This article is part of a deeper series on observability. Paid readers get the full implementation kit.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe&quot;,&quot;text&quot;:&quot;Upgrade Here&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.thecloudplaybook.com/subscribe"><span>Upgrade Here</span></a></p><div><hr></div><h2><strong>Whenever you&#8217;re ready, there are 2 ways I can help you:</strong></h2><ol><li><p><strong>Free guides and helpful resources: </strong><a href="https://thecloudplaybook.gumroad.com/">https://thecloudplaybook.gumroad.com/</a></p></li><li><p>Get certified as an <strong>AWS AI Practitioner</strong> in 2026. Sign up today to elevate your cloud skills. (<em><a href="https://www.udemy.com/course/aws-certified-ai-practitioner-practice-exams-aif-c01/">link</a></em>)</p></li></ol><div><hr></div><h2><strong>That&#8217;s it for today!</strong></h2><p>Did you enjoy this newsletter issue?</p><p>Share with your friends, colleagues, and your favorite social media platform.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share The Cloud Playbook&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share The Cloud Playbook</span></a></p><p><strong>Until next week &#8212; Amrut</strong></p><div><hr></div><h2><strong>Get in touch</strong></h2><p>You can find me on <a href="https://www.linkedin.com/in/patilamrut/">LinkedIn</a> or <a href="https://twitter.com/realamrutpatil">X</a>.</p><p>If you would like to request a topic to read, please feel free to contact me directly via LinkedIn or X.</p>]]></content:encoded></item><item><title><![CDATA[TCP #102: What does your platform produce when a developer does the minimum?]]></title><description><![CDATA[That answer is your real DX score.]]></description><link>https://www.thecloudplaybook.com/p/developer-experience-right-thing-default</link><guid isPermaLink="false">https://www.thecloudplaybook.com/p/developer-experience-right-thing-default</guid><dc:creator><![CDATA[Amrut Patil]]></dc:creator><pubDate>Sun, 22 Feb 2026 13:01:26 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!UZbA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9efd95a-aaaa-4cfa-bbb9-5c5d377cd4d1_2816x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You can also read my newsletters from the Substack mobile app and be notified when a new issue is available.</p><div class="install-substack-app-embed install-substack-app-embed-web" data-component-name="InstallSubstackAppToDOM"><img class="install-substack-app-embed-img" src="https://substackcdn.com/image/fetch/$s_!7MI5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1ca555-b578-4ae3-8dda-a03cbc6b1d18_500x500.png"><div class="install-substack-app-embed-text"><div class="install-substack-app-header">Get more from Amrut Patil in the Substack app</div><div class="install-substack-app-text">Available for iOS and Android</div></div><a href="https://substack.com/app/app-store-redirect?utm_campaign=app-marketing&amp;utm_content=author-post-insert&amp;utm_source=thecloudplaybook" target="_blank" class="install-substack-app-embed-link"><button class="install-substack-app-embed-btn button primary">Get the app</button></a></div><div><hr></div><p>Most platform teams frame developer experience as a friction problem.</p><p>Developers are slowing down. Onboarding takes too long. The internal tools are clunky. Engineers complain about the deployment process.</p><p>The answer, in this frame, is to reduce friction. Simplify the interface. Add documentation. Run enablement sessions. Make it easier.</p><p>That frame is not wrong. It is just incomplete.</p><p>And the part it is missing is the part that actually determines whether your platform produces reliable, compliant, secure software at scale.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UZbA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9efd95a-aaaa-4cfa-bbb9-5c5d377cd4d1_2816x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UZbA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9efd95a-aaaa-4cfa-bbb9-5c5d377cd4d1_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!UZbA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9efd95a-aaaa-4cfa-bbb9-5c5d377cd4d1_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!UZbA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9efd95a-aaaa-4cfa-bbb9-5c5d377cd4d1_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!UZbA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9efd95a-aaaa-4cfa-bbb9-5c5d377cd4d1_2816x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UZbA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9efd95a-aaaa-4cfa-bbb9-5c5d377cd4d1_2816x1536.png" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e9efd95a-aaaa-4cfa-bbb9-5c5d377cd4d1_2816x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:6081153,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.thecloudplaybook.com/i/188450798?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9efd95a-aaaa-4cfa-bbb9-5c5d377cd4d1_2816x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UZbA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9efd95a-aaaa-4cfa-bbb9-5c5d377cd4d1_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!UZbA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9efd95a-aaaa-4cfa-bbb9-5c5d377cd4d1_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!UZbA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9efd95a-aaaa-4cfa-bbb9-5c5d377cd4d1_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!UZbA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9efd95a-aaaa-4cfa-bbb9-5c5d377cd4d1_2816x1536.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h3>SATISFACTION IS NOT AN OBJECTIVE</h3><p>Developer experience, as most platform teams describe it, is about developer satisfaction.</p><p>Are engineers happy with the tools? Is the onboarding smooth? Does the internal developer portal have good UX? Are NPS scores improving?</p><p>These are real questions.</p><p>Platforms that ignore them produce friction that slows engineers down and drives adoption toward workarounds.</p><p>But satisfaction is an outcome. It is not an objective.</p><p>Building toward satisfaction without a more precise goal produces platforms that are pleasant to use and dangerous to operate.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">If you&#8217;re on the free tier, you&#8217;ll get the concepts. Paid subscribers get the concrete templates and walkthroughs to ship this inside your AWS org.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h3>THE PLATFORM ALLOWED IT</h3><p>The satisfaction frame has a hidden assumption: that developers know what good looks like.</p><p>Sometimes they do. Slow pipelines, brittle test environments, and manual provisioning steps are real friction points. Engineers identify them accurately, and reducing them produces genuine speed gains.</p><p>But the gaps that cause incidents, compliance failures, and cost spikes are not usually friction gaps.</p><p>They are judgment gaps.</p><p>Engineers make reasonable local decisions that are wrong at the system level. They deploy without tags because tagging was never enforced. They skipped security scanning because the scanner was optional. They provision resources outside the approved account structure because nobody blocked it.</p><p>In each case, the developer experience was fine. The path of least resistance was right there.</p><p>The problem is that the path of least resistance led to a place the platform should never have allowed.</p><p>When you optimize purely for satisfaction, you optimize for ease of use. You do not optimize for correctness.</p><p>And in regulated environments where one misconfigured resource triggers a compliance finding, correctness is the product.</p><div><hr></div><h3>THE DEFAULT IS THE PRODUCT</h3><p>Developer experience is not a UX problem. It is an architecture problem.</p><p>The question is not &#8220;how do we make this easier?&#8221;</p><p>The question is &#8220;what does the platform make the default, and is the default correct?&#8221;</p><p>A well-designed platform makes the secure path the obvious path. It makes the compliant choice the low-friction choice. It makes the tagged resource, the approved account, the scanned image, and the reviewed deployment the default.</p><p>Developers do not fight the platform. They follow it.</p><p>And when they follow it, the output is correct without the developer having to know why.</p><p>This is the reframe.</p><p>Developer experience, done right, is not about removing barriers to doing things. It is about removing barriers to doing the right things, while adding barriers to everything else.</p><div><hr></div><h3>COMPLIANCE IS DEVELOPER EXPERIENCE</h3><p>When you adopt this frame, the platform roadmap looks different.</p><p>You stop treating developer experience as a track that runs parallel to security and compliance.</p><p>You recognize that those things are developer experience.</p><p>Making it fast to deploy to a compliant environment is better DX than making it fast to deploy anywhere.</p><p>Making it automatic to produce audit evidence is better DX than making it easy to skip steps.</p><p>Making it impossible to provision an untagged resource is better DX than making manual tagging easy.</p><p>The team conversations change, too.</p><p>When a developer asks, &#8220;Why do I have to go through this process?&#8221; the answer is no longer &#8220;because compliance requires it.&#8221;</p><p>The answer becomes &#8220;because the platform ships this for you, so you do not have to think about it.&#8221;</p><p>One is bureaucracy. The other is product design.</p><p>Metrics shift as well. Satisfaction scores remain useful signals, but the primary question is: what percentage of deployments follow the golden path, and what percentage deviates?</p><p>Deviation rate is a developer experience metric.</p><p>High deviation means the default is wrong. Low deviation means you have built a platform that makes correct behavior the easy choice.</p><p>In regulated industries, that metric matters more than NPS.</p><p>A satisfied developer who routinely sidesteps the guardrails is a liability. A developer who follows the platform without friction is an asset.</p><div><hr></div><h3>TRACE THE PATH OF LEAST RESISTANCE</h3><p>Audit your platform for the path of least resistance.</p><p>Pick one workflow that your developers do every week. Trace the easiest possible route through it.</p><p>Ask: if an engineer does the minimum required to complete this workflow, what does the output look like?</p><p>Is it tagged? Is it scanned? Is it in the right account? Is it documented?</p><p>If the answer is no, the platform has a developer experience problem.</p><p>Not because it is too slow or too hard. Because the default produces the wrong thing.</p><p>Fix the default before you improve the UX.</p><div><hr></div><p>Speed is easy. Predictability is hard. I build platforms that deliver both.</p><div><hr></div><p><em>The Cloud Playbook publishes every Wednesday and Sunday. If this issue reframes something you have been thinking about, forward it to one platform leader who is solving DX by improving their portal instead of their defaults.</em></p><p><em>P.S. This article is part of a deeper series on AWS cloud cost ownership. Paid readers get the full implementation kit. </em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe&quot;,&quot;text&quot;:&quot;Upgrade Here&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.thecloudplaybook.com/subscribe"><span>Upgrade Here</span></a></p><div><hr></div><h2><strong>Whenever you&#8217;re ready, there are 2 ways I can help you:</strong></h2><ol><li><p><strong>Free guides and helpful resources: </strong><a href="https://thecloudplaybook.gumroad.com/">https://thecloudplaybook.gumroad.com/</a></p></li><li><p>Get certified as an <strong>AWS AI Practitioner</strong> in 2026. Sign up today to elevate your cloud skills. (<em><a href="https://www.udemy.com/course/aws-certified-ai-practitioner-practice-exams-aif-c01/">link</a></em>)</p></li></ol><div><hr></div><h2><strong>That&#8217;s it for today!</strong></h2><p>Did you enjoy this newsletter issue?</p><p>Share with your friends, colleagues, and your favorite social media platform.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share The Cloud Playbook&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share The Cloud Playbook</span></a></p><p><strong>Until next week &#8212; Amrut</strong></p><div><hr></div><h2><strong>Get in touch</strong></h2><p>You can find me on <a href="https://www.linkedin.com/in/patilamrut/">LinkedIn</a> or <a href="https://twitter.com/realamrutpatil">X</a>.</p><p>If you would like to request a topic to read, please feel free to contact me directly via LinkedIn or X.</p>]]></content:encoded></item><item><title><![CDATA[TCP#101: Your AWS bill is lying to you]]></title><description><![CDATA[The real problem is two layers deeper than the invoice.]]></description><link>https://www.thecloudplaybook.com/p/aws-cloud-cost-ownership-accountability</link><guid isPermaLink="false">https://www.thecloudplaybook.com/p/aws-cloud-cost-ownership-accountability</guid><dc:creator><![CDATA[Amrut Patil]]></dc:creator><pubDate>Thu, 19 Feb 2026 12:30:52 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Al7i!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9aa52a6-cf83-4eec-8f4f-a26ea0222bf1_1536x2752.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You can also read my newsletters from the Substack mobile app and be notified when a new issue is available.</p><div class="install-substack-app-embed install-substack-app-embed-web" data-component-name="InstallSubstackAppToDOM"><img class="install-substack-app-embed-img" src="https://substackcdn.com/image/fetch/$s_!7MI5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1ca555-b578-4ae3-8dda-a03cbc6b1d18_500x500.png"><div class="install-substack-app-embed-text"><div class="install-substack-app-header">Get more from Amrut Patil in the Substack app</div><div class="install-substack-app-text">Available for iOS and Android</div></div><a href="https://substack.com/app/app-store-redirect?utm_campaign=app-marketing&amp;utm_content=author-post-insert&amp;utm_source=thecloudplaybook" target="_blank" class="install-substack-app-embed-link"><button class="install-substack-app-embed-btn button primary">Get the app</button></a></div><div><hr></div><p>Your AWS bill is up 30% quarter over quarter. The first call goes to FinOps.</p><p>They run Cost Explorer. Flag unused EC2 instances, orphaned snapshots, and over-provisioned RDS clusters. Set budgets. Send reports.</p><p>The bill keeps climbing.</p><p>Most engineering leaders misread the AWS cloud cost problem they are actually facing. They see a spending problem. They reach for a cost tool. The bill is not the problem. It is a symptom. What it points at sits two layers deeper: cloud resource ownership.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Al7i!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9aa52a6-cf83-4eec-8f4f-a26ea0222bf1_1536x2752.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Al7i!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9aa52a6-cf83-4eec-8f4f-a26ea0222bf1_1536x2752.png 424w, https://substackcdn.com/image/fetch/$s_!Al7i!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9aa52a6-cf83-4eec-8f4f-a26ea0222bf1_1536x2752.png 848w, https://substackcdn.com/image/fetch/$s_!Al7i!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9aa52a6-cf83-4eec-8f4f-a26ea0222bf1_1536x2752.png 1272w, https://substackcdn.com/image/fetch/$s_!Al7i!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9aa52a6-cf83-4eec-8f4f-a26ea0222bf1_1536x2752.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Al7i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9aa52a6-cf83-4eec-8f4f-a26ea0222bf1_1536x2752.png" width="728.0000610351562" height="1304.5001093686283" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a9aa52a6-cf83-4eec-8f4f-a26ea0222bf1_1536x2752.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:2609,&quot;width&quot;:1456,&quot;resizeWidth&quot;:728.0000610351562,&quot;bytes&quot;:6318354,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.thecloudplaybook.com/i/188447963?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9aa52a6-cf83-4eec-8f4f-a26ea0222bf1_1536x2752.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Al7i!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9aa52a6-cf83-4eec-8f4f-a26ea0222bf1_1536x2752.png 424w, https://substackcdn.com/image/fetch/$s_!Al7i!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9aa52a6-cf83-4eec-8f4f-a26ea0222bf1_1536x2752.png 848w, https://substackcdn.com/image/fetch/$s_!Al7i!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9aa52a6-cf83-4eec-8f4f-a26ea0222bf1_1536x2752.png 1272w, https://substackcdn.com/image/fetch/$s_!Al7i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9aa52a6-cf83-4eec-8f4f-a26ea0222bf1_1536x2752.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h3>NOBODY OWNS THE SPIKE</h3><p>You open Cost Explorer. The numbers do not add up.</p><p>EC2 costs are up across six accounts. S3 storage doubled. Data transfer charges appeared from services nobody can name. You ask which team owns the resources driving the spike.</p><p>Silence.</p><p>Tags are inconsistent. The account structure does not map to any org chart. Costs are real. Owners are not.</p><div><hr></div><h3>COST HYGIENE IS NOT COST MANAGEMENT</h3><p>The reflex is to fix the spending.</p><p>Rightsize instances. Set budget alerts. Implement S3 lifecycle policies. Turn off what is unused. These actions are not wrong. They are aimed at the wrong layer.</p><p>FinOps produces a dashboard. Leadership gets a weekly cost report. Engineers get Slack alerts when a resource exceeds the threshold. Costs dip, then creep back up two months later. New resources appear untagged. New services are being spun up in accounts that were supposed to be off-limits.</p><p>The cycle repeats.</p><p>This is cost hygiene. It is not cloud cost management. The gap between those two things is ownership.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">If you&#8217;re on the free tier, you&#8217;ll get the concepts. Paid subscribers get the concrete templates and walkthroughs to ship this inside your AWS org.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h3>RESOURCES OUTLIVE THEIR OWNERS</h3><p>Rising cloud costs are a forcing function. They expose what was always true about your infrastructure: nobody owns it with enough cloud cost accountability to keep it predictable.</p><p>In most engineering organizations, cloud resources are created by whoever needs them, when they need them, with whatever access they have. Tagging is optional. Account structure reflects history, not design. </p><p>Cloud resource ownership is assumed but never formalized. The developer who provisioned that cluster moved to a different team six months ago. The resource stayed. The ownership did not transfer.</p><p>When costs rise, you are not seeing a new problem. You are seeing the accumulated weight of all ownership gaps on your platform. Every resource has no accountable owner. Every account with no cost center mapping. Every service with no team responsible for its AWS footprint.</p><p>The tell is in the conversation after the cost alert fires. If your first question is &#8220;what is this resource,&#8221; your second question should not be &#8220;can we delete it.&#8221; It should be <strong>&#8220;why did we not already know.&#8221;</strong></p><p>The answer to that second question is organizational. Not technical.</p><div><hr></div><h3>ONE QUESTION EXPOSES THE REAL PROBLEM</h3><p>Before you touch a cost optimization tool or kick off a rightsizing exercise, ask one question:</p><p><strong>For every resource in every account, can you name the owning team, who is accountable for its cost, and which product or business function it maps to?</strong></p><p>If answering that requires querying three different systems, cross-referencing a spreadsheet, and making assumptions, you do not have a spending problem. You have an AWS cloud cost ownership problem.</p><p>Run a secondary diagnostic. Pull your tagging compliance rate. Not the tags that exist. The tags that are enforced. If any team deploys untagged resources without an automated block or remediation triggering, cloud resource ownership is optional in your platform. Optional ownership produces unpredictable costs.</p><div><hr></div><h3>FOUR MOVES TO MAKE OWNERSHIP MANDATORY</h3><p>The fix is not a FinOps tool. It is an ownership model, enforced at the infrastructure layer.</p><p>Four moves.</p><p><strong>One:</strong> Enforce three mandatory tags on every AWS resource. Owner team. Service or product. Environment. Not suggestions. Enforced at deploy time via an AWS tagging strategy built into your CI/CD pipeline. AWS Config rules or OPA policies block untagged resources from provisioning. Resources that drift post-creation trigger automated remediation. No exceptions.</p><p><strong>Two:</strong> align your AWS account structure for cost control. Accounts are not just security boundaries. They are cloud cost accountability boundaries. When a team owns an account, they are responsible for the bill. Cost Explorer data becomes meaningful because it maps to a real team. Monthly cost reviews are no longer abstract reports. They become conversations with accountable owners.</p><p><strong>Three:</strong> build a lightweight ownership registry. One YAML file per service, committed to your central platform repo. It contains the owning team, the primary AWS account, the on-call rotation, and the estimated monthly cost baseline. Update it as part of service onboarding. It becomes your source of truth the next time costs spike and nobody knows where to look.</p><p><strong>Four:</strong> run a quarterly orphan audit. Every resource with no matching registry entry is a liability. Automated Lambda functions query the AWS API for untagged or unregistered resources. Results route to the platform team for triage. Orphans surface before they compound.</p><p>None of this requires new tooling. It requires a decision: <strong>cloud resource ownership is mandatory, not aspirational.</strong></p><p>When you build that into your platform design, cost trends become readable. Spikes have named owners. Anomalies surface in context. Budget conversations happen between accountable parties, not between engineers and dashboards.</p><p>The teams that get cloud cost management right do not spend less. They know who owns what. That precision is the only foundation on which cloud cost accountability works.</p><p>If you are running a cost-optimization sprint right now and do not have a clear answer to who owns this, pause the sprint. First, fix the AWS cloud cost ownership model. Everything else builds on it.</p><div><hr></div><p><em>P.S. This article is part of a deeper series on AWS cloud cost ownership. Paid readers get the full implementation kit. </em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe&quot;,&quot;text&quot;:&quot;Upgrade Here&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.thecloudplaybook.com/subscribe"><span>Upgrade Here</span></a></p><div><hr></div><h2><strong>Whenever you&#8217;re ready, there are 2 ways I can help you:</strong></h2><ol><li><p><strong>Free guides and helpful resources: </strong><a href="https://thecloudplaybook.gumroad.com/">https://thecloudplaybook.gumroad.com/</a></p></li><li><p>Get certified as an <strong>AWS AI Practitioner</strong> in 2026. Sign up today to elevate your cloud skills. (<em><a href="https://www.udemy.com/course/aws-certified-ai-practitioner-practice-exams-aif-c01/">link</a></em>)</p></li></ol><div><hr></div><h2><strong>That&#8217;s it for today!</strong></h2><p>Did you enjoy this newsletter issue?</p><p>Share with your friends, colleagues, and your favorite social media platform.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share The Cloud Playbook&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share The Cloud Playbook</span></a></p><p><strong>Until next week &#8212; Amrut</strong></p><div><hr></div><h2><strong>Get in touch</strong></h2><p>You can find me on <a href="https://www.linkedin.com/in/patilamrut/">LinkedIn</a> or <a href="https://twitter.com/realamrutpatil">X</a>.</p><p>If you would like to request a topic to read, please feel free to contact me directly via LinkedIn or X.</p>]]></content:encoded></item><item><title><![CDATA[TCP #100: Your platform is fast. But is it predictable?]]></title><description><![CDATA[A practical lens for building leadership trust in your systems.]]></description><link>https://www.thecloudplaybook.com/p/predictability-over-speed-platform-engineering</link><guid isPermaLink="false">https://www.thecloudplaybook.com/p/predictability-over-speed-platform-engineering</guid><dc:creator><![CDATA[Amrut Patil]]></dc:creator><pubDate>Sun, 15 Feb 2026 13:00:31 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!q51i!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2c959d1-1e36-485e-bd16-a8910644f45e_1856x2304.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You can also read my newsletters from the Substack mobile app and be notified when a new issue is available.</p><div class="install-substack-app-embed install-substack-app-embed-web" data-component-name="InstallSubstackAppToDOM"><img class="install-substack-app-embed-img" src="https://substackcdn.com/image/fetch/$s_!7MI5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1ca555-b578-4ae3-8dda-a03cbc6b1d18_500x500.png"><div class="install-substack-app-embed-text"><div class="install-substack-app-header">Get more from Amrut Patil in the Substack app</div><div class="install-substack-app-text">Available for iOS and Android</div></div><a href="https://substack.com/app/app-store-redirect?utm_campaign=app-marketing&amp;utm_content=author-post-insert&amp;utm_source=thecloudplaybook" target="_blank" class="install-substack-app-embed-link"><button class="install-substack-app-embed-btn button primary">Get the app</button></a></div><div><hr></div><p>Speed is easy. Predictability is hard. </p><p>I built my platform engineering strategy around that distinction because predictable systems create sustainable speed.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!q51i!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2c959d1-1e36-485e-bd16-a8910644f45e_1856x2304.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!q51i!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2c959d1-1e36-485e-bd16-a8910644f45e_1856x2304.png 424w, https://substackcdn.com/image/fetch/$s_!q51i!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2c959d1-1e36-485e-bd16-a8910644f45e_1856x2304.png 848w, https://substackcdn.com/image/fetch/$s_!q51i!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2c959d1-1e36-485e-bd16-a8910644f45e_1856x2304.png 1272w, https://substackcdn.com/image/fetch/$s_!q51i!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2c959d1-1e36-485e-bd16-a8910644f45e_1856x2304.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!q51i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2c959d1-1e36-485e-bd16-a8910644f45e_1856x2304.png" width="1456" height="1807" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e2c959d1-1e36-485e-bd16-a8910644f45e_1856x2304.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1807,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7400035,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.thecloudplaybook.com/i/186567632?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2c959d1-1e36-485e-bd16-a8910644f45e_1856x2304.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!q51i!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2c959d1-1e36-485e-bd16-a8910644f45e_1856x2304.png 424w, https://substackcdn.com/image/fetch/$s_!q51i!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2c959d1-1e36-485e-bd16-a8910644f45e_1856x2304.png 848w, https://substackcdn.com/image/fetch/$s_!q51i!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2c959d1-1e36-485e-bd16-a8910644f45e_1856x2304.png 1272w, https://substackcdn.com/image/fetch/$s_!q51i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2c959d1-1e36-485e-bd16-a8910644f45e_1856x2304.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2>Where did it come from?</h2><p>Most engineering organizations prioritize speed metrics.</p><p>Deployment frequency. Lead time. Story throughput. Sprint velocity. Teams push for faster pipelines, faster reviews, faster shipping. The assumption is simple. Move faster, and outcomes improve.</p><p>But speed without predictability creates a different class of failure. One that rarely shows up on dashboards until leadership feels it.</p><p>Unexpected outages. Cost spikes without explanation. Compliance work that blocks releases at the worst time. Incidents where no one is certain who owns the fix. Deployments that technically succeed but introduce downstream instability.</p><p>Early in my platform leadership experience, I realized something uncomfortable. </p><p><strong>Speed is the easiest capability to create in a modern cloud environment. Predictability is the hardest.</strong></p><p>Any team can ship faster with enough pressure. Fewer checks. More autonomy. Fewer guardrails. </p><p>But the resulting system becomes harder to reason about with each release. Teams move quickly. Leadership loses confidence. Eventually, velocity slows because trust erodes.</p><p>This is the moment most platform strategies go wrong. They optimize for visible speed and ignore invisible predictability.</p><p>I built my platform strategy around the opposite assumption. Speed without predictability is fragile. Predictability creates durable velocity.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Cloud Playbook! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2>Why It Matters Now?</h2><p>Cloud-native architecture made speed accessible.</p><p>Infrastructure can be provisioned in minutes. CI/CD pipelines can deploy continuously. Teams can spin up services without waiting on centralized operations. From a tooling perspective, the barriers to speed have largely disappeared.</p><p>Predictability did not become easier.</p><p>As organizations scale across teams, services, and environments, the number of possible failure paths multiplies. </p><p>Ownership becomes less obvious. Compliance requirements intersect with delivery timelines. Observability data increases, but clarity does not automatically follow.</p><p><strong>Leadership does not worry about how fast teams can ship. Leadership worries about whether systems behave as expected after they ship.</strong></p><ul><li><p>Will costs remain within the forecast?</p></li><li><p>Will uptime remain stable?</p></li><li><p>Will audits pass without disruption?</p></li><li><p>Will incidents be contained quickly?</p></li><li><p>Will new tenants onboard without surprises?</p></li></ul><p><strong>Predictability is what allows executives to make commitments externally. Revenue targets. Customer SLAs. Regulatory timelines. Market launches.</strong></p><p>Without predictable systems, every commitment carries hidden risk. Engineering becomes a source of uncertainty rather than leverage.</p><p>This is why I anchor platform strategy on predictability first. Speed becomes meaningful only when outcomes are consistent.</p><h3>Predictability maps to delivery stability, not vibes</h3><p>The industry already has language for this.</p><p>DORA frames delivery performance with velocity and stability signals. Change failure rate and time to restore service to capture instability. Deployment frequency and lead time capture speed.</p><p>Predictability occurs when speed improves without stability degrading.</p><div><hr></div><h2>What To Do With It?</h2><p>Design the platform to reduce variance, not just increase throughput.</p><h3>1) Make releases boring through standard paths</h3><p>If every team deploys differently, outcomes will vary. Predictability drops.</p><p>Use golden paths as the default route for common workflows. </p><p>Golden paths are designed to reduce cognitive load and help teams operate safely and consistently. </p><p>Internal developer platforms are commonly described as tools that glue together golden paths, reducing cognitive load and enabling self-service.</p><p>When you reduce cognitive load, you reduce variance. When you reduce variance, you get predictability.</p><h3>2) Invest in rollback confidence before you chase faster deploys</h3><p>Fast deployments only matter if rollback is trivial and well understood.</p><p>Standardize deployment patterns. Standardize rollback patterns. Encode ownership. Instrument the deploy so you can tell within minutes whether the release behaved as expected.</p><p>If leaders cannot trust release outcomes, they will eventually slow you down. That is predictable too.</p><h3>3) Treat reliability as a control system with error budgets</h3><p>Predictability is not the absence of failure. It is a controlled failure within known limits.</p><p>Error budgets are an SRE mechanism for balancing reliability and the pace of change. When error budgets are consumed, attention shifts from feature work to stability work.</p><p>This is a platform design pattern, not an SRE process detail. If the platform cannot enforce the tradeoff, teams will negotiate it during incidents. That increases organizational load.</p><h3>4) Design compliance and cost as defaults, not after-the-fact work</h3><p>If audits require manual evidence gathering, the system is not predictable. If cost spikes require detective work, the system is not predictable.</p><p>Predictability comes from defaults and guardrails that make the right behavior the easy behavior.</p><p>Logging, access controls, and change management should produce proof by default. Self-service should provision compliant, observable, and cost-tagged infrastructure by default.</p><div><hr></div><h2>How To Measure Predictability?</h2><p>Do not measure predictability as a feeling. Measure it as variance reduction.</p><p>Use DORA stability signals as leading indicators. Change failure rate. Time to restore service.</p><p>Then add platform-specific questions that expose surprise.</p><ul><li><p>How often do incidents surprise the team?</p></li><li><p>How often do cost spikes require investigation?</p></li><li><p>How often do releases behave differently than expected?</p></li><li><p>How often does ownership confusion delay resolution?</p></li></ul><p>These are operational proxies for predictability. They tell you whether the platform is creating confidence or creating work.</p><p>Speed is easy. Predictability is hard. I build platforms that deliver both.</p><div><hr></div><h2><strong>Whenever you&#8217;re ready, there are 2 ways I can help you:</strong></h2><ol><li><p><strong>Free guides and helpful resources: </strong><a href="https://thecloudplaybook.gumroad.com/">https://thecloudplaybook.gumroad.com/</a></p></li><li><p>Get certified as an <strong>AWS AI Practitioner</strong> in 2026. Sign up today to elevate your cloud skills. (<em><a href="https://www.udemy.com/course/aws-certified-ai-practitioner-practice-exams-aif-c01/">link</a></em>)</p></li></ol><div><hr></div><h2><strong>That&#8217;s it for today!</strong></h2><p>Did you enjoy this newsletter issue?</p><p>Share with your friends, colleagues, and your favorite social media platform.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share The Cloud Playbook&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share The Cloud Playbook</span></a></p><p><strong>Until next week &#8212; Amrut</strong></p><div><hr></div><h2><strong>Get in touch</strong></h2><p>You can find me on <a href="https://www.linkedin.com/in/patilamrut/">LinkedIn</a> or <a href="https://twitter.com/realamrutpatil">X</a>.</p><p>If you would like to request a topic to read, please feel free to contact me directly via LinkedIn or X.</p>]]></content:encoded></item><item><title><![CDATA[TCP#99: Platform engineering is load reduction, not tooling]]></title><description><![CDATA[Platform success is not adoption charts.]]></description><link>https://www.thecloudplaybook.com/p/platform-engineering-reduce-organizational-load</link><guid isPermaLink="false">https://www.thecloudplaybook.com/p/platform-engineering-reduce-organizational-load</guid><dc:creator><![CDATA[Amrut Patil]]></dc:creator><pubDate>Wed, 11 Feb 2026 12:31:05 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!FZbs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc510cd97-74e1-41ad-958b-3714e2fe6dc8_1856x2304.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You can also read my newsletters from the Substack mobile app and be notified when a new issue is available.</p><div class="install-substack-app-embed install-substack-app-embed-web" data-component-name="InstallSubstackAppToDOM"><img class="install-substack-app-embed-img" src="https://substackcdn.com/image/fetch/$s_!7MI5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1ca555-b578-4ae3-8dda-a03cbc6b1d18_500x500.png"><div class="install-substack-app-embed-text"><div class="install-substack-app-header">Get more from Amrut Patil in the Substack app</div><div class="install-substack-app-text">Available for iOS and Android</div></div><a href="https://substack.com/app/app-store-redirect?utm_campaign=app-marketing&amp;utm_content=author-post-insert&amp;utm_source=thecloudplaybook" target="_blank" class="install-substack-app-embed-link"><button class="install-substack-app-embed-btn button primary">Get the app</button></a></div><div><hr></div><p>Platform engineering is usually framed as a tooling function.</p><p>Build an internal developer platform. Ship golden paths. Standardize CI/CD. Centralize observability. Offer paved roads and self-service infrastructure.</p><p>Success becomes adoption. </p><p>How many teams are using the platform? How quickly new services spin up? How many templates get reused?</p><p>Under this framing, platform teams are measured in the same way as internal product teams. </p><p>Ship features. Improve developer experience. Reduce friction. Increase autonomy.</p><p>This framing is incomplete. It focuses on what platform teams produce rather than what they remove.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FZbs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc510cd97-74e1-41ad-958b-3714e2fe6dc8_1856x2304.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FZbs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc510cd97-74e1-41ad-958b-3714e2fe6dc8_1856x2304.png 424w, https://substackcdn.com/image/fetch/$s_!FZbs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc510cd97-74e1-41ad-958b-3714e2fe6dc8_1856x2304.png 848w, https://substackcdn.com/image/fetch/$s_!FZbs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc510cd97-74e1-41ad-958b-3714e2fe6dc8_1856x2304.png 1272w, https://substackcdn.com/image/fetch/$s_!FZbs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc510cd97-74e1-41ad-958b-3714e2fe6dc8_1856x2304.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FZbs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc510cd97-74e1-41ad-958b-3714e2fe6dc8_1856x2304.png" width="1456" height="1807" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c510cd97-74e1-41ad-958b-3714e2fe6dc8_1856x2304.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1807,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7161413,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thecloudplaybook.com/i/186567568?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc510cd97-74e1-41ad-958b-3714e2fe6dc8_1856x2304.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FZbs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc510cd97-74e1-41ad-958b-3714e2fe6dc8_1856x2304.png 424w, https://substackcdn.com/image/fetch/$s_!FZbs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc510cd97-74e1-41ad-958b-3714e2fe6dc8_1856x2304.png 848w, https://substackcdn.com/image/fetch/$s_!FZbs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc510cd97-74e1-41ad-958b-3714e2fe6dc8_1856x2304.png 1272w, https://substackcdn.com/image/fetch/$s_!FZbs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc510cd97-74e1-41ad-958b-3714e2fe6dc8_1856x2304.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2>Tools can increase cognitive load</h2><p>When platform engineering is treated as a tooling discipline, organizations end up adding complexity rather than reducing it.</p><p>Every new tool adds choices. Every template adds a maintenance surface. Every internal product requires documentation, support, and governance. Instead of simplifying the system, the platform becomes another layer teams must navigate.</p><p>This is what cognitive load looks like in practice. </p><p>Developers must remember the &#8220;right&#8221; path across environments. Teams must learn which guardrails are real and which are optional. </p><p>Leaders must arbitrate escalation paths when incidents span boundaries. The organization pays coordination overhead because the system is not explicit about defaults.</p><p>Industry language often describes the platform&#8217;s job as reducing cognitive load. That is a useful entry point, and it is widely stated in platform engineering and IDP definitions.</p><p>But the deeper failure mode is not just cognitive load for developers. </p><p><strong>It is organizational load across the entire operating model. Decision load. Coordination load. Ownership load.</strong></p><p>Engineers spend time choosing among paths rather than following defaults. </p><p>Leaders spend time resolving ownership conflicts instead of delivering outcomes. Incidents escalate across teams because boundaries are unclear. </p><p>Compliance evidence is gathered manually because systems were not designed to produce it automatically.</p><p>Most platform initiatives fail not because the tools are bad. They fail because they increase the organization's total operational burden.</p><p>A platform that adds options without removing decisions is not a platform. It is an additional system to manage.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Cloud Playbook! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2>Platform engineering reduces organizational load</h2><p>Platform engineering is not about building tools. It is about reducing organizational load.</p><p><strong>The core function of a platform team is to remove friction from the company's operating model. </strong></p><p>Reduce the number of decisions teams must make repeatedly. Reduce the number of handoffs required to ensure safe shipping. Reduce the coordination required during incidents. Reduce the effort required to prove compliance. Reduce the cognitive overhead of deploying and operating software.</p><p>This is a systems design problem. Not a tooling problem.</p><p>Tools are one mechanism. They are not the objective.</p><p>Platform teams exist to enable stream-aligned teams by reducing the cognitive load they must carry to deliver value.</p><p>If your platform work does not reduce recurring decisions and coordination, it is not reducing load. It is relocating it.</p><div><hr></div><h2>What changes when you see it this way</h2><h3>Your roadmap stops being feature lists.</h3><p>Priorities shift immediately.</p><p>Instead of asking what tools to build next, you ask which recurring decisions are consuming the most organizational energy. </p><p>Instead of measuring feature adoption, you measure the reduction in manual coordination. </p><p>Instead of optimizing for developer happiness alone, you optimize for clarity of ownership and default behavior.</p><p>This is where &#8220;platform as a product&#8221; becomes useful. Not because you want to behave like an internal SaaS vendor. </p><p>Because you need discipline to eliminate friction end-to-end. Research, understand, simplify, ship defaults, and remove escape hatches that recreate the problem.</p><h3>You start hunting load generators.</h3><p>You start identifying sources of load across the system.</p><ul><li><p>Ambiguous service ownership during incidents.</p></li><li><p>Manual tenant onboarding that requires cross-team coordination.</p></li><li><p>Compliance evidence gathered through ad hoc requests.</p></li><li><p>Inconsistent infrastructure patterns that increase debugging time.</p></li><li><p>Cost anomalies that require investigation across multiple teams.</p></li></ul><p>Each of these is a load generator. Each consumes time and attention across engineering, security, and leadership.</p><p>The platform response is not to build more dashboards or add more documentation. The response is to eliminate the underlying coordination requirement.</p><p>Ownership becomes encoded into infrastructure and runbooks. </p><p>Onboarding becomes automated and standardized. </p><p>Evidence is generated by default through logging and configuration baselines. </p><p>Cost controls are enforced through guardrails rather than after-the-fact reporting.</p><p>Organizational load drops because the system becomes explicit.</p><h3>Your success metrics become absence metrics.</h3><p>This also changes how you evaluate platform investments.</p><p>A new service template is valuable only if it removes repeated decisions. </p><p>A new observability layer is justified only if it reduces incident ambiguity and reduces escalation time. </p><p>A compliance automation initiative matters only if it eliminates manual evidence collection. </p><p>A cost governance mechanism succeeds only if it reduces executive uncertainty about spend.</p><p>If a platform initiative does not reduce organizational load, it is not platform work. It is additional complexity.</p><div><hr></div><h2>Kill one coordination loop this quarter</h2><p>Map the top five recurring coordination loops in your organization.</p><ol><li><p>Where do teams wait on each other to ship?</p></li><li><p>Where do incidents stall because ownership is unclear?</p></li><li><p>Where does compliance require manual effort?</p></li><li><p>Where do cost reviews require investigation instead of explanation?</p></li><li><p>Where do new environments require meetings instead of automation?</p></li></ol><p>Treat each loop as a design flaw, not an operational inconvenience.</p><p>Assign your platform team to eliminate one of these loops completely within the next quarter. </p><p>Do not improve it. Remove it. Replace it with a default, automated, or enforced path that requires no coordination.</p><p>Measure success by the absence of meetings, tickets, and escalations that used to exist. If the loop is still there, the load is still there.</p><p>When those disappear, organizational load drops. When organizational load drops, everything else moves faster without forcing speed.</p><div><hr></div><h2><strong>Whenever you&#8217;re ready, there are 2 ways I can help you:</strong></h2><ol><li><p><strong>Free guides and helpful resources: </strong><a href="https://thecloudplaybook.gumroad.com/">https://thecloudplaybook.gumroad.com/</a></p></li><li><p>Get certified as an <strong>AWS AI Practitioner</strong> in 2026. Sign up today to elevate your cloud skills. (<em><a href="https://www.udemy.com/course/aws-certified-ai-practitioner-practice-exams-aif-c01/">link</a></em>)</p></li></ol><div><hr></div><h2><strong>That&#8217;s it for today!</strong></h2><p>Did you enjoy this newsletter issue?</p><p>Share with your friends, colleagues, and your favorite social media platform.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share The Cloud Playbook&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share The Cloud Playbook</span></a></p><p><strong>Until next week &#8212; Amrut</strong></p><div><hr></div><h2><strong>Get in touch</strong></h2><p>You can find me on <a href="https://www.linkedin.com/in/patilamrut/">LinkedIn</a> or <a href="https://twitter.com/realamrutpatil">X</a>.</p><p>If you would like to request a topic to read, please feel free to contact me directly via LinkedIn or X.</p>]]></content:encoded></item><item><title><![CDATA[TCP #98: AWS Multi-Account vs. Single-Account Strategy]]></title><description><![CDATA[Here is the framework I use to evaluate single-account vs. multi-account.]]></description><link>https://www.thecloudplaybook.com/p/multi-account-vs-single-account-aws-strategy</link><guid isPermaLink="false">https://www.thecloudplaybook.com/p/multi-account-vs-single-account-aws-strategy</guid><dc:creator><![CDATA[Amrut Patil]]></dc:creator><pubDate>Sun, 08 Feb 2026 13:01:28 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!UeBf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4265a44-3f84-4ee7-a4d8-d58637360d75_1856x2304.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You can also read my newsletters from the Substack mobile app and be notified when a new issue is available.</p><div class="install-substack-app-embed install-substack-app-embed-web" data-component-name="InstallSubstackAppToDOM"><img class="install-substack-app-embed-img" src="https://substackcdn.com/image/fetch/$s_!7MI5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1ca555-b578-4ae3-8dda-a03cbc6b1d18_500x500.png"><div class="install-substack-app-embed-text"><div class="install-substack-app-header">Get more from Amrut Patil in the Substack app</div><div class="install-substack-app-text">Available for iOS and Android</div></div><a href="https://substack.com/app/app-store-redirect?utm_campaign=app-marketing&amp;utm_content=author-post-insert&amp;utm_source=thecloudplaybook" target="_blank" class="install-substack-app-embed-link"><button class="install-substack-app-embed-btn button primary">Get the app</button></a></div><div><hr></div><p>Every growing platform team hits this fork in their AWS account strategy.</p><p>You have multiple workloads, multiple teams, and compliance requirements stacking up. Do you run everything in a single AWS account with tight IAM boundaries, or split into a multi-account structure with AWS Organizations?</p><p>I have managed 9 AWS accounts across 11 services for 9 tenants in regulated environments, including FedRAMP, ISO, and HIPAA.</p><p>The choice between single-account and multi-account AWS shapes your security posture, operational overhead, and audit readiness for years to come.</p><p>Get it wrong, and you are either retrofitting account isolation under audit pressure or drowning in cross-account complexity you did not need.</p><p>This is my decision framework for evaluating it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UeBf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4265a44-3f84-4ee7-a4d8-d58637360d75_1856x2304.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UeBf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4265a44-3f84-4ee7-a4d8-d58637360d75_1856x2304.png 424w, https://substackcdn.com/image/fetch/$s_!UeBf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4265a44-3f84-4ee7-a4d8-d58637360d75_1856x2304.png 848w, https://substackcdn.com/image/fetch/$s_!UeBf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4265a44-3f84-4ee7-a4d8-d58637360d75_1856x2304.png 1272w, https://substackcdn.com/image/fetch/$s_!UeBf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4265a44-3f84-4ee7-a4d8-d58637360d75_1856x2304.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UeBf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4265a44-3f84-4ee7-a4d8-d58637360d75_1856x2304.png" width="1456" height="1807" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d4265a44-3f84-4ee7-a4d8-d58637360d75_1856x2304.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1807,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7425897,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thecloudplaybook.com/i/186567423?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4265a44-3f84-4ee7-a4d8-d58637360d75_1856x2304.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UeBf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4265a44-3f84-4ee7-a4d8-d58637360d75_1856x2304.png 424w, https://substackcdn.com/image/fetch/$s_!UeBf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4265a44-3f84-4ee7-a4d8-d58637360d75_1856x2304.png 848w, https://substackcdn.com/image/fetch/$s_!UeBf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4265a44-3f84-4ee7-a4d8-d58637360d75_1856x2304.png 1272w, https://substackcdn.com/image/fetch/$s_!UeBf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4265a44-3f84-4ee7-a4d8-d58637360d75_1856x2304.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2>The Options (Single Account, Multi-Account, or Hybrid)</h2><p><strong>Option A: Single account with strong IAM boundaries.</strong></p><p>You keep all workloads in one AWS account. You use IAM policies, resource tags, and service control boundaries to isolate teams and environments.</p><p>Billing stays centralized. Networking stays flat. You manage one set of CloudTrail logs, one GuardDuty configuration, and one Security Hub deployment.</p><p>Your Terraform state files share a single S3 backend. Everything lives under one roof, and your team operates with minimal AWS organizational overhead.</p><p><strong>Option B: Multi-account with AWS Organizations.</strong></p><p>You create dedicated accounts for each environment (dev, staging, prod), for each team, or for each tenant. You use AWS Organizations with SCPs for guardrails.</p><p>Each account gets its own blast radius boundary. Billing is segmented by default. You need a landing zone, centralized logging accounts, and cross-account role assumptions.</p><p><strong>Option C: Hybrid, phased approach.</strong></p><p>You start with a single account and split as triggers emerge. You define specific conditions, team count thresholds, compliance scope changes, blast radius incidents that force the migration.</p><p>You build the abstraction layer early, so the move is not a rewrite. Your Terraform modules reference account IDs as variables. Your CI/CD pipelines use role assumptions that work the same way whether the target is a local or remote account.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Cloud Playbook! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2>The Tradeoffs of Each AWS Account Strategy</h2><p><strong>A single account</strong>&nbsp;is simpler to operate at a small scale.</p><p>One set of IAM policies. One networking layer. One billing dashboard. Your platform team spends less time on account vending and cross-account permissions.</p><p>Onboarding a new engineer takes minutes, not hours. Your CI/CD pipelines do not need cross-account role assumptions. Your Terraform state lives in one place.</p><p>The cost shows up later.</p><p>When an S3 bucket policy misconfiguration in development exposes production data, you realize the blast radius was never contained.</p><p>When your auditor asks for evidence of environmental isolation, IAM policies alone do not satisfy the control. The auditor wants to ensure that a compromised set of dev credentials cannot access production resources, and that tag-based policies are not sufficient proof for most assessors.</p><p>When two teams hit the same service quota, and one of them is production, you scramble. I have seen Lambda concurrency limits, API Gateway throttling, and DynamoDB throughput all become shared bottlenecks across environments within a single account.</p><p><strong>Multi-account</strong> gives you AWS account isolation by default.</p><p>A compromised dev account does not touch production. Service quotas are per-account, so a runaway Lambda in staging does not throttle your production workloads.</p><p>Cost attribution is automatic. Each account maps cleanly to a cost center or tenant. Auditors see clean environment separation, and your evidence collection for compliance becomes straightforward.</p><p>The cost is operational complexity.</p><p>Cross-account networking requires Transit Gateway or VPC peering, each with their own routing tables and security groups to manage. Centralized logging needs a dedicated account with S3 replication and cross-account CloudWatch access.</p><p>IAM roles must be assumed across accounts, adding latency and failure points to your CI/CD pipelines. Every new account needs a baseline: CloudTrail, Config, GuardDuty, Security Hub, and proper SCPs.</p><p>Without automation, provisioning and maintaining this baseline becomes a full-time job for your platform team.</p><p><strong>The hybrid path</strong> sounds smart but carries its own risk.</p><p>If you do not define the migration triggers upfront, you will always find a reason to delay. And the longer you wait, the more tightly coupled your single-account architecture becomes, which makes the eventual split harder.</p><div><hr></div><h2>What We Chose and Why</h2><p>We went multi-account from the start. Three factors drove the decision.</p><p><strong>First, we operated in regulated industries.</strong></p><p>FedRAMP, ISO, HIPAA. Each framework requires environment separation beyond IAM policies.</p><p>Auditors want to see account-level isolation between production and non-production. They want separate CloudTrail trails in separate accounts with restricted access.</p><p>They want to verify that a developer with access to a staging account cannot escalate to production resources through any path, not through IAM, not through networking, not through shared credentials.</p><p>Starting with an AWS multi-account strategy meant we never had to retrofit this evidence.</p><p><strong>Second, we supported 75+ developers across the US, India, and Colombia.</strong></p><p>At that team size, the blast radius risk of a single account was too high. One misconfigured AWS CDK module in development should not affect production.</p><p>Account boundaries enforce this without relying on everyone getting their IAM policies right. The boundary is structural, not behavioral. That matters at scale.</p><p><strong>Third, we built the account vending automation early.</strong></p><p>Account provisioning through AWS CDK modules. Baseline security controls applied through AWS Organizations SCPs. Centralized logging with cross-account replication on day one.</p><p>The upfront investment was significant, roughly three weeks of dedicated platform engineering time. The payoff was that every new account was production-ready in under an hour, with a full compliance baseline applied automatically.</p><div><hr></div><h2>When This AWS Multi-Account Framework Applies</h2><p>Use this decision framework when you see any of these conditions:</p><p>Your compliance scope includes frameworks that require environment isolation. Your developer count is above 30. Your production workloads serve external customers. You run a multi-tenant infrastructure where tenant data must be segregated.</p><p>If you are a 5-person startup with one product and no compliance requirements, stay on a single account. Do not over-engineer.</p><p>But define the triggers now. Write them down.</p><p>&#8220;When our team hits 20 engineers.&#8221;</p><p>&#8220;When our first enterprise customer asks for a SOC 2 report.&#8221;</p><p>&#8220;When our first outage traces back to a dev environment change hitting production.&#8221;</p><p>These are not hypothetical. They are inevitable at scale.</p><p>The decision is not which AWS account strategy is better in the abstract. Which strategy matches your current constraints and the trajectory you are building toward?</p><p>Evaluate accordingly.</p><div><hr></div><h2><strong>Whenever you&#8217;re ready, there are 2 ways I can help you:</strong></h2><ol><li><p><strong>Free guides and helpful resources: </strong><a href="https://thecloudplaybook.gumroad.com/">https://thecloudplaybook.gumroad.com/</a></p></li><li><p>Get certified as an <strong>AWS AI Practitioner</strong> in 2026. Sign up today to elevate your cloud skills. (<em><a href="https://www.udemy.com/course/aws-certified-ai-practitioner-practice-exams-aif-c01/">link</a></em>)</p></li></ol><div><hr></div><h2><strong>That&#8217;s it for today!</strong></h2><p>Did you enjoy this newsletter issue?</p><p>Share with your friends, colleagues, and your favorite social media platform.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share The Cloud Playbook&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.thecloudplaybook.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share The Cloud Playbook</span></a></p><p><strong>Until next week &#8212; Amrut</strong></p><div><hr></div><h2><strong>Get in touch</strong></h2><p>You can find me on <a href="https://www.linkedin.com/in/patilamrut/">LinkedIn</a> or <a href="https://twitter.com/realamrutpatil">X</a>.</p><p>If you would like to request a topic to read, please feel free to contact me directly via LinkedIn or X.</p>]]></content:encoded></item></channel></rss>