<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Protomota Lab]]></title><description><![CDATA[Solo founder. Team of AI agents. I build products and run everything with orchestrated agent stacks — no human team behind it. This is where I share what actually works.]]></description><link>https://newsletter.protomota.com</link><image><url>https://substackcdn.com/image/fetch/$s_!gWIA!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dfb5009-bc5b-49a9-b0ce-83b241af7782_1000x1000.png</url><title>Protomota Lab</title><link>https://newsletter.protomota.com</link></image><generator>Substack</generator><lastBuildDate>Thu, 30 Apr 2026 12:38:34 GMT</lastBuildDate><atom:link href="https://newsletter.protomota.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Protomota]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[braddunlap@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[braddunlap@substack.com]]></itunes:email><itunes:name><![CDATA[Brad Dunlap]]></itunes:name></itunes:owner><itunes:author><![CDATA[Brad Dunlap]]></itunes:author><googleplay:owner><![CDATA[braddunlap@substack.com]]></googleplay:owner><googleplay:email><![CDATA[braddunlap@substack.com]]></googleplay:email><googleplay:author><![CDATA[Brad Dunlap]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Fewer People, Less Process: How AI Changes the Way Teams Ship]]></title><description><![CDATA[Linear says issue tracking is dead. They are half right.]]></description><link>https://newsletter.protomota.com/p/fewer-people-less-process-how-ai</link><guid isPermaLink="false">https://newsletter.protomota.com/p/fewer-people-less-process-how-ai</guid><dc:creator><![CDATA[Brad Dunlap]]></dc:creator><pubDate>Sun, 12 Apr 2026 16:31:12 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/ad96ddee-b02f-4eef-90bf-ce149cfabb5a_1376x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Linear just declared <a href="https://linear.app/next">issue tracking dead</a>. Their CEO Karri Saarinen: "Complexity started to look like sophistication. Overhead kept growing, and the process became the work." They launched an AI agent that generates issues, triages automatically, and will soon write code. 75% of their enterprise workspaces already have coding agents installed. Agents now author nearly 25% of new issues.</p><p>Their pitch is "context over process." I think Saarinen is right about the diagnosis. But he's solving the wrong layer of the problem.</p><h2>Process became the product a long time ago</h2><p>If you've used Jira or Azure DevOps at a company with more than 50 engineers, you know what he's talking about. Jira's the one everyone loves to hate, but Azure DevOps is the quiet offender. Same disease in a Microsoft suit: work items linked to test plans linked to boards nobody opens, welded to your CI/CD so the process overhead becomes literally impossible to remove without breaking your builds.</p><p>Theo Browne <a href="https://www.youtube.com/watch?v=example">talked about this</a> &#8212; his Jira at Twitch took over two minutes to load. That's not a performance bug. That's what happens when every stakeholder gets to add a field.</p><p>I've lived both sides of this. For a stretch of my career I was a Jira admin at a 100-person company. It consumed about half my day. Not building. Not shipping. Tweaking workflows, fixing broken automations, cleaning up schemes that had drifted into nonsense. Half my working hours maintaining a tool whose purpose was supposed to be helping other people work. I also spent time as a product owner on a large enterprise team, basically living in Azure DevOps. Grooming backlogs, writing acceptance criteria, sitting in sprint ceremonies, managing boards. The work was real, but it wasn't building. It was managing the process around building.</p><p>At my core I'm a builder. I went back to building. Those process management roles no longer provide value in the ways we need to work today. The future belongs to builders, not process managers.</p><p>The Agile Manifesto said "working software over comprehensive documentation" in 2001. Then an industry of consultants, certification bodies, and framework vendors built a multi-billion-dollar empire around the ceremony of being agile. SAFe. Scrum certifications. "Agile transformations" led by people who haven't shipped code in a decade. Microsoft built an entire product around it. Atlassian built an entire company around it. The tools designed to make teams agile became the heaviest things in the building.</p><p>The <a href="https://theagilemindset.substack.com/p/agile-is-dead-long-live-agile">agile industrial complex</a> is real. The principles were never the problem. The industry that grew around them was.</p><h2>What AI actually changes</h2><p>Theo's approach at Twitch was to skip the spec and build a rough prototype to discover requirements. Before AI, that worked if you had the right engineer. But in the AI era, I think you actually need the spec more, not less. Not the 20-page Google Doc. A markdown file produced by the subject matter experts. What you're building, who it's for, what done looks like. Enough context to hand an agent and get a useful first pass.</p><p>You can't "skip the spec" when your builder is an LLM. The prompt IS the spec. The question is whether you write a thoughtful one that gets a working prototype on the first pass, or wing it and spend three hours course-correcting.</p><p>Write the spec, prompt the agent, get a prototype, refine. That's the loop. The cost of a first pass is approaching zero. When it costs an afternoon and a good prompt, you iterate faster than any sprint cycle could keep up with.</p><h2>A new type of team is emerging</h2><p>A solo founder with AI agents can ship what used to take 15 engineers. A team of three can operate at the scale of thirty. The leverage is compounding fast, and it changes what tools you need.</p><p>Jira and Azure DevOps exist because large teams need coordination overhead. Tickets, workflows, sprint boards, estimation rituals &#8212; all coordination tax. The cost of having a lot of humans in the same codebase. But the teams forming now don't look like that. They don't need sprint boards because there's no sprint. They don't need estimation because one person holds the whole context.</p><p>Smaller doesn't mean solo, though. There are unicorns who can design, code, manage product, and test. They exist. But they're rare, and even the best of them usually can't match the quality a small team of specialists produces. The unicorn gets you to market. The team gets you to quality.</p><p>The roles haven't gone away. You still need a designer's eye, an engineer's instinct for what breaks at scale, a PM who manages the project and works with clients, a QA engineer who thinks adversarially. What changes is headcount and throughput. A designer with AI explores ten directions in the time it used to take to mock up two. An engineer ships in a day what used to take a sprint. AI is a productivity accelerator, not a role eliminator. A team of five with the right specializations can operate like fifty used to.</p><p>That's the market Linear is actually chasing. Not better issue tracking for big teams. Lighter tools for smaller teams that punch above their weight.</p><h2>The business model is breaking too</h2><p>This isn't just a team structure problem. It's a pricing problem.</p><p>Software projects used to be measured in developer hours. Entire sales organizations were built around scoping engagements at X hours times Y rate. That math is collapsing. The honest unit of work is shifting from hours to tokens, and while nobody has a clean way to estimate token cost for a project yet, the direction is obvious: it's a lot less. That's all that matters.</p><p>Software can't be sold at the prices it was even two years ago. A project that a consultancy would have quoted at six months and half a million dollars can now be built by a small team in weeks for a fraction of the cost. The buyers are figuring this out. The sellers, a lot of them, haven't.</p><p>Large firms that only chase big deals are going to get their lunch taken by smaller indie dev shops that move faster, charge less, and ship better work with leaner teams. The overhead that justified premium pricing, the project managers, the Jira admins, the sprint ceremonies, the 30-person delivery teams, none of that scales the way it used to. It's just cost now.</p><p>Offshoring is facing the same math. The whole model was built on labor cost arbitrage: cheaper hourly rates in other markets. But when the unit of work shifts from hours to tokens, that gap shrinks fast. Tokens cost the same no matter where the developer sits. Meanwhile, the coordination costs haven't changed. Time zone gaps, communication overhead, context getting lost across languages and cultures, losing direct control over the taste and style of the output, the quality variance that comes from managing work at a distance. Those tradeoffs used to be worth it because the savings were significant. When a small local team with AI tools can compete on price, the economic case for offshoring gets a lot harder to make.</p><p>Sales teams are struggling and a lot of them don't understand why. The pipeline looks the same. The pitch decks look the same. But the deals aren't closing because the buyers can see the math changing. The teams that adapt, that price for the new reality and sell speed and quality instead of headcount, will win. The ones still quoting based on 2023 economics are going to have a rough year.</p><h2>What AI doesn't change</h2><p>You still don't know what your users want. <a href="https://www.inflectra.com/Ideas/Whitepaper/Is-Agile-Dead.aspx">Inflectra put it well</a>: "If code shows up faster than requirements learning, you can ship the wrong thing even sooner."</p><p>AI makes shipping faster. It does not make learning faster. Skip the feedback loops and you just automate failure. The Agile principle of "optimize for learning" matters more now than it did in 2001. Short feedback loops aren't optional when your agent can ship a feature before lunch.</p><p>What actually dies is the handoff model. Not the roles, but the assembly line where each role produces an artifact for the next to consume. When your team is small enough that everyone shares context directly, you don't need the proxy. The teams that win will be the ones with taste &#8212; who know what's worth building and can spot "this is wrong" before users have to tell them.</p><p>When building is cheap, judgment becomes the bottleneck.</p><h2>The manifesto was right</h2><p>Issue tracking isn't dead. Bad issue tracking deserved to die. The bloated Jira instances, the Azure DevOps boards nobody opens, the velocity charts that measured activity instead of outcomes.</p><p>What survives is what the Agile Manifesto said 25 years ago. Build working software. Collaborate with your users. Respond to change. Value people over process. We buried those principles under dashboards and certifications and two-week rituals. AI is digging them back out.</p><p>The future isn't post-Agile. It's Agile without the industry that grew around it.</p><p>Build something. Learn from it. Do it again.</p>]]></content:encoded></item><item><title><![CDATA[bringing it back home: gemma 4 on a 60-watt box]]></title><description><![CDATA[I put Gemma 4 on a Jetson AGX Orin. It's better than it should be.]]></description><link>https://newsletter.protomota.com/p/bringing-it-back-home-gemma-4-on</link><guid isPermaLink="false">https://newsletter.protomota.com/p/bringing-it-back-home-gemma-4-on</guid><dc:creator><![CDATA[Brad Dunlap]]></dc:creator><pubDate>Fri, 10 Apr 2026 20:52:25 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/adf0785c-952f-40a8-9259-052936c1ec37_1376x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>A few weeks ago I wrote about getting Qwen 3.5 35B-A3B running on my Jetson AGX Orin through vLLM and OpenClaw. It worked. It was fast enough. I was happy with it.</p><p>But every time I looked at my setup there was a small mental asterisk: I was running my local agent stack on weights trained by Alibaba. Not because I had any concern about the model, but because I'm an American indie builder and my favorite local model was coming out of Hangzhou. It felt slightly off-brand for what I'm trying to build.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.protomota.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Protomota Lab is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Gemma 4 26B-A4B is the first genuinely capable small US open weights model I've been able to run on my own hardware. Not "capable for a small model." Capable, full stop.</p><h2>Why 26B on an Orin should not work</h2><p>The AGX Orin is a 275mm x 87mm module that draws about 60 watts under load. 64GB of unified memory shared between CPU and GPU. Ampere GPU, 2048 CUDA cores. It sits on my workbench next to my dev machine.</p><p>Gemma 4 26B-A4B is not a normal 26B model. It has 25.2 billion total parameters, but only 3.8 billion are active during any given inference pass. 128 experts per MoE layer, router picks 2 per token, the rest sit idle. The file is 16.8GB at Q4_K_M, but the compute per token is closer to a 4B model.</p><p>That's the whole trick. Training quality of a 26B, inference cost of a 4B. On hardware where every watt matters, that's the difference between "runs" and "doesn't."</p><p>NVIDIA's Jetson AI Lab lists the AGX Orin as a supported platform and ships a Docker container with llama.cpp pre-configured:</p><pre><code>sudo docker run -it --rm --pull always --runtime=nvidia --network host \
  -v $HOME/.cache/huggingface:/root/.cache/huggingface \
  ghcr.io/nvidia-ai-iot/llama_cpp:gemma4-jetson-orin \
  llama-server -hf ggml-org/gemma-4-26B-A4B-it-GGUF:Q4_K_M</code></pre><p>One line. Model takes ~24GB of RAM. On my 64GB Orin, that leaves plenty of room for the rest of the system.</p><h2>What "runs" actually means</h2><p>Set expectations. At Q4 on the Orin, Gemma 4 26B-A4B is not fast. I'm getting around 11 tokens per second, which I'd call "comfortable for non-interactive use." Background tasks, batch processing, offline analysis. Not real-time chat.</p><p>For my use cases this is fine. I'm not building a chatbot on the Orin. I'm running background agent tasks: log analysis, code review, config generation, structured extraction. None of those need sub-second latency. They need good output on short-to-medium prompts.</p><p>And the output is where 26B earns its place. Sub-10B models on this hardware work for simple stuff (summarize this, extract these fields, classify this log line), but they fall apart on anything that needs real reasoning. A 3B model asked to analyze a stack trace gives you something plausible that's wrong half the time. A 7B does better but hallucinates function names. Gemma 4 26B is a different animal. It understands code, handles tool calling, follows multi-step instructions without losing the thread.</p><p>The 256K context window doesn't hurt either. I can feed it a whole config file, a stack trace, and the relevant source all at once.</p><p>For the first time, I have a model on the Orin that I trust to do work I'd previously have to ship to an API.</p><h2>Rough edges</h2><p><strong>Skip Ollama on Jetson.</strong> There's an open bug where <code>gemma4:26b</code> throws HTTP 500s on moderately long context on the Orin specifically. Looks like a memory management issue with Ollama's CUDA integration on Jetson. Stick with llama.cpp.</p><p><strong>Long context is slow.</strong> The 256K window exists but filling it is painful. Prefill scales with context length, and the Orin's GPU takes a while on 50K+ tokens. I keep my prompts under 10K for anything that needs to respond in under a minute.</p><p><strong>Multimodal is experimental.</strong> Vision through llama.cpp on Jetson isn't mature yet. The text capabilities are solid.</p><h2>The punchline</h2><p>I've been waiting for a model smart enough to be useful and small enough to run on hardware I actually own. Not a rented H100. Not a cloud API I'm paying per token for. Not a Mac Studio I'd have to buy specifically for this. A 60-watt box I already have on my desk.</p><p>Gemma 4 26B-A4B is the first model that meets both criteria on the Orin.</p><p>It's not going to replace Claude or GPT for complex, multi-turn reasoning. It's not going to write this newsletter. But for the work that needs to happen locally, without a cloud round-trip, it is the best option available today. And it's US open weights, which means I can finally stop the mental asterisk.</p><p>The interesting question isn't whether it's as good as a cloud model. It obviously isn't. The interesting question is whether it's good enough that you stop needing the cloud model for a meaningful chunk of your workload.</p><p>On my bench, the answer is yes. For the first time.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.protomota.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Protomota Lab is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA["Just ship it" is bad advice for solo founders]]></title><description><![CDATA[The decision framework I actually use before pressing deploy]]></description><link>https://newsletter.protomota.com/p/just-ship-it-is-bad-advice-for-solo</link><guid isPermaLink="false">https://newsletter.protomota.com/p/just-ship-it-is-bad-advice-for-solo</guid><dc:creator><![CDATA[Brad Dunlap]]></dc:creator><pubDate>Thu, 02 Apr 2026 17:25:01 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/2d401c14-dace-4b55-a3cc-2f71ec5f6cb4_1376x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>"Just ship it" sounds brave. It sounds like the kind of thing someone with a Y Combinator hoodie says while sipping cold brew in a WeWork. And for a team of six with a safety net of funding and a product manager who can course-correct after launch, it's fine advice.</p><p>For a solo founder, it's a trap.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.protomota.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Protomota Lab is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>When you ship something broken and you're the only person who can fix it, you don't get to "learn from the market." You get buried in support emails, refund requests, and a reputation hit you can't delegate to anyone. There's no PM absorbing the fallout while you iterate. There's just you, triaging at midnight.</p><p>Shipping too early as a solo founder is worse than shipping too late, because the recovery tax falls entirely on one person. You can't parallelize damage control and feature work when you're the whole company.</p><h2>The framework</h2><p>Five questions before anything goes live. Not a checklist in a project management tool. Just five things worth thinking through before you hit deploy.</p><p><strong>1. Can a new user finish the core loop without help?</strong></p><p>Not "can they figure it out if they're technical." Can a regular person sign up, do the main thing, and get value without messaging you? If the answer is no, it's not ready. Doesn't matter how clever the architecture is.</p><p>That dead end on one screen size, the missing empty state, the confusing button label. Nobody complains on day one. By day seven you're answering the same question in every support thread.</p><p><strong>2. What breaks if twice as many people use it as you expect?</strong></p><p>Solo founders don't have an SRE team. If your database melts or your API rate limits get hit on launch day, you're the one SSHing in while your phone blows up. Spend 30 minutes thinking about what happens at 2x your optimistic traffic estimate. Usually the answer is "nothing, because my traffic estimates are delusional." But sometimes it catches a real problem: a missing index, a webhook that retries infinitely on failure.</p><p><strong>3. Is there a way to undo this if it goes wrong?</strong></p><p>Feature flags. Database backups verified within the last 24 hours. A rollback path that doesn't require rewriting migrations at 2 AM. If you can't reverse the deploy in under 10 minutes, maybe don't ship it on a Friday.</p><p>The trap here: migrations that are technically reversible but practically aren't, because the rollback would wipe user data created after the deploy. Test your rollback, don't just confirm it exists.</p><p><strong>4. Are you shipping this because it's ready, or because you're tired of looking at it?</strong></p><p>There's a specific kind of fatigue that hits around week three of working on a feature. You start telling yourself "good enough" when you mean "I'm sick of this." Those are not the same thing.</p><p>If you're excited to announce it, it's probably ready. If you're relieved to be done with it, you're probably trying to escape the project, not finish it.</p><p><strong>5. What's the support burden for the first 48 hours?</strong></p><p>Can you actually be available? Or are you shipping right before going offline for six hours? Solo founders don't get to ship and disappear. The three people who try it and hit a bug during that window will leave and never come back.</p><p>Block your calendar for 48 hours after any significant release. No deep work on other projects. Just monitoring, responding, and fixing whatever surfaces. Boring, but it's the difference between a launch and a mess.</p><h2>When "just ship it" actually works</h2><p>Some things genuinely benefit from speed over polish: internal tools only you use, blog posts, landing page copy, config changes. Anything where the blast radius is you and only you.</p><p>There's also a middle ground: give it away free during a beta period. People who aren't paying have a completely different tolerance for rough edges. They'll tell you what's broken instead of demanding a refund. They'll stick around through the awkward phase because they feel like they're part of building something, not customers who got shortchanged.</p><p>A free beta buys you the one thing solo founders don't have: room to be wrong. You still need to care about the experience, but the stakes on any single release drop significantly when nobody's credit card is attached.</p><p>If a stranger will interact with it, run the five questions. If only you will, ship it. If it's a free beta, you can be a little more aggressive &#8212; but don't skip question one. A broken core loop wastes everyone's time, paying or not.</p><h2>The real problem with "just ship it"</h2><p>The advice assumes you have buffer. Someone to catch what you miss.</p><p>Solo founders have none of that. Every hour spent recovering from a bad launch is an hour not spent building the next thing. The math is brutal when there's only one of you.</p><p>Five questions, maybe 15 minutes total. Worth it every time.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.protomota.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Protomota Lab is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Stop trying to build a platform. Build a tool.]]></title><description><![CDATA[Platforms need ecosystems. Tools need one user with one problem.]]></description><link>https://newsletter.protomota.com/p/stop-trying-to-build-a-platform-build</link><guid isPermaLink="false">https://newsletter.protomota.com/p/stop-trying-to-build-a-platform-build</guid><dc:creator><![CDATA[Brad Dunlap]]></dc:creator><pubDate>Fri, 20 Mar 2026 19:42:08 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/34a84190-8e93-4f93-8021-6569c480b8e6_1264x848.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Every indie founder has the same inflection point. The product works for a handful of people. Usage is growing. And then the thought arrives: "What if we opened this up? What if other people could build on top of it? What if we became the platform?"</p><p>That thought has killed more promising products than bad code, bad marketing, and bad timing combined.</p><h2>The platform trap</h2><p>A tool solves a problem. A platform hosts other people's solutions to other people's problems. The difference in engineering effort, go-to-market strategy, and operational complexity is not 2x. It's 10x or more.</p><p>When Shopify was a tool, it helped merchants set up online stores. When it became a platform, it needed an app ecosystem, a developer relations team, an API stability guarantee, a review process for third-party apps, documentation for external developers, a partner program, and a fraud detection system for apps that abuse merchant data. Each of those is a company-sized problem on its own.</p><p>Shopify pulled it off because they had hundreds of millions in revenue and thousands of employees when they made that transition. They didn't start as a platform. They earned the right to become one.</p><p>Most indie products try to skip directly to platform. They add plugin systems before they have 100 users. They build APIs before anyone has asked for one. They design extensibility frameworks before the core product is stable. Then they spend six months maintaining infrastructure that nobody uses while the core experience rots.</p><h2>Tools have gravity. Platforms have overhead.</h2><p>A good tool attracts users because it solves their problem right now. No ecosystem required. No third-party developer community needed. Someone finds it, tries it, and either it works or it doesn't. The feedback loop is fast.</p><p>A platform attracts users only when the ecosystem around it has enough value to justify the switching cost. That's a chicken-and-egg problem that burns time and money. You need developers to build apps before users will adopt the platform, but developers won't build apps until there are users. Solving this requires either subsidizing developers (expensive) or building the first wave of apps yourself (which means you're back to building tools anyway).</p><p>The tool path: build something, ship it, get users, charge money. Six weeks to first revenue if the product is good.</p><p>The platform path: build something, build the developer tools around it, write documentation, recruit developers, hope they build things, hope users find those things. Twelve to eighteen months before the ecosystem generates value, if it ever does.</p><p>For a solo founder or small team, the platform path is almost always wrong. Not because platforms aren't valuable. Because you can't afford the timeline.</p><h2>The premature API</h2><p>The most common form of platform thinking in early-stage products: building an API before anyone has asked for one.</p><p>APIs are expensive to maintain. Once external developers depend on your API, every endpoint becomes a contract. Changing a response format breaks someone's integration. Deprecating a field requires a migration path. Version management becomes a permanent line item in your engineering budget.</p><p>If you have 50 users and zero of them have asked for API access, you don't need an API. You need a better core product. The time spent building and documenting an API could have gone toward fixing the three things your actual users complain about.</p><p>Build the API when someone emails you saying "I need to integrate this with my system and I'll pay more for it." That's demand. Everything before that is speculation.</p><h2>The plugin system nobody uses</h2><p>Same pattern. A product with 200 users adds a plugin architecture. Now the codebase has a plugin loader, a sandboxing layer, a configuration schema, and documentation for plugin developers. Total plugins written by external developers: zero.</p><p>The plugin system exists because the founder imagined an ecosystem. Not because users asked for extensibility. The fantasy is compelling: other people building features for your product, for free, that attract more users. In practice, plugin ecosystems only work at scale. WordPress has plugins because it has millions of sites. Figma has plugins because it has millions of designers. Your product with 200 users does not have the gravity to attract plugin developers.</p><p>Build the ten features your users actually need instead of building a system for other people to maybe build features someday.</p><h2>When to actually become a platform</h2><p>There are real signals that a product should add platform capabilities:</p><p>Users are building workarounds. They're scraping your UI, exporting CSVs and transforming them in scripts, or building unofficial integrations. This means the core product has value but doesn't connect to their workflow. An API makes sense here because the demand already exists.</p><p>Power users are asking for customization that would fracture the core product. If adding every custom request would turn the product into an unmanageable mess, a plugin or extension system lets power users solve their own edge cases without bloating the main product.</p><p>Revenue supports the investment. Platform infrastructure is ongoing cost. API maintenance, developer support, documentation updates, backwards compatibility testing. If the business can't absorb that cost for 12+ months with no direct revenue from the platform layer, it's too early.</p><p>A third party offers to build on top of your product and pay for the privilege. This is the clearest signal. Someone with money wants to extend your product for their own commercial purposes. That's real demand.</p><p>Everything else is founder fantasy about what the product could become if it were ten times bigger. Build for what it is now.</p><h2>The tool-first path</h2><p>Build a tool. Make it good. Charge for it. Get to the point where the tool generates enough revenue and has enough users that platform capabilities become a natural extension rather than a premature bet.</p><p>Basecamp was a project management tool for years before it became anything resembling a platform. Notion was a note-taking tool before it had an API. Linear was an issue tracker before it had integrations. Each of them nailed the core experience first. The platform layer came later, funded by tool revenue, pulled by user demand.</p><p>There's a version of your product that solves one problem really well for a specific group of people. That version is shippable in weeks, not months. It doesn't need an app store or a developer portal or an API reference. It needs to work, reliably, for the people who have that specific problem.</p><p>Ship that. Everything else is a distraction dressed up as ambition.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.protomota.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Protomota Lab is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Building a product nobody asked for]]></title><description><![CDATA[The market research industrial complex is killing more startups than bad ideas.]]></description><link>https://newsletter.protomota.com/p/building-a-product-nobody-asked-for</link><guid isPermaLink="false">https://newsletter.protomota.com/p/building-a-product-nobody-asked-for</guid><dc:creator><![CDATA[Brad Dunlap]]></dc:creator><pubDate>Fri, 13 Mar 2026 14:21:33 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/0592ca1e-c7dc-4b97-882a-d382884c550f_1264x848.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Two products I'm building right now started the same way. One generates and schedules content across multiple brands. The other packages AI agent capabilities into shareable, sellable units. Nobody asked for either. There was no survey. No customer discovery call. Just problems I kept running into and solutions I started building.</p><p>Neither has gone through a traditional validation process. They might fail. But they exist because I needed them, and that matters more than most founders think.</p><h2>The validation trap</h2><p>The standard startup advice goes like this: talk to 50 potential customers, identify a pain point, validate willingness to pay, build an MVP, test it with early adopters, iterate based on feedback.</p><p>This process is not wrong. It's just slow, expensive, and biased toward problems that are easy to articulate. The best products often solve problems people don't know how to describe yet. Nobody was asking for "a marketplace for AI agent skills" in 2024 because the concept barely existed. You can't validate demand for a category that hasn't been named.</p><p>The validation loop also selects for crowded markets. If 50 people can describe the problem clearly, 10 other teams are already building a solution. You've validated yourself into a competition.</p><h2>What works instead</h2><p>Build for yourself first. Not as a growth hack or a marketing angle. As a filter.</p><p>If you're building something you actually use, three things happen automatically:</p><p>You know the problem is real because you have it. Not "I think users might want this" but "I need this right now and nothing else does it." That's a different level of conviction. It survives the first bad week.</p><p>You're your own first tester. Every edge case, every friction point, every moment where the product falls short hits you before it hits anyone else. The feedback loop is instant. No surveys, no analytics dashboards, no waiting for support tickets. You just feel it.</p><p>You build faster because you skip the specification phase. When you're building for yourself, you already know the requirements. The PRD is in your head because you lived it. The "what should this do?" question has an obvious answer: whatever you need it to do right now.</p><h2>The market research industrial complex</h2><p>There's a whole ecosystem that profits from making founders feel like building is the risky part. Market research firms, customer discovery consultants, survey tools, validation frameworks. They all sell the same premise: the biggest risk is building the wrong thing, and their process reduces that risk.</p><p>The premise isn't entirely false. Building the wrong thing does happen. But for solo founders and small teams, the bigger risk is building nothing at all. Analysis paralysis kills more projects than bad product-market fit. You can pivot a launched product. You can't pivot a spreadsheet full of interview notes.</p><p>A founder who ships something ugly in two weeks and gets 5 real users has more useful information than a founder who spent three months interviewing 50 people and hasn't written a line of code.</p><h2>When validation actually matters</h2><p>There are situations where building without validation is genuinely reckless:</p><p>Enterprise software where each sale takes 6 months and costs $200K to close. You need to know the buyer exists and has budget before you invest a year of engineering.</p><p>Hardware products where the manufacturing minimum is 10,000 units and $500K. The cost of being wrong is bankruptcy, not wasted weekend hours.</p><p>Products that require regulatory approval. If you need FDA clearance or financial licensing, validate first because the compliance timeline is measured in years.</p><p>For software products built by a small team with low overhead? Just build it. The cost of being wrong is a few weeks of work. The cost of over-validating is months of delay while someone else ships the thing you were researching.</p><h2>There are no new ideas</h2><p>The other thing founders get wrong: chasing originality. Spending months looking for an idea nobody's had before. That's backwards.</p><p>If nobody has built it, there's usually a reason. Either the problem doesn't matter enough, the timing is wrong, or someone tried and failed for reasons you haven't discovered yet. Searching for a completely novel idea is one of the slowest ways to start a company.</p><p>The better approach: find something that already exists, that people already pay for, and build a version that's better for a specific group. Content scheduling tools exist. Dozens of them. But none of them were built for someone running five brands from one desk with AI agents handling the drafting. That's a niche. That's a differentiator. The category is validated. The specific angle is not.</p><p>Stripe didn't invent online payments. PayPal existed. Stripe built a payments API for developers instead of for merchants. Same category, different angle, better product for a specific audience.</p><p>Notion didn't invent note-taking or project management. They combined them in a way that attracted a specific kind of user who wanted flexibility over structure.</p><p>Linear didn't invent issue tracking. They built a version of it that was fast when everything else was slow, and opinionated when everything else tried to be configurable.</p><p>Competition is validation. If other companies are making money in the space, the demand is real. Your job is not to find a market with zero competitors. Your job is to find the gap in an existing market where nobody is serving a specific need well.</p><p>Look at what's already working. Find the group of users that existing products are ignoring or underserving. Build for them. That's faster, cheaper, and more likely to work than chasing a completely original idea.</p><h2>The unfair advantage of building what you need</h2><p>Products built for yourself have a quality that's hard to replicate: they're opinionated. They make decisions about how things should work instead of trying to accommodate every possible use case.</p><p>Basecamp was built because 37signals needed project management for their own client work. Stripe was built because the Collison brothers needed a payment API that didn't make them want to throw their laptop out a window. Slack was a chat tool built inside a gaming company that realized the chat was more valuable than the game.</p><p>None of these started with "let's research the project management / payments / enterprise chat market." They started with "this sucks, let me build something better." The product reflected specific, strong opinions about how work should happen. Those opinions attracted users who agreed and repelled users who didn't. That's not a bug. That's positioning.</p><p>Generic products built from surveys and focus groups tend toward the middle. They satisfy the average case and delight nobody. Products built from frustration tend toward extremes. They nail one workflow perfectly and ignore everything else. Users who have that exact workflow become devoted fans. Everyone else uses something else, and that's fine.</p><h2>The practical version</h2><p>If you're sitting on an idea and debating whether to validate or build:</p><p>Ask one question: do you personally have the problem this solves? If yes, build it this week. Use whatever tools get you to a working version fastest. Don't worry about scale, don't worry about the tech stack, don't worry about what happens at 10,000 users. You don't have 10,000 users. You have one: yourself.</p><p>Use your own product for two weeks. Write down every time it frustrates you, every time it's missing something, every time you work around a limitation. Fix the worst one. Repeat.</p><p>After a month of daily use, show it to three people who have the same problem. Not 50. Three. If two of them say "can I use this?" you have something. If all three shrug, you learned that in a month instead of three months of interviews.</p><p>The validation happened. It just happened through building and usage instead of through research and surveys. The information is better because it came from real behavior, not hypothetical answers to hypothetical questions.</p><h2>The thing nobody tells you about market research</h2><p>Survey responses are aspirational. Interview answers are performative. People describe the version of themselves they want to be, not the version that actually opens their laptop at 9 AM.</p><p>"Would you pay $20/month for a tool that does X?" gets a 70% yes rate in a survey. Actual conversion when the tool launches is closer to 3%. The gap between stated intent and real behavior is enormous, and no amount of careful survey design closes it.</p><p>Usage data is the only honest signal. And you can only get usage data from something that exists. Which means you have to build it first.</p><p>The entire market validation industry exists because of a reasonable fear (building the wrong thing) applied to the wrong solution (asking people what they want instead of watching what they do). Asking is cheap and feels productive. Building is expensive and feels risky. But building gives you the only data that actually predicts whether the product works.</p><p>Build the thing. Use the thing. Fix the thing. Show three people the thing.</p><p>That's the whole process.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.protomota.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Protomota Lab is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Memory is the hard part]]></title><description><![CDATA[Most agent setups forget everything between sessions. Here's what a working memory system actually looks like.]]></description><link>https://newsletter.protomota.com/p/memory-is-the-hard-part</link><guid isPermaLink="false">https://newsletter.protomota.com/p/memory-is-the-hard-part</guid><dc:creator><![CDATA[Brad Dunlap]]></dc:creator><pubDate>Tue, 10 Mar 2026 15:45:17 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/2b75b68c-58a8-401e-959c-7feb598ca356_1264x848.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you're running AI agents for anything beyond one-shot tasks, you've probably noticed the same thing: they forget everything between sessions.</p><p>The LinkedIn agent doesn't know what it posted yesterday. The security agent doesn't remember which servers it already patched. The publishing agent has no idea which drafts are staged and which were rejected. Every session starts from zero. Every session wastes its first few minutes rediscovering context that should have been obvious.</p><p>Memory is what turns a chatbot into something useful. Almost nobody builds it right.</p><h2>What memory actually means in an agent system</h2><p>When people say "memory" in the context of AI agents, they usually mean one of three things, and they're usually conflating all three.</p><p><strong>Conversation history</strong> is what ChatGPT gives you. The model remembers what you said earlier in the thread. Cheapest form of memory, least useful for real work. It dies when the session ends. It fills the context window. It can't be shared across agents.</p><p><strong>Retrieved context</strong> is RAG. Embed documents, store them in a vector database, retrieve relevant chunks at query time. Works for knowledge bases. Terrible for operational memory because relevance scoring doesn't understand time, recency, or task state. A vector search for "what's the current deploy status" might return a doc from three months ago because the words match.</p><p><strong>Persistent state</strong> is what agents actually need. Not "what documents are similar to this query" but "what happened yesterday, what's in progress right now, and what decisions were already made." Most setups skip this entirely.</p><p>My system uses all three, but the third one does the heavy lifting.</p><h2>The MEMORY.md pattern</h2><p>The pattern that works: give every agent a MEMORY.md file. Plain markdown. Human-readable. Version-controlled by nature because it lives in a git repo.</p><p>A LinkedIn agent's MEMORY.md tracks voice guidelines, which posts performed well, which topics to avoid, and the current posting schedule. A security agent's MEMORY.md tracks which servers have been audited, what vulnerabilities were found, and which patches are pending.</p><p>Not a database. A text file the agent reads at session start and updates when something important changes. The format is just markdown:</p><pre><code># MEMORY.md - Agent Name

## Role
What this agent does.

## Current State
What's in progress right now.

## Decisions Made
Things that were decided and shouldn't be relitigated.

## Preferences
How the human wants things done.</code></pre><p>Why markdown instead of a database?</p><p><strong>Debuggability.</strong> When an agent does something wrong, open the MEMORY.md and read it. You can see exactly what the agent knew. No query logs, no embedding inspection, no "why did the retrieval return that chunk?" detective work. Just text.</p><p><strong>Editability.</strong> Open the file and change it. If an agent has a wrong assumption baked into memory, fix it in thirty seconds. Try doing that with a vector store.</p><p><strong>Portability.</strong> Every agent's memory is a text file in a git repo. You can grep across all agent memories in one command. Diff changes over time. Copy one agent's memory pattern to bootstrap a new agent.</p><h2>Semantic search on top of flat files</h2><p>MEMORY.md handles persistent state. But agents also need to search their memory when a question comes up that isn't covered by the top-level file.</p><p>Agents can also have a `memory/` directory with dated entries and topic-specific files. When an agent gets a question about something from last Tuesday, it runs a semantic search across MEMORY.md and everything in `memory/`. The search uses embeddings (text-embedding-3-small works fine) and returns the top snippets with file paths and line numbers.</p><p>The flow: agent gets a question, runs `memory_search`, gets back relevant snippets with citations, pulls the specific lines it needs with `memory_get`, and answers with full context.</p><p>This is where the "retrieved context" layer comes in, but it's searching the agent's own operational history, not a generic document corpus. A RAG system pointed at your company wiki retrieves information. This retrieves experience.</p><h2>Why RAG alone fails for operational memory</h2><p>A common mistake: building agent memory with RAG and nothing else. Embed all the docs, all the Slack messages, all the meeting notes into a vector store and point the agent at it.</p><p>It works for answering questions about static knowledge. "What's our refund policy?" gets the right answer because the refund policy doc is sitting in the index and the embedding similarity is high.</p><p>It fails for anything time-sensitive or state-dependent. "What did we decide about the pricing change?" might return four different documents from four different meetings because they all discuss pricing. The agent has no way to know which one is current. "What's the status of the deployment?" returns nothing useful because deployment status isn't a document, it's a state that changes every hour.</p><p>The fix isn't better embeddings or fancier retrieval. The fix is a separate memory layer that tracks state explicitly. MEMORY.md handles "what is true right now." Semantic search handles "what happened before that might be relevant." RAG handles "what do we know about this topic in general." Three layers, three purposes.</p><h2>The memory lifecycle</h2><p>Memory isn't write-once. It has a lifecycle.</p><p><strong>Capture</strong>: Something important happens during a session. The agent writes it to memory. Not everything, just decisions, outcomes, and state changes. An agent that logs every API call to memory will drown in noise. An agent that logs "deployed v2.3 to production, all tests passing, monitoring for 24h" gives its future self exactly what it needs.</p><p><strong>Recall</strong>: At the start of every session, the agent loads its MEMORY.md. This is automatic in my system. The agent's workspace files (SOUL.md, MEMORY.md, TOOLS.md) are injected into every session. For deeper recall, the agent runs semantic search when a question requires historical context.</p><p><strong>Decay</strong>: Old memory needs to age out or get compressed. A MEMORY.md that grows forever becomes useless. I handle this with periodic consolidation: the agent reviews its memory, keeps what's still relevant, archives what isn't, and summarizes patterns. This happens on a schedule, not continuously.</p><p><strong>Correction</strong>: Sometimes memory is wrong. The agent believed something that turned out to be false, or a decision was reversed. The human edits the MEMORY.md directly. This is why plain text matters. Correcting a vector embedding is a research project. Correcting a markdown file is a text edit.</p><h2>What breaks when memory is wrong</h2><p>Bad memory is worse than no memory.</p><p>An agent with no memory starts fresh every session. It's slow but safe. It asks questions it's asked before. It redoes work it's done before. Annoying but not dangerous.</p><p>An agent with wrong memory acts on false beliefs with full confidence. The security agent that "remembers" a server was patched when it wasn't. The publishing agent that "remembers" a draft was approved when it was actually rejected. The financial agent that "remembers" an invoice was sent when it's still in queue.</p><p>This is why MEMORY.md being human-readable matters. Review agent memories periodically. Not every day, but often enough to catch drift. When an agent starts making decisions that don't make sense, check its memory first. Nine times out of ten, something in the MEMORY.md is stale or wrong.</p><h2>The cost math</h2><p>Memory isn't free. Every MEMORY.md that gets loaded into a session consumes tokens. Semantic search costs embedding API calls. Storing files costs disk space (trivial) and git history (also trivial).</p><p>The real cost is context window usage. A 500-line MEMORY.md takes roughly 2,000 tokens. Across a fleet of agents running multiple sessions per day, that adds up. But prompt caching covers most of it. MEMORY.md files load at session start, which means they hit the cache on subsequent messages within the same session. First load costs full price. Everything after costs roughly 10% because the cached prefix matches.</p><p>Without memory, agents waste tokens rediscovering context. An agent that spends 500 tokens re-asking "what's the current status?" every session burns more than the 2,000-token memory file that would have answered it upfront. The net cost of memory is negative.</p><h2>Getting started</h2><p>If you're running agents and haven't built a memory layer, start here:</p><ol><li><p>Create a MEMORY.md for your most important agent. Put the basics in it: what the agent does, what's currently in progress, and any decisions that shouldn't be repeated.</p></li></ol><ol><li><p>Tell the agent to update its MEMORY.md when something important changes. Not after every message. After outcomes: task completed, decision made, error encountered.</p></li></ol><ol><li><p>After a week, read the MEMORY.md. Is it useful? Does it contain things the agent actually needs to know? Trim the noise, keep the signal.</p></li></ol><ol><li><p>Add semantic search when you outgrow a single file. This usually happens when the agent has been running for a month and the memory directory has enough entries to make search worthwhile.</p></li></ol><ol><li><p>Set a reminder to review agent memories monthly. Catch stale beliefs before they cause problems.</p></li></ol><p>The whole system is plain text files with a search layer on top. No specialized infrastructure. No vector database to manage (the embeddings are computed at query time or cached locally). No migration path to worry about because markdown doesn't have schema changes.</p><p>Model quality matters less than you think. Memory quality matters more than anyone talks about.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.protomota.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Protomota Lab is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[Running Qwen 3.5 35B on an NVIDIA Jetson AGX Orin with OpenClaw]]></title><description><![CDATA[A real-world tutorial &#8212; every command included]]></description><link>https://newsletter.protomota.com/p/running-qwen-35-35b-on-a-jetson-agx</link><guid isPermaLink="false">https://newsletter.protomota.com/p/running-qwen-35-35b-on-a-jetson-agx</guid><dc:creator><![CDATA[Brad Dunlap]]></dc:creator><pubDate>Sat, 07 Mar 2026 14:02:12 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/6af672d9-3246-4281-89dd-dd872ceb8e55_2816x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div><hr></div><h2>What I Built</h2><p>A NVIDIA Jetson AGX Orin 64GB running Qwen 3.5 35B-A3B (MoE, custom quantized) as a local AI model provider, fully integrated into an OpenClaw agent stack. My Mac calls the Jetson over LAN using a simple alias (<code>agx</code>) and gets 35B-level reasoning back at ~30 tok/sec &#8212; $0/month, 60 watts.</p><div><hr></div><h2>Hardware &amp; Specs</h2><ul><li><p><strong>Device:</strong> NVIDIA Jetson AGX Orin 64GB</p></li><li><p><strong>OS:</strong> Ubuntu, JetPack R36.4.7 (aarch64)</p></li><li><p><strong>CUDA:</strong> 12.6</p></li><li><p><strong>RAM:</strong> 64GB unified memory (CPU + GPU share it)</p></li><li><p><strong>Storage:</strong> 3.7TB NVMe</p></li><li><p><strong>Power:</strong> ~60W under load</p></li><li><p><strong>Model:</strong> <code>Kbenkhaled/Qwen3.5-35B-A3B-quantized.w4a16</code> (w4a16 quantized, vLLM)</p></li></ul><h3>Why MoE matters</h3><p>Qwen 3.5 35B-A3B is a <strong>Mixture of Experts</strong> model. 35B total parameters, but only ~3B active per token at inference. That's ~10x less memory bandwidth per inference compared to a dense model of the same size. The dense 27B Qwen variant is <em>slower</em> on the same hardware. MoE wins on edge hardware every time.</p><div><hr></div><h2>What NOT to Do: The Ollama Trap</h2><p>The standard Ollama build of Qwen does <strong>not</strong> optimize for Orin's CUDA architecture the same way. If you want real performance out of the Jetson, skip Ollama and use a custom quantized build served via <strong>vLLM</strong> with CUDA acceleration.</p><p>NVIDIA's Jetson AI Lab documents this model officially &#8212; that's where I found it: &#128073; <a href="https://www.jetson-ai-lab.com/models/qwen3-5-35b-a3b/">jetson-ai-lab.com/models/qwen3-5-35b-a3b</a></p><p>The specific quantized build I used is the <code>w4a16</code> variant optimized for Orin's architecture.</p><p>vLLM serves it with an OpenAI-compatible API on port 8000.</p><div><hr></div><h2>Step 1: Verify the Model is Running</h2><p>SSH into your Jetson and confirm vLLM is serving:</p><pre><code># Check what's listening on port 8000
ss -tlnp | grep 8000
# Should show: LISTEN 0 2048 0.0.0.0:8000 0.0.0.0:*

# Verify the model API is responding
curl -s http://localhost:8000/v1/models | python3 -m json.tool</code></pre><p>Test a completion:</p><pre><code>curl -s http://127.0.0.1:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Kbenkhaled/Qwen3.5-35B-A3B-quantized.w4a16",
    "messages": [{"role": "user", "content": "Say hello in one sentence."}],
    "max_tokens": 100
  }' | python3 -c "import json,sys; r=json.load(sys.stdin); print(r['choices'][0]['message']['content'])"</code></pre><div><hr></div><h2>Step 2: Configure OpenClaw on the Jetson</h2><p>Check what OpenClaw sees:</p><pre><code>openclaw models list</code></pre><p>Set the local model as the default:</p><pre><code>openclaw config set agents.defaults.model qwen
openclaw gateway restart</code></pre><blockquote><p><strong>Gotcha:</strong> <code>openclaw config set model qwen</code> doesn't work &#8212; <code>model</code> is not a root-level key. The correct path is <code>agents.defaults.model</code>.</p></blockquote><div><hr></div><h2>Step 3: Clean Up Stale Model Entries</h2><p>If you have leftover model entries (e.g., from an old Ollama provider or a stale alias), remove them:</p><pre><code># View current models config
openclaw config get models

# Remove stale provider (e.g. ollama pointing at port 11434 that isn't running)
python3 -c "
import json
with open('/home/agx/.openclaw/openclaw.json') as f:
    cfg = json.load(f)
cfg['models']['providers'].pop('ollama', None)
with open('/home/agx/.openclaw/openclaw.json', 'w') as f:
    json.dump(cfg, f, indent=2)
print('Done')
"

# Remove stale model alias from agents config
python3 -c "
import json
with open('/home/agx/.openclaw/openclaw.json') as f:
    cfg = json.load(f)
cfg['agents']['defaults']['models'].pop('kbenkhaled/Qwen3.5-35B-A3B-quantized.w4a16', None)
with open('/home/agx/.openclaw/openclaw.json', 'w') as f:
    json.dump(cfg, f, indent=2)
print('Done')
"

openclaw gateway restart</code></pre><blockquote><p><strong>Gotcha:</strong> There's no <code>openclaw models remove</code> command. You have to edit the JSON directly.</p></blockquote><blockquote><p><strong>Note on Ollama errors:</strong> OpenClaw has built-in Ollama auto-discovery that tries port 11434 at startup. If Ollama isn't running, you'll see <code>Failed to discover Ollama models: TypeError: fetch failed</code> in logs. This is cosmetic &#8212; it doesn't affect functionality. There's no config key to disable it in 2026.3.2.</p></blockquote><div><hr></div><h2>Step 4: Allow Remote Exec on the Jetson Node</h2><p>By default, agx requires approval for every <code>system.run</code> command from a remote session. To allow your main machine to run commands freely:</p><pre><code># Set security to full (no restrictions &#8212; fine for a trusted local node)
openclaw config set tools.exec.ask off
openclaw config set tools.exec.security full
openclaw gateway restart</code></pre><blockquote><p><strong>Gotcha:</strong> <code>security=allowlist</code> without defined safeBins will give you <code>allowlist miss</code> errors. Use <code>full</code> for a local trusted node, or define your safeBins list explicitly.</p></blockquote><div><hr></div><h2>Step 5: Add the Jetson as a Provider on Your Main Machine</h2><p>First, confirm your Mac can reach the Jetson's model server:</p><pre><code># Run this on your Mac
curl -s http://YOUR_JETSON_IP:8000/v1/models | python3 -m json.tool | head -10</code></pre><p>Then add it as a custom provider in OpenClaw (run on your Mac, or use the gateway tool):</p><pre><code>openclaw config set models.providers.agx-qwen '{
  "baseUrl": "http://YOUR_JETSON_IP:8000/v1",
  "apiKey": "none",
  "api": "openai-completions",
  "models": [{
    "id": "Kbenkhaled/Qwen3.5-35B-A3B-quantized.w4a16",
    "name": "AGX Qwen3.5-35B (local)",
    "input": ["text"],
    "contextWindow": 16000,
    "maxTokens": 4096
  }]
}'</code></pre><p>Replace <code>YOUR_JETSON_IP</code> with your Jetson's actual LAN IP:</p><pre><code># Check Jetson IP
hostname -I | awk '{print $1}'</code></pre><div><hr></div><h2>Step 6: Add a Model Alias</h2><pre><code># Run on your Mac
openclaw models aliases add agx agx-qwen/Kbenkhaled/Qwen3.5-35B-A3B-quantized.w4a16</code></pre><p>Now you can reference it anywhere as <code>agx</code>.</p><div><hr></div><h2>Step 7: Test It from Your Mac</h2><pre><code>curl -s http://YOUR_JETSON_IP:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Kbenkhaled/Qwen3.5-35B-A3B-quantized.w4a16",
    "messages": [{"role": "user", "content": "Write a one-paragraph story about a robot."}],
    "max_tokens": 300
  }' | python3 -c "import json,sys; r=json.load(sys.stdin); print(r['choices'][0]['message']['content'])"</code></pre><div><hr></div><h2>GPU Health Check</h2><pre><code># Real-time stats (run on Jetson)
tegrastats

# Or one-shot summary
nvidia-smi --query-gpu=name,temperature.gpu,utilization.gpu,memory.used,memory.total,power.draw --format=csv,noheader,nounits

# System overview
free -h          # RAM
df -h /          # Disk
uptime           # Load average</code></pre><p>Healthy idle numbers on the AGX 64GB running Qwen 3.5 35B-A3B:</p><ul><li><p>RAM: ~56GB used (model loaded in unified memory)</p></li><li><p>GPU temp: ~46&#176;C</p></li><li><p>GPU utilization: ~10% idle, spikes during inference</p></li><li><p>Power draw: ~3.5W idle</p></li></ul><div><hr></div><h2>Gotchas Summary</h2><ul><li><p>Issue: <code>Unrecognized key: "model"</code></p><p>Cause: Wrong config path</p><p>Fix: Use <code>agents.defaults.model</code> not <code>model</code></p></li></ul><ul><li><p>Issue: <code>SYSTEM_RUN_DENIED: approval required</code></p><p>Cause: Default node security</p><p>Fix: Run <code>openclaw config set tools.exec.security full</code> on the node</p></li></ul><ul><li><p>Issue: <code>SYSTEM_RUN_DENIED: allowlist miss</code></p><p>Cause: <code>security=allowlist</code> with no bins defined</p><p>Fix: Switch to <code>full</code> or define <code>safeBins</code></p></li></ul><ul><li><p>Issue: <code>openclaw models remove</code> not found</p><p>Cause: Command doesn't exist</p><p>Fix: Edit openclaw.json directly with python3</p></li></ul><ul><li><p>Issue: Ollama errors at startup</p><p>Cause: Built-in discovery, can't disable</p><p>Fix: Ignore &#8212; cosmetic only</p></li></ul><ul><li><p>Issue: Model output includes thinking chain</p><p>Cause: Reasoning mode baked into model</p><p>Fix: Add system prompt telling it to skip thinking, or disable reasoning in vLLM config</p></li></ul><div><hr></div><h2>Hardware Note: Orin Nano Super (8GB)</h2><p>Same approach works on an Orin Nano Super (8GB) &#8212; just use a smaller model. The Qwen 3.5 Small series (just released March 2026, 0.8B&#8211;9B range) is built for on-device/edge and fits the 8GB form factor. Methods are identical.</p><div><hr></div><h2>Why Bother?</h2><ul><li><p>$0/month operating cost</p></li><li><p>No API latency, no rate limits, no data leaving your network</p></li><li><p>60W power draw &#8212; runs all night on overnight tasks</p></li><li><p>35B-level reasoning for background jobs: research, batch processing, coding runs</p></li><li><p>Full tool-use and thinking capabilities</p></li><li><p>Still might become the brain for a robot someday</p></li></ul><div><hr></div><p><em>Originally set up March 6, 2026. All commands verified in production.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.protomota.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">This Substack is reader-supported. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[Why your AI workflow keeps skipping steps]]></title><description><![CDATA[The failure mode nobody warns you about, and how to design around it.]]></description><link>https://newsletter.protomota.com/p/why-your-ai-workflow-keeps-skipping</link><guid isPermaLink="false">https://newsletter.protomota.com/p/why-your-ai-workflow-keeps-skipping</guid><dc:creator><![CDATA[Brad Dunlap]]></dc:creator><pubDate>Fri, 06 Mar 2026 22:40:48 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/611facb3-e64f-4b16-a0b7-80879ea0a6a4_2816x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you've built a workflow with an AI agent, you've hit this: the agent does most of the task, declares victory, and stops. Step 6 never happened. The output looks plausible enough that you almost miss it.</p><p>This isn't a bug. It's how language models work. Understanding it changed how I build everything.</p><h2>Two kinds of workflows</h2><p>Deterministic workflows (n8n, Zapier, custom code) follow the same path every run. Step 2 fires after Step 1. If it breaks, it breaks the same way, which means you can find it and fix it. The limitation: you specify everything upfront. Every edge case. Every branch.</p><p>AI skill workflows don't have a fixed path. The model reads instructions, reasons about the task, and decides what to do. It handles ambiguity and adapts to context you never anticipated.</p><p>The tradeoff: execution quality depends on the model doing the work. Same skill file. Different model. Different results.</p><h2>How AI workflows actually fail</h2><p>Traditional software fails with wrong logic. You find it, you fix it.</p><p>AI workflows fail by <strong>premature completion</strong>. The model decides the task is "done enough" and wraps up. No error. No warning. It just stops.</p><p>This happens because language models are trained to be helpful and responsive. Finishing feels like the right move. The longer and more complex the task, the more likely the model cuts a corner, especially on steps that feel administrative or repetitive.</p><p>I see this constantly. My daily content pipeline has 7 steps. A weaker model hits Step 4, feels like the main work is done, and summarizes. A stronger model follows through to Step 7 even when it's grinding through the fifth piece of content in a row.</p><h2>Design for compliance, not just clarity</h2><p>Tightening the skill file helps more than switching models (though model choice matters too).</p><p><strong>Be explicit about completion.</strong> "DO NOT skip this step" and "This step is MANDATORY" aren't redundant. They counteract the model's natural tendency to treat later steps as optional.</p><p><strong>State completion criteria.</strong> Instead of "write social posts," write "write social posts for all four platforms: Twitter, Instagram, TikTok, YouTube Shorts. All four must be present before this step is complete."</p><p><strong>Use memory for standing rules.</strong> If a step gets skipped repeatedly, add it to memory with the date it was corrected. Models read prior corrections as high-priority context.</p><p><strong>Verify outputs, not just completion.</strong> Don't check whether the model said it finished. Check whether the output files exist, the word counts are right, the required sections are present.</p><h2>The hybrid architecture</h2><p>The most practical solution for complex pipelines: use deterministic tools for orchestration and AI for content.</p><p>n8n handles the skeleton. Trigger at 8am, pass Step 1 output to Step 2, wait for approval gate, continue. The structure is reliable. AI fills in the variable parts: writing the summary, picking the angle, adapting tone.</p><p>The mental model: n8n is the project manager, AI is the writer. The project manager doesn't forget steps. The writer doesn't need to think about pipeline logic.</p><h2>What to actually do</h2><p>If you're running AI workflows today:</p><ol><li><p><strong>Audit your skill files for vague step language.</strong> "Write social content" is not a complete instruction.</p></li><li><p><strong>Add explicit completion checks.</strong> List every required output, not just the task.</p></li><li><p><strong>Test with a weaker model.</strong> If a cheaper model skips steps, your skill isn't tight enough. Fix the skill, not the model tier.</p></li><li><p><strong>Consider hybrid architecture</strong> for any pipeline longer than 4-5 steps. The complexity cost of n8n pays off fast.</p></li></ol><p>The goal isn't to make AI workflows as reliable as traditional software. They can't be. The goal is to design them so the model's flexibility fills the right gaps, and none of the mandatory ones.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.protomota.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">This Substack is reader-supported. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[What a 3-person team that writes zero code is telling us]]></title><description><![CDATA[StrongDM built production security software with no human writing or reviewing a single line. Here's what they actually did &#8212; and what it means.]]></description><link>https://newsletter.protomota.com/p/what-a-3-person-team-that-writes</link><guid isPermaLink="false">https://newsletter.protomota.com/p/what-a-3-person-team-that-writes</guid><dc:creator><![CDATA[Brad Dunlap]]></dc:creator><pubDate>Fri, 06 Mar 2026 21:24:33 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!LUTR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd13b495f-1318-406c-bcca-c1ca07304fab_3168x1344.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LUTR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd13b495f-1318-406c-bcca-c1ca07304fab_3168x1344.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LUTR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd13b495f-1318-406c-bcca-c1ca07304fab_3168x1344.png 424w, https://substackcdn.com/image/fetch/$s_!LUTR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd13b495f-1318-406c-bcca-c1ca07304fab_3168x1344.png 848w, https://substackcdn.com/image/fetch/$s_!LUTR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd13b495f-1318-406c-bcca-c1ca07304fab_3168x1344.png 1272w, https://substackcdn.com/image/fetch/$s_!LUTR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd13b495f-1318-406c-bcca-c1ca07304fab_3168x1344.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LUTR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd13b495f-1318-406c-bcca-c1ca07304fab_3168x1344.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d13b495f-1318-406c-bcca-c1ca07304fab_3168x1344.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LUTR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd13b495f-1318-406c-bcca-c1ca07304fab_3168x1344.png 424w, https://substackcdn.com/image/fetch/$s_!LUTR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd13b495f-1318-406c-bcca-c1ca07304fab_3168x1344.png 848w, https://substackcdn.com/image/fetch/$s_!LUTR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd13b495f-1318-406c-bcca-c1ca07304fab_3168x1344.png 1272w, https://substackcdn.com/image/fetch/$s_!LUTR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd13b495f-1318-406c-bcca-c1ca07304fab_3168x1344.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>Three engineers at StrongDM built production security software in 2025 under two rules: no human writes code, no human reviews code. They shipped it. It's running in production.</p><p>I've been following this story since they published their methodology in February. My reaction was something between "obviously this is where things are headed" and "I genuinely don't know how I feel about that."</p><p>The domain matters here. StrongDM isn't building a todo app. They're building access management software &#8212; the kind that controls who can touch what across Okta, Jira, Slack, and Google Drive. If it has a flaw, the blast radius is real. The fact that no human reviewed the code doesn't make it smaller.</p><h2>The testing problem they actually solved</h2><p>The part that stuck with me: agents cheat. Not deliberately, but effectively. If a test checks whether a function returns a specific value, the agent will hardcode that value. Test passes. Software is broken. The model found the shortest path to green and didn't care whether it was useful.</p><p>This isn't a new problem. Goodhart's Law has been around since 1975. What's new is that the cheater is your software, and it's faster at gaming metrics than you are at writing them.</p><p>StrongDM's fix: treat validation like a machine learning holdout set. Store test scenarios completely outside the codebase, where the agent can't read them. Their evaluation framework tests user-level outcomes &#8212; did the software do what the user needed, not did this function return the right value.</p><blockquote><p>They call this measuring "satisfaction." I'd call it the right question.</p></blockquote><h2>The fake infrastructure play</h2><p>They also built behavioral clones of every third-party service the software integrates with. Full replicas of Okta, Jira, Slack, Google Drive &#8212; their APIs, edge cases, observable behaviors &#8212; running locally with no rate limits and no production risk. They call it a Digital Twin Universe.</p><p>With it, they run thousands of test scenarios per hour. The setup lets them:</p><ol><li><p>Simulate failure modes that would be dangerous to test against live systems</p></li><li><p>Run the same scenario thousands of times without rate limits</p></li><li><p>Have the agents building the software also build the testing environment</p></li></ol><p>Six months ago, faithfully replicating even one major SaaS API was economically absurd. Now it's table stakes for this team.</p><h2>The accountability question nobody has answered</h2><p>When no human has read the code, who's responsible for what it does?</p><p>There's no good answer yet. Stanford Law flagged it two days after StrongDM's announcement. Existing software liability frameworks assume a human made decisions about what shipped. The legal infrastructure for "the model decided" doesn't exist.</p><p>This matters for anyone building AI-first. Your outputs have consequences regardless of whether a human touched the code.</p><h2>The number</h2><blockquote><p>StrongDM's benchmark: if you're not spending at least $1,000 per engineer per day on tokens, your factory has room to improve.</p></blockquote><p>That's $20K/month per engineer in inference before salaries. The math works if three engineers can build and maintain production security software without reviewers. It doesn't work for most teams today.</p><p>But the cost comes down as models get cheaper, and the methodology &#8212; scenario holdouts, digital twins, probabilistic validation &#8212; scales in both directions.</p><p>Whether this becomes standard practice is a question 2026 is answering right now. I'm watching closely.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.protomota.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Brad Dunlap's Substack is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Your agent config is code. Start treating it like it.]]></title><description><![CDATA[Why your SKILL.md is infrastructure, not a preference &#8212; and what happens when you treat it that way.]]></description><link>https://newsletter.protomota.com/p/your-agent-config-is-code-start-treating</link><guid isPermaLink="false">https://newsletter.protomota.com/p/your-agent-config-is-code-start-treating</guid><dc:creator><![CDATA[Brad Dunlap]]></dc:creator><pubDate>Wed, 04 Mar 2026 15:48:57 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Id5u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51b29ffa-1bd2-4a14-81a8-540dd09a7ae4_2752x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Id5u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51b29ffa-1bd2-4a14-81a8-540dd09a7ae4_2752x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Id5u!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51b29ffa-1bd2-4a14-81a8-540dd09a7ae4_2752x1536.png 424w, https://substackcdn.com/image/fetch/$s_!Id5u!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51b29ffa-1bd2-4a14-81a8-540dd09a7ae4_2752x1536.png 848w, https://substackcdn.com/image/fetch/$s_!Id5u!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51b29ffa-1bd2-4a14-81a8-540dd09a7ae4_2752x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!Id5u!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51b29ffa-1bd2-4a14-81a8-540dd09a7ae4_2752x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Id5u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51b29ffa-1bd2-4a14-81a8-540dd09a7ae4_2752x1536.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/51b29ffa-1bd2-4a14-81a8-540dd09a7ae4_2752x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Id5u!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51b29ffa-1bd2-4a14-81a8-540dd09a7ae4_2752x1536.png 424w, https://substackcdn.com/image/fetch/$s_!Id5u!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51b29ffa-1bd2-4a14-81a8-540dd09a7ae4_2752x1536.png 848w, https://substackcdn.com/image/fetch/$s_!Id5u!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51b29ffa-1bd2-4a14-81a8-540dd09a7ae4_2752x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!Id5u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51b29ffa-1bd2-4a14-81a8-540dd09a7ae4_2752x1536.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>Most people are still calling this "better prompting." I don't think that's right.</p><p>What's actually happening: we're moving from one-off instructions to software-defined environments. When I started building agent teams for my projects, I thought in terms of prompts. Now I think in terms of files. SKILL.md, CLAUDE.md, .clinerules, action schemas &#8212; each one is a deployable artifact with a specific scope and lifecycle.</p><p>The difference matters more than it sounds.</p><p>A prompt is temporary. A skill file is infrastructure. When I write a SKILL.md for one of my agents, I'm authoring behavior: what it does, in what order, under what constraints, and when it stops. That's configuration code. It belongs in version control, gets reviewed when it changes, and gets maintained like anything else in your stack.</p><p>I learned this the hard way. Early versions of my agent setups were sprawling instruction blobs I'd partially remember and inconsistently apply. New behavior introduced in one session would disappear in the next. The agent's reliability was inversely proportional to how long it had been since I'd touched the config.</p><p>Once I started treating these files like actual code &#8212; committed to git, organized by concern, with clear inheritance &#8212; everything got more predictable. Not because the models improved. Because the configuration did.</p><h2>The layering pattern</h2><p>A setup worth copying in 2026:</p><ol><li><p>A root config for global defaults and guardrails</p></li><li><p>Skill modules for work you do repeatedly</p></li><li><p>Project-level overrides that inherit from the base</p></li><li><p>Memory files that persist context across sessions</p></li></ol><p>One configuration for writing. Another for product work. Another for ops. Each tuned, versioned, owned by you.</p><blockquote><p>The alternative is relying on vibes and hoping the agent remembers what you told it last week.</p></blockquote><h2>The security surface nobody's thinking about</h2><p>If behavior lives in files, those files are part of your attack surface. Not theoretically &#8212; practically.</p><p>Third-party skill files should be treated like third-party dependencies:</p><ul><li><p>Review them before running them</p></li><li><p>Scope permissions to what's actually needed</p></li><li><p>Don't give an agent write access when it only needs to read</p></li></ul><p>Most people haven't applied basic hygiene standards to their agent configs yet. That gap will close, probably after something goes wrong.</p><h2>What to actually do</h2><p>Put your config files in git. Write commit messages that explain why a rule changed, not just what changed. Build skills that do one thing and can be tested in isolation.</p><p>Think through your precedence model deliberately: what's global, what's project-specific, what can a single user override. Those answers exist &#8212; they just require thinking it through rather than letting the config accumulate by accident.</p><blockquote><p>A well-built agent stack can be cloned, backed up, handed off, and iterated. It's your operating layer. Treat it like one.</p></blockquote><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.protomota.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Brad Dunlap's Substack is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item></channel></rss>