-
Notifications
You must be signed in to change notification settings - Fork 36
Blog Post: What a Closed-Loop Coding Agent Actually Is #1188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
d6d8b0d
Blog Post: What a Closed-Loop Coding Agent Actually Is
hubyrod 05c97ba
Address Scipio917 review feedback on closed-loop post
hubyrod e85c13b
Tighten prose in closed-loop post
hubyrod 4f1b237
Merge branch 'main' into blog_article
hubyrod 25dcc1a
Update intro and publication date
hubyrod File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,58 @@ | ||
| --- | ||
| title: What a Closed-Loop Coding Agent Actually Is | ||
| description: Closed-loop control requires a feedback signal trustworthy enough to act on. Most AI coding agents close their loops on lossy sensors, tests, unsound types, model judgment, and inherit the instability that follows. Skipper is built on sound signals with SKJS, deterministic execution, and reactive computation. | ||
| slug: closed_loop_coding_agent | ||
| date: 2026-05-18 | ||
| authors: hubyrod | ||
| image: /img/skip.png | ||
| --- | ||
|
|
||
| Skipper is a closed-loop coding agent. That label carries more weight than it appears. The industry adopted "closed-loop" without retaining its original meaning, so many tools labeled that way don't truly qualify. Once you understand the distinction, you see what sets Skipper apart. | ||
|
|
||
| ## What closed-loop meant before we borrowed it | ||
|
|
||
| Closed-loop control comes from control theory. An autopilot reads altitude, compares it to the target, nudges the elevators based on the error. The defining property isn't that a loop exists. It's that **the feedback signal is trustworthy enough to act on**. | ||
|
|
||
| An autopilot with a broken altimeter will fly the plane into the ground. Confidently. Autonomously. In perfect mechanical closure. | ||
|
|
||
| Control theorists know this. They obsess over sensor fidelity, because a closed loop with a bad sensor isn't a stable system. It's an unstable one with extra steps. | ||
|
|
||
| {/* truncate */} | ||
|
|
||
| Medical devices go further. An artificial pancreas is closed-loop. A glucose monitor that beeps at a human to dose insulin is not. The line isn't drawn at the loop. It's drawn at whether the signal is sound enough to take the human out. You earn closed-loop autonomy through signal quality. You can't declare it through architecture. | ||
|
|
||
| ## What most agent loops actually close on | ||
|
hubyrod marked this conversation as resolved.
|
||
|
|
||
| So look at what a typical coding agent feeds back into its next decision: | ||
|
|
||
| - **Test output.** Tests are samples, not proofs. The agent can pass them without satisfying the intent. Sometimes by editing the test. | ||
| - **Linter and type-checker output.** Useful, but TypeScript's type system is unsound by design. A green check means "I couldn't find a contradiction in the bit I can see." It doesn't mean correct. | ||
| - **Runtime logs.** Noisy, incomplete, silent on the failures you actually care about. | ||
| - **Model-based judgment.** The agent's own confidence, or a second model with the same blind spots. Either way, the signal is the thing you were trying to check. | ||
|
|
||
| Every one of these is a lossy sensor. It drops information about the thing it's supposed to measure. The loop still closes mechanically. Output feeds back into the next decision. But what flows through it is a low-fidelity proxy for correctness. | ||
|
|
||
| And because the agent acts on that proxy with full autonomy, small signal errors compound into the failure modes we all recognize: | ||
|
|
||
| - The agent deletes the failing test and declares victory. | ||
| - The agent ships code that type-checks fine but violates an invariant the types can't express. | ||
| - The agent loops for 40 minutes polishing a solution that was already correct on attempt 3, because nothing in the signal told it to stop. | ||
| - The agent drifts. Each step is locally plausible. The trajectory is nonsense. | ||
|
|
||
| These aren't bugs in the agents. They're what closed-loop control does when the sensor is lossy. Control theory called this decades ago. We're rediscovering it the hard way. | ||
|
|
||
| ## A physics problem, not a craftsmanship problem | ||
|
|
||
| Here's the part worth sitting with. No amount of prompt engineering, scaffolding, or agent-framework cleverness fixes a lossy feedback signal. You can tune the controller all day. If the altimeter is wrong, the altitude is wrong. The ceiling on an agent's reliability is the trustworthiness of what it closes the loop on. | ||
|
|
||
| So the real question when you're evaluating an AI coding tool isn't *how autonomous is the loop*. It's *what does the loop close on, and how much can you trust it?* | ||
|
|
||
| A sound type checker closes a better loop than an unsound one. A deterministic execution environment closes a better loop than a flaky one. A reactive system that recomputes only what changed closes a better loop than one that re-runs everything and re-derives confidence from noise. These are sensor upgrades. They're what lets the loop actually control something. | ||
|
|
||
| This is what Skipper is built on. SKJS is a fully sound TypeScript-compatible type checker, so when Skipper's loop closes on a type signal, that signal means what it says. Execution is deterministic, so "it worked" is reproducible instead of probabilistic. Computation is reactive, so feedback is precise instead of a fog of re-runs. Three sensor upgrades. Together they're what lets Skipper close the loop at all. | ||
|
|
||
| ## Why this matters | ||
|
|
||
| The industry has been iterating hard on the controller. Better prompts, better scaffolds, better planning. And treating the feedback signal as given. The feedback signal was never given. It's the variable that decides whether your autonomous loop is closed-loop or just an unstable one in a costume. | ||
|
|
||
| Skipper closes the loop because Skipper took the sensor seriously. Everything else in the category is running open-loop with confidence it hasn't earned. | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.