RFC / Roadmap: moving Agentic-Flow toward a meta-harness architecture (freeze the model, evolve the harness)

## TL;DR

Agentic-Flow is shifting from "an AI agent orchestrator" to **an agentic *meta-harness*** — a runtime whose main job is to build, improve, and verify the **harness around a model**, not to be a model. The slogan: **freeze the model, evolve the harness.** The first pieces shipped in **`agentic-flow@2.1.0`**. This issue explains the move in plain language and invites feedback on the direction.

---

## What's a "harness," and what's a "meta-harness"?

The **model** is the LLM (Claude, GPT, a local model, etc.).

The **harness** is *everything around the model* that turns it into a useful agent:
- how it plans a task,
- what context/files it's shown,
- how it reviews its own output,
- when it retries,
- which tools it can call,
- what it remembers,
- how "success" is scored.

A **meta-harness** is a system whose product is *that harness* — it chooses models, improves the harness, runs agents on it, and verifies the whole thing is safe and trustworthy.

## Why move this way?

Because the measured lever in modern agentic systems is **the harness, not a bigger model**. A cheap model inside a well-built, self-improving harness can match an expensive model at a fraction of the cost. So instead of always reaching for a bigger model, we make the *harness* smarter.

## The four pillars

| Pillar | In plain terms |
|---|---|
| 🧭 **Route** | Send each request to the cheapest model that's still good enough for it. |
| 🧬 **Evolve** | Let the system improve its own harness and repair code automatically — same model, better results. |
| 🤝 **Orchestrate** | Run the agents, tools, memory, and swarms on top. |
| 🔏 **Verify** | A safety gate on every harness change, plus signed provenance so you can trust what shipped. |

## What already shipped in 2.1.0

- **Route** — cost-optimal model routing: learn from your own eval logs and pick the cheapest model predicted to clear a quality bar. *Measured: ~28.5% cheaper than always using the top model while keeping ~98% of answers above the bar.*
- **Evolve** — `agentic-flow-repair`: an autonomous "freeze the model, evolve the harness" loop that repairs a repo, gated by the repo's own tests in a shell-free, secret-scrubbed sandbox.
- **Verify** — harness MCP tools (`harness_repair` / `harness_manifest` / `harness_verify`) and an Ed25519 *witness manifest* so you can sign your agent/harness config and detect tampering.
- **Positioning** — README + package now lead with the meta-harness identity (ADR-073/074/075/076).

## Try it

```bash
npm i agentic-flow@2.1.0
# autonomous repair (deterministic, no Docker needed):
npx agentic-flow-repair ./your-repo --mock
```
```ts
import { CostOptimalRouter } from 'agentic-flow/router/cost-optimal';
import { repair } from 'agentic-flow/repair';
import { signFiles, verifySignedManifest } from 'agentic-flow/harness/provenance';
```

## Honest scope

The in-package repair **engine** is fully working and tested. The headline SWE-bench-Lite "Test-Driven Repair" *product* numbers (~58–68%) come from the upstream `@metaharness/darwin` Docker harness — that's the documented deployment path, not bundled here.

## Where we're heading (feedback welcome)

1. **Route:** turn real usage into routing training data automatically; a native (FastGRNN) backend by default.
2. **Evolve:** evolve agentic-flow's *own* agent policies against its benchmark suite; wire the full SWE-bench TDR path.
3. **Verify:** `harness verify` as a CI/pre-publish gate; key-management guidance.
4. **Docs:** an end-to-end "build your own meta-harness" guide.

**Questions for the community:**
- Does the "harness, not the model, is the lever" framing match your experience?
- Which pillar is most useful to you first — routing, repair, or provenance?
- What would make you adopt cost-optimal routing in production (data format, integrations, guardrails)?

Background: ADR-073/074/075/076 in `docs/adr/`. Related packages: [`@metaharness/router`](https://www.npmjs.com/package/@metaharness/router), [`@metaharness/darwin`](https://www.npmjs.com/package/@metaharness/darwin).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC / Roadmap: moving Agentic-Flow toward a meta-harness architecture (freeze the model, evolve the harness) #173

TL;DR

What's a "harness," and what's a "meta-harness"?

Why move this way?

The four pillars

What already shipped in 2.1.0

Try it

Honest scope

Where we're heading (feedback welcome)

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Pillar	In plain terms
🧭 Route	Send each request to the cheapest model that's still good enough for it.
🧬 Evolve	Let the system improve its own harness and repair code automatically — same model, better results.
🤝 Orchestrate	Run the agents, tools, memory, and swarms on top.
🔏 Verify	A safety gate on every harness change, plus signed provenance so you can trust what shipped.

RFC / Roadmap: moving Agentic-Flow toward a meta-harness architecture (freeze the model, evolve the harness) #173

Description

TL;DR

What's a "harness," and what's a "meta-harness"?

Why move this way?

The four pillars

What already shipped in 2.1.0

Try it

Honest scope

Where we're heading (feedback welcome)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions