Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,22 @@ checklist.

## [Unreleased]

## [0.7.0] - 2026-06-12

hands becomes scriptable. v0.6.0 gave Claude Login mode real sessions, audit, and guardrails; v0.7.0 exposes that machinery to scripts, cron jobs, and orchestrators — `hands run` no longer has to end in an interactive prompt, and the now-dual-mode audit log is queryable. Minor-version bump for the new flags.

PRs in this release: #81 (one-shot scripting mode), #82 (audit list filters).

### Added — one-shot scripting mode: `--once`, `--json`, exit-code contract (#81)

- **`hands run --once "<task>"`** runs a single task and exits — no "What next?" loop. The session pointer is saved before returning, so `hands run -c --once "<next step>"` chains multi-step automation across invocations (and reboots).
- **`hands run --json "<task>"`** (implies `--once`) emits exactly one machine-readable JSON line on stdout: `{ ok, mode, result, turns, costUsd, tokens: {input, output}, sessionId? }`. Decorative output is silenced (the spinner included), and failures still emit one JSON line (`{ ok: false, error }`) — a parsing script never sees pretty text. The field set is stable: fields get added, never renamed. SDK mode honors `--json` too, with a `dryRun: true` marker when `--dry-run` forced it.
- **Exit codes**, pinned in tests: `0` task completed, `1` setup/config error, `2` task did not complete cleanly (max-turns cutoff or execution error, surfaced from the stream's result envelope).

### Added — audit list filters: `--mode`, `--tool`, `--failed`, `--json` (#82)

`hands audit list` can now answer "what did Claude Login bash do?" (`--mode cli --tool bash`) and "what went wrong?" (`--failed` — including guardrail blocks) and emit the result as JSON. Filtering never renumbers: printed indexes stay positions in the full log, because they're what `audit show/replay <index>` accept. CLI-mode entries carry a `[cli]` marker in listings; pre-0.6 entries count as `sdk`.

## [0.6.0] - 2026-06-11

Claude Login mode grows up. The default, $0 mode used to run blind: `--dangerously-skip-permissions` with nothing but prompt text for protection, no audit trail ("only SDK mode is covered"), stderr string-scraping for the action display, and "session memory" that was 200-char task summaries re-injected into the system prompt. All four are gone — replaced with the claude CLI's real primitives. Minor-version bump for the new `--continue` flag and audit-log `mode` field.
Expand Down
20 changes: 19 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -317,6 +317,8 @@ hands voice-setup # download whisper.cpp + speech model for --voice
hands run "<prompt>" # interactive computer control session
hands run --continue # resume the most recent session (works after exit/reboot)
hands run -c "<follow-up>" # resume and immediately run a follow-up task
hands run --once "<task>" # one task, no interactive loop — for scripts and cron
hands run --json "<task>" # one JSON result line on stdout (implies --once)
hands run "<prompt>" --voice # voice input via local whisper
hands run "<prompt>" --dry-run # plan + audit-log without executing (SDK mode)
hands run "<prompt>" -m claude-opus-4-6 # override model (this run only)
Expand All @@ -326,10 +328,26 @@ hands run "<prompt>" --persona thorough # named system-prompt override (SDK
hands run "<prompt>" --no-dario # skip the dario auto-detect probe at startup
```

### Scripting & automation

`--once` + `--json` + `-c` turn hands into a composable building block — cron jobs, CI steps, and orchestrators get `spawn → parse → branch` semantics:

```bash
hands run --once "open the spreadsheet and add June's numbers"
hands run -c --once "now export it as PDF to ~/reports" # same conversation, next step

result=$(hands run --json "check if the nightly build passed")
echo "$result" | jq -r .result
```

The JSON line is `{ ok, mode, result, turns, costUsd, tokens, sessionId? }` — stable field set, fields get added but never renamed. Failures emit `{ ok: false, error }` so a parser never sees pretty text. Exit codes: `0` task completed, `1` setup/config error, `2` task didn't complete cleanly.

### Audit

```bash
hands audit list --last 20 # recent tool calls (both modes) with replay index
hands audit list --mode cli --tool bash # filter: Claude Login bash calls only
hands audit list --failed --json # everything that went wrong, as JSON
hands audit show <index> # full JSON detail for one entry
hands audit replay <index> # dry-run replay of one tool call; --execute fires it
```
Expand Down Expand Up @@ -464,7 +482,7 @@ Env wins over config: `ANTHROPIC_BASE_URL`, `ANTHROPIC_API_KEY` (for SDK + dario
- **Security issues** — email **security@askalf.org**, not a public issue. See [SECURITY.md](SECURITY.md).
- **PRs welcome.** See [CONTRIBUTING.md](CONTRIBUTING.md) for build / test flow. Code style matches dario / agent / deepdive: small TypeScript, pure decision functions where possible, `strict: true`, no `any`, no unused imports.

Run `npm install && npm run build && npm test` to get a working dev tree (192 tests across 22 test files).
Run `npm install && npm run build && npm test` to get a working dev tree (204 tests across 24 test files).

---

Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "@askalf/hands",
"version": "0.6.0",
"version": "0.7.0",
"description": "Cross-platform computer-use agent. Your LLM on your mouse, keyboard, and screen. Windows (PowerShell), macOS (open + osascript), Linux (xdotool / ydotool). Voice optional, safety guardrails. Routes through dario or any Anthropic-compat endpoint.",
"main": "./dist/cli.js",
"types": "./dist/cli.d.ts",
Expand Down
Loading