a cli that turns 7b reasoning into 70b

miii vs pi

pi is a flexible harness with a great extension ecosystem. miii is a privacy-first agent that runs local by default — and ships Beacon, a context engine that makes a 7B model outperform a 70B on long agentic tasks.

The differentiator is not model size. It's what gets fed into context. Beacon compresses every tool output at the moment it's produced and reinjects your goal before every response. pi has no equivalent. Neither does any cloud agent.

local default

max depth

60–70%

context saved

LLM calls for compression

pi — flexible harness, 50+ extensions, 15+ cloud providers, session branching. miii — privacy-first defaults, Beacon, shadow git, OS sandbox, call graph — all built-in. pi wins on extensibility. miii wins on defaults.

Feature matrix

	miii	pi
Default provider	Ollama (local, offline)	Cloud (Anthropic / OpenAI / etc.)
API key required to start	✗	✓
Works fully offline	✓	✗
Code sent to cloud	Only when configured	Every request
Goal-aware context (Beacon)★	✓	✗
Per-tool context compression★	✓	✗
Dynamic context window★	✓ auto-detected	hardcoded guess
Shadow context store★	✓	✗
OS-level shell sandbox	✓ built-in	opt-in extension
Shadow git (model edit log)	✓	✗
File snapshots + Esc rollback	✓	✗
Live diff permission modal	✓ built-in	opt-in extension
Static call graph (AST)	✓	✗
Windowed file reads	✓	✗
Semantic vector index	✓	✗
Anthropic support	✓	✓
OpenAI support	✓	✓
OpenAI-compat endpoints	✓	Via models.json
15+ cloud providers native	✗	✓
Session branching / tree	✗	✓
HTML export / gist share	✗	✓
Extension ecosystem	small	50+ TypeScript extensions
MCP support	✓	✓
License	MIT	MIT

1 — Privacy & local-first

miii defaults to offline.
pi defaults to the cloud.

miii defaults to Ollama — fully offline, no API keys, no telemetry. When you need more power, you configure a cloud provider. pi supports local models as one option among many but is built around cloud APIs. Privacy is not a first-class design constraint in pi.

miii — local by default

terminal

# local default — no key needed

miii

# switch to Claude Opus 4

ANTHROPIC_API_KEY=sk-ant-... miii

/config model claude-opus-4-7

# one-turn cloud escalation

/cloud explain why this segfault happens

Cloud escalation via /cloud fires a single turn to Opus 4 or o3 without switching the session permanently.

pi — cloud by default

pi defaults to and is built around cloud APIs. Local models are available as one provider choice, but the UX and defaults assume a cloud connection.

For regulated industries — fintech, healthcare, legal, defence — pi's default means every task, every file read, every prompt leaves the machine.

	miii	pi
Default provider	Ollama (local, offline)	Cloud (Anthropic / OpenAI)
API key required to start	✗	✓
Works fully offline	✓	✗
Zero cost local runs	✓	✗
Code sent to cloud	Only when configured	Every request

2 — Context engine

Beacon vs basic compaction.
pi has no equivalent.

pi auto-compacts older messages when approaching context limits: wait for context to fill, call the LLM to summarise, pay tokens, wait for latency, resume. On Ollama this also means queue contention — the model serialises requests, so a summarise call blocks the agent call. miii's Beacon engine replaces this entirely.

Context window at each depth

pi — bulk compactionstalls at depth 9

depth 1

depth 2

depth 3

depth 4

depth 5

depth 6

depth 7

depth 8

LLM call to summarise ✗

miii — Beaconcompletes at depth 20

depth 1

depth 2

depth 3

depth 4

depth 5

depth 6

depth 7

depth 8

Goal Block

· · ·

depth 20

✓ complete

pi — LLM compaction on overflow

miii — Beacon continuous compression

2a — Per-tool compression

zero LLM calls · microseconds

Beacon applies a different compression strategy per tool type based on age and position. No LLM is called. No tokens spent. No latency. File reads become filename + line count + first 4 lines. Command output becomes first 4 lines + last line. Failures always kept verbatim.

2b — Goal injection

prevents model drift

After ~10 tool calls, models routinely forget the original task. Beacon extracts your goal synchronously at depth 0 (no LLM call), then injects a live goal-state block just before the last message at every depth. The model cannot lose the thread.

2c — Dynamic context window

live detection vs hardcoded guess

pi hardcodes a conservative context limit and compacts early. miii detects the actual window at session start: Ollama via live /api/show, Anthropic via model prefix lookup (200K for Opus 4 / Sonnet 4). Claude users get 14× more headroom before compaction triggers.

2d — Shadow store

nothing is lost

Everything Beacon compresses is preserved in a shadow store. Content is condensed, not deleted. The full original remains accessible. pi has no equivalent.

Goal injection block

Extracted at depth 0, injected just before the last message at every subsequent depth. No LLM call. Single split. Every response.

miii · depth 12

╔══════════════════════════════════════════════════════╗

║ Beacon — Goal State ║

║ Goal: fix token expiry bug in auth middleware ║

║ ║

║ Steps taken: ║

║ • read_file(src/middleware/auth.ts) → 147 lines ║

║ • update_file(auth.ts) → changed < to <= ║

║ • run_tests(auth.test.ts) → 12 passed ║

║ ║

║ Remaining: verify expired-at-boundary edge case ║

╚══════════════════════════════════════════════════════╝

How Beacon compresses each tool

Tool result	pi (no compression)	miii Beacon	Reduction
read_file (800 lines)	800 lines verbatim	filename + line count + first 4 lines	97%
list_files (full tree)	full tree	file count + first 8 paths	84%
run_command (stdout)	full stdout	first 4 lines + last line	95%
run_tests (output)	full output	first 10 lines (failures always kept)	90%
Error messages	verbatim	always verbatim — never touched	—

For Ollama users specifically

Beacon is the difference between a task completing and the context window crashing. An 8K-context model hits the wall at depth 6–8 on a complex task. With Beacon, the same model runs to depth 20. pi's LLM-based compaction creates queue contention on Ollama — the compaction call blocks the agent call.

3 — Codebase understanding

Call graph + windowed reads.
pi has no equivalent.

Every structural question in pi costs file reads — the full file, verbatim, into context. miii builds a complete symbol call graph of the codebase using pure AST analysis. No model. No network. Sub-second build.

Static call graph

AST · no model

Every function, class, method, and call edge across the codebase. Sub-second build. When a structural question arises mid-task, the agent queries the graph instead of reading files.

/graph build

# 847 symbols, 1203 edges — <1s

/graph query "auth"

# auth → verifyToken → decodeJWT

Windowed file reads

imports + region + footer

Reads imports, a focused region around the edit target, and the file footer. A 500-line file costs ~480 tokens instead of ~2000. pi reads full files.

Semantic vector index

local embeddings

Embed every file, store vectors locally via Ollama embedding model. Semantic search over the codebase without sending data to the cloud.

/index build

Why 7B beats 70B here

Context beats parameters. A 70B model asked to answer architectural questions from raw file dumps fails before a 7B model given the exact call graph it needs. The call graph is not a convenience feature. It's why miii's local models stay coherent on structural tasks that stall every other agent.

4 — Safety & reversibility

miii ships safety built-in.
pi offers it as extensions.

pi offers permission gates, sandboxing, and shadow git as optional extensions — not defaults. miii ships all of these built-in and hardened. You don't configure safety; it's on.

	miii	pi
Live diff + permission modal	Built-in — every file write	Opt-in extension
Shadow git model edit log	Built-in — every successful edit	Not included (partial via extension)
File snapshots + Esc rollback	Built-in — survives crashes	Not included
OS-level shell sandbox	Built-in (sandbox-exec / bwrap)	Opt-in extension
Failure decompose nudge	Built-in — triggers on 2nd failure	Not included

Permission modal

miii · write prompt

⚡ Edit src/middleware/auth.ts?

- if (exp < Date.now() / 1000) {

+ if (exp <= Date.now() / 1000) {

❯ Yes / Yes, for this session / No

Shadow git — model edit log

miii · /history

3 edits · current session

a3f2c1 — auth.ts: token expiry fix

7b8e2a — routes/user.ts: middleware wire

c91d4f — auth.test.ts: add edge case

/undo ← revert last edit

5 — Provider support

pi wins on cloud breadth.
miii wins on local depth.

pi natively supports 15+ cloud providers. miii covers Anthropic + OpenAI + any compatible endpoint — which covers ~95% of developer use cases — plus the deepest Ollama integration of any agent.

Provider	miii	pi
Ollama (local)	First-class: auto-start, exact context window via /api/show, in-TUI model pull	Supported as one provider
Anthropic (Opus 4, Sonnet 4)	Yes (ANTHROPIC_API_KEY)	Yes
OpenAI (o3, o4-mini, GPT-4o)	Yes (openai-compat mode)	Yes
Azure / Bedrock / Groq / xAI	Via openai-compat endpoint	Native support for all
Custom endpoints (vLLM, LM Studio, llama.cpp)	Yes, context auto-detected	Via models.json
Per-turn cloud escalation	/cloud → Opus 4 / o3	No equivalent

6 — Extensibility

pi wins

pi's TypeScript extension architecture with 50+ examples lets you wire in anything: sub-agents, plan mode, permission gates, sandboxing, SSH execution, path protection. The tradeoff: you need to configure them.

miii ships with the most important capabilities built-in. The extension surface is smaller but the defaults are production-grade.

7 — Session & history

Session branching / tree	✗	✓
HTML export	✗	✓
GitHub gist sharing	✗	✓
Shadow git model log	✓	✗
Esc full rollback	✓	✗
Memory extraction on /compact	✓	✗

pi wins on session navigation. miii wins on safety.

8 — Agentic depth & task modes

	miii	pi
Max tool-call depth	20 (explicit cap)	Unbounded / extension-defined
/plan mode	Built-in	Extension
/tdd mode	Built-in (red → green loop)	Not included
/refactor with checkpoints	Built-in	Not included
/compact with memory extraction	Built-in	Auto-compaction (no extraction)
Cloud escalation	/cloud (one-turn Opus 4 / o3)	Switch model mid-session
MCP support	✓	✓

Verdict

miii wins when

→

Privacy matters

Local by default — code only leaves when you configure it to.

→

Local models

Deepest Ollama integration: auto-start, exact context window, in-TUI pull.

→

Long agentic tasks

Beacon keeps any model (local or cloud) on-task to depth 20.

→

Zero cost

No subscription. Local runs free. Cloud BYOK same rate as pi.

→

Safety defaults

Shadow git, file snapshots, OS sandbox, live diffs — on by default.

→

Structural codebase work

Call graph + windowed reads = fewer tokens, better answers.

→

Beacon on cloud too

Per-tool compression and goal injection make Opus 4 and o3 more efficient.

pi wins when

→

Niche cloud providers

Azure, Bedrock, Groq, Cerebras, xAI, Mistral, HuggingFace — all natively.

→

Extension / customisation

TypeScript extensions, 50+ examples, full composability.

→

Session branching

Tree navigation, HTML export, gist sharing.

→

SSH execution

Remote machine support built-in.

→

Non-Node integrations

RPC protocol for embedding pi in other stacks.

Bottom line

Pi is the right choice if you need maximum provider flexibility and a composable extension system.

miii is the right choice if you want an autonomous coding agent that defaults to your hardware — free, private, offline — but can escalate to Claude Opus 4 or o3 when the task demands it, all while Beacon keeps the model on-task through deep tool chains that would stall any naive agent. The local default is the differentiator; the cloud support means you never hit a wall.

Try it in three steps.

01Install Ollama

# install ollama first → ollama.ai

brew install ollama

ollama pull llama3.2

02Install miii-cli

npm i -g miii-cli

03Run it

miii

npm i -g miii-cli ← miii vs Claude Code / Cursor / Codex ← Back to miii

miii vs pi

miii defaults to offline.pi defaults to the cloud.

Beacon vs basic compaction.pi has no equivalent.