🪟

Part 3 — A Tour of AgentZero Lite: Multi-CLI From a Single Window

TL;DR — AgentZero Lite is a Windows-native, single-binary shell built around one idea: in the AI era, most of your day is spent talking to command-line tools. Each AI CLI (Claude, Codex, the next one) wants its own terminal; you want them all visible, addressable, and able to talk to each other. This post is a feature tour grounded in the actual UI and code.

What it is, in one sentence

A Windows desktop shell that runs N ConPTY terminals side by side, with a small chat pane that can either type into the active terminal or (in AIMODE) let an on-device LLM coordinate them all for you. One binary, no cloud calls, ~60 MB.

1 — The main window

notion image
The main window is three regions: a left rail of workspaces and sessions, a top tab strip of terminal tabs, and a status bar at the bottom. A workspace is just a folder context — pick one and every new tab opens with cd already done. Tabs are real ConPTY (Microsoft.Terminal.Control over the Windows pseudo-console API), so claude, codex, gh, pwsh, a build log tail — anything you'd run in a shell — runs unmodified.
The tab title is load-bearing: it's the peer name the AgentBot uses to address that tab, both for the AIMODE relay and for the cross-terminal dialogue trick (next section).

2 — The signature trick: two AI CLIs talking to each other

This is the use case the Lite edition was built for, and it takes about a minute to set up:
  1. Open two tabs in the same workspace — say, group 0 tab 0 = claude, group 0 tab 1 = codex.
  1. In each tab, paste a one-line briefing: "Learn AgentZeroLite.ps1 help and use it for cross-terminal talk. Use terminal-send <grp> <tab> "text" to talk, terminal-read <grp> <tab> to read."
  1. In the Claude tab, ask it to "Greet the tab named Codex and propose we co-design a REST endpoint." Claude runs the CLI, Codex sees the message at its prompt, replies back — you watch the conversation stream in both tabs.
What makes this work, mechanically:
  • Each AI runs in its own ConPTY. No shared memory, no context leakage.
  • Messages traverse AgentZero's IPC (WM_COPYDATA + memory-mapped files) — not a cloud relay. Nothing leaves the machine.
  • The "broker" is just a CLI command (AgentZeroLite.exe -cli terminal-send …) — any CLI-native agent the AI already knows how to invoke a shell from will do.
  • You can interrupt, nudge, or splice in at any step. The human stays the supervisor.
Terminal multiplexers let you watch many prompts; AgentZero Lite lets them talk.

3 — The AgentBot pane (text broker, not an AI by default)

notion image
There's a chat pane that can dock at the bottom of the main window or float as its own window. Out of the box it is not an AI — it's a typed-input broker:
  • CHT mode types whatever you write into the active tab and hits Enter.
  • KEY mode sends raw control keys (Ctrl+C, Tab, arrows) to the active tab.
The [+] menu has three setup helpers worth naming:
  • AgentZeroCLI Helper — drops a ready-made briefing into the chat input that teaches any terminal AI how to call AgentZeroLite.exe -cli once. Review, hit Send, done. (If the CLI isn't on PATH the menu nudges you to Settings → Register PATH first.)
  • Import Starter Skills — copies the shipped agent-zero-lite skill into the active workspace's .claude/skills/ so Claude Code picks it up.
  • Skill Sync — reads /skills out of an already-running Claude tab and turns it into a slash-command menu in the chat box. Type /, pick, Enter — the macro fires at the terminal. No LLM round-trip.
So the AgentBot starts as a tiny utility for type once, send to whichever terminal is in focus. That's already useful. AIMODE (next) is what turns it into a coordinator.

4 — AIMODE: a local LLM as in-shell coordinator

notion image
Hit Shift+Tab in the bot pane and a small on-device LLM (Gemma 4 today; Nemotron staged) takes over. It is not trying to out-think Claude or Codex. Its job is the secretary role: take a fuzzy ask in Korean or English, route it to the right terminal AI, wait for the reply, summarise back. Less than a PM, more than a bash alias.
Concretely, when you type "Claude한테 토론 시작해" (start a discussion with Claude) the bot does:
  1. send_to_terminal(Claude, <handshake intro> + <opener>)
  1. wait(5) — terminal AIs need 5–15 s to start replying; reading immediately would only catch a Crafting… spinner.
  1. read_terminal(Claude) — checks for a real reply.
  1. If only spinner → wait(5)read_terminal again, up to 3 times.
  1. done("Claude greeted you back and asked about your goals.") — short summary back to you.
The model is physically constrained to emit only that 6-tool catalog (list_terminals, read_terminal, send_to_terminal, send_key, wait, done), enforced by a GBNF grammar at the sampler level. It cannot accidentally produce free-form prose. The full reasoning is in Part 2 — An LLM Is Not an Agent.
One cycle per run. Each StartReactor does one short round-trip with one peer (send → wait → read → react → done) and then stops. The LLM does not try to script a 5-turn discussion in a single giant tool chain — that path leads to truncation and hallucinated replies. Subsequent cycles are triggered by the user OR by an arriving peer signal.
Two-way channel. When AgentBot first contacts a tab, it sends a handshake intro telling the terminal AI: "You are <peerName>; here's how to call back via bot-chat CLI; acknowledge with DONE(handshake-ok)." From that point on, the peer terminal can push messages back via the existing CLI — no extra polling needed. The active conversation FSM (HandshakeState: NotConnected → HandshakeSent → Connected) lives in an Akka actor (AgentReactorActor), which also hosts the KV cache so multi-turn history is preserved across cycles.

5 — One bot, four peers (the multi-CLI dock)

notion image
The most useful layout for AIMODE is a 2×2 split with one Gemma-driven AgentBot docked at the bottom and four CLI tabs on top — for example Codex, Claude 1, Claude 2, Claude 3. The bot can address any of them by tab name, route a question to whichever fits, and merge the replies back.
A single handshake cycle plays out as four states:
flowchart LR T0[t=0 · idle<br/>4 panes idle<br/>bot ready] T1[t=1 · send<br/>bot picks Claude 2<br/>handshake intro<br/>MarkHandshakeSent] T2[t=2 · wait<br/>peer Crafting…<br/>bot calls wait(5)<br/>HandshakeSent] T3[t=3 · reply<br/>peer bot-chat DONE<br/>TerminalSentToBot<br/>HandshakeState→Connected] T0 --> T1 --> T2 --> T3 T3 -.->|next user prompt OR<br/>next peer signal| T0
Three of the four panes never wake up unless the bot picks them. The other three sit idle, ready to be addressed in the next cycle. Same machine, no cloud, all of it visible to you. The bot deliberately does not poll — wait is a first-class tool the model owns, so timing decisions live in the model rather than in application code that has to guess.

6 — Settings: minimal on purpose

notion image
Two tabs only.
  • CLI Definitions — register shells AgentZero can spawn (cmd, pwsh, claude, custom entries). Built-ins cannot be deleted; new definitions appear in the + menu of every workspace.
  • AgentZero CLI — one-click button to register the app directory in user PATH, so AgentZeroLite.exe -cli and the AgentZeroLite.ps1 wrapper resolve from any shell.
Plus an AgentBot AI Mode panel (experimental) for choosing the on-device model and backend (GBNF grammar for Gemma; native Llama-3.1 chat template for Nemotron), turning AIMODE on/off, and tuning the per-turn token cap.
Persistence is SQLite at %LOCALAPPDATA%\AgentZeroLite\agentZeroLite.db, migrated by EF Core on first run.

7 — CLI mode: drive the GUI from any script

AgentZeroLite.exe is a single binary. With -cli <command> it skips the WPF message loop entirely (decided at App.OnStartup) and acts as an IPC client into the running GUI. The most useful commands:
Command
Effect
terminal-list
JSON list of all workspace/tab sessions
terminal-send <g> <t> "text"
Send text to tab <t> in workspace <g>
terminal-key <g> <t> <key>
Send a control key (Ctrl+C, Enter, Tab, arrows)
terminal-read <g> <t> [-n N]
Read the last N bytes of a tab's scrollback
bot-chat [--from X] "text"
Display a chat bubble in the bot window (this is also the peer-signal channel for AIMODE)
So any shell script — or any AI CLI smart enough to invoke a shell command — can drive the GUI. Combined with the per-tab handshake, that is the entire reverse channel for AIMODE.

8 — Brief — what's underneath

notion image
The why and how of the architecture is covered in Part 2 — An LLM Is Not an Agent. Quick recap so this doc is self-contained:
 
  • Single binary, two modes — GUI (default) vs CLI client (-cli flag flips at App.OnStartup).
  • Akka.NET actor system/user/stage is the supervisor; /bot and /ws-<name>/term-<id> are children. Inference runs inside AgentReactorActor so it never blocks the WPF UI thread.
  • ZeroCommon is UI-free and covered by a headless xUnit suite (Akka.TestKit) so the actors and message routing are testable without a desktop session.
  • EF Core + SQLite for persistence: CLI definitions, workspace layout, clipboard history.
  • Native deps are pulled from NuGet (CI.Microsoft.Terminal.Wpf for ConPTY; LLamaSharp + a self-built llama.cpp for the local model) and copied into output. No system-level installs.

Closing — the bet

Terminal multiplexers (Windows Terminal, ConEmu, Hyper) optimise for one prompt at a time. The AI era's actual unit of work is several prompts thinking in parallel. A multi-CLI shell with a small in-app coordinator turns that pattern into a first-class workflow instead of a bunch of windows you tab through.
This is alpha and Windows-only on purpose — Windows-native AI agent CLIs are scarce, and .NET 10 + Akka.NET gives a path to grow this from a single-device shell into a remote/cluster runtime later (the PRO roadmap). For now: install on a machine you trust, audit the source, and if you're curious about the how, Part 2 is the deeper read.

TECH LINKS