Documentation

vibeCodeCLI by ThinkingDBx · a terminal-based agentic platform for your code and your data. Fan out autonomous agents across isolated git worktrees, or connect a warehouse and ship governed changes; bring your own key.

Questions or early access → [email protected]

Overview

Most terminal coding agents run a single loop against your working tree. vibeCodeCLI is built around parallel agents from the ground up: git worktrees give every agent a physically separate checkout of the same repo (sharing one .git), so agents never stomp each other. You launch a fleet, walk away, and review diffs side by side.

The four ideas that make this one product:

Autonomy / vibe mode · loose intent in, finished branches out.
Parallel multi-agent · many agents at once as a first-class concept.
Git-native · every run is an isolated worktree → reviewable branch.
Speed · Rust + tokio; orchestrating a fleet is concurrency-heavy.

And it's one agent across two surfaces: the same loop · fleet, memory, outcome-check · works your codebase (it ships reviewable branches) and your data stack (it ships governed warehouse changes). See Data engineering.

Install

$ curl -fsSL vibecodecli.com/install.sh | sh
$ vibecodecli --version

Installs a prebuilt binary for macOS & Linux (Intel or Apple Silicon / arm64) · a single static binary, no Rust or compiler required. It lands in ~/.local/bin; set VIBECODECLI_INSTALL_DIR to change that. The installer verifies a SHA-256 checksum before installing.

Stay current · vibeCodeCLI checks for updates at startup and self-updates in the background (next launch runs the new version). Update now with vibecodecli update or /update in-session. Remove config with vibecodecli uninstall, or the binary too with --purge.

First run & providers

You don't write config. On first run with nothing set up, vibeCodeCLI asks you · pick from 25 built-in providers and paste your key:

$ vibecodecli run "fix the bug"
vibeCodeCLI setup · pick from 25 providers, paste your key.

  1) anthropic  (e.g. claude-opus-4-8)
  2) openai     3) gemini    4) grok      5) cerebras
  6) groq       7) deepseek  8) mistral   9) kimi    10) qwen
 11) cohere   …  22) openrouter   23) ollama  (local / self-hosted)

Provider [1-25]: 3
Model [gemini-2.5-flash]: gemini-2.5-pro
gemini API key: ********
✓ Saved google/gemini-2.5-pro

It saves to ~/.config/vibecodecli/ (the key in a separate owner-only file). Re-run setup anytime with vibecodecli init. Already have a key in the environment (e.g. ANTHROPIC_API_KEY)? It's used automatically · no prompt.

Running

One-shot from the shell:

$ vibecodecli run "fix the failing test in src/foo.rs"     # single agent, streams live
$ vibecodecli run --agents 3 "make the parser 2x faster"   # fleet: 3 isolated agents

Fan-out launches N agents on the same prompt, each in its own worktree, runs them concurrently, and shows a live fleet dashboard (ratatui) with a pane per agent · status, activity, tokens, and the resulting diffstat. Press q to dismiss it; a side-by-side summary then prints, followed by a ★ recommendation that picks the strongest candidate. Piped output or --plain falls back to labeled plain text.

A worktree isolates the filesystem, not the warehouse. So in a fleet, workers author freely (read schema, plan blast-radius, write SQL/models in their branch) but writes to shared external state — dbt run, warehouse writes, mutating MCP calls — are deferred (shown as ⛔ deferred in fleet). You materialize once, on the branch the fan-in selects — so N parallel agents never collide on one warehouse.

Interactive session

Run it with no arguments to drop into the interactive session · a persistent REPL:

$ vibecodecli

The footer is pinned: a bordered input box and a status bar (model · branch · cwd · tokens) stay put while the transcript scrolls into your native scrollback above them. A spinner shows the current step and elapsed time; Esc cancels an in-flight turn. ↑/↓ recall previous prompts. Each tool call shows what it did · inline diffs for edits, ✓/✗ exit N for commands, and a live pipeline board (animated per-stage progress) when it runs dbt or a data pipeline.

The palette is calm by default · only errors and the theme accent carry color, so routine confirmations don't flash green; /palette switches to vivid, and NO_COLOR is honored. Mention @path in a prompt to attach a file's contents. Every conversation is saved automatically as a session: vibecodecli sessions lists them, vibecodecli --continue resumes the latest (-c <id> for a specific one).

Slash commands

Setup commands are guided, not manual. Run one bare — /connect, /model, /memory, /search, /mode, /config, /mcp, /palette, /theme — and it opens an interactive picker or a paste-a-key prompt (with a signup link where one's needed), so you never have to hand-edit config or memorize flags. Pass arguments to skip the picker (the power path), e.g. /connect tl <token> or /model anthropic claude-opus-4-8. Type / for the menu; ↑/↓ choose, Enter runs.

/plan <task>	Propose a plan first, without making changes.
/fleet N <prompt>	Fan out N parallel agents, each in its own worktree.
/model	Show or switch the provider/model (live list).
/mode <level>	Autonomy: `careful` · `auto` · `full-auto`.
/theme · /palette · /mouse	Accent color (orange, yellow, …) · palette (calm vs vivid — how much color the chat uses) · toggle mouse capture (wheel-scroll vs. drag-to-copy).
/update	Self-update to the latest release (restart to apply). Also `vibecodecli update`.
/undo	Revert the last turn's file changes.
/verify · /review	Run the build/test gate · review the working diff for issues.
/diff · /compact	Working-tree diff · shed old context to save tokens.
/memory · /search	Connect ThinkingMemory · connect web search (fetch_url is always on).
/connect tl\|dbt\|local\|cloud\|airflow · list · rm	Connect / manage a data backend or MCP integration in-session · hot-loads its tools (see Data engineering).
/mcp	Enable/disable connected MCP servers for the session · mute the ones you're not using to reclaim prompt tokens (their tool schemas are withheld from the model). Persists across launches.
/editor · /copy · /export	Compose in `$EDITOR` · copy last reply · save conversation to markdown.
/config · /init	Provider/model/mode/keys · generate a `VIBECODECLI.md` for the repo.
/doctor · /tokens	Health check (config, keys, git, MCP) · token usage this session.
/sessions · /resume <id> · /new	Switch saved sessions.
/list · /land <branch> · /clean <id\|--all>	Manage fleet runs.
/clear · /help · /quit	Reset history · help · exit.

Aliases (q, h, exit) are omitted. Custom commands: drop a Markdown file in .vibecodecli/commands/<name>.md to add /<name> ($ARGUMENTS expands).

Memory

Run /memory and paste a key (there's a free-key link right in the prompt) — the session gains long-term memory scoped per-repo. Run it again any time for a status view where you enable/disable auto-recall, replace or test the key, or forget this repo's store. Each turn auto-recalls a ranked, token-budget-packed, cited slice of relevant history into context, and auto-remembers the turn's outcome · so recall improves as the agent runs and context carried across sessions stops evaporating. The model can also call the recall / remember tools deliberately.

Prefer config? Set it in vibecodecli.toml instead (or export THINKINGMEMORY_API_KEY):

# vibecodecli.toml
[memory]
base_url = "https://memory.thinkingdbx.com"  # or localhost:8091 to self-host

Recall is visible. A per-turn receipt shows what entered context and what it saved (↳ recalled 20 memories · ~785 tok, saved ~6.3k vs dump). Recalled text is fenced as untrusted data, so a stored memory can't smuggle instructions back into the agent; repeated outcomes are de-duplicated so ten identical dbt run successes don't crowd out real context. Auto-recall is best-effort with a short deadline — a healthy-but-slow backbone reports "memory slow — skipped this turn" (not a false "unreachable"), and the turn proceeds without stalling.

Unconfigured, it's a silent no-op · the agent runs exactly as before. Toggle it live with /memory (which also tests the key: rejected vs unreachable), or vibecodecli doctor to probe reachability. ThinkingMemory on GitHub →

Safety & autonomy

Autonomy dial

careful (ask before every command), auto (ask only before destructive commands or writes outside the repo · the default), or full-auto. Set with -m//mode; shown in the status bar.

Outcome-aware actions (fail-closed)

Before any action that touches external or irreversible state, a forward check predicts its effect and proceeds, defers, or rejects it. It's fail-closed: a filesystem delete, a DROP TABLE, a terraform destroy, a kubectl delete, or a mutating MCP tool is judged the same way · only provably read-only actions and reversible in-repo edits skip it. A predicted-unsafe action is rejected even in full-auto. If the judge is unavailable, auto/careful fall back to asking you, and full-auto won't silently run an un-cleared stateful action.

/undo

Every turn snapshots the files it touches; one keystroke reverts.

Untrusted content & secrets

Output from external sources — a fetched web page, a search result, any MCP/warehouse tool — is fenced as untrusted data before it reaches the model, with a standing rule to treat it as information, never instructions. So a hostile page or a compromised tool server can't smuggle in "ignore your instructions and…". fetch_url resolves and pins the vetted IP (no DNS-rebinding into internal/metadata addresses), and secrets at rest — credentials, the local runner config, session transcripts — are written owner-only (0600). Text bound for cloud memory is secret-scrubbed first.

Data engineering

Connect a data backend and the same agent · fleet, memory, outcome-check and all · gains a data-engineering toolset. It grounds in your real schema and lineage before writing SQL, runs across 15 connectors, and can execute without the cloud ever holding your credentials.

Just type /connect — it opens a menu of every database, framework, and platform, and picking one starts a guided wizard (local vs cloud, a name, the connection string), so you don't need to know a token format or connstring syntax up front. Want to script it? Pass arguments to skip the wizard (the power path below), or run the same from your shell with vibecodecli connect:

❯ /connect tl <token>                  # ThinkingLanguage cloud · your whole workspace
❯ /connect dbt ./my-dbt-project        # a dbt project · schema + lineage
❯ /connect local prod "postgresql://…" # creds stay on THIS machine (a runner executes)
❯ /connect cloud wh "postgresql://…"   # creds stored ENCRYPTED in the cloud (no runner)
❯ /connect airflow https://af.co <jwt>  # Apache Airflow, as MCP tools (uvx server)
❯ /connect list                        # every connection · where creds live · writable?
❯ /connect rm prod                     # remove (from cloud + local)

tl · paste a workspace MCP token from tl.thinkingdbx.com (Settings → MCP server). Every connection in your TL workspace becomes available; the warehouse tools appear to the model as mcp__tl__*. Do this once; the others build on it.
dbt · point at a dbt project directory; the agent reads manifest.json for grounding & blast-radius and can run dbt build/run/test.
local · register a database whose credentials stay on this machine. The cloud stores only the name + connector type; a runner (embedded in the CLI) executes against it. Connector inferred from the connstring, or pass --connector; --writable to allow writes.
cloud · register a database the cloud manages · the connstring is sent to TL and stored encrypted at rest (AES-256-GCM), and the cloud executes directly, no runner needed. A one-time consent line reminds you the credentials leave the machine (the opposite of local).
list / rm · /connect list shows every connection and where its credentials live; /connect rm <name> deletes it from the cloud workspace and your local runner config.

How a specific database connects. There's no per-engine verb (no /connect sqlite). An individual database · SQLite, Postgres, MySQL, SQL Server, DuckDB, ClickHouse, Redshift, Snowflake, BigQuery, Databricks · connects with /connect local (creds stay put) or /connect cloud (creds encrypted in the cloud), e.g. /connect local app ./app.sqlite. The connector type is inferred for postgresql://, mysql://, *.duckdb, and *.sqlite; for the rest pass --connector <type>. Databases already in your TL cloud workspace need no per-database step · /connect tl exposes them all at once.

Orchestrators & other tools (MCP). Things that aren't SQL backends come in as MCP servers. Apache Airflow is turnkey: /connect airflow <base_url> <jwt-token> registers mcp-server-apache-airflow (run via uvx) so the agent gets mcp__airflow__* tools · list/trigger/inspect DAG runs, task states, etc. It's read-only by default (add --writable to allow triggering/mutating; the outcome-check still gates those). The JWT stays in your local credential store; only the host goes in config. Any other MCP tool (GitHub, Slack, …) is configured directly under [mcp.<name>].

Credential-local runner

For connect local connections, a runner executes queries against your warehouse from your machine · the cloud orchestrates the work but never sees a password. The runner is embedded in the CLI: a normal vibecodecli session serves your local connections automatically. It long-polls the cloud for queries, writes, and cross-source pipelines, runs them locally via tl, and returns only results · outbound HTTP only. Run it standalone with vibecodecli runner for a dedicated data plane. Credentials live in ~/.config/vibecodecli/runner.json (owner-only); only results cross the wire.

Tools & connectors

Once a backend is connected the model gains these tools. Read-only ones skip the outcome gate; writes are opt-in per connection and cost-previewed first.

introspect_connection	Exact tables, columns, and types from the live warehouse.
dbt_catalog · dbt_impact	Real columns of a model · blast-radius of a rename/drop (reads the DAG, nothing runs).
explain	Query plan / bytes scanned before the query runs · the cost preview.
run_query	Read-only SQL against a connection.
run_write	INSERT / UPDATE / DELETE / DDL · opt-in, gated by the outcome check.
run_transform	In-warehouse transform (CTAS) · pushed down to the warehouse's compute, no rows pulled.
run_pipeline	Cross-source ETL · joins/transforms run in the engine (DataFusion), never in the LLM.
compare_tables	Row-count + schema parity check (e.g. verify a Redshift → Snowflake migration).
dbt_run	Materialize models via `dbt build/run/test` · the framework write path. Streams a live pipeline board (per-model progress); `Esc` stops the warehouse run.

15 connectors: PostgreSQL, MySQL, SQL Server, Snowflake, BigQuery, Databricks, Redshift, ClickHouse, DuckDB, SQLite, MongoDB, Redis, Apache Iceberg (read today · writes coming soon), S3, SFTP. Writes (INSERT / UPDATE / DELETE / DDL) span 10+ SQL engines.

A pipeline action — dbt run, a warehouse/pipeline execution — renders a live stage board: each stage with its status and timing, a progress bar, and per-stage completion as it runs, so a long materialization reads as alive rather than frozen. In a fleet these writes are deferred (see Running).

The outcome check treats a DROP TABLE or a destructive write exactly like a filesystem delete · it predicts the effect, shows the cost, then proceeds, defers, or rejects. Big data never flows through the model: in-warehouse work pushes down to the warehouse; cross-source work runs in the engine.

Bonacci Studio

Bonacci Studio is ThinkingDBx's visual, agentic data-engineering & data-science platform · build database, API, file, Kafka and Spark pipelines on a drag-and-drop canvas, with SQL, PySpark, DAGs, and ThinkingLanguage. vibeCodeCLI connects to your Studio account over MCP, so the agent can run, create, troubleshoot, and monitor your real pipelines from the terminal.

Connect

In Studio, open Settings → API Tokens and generate a token (bcs_pat_…, shown once). Then pick Bonacci Studio from the /connect menu and paste it — or use the direct form:

❯ /connect studio <token>        # or: vibecodecli connect studio <token>

The token authenticates as you and is stored locally; the agent gains mcp__studio__* tools scoped to your account. (Self-hosted Studio? point at <host>/api/mcp.)

What it can do

design_pipeline	Describe a pipeline in plain language · any type (database, API, file, Kafka, Spark/PySpark, SQL, DAG, ThinkingLanguage) · and Studio's AI pipeline designer builds it. Conversational: pass the returned `conversationId` to iterate.
list_pipelines · get_pipeline · run_pipeline · create_pipeline	Inspect, trigger, and create pipelines (the typed create covers DATABASE / API / FILE / CODEGEN).
pipeline_status · pipeline_executions · execution_logs · pipeline_logs · pipeline_metrics	Monitor runs and troubleshoot failures · status, run history, logs, and metrics.
list_connections · list_schedules · schedules_for_pipeline · create_schedule	Connections (name + type only · never secrets) and schedules.

Mutating tools (run / create / schedule) ride the outcome-aware confirm gate. Tools run in your account's security context; credentials stay in Studio.

Config & roles

Roles map jobs to models, so cost-tiering is a config choice · not a fragile runtime router. Hand-write vibecodecli.toml (cwd) or ~/.config/vibecodecli/config.toml:

[roles.orchestrator]   # the main / single agent
provider = "anthropic"
model = "claude-opus-4-8"
effort = "xhigh"

[roles.worker]          # each parallel fan-out agent
provider = "cerebras"
model = "gpt-oss-120b"

# [roles.judge] runs the outcome check · defaults to your own model at low
# effort. Set explicitly for a cheaper/faster or INDEPENDENT (decorrelated) judge.

orchestrator	The main / single agent. Default: Opus 4.8 @ xhigh.
worker	Each parallel fan-out agent. Default: Sonnet 4.6 @ low.
judge	The outcome check. Default: your own model, low effort.
cheap	Mechanical ops (reserved). Default: Haiku 4.5.

25 providers are built in (anthropic, openai, gemini, grok, mistral, deepseek, kimi, glm, qwen, cohere, perplexity, groq, cerebras, together, fireworks, deepinfra, openrouter, cloudflare (Workers AI), huggingface, ollama, …) · reference them by name, no base URL needed. Two wire shapes cover the ecosystem: native Anthropic and the OpenAI-compatible chat API. Self-hosted & local servers (Ollama, vLLM, llama.cpp) work too · keyless, point at any local or remote host (e.g. OLLAMA_HOST). /model fetches each provider's current models live.

Choosing a local model. vibeCodeCLI is a tool-using agent · it drives work through function/tool calls, so a self-hosted model must emit structured tool calls (not just describe them in prose). Not every Ollama model does. We test against these:

✓ Recommended: qwen3 (e.g. qwen3:14b), llama3.3, mistral-small · drive the full read → edit → verify loop reliably. Bigger is better; a ~14B+ model is the practical floor for agentic work.
✗ Avoid for agent use: qwen2.5-coder emits tool calls as plain text in Ollama, so the agent can't act on them. Small models (≤8B) often half-use tools, then revert to printing answers.

Rule of thumb: if vibecodecli reads files but never edits, the model isn't returning real tool calls · switch to one of the recommended models. Frontier hosted models (Claude, GPT, Gemini, GLM via zai) all tool-call natively.

Mind the context window. On a long task the replayed history can exceed a local model's window; the server then silently drops the oldest tokens · the system prompt and your task · and the agent forgets what it's doing. Give the model a large window and set the compaction budget below it: e.g. OLLAMA_CONTEXT_LENGTH=32768 VIBECODECLI_CONTEXT_BUDGET=24000.

MCP, hooks & custom commands

MCP · configure Model Context Protocol servers in vibecodecli.toml ([mcp.<name>]); their tools are exposed to the model as mcp__<name>__<tool>. Two transports: stdio (set command/args — the server is launched as a child process) and http (set url — JSON-RPC is POSTed to a remote endpoint with a Bearer token; how tl and studio connect).

[mcp.github]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
# muted = true   # connected but tools withheld — toggle live with /mcp

Token control. Every connected server advertises all its tools on every request (the model can only call a tool it's shown), so a big server's schemas cost prompt tokens each turn even when unused. /mcp lets you mute a server for the session — it stays connected but its tools are withheld from the model — reclaiming those tokens. The choice persists (muted = true). If a request still overflows the model's context window, vibecodecli degrades gracefully: it sheds the repo map and recalled memory and retries, so small-window models still answer instead of dead-ending.

Hooks · pre/post_tool_use guardrails.
Custom commands · drop Markdown files in .vibecodecli/commands/*.md.
Repo map + project memory · on start, a git ls-files map of files & symbols plus your AGENTS.md / VIBECODECLI.md load into context.
Explore subagent · delegate investigation to a read-only subagent that returns only a digest.
Web access · fetch_url reads any page as text (always on, no key; private/loopback hosts are blocked); web_search adds Tavily / Brave / Exa with a key.

Workspace trust · a project-local vibecodecli.toml (in the dir you start from · attacker-controlled in a cloned repo) is loaded for providers/models, but its code-executing [mcp] / [hooks] are ignored until you run vibecodecli trust there. Your global config and an explicit $VIBECODECLI_CONFIG are always trusted.

Workspace & isolation

--cwd <dir> points the agent at any directory · a repo, a plain folder, anywhere.
--isolation picks the strategy: worktree (each agent in its own git worktree — the default in a repo, and what makes parallel fan-out collision-safe) or none (operate in place — for single-agent or cross-cutting work, and the default outside a git repo).

Managing runs

Each run leaves a reviewable branch and worktree.

$ vibecodecli list                       # show every run's branch + worktree
$ vibecodecli land vibecodecli/run-…      # merge a winner into the current branch
$ vibecodecli clean run-…                 # remove one run's worktree + branch
$ vibecodecli clean --all                 # remove every run

Closing the loop

Generation quality is mostly a property of the scaffold, not just the model · so the harness enforces what a careful engineer would do.

Verification gate · after editing, vibecodecli runs the project's build/test gate in code and, on failure, feeds the error back and keeps working. Auto-detects per project (Cargo → cargo check, Go → go build, npm build, tsconfig → tsc --noEmit); override with [verify] command.
Thrash guard · an identical tool call that already failed twice is blocked; the model is told to change approach.
Fan-in · a synthesis pass picks the strongest candidate and names the runner-up's weakness (★ recommendation).
Decorrelated review (opt-in, [verify] review = true) · a fresh-context reviewer (set the judge role to a different model family) checks the diff against the goal and bounces it back once if it finds real defects. Fail-open.
Cost / prompt caching · the loop re-sends a growing transcript each turn, so prompt caching is the dominant cost lever; each request sets a cache breakpoint and re-reads the prior prefix at ~0.1× input price. Every run prints a token summary.