Domain model

Concepts

Orca organises work as a tree of tasks, groups autonomous runs as missions, and gates risk through an overseer decision engine. This page covers the vocabulary you'll see across the daemon, CLI, and web UI.

Tasks

A task is a unit of work. Tasks form a tree via parent_id — an epic (root task) contains sub-tasks. Tasks can also declare dependencies (the task_deps table) that must be closed before the task becomes ready.

Task lifecycle

open → in_progress → closed ↓ ↑ blocked ─────────────┘ (retry via manual unblock) ↓ cancelled

Status	Meaning
`open`	Waiting to be picked up
`in_progress`	Assigned to an agent session
`blocked`	Escalated to human — stuck detector exceeded the relaunch budget, or a post-done review rejected a dependency
`closed`	Completed successfully
`cancelled`	Abandoned

Blocked tasks are excluded from readiness.ready(), so the engine tick skips them. A human must manually unblock (set back to open) to retry.

Labels

Tasks carry string labels used for routing and agent naming:

exec:<spec> — route to a specific agent executor (e.g. exec:sonnet, exec:opencode:ollama-cloud/deepseek-v4-flash, exec:codex:gpt-5.5).
agent:<name> — pin a specific agent name so the deriver/janitor/stuck detector can resolve the task from a session without first-in-progress fallback.
started:<epoch-ms> — precise spawn timestamp for correct usage attribution under concurrency.
stuck:<n> — relaunch counter; incremented each time the stuck detector reverts this task, capped at maxRelaunch (2).

Readiness

A task is ready when it is open, not an epic, and every dependency (task_deps) is closed or cancelled. The Readiness service computes this at query time with a single NOT EXISTS deps check — across a project (ready(projectId)) or scoped to one epic's direct children (readyForEpic(epicId)), so parallel missions don't walk each other's tasks.

Missions

A mission groups tasks under an epic for autonomous execution. The mission engine (MissionEngine) ticks active missions, picks each epic's ready tasks, and spawns agents up to max_sessions. The Overseer is not consulted at dispatch — it gates the agents' permission prompts (via the Deriver) and optional post-phase reviews.

Mission lifecycle

engage → active → disengaged ↓ paused → active (resume) ↓ stalled → active (blocked child unblocked or resumed)

State	Meaning
`active`	Engine processes this mission on each tick
`paused`	Skipped by the engine; running agents killed, tasks reverted to `open`
`stalled`	Active but no agent running and a child is `blocked` — waiting for human intervention
`disengaged`	All children closed/cancelled; mission complete

Engine tick

The tick loop runs every 90 seconds, one tick per active mission:

Load the mission, its epic, and its project.
If all children are closed/cancelled → auto-disengage.
Count running = this epic's own in_progress children (not global sessions).
Walk readiness.readyForEpic(epicId); for each, while running < max_sessions: skip if autonomy is L0; otherwise resolve the executor from labels, pick an agent name, set in_progress, spawn via tmux.
Detect stalled: zero running + any blocked child → mark stalled; if work resumes → flip back to active.

Autonomy levels

Level	Name	Auto-spawn	Prompt gate	Confidence bar
L0	Recommend	Never	Always escalate to human	—
L1	Assist	Yes	Overseer gate (stricter)	0.85
L2	Pilot	Yes	Overseer gate (standard)	0.6
L3	Auto	Yes	Overseer gate (standard)	0.6

L0 — the engine never auto-spawns; the deriver escalates every detected permission prompt to human (needs_input).
L1 — auto-spawns ready tasks, but the deriver routes prompts through the overseer with a stricter threshold (0.85). Only clearly-safe steps auto-clear.
L2/L3 — auto-spawn, standard threshold (0.6). L3 additionally waves non-destructive prompts through when no overseer is configured at all; L2 escalates in that case.

Overseer (decision gate)

Two decision paths, controlled by config.autopilot.overseerExec:

Relay path (default)

overseerExec is empty. Permission-prompt decisions go through a RelayClient using config.autopilot.overseerModel (falls back to the planner model). When no relay is wired at all, the daemon applies a conservative fallback: only L3 waves a non-destructive prompt through; L0–L2 escalate, and destructive prompts always escalate. Post-done reviews cannot run on the relay path — they require a parked overseer.

All decisions pass through the centralised gateVerdict() function, which applies the MIN_CONFIDENCE (0.6) threshold as a single source of truth.

Agent path (parked overseer)

overseerExec is set (e.g. sonnet). On mission engage, one Overseer agent is parked per active mission and runs a long-poll loop:

orca overseer poll — blocks until a decision is needed, returns {id, kind, context}.
Judge the request.
orca overseer decide --id <id> --approve --confidence 0.85 --rationale "..." — submit the verdict.
Back to step 1.

The local destructive heuristic (computed at enqueue time) is always authoritative — the agent cannot override it. A timeout (120s) or mission disengage conservatively escalates all pending decisions. The heuristic covers: rm -rf, DROP TABLE, DELETE FROM, TRUNCATE, migrations, .env, secrets/credentials, force-push, git reset --hard, chmod 777, curl/wget pipes to shell, python/node/perl -e, netcat, bash -c, eval(), os.system, subprocess, exec().

Deriver

The deriver monitors agent sessions in real time. It polls tmux every 5 seconds and detects agent state by examining the pane output.

Signal	Meaning
`working`	Agent is progressing normally
`needs_input`	Agent is waiting for user input (prompt detected, escalated)
`complete`	Task is closed

Prompt detection is implemented for all three supported agent programs (shellPatterns.ts): OpenCode's "Permission required" dialog, Claude Code's workspace-trust gate (auto-accepted) and "Do you want to proceed?" permission gate, and Codex's "Allow command?" / "Approve this command?" gate. Each detected prompt is hashed to avoid re-emitting the same signal on consecutive polls.

No-overseer fallback

L3: non-destructive prompts are waved through; destructive ones escalate.
L0–L2: all prompts escalate to human — no blanket approval.

PR-native workflow (optional)

Off by default. When Settings → Autopilot → PR workflow is enabled, each mission runs isolated and ships a real GitHub pull request instead of leaving uncommitted changes in the main checkout. It complements the overseer review (which still gates phases) — the PR is the final human gate plus a feedback loop.

Lifecycle, orchestrated by MissionGit and tracked in the mission_pr table:

Engage → a dedicated branch orca/<slug>-<epicId> and a sibling git worktree (<repo-parent>/.orca-worktrees/<slug>-<missionId>) are created; the mission's agents run there, not in the main checkout.
Per phase → on the approving review verdict (or on close when review-on-done is off), the daemon commits that phase's worktree changes with the phase title. A rejected phase never commits.
Epic done → the optional prVerifyCommand runs in the worktree. Non-zero holds the mission (stalled, escalation surfaced) and opens nothing. Green → push the branch and open the PR (auto), or wait for a manual POST /missions/:id/pr.
Feedback loop → a ~60s poller reads each open PR's reviews, line-level diff comments, and conversation comments. Any fresh actionable feedback — changes requested, a line comment, a COMMENTED review with a body (bots and human comments both count), or a conversation comment — is routed through the pilot, which plans 1..N fix phases under the epic and re-engages the mission. A fix budget (2 rounds) bounds the bot↔autopilot ping-pong; once spent, the mission parks as stalled for a human. Dedup is by last_review_ts; a merged/closed PR stops the watch and clears the budget.

The worktree is torn down on pause/disengage (the branch is kept). GitHub-only via the gh CLI: a missing gh/token/remote degrades to a no-op + warning, leaving the rest of autopilot unaffected.

Pilot agent (AI planning)

When config.autopilot.pilotExec is set, POST /tasks/plan spawns a Pilot agent in the repository instead of using the relay planner. The Pilot reads relevant files, AGENTS.md, CLAUDE.md, the README for conventions; decomposes the goal into 3–7 ordered phases; submits the plan via orca plan submit; and stops — it must not implement anything or spawn agents.

The PlanJobStore tracks the async job. Autopilot mode is always async — both backends return 202 Accepted with a jobId the web UI polls via GET /plan/:jobId. Only manual phases mode is synchronous (201). Plan jobs are in-memory and ephemeral: a daemon restart drops in-flight jobs (surfaced as failed), and a finished job is pruned after a 10-minute TTL.

Authentication & authorisation

The daemon supports optional token-based auth. When a UserStore is configured, all endpoints except GET /health, GET /setup, and POST /auth/login require a bearer token.

login (username + password) → receive token → pass as Authorization: Bearer <token>

Tokens are issued via POST /auth/login (scrypt password verification), stored in the auth_tokens table, revocable via POST /auth/logout, and passable as ?token=<value> for SSE EventSource.

Token scopes

Scope	Purpose	Restrictions
`full`	Interactive user session (login via browser/CLI)	Bounded by the user's role and project assignments
`agent`	Spawned agent (worker, overseer, pilot) — injected via `ORCA_TOKEN`	Verb + path allow-list; project scope confined to the agent's live working set
`advisor`	Per-user assistant session (`orca-advisor-<userId>`)	Mapped to `full` at the guard so it has the user's rights, but isolated from login tokens

Agent-scoped tokens prevent a prompt-injected agent from creating users, performing admin operations, accessing projects it isn't actively working in, listing tokens, or spawning sessions. The agentAllowed() gate admits only the verbs the agent CLI actually drives (see CLI reference). Project ownership of the affected row is still enforced downstream by canAccessProject, so the agent cannot cross tenancy even within the allow-list.

Event bus & phone push

The EventBus decouples services and serves SSE streams at GET /events, invalidates React Query caches in the web UI, and drives two background subscribers:

PushDispatcher — maps lifecycle events (review escalation, needs_input, stall, completion, blocked task) to web-push phone notifications for the mission owner + admins.
UsageRecorder — snapshots each task's token/cost usage into the task_usage table the moment a task settles, so the stats page reads DB aggregates instead of rescanning CLI session stores.

Push dispatch mapping

Event	Trigger	Payload
`review` (not approved)	Overseer rejected or timed out a phase review	Inline Approve / Re-run
`signal` (`needs_input`)	Agent waiting on a permission prompt	Inline Allow / Reject (or tap-to-open for multi-choice)
`mission` (`stalled`)	No running agents and a blocked child	Tap-to-open
`mission` (`disengaged`)	Natural mission completion	FYI (mentions PR if one was opened)
`task` (`blocked`)	An agent died too many times (relaunch budget exceeded)	Tap-to-open

A VAPID keypair is generated on first boot and the private key never leaves the daemon. Subscriptions are opt-in per device from the Account page.

Stuck detector & post-done review

The stuck detector (src/overseer/stuckDetector.ts) runs every 60s with a 120s grace period. If an agent exits or crashes without running orca close, the detector increments the stuck:<n> label; past maxRelaunch (2) it sets the task blocked, otherwise it reverts it to open so the mission re-spawns it. A one-shot zombie reconcile runs on startup — same logic, no grace, no counter — to revert orphaned in-progress tasks.

Post-done review is an optional hard sequential gate: when config.autopilot.reviewOnDone is true and an agent overseer is configured, closing a mission phase sets all open direct dependents to blocked synchronously, enqueues a review-kind decision, and on the approving, non-destructive verdict releases them back to open and fires engine.tick() immediately so the next phase spawns without waiting for the 90s interval. Default off; relay fallback cannot drive it.

Assistant (per-user advisor)

Each user gets a persistent assistant agent (orca-advisor-<userId>) that drives Orca on their behalf through a built-in MCP server. It auto-starts on login (when a saved advisor_exec and advisor_autostart: true are set), remembers its model, and runs in a docked IDE-style side panel with a real-PTY terminal — pop any session out into its own chromeless window.

The advisor acts through POST /mcp, handled statelessly with a fresh McpServer bound to the caller's bearer token — so every connection acts with exactly its user's rights. The toolset: orca_request (generic escape hatch), orca_tasks, orca_create_task, orca_plan, orca_sessions. Every tool delegates to the shared callOrcaApi core — the same forward path as the orca api CLI verb, so a new REST endpoint works in both with zero edits.

Continue to the CLI reference for the exact commands, or to Architecture for the module-level map.