Skip to content
GitHub
Section: Concepts
Domain model

Concepts

Orca organises work as a tree of tasks, groups autonomous runs as missions, and gates risk through an overseer decision engine. This page covers the vocabulary you'll see across the daemon, CLI, and web UI.

Tasks

A task is a unit of work. Tasks form a tree via parent_id — an epic (root task) contains sub-tasks. Tasks can also declare dependencies (the task_deps table) that must be closed before the task becomes ready.

Task lifecycle

open → in_progress → closed ↓ ↑ blocked ─────────────┘ (retry via manual unblock) ↓ cancelled
StatusMeaning
openWaiting to be picked up
in_progressAssigned to an agent session
blockedEscalated to human — stuck detector exceeded the relaunch budget, or a post-done review rejected a dependency
closedCompleted successfully
cancelledAbandoned

Blocked tasks are excluded from readiness.ready(), so the engine tick skips them. A human must manually unblock (set back to open) to retry.

Labels

Tasks carry string labels used for routing and agent naming:

Readiness

A task is ready when it is open, not an epic, and every dependency (task_deps) is closed or cancelled. The Readiness service computes this at query time with a single NOT EXISTS deps check — across a project (ready(projectId)) or scoped to one epic's direct children (readyForEpic(epicId)), so parallel missions don't walk each other's tasks.


Missions

A mission groups tasks under an epic for autonomous execution. The mission engine (MissionEngine) ticks active missions, picks each epic's ready tasks, and spawns agents up to max_sessions. The Overseer is not consulted at dispatch — it gates the agents' permission prompts (via the Deriver) and optional post-phase reviews.

Mission lifecycle

engage → active → disengaged ↓ paused → active (resume) ↓ stalled → active (blocked child unblocked or resumed)
StateMeaning
activeEngine processes this mission on each tick
pausedSkipped by the engine; running agents killed, tasks reverted to open
stalledActive but no agent running and a child is blocked — waiting for human intervention
disengagedAll children closed/cancelled; mission complete

Engine tick

The tick loop runs every 90 seconds, one tick per active mission:

  1. Load the mission, its epic, and its project.
  2. If all children are closed/cancelled → auto-disengage.
  3. Count running = this epic's own in_progress children (not global sessions).
  4. Walk readiness.readyForEpic(epicId); for each, while running < max_sessions: skip if autonomy is L0; otherwise resolve the executor from labels, pick an agent name, set in_progress, spawn via tmux.
  5. Detect stalled: zero running + any blocked child → mark stalled; if work resumes → flip back to active.

Autonomy levels

LevelNameAuto-spawnPrompt gateConfidence bar
L0RecommendNeverAlways escalate to human
L1AssistYesOverseer gate (stricter)0.85
L2PilotYesOverseer gate (standard)0.6
L3AutoYesOverseer gate (standard)0.6

Overseer (decision gate)

Two decision paths, controlled by config.autopilot.overseerExec:

Relay path (default)

overseerExec is empty. Permission-prompt decisions go through a RelayClient using config.autopilot.overseerModel (falls back to the planner model). When no relay is wired at all, the daemon applies a conservative fallback: only L3 waves a non-destructive prompt through; L0–L2 escalate, and destructive prompts always escalate. Post-done reviews cannot run on the relay path — they require a parked overseer.

All decisions pass through the centralised gateVerdict() function, which applies the MIN_CONFIDENCE (0.6) threshold as a single source of truth.

Agent path (parked overseer)

overseerExec is set (e.g. sonnet). On mission engage, one Overseer agent is parked per active mission and runs a long-poll loop:

  1. orca overseer poll — blocks until a decision is needed, returns {id, kind, context}.
  2. Judge the request.
  3. orca overseer decide --id <id> --approve --confidence 0.85 --rationale "..." — submit the verdict.
  4. Back to step 1.

The local destructive heuristic (computed at enqueue time) is always authoritative — the agent cannot override it. A timeout (120s) or mission disengage conservatively escalates all pending decisions. The heuristic covers: rm -rf, DROP TABLE, DELETE FROM, TRUNCATE, migrations, .env, secrets/credentials, force-push, git reset --hard, chmod 777, curl/wget pipes to shell, python/node/perl -e, netcat, bash -c, eval(), os.system, subprocess, exec().


Deriver

The deriver monitors agent sessions in real time. It polls tmux every 5 seconds and detects agent state by examining the pane output.

SignalMeaning
workingAgent is progressing normally
needs_inputAgent is waiting for user input (prompt detected, escalated)
completeTask is closed

Prompt detection is implemented for all three supported agent programs (shellPatterns.ts): OpenCode's "Permission required" dialog, Claude Code's workspace-trust gate (auto-accepted) and "Do you want to proceed?" permission gate, and Codex's "Allow command?" / "Approve this command?" gate. Each detected prompt is hashed to avoid re-emitting the same signal on consecutive polls.

No-overseer fallback


PR-native workflow (optional)

Off by default. When Settings → Autopilot → PR workflow is enabled, each mission runs isolated and ships a real GitHub pull request instead of leaving uncommitted changes in the main checkout. It complements the overseer review (which still gates phases) — the PR is the final human gate plus a feedback loop.

Lifecycle, orchestrated by MissionGit and tracked in the mission_pr table:

  1. Engage → a dedicated branch orca/<slug>-<epicId> and a sibling git worktree (<repo-parent>/.orca-worktrees/<slug>-<missionId>) are created; the mission's agents run there, not in the main checkout.
  2. Per phase → on the approving review verdict (or on close when review-on-done is off), the daemon commits that phase's worktree changes with the phase title. A rejected phase never commits.
  3. Epic done → the optional prVerifyCommand runs in the worktree. Non-zero holds the mission (stalled, escalation surfaced) and opens nothing. Green → push the branch and open the PR (auto), or wait for a manual POST /missions/:id/pr.
  4. Feedback loop → a ~60s poller reads each open PR's reviews, line-level diff comments, and conversation comments. Any fresh actionable feedback — changes requested, a line comment, a COMMENTED review with a body (bots and human comments both count), or a conversation comment — is routed through the pilot, which plans 1..N fix phases under the epic and re-engages the mission. A fix budget (2 rounds) bounds the bot↔autopilot ping-pong; once spent, the mission parks as stalled for a human. Dedup is by last_review_ts; a merged/closed PR stops the watch and clears the budget.

The worktree is torn down on pause/disengage (the branch is kept). GitHub-only via the gh CLI: a missing gh/token/remote degrades to a no-op + warning, leaving the rest of autopilot unaffected.


Pilot agent (AI planning)

When config.autopilot.pilotExec is set, POST /tasks/plan spawns a Pilot agent in the repository instead of using the relay planner. The Pilot reads relevant files, AGENTS.md, CLAUDE.md, the README for conventions; decomposes the goal into 3–7 ordered phases; submits the plan via orca plan submit; and stops — it must not implement anything or spawn agents.

The PlanJobStore tracks the async job. Autopilot mode is always async — both backends return 202 Accepted with a jobId the web UI polls via GET /plan/:jobId. Only manual phases mode is synchronous (201). Plan jobs are in-memory and ephemeral: a daemon restart drops in-flight jobs (surfaced as failed), and a finished job is pruned after a 10-minute TTL.


Authentication & authorisation

The daemon supports optional token-based auth. When a UserStore is configured, all endpoints except GET /health, GET /setup, and POST /auth/login require a bearer token.

login (username + password) → receive token → pass as Authorization: Bearer <token>

Tokens are issued via POST /auth/login (scrypt password verification), stored in the auth_tokens table, revocable via POST /auth/logout, and passable as ?token=<value> for SSE EventSource.

Token scopes

ScopePurposeRestrictions
fullInteractive user session (login via browser/CLI)Bounded by the user's role and project assignments
agentSpawned agent (worker, overseer, pilot) — injected via ORCA_TOKENVerb + path allow-list; project scope confined to the agent's live working set
advisorPer-user assistant session (orca-advisor-<userId>)Mapped to full at the guard so it has the user's rights, but isolated from login tokens

Agent-scoped tokens prevent a prompt-injected agent from creating users, performing admin operations, accessing projects it isn't actively working in, listing tokens, or spawning sessions. The agentAllowed() gate admits only the verbs the agent CLI actually drives (see CLI reference). Project ownership of the affected row is still enforced downstream by canAccessProject, so the agent cannot cross tenancy even within the allow-list.


Event bus & phone push

The EventBus decouples services and serves SSE streams at GET /events, invalidates React Query caches in the web UI, and drives two background subscribers:

Push dispatch mapping

EventTriggerPayload
review (not approved)Overseer rejected or timed out a phase reviewInline Approve / Re-run
signal (needs_input)Agent waiting on a permission promptInline Allow / Reject (or tap-to-open for multi-choice)
mission (stalled)No running agents and a blocked childTap-to-open
mission (disengaged)Natural mission completionFYI (mentions PR if one was opened)
task (blocked)An agent died too many times (relaunch budget exceeded)Tap-to-open

A VAPID keypair is generated on first boot and the private key never leaves the daemon. Subscriptions are opt-in per device from the Account page.


Stuck detector & post-done review

The stuck detector (src/overseer/stuckDetector.ts) runs every 60s with a 120s grace period. If an agent exits or crashes without running orca close, the detector increments the stuck:<n> label; past maxRelaunch (2) it sets the task blocked, otherwise it reverts it to open so the mission re-spawns it. A one-shot zombie reconcile runs on startup — same logic, no grace, no counter — to revert orphaned in-progress tasks.

Post-done review is an optional hard sequential gate: when config.autopilot.reviewOnDone is true and an agent overseer is configured, closing a mission phase sets all open direct dependents to blocked synchronously, enqueues a review-kind decision, and on the approving, non-destructive verdict releases them back to open and fires engine.tick() immediately so the next phase spawns without waiting for the 90s interval. Default off; relay fallback cannot drive it.


Assistant (per-user advisor)

Each user gets a persistent assistant agent (orca-advisor-<userId>) that drives Orca on their behalf through a built-in MCP server. It auto-starts on login (when a saved advisor_exec and advisor_autostart: true are set), remembers its model, and runs in a docked IDE-style side panel with a real-PTY terminal — pop any session out into its own chromeless window.

The advisor acts through POST /mcp, handled statelessly with a fresh McpServer bound to the caller's bearer token — so every connection acts with exactly its user's rights. The toolset: orca_request (generic escape hatch), orca_tasks, orca_create_task, orca_plan, orca_sessions. Every tool delegates to the shared callOrcaApi core — the same forward path as the orca api CLI verb, so a new REST endpoint works in both with zero edits.

Continue to the CLI reference for the exact commands, or to Architecture for the module-level map.

© 2026 ORCA · MIT Licensed · View source on GitHub