Oh My Opencode Specialised Agents Deep Dive and Model Guide
Meet Sisyphus and its specialist agent crew.
The biggest capability jump in OpenCode comes from specialised agents: deliberate separation of orchestration, planning, execution, and research.
Oh My Opencode packages that idea into a batteries-included harness where Sisyphus coordinates a full “virtual team” of agents with different permissions, prompts, and model preferences.

This is the deep dive into agents and model routing. If you are earlier in the journey:
- OpenCode quickstart — install and configure the base agent
- Oh My Opencode quickstart — install the plugin, run your first ultrawork task
- Oh My Opencode experience — real-world results and community benchmarks
For the wider AI coding toolchain context, see the AI developer tools overview.
What Is Oh My Opencode and How Does It Extend OpenCode
OpenCode is an open-source AI coding agent built for the terminal. It ships with a TUI, and the CLI starts that TUI by default when you run opencode with no arguments. It is provider-flexible: it supports a large provider catalog including local models, exposes provider configuration through its config file and /connect flow, and handles everything from cloud APIs to Ollama endpoints without patching.
Oh My Opencode (also known as oh-my-openagent, or just “omo”) is a community plugin that transforms OpenCode into a full multi-agent engineering system. It adds:
- the Sisyphus orchestration system with parallel background execution
- 11 specialised agents with distinct roles, prompts tuned per model family, and explicit tool permissions
- LSP + AST-Grep for IDE-quality refactoring inside agents
- Hashline — a hash-anchored edit tool that eliminates stale-line errors (see below)
- Built-in MCPs: Exa (web search), Context7 (official docs), Grep.app (GitHub search), all on by default
/init-deep— auto-generates hierarchicalAGENTS.mdfiles throughout your project for lean context injection
One naming quirk: the upstream repository is now branded as oh-my-openagent, but the plugin package and install commands still use oh-my-opencode. The maintainer suggests calling it “oh-mo” or just “Sisyphus.”
Why Oh My Opencode Assigns Different Models to Different Agents
Oh My Opencode is built around one foundational idea: different models think differently, and each agent’s prompt is written for one mental model. Claude follows mechanics-driven prompts — detailed checklists, templates, step-by-step procedures. More rules means more compliance. GPT (especially 5.2+) follows principle-driven prompts — concise principles, XML structure, explicit decision criteria. Give GPT a 1,100-line Claude prompt and it contradicts itself. Give Claude a 121-line GPT prompt and it drifts.
This is not a quirk you configure around. It is the system’s design.
The practical consequence: when you change an agent’s model, you change which prompt fires. Agents that support multiple model families (Prometheus, Atlas) auto-detect your model at runtime via isGptModel() and switch prompts automatically. Agents that don’t (Sisyphus, Hephaestus) have prompts written for one family only — and swapping them to the wrong family degrades the output significantly.
How Oh My Opencode Specialised Agents Collaborate
The four agent personality groups
Agents fall into four groups based on which model family they are optimised for. This matters for both understanding the system and for self-hosting decisions.
Group 1 — Communicators (Claude / Kimi / GLM): Sisyphus and Metis. Long, mechanics-driven prompts (~1,100 lines for Sisyphus). Need models that reliably follow complex multi-layered instructions across dozens of tool calls. Claude Opus is the reference. Kimi K2.5 and GLM-5 are strong, cost-effective alternatives that behave similarly. Do not override these to older GPT models.
Group 2 — Dual-Prompt (Claude preferred, GPT supported): Prometheus and Atlas. Auto-detect your model family at runtime and switch to the appropriate prompt. Claude gets the full mechanics-driven version. GPT gets a compact, principle-driven version that achieves the same outcome in ~121 lines. Safe to use either; the system handles the switching.
Group 3 — GPT-Native (GPT-5.3-codex / GPT-5.4): Hephaestus, Oracle, Momus. Principle-driven, autonomous execution style. Their prompts assume goal-oriented, independent reasoning — which is what GPT is built for. Hephaestus has no fallback and requires GPT access. Do not override these to Claude; the behaviour degrades.
Group 4 — Utility Runners (speed over intelligence): Explore, Librarian, Multimodal Looker. Do grep, search, and retrieval. Intentionally use the fastest, cheapest models available. “Upgrading” Explore to Opus is hiring a senior engineer to file paperwork. These are also the best candidates for local model replacement.
Delegation mechanisms
Oh My Opencode uses two complementary tools for delegation:
task()— category-based delegation: choose a category likevisual-engineeringordeep, optionally inject skills, and optionally run in the backgroundcall_omo_agent()— direct invocation of a specific agent by name, bypassing category routing
Both support parallel background execution, with concurrency enforced per provider and per model.
Categories are model routing presets
When Sisyphus delegates to a subagent, it picks a category, not a model name. The category maps to the right model automatically.
| Category | What it is for | Default model |
|---|---|---|
visual-engineering |
Frontend, UI/UX, CSS, design | Gemini 3.1 Pro (high) |
artistry |
Creative, novel approaches | Gemini 3.1 Pro → Claude Opus → GPT-5.4 |
ultrabrain |
Hard logic, architecture decisions | GPT-5.4 (xhigh) → Gemini 3.1 Pro → Claude Opus |
deep |
Deep coding, complex multi-file logic | GPT-5.3 Codex → Claude Opus → Gemini 3.1 Pro |
unspecified-high |
General complex work | Claude Opus → GPT-5.4 (high) → GLM-5 |
unspecified-low |
General standard work | Claude Sonnet → GPT-5.3 Codex → Gemini Flash |
quick |
Single-file changes, simple tasks | Claude Haiku → Gemini Flash → GPT-5-Nano |
writing |
Text, documentation, prose | Gemini Flash → Claude Sonnet |
Categories are the right abstraction for self-hosting too: map a category to a local model and every task routed to that category automatically uses it.
Model resolution order
Agent Request → User Override (if configured) → Fallback Chain → System Default
Provider priority when the same model is available through multiple providers:
Native (anthropic/, openai/, google/) > Kimi for Coding > GitHub Copilot > Venice > OpenCode Go > Z.ai Coding Plan
Oh My Opencode Agents: Full Catalogue with Roles and Model Requirements
Orchestrators
Sisyphus
Purpose: Main orchestrator. Plans, delegates, and drives tasks to completion through aggressive parallel execution.
Group: Communicator (Claude / Kimi / GLM)
Role: The team lead who coordinates across the whole codebase — its ~1,100-line mechanics-driven prompt needs a model that can follow every step across dozens of tool calls without losing track.
⚠️ Never override Sisyphus to older GPT models. GPT-5.4 has a dedicated prompt path but is not the recommended default. Claude Opus is the reference.
Fallback chain: anthropic/claude-opus-4-6 (max) → opencode-go/kimi-k2.5 → k2p5 → gpt-5.4 → glm-5 → big-pickle
Self-hosted: Sisyphus is the hardest agent to run locally. Its prompt complexity makes it dependent on models with strong instruction-following over long tool-call sequences. A local Qwen3-coder or DeepSeek-Coder-V3 may work for simple tasks, but expect degradation on workflows that require multi-agent coordination. If you self-host, test with a single-agent task before enabling parallel execution.
Atlas
Purpose: “Todo-list orchestrator.” Keeps a structured plan moving by enforcing completion and sequencing.
Group: Dual-prompt (Claude preferred, GPT supported)
Role: While Sisyphus handles the big picture, Atlas drives the checklist. Auto-detects your model family at runtime and switches prompts accordingly.
Fallback chain: anthropic/claude-sonnet-4-6 → opencode-go/kimi-k2.5
Self-hosted: A fast, reliable local coder model handles Atlas-style “drive the checklist” work reasonably well because the tasks are more structured than Sisyphus’s orchestration. Qwen3-coder at 32k+ context is a workable starting point.
Planning agents
The planning layer enforces “think before act”: requirements gathering, gap detection, and plan critique all happen before any execution agent sees the task.
Prometheus
Purpose: Strategic planner with an interview-style workflow. Activates when you press Tab or run /start-work.
Group: Dual-prompt (Claude preferred, GPT supported)
Role: Interviews you like a real engineer — identifies scope, surfaces ambiguities, and produces a verified plan before a single line of code is touched. The GPT version achieves the same in ~121 lines; the Claude version uses ~1,100 lines across 7 files.
Collaborates with: Metis (gap detection) and Momus (plan validation) before handing off to execution.
Fallback chain: anthropic/claude-opus-4-6 (max) → openai/gpt-5.4 (high) → opencode-go/glm-5 → google/gemini-3.1-pro
Self-hosted: Workable with a strong instruction-following local model at low temperature. Planning quality degrades when the model cannot hold your constraints and acceptance criteria in-context across a long multi-turn interview. Minimum 64k context window recommended.
Metis
Purpose: Pre-planning consultant and gap analyser. Runs at a higher temperature than most agents to encourage creative gap detection.
Group: Communicator (Claude preferred)
Role: “What did we miss?” reviewer before execution — not a code-writing worker, but part of the plan quality control story.
Collaborates with: Invoked by Prometheus before the plan is finalised.
Fallback chain: anthropic/claude-opus-4-6 (max) → opencode-go/glm-5 → k2p5
Self-hosted: A local reasoning-capable model is fine. Keep temperature non-zero if you want Metis to actually surface edge cases — set it to 0 and it becomes a rubber-stamp.
Momus
Purpose: Ruthless plan reviewer. Enforces clarity and verification standards. Can operate as a strict “OK or reject” gate.
Group: GPT-native
Role: QA-minded critic for plans. Tool restrictions keep it in review mode rather than execution mode.
Collaborates with: Used after plan creation to challenge feasibility before work begins.
Fallback chain: openai/gpt-5.4 (medium) → anthropic/claude-opus-4-6 (max) → google/gemini-3.1-pro (high)
Self-hosted: If you self-host, keep sampling very low. The entire point of Momus is stable, reproducible critique — creativity is the last thing you want here. A strong local reasoning model at temperature 0.1 or lower is the right configuration.
Worker agents
Hephaestus
Purpose: Autonomous deep worker. Give it a goal, not a recipe.
Group: GPT-native — GPT-5.3 Codex only
Role: The specialist who stays in their room coding all day. Explores the codebase, researches patterns, and executes end-to-end without constant supervision. The maintainer calls it “the Legitimate Craftsman” (a deliberate reference to Anthropic’s decision to block OpenCode).
⚠️ No fallback chain — requires GPT access. There is no Claude prompt for this agent. Running it without OpenAI or GitHub Copilot means it cannot execute. “GPT-5.3-codex-spark” exists but is explicitly not recommended — it compacts context so aggressively that Oh My Opencode’s context management breaks.
Fallback chain: openai/gpt-5.3-codex (medium) — no fallback
Self-hosted: There is no viable local replacement for Hephaestus today. Its prompt is built around GPT-Codex’s principle-driven, autonomous exploration style. If you need a deep worker on a fully local stack, use Sisyphus-Junior with the deep category instead (which routes to GPT-5.3 Codex, or falls back to Claude Opus if that is what you have).
Sisyphus-Junior
Purpose: Category-spawned executor used by the delegation system.
Group: Inherits from whichever category launched it
Role: The “specialist contractor” that inherits its model from category config. Created dynamically via task(), often with skills injected, and can be run in the background for parallelism. Think of it as a blank slate worker whose capability is determined entirely by which category you assign.
Fallback chain: anthropic/claude-sonnet-4-6 (default); inherits from the launching category in practice
Self-hosted: Sisyphus-Junior is the most practical place to start self-hosting. Map each category to a local model in oh-my-opencode.jsonc and every category-spawned task automatically uses it. Start with quick (simple tasks), verify it works, then expand to unspecified-low before touching anything that routes to deep or ultrabrain.
Specialist subagents
Oracle
Purpose: Read-only consultation for architecture decisions and complex debugging.
Group: GPT-native
Role: Senior architect and “last resort” debugger. Intentionally restricted from writing and delegating tools so its output stays advisory. Call Oracle after major work, after repeated failures, or before making a high-stakes architectural decision.
Fallback chain: openai/gpt-5.4 (high) → google/gemini-3.1-pro (high) → anthropic/claude-opus-4-6 (max)
Self-hosted: If you self-host Oracle, pick your strongest local reasoning model and keep sampling very low. The output quality difference between a capable local reasoner and GPT-5.4 is significant for complex architecture questions. In a hybrid setup, Oracle is one of the agents worth keeping on a cloud model while moving utility work local.
Librarian
Purpose: External docs and open-source research.
Group: Utility runner
Role: Documentation and evidence collector. Tool restrictions prevent editing, so it stays focused on sourcing and summarising. Designed to run in parallel with Explore for combined “inside the repo + outside the repo” evidence gathering.
Fallback chain: opencode-go/minimax-m2.5 → minimax-m2.5-free → claude-haiku-4-5 → gpt-5-nano
Self-hosted: The best agent to move fully local on day one. Librarian’s job is retrieval and summarisation, not deep reasoning. Any local model with reliable tool calling handles it well. Even a 7B or 13B model is sufficient if it can follow the “search, collect, report” pattern without drifting.
Explore
Purpose: Contextual grep and fast codebase search.
Group: Utility runner
Role: The “find me the relevant files and patterns” agent. Fire 10 of these in parallel for non-trivial questions, each scoped to a different area of the codebase, then let the orchestrator synthesise the results.
Fallback chain: grok-code-fast-1 → opencode-go/minimax-m2.5 → minimax-m2.5-free → claude-haiku-4-5 → gpt-5-nano
Self-hosted: Along with Librarian, Explore is the best starting point for local inference. Its job is pattern matching and structured reporting — the model does not need deep reasoning, just fast, reliable tool calling and decent instruction following. A small local coder model (Qwen2.5-Coder-7B or similar) at high throughput works well.
Multimodal Looker
Purpose: Vision analyst and “diagram reader.” Analyses images and PDFs via a look_at workflow.
Group: Utility runner (vision required)
Role: Heavily tool-restricted (read-only) to prevent side effects and keep it purely interpretive. Used when you need to feed UI screenshots, architecture diagrams, or PDF pages into the workflow.
Kimi K2.5 is specifically called out as excelling at multimodal understanding — that is why it sits high in this fallback chain.
Fallback chain: openai/gpt-5.4 → opencode-go/kimi-k2.5 → zai-coding-plan/glm-4.6v → gpt-5-nano
Self-hosted: Local vision requires a multimodal model with solid tool calling and enough context. If your local stack is not there yet, keep Multimodal Looker on a cloud model — a misconfigured local vision pipeline produces silent garbage, not useful errors.
Oh My Opencode Model Routing: Fallback Chains and Provider Priority
Per-agent defaults and the “no single global model” design
Oh My Opencode ships with per-agent model defaults and fallback chains, not a single global model. The design is deliberately opinionated:
- Explore and Librarian use the cheapest, fastest models because they do not need deep reasoning
- Oracle and Momus use the highest-capability models because their outputs gate execution
- Sisyphus and Prometheus get the best orchestration-class models by default
The OpenCode Go tier ($10/month)
OpenCode Go is a subscription tier that provides reliable access to Chinese frontier models through OpenCode’s infrastructure. It appears in the middle of many fallback chains as a bridge between premium native providers and free-tier alternatives.
| Model via OpenCode Go | Used by |
|---|---|
opencode-go/kimi-k2.5 |
Sisyphus, Atlas, Sisyphus-Junior, Multimodal Looker |
opencode-go/glm-5 |
Oracle, Prometheus, Metis, Momus |
opencode-go/minimax-m2.5 |
Librarian, Explore |
If you do not have Anthropic or OpenAI subscriptions, OpenCode Go plus GitHub Copilot covers most of the fallback chain at low cost.
Provider mappings for GitHub Copilot
When GitHub Copilot is the best available provider, agent assignments are:
| Agent | Model |
|---|---|
| Sisyphus | github-copilot/claude-opus-4-6 |
| Oracle | github-copilot/gpt-5.4 |
| Explore | github-copilot/grok-code-fast-1 |
| Librarian | github-copilot/gemini-3-flash |
Prompt variants track model families
If you switch an agent from Claude to GPT or Gemini, Oh My Opencode does not use the same prompt. Agents that support multiple families (Prometheus, Atlas) auto-detect via isGptModel() and switch. Agents that do not support multiple families (Sisyphus, Hephaestus) have one prompt — switch them to the wrong family and the output degrades.
If your agent outputs feel off after a model change, check whether you crossed a model family boundary and revert.
Running Oh My Opencode with Self-Hosted and Local Models
There are two layers to configure:
- OpenCode must know about your local provider and model IDs
- Oh My Opencode must be told which agent uses which model (because most agents ignore your UI-selected model by design)
What you can realistically run locally today
| Agent | Local viability | Recommended approach |
|---|---|---|
| Explore | ✅ Excellent | Any fast local coder model (Qwen2.5-Coder-7B+) |
| Librarian | ✅ Excellent | Any fast local model with reliable tool calling |
Sisyphus-Junior (quick category) |
✅ Good | Small coder model for quick tasks |
| Atlas | ⚠️ Workable | Mid-size model (13B+), 32k+ context |
| Prometheus | ⚠️ Workable | Strong instruction-follower, 64k+ context, low temperature |
| Metis | ⚠️ Workable | Reasoning-capable, keep temperature non-zero |
| Momus | ⚠️ Workable | Reasoning-capable, very low temperature |
| Sisyphus | ⚠️ Partial | Only for simple single-agent tasks; multi-agent orchestration needs Claude-class models |
| Oracle | ❌ Not recommended | Keep on cloud; quality gap is significant for complex queries |
| Hephaestus | ❌ No local path | Requires GPT-5.3-codex; no Claude or local equivalent |
Step 1 — Add a local provider to OpenCode
OpenCode supports local models and custom baseURL values in provider config — Ollama, vLLM, and any OpenAI-compatible endpoint are first-class options. The OpenCode quickstart covers provider authentication in detail.
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"ollama": {
"npm": "@ai-sdk/openai-compatible",
"name": "Ollama",
"options": {
"baseURL": "http://localhost:11434/v1"
},
"models": {
"qwen2.5-coder:7b": { "name": "qwen2.5-coder:7b" },
"qwen2.5-coder:32b": { "name": "qwen2.5-coder:32b" }
}
}
}
}
For vLLM or LM Studio, the same pattern applies — just point baseURL to your server’s /v1 endpoint and list the models you have loaded.
OpenCode requires at least a 64k context window for orchestration agents. Anything smaller and you will see truncation errors mid-workflow.
Step 2 — Override agent models in Oh My Opencode config
Config locations (project takes precedence over user-level):
.opencode/oh-my-opencode.jsonc(project-level, highest priority)~/.config/opencode/oh-my-opencode.jsonc(user-level)
A practical hybrid config — local inference for utility agents, cloud for reasoning:
{
"$schema": "https://raw.githubusercontent.com/code-yeongyu/oh-my-openagent/dev/assets/oh-my-opencode.schema.json",
"agents": {
// Utility agents: fast local model is more than enough
"explore": { "model": "ollama/qwen2.5-coder:7b", "temperature": 0.1 },
"librarian": { "model": "ollama/qwen2.5-coder:7b", "temperature": 0.1 },
// Sisyphus-Junior in quick mode: local is fine
// (controlled via categories below)
// Keep the reasoning agents on cloud
"oracle": { "model": "openai/gpt-5.4", "variant": "high" },
"momus": { "model": "openai/gpt-5.4", "variant": "xhigh" },
// Hephaestus: do not touch — it needs GPT-5.3-codex, no fallback
},
"categories": {
// Route simple spawned tasks to local model
"quick": { "model": "ollama/qwen2.5-coder:7b" },
"writing": { "model": "ollama/qwen2.5-coder:7b" },
// Keep heavy reasoning on cloud
"deep": { "model": "openai/gpt-5.3-codex", "variant": "medium" },
"ultrabrain": { "model": "openai/gpt-5.4", "variant": "xhigh" }
},
"background_task": {
"defaultConcurrency": 2,
"providerConcurrency": {
"ollama": 4, // local endpoint can handle more parallelism
"openai": 2, // stay inside plan limits
"anthropic": 2
},
"modelConcurrency": {
"ollama/qwen2.5-coder:7b": 4
}
}
}
The cost-conscious alternative to full self-hosting
Before committing to a local GPU setup, consider the OpenCode Go + Kimi for Coding stack. At around $11/month total, it covers:
- Kimi K2.5 for Sisyphus and Atlas (Claude-class orchestration quality at low cost)
- GLM-5 for Prometheus, Metis, and Momus (solid reasoning, free tier available)
- MiniMax M2.5 for Librarian and Explore (fast retrieval)
For most workloads this is cheaper than running a local inference server and does not require GPU hardware.
Oh My Opencode Built-in Tools: Hashline, Init-Deep, Ralph Loop, and MCPs
Hashline — hash-anchored edit tool
One of the most practical improvements in Oh My Opencode is how it handles code edits. Every line the agent reads comes back tagged with a content hash:
11#VK| function hello() {
22#XJ| return "world";
33#MB| }
When the agent edits by referencing those tags, if the file changed since the last read the hash will not match and the edit is rejected before corruption. This eliminates the entire class of “stale line” errors where agents confidently edit lines that no longer exist. Grok Code Fast’s success rate on edit tasks went from 6.7% to 68.3% just from this change.
/init-deep — hierarchical context injection
Run /init-deep and Oh My Opencode generates AGENTS.md files at every relevant level of your project tree:
project/
├── AGENTS.md ← project-wide context
├── src/
│ ├── AGENTS.md ← src-specific context
│ └── components/
│ └── AGENTS.md ← component-specific context
Agents auto-read relevant context at their scope. Instead of loading the entire repo into context at the start of every run, each agent only pulls in what is relevant to where it is working.
Prometheus planning mode — /start-work
For complex tasks, do not just type a prompt and hope. Press Tab to enter Prometheus mode or use /start-work. Prometheus interviews you like a real engineer: identifies scope, surfaces ambiguities, builds a verified plan before any execution agent runs. The “Decision Complete” standard means the plan leaves zero decisions to the implementer.
Ralph Loop — /ulw-loop
A self-referential execution loop that does not stop until the task is 100% complete. Use this for large, multi-step tasks where you want the system to self-verify and continue without your involvement. It is aggressive — make sure your concurrency limits are set before running it on an expensive cloud provider.
Built-in MCPs
Three MCP servers are pre-configured and always on:
- Exa — web search
- Context7 — official documentation lookup
- Grep.app — GitHub code search across public repositories
You do not need to configure these. They are available to all agents by default.
For hands-on results and community benchmarks on how these agents perform in practice, see the Oh My Opencode experience article. To install the plugin from scratch, start with the Oh My Opencode quickstart.