Multi-model AI collaboration for Claude Code. Get consensus-driven code reviews and second opinions from multiple AI models in parallel.
| Command | Purpose |
|---|---|
/review |
Multi-model code review on your staged changes |
/consult |
Get a second opinion when stuck on a problem |
/review /consult "why is this broken?"
│ │
├── Codex ───► reviews diff ├── Codex ───► reads conversation
├── Gemini ──► reviews diff ├── Gemini ──► reads conversation
├── Claude ──► reviews diff ├── Claude ──► reads conversation
└── ... └── ...
│ │
▼ ▼
Consensus: issues flagged Synthesis: consensus suggestions,
by 2+ models highlighted unique perspectives, disagreements
Why multiple models?
- Different training data → different blind spots
- 2+ models flagging the same issue → stronger signal
- Diverse perspectives surface better solutions
Inspired by LLM Council — the idea that multiple LLMs reviewing the same problem surfaces stronger signals than any single model.
git clone https://github.com/caiopizzol/conclave ~/dev/conclave
cd ~/dev/conclave
bun run registerTo unregister:
bun run unregister{
"persistence": {
"enabled": true,
"required": false,
"data_dir": "~/.local/share/conclave/reviews"
},
"tools": {
"codex": {
"enabled": true,
"scope": ["review", "consult"],
"command": "codex exec --full-auto -",
"model": "gpt-5.2-codex",
"description": "OpenAI Codex CLI"
},
"claude-opus": {
"enabled": true,
"scope": ["consult"],
"command": "claude --print",
"model": "opus",
"description": "Claude Code (Opus)"
},
"gemini": {
"enabled": true,
"scope": ["review", "consult"],
"command": "gemini -o text",
"model": "gemini-3-pro-preview",
"description": "Google Gemini CLI"
}
},
"prompts": {
"review": "~/.config/conclave/prompt.md",
"consult": "~/.config/conclave/consult-prompt.md"
}
}| Field | Required | Description |
|---|---|---|
enabled |
Yes | Whether to use this tool |
scope |
No | Array of commands: ["review"], ["consult"], or ["review", "consult"]. If omitted, tool is used for all commands |
command |
Yes | CLI command to run |
model |
No | Model to use (injected via --model or -m flag) |
description |
No | Human-readable description |
You can define multiple entries for the same provider with different models (e.g., claude-opus and claude-sonnet).
| Field | Default | Description |
|---|---|---|
enabled |
true |
Save review results to disk for later analysis |
required |
false |
If true, halt on persistence failure instead of continuing |
data_dir |
~/.local/share/conclave/reviews |
Directory for saved review data |
Review results are saved as JSON files containing raw model outputs, timestamps, and investigation results. This enables:
- Tracking review history across PRs
- Analyzing model performance over time
- Recovering from context compaction
Supported models:
| Tool | Models | Documentation |
|---|---|---|
| Codex | gpt-5.2-codex, gpt-5.1-codex-mini, gpt-5.1-codex-max, gpt-5.2 |
Codex Models |
| Claude | opus, sonnet, haiku (aliases) or full names like claude-opus-4-5-20251101 |
CLI Reference |
| Gemini | gemini-2.5-pro, gemini-2.5-flash, gemini-3-pro-preview, gemini-3-flash-preview |
Gemini CLI |
| Qwen | coder-model (default), vision-model |
Qwen Code Docs |
| Mistral | Config-based (~/.vibe/config.toml) |
Mistral Vibe Docs |
| Grok | grok-code-fast-1, grok-4-1-fast-*, grok-4-fast-*, grok-3, grok-3-mini |
xAI API Models |
| Ollama | qwen3-coder:480b-cloud, devstral-2:123b-cloud, or any model from library |
Ollama Library |
Note: Ollama cloud models use
:cloudsuffix and requireOLLAMA_API_KEYenvironment variable. Get your API key at ollama.com. You can also run local models (e.g.,qwen2.5-coder:7b), but they are slow and require significant memory (~8GB+ RAM for 7B models).
Note: Mistral and Grok use command-line argument passing (not stdin), which has a ~200KB limit on macOS. Very large diffs may cause these tools to fail while other tools succeed.
Note: Grok uses the community CLI (
@vibe-kit/grok-cli) until xAI releases the official "Grok Build" CLI.
Customize prompts for each command:
| File | Template Variables |
|---|---|
~/.config/conclave/prompt.md |
{{branch}}, {{target_branch}}, {{diff}} |
~/.config/conclave/consult-prompt.md |
{{history_file}}, {{question}}, {{cwd}} |
| Tool | Install |
|---|---|
| Codex | npm install -g @openai/codex |
| Claude | Built-in |
| Gemini | npm install -g @google/gemini-cli |
| Qwen | npm install -g @qwen-code/qwen-code |
| Mistral | pipx install mistral-vibe |
| Grok | bun add -g @vibe-kit/grok-cli; export GROK_API_KEY="key" in ~/.zshrc |
| Ollama | ollama.com/download; cloud: export OLLAMA_API_KEY="key" in ~/.zshrc; local: ollama pull <model> |
# Review staged changes
git add -p
/review# When stuck on a problem
/consult "why is the table rendering broken after paste?"
# When going in circles
/consult "I've tried X, Y, Z but none work"
# Validate an approach
/consult "is this the right way to handle state here?"/consult passes your Claude Code conversation history to external models, so they can see what's already been tried and avoid suggesting the same things.
More models ≠ better. The value is consensus:
- 1 model flags issue → might be noise
- 2+ models flag same issue → likely real
- Different perspectives → surface blind spots
Conclave surfaces what matters.
/review follows a state machine pattern with checkpoints to ensure reliability:
[INIT] → [GATHERING] → [SPAWNING] → [TOOLS_COMPLETE]
│
⛔ CHECKPOINT
│
[PERSISTED] → [SYNTHESIZING] → [INVESTIGATING] → [COMPLETE]
The ⛔ checkpoint ensures review data is persisted before synthesis. If persistence.required is true, the workflow halts on persistence failure.
MIT