Troubleshooting
Common errors when running Aether Forge agents and how to fix them.
Install & Setup
forge: command not found
The forge CLI installs to your Python env’s bin/. Activate the venv:
source .venv/bin/activate
which forgeIf still missing: pip install -e . from the repo root.
forge doctor shows skipped optional checks
The [skip] lines mean the optional package isn’t installed. Add it:
pip install 'aether-forge[wallet]' # OWS
pip install 'aether-forge[knowledge]' # MemPalace
pip install 'aether-forge[security]' # cryptography
pip install 'aether-forge[a2a]' # a2a-sdk
pip install 'aether-forge[all]' # everythingLLM / Planner
Error: openai-compatible planner mode is missing required settings: api_key
Your provider’s API key env var isn’t set. Set the right one for your --planner-mode:
| Mode | Env var |
|---|---|
openrouter | OPENROUTER_API_KEY |
anthropic | ANTHROPIC_API_KEY |
openai | OPENAI_API_KEY |
gemini / google | GEMINI_API_KEY |
ollama | (none — needs ollama serve running) |
export OPENROUTER_API_KEY=sk-or-v1-...
forge run . --environment sandbox --mode paperPlanner times out
DeepSeek R1 and reasoning models can take 1-3 minutes per response. Either:
- Increase
tick_timeout_secondsinaether-forge.json - Switch to a faster model:
--planner-mode openrouter --planner-model anthropic/claude-sonnet-4
Could not convert string to float: '50%'
A numeric field somewhere has % in it. Check strategy.json and aether-forge.json — values should be raw numbers (0.5 not "50%").
Ollama: Connection refused
Ollama isn’t running. Start it:
ollama serve # in one terminal
ollama pull gemma4:latestforge doctor fails because planner is autodetected
Production and staging profiles require an explicit planner choice. Regenerate with explicit flags or edit aether-forge.json intentionally:
forge generate-fast \
--name "research-agent" \
--idea "summarize trusted sources" \
--output ./research-agent \
--deployment-profile staging \
--planner-mode anthropic \
--planner-model claude-sonnet-4-5 \
--planner-api-key-env ANTHROPIC_API_KEYlocal keeps autodetected planners advisory-only. staging and production fail because planner provenance must be auditable before promotion.
A cloud key is set but Aether Forge tries Ollama
Autodetect probes cloud providers before Ollama. Check for misspelled env vars:
env | grep -E 'ANTHROPIC_API_KEY|OPENAI_API_KEY|GOOGLE_API_KEY|GEMINI_API_KEY|OPENROUTER_API_KEY'If a cloud key is set, Aether Forge should not open localhost:11434 unless AETHER_FORGE_ALLOW_OLLAMA_AUTODETECT is truthy. Unset that variable if you want cloud-first behavior.
Memory & Storage
sqlite3.OperationalError: database is locked
Two processes are writing to the same memory.db. Use one runner per agent dir, or set --memory-db /tmp/different-path.db.
no such table: memory_records
Memory database is missing or corrupted:
rm memory.db memory.db-wal memory.db-shm
forge run . --environment sandbox --mode paperMemory keeps growing
Memory has no auto-eviction. To trim:
sqlite3 memory.db "DELETE FROM memory_records WHERE created_at < datetime('now', '-7 days')"
sqlite3 memory.db "VACUUM"Or set expires_at when writing memory records.
Wallet & Crypto
OWS_API_KEY not found
Either install OWS (pip install 'aether-forge[wallet]') or run with the simulated wallet provider (default if OWS isn’t installed).
Install support and credentials are separate:
pip install 'aether-forge[wallet] @ git+https://github.com/HeyElsa/aether-forge.git'
export OWS_API_KEY=...For the first run, do not install wallet extras and do not pass --wallet. Use First 10 Minutes.
BalanceCheckFailed: insufficient ETH for gas
The agent’s wallet needs ~$0.003 in ETH on Base for transaction gas. Check wallet:
cat wallet.json | jq '.addresses.evm'
# Send small amount of ETH to that address on Base mainnetforge agent-register fails with RPC error
Public Base RPC may be rate-limited. Use your own:
forge agent-register <id> --rpc-url https://base-mainnet.g.alchemy.com/v2/YOUR_KEYA2A & Multi-Agent
Connection refused when calling another agent
The peer agent isn’t running, or its --a2a-port doesn’t match your agent-send URL. Verify:
curl http://localhost:9001/.well-known/a2a-cardA2A task stays in submitted forever
The peer agent’s planner errored. Check the peer’s logs / replay files:
forge replay-show ./peer-agent/replays/$(ls -1t ./peer-agent/replays | head -1)429 Too Many Requests on A2A
Hit the rate limit (60 req/min per IP). Slow down or run multiple instances.
x402 Payments
Payment required: $0.005 USDC
Caller didn’t include the X-PAYMENT header. Either:
- Use
forge x402-call ...which handles this automatically - Manually sign EIP-3009 and add the X-PAYMENT header
Payment header rejected before execution
X402PaymentGate verifies payment headers before running paid capabilities. Common causes:
| Error shape | Fix |
|---|---|
| Wrong receiving address | Match the gate’s configured pay-to address |
| Insufficient amount | Pay at least the requested amount |
| Payer not allowed | Add the caller to allowed_payers or use an allowed wallet |
| Malformed header | Use forge x402-call ... instead of hand-building the header |
Rejections happen before the paid capability executes.
Budget exceeded: session_spent_usd 1.05 > max_session_usd 1.00
You hit your configured cap. Either:
- Wait for next session
- Increase
max_session_usdinaether-forge.jsonx402_budget - Run with
--confirm-live --max-session-usd 5.00
Budget checks and payment execution share one lock. If concurrent callers wait or fail around x402_state.json, serialize calls or give each agent its own state path.
Tx reverts on Base
Use simulate_tx() from aether_forge.defi_safety before signing. Check the revert reason for clues (insufficient balance, slippage, deadline).
Runtime / Production
Agent stuck on a tick (no output)
The LLM call is hanging. The default tick_timeout_seconds=120 should cut it off after 2 min. Check logs:
tail -f logs/agent.jsonl | jq -c 'select(.level=="ERROR" or .level=="WARNING")'/ready returns 503
Check the reason in the response body:
curl localhost:8080/ready
# {"ready": false, "reason": "kill switch active"}Common reasons:
kill switch active→forge resume .last 5 ticks failed→ check planner config and LLM availabilitywarming up→ just wait for the first tick to complete
Circuit breaker tripped
After 5 consecutive failures, the agent enters a 60-second cooldown. Check what’s failing:
forge replays . --limit 5
forge replay-show ./replays/<latest-failed>.jsonDocker container exits immediately
Check logs:
docker logs <container-id>Common causes: missing env vars (OPENROUTER_API_KEY), missing wallet vault (.ows/ not mounted), wrong working dir.
MCP
MCP server starts but no tools are available
Check the server entry in aether-forge.json:
{
"mcp_servers": [
{
"name": "filesystem",
"transport": "stdio",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "."],
"tools": { "include": ["read_file"], "exclude": [] }
}
]
}tools.include and tools.exclude are enforced per server. A tool excluded by config will not appear to the planner.
MCP server cannot see an env var
Stdio MCP subprocesses receive a safe baseline environment plus only the env vars explicitly declared on the server config. Add required variables under that server’s env field instead of relying on the parent shell.
MCP result looks like prompt injection
External tool results are scanned before entering planner context. If a tool returns instructions such as “ignore previous instructions” or asks for secrets, Aether Forge blocks or flags the result. Treat that as a data-source issue, not a planner bug.
Docs Site
npm run dev fails with peer dep warnings
Use --legacy-peer-deps:
cd docs-site
npm install --legacy-peer-depsVideos don’t load
Videos live in docs-site/public/videos/. Posters live in docs-site/public/video-posters/ so pages have a useful visual state before playback. If cloning fresh, ensure both sets downloaded:
ls -lh docs-site/public/videos/*.mp4 | wc -l
ls -lh docs-site/public/video-posters/*.jpg | wc -l
# Both should be 34Every video page also has the written guide text as the canonical path, so a missing video should not block setup.
next start returns 404 for docs routes
Build the docs site before starting production mode:
cd docs-site
npm run build
npm run startThe postbuild step patches the Next routes manifest for MDX docs routes. If you bypass npm run build, production route serving will not be prepared.
Still stuck?
- Run
forge doctor— most issues show up here - Run with
-vfor full traceback - Check
replays/for the last tick’s step ledger - Open an issue with
forge doctoroutput and the failing replay