Skip to Content
DocumentationHelpTroubleshooting

Troubleshooting

Common errors when running Aether Forge agents and how to fix them.

Install & Setup

forge: command not found

The forge CLI installs to your Python env’s bin/. Activate the venv:

source .venv/bin/activate which forge

If still missing: pip install -e . from the repo root.

forge doctor shows skipped optional checks

The [skip] lines mean the optional package isn’t installed. Add it:

pip install 'aether-forge[wallet]' # OWS pip install 'aether-forge[knowledge]' # MemPalace pip install 'aether-forge[security]' # cryptography pip install 'aether-forge[a2a]' # a2a-sdk pip install 'aether-forge[all]' # everything

LLM / Planner

Error: openai-compatible planner mode is missing required settings: api_key

Your provider’s API key env var isn’t set. Set the right one for your --planner-mode:

ModeEnv var
openrouterOPENROUTER_API_KEY
anthropicANTHROPIC_API_KEY
openaiOPENAI_API_KEY
gemini / googleGEMINI_API_KEY
ollama(none — needs ollama serve running)
export OPENROUTER_API_KEY=sk-or-v1-... forge run . --environment sandbox --mode paper

Planner times out

DeepSeek R1 and reasoning models can take 1-3 minutes per response. Either:

  • Increase tick_timeout_seconds in aether-forge.json
  • Switch to a faster model: --planner-mode openrouter --planner-model anthropic/claude-sonnet-4

Could not convert string to float: '50%'

A numeric field somewhere has % in it. Check strategy.json and aether-forge.json — values should be raw numbers (0.5 not "50%").

Ollama: Connection refused

Ollama isn’t running. Start it:

ollama serve # in one terminal ollama pull gemma4:latest

forge doctor fails because planner is autodetected

Production and staging profiles require an explicit planner choice. Regenerate with explicit flags or edit aether-forge.json intentionally:

forge generate-fast \ --name "research-agent" \ --idea "summarize trusted sources" \ --output ./research-agent \ --deployment-profile staging \ --planner-mode anthropic \ --planner-model claude-sonnet-4-5 \ --planner-api-key-env ANTHROPIC_API_KEY

local keeps autodetected planners advisory-only. staging and production fail because planner provenance must be auditable before promotion.

A cloud key is set but Aether Forge tries Ollama

Autodetect probes cloud providers before Ollama. Check for misspelled env vars:

env | grep -E 'ANTHROPIC_API_KEY|OPENAI_API_KEY|GOOGLE_API_KEY|GEMINI_API_KEY|OPENROUTER_API_KEY'

If a cloud key is set, Aether Forge should not open localhost:11434 unless AETHER_FORGE_ALLOW_OLLAMA_AUTODETECT is truthy. Unset that variable if you want cloud-first behavior.

Memory & Storage

sqlite3.OperationalError: database is locked

Two processes are writing to the same memory.db. Use one runner per agent dir, or set --memory-db /tmp/different-path.db.

no such table: memory_records

Memory database is missing or corrupted:

rm memory.db memory.db-wal memory.db-shm forge run . --environment sandbox --mode paper

Memory keeps growing

Memory has no auto-eviction. To trim:

sqlite3 memory.db "DELETE FROM memory_records WHERE created_at < datetime('now', '-7 days')" sqlite3 memory.db "VACUUM"

Or set expires_at when writing memory records.

Wallet & Crypto

OWS_API_KEY not found

Either install OWS (pip install 'aether-forge[wallet]') or run with the simulated wallet provider (default if OWS isn’t installed).

Install support and credentials are separate:

pip install 'aether-forge[wallet] @ git+https://github.com/HeyElsa/aether-forge.git' export OWS_API_KEY=...

For the first run, do not install wallet extras and do not pass --wallet. Use First 10 Minutes.

BalanceCheckFailed: insufficient ETH for gas

The agent’s wallet needs ~$0.003 in ETH on Base for transaction gas. Check wallet:

cat wallet.json | jq '.addresses.evm' # Send small amount of ETH to that address on Base mainnet

forge agent-register fails with RPC error

Public Base RPC may be rate-limited. Use your own:

forge agent-register <id> --rpc-url https://base-mainnet.g.alchemy.com/v2/YOUR_KEY

A2A & Multi-Agent

Connection refused when calling another agent

The peer agent isn’t running, or its --a2a-port doesn’t match your agent-send URL. Verify:

curl http://localhost:9001/.well-known/a2a-card

A2A task stays in submitted forever

The peer agent’s planner errored. Check the peer’s logs / replay files:

forge replay-show ./peer-agent/replays/$(ls -1t ./peer-agent/replays | head -1)

429 Too Many Requests on A2A

Hit the rate limit (60 req/min per IP). Slow down or run multiple instances.

x402 Payments

Payment required: $0.005 USDC

Caller didn’t include the X-PAYMENT header. Either:

  • Use forge x402-call ... which handles this automatically
  • Manually sign EIP-3009 and add the X-PAYMENT header

Payment header rejected before execution

X402PaymentGate verifies payment headers before running paid capabilities. Common causes:

Error shapeFix
Wrong receiving addressMatch the gate’s configured pay-to address
Insufficient amountPay at least the requested amount
Payer not allowedAdd the caller to allowed_payers or use an allowed wallet
Malformed headerUse forge x402-call ... instead of hand-building the header

Rejections happen before the paid capability executes.

Budget exceeded: session_spent_usd 1.05 > max_session_usd 1.00

You hit your configured cap. Either:

  • Wait for next session
  • Increase max_session_usd in aether-forge.json x402_budget
  • Run with --confirm-live --max-session-usd 5.00

Budget checks and payment execution share one lock. If concurrent callers wait or fail around x402_state.json, serialize calls or give each agent its own state path.

Tx reverts on Base

Use simulate_tx() from aether_forge.defi_safety before signing. Check the revert reason for clues (insufficient balance, slippage, deadline).

Runtime / Production

Agent stuck on a tick (no output)

The LLM call is hanging. The default tick_timeout_seconds=120 should cut it off after 2 min. Check logs:

tail -f logs/agent.jsonl | jq -c 'select(.level=="ERROR" or .level=="WARNING")'

/ready returns 503

Check the reason in the response body:

curl localhost:8080/ready # {"ready": false, "reason": "kill switch active"}

Common reasons:

  • kill switch activeforge resume .
  • last 5 ticks failed → check planner config and LLM availability
  • warming up → just wait for the first tick to complete

Circuit breaker tripped

After 5 consecutive failures, the agent enters a 60-second cooldown. Check what’s failing:

forge replays . --limit 5 forge replay-show ./replays/<latest-failed>.json

Docker container exits immediately

Check logs:

docker logs <container-id>

Common causes: missing env vars (OPENROUTER_API_KEY), missing wallet vault (.ows/ not mounted), wrong working dir.

MCP

MCP server starts but no tools are available

Check the server entry in aether-forge.json:

{ "mcp_servers": [ { "name": "filesystem", "transport": "stdio", "command": "npx", "args": ["-y", "@modelcontextprotocol/server-filesystem", "."], "tools": { "include": ["read_file"], "exclude": [] } } ] }

tools.include and tools.exclude are enforced per server. A tool excluded by config will not appear to the planner.

MCP server cannot see an env var

Stdio MCP subprocesses receive a safe baseline environment plus only the env vars explicitly declared on the server config. Add required variables under that server’s env field instead of relying on the parent shell.

MCP result looks like prompt injection

External tool results are scanned before entering planner context. If a tool returns instructions such as “ignore previous instructions” or asks for secrets, Aether Forge blocks or flags the result. Treat that as a data-source issue, not a planner bug.

Docs Site

npm run dev fails with peer dep warnings

Use --legacy-peer-deps:

cd docs-site npm install --legacy-peer-deps

Videos don’t load

Videos live in docs-site/public/videos/. Posters live in docs-site/public/video-posters/ so pages have a useful visual state before playback. If cloning fresh, ensure both sets downloaded:

ls -lh docs-site/public/videos/*.mp4 | wc -l ls -lh docs-site/public/video-posters/*.jpg | wc -l # Both should be 34

Every video page also has the written guide text as the canonical path, so a missing video should not block setup.

next start returns 404 for docs routes

Build the docs site before starting production mode:

cd docs-site npm run build npm run start

The postbuild step patches the Next routes manifest for MDX docs routes. If you bypass npm run build, production route serving will not be prepared.

Still stuck?

  1. Run forge doctor — most issues show up here
  2. Run with -v for full traceback
  3. Check replays/ for the last tick’s step ledger
  4. Open an issue with forge doctor output and the failing replay
Last updated on