Discovery → key → score → iterate.

Discover — read /llms.txt (and /.well-known/llms.txt); the OpenAPI spec and interactive docs carry full schemas.

Get a key — a signed-in org user mints one once at /developers (POST /api/v1/agent-keys); the sk-otter-<40 hex> secret is shown once. Hand it to the agent.

Connect — drop the .mcp.json below into Claude / Codex / Cursor, or call the HTTP API with Authorization: Bearer sk-otter-....

Score — POST /api/v1/eval/feedback with the artifact + the prompt the agent was given.

Read flaws — each flaw has criterion, severity, evidence, detail, anchor; upgrades[] are concrete fixes.

Iterate — POST /api/v1/eval/runs/{id}/iterate until band clears the gate (e.g. ship).

Workflow / benchmark — POST /api/v1/eval/workflows/{id}/topology for a composite + chain critique.

{ "mcpServers": { "otterloop": { "command": "python", "args": ["-m", "otterloop.mcp_server"], "env": { "OTTERLOOP_API_URL": "https://api.seaotter.ai", "OTTERLOOP_API_KEY": "sk-otter-...", "OTTERLOOP_POLICY_ID": "acme-prod-acceptance" } } } }

curl -s https://api.seaotter.ai/api/v1/eval/feedback \ -H "Authorization: Bearer $OTTER_KEY" -H 'Content-Type: application/json' \ -d '{ "modality":"text", "policy_id":"acme-prod-acceptance", "prompt":"Draft the Q3 incident postmortem", "artifact_parts":[{"mime_type":"text/plain","text":"...your work..."}], "return_feedback_artifacts": true }'

Every eval call: Authorization: Bearer sk-otter-...

Bases: https://api.seaotter.ai (prod) / https://dev-api.seaotter.ai (dev).

Method

Path

What it does

GET

/api/v1/eval/policies

Org acceptance policies to condition grading on.

GET

/api/v1/eval/rubrics

List rubrics (acceptance criteria); /{id} for one.

POST

/api/v1/eval/feedback

One-shot grade -> { run_id, verdict }.

POST

/api/v1/eval/runs

Create a run + first verdict (full conditioning slots).

POST

/api/v1/eval/runs/{id}/iterate

Submit a revision, get the next verdict.

GET

/api/v1/eval/runs/{id}/score

Fetch a run's latest score.

POST

/api/v1/eval/workflows/{id}/topology

Workflow composite + per-step + chain critique.

GET/POST

/api/v1/agent-keys

List / mint eval keys (signed-in org user, not an eval key).

Fully-programmatic self-signup is not shipped yet.

Today the first eval key is minted once by a human: a signed-in org user at /developers (POST /api/v1/agent-keys requires a product-user JWT, not an eval key). After that, the agent runs the whole loop with no human in the loop. The remaining gap for zero-human onboarding is a scoped, rate-limited, abuse-gated POST /api/v1/agent-signup (or an OAuth-style device/client-credentials grant) that provisions a sandbox tenant + a low-quota key without the Firebase step. Until then, the one-time human key mint is the single manual step.

Discovery → key → score → iterate.

Discover — read /llms.txt (and /.well-known/llms.txt); the OpenAPI spec and interactive docs carry full schemas.

Get a key — a signed-in org user mints one once at /developers (POST /api/v1/agent-keys); the sk-otter-<40 hex> secret is shown once. Hand it to the agent.

Connect — drop the .mcp.json below into Claude / Codex / Cursor, or call the HTTP API with Authorization: Bearer sk-otter-....

Score — POST /api/v1/eval/feedback with the artifact + the prompt the agent was given.

Read flaws — each flaw has criterion, severity, evidence, detail, anchor; upgrades[] are concrete fixes.

Iterate — POST /api/v1/eval/runs/{id}/iterate until band clears the gate (e.g. ship).

Workflow / benchmark — POST /api/v1/eval/workflows/{id}/topology for a composite + chain critique.

Every eval call: Authorization: Bearer sk-otter-...

Bases: https://api.seaotter.ai (prod) / https://dev-api.seaotter.ai (dev).

Method

Path

What it does

GET

/api/v1/eval/policies

Org acceptance policies to condition grading on.

GET

/api/v1/eval/rubrics

List rubrics (acceptance criteria); /{id} for one.

POST

/api/v1/eval/feedback

One-shot grade -> { run_id, verdict }.

POST

/api/v1/eval/runs

Create a run + first verdict (full conditioning slots).

POST

/api/v1/eval/runs/{id}/iterate

Submit a revision, get the next verdict.

GET

/api/v1/eval/runs/{id}/score

Fetch a run's latest score.

POST

/api/v1/eval/workflows/{id}/topology

Workflow composite + per-step + chain critique.

GET/POST

/api/v1/agent-keys

List / mint eval keys (signed-in org user, not an eval key).

Fully-programmatic self-signup is not shipped yet.

An agent can onboard itself.

Discovery → key → score → iterate.

Every eval call: Authorization: Bearer sk-otter-...

Fully-programmatic self-signup is not shipped yet.

An agent can onboard itself.

Discovery → key → score → iterate.

Every eval call: Authorization: Bearer sk-otter-...

Fully-programmatic self-signup is not shipped yet.