API reference

This catalog documents 23 endpoints across 11 groups. All request and response bodies are application/json unless noted.

Analytics

GET `/api/v1/analytics`

Model performance analytics

Aggregates execution statistics from the last 100 tasks: success rate, average duration, and failure counts per model. Returns a ranked top-5 list by success rate alongside full per-model stats.

Responses

Status	Meaning
`200`	Aggregate model performance stats with top-5 ranking
`503`	Storage unavailable while reading task/step records

200 response shape — AnalyticsResponse

Field	Type	Required	Description
`total_models`	`integer`	✓	—
`total_executions`	`integer`	✓	—
`models`	`ModelStatsView[]`	✓	—
`top_models`	`ModelStatsView[]`	✓	—

Example response:

{
  "total_models": 0,
  "total_executions": 0,
  "models": [
    "gpt-5.4",
    "claude"
  ],
  "top_models": [
    {
      "model_id": "abc123",
      "total_executions": 0,
      "successful": 0,
      "failed": 0,
      "success_rate": 0,
      "avg_duration_ms": 0
    }
  ]
}

Batch

POST `/api/v1/batch`

Execute multiple prompts in parallel

Runs a single model on up to 20 prompts concurrently. Returns per-prompt results with individual success/failure status.

Request body

Field	Type	Required	Description
`prompts`	`string[]`	✓	—
`model`	`string`		(default: `"claude-sonnet-4-6"`)
`dry_run`	`boolean`		(default: `false`)

{
  "prompts": [
    "Summarize the latest incident report."
  ],
  "model": "claude-sonnet-4-6",
  "dry_run": false
}

Responses

Status	Meaning
`200`	Batch execution results with per-prompt status and aggregate counts
`400`	Invalid model or no adapter available for the requested provider
`422`	Validation error (empty prompts list, prompt too long, or too many prompts)

200 response shape — BatchResponse

Field	Type	Required	Description
`results`	`BatchItemResponse[]`	✓	—
`total`	`integer`	✓	—
`succeeded`	`integer`	✓	—
`failed`	`integer`	✓	—

Example response:

{
  "results": [
    {
      "prompt": "Summarize the latest incident report.",
      "answer": "string",
      "status": "string"
    }
  ],
  "total": 0,
  "succeeded": 0,
  "failed": 0
}

Compare

POST `/api/v1/compare`

Compare answers from multiple models

Runs the same prompt on each requested model concurrently. When analyze=true and at least two models succeed, an additional LLM call summarizes differences, strengths, and the best answer.

Request body

Field	Type	Required	Description
`prompt`	`string`	✓	—
`models`	`string[]`		—
`analyze`	`boolean`		(default: `true`)
`dry_run`	`boolean`		(default: `false`)

{
  "prompt": "Summarize the latest incident report.",
  "analyze": true,
  "dry_run": false
}

Responses

Status	Meaning
`200`	Per-model answers with optional comparative analysis and aggregate success counts
`422`	Validation error (empty models list or prompt too long)

200 response shape — CompareResponse

Field	Type	Required	Description
`answers`	`ModelAnswer[]`	✓	—
`analysis`	`string \| null`	✓	—
`total_models`	`integer`	✓	—
`succeeded`	`integer`	✓	—
`failed`	`integer`	✓	—

Example response:

{
  "answers": [
    {
      "model_id": "abc123",
      "answer": "string",
      "status": "string"
    }
  ],
  "analysis": "string",
  "total_models": 0,
  "succeeded": 0,
  "failed": 0
}

Consensus

POST `/api/v1/consensus`

Run iterative consensus V1

Generates multiple response variations per round, clusters them by semantic similarity, and iterates until the top cluster reaches the consensus target. Requires an embeddings client to be configured.

Request body

Field	Type	Required	Description
`prompt`	`string`	✓	—
`model`	`string`		(default: `"claude-sonnet-4-6"`)
`similarity_threshold`	`number`		(default: `0.85`)
`consensus_target`	`number`		(default: `0.95`)
`max_rounds`	`integer`		(default: `5`)
`variations_per_round`	`integer`		(default: `3`)
`use_confidence_weighting`	`boolean`		(default: `true`)
`dry_run`	`boolean`		(default: `false`)

{
  "prompt": "Summarize the latest incident report.",
  "model": "claude-sonnet-4-6",
  "similarity_threshold": 0.85,
  "consensus_target": 0.95,
  "max_rounds": 5,
  "variations_per_round": 3,
  "use_confidence_weighting": true,
  "dry_run": false
}

Responses

Status	Meaning
`200`	Consensus result with score, cluster count, best response, and round statistics
`400`	Invalid model or no adapter available for the requested provider
`422`	Validation Error
`500`	Consensus execution failed (internal error)
`503`	Embeddings client is not configured

200 response shape — ConsensusResponse

Field	Type	Required	Description
`consensus_score`	`number`	✓	—
`num_clusters`	`integer`	✓	—
`best_response`	`string`	✓	—
`weighted_score`	`number \| null`	✓	—
`total_rounds`	`integer`	✓	—
`total_llm_calls`	`integer`	✓	—
`needs_debate`	`boolean`	✓	—
`top_cluster_size`	`integer`	✓	—

Example response:

{
  "consensus_score": 0,
  "num_clusters": 0,
  "best_response": "string",
  "weighted_score": 0,
  "total_rounds": 0,
  "total_llm_calls": 0,
  "needs_debate": false,
  "top_cluster_size": 0
}

Debate

POST `/api/v1/debate`

Run a Devil’s Advocate debate round

Generates an initial position on the topic, then runs a structured debate: challenge (Devil’s Advocate), defense, and improved final response. Optionally supply your own initial_position to skip the first LLM call.

Request body

Field	Type	Required	Description
`topic`	`string`	✓	—
`initial_position`	`string \| null`		—
`model`	`string`		(default: `"claude-sonnet-4-6"`)
`dry_run`	`boolean`		(default: `false`)

{
  "topic": "string",
  "model": "claude-sonnet-4-6",
  "dry_run": false
}

Responses

Status	Meaning
`200`	Debate transcript: initial position, challenge, defense, improved response, and call count
`400`	Invalid model or no adapter available for the requested provider
`422`	Validation Error

200 response shape — DebateResponse

Field	Type	Required	Description
`initial_position`	`string`	✓	—
`challenge`	`string`	✓	—
`defense`	`string`	✓	—
`improved_response`	`string`	✓	—
`model_id`	`string`	✓	—
`total_llm_calls`	`integer`	✓	—

Example response:

{
  "initial_position": "string",
  "challenge": "string",
  "defense": "string",
  "improved_response": "string",
  "model_id": "abc123",
  "total_llm_calls": 0
}

Health

GET `/api/v1/health/detailed`

Detailed adapter and embeddings health

Returns per-adapter status (ok / no_key) and embeddings client state. Overall status is ‘healthy’ only when all adapters and the embeddings client report ok. Protect this endpoint with GRACEKELLY_API_KEY in internet-facing deployments.

Responses

Status	Meaning
`200`	Detailed health with adapter list, embeddings status, and uptime

200 response shape — DetailedHealthResponse

Field	Type	Required	Description
`status`	`string`	✓	—
`uptime_seconds`	`integer`	✓	—
`adapters`	`AdapterStatus[]`	✓	—
`embeddings`	`EmbeddingsStatus`	✓	—
`total_adapters`	`integer`	✓	—

Example response:

{
  "status": "string",
  "uptime_seconds": 0,
  "adapters": [
    {
      "name": "string",
      "status": "string"
    }
  ],
  "embeddings": {
    "status": "string",
    "cache_size": 0
  },
  "total_adapters": 0
}

GET `/api/v1/readiness`

Readiness

Responses

Status	Meaning
`200`	Successful Response

GET `/health`

Health

Responses

Status	Meaning
`200`	Successful Response

GET `/healthz/live`

Liveness

Responses

Status	Meaning
`200`	Successful Response

GET `/healthz/ready`

Readiness Probe

Responses

Status	Meaning
`200`	Successful Response

GET `/metrics`

Metrics

Responses

Status	Meaning
`200`	Successful Response

Models

GET `/api/v1/models`

List all registered models

Returns the full model catalog with availability status. Browser-backed models include live observation metadata: when the model menu was last checked and whether the model label was confirmed.

Responses

Status	Meaning
`200`	Model catalog with adapter kind, provider, availability status, and observation timestamps

POST `/api/v1/models/refresh`

Refresh model catalog

Returns the current model catalog snapshot with a refreshed_at timestamp. Browser model availability reflects the last Playwright observation. A live Perplexity query is needed to update the model menu itself.

Responses

Status	Meaning
`200`	Successful Response

Orchestration

POST `/api/v1/orchestrate`

Submit a prompt for orchestrated execution

Submits a prompt to one or more models according to the specified execution plan. Supports dry-run, quorum, merge strategies, reasoning mode, and optional trace correlation. Executes synchronously and returns the final task snapshot for the completed request.

Request body

Field	Type	Required	Description
`prompt`	`string`	✓	—
`model`	`string \| null`		—
`models`	`string[]`		—
`adapter_hint`	`AdapterHint`		(default: `"auto"`)
`quorum`	`integer`		(default: `1`)
`merge_strategy`	`MergeStrategy`		(default: `"first_success"`)
`cancel_on_quorum`	`boolean`		(default: `true`)
`reasoning`	`boolean`		(default: `false`)
`metadata`	`{ [key]: object }`		—
`dry_run`	`boolean`		(default: `false`)
`decompose`	`boolean`		Enable automatic decomposition for complex prompts (default: `true`)
`session_id`	`string \| null`		Session ID for conversation chaining

{
  "prompt": "Summarize the latest incident report.",
  "adapter_hint": "auto",
  "quorum": 1,
  "merge_strategy": "first_success",
  "cancel_on_quorum": true,
  "reasoning": false,
  "dry_run": false,
  "decompose": true
}

Responses

Status	Meaning
`200`	Final task snapshot with execution plan, steps, and terminal status
`422`	Validation error (unsupported model, invalid merge strategy, quorum conflict)
`501`	Requested capability is not implemented
`503`	Storage temporarily unavailable
`504`	Orchestration timed out (GRACEKELLY_ORCHESTRATE_TIMEOUT_SECONDS exceeded)

200 response shape — OrchestrateResponse

Field	Type	Required	Description
`task_id`	`string`	✓	—
`status`	`string`	✓	—
`accepted_at`	`string <date-time>`	✓	—
`completed_at`	`string <date-time> \| null`		—
`duration_ms`	`integer \| null`		—
`execution_mode`	`string`	✓	—
`adapter_name`	`string`	✓	—
`failure_code`	`string \| null`		—
`failure_message`	`string \| null`		—
`model`	`ModelView \| null`		—
`requested_models`	`ModelView[]`	✓	—
`output_text`	`string \| null`		—
`was_decomposed`	`boolean`		(default: `false`)
`subtask_count`	`integer`		(default: `0`)

Example response:

{
  "task_id": "abc123",
  "status": "string",
  "accepted_at": "2026-01-01T00:00:00Z",
  "execution_mode": "string",
  "adapter_name": "string",
  "requested_models": [
    {
      "id": "abc123",
      "display_name": "string"
    }
  ],
  "was_decomposed": false,
  "subtask_count": 0
}

POST `/api/v1/orchestrate/upload`

Submit a prompt with file uploads for orchestrated execution

Responses

Status	Meaning
`200`	Successful Response
`422`	Validation Error

200 response shape — OrchestrateResponse

Field	Type	Required	Description
`task_id`	`string`	✓	—
`status`	`string`	✓	—
`accepted_at`	`string <date-time>`	✓	—
`completed_at`	`string <date-time> \| null`		—
`duration_ms`	`integer \| null`		—
`execution_mode`	`string`	✓	—
`adapter_name`	`string`	✓	—
`failure_code`	`string \| null`		—
`failure_message`	`string \| null`		—
`model`	`ModelView \| null`		—
`requested_models`	`ModelView[]`	✓	—
`output_text`	`string \| null`		—
`was_decomposed`	`boolean`		(default: `false`)
`subtask_count`	`integer`		(default: `0`)

Example response:

{
  "task_id": "abc123",
  "status": "string",
  "accepted_at": "2026-01-01T00:00:00Z",
  "execution_mode": "string",
  "adapter_name": "string",
  "requested_models": [
    {
      "id": "abc123",
      "display_name": "string"
    }
  ],
  "was_decomposed": false,
  "subtask_count": 0
}

GET `/api/v1/tasks`

List recent tasks

Returns a paginated list of recent tasks with step and event summaries. Supports filtering by status, execution mode, dry_run flag, and failure code. Use the before cursor (ISO timestamp) for keyset pagination.

Responses

Status	Meaning
`200`	List of task summaries ordered by accepted_at descending
`422`	Validation Error
`503`	Storage temporarily unavailable

GET `/api/v1/tasks/{task_id}`

Get full task detail

Returns the complete execution context for a task: plan scalars, all steps, and events. Events are paginated via events_limit / events_offset query parameters.

Responses

Status	Meaning
`200`	Full task view including steps and paginated events
`404`	Task not found
`422`	Validation Error
`503`	Storage temporarily unavailable

200 response shape — TaskView

Field	Type	Required	Description
`task_id`	`string`	✓	—
`status`	`string`	✓	—
`accepted_at`	`string <date-time>`	✓	—
`completed_at`	`string <date-time> \| null`		—
`duration_ms`	`integer \| null`		—
`execution_mode`	`string`	✓	—
`adapter_name`	`string`	✓	—
`failure_code`	`string \| null`		—
`failure_message`	`string \| null`		—
`model`	`ModelView \| null`		—
`requested_models`	`ModelView[]`	✓	—
`output_text`	`string \| null`		—
`was_decomposed`	`boolean`		(default: `false`)
`subtask_count`	`integer`		(default: `0`)
`prompt`	`string`	✓	—
`reasoning`	`boolean`	✓	—
`metadata`	`{ [key]: object }`	✓	—
`quorum`	`integer`	✓	—
`merge_strategy`	`string`	✓	—
`adapter_hint`	`string`	✓	—
`cancel_on_quorum`	`boolean`	✓	—
`retry_of_task_id`	`string \| null`		—
`winning_step_index`	`integer \| null`		—
`cancelled_steps`	`integer[]`		—
`cancel_reason`	`string \| null`		—
`execution_details`	`{ [key]: object }`		—
`steps`	`TaskStepView[]`		—
`events`	`TaskEventView[]`		—
`events_total`	`integer \| null`		—

Example response:

{
  "task_id": "abc123",
  "status": "string",
  "accepted_at": "2026-01-01T00:00:00Z",
  "execution_mode": "string",
  "adapter_name": "string",
  "requested_models": [
    {
      "id": "abc123",
      "display_name": "string"
    }
  ],
  "was_decomposed": false,
  "subtask_count": 0,
  "prompt": "Summarize the latest incident report.",
  "reasoning": false,
  "metadata": {},
  "quorum": 2,
  "merge_strategy": "string",
  "adapter_hint": "string",
  "cancel_on_quorum": false
}

GET `/api/v1/tasks/{task_id}/export`

Export task as Markdown

Responses

Status	Meaning
`200`	Task exported as Markdown
`404`	Task not found
`422`	Validation Error
`503`	Storage temporarily unavailable

POST `/api/v1/tasks/{task_id}/retry`

Retry a failed or cancelled task

Synchronously creates and executes a new task that replays the original prompt and execution plan. Only tasks with status failed or cancelled can be retried. The new task carries a retry_of_task_id link back to the original.

Responses

Status	Meaning
`200`	Final task snapshot for the retry linked back to the original task
`404`	Original task not found
`409`	Task status does not allow retry (not failed or cancelled)
`422`	Validation error reconstructing the retry request
`503`	Storage temporarily unavailable

200 response shape — OrchestrateResponse

Field	Type	Required	Description
`task_id`	`string`	✓	—
`status`	`string`	✓	—
`accepted_at`	`string <date-time>`	✓	—
`completed_at`	`string <date-time> \| null`		—
`duration_ms`	`integer \| null`		—
`execution_mode`	`string`	✓	—
`adapter_name`	`string`	✓	—
`failure_code`	`string \| null`		—
`failure_message`	`string \| null`		—
`model`	`ModelView \| null`		—
`requested_models`	`ModelView[]`	✓	—
`output_text`	`string \| null`		—
`was_decomposed`	`boolean`		(default: `false`)
`subtask_count`	`integer`		(default: `0`)

Example response:

{
  "task_id": "abc123",
  "status": "string",
  "accepted_at": "2026-01-01T00:00:00Z",
  "execution_mode": "string",
  "adapter_name": "string",
  "requested_models": [
    {
      "id": "abc123",
      "display_name": "string"
    }
  ],
  "was_decomposed": false,
  "subtask_count": 0
}

Pipeline

POST `/api/v1/pipeline`

Execute a reliability-level pipeline

Runs a prompt through a reliability-level-selected execution pattern. Set multi_model=true to fan out across all configured API providers and aggregate results.

Request body

Field	Type	Required	Description
`prompt`	`string`	✓	—
`model`	`string`		(default: `"claude-sonnet-4-6"`)
`reliability_level`	`string \| null`		—
`multi_model`	`boolean`		(default: `false`)
`dry_run`	`boolean`		(default: `false`)

{
  "prompt": "Summarize the latest incident report.",
  "model": "claude-sonnet-4-6",
  "multi_model": false,
  "dry_run": false
}

Responses

Status	Meaning
`200`	Pipeline answer with pattern, reliability level, model list, and LLM call count
`400`	Invalid model, unknown reliability level, or no adapter available
`422`	Validation Error

200 response shape — PipelineResponse

Field	Type	Required	Description
`answer`	`string`	✓	—
`task_type`	`string`	✓	—
`pattern_used`	`string`	✓	—
`reliability_level`	`string`	✓	—
`total_llm_calls`	`integer`	✓	—
`model_id`	`string`	✓	—
`models_used`	`string[]`		—

Example response:

{
  "answer": "string",
  "task_type": "string",
  "pattern_used": "string",
  "reliability_level": "string",
  "total_llm_calls": 0,
  "model_id": "abc123"
}

Smart

POST `/api/v1/smart`

Auto-routing smart execution

Classifies the prompt, assesses complexity, and selects the optimal execution pattern (single call, consensus V1, role-based, or decomposition) automatically. Override via reliability_level (quick/standard/high) or an explicit pattern name.

Request body

Field	Type	Required	Description
`prompt`	`string`	✓	—
`model`	`string`		(default: `"claude-sonnet-4-6"`)
`reliability_level`	`string \| null`		—
`pattern`	`string \| null`		—
`dry_run`	`boolean`		(default: `false`)

{
  "prompt": "Summarize the latest incident report.",
  "model": "claude-sonnet-4-6",
  "dry_run": false
}

Responses

Status	Meaning
`200`	Answer with routing metadata: pattern used, complexity level, LLM call count
`400`	Invalid model, unknown pattern/reliability level, or conflicting options
`422`	Validation Error

200 response shape — SmartResponse

Field	Type	Required	Description
`answer`	`string`	✓	—
`task_type`	`string`	✓	—
`complexity_level`	`string`	✓	—
`pattern_used`	`string`	✓	—
`reliability_level`	`string`	✓	—
`was_decomposed`	`boolean`	✓	—
`used_consensus`	`boolean`	✓	—
`used_roles`	`boolean`	✓	—
`total_llm_calls`	`integer`	✓	—
`model_id`	`string`	✓	—

Example response:

{
  "answer": "string",
  "task_type": "string",
  "complexity_level": "string",
  "pattern_used": "string",
  "reliability_level": "string",
  "was_decomposed": false,
  "used_consensus": false,
  "used_roles": false,
  "total_llm_calls": 0,
  "model_id": "abc123"
}

POST `/api/v1/smart/v2`

Auto-routing smart execution with Consensus V2

Like /smart but uses the Consensus V2 engine when consensus is required: HAC clustering, cross-pollination, debate rounds, divergence handling, and dissenting-view extraction. Returns cluster confidence and dissenting perspectives alongside the best answer.

Request body

Field	Type	Required	Description
`prompt`	`string`	✓	—
`model`	`string`		(default: `"claude-sonnet-4-6"`)
`reliability_level`	`string \| null`		—
`pattern`	`string \| null`		—
`dry_run`	`boolean`		(default: `false`)

{
  "prompt": "Summarize the latest incident report.",
  "model": "claude-sonnet-4-6",
  "dry_run": false
}

Responses

Status	Meaning
`200`	Answer with V2 consensus metadata: cluster confidence, dissenting views, consensus score
`400`	Invalid model, unknown pattern/reliability level, or conflicting options
`422`	Validation Error

200 response shape — SmartV2Response

Field	Type	Required	Description
`answer`	`string`	✓	—
`task_type`	`string`	✓	—
`complexity_level`	`string`	✓	—
`pattern_used`	`string`	✓	—
`reliability_level`	`string`	✓	—
`was_decomposed`	`boolean`	✓	—
`used_consensus`	`boolean`	✓	—
`used_roles`	`boolean`	✓	—
`total_llm_calls`	`integer`	✓	—
`model_id`	`string`	✓	—
`consensus_status`	`string \| null`	✓	—
`consensus_score`	`number \| null`	✓	—
`cluster_confidence`	`number \| null`	✓	—
`dissenting_views`	`DissentingViewResponse[]`	✓	—

Example response:

{
  "answer": "string",
  "task_type": "string",
  "complexity_level": "string",
  "pattern_used": "string",
  "reliability_level": "string",
  "was_decomposed": false,
  "used_consensus": false,
  "used_roles": false,
  "total_llm_calls": 0,
  "model_id": "abc123",
  "consensus_status": "string",
  "consensus_score": 0,
  "cluster_confidence": 0,
  "dissenting_views": [
    {
      "perspective": "string",
      "support_ratio": 0
    }
  ]
}

Streaming

POST `/api/v1/orchestrate/stream`

Orchestrate Stream

Request body

Field	Type	Required	Description
`prompt`	`string`	✓	—
`model`	`string \| null`		—
`models`	`string[]`		—
`adapter_hint`	`AdapterHint`		(default: `"auto"`)
`quorum`	`integer`		(default: `1`)
`merge_strategy`	`MergeStrategy`		(default: `"first_success"`)
`cancel_on_quorum`	`boolean`		(default: `true`)
`reasoning`	`boolean`		(default: `false`)
`metadata`	`{ [key]: object }`		—
`dry_run`	`boolean`		(default: `false`)
`decompose`	`boolean`		Enable automatic decomposition for complex prompts (default: `true`)
`session_id`	`string \| null`		Session ID for conversation chaining

{
  "prompt": "Summarize the latest incident report.",
  "adapter_hint": "auto",
  "quorum": 1,
  "merge_strategy": "first_success",
  "cancel_on_quorum": true,
  "reasoning": false,
  "dry_run": false,
  "decompose": true
}

Responses

Status	Meaning
`200`	Successful Response
`422`	Validation Error

API reference

Analytics

GET /api/v1/analytics

Batch

POST /api/v1/batch

Compare

POST /api/v1/compare

Consensus

POST /api/v1/consensus

Debate

POST /api/v1/debate

Health

GET /api/v1/health/detailed

GET /api/v1/readiness

GET /health

GET /healthz/live

GET /healthz/ready

GET /metrics

Models

GET /api/v1/models

POST /api/v1/models/refresh

Orchestration

POST /api/v1/orchestrate

POST /api/v1/orchestrate/upload

GET /api/v1/tasks

GET /api/v1/tasks/{task_id}

GET /api/v1/tasks/{task_id}/export

POST /api/v1/tasks/{task_id}/retry

Pipeline

POST /api/v1/pipeline

Smart

POST /api/v1/smart

POST /api/v1/smart/v2

Streaming

POST /api/v1/orchestrate/stream

GET `/api/v1/analytics`

POST `/api/v1/batch`

POST `/api/v1/compare`

POST `/api/v1/consensus`

POST `/api/v1/debate`

GET `/api/v1/health/detailed`

GET `/api/v1/readiness`

GET `/health`

GET `/healthz/live`

GET `/healthz/ready`

GET `/metrics`

GET `/api/v1/models`

POST `/api/v1/models/refresh`

POST `/api/v1/orchestrate`

POST `/api/v1/orchestrate/upload`

GET `/api/v1/tasks`

GET `/api/v1/tasks/{task_id}`

GET `/api/v1/tasks/{task_id}/export`

POST `/api/v1/tasks/{task_id}/retry`

POST `/api/v1/pipeline`

POST `/api/v1/smart`

POST `/api/v1/smart/v2`

POST `/api/v1/orchestrate/stream`