Skip to content

API reference

This catalog documents 23 endpoints across 11 groups. All request and response bodies are application/json unless noted.

Model performance analytics

Aggregates execution statistics from the last 100 tasks: success rate, average duration, and failure counts per model. Returns a ranked top-5 list by success rate alongside full per-model stats.

Responses

StatusMeaning
200Aggregate model performance stats with top-5 ranking
503Storage unavailable while reading task/step records
200 response shape — AnalyticsResponse
FieldTypeRequiredDescription
total_modelsinteger
total_executionsinteger
modelsModelStatsView[]
top_modelsModelStatsView[]

Example response:

{
"total_models": 0,
"total_executions": 0,
"models": [
"gpt-5.4",
"claude"
],
"top_models": [
{
"model_id": "abc123",
"total_executions": 0,
"successful": 0,
"failed": 0,
"success_rate": 0,
"avg_duration_ms": 0
}
]
}

Execute multiple prompts in parallel

Runs a single model on up to 20 prompts concurrently. Returns per-prompt results with individual success/failure status.

Request body

FieldTypeRequiredDescription
promptsstring[]
modelstring(default: "claude-sonnet-4-6")
dry_runboolean(default: false)
{
"prompts": [
"Summarize the latest incident report."
],
"model": "claude-sonnet-4-6",
"dry_run": false
}

Responses

StatusMeaning
200Batch execution results with per-prompt status and aggregate counts
400Invalid model or no adapter available for the requested provider
422Validation error (empty prompts list, prompt too long, or too many prompts)
200 response shape — BatchResponse
FieldTypeRequiredDescription
resultsBatchItemResponse[]
totalinteger
succeededinteger
failedinteger

Example response:

{
"results": [
{
"prompt": "Summarize the latest incident report.",
"answer": "string",
"status": "string"
}
],
"total": 0,
"succeeded": 0,
"failed": 0
}

Compare answers from multiple models

Runs the same prompt on each requested model concurrently. When analyze=true and at least two models succeed, an additional LLM call summarizes differences, strengths, and the best answer.

Request body

FieldTypeRequiredDescription
promptstring
modelsstring[]
analyzeboolean(default: true)
dry_runboolean(default: false)
{
"prompt": "Summarize the latest incident report.",
"analyze": true,
"dry_run": false
}

Responses

StatusMeaning
200Per-model answers with optional comparative analysis and aggregate success counts
422Validation error (empty models list or prompt too long)
200 response shape — CompareResponse
FieldTypeRequiredDescription
answersModelAnswer[]
analysisstring | null
total_modelsinteger
succeededinteger
failedinteger

Example response:

{
"answers": [
{
"model_id": "abc123",
"answer": "string",
"status": "string"
}
],
"analysis": "string",
"total_models": 0,
"succeeded": 0,
"failed": 0
}

Run iterative consensus V1

Generates multiple response variations per round, clusters them by semantic similarity, and iterates until the top cluster reaches the consensus target. Requires an embeddings client to be configured.

Request body

FieldTypeRequiredDescription
promptstring
modelstring(default: "claude-sonnet-4-6")
similarity_thresholdnumber(default: 0.85)
consensus_targetnumber(default: 0.95)
max_roundsinteger(default: 5)
variations_per_roundinteger(default: 3)
use_confidence_weightingboolean(default: true)
dry_runboolean(default: false)
{
"prompt": "Summarize the latest incident report.",
"model": "claude-sonnet-4-6",
"similarity_threshold": 0.85,
"consensus_target": 0.95,
"max_rounds": 5,
"variations_per_round": 3,
"use_confidence_weighting": true,
"dry_run": false
}

Responses

StatusMeaning
200Consensus result with score, cluster count, best response, and round statistics
400Invalid model or no adapter available for the requested provider
422Validation Error
500Consensus execution failed (internal error)
503Embeddings client is not configured
200 response shape — ConsensusResponse
FieldTypeRequiredDescription
consensus_scorenumber
num_clustersinteger
best_responsestring
weighted_scorenumber | null
total_roundsinteger
total_llm_callsinteger
needs_debateboolean
top_cluster_sizeinteger

Example response:

{
"consensus_score": 0,
"num_clusters": 0,
"best_response": "string",
"weighted_score": 0,
"total_rounds": 0,
"total_llm_calls": 0,
"needs_debate": false,
"top_cluster_size": 0
}

Run a Devil’s Advocate debate round

Generates an initial position on the topic, then runs a structured debate: challenge (Devil’s Advocate), defense, and improved final response. Optionally supply your own initial_position to skip the first LLM call.

Request body

FieldTypeRequiredDescription
topicstring
initial_positionstring | null
modelstring(default: "claude-sonnet-4-6")
dry_runboolean(default: false)
{
"topic": "string",
"model": "claude-sonnet-4-6",
"dry_run": false
}

Responses

StatusMeaning
200Debate transcript: initial position, challenge, defense, improved response, and call count
400Invalid model or no adapter available for the requested provider
422Validation Error
200 response shape — DebateResponse
FieldTypeRequiredDescription
initial_positionstring
challengestring
defensestring
improved_responsestring
model_idstring
total_llm_callsinteger

Example response:

{
"initial_position": "string",
"challenge": "string",
"defense": "string",
"improved_response": "string",
"model_id": "abc123",
"total_llm_calls": 0
}

Detailed adapter and embeddings health

Returns per-adapter status (ok / no_key) and embeddings client state. Overall status is ‘healthy’ only when all adapters and the embeddings client report ok. Protect this endpoint with GRACEKELLY_API_KEY in internet-facing deployments.

Responses

StatusMeaning
200Detailed health with adapter list, embeddings status, and uptime
200 response shape — DetailedHealthResponse
FieldTypeRequiredDescription
statusstring
uptime_secondsinteger
adaptersAdapterStatus[]
embeddingsEmbeddingsStatus
total_adaptersinteger

Example response:

{
"status": "string",
"uptime_seconds": 0,
"adapters": [
{
"name": "string",
"status": "string"
}
],
"embeddings": {
"status": "string",
"cache_size": 0
},
"total_adapters": 0
}

Readiness

Responses

StatusMeaning
200Successful Response

Health

Responses

StatusMeaning
200Successful Response

Liveness

Responses

StatusMeaning
200Successful Response

Readiness Probe

Responses

StatusMeaning
200Successful Response

Metrics

Responses

StatusMeaning
200Successful Response

List all registered models

Returns the full model catalog with availability status. Browser-backed models include live observation metadata: when the model menu was last checked and whether the model label was confirmed.

Responses

StatusMeaning
200Model catalog with adapter kind, provider, availability status, and observation timestamps

Refresh model catalog

Returns the current model catalog snapshot with a refreshed_at timestamp. Browser model availability reflects the last Playwright observation. A live Perplexity query is needed to update the model menu itself.

Responses

StatusMeaning
200Successful Response

Submit a prompt for orchestrated execution

Submits a prompt to one or more models according to the specified execution plan. Supports dry-run, quorum, merge strategies, reasoning mode, and optional trace correlation. Executes synchronously and returns the final task snapshot for the completed request.

Request body

FieldTypeRequiredDescription
promptstring
modelstring | null
modelsstring[]
adapter_hintAdapterHint(default: "auto")
quoruminteger(default: 1)
merge_strategyMergeStrategy(default: "first_success")
cancel_on_quorumboolean(default: true)
reasoningboolean(default: false)
metadata{ [key]: object }
dry_runboolean(default: false)
decomposebooleanEnable automatic decomposition for complex prompts (default: true)
session_idstring | nullSession ID for conversation chaining
{
"prompt": "Summarize the latest incident report.",
"adapter_hint": "auto",
"quorum": 1,
"merge_strategy": "first_success",
"cancel_on_quorum": true,
"reasoning": false,
"dry_run": false,
"decompose": true
}

Responses

StatusMeaning
200Final task snapshot with execution plan, steps, and terminal status
422Validation error (unsupported model, invalid merge strategy, quorum conflict)
501Requested capability is not implemented
503Storage temporarily unavailable
504Orchestration timed out (GRACEKELLY_ORCHESTRATE_TIMEOUT_SECONDS exceeded)
200 response shape — OrchestrateResponse
FieldTypeRequiredDescription
task_idstring
statusstring
accepted_atstring <date-time>
completed_atstring <date-time> | null
duration_msinteger | null
execution_modestring
adapter_namestring
failure_codestring | null
failure_messagestring | null
modelModelView | null
requested_modelsModelView[]
output_textstring | null
was_decomposedboolean(default: false)
subtask_countinteger(default: 0)

Example response:

{
"task_id": "abc123",
"status": "string",
"accepted_at": "2026-01-01T00:00:00Z",
"execution_mode": "string",
"adapter_name": "string",
"requested_models": [
{
"id": "abc123",
"display_name": "string"
}
],
"was_decomposed": false,
"subtask_count": 0
}

Submit a prompt with file uploads for orchestrated execution

Responses

StatusMeaning
200Successful Response
422Validation Error
200 response shape — OrchestrateResponse
FieldTypeRequiredDescription
task_idstring
statusstring
accepted_atstring <date-time>
completed_atstring <date-time> | null
duration_msinteger | null
execution_modestring
adapter_namestring
failure_codestring | null
failure_messagestring | null
modelModelView | null
requested_modelsModelView[]
output_textstring | null
was_decomposedboolean(default: false)
subtask_countinteger(default: 0)

Example response:

{
"task_id": "abc123",
"status": "string",
"accepted_at": "2026-01-01T00:00:00Z",
"execution_mode": "string",
"adapter_name": "string",
"requested_models": [
{
"id": "abc123",
"display_name": "string"
}
],
"was_decomposed": false,
"subtask_count": 0
}

List recent tasks

Returns a paginated list of recent tasks with step and event summaries. Supports filtering by status, execution mode, dry_run flag, and failure code. Use the before cursor (ISO timestamp) for keyset pagination.

Responses

StatusMeaning
200List of task summaries ordered by accepted_at descending
422Validation Error
503Storage temporarily unavailable

Get full task detail

Returns the complete execution context for a task: plan scalars, all steps, and events. Events are paginated via events_limit / events_offset query parameters.

Responses

StatusMeaning
200Full task view including steps and paginated events
404Task not found
422Validation Error
503Storage temporarily unavailable
200 response shape — TaskView
FieldTypeRequiredDescription
task_idstring
statusstring
accepted_atstring <date-time>
completed_atstring <date-time> | null
duration_msinteger | null
execution_modestring
adapter_namestring
failure_codestring | null
failure_messagestring | null
modelModelView | null
requested_modelsModelView[]
output_textstring | null
was_decomposedboolean(default: false)
subtask_countinteger(default: 0)
promptstring
reasoningboolean
metadata{ [key]: object }
quoruminteger
merge_strategystring
adapter_hintstring
cancel_on_quorumboolean
retry_of_task_idstring | null
winning_step_indexinteger | null
cancelled_stepsinteger[]
cancel_reasonstring | null
execution_details{ [key]: object }
stepsTaskStepView[]
eventsTaskEventView[]
events_totalinteger | null

Example response:

{
"task_id": "abc123",
"status": "string",
"accepted_at": "2026-01-01T00:00:00Z",
"execution_mode": "string",
"adapter_name": "string",
"requested_models": [
{
"id": "abc123",
"display_name": "string"
}
],
"was_decomposed": false,
"subtask_count": 0,
"prompt": "Summarize the latest incident report.",
"reasoning": false,
"metadata": {},
"quorum": 2,
"merge_strategy": "string",
"adapter_hint": "string",
"cancel_on_quorum": false
}

Export task as Markdown

Responses

StatusMeaning
200Task exported as Markdown
404Task not found
422Validation Error
503Storage temporarily unavailable

Retry a failed or cancelled task

Synchronously creates and executes a new task that replays the original prompt and execution plan. Only tasks with status failed or cancelled can be retried. The new task carries a retry_of_task_id link back to the original.

Responses

StatusMeaning
200Final task snapshot for the retry linked back to the original task
404Original task not found
409Task status does not allow retry (not failed or cancelled)
422Validation error reconstructing the retry request
503Storage temporarily unavailable
200 response shape — OrchestrateResponse
FieldTypeRequiredDescription
task_idstring
statusstring
accepted_atstring <date-time>
completed_atstring <date-time> | null
duration_msinteger | null
execution_modestring
adapter_namestring
failure_codestring | null
failure_messagestring | null
modelModelView | null
requested_modelsModelView[]
output_textstring | null
was_decomposedboolean(default: false)
subtask_countinteger(default: 0)

Example response:

{
"task_id": "abc123",
"status": "string",
"accepted_at": "2026-01-01T00:00:00Z",
"execution_mode": "string",
"adapter_name": "string",
"requested_models": [
{
"id": "abc123",
"display_name": "string"
}
],
"was_decomposed": false,
"subtask_count": 0
}

Execute a reliability-level pipeline

Runs a prompt through a reliability-level-selected execution pattern. Set multi_model=true to fan out across all configured API providers and aggregate results.

Request body

FieldTypeRequiredDescription
promptstring
modelstring(default: "claude-sonnet-4-6")
reliability_levelstring | null
multi_modelboolean(default: false)
dry_runboolean(default: false)
{
"prompt": "Summarize the latest incident report.",
"model": "claude-sonnet-4-6",
"multi_model": false,
"dry_run": false
}

Responses

StatusMeaning
200Pipeline answer with pattern, reliability level, model list, and LLM call count
400Invalid model, unknown reliability level, or no adapter available
422Validation Error
200 response shape — PipelineResponse
FieldTypeRequiredDescription
answerstring
task_typestring
pattern_usedstring
reliability_levelstring
total_llm_callsinteger
model_idstring
models_usedstring[]

Example response:

{
"answer": "string",
"task_type": "string",
"pattern_used": "string",
"reliability_level": "string",
"total_llm_calls": 0,
"model_id": "abc123"
}

Auto-routing smart execution

Classifies the prompt, assesses complexity, and selects the optimal execution pattern (single call, consensus V1, role-based, or decomposition) automatically. Override via reliability_level (quick/standard/high) or an explicit pattern name.

Request body

FieldTypeRequiredDescription
promptstring
modelstring(default: "claude-sonnet-4-6")
reliability_levelstring | null
patternstring | null
dry_runboolean(default: false)
{
"prompt": "Summarize the latest incident report.",
"model": "claude-sonnet-4-6",
"dry_run": false
}

Responses

StatusMeaning
200Answer with routing metadata: pattern used, complexity level, LLM call count
400Invalid model, unknown pattern/reliability level, or conflicting options
422Validation Error
200 response shape — SmartResponse
FieldTypeRequiredDescription
answerstring
task_typestring
complexity_levelstring
pattern_usedstring
reliability_levelstring
was_decomposedboolean
used_consensusboolean
used_rolesboolean
total_llm_callsinteger
model_idstring

Example response:

{
"answer": "string",
"task_type": "string",
"complexity_level": "string",
"pattern_used": "string",
"reliability_level": "string",
"was_decomposed": false,
"used_consensus": false,
"used_roles": false,
"total_llm_calls": 0,
"model_id": "abc123"
}

Auto-routing smart execution with Consensus V2

Like /smart but uses the Consensus V2 engine when consensus is required: HAC clustering, cross-pollination, debate rounds, divergence handling, and dissenting-view extraction. Returns cluster confidence and dissenting perspectives alongside the best answer.

Request body

FieldTypeRequiredDescription
promptstring
modelstring(default: "claude-sonnet-4-6")
reliability_levelstring | null
patternstring | null
dry_runboolean(default: false)
{
"prompt": "Summarize the latest incident report.",
"model": "claude-sonnet-4-6",
"dry_run": false
}

Responses

StatusMeaning
200Answer with V2 consensus metadata: cluster confidence, dissenting views, consensus score
400Invalid model, unknown pattern/reliability level, or conflicting options
422Validation Error
200 response shape — SmartV2Response
FieldTypeRequiredDescription
answerstring
task_typestring
complexity_levelstring
pattern_usedstring
reliability_levelstring
was_decomposedboolean
used_consensusboolean
used_rolesboolean
total_llm_callsinteger
model_idstring
consensus_statusstring | null
consensus_scorenumber | null
cluster_confidencenumber | null
dissenting_viewsDissentingViewResponse[]

Example response:

{
"answer": "string",
"task_type": "string",
"complexity_level": "string",
"pattern_used": "string",
"reliability_level": "string",
"was_decomposed": false,
"used_consensus": false,
"used_roles": false,
"total_llm_calls": 0,
"model_id": "abc123",
"consensus_status": "string",
"consensus_score": 0,
"cluster_confidence": 0,
"dissenting_views": [
{
"perspective": "string",
"support_ratio": 0
}
]
}

Orchestrate Stream

Request body

FieldTypeRequiredDescription
promptstring
modelstring | null
modelsstring[]
adapter_hintAdapterHint(default: "auto")
quoruminteger(default: 1)
merge_strategyMergeStrategy(default: "first_success")
cancel_on_quorumboolean(default: true)
reasoningboolean(default: false)
metadata{ [key]: object }
dry_runboolean(default: false)
decomposebooleanEnable automatic decomposition for complex prompts (default: true)
session_idstring | nullSession ID for conversation chaining
{
"prompt": "Summarize the latest incident report.",
"adapter_hint": "auto",
"quorum": 1,
"merge_strategy": "first_success",
"cancel_on_quorum": true,
"reasoning": false,
"dry_run": false,
"decompose": true
}

Responses

StatusMeaning
200Successful Response
422Validation Error