Project overview

GraceKelly

Multi-model LLM orchestrator powered by your Perplexity Pro subscription. Access GPT-5.4, Claude, Gemini, Kimi, and other models through Perplexity’s browser interface - then compare, debate, and reach consensus across models, all within a single subscription.

The current operating target is a single-user local deployment: browser execution via Perplexity is primary, and direct provider APIs remain optional fallbacks. New here? Jump to Documentation for setup and design docs.

Quick Start

1. Start the backend:

pip install -e ".[dev,browser]"
cp .env.example .env
# Set GRACEKELLY_BROWSER_ENABLED=true
# Set GRACEKELLY_BROWSER_AUTOMATION_BACKEND=playwright
# Set GRACEKELLY_EXECUTION_PROFILE=hybrid
# Set GRACEKELLY_BROWSER_PROFILE_DIR to your Chrome profile with Perplexity login
python -m uvicorn gracekelly.main:create_app --factory --host 127.0.0.1 --port 8011

2. Open the web UI:

Open http://localhost:8011

Docker alternative:

docker-compose up

Operating Risks

GraceKelly is a personal-use tool. Three risks are inherent to the design — read before relying on it.

Browser automation against Perplexity. The primary execution path drives the Perplexity Pro web UI through Playwright. Perplexity’s Terms of Service do not explicitly authorise automated access; Cloudflare bot detection may classify the traffic as automation and block the account. There is no public API fallback for the Perplexity-routed models — the API adapters here cover OpenAI and Anthropic only. Use at your own risk and avoid running multiple parallel sessions or aggressive polling.

Chrome profile lock. GRACEKELLY_BROWSER_PROFILE_DIR must point to a dedicated profile that is not in use by any other Chrome instance. Opening Perplexity manually in a Chrome window using the same profile causes BrowserProfileBusyError and cascades into the circuit breaker. The recommended setup is a profile created by scripts/bootstrap_chrome_profile.py, opened only by GraceKelly.

Perplexity UI drift. The browser adapter relies on CSS selectors for the model picker, response area, and authentication overlays. UI redesigns by Perplexity (typically every 2–3 months) require re-running the gracekelly-capture-perplexity-recon tool and updating selector constants. Symptoms of drift: model selection silently picks the wrong model, response polling returns empty, or auth banner is not dismissed. See docs/perplexity-dom-recon.md.

Integration / Clients

Known local integrators target the V2 API surface on http://127.0.0.1:8011:

RAG_Support_Assistant — provider-aware support bot. Smoke harness: scripts/gracekelly_smoke.py in that repo (8 steps).
agent_toolkit — LangGraph agent building blocks. Migrated from V1 in 2026-04-25 (commits 09e2632…1276a06 in that repo). 66 unit + 10 integration tests.
juhub — AI Daily Debate scheduled at 08:30. Migrated from the V1 orchestrator on 2026-04-25. Requires V2 to be running before cron fires; no auto-start.

The legacy V1 orchestrator (:8001, /api/gk/*) was deprecated 2026-04-25.

Operations

Ecosystem smoke (one-shot health check across all clients)

.venv\Scripts\python scripts\ecosystem_smoke.py

Pre-flight check on :8011 → direct V2 sanity (/smart, /orchestrate) → RAG smoke (if :8000 reachable) → agent_toolkit integration tests (if repo present) → juhub debate dry-run (if repo present). Exits 0 if all PASS or SKIP, 1 on first FAIL. CLI flags: --skip-rag, --skip-agent-toolkit, --skip-juhub, --verbose.

Windows always-on (optional)

scripts\win-autostart\ ships a Task Scheduler XML + .bat installers so V2 boots on user logon and survives crashes (3 retries / 5 min). Run install_autostart.bat once as Administrator. Switch execution profile without editing files: set_profile.bat hybrid. Logs land in %LOCALAPPDATA%\GraceKelly\uvicorn.log. See scripts\win-autostart\README.md for details.

Configuration

Copy .env.example and configure:

Required: Perplexity (browser execution)

Variable	Default	Description
`GRACEKELLY_BROWSER_ENABLED`	`false`	Set `true` to enable browser execution
`GRACEKELLY_BROWSER_AUTOMATION_BACKEND`	`null`	`playwright` for real browser, `scripted` for testing
`GRACEKELLY_BROWSER_PROFILE_DIR`	-	Path to Chrome profile with Perplexity login
`GRACEKELLY_BROWSER_CALL_TIMEOUT_SECONDS`	`120`	Per-call budget for one Perplexity submit. Raise for very long prompts; lower for aggressive fail-fast.
`GRACEKELLY_BROWSER_SCREENSHOTS_DIR`	-	Directory for per-step PNGs (session start, auth, model select, submit, response). Leave empty to disable.
`GRACEKELLY_MODEL_CATALOG_REFRESH_INTERVAL_HOURS`	`24`	Periodic refresh interval for the browser-backed model catalog while browser execution is enabled.

Optional: API fallbacks

API adapters are optional. Use them only if you have separate API keys and want direct provider access alongside browser execution. Mistral remains embeddings-only.

Variable	Description
`GRACEKELLY_MISTRAL_API_KEY`	Optional. Used only for consensus-pattern embeddings (semantic clustering), not as an LLM provider
`GRACEKELLY_OPENAI_API_KEY`	OpenAI-compatible API
`GRACEKELLY_ANTHROPIC_API_KEY`	Anthropic API

General

Variable	Default	Description
`GRACEKELLY_STORAGE_BACKEND`	`memory`	`memory` or `postgres`
`GRACEKELLY_POSTGRES_DSN`	-	PostgreSQL connection string
`GRACEKELLY_API_KEY`	-	Optional bearer token for protected API/metrics endpoints; static UI assets stay public
`GRACEKELLY_EXECUTION_PROFILE`	`dry-run`	one of: `dry-run`, `api-only`, `hybrid`
`GRACEKELLY_RATE_LIMIT_RPM`	`60`	Per-IP steady-state request limit enforced by the API middleware
`GRACEKELLY_RATE_LIMIT_BURST`	`10`	Extra burst capacity allowed above the steady-state per-minute limit

Full reference: .env.example

UI

Built-in HTML SPA served at http://localhost:8011/ by the same FastAPI process. Pattern is chosen from the model menu at the top of the main panel:

Sonar, Best, Claude 4.6, GPT-5.4, Gemini 3.1 — single-model patterns streaming into the chat panel
Claude + GPT, Claude + Gemini, Claude + Best, GPT + Best — pairwise consensus
5М.Все мнения, 5М.Сравнение, 5М.Консенсус — five-model compare / consensus bundles
Умный выбор (smart), Дебаты (debate) — auto-routing patterns in the Авто group; both pin to claude-sonnet-4-6 and hit /api/v1/smart / /api/v1/debate

Sidebar: task history with drill-down into steps and events, voice capture, export, file attachments.

Static UI routes (/, /*.html, /js/*, /css/*, /icons/*) are served without an API key so the shell can load in a browser. Protected API calls still require Authorization: Bearer <key> or X-API-Key: <key> when GRACEKELLY_API_KEY is set.

The dedicated analytics page at /analytics.html reads the current /api/v1/analytics schema (total_executions, models, top_models) and no longer calls the removed /api/analytics/* routes.

On a first dry-run/no-browser start, /api/v1/models serves a static dry-run-static browser catalog plus API models so the UI menu can populate before an authenticated Perplexity profile has refreshed the live catalog. When browser execution is enabled, the live browser catalog also refreshes on a scheduled loop using GRACEKELLY_MODEL_CATALOG_REFRESH_INTERVAL_HOURS.

API

#	Method	Path	Description
1	GET	`/health`	Service health
2	GET	`/healthz/live`	Liveness probe
3	GET	`/healthz/ready`	Readiness probe
4	GET	`/api/v1/readiness`	Component readiness
5	GET	`/metrics`	Prometheus metrics
6	POST	`/api/v1/orchestrate`	Multi-model execution
7	POST	`/api/v1/orchestrate/upload`	Multi-model execution with file attachments
8	POST	`/api/v1/orchestrate/stream`	Streaming execution (SSE)
9	GET	`/api/v1/tasks`	List recent tasks
10	GET	`/api/v1/tasks/{task_id}`	Task detail + steps + events
11	GET	`/api/v1/tasks/{task_id}/export`	Export task as Markdown
12	POST	`/api/v1/tasks/{task_id}/retry`	Retry a failed or cancelled task
13	GET	`/api/v1/models`	Model catalog
14	POST	`/api/v1/models/refresh`	Refresh model catalog snapshot
15	POST	`/api/v1/consensus`	Majority-vote consensus
16	GET	`/api/v1/analytics`	Model performance analytics
17	POST	`/api/v1/smart`	Auto-profile execution
18	POST	`/api/v1/smart/v2`	Consensus V2 (HAC clustering)
19	POST	`/api/v1/batch`	Parallel multi-prompt
20	POST	`/api/v1/pipeline`	Sequential task graph
21	GET	`/api/v1/health/detailed`	Per-component health
22	POST	`/api/v1/debate`	Devil’s Advocate debate
23	POST	`/api/v1/compare`	Multi-model comparison

Interactive docs: http://localhost:8011/docs

Development

pip install -e ".[dev,postgres,browser]"
python -m pytest -p no:schemathesis --tb=short -q
python -m mypy src/ tests/
python -m ruff check src/ tests/
python -m pytest -p no:schemathesis --cov=gracekelly --cov-report=term --cov-fail-under=94 -q

CI runs this matrix on Python 3.11 and 3.12. If your local interpreter is newer (e.g. 3.13), run mypy under 3.11 (a 3.11 virtualenv, or py -3.11 -m mypy src/ tests/) before pushing — version-sensitive # type: ignore placement can otherwise pass locally and fail in CI.

Optional local security parity with CI:

pip install pip-audit "bandit[toml]"
pip-audit --ignore-vuln PYSEC-2022-42969
bandit -r src/gracekelly/ -ll -x src/gracekelly/adapters/browser/

Live end-to-end smoke

scripts/live_smart_smoke.py drives the SPA through a separate bundled chromium and captures the /api/v1/smart or /api/v1/debate response. It expects uvicorn already running with the browser env vars set and the Chrome profile signed in to Perplexity; no chrome.exe must be using the profile.

python scripts/live_smart_smoke.py --pattern smart --tag smoke-1
python scripts/live_smart_smoke.py --pattern debate --tag smoke-1 \
    --prompt "Your debate topic here."

Artifacts land in .workflow/outbox/<tag>-<SMART|DEBATE>-* (response.json, before/after screenshots, report.md). Exit code 0 on a meaningful answer that hits the topic keywords and carries no [auth_failed] or streaming-chrome markers; 1 otherwise.

Architecture

Routes → Orchestrator → Router → Adapters (API / Browser) → Storage (Memory / PostgreSQL)

See docs/architecture.md for the component breakdown, typed task/step/event contracts, and execution flow.

Documentation

Architecture — components, contracts, and execution flow
Onboarding — first-time Chrome-profile setup and daily use
Operator runbook — running, monitoring, and recovery
Perplexity DOM recon — refreshing selectors after UI drift
Phased roadmap — what shipped and what is next
Быстрый старт (RU) — краткая памятка на русском
Full documentation site — Astro Starlight reference