Operator Runbook
Operator Runbook
Section titled “Operator Runbook”Last updated: 2026-04-27
This runbook covers the current operating surface for GraceKelly:
- API authentication
- web UI startup
- browser execution via Perplexity as the primary path
- service liveness and readiness
- metrics scraping
- browser-adapter recovery
- storage validation and task-scoped snapshot restore
It is intentionally limited to the current in-process deployment model.
Quickstart
Section titled “Quickstart”-
Step 1 — Boot Start the backend with the browser runtime enabled:
Terminal window set GRACEKELLY_BROWSER_ENABLED=trueset GRACEKELLY_BROWSER_AUTOMATION_BACKEND=playwrightset GRACEKELLY_EXECUTION_PROFILE=hybridset GRACEKELLY_BROWSER_PROFILE_DIR=<repo>\tmp\browser-recon\perplexity-profilepython -m uvicorn gracekelly.main:create_app --factory --host 127.0.0.1 --port 8011Keep that process running, then open
http://127.0.0.1:8011/. -
Step 2 — Authenticate browser Bootstrap a dedicated Chrome profile once:
Terminal window gracekelly-create-perplexity-profileFinish the Perplexity login manually in that profile, close every Chrome window using it, and reuse the same
GRACEKELLY_BROWSER_PROFILE_DIRfor the backend. -
Step 3 — First smoke
Terminal window python scripts/live_smart_smoke.py --pattern smartExpected result: HTTP
200, a meaningful answer, and roughly1-3browser submits for the SMART flow. WithGRACEKELLY_EXECUTION_PROFILE=dry-run, all eight sync routes auto-gate to dry-run execution without requiringdry_run: truein the request body.
For deeper operations see the sections below:
- Ecosystem smoke
- Windows always-on autostart
- Usage telemetry
- Selectors weekly recon
- Live smoke harness
- Browser triage
- Harness limitations
- Known integrators
Ecosystem smoke
Section titled “Ecosystem smoke”scripts/ecosystem_smoke.py is the single-command health check across the V2 backend
and all three known clients (RAG_Support_Assistant, agent_toolkit, juhub).
.venv\Scripts\python scripts\ecosystem_smoke.pyStep order: pre-flight :8011/healthz/ready → V2 direct (/smart + /orchestrate)
→ RAG smoke (if :8000 reachable) → agent_toolkit pytest tests/integration/
(if agent_toolkit exists) → juhub --dry-run debate (if
Perplexity_Orchestrator2\juhub exists). Missing components are reported as
SKIP, not FAIL. Exit code 0 if every step is PASS or SKIP, 1 on the first FAIL.
Useful flags:
--skip-rag,--skip-agent-toolkit,--skip-juhub— narrow the run.--gracekelly-url,--rag-url— override base URLs.--verbose— show each subprocess stdout.
This script does not start uvicorn itself; boot V2 first.
Windows always-on autostart
Section titled “Windows always-on autostart”scripts\win-autostart\ ships a Windows Task Scheduler XML and .bat helpers to
keep V2 running on user logon. This is optional — purely a convenience for
single-user local deploy where juhub cron at 08:30 and RAG async traffic both rely
on V2 being already up.
Install once, as Administrator:
cd <repo>\scripts\win-autostartinstall_autostart.batVerify:
schtasks /Query /TN "GraceKelly Autostart" /V /FO LISTSwitch execution profile without editing files:
set_profile.bat hybrid :: or dry-run / api-onlyThe wrapper gracekelly_uvicorn.bat reads %LOCALAPPDATA%\GraceKelly\profile.env
on each start; restart the task to pick up changes. Logs land in
%LOCALAPPDATA%\GraceKelly\uvicorn.log. Uninstall: uninstall_autostart.bat
(also as Administrator). See scripts\win-autostart\README.md for full reference
and troubleshooting.
Usage telemetry
Section titled “Usage telemetry”Optional per-request JSONL log appended to <repo>/logs/usage.jsonl. Designed for
honest 30-day usage audits before any simplify-driven refactor (see
audit_opus_2026-04-26.md §R1).
Enable
Section titled “Enable”set GRACEKELLY_USAGE_TELEMETRY_ENABLED=true:: optional path override, default: <cwd>/logs/usage.jsonlset GRACEKELLY_USAGE_TELEMETRY_PATH=Then restart uvicorn so the new .env is read.
Record format
Section titled “Record format”One JSON object per line, written after call_next completes:
{"ts":"2026-04-26T16:55:30.740374Z", "endpoint":"/api/v1/orchestrate", "method":"POST", "status":200, "duration_ms":14, "request_id":"308deca4-c1c7-4c84-981c-2a27ed6dd95e", "prompt_hash":"e32ca25d4ac598a59600c9b6dcc10eaf4f0636acd4e0db2ce70560adc7df146f"}endpointis UUID-normalised (e.g./api/v1/tasks/{id}/retry).prompt_hashissha256(body_bytes)for orchestration POST routes only (/orchestrate,/orchestrate/upload,/orchestrate/stream,/consensus,/compare,/debate,/smart,/smart/v2,/batch,/pipeline);nullelsewhere. The body itself is not persisted — only its hash.request_idfalls back to theX-Request-IDrequest header when no correlation middleware is wired, otherwise picks the response header. If a Redis rate-limit 429 returns before correlation middleware runs, telemetry generates a UUID and returns it asX-Request-IDon that 429 response.
python -c "import json,collections; c=collections.Counter(); ^[c.update([json.loads(l)['endpoint']]) for l in open('logs/usage.jsonl')]; print(c.most_common())"The middleware never blocks: write failures emit one usage_telemetry.write_failed
warning per process, then degrade silently. The body is replayed via
request._receive so downstream handlers see it intact.
Selectors weekly recon
Section titled “Selectors weekly recon”Weekly Friday 03:00 scheduled task that captures the live Perplexity DOM and
diffs it against a stored baseline so UI drift is detected before it breaks a
live run (see audit_opus_2026-04-26.md §R4). The task runs the
gracekelly-recon-weekly console entry point.
It loads the repo .env before resolving CLI defaults, so the scheduled task
uses the same GRACEKELLY_BROWSER_PROFILE_DIR as Settings.from_env() unless
--profile-dir is passed explicitly.
Install
Section titled “Install”Right-click <repo>\scripts\win-autostart\install_recon_cron.bat →
Run as administrator. The installer renders recon-task.xml with the current
%USERDOMAIN%\%USERNAME% substituted in and converts the file to UTF-16 LE
before calling schtasks /Create /XML.
Verify:
schtasks /Query /TN "GraceKelly Selectors Recon" /V /FO LISTWhat recon writes
Section titled “What recon writes”| Path | Meaning |
|---|---|
.workflow/state/perplexity-selectors-baseline.json | Reference snapshot. Created on first run, updated only on explicit acknowledgement. |
.workflow/state/perplexity-selectors-latest.json | Most recent capture, always overwritten. |
.workflow/state/perplexity-selectors-drift.flag | Present iff drift was detected on the latest run; deleted automatically when the next run matches the baseline again. |
logs/recon-drift.jsonl | Append-only {ts, added, removed, changed} lines for every drifted run. |
The captured snapshot is structural: home-button labels, the model menu list,
manifest flags (direct_model_button_visible, more_button_visible,
more_clicked, model_button_visible_after_more), and the artefact-file
inventory. Screenshots and intermediate HTML are written to a temporary
directory and discarded after the snapshot is extracted.
Acknowledging drift
Section titled “Acknowledging drift”When the flag is present:
-
Inspect
logs/recon-drift.jsonlfor the structural diff. -
Decide whether the drift is benign (a new model added, a button renamed) or breaking (a selector path no longer resolves).
-
Breaking drift — fix the selector module and rerun integration tests.
-
To accept the new state as the baseline:
Terminal window copy /Y "<repo>\.workflow\state\perplexity-selectors-latest.json" ^"<repo>\.workflow\state\perplexity-selectors-baseline.json"del "<repo>\.workflow\state\perplexity-selectors-drift.flag"
The next run will exit 0 again until the next drift.
Manual run
Section titled “Manual run”.\.venv\Scripts\gracekelly-recon-weekly.exeExit codes: 0 no drift, 1 drift detected, 2 missing --profile-dir /
GRACEKELLY_BROWSER_PROFILE_DIR after .env loading. The manual run requires
no other Chrome window to be holding the same profile directory open, otherwise
Playwright hits BrowserProfileBusyError — stop the autostart task or close
stray Chrome processes first.
Uninstall
Section titled “Uninstall”Right-click uninstall_recon_cron.bat → Run as administrator. The script
runs schtasks /Delete /TN "GraceKelly Selectors Recon" /F.
The built-in web UI is served from the main app at http://127.0.0.1:8011/. Run the backend, then open that address in the browser.
Static UI shell paths (/, /*.html, /js/*, /css/*, /icons/*) remain
public even when GRACEKELLY_API_KEY is configured. This lets the browser load
the SPA and linked tools. API calls from that UI still hit protected
/api/v1/* routes and need a bearer token or X-API-Key header when endpoint
auth is enabled.
HTML pages use a static-compatible CSP because the current vanilla UI still has
inline handlers/scripts/styles and analytics.html loads Chart.js from
https://cdn.jsdelivr.net. Non-HTML routes keep the stricter CSP without
'unsafe-inline'.
/analytics.html reads only GET /api/v1/analytics. It renders totals,
per-model rows, and top models from the current response fields:
total_models, total_executions, models, and top_models.
API security
Section titled “API security”Authentication
Section titled “Authentication”Set GRACEKELLY_API_KEY to require API key on all protected endpoints. Clients must include one of:
Authorization: Bearer <key>headerX-API-Key: <key>header
Public endpoints (no key required): /health, /healthz/live,
/healthz/ready, /docs, /openapi.json, /redoc, /, /*.html,
/js/*, /css/*, and /icons/*.
When GRACEKELLY_API_KEY is not set, all endpoints are open (development default).
Browser execution (primary)
Section titled “Browser execution (primary)”GraceKelly executes models through your Perplexity Pro subscription via browser automation. Direct provider APIs remain optional fallbacks when you need separate provider access.
- Create a Chrome profile logged into Perplexity Pro
- Set in
.env:Terminal window GRACEKELLY_BROWSER_ENABLED=trueGRACEKELLY_BROWSER_AUTOMATION_BACKEND=playwrightGRACEKELLY_BROWSER_PROFILE_DIR=/path/to/chrome/profile - Available models depend on your Perplexity subscription tier
Circuit breaker
Section titled “Circuit breaker”If the browser adapter fails 3 times consecutively, the circuit breaker opens
for 60 seconds. Check /metrics for gracekelly_browser_circuit_breaker_state.
MODEL_MISMATCH does not count toward the breaker (Sonar auto-route is
recovered by retry, not by tripping). Only PROVIDER_UNAVAILABLE, TIMEOUT,
and UNKNOWN_ERROR are counted.
Configure via:
GRACEKELLY_BROWSER_CIRCUIT_BREAKER_FAILURE_THRESHOLD(default 3)GRACEKELLY_BROWSER_CIRCUIT_BREAKER_COOLDOWN_SECONDS(default 60)
Stability behaviors (2026-04-26)
Section titled “Stability behaviors (2026-04-26)”The browser adapter has three layered protections to keep sessions healthy across long runs:
- Cold-start navigation — initial
page.goto(perplexity.ai)and home re-navigations use a 30s timeout (was 5s). Cold Chromium launches no longer fail the first request. - Sonar auto-route retry — when Perplexity overrides the requested model
to Sonar, the adapter retries
select_modelup to 2 extra times with a 1.5s delay before returningMODEL_MISMATCH. Class constants_MODEL_SELECT_RETRIES/_MODEL_SELECT_RETRY_DELAY_Sinadapters/browser/perplexity.py. - Force session reset on exception — after
TIMEOUTor unknown exceptions, the adapter best-effort-closes Playwright/Chromium so the next request relaunches a fresh session. Without this, a degraded session cascades through the breaker. - Thinking-toggle memoization — if Perplexity’s UI does not surface a separate “Thinking” toggle for the active model, the adapter records the miss once per session and skips the menu probe on subsequent calls (otherwise ~2s wasted per call).
- Submit click force=True — the prompt-submit button uses
force=Trueto bypass actionability waits when an overlay briefly covers it.
Live smoke verification: 12/12 sequential /api/v1/smart calls landed clean
(0 failures, 0 warnings, 0 breaker trips) on HEAD ceeb27d.
After the 2026-04-26 cold-start refactor, three test doubles in
tests/test_playwright_driver.py (_FakePage.goto, _HomeNavigationPage.goto)
needed *, timeout: int | None = None to match the production signature. Without
the kwarg, production goto(..., timeout=30_000) raised TypeError, the
surrounding try/except swallowed it, and model_selection_attempted returned
False. Fixed in b166de8; if a future stability change adds a new kwarg to a
real page.goto call, propagate it to those test doubles.
Primary endpoints
Section titled “Primary endpoints”GET /health- fast summary for service, environment, storage backend, active model executions, saturated models
GET /api/v1/readiness- component-by-component status for storage, execution router, and adapters
GET /metrics- Prometheus-style gauges for readiness, component states, execution saturation, storage counts when available, and browser circuit-breaker state
GET /api/v1/tasks- recent operator task summaries with
status,execution_mode,dry_run, andfailure_codefilters
- recent operator task summaries with
GET /api/v1/tasks/{task_id}- full execution context: plan scalars, steps, events, terminal execution details
Normal startup checks
Section titled “Normal startup checks”- Confirm the process is live:
curl http://127.0.0.1:8011/health
- Confirm readiness semantics:
curl http://127.0.0.1:8011/api/v1/readiness
- Confirm scrape surface:
curl http://127.0.0.1:8011/metrics
Expected development baseline:
storage_backend=memory- readiness may be
okeven if browser is optional and degraded under the active execution profile gracekelly_execution_active_model_executions 0when idle
Local security preflight
Section titled “Local security preflight”CI installs security scanners on demand rather than keeping them in the default dev extra. To reproduce the CI security gates locally:
pip install pip-audit "bandit[toml]"pip-audit --ignore-vuln PYSEC-2022-42969bandit -r src/gracekelly/ -ll -x src/gracekelly/adapters/browser/Readiness interpretation
Section titled “Readiness interpretation”storage component:
ok: repository reachable and schema report acceptabledegraded: connectivity or schema drift issue- Action:
- for PostgreSQL, run the validation CLI
- for memory, restart the process if the in-memory store itself is corrupted
execution-router component:
- use
active_model_executions,active_by_model,model_limits, andsaturated_models saturated_modelsmeans requests are being rejected withrate_limitedfor those models
browser.perplexity component:
sessionshows configuration and last session errorautomationshows live-driver or scripted-driver detailcircuit_breakershows whether repeated infrastructure failures have opened the browser adapter
Metrics interpretation
Section titled “Metrics interpretation”Key metric groups:
gracekelly_readiness_stategracekelly_component_stategracekelly_execution_active_model_executionsgracekelly_execution_model_activegracekelly_execution_model_limitgracekelly_execution_model_saturatedgracekelly_storage_task_count,gracekelly_storage_step_count,gracekelly_storage_event_countgracekelly_browser_circuit_breaker_stategracekelly_browser_circuit_breaker_consecutive_failuresgracekelly_browser_circuit_breaker_open_countgracekelly_browser_circuit_breaker_fail_fast_rejections
Storage-count gauges are present on both the in-memory backend and PostgreSQL when the repository healthcheck can read the durable tables successfully.
Browser triage
Section titled “Browser triage”Common task-level failure codes:
auth_failed:
- browser profile is not logged in or Perplexity showed a late sign-in overlay
- Recovery:
- create or refresh a dedicated profile:
gracekelly-create-perplexity-profile
- point runtime to that directory:
set GRACEKELLY_BROWSER_PROFILE_DIR=<repo>\tmp\browser-recon\perplexity-profile
- rerun the live smoke
- create or refresh a dedicated profile:
- Diagnostics:
- the adapter logs a structured
browser_auth_unknownwarning with url/title/body_length/prompt_input state whenever auth still resolves to logged_out after the settle retry. Grep the uvicorn log forbrowser_auth_unknownto see the actual page state that defeated the auth check.
- the adapter logs a structured
provider_unavailable:
- browser driver missing, profile directory busy, browser disabled, or circuit breaker currently open
- Recovery:
- confirm
browser_enabled=true - close any Chrome windows using the same profile directory
- inspect
/api/v1/readinessforbrowser.perplexity.details.circuit_breaker - if the circuit breaker is open, wait for cooldown or restart the service
- confirm
model_mismatch:
- requested browser model was not confirmed in the current authenticated UI
- Recovery:
- inspect
GET /api/v1/models - if model availability drift is suspected, capture fresh recon artifacts
- inspect
Dry-run first start:
- when
GRACEKELLY_EXECUTION_PROFILE=dry-runand browser automation is disabled,GET /api/v1/modelsreturns adry-run-staticbrowser catalog plus API models so the UI can populate before any live Perplexity refresh exists. - after browser automation is enabled, startup treats that static snapshot as refreshable and replaces it with the authenticated Perplexity menu when the catalog refresh succeeds.
timeout or unknown_error:
- live UI or automation state unstable
- If an external integrator receives
failure_code: "unknown_error"with a Playwright traceback while the backend is running withGRACEKELLY_EXECUTION_PROFILE=dry-run, treat it as the known dry-run profile-gate regression (the dry-run profile must not execute real adapters). - Recovery:
- inspect
browser.perplexityhealth details and breaker counters - capture fresh DOM recon
- rerun the live smoke with debug enabled
- inspect
- Per-call budget is controlled by
GRACEKELLY_BROWSER_CALL_TIMEOUT_SECONDS(default 120s). Raise it for very long prompts or when SMART fan-out sub-calls are still within the budget but tight.
Fan-out / decomposition (SMART used_roles=True or DEBATE):
- each sub-exec is routed through the same browser session. The adapter
calls
reset_page_state()(navigates the UI back to the home ask-input) before every submit so consecutive sub-execs do not extract stalebody_after_promptfrom the previous thread. If you see multiple sub-execs completing with identical output lengths or anomalously short durations (<2s) in the log, confirm the “Navigating Perplexity UI back to” log line is present between them — if missing, the reset pathway itself has regressed.
Browser recovery commands
Section titled “Browser recovery commands”Create or refresh a dedicated authenticated profile:
gracekelly-create-perplexity-profileCapture fresh authenticated recon:
gracekelly-capture-perplexity-recon --prompt "Reply with only OK" --timeout-seconds 60Run the manual-gated live smoke after the backend is already running with the browser env settings from the Quick Start:
python scripts/live_smart_smoke.py --pattern smartCircuit breaker recovery
Section titled “Circuit breaker recovery”Browser circuit breaker semantics:
- counts only
provider_unavailable,timeout, andunknown_error - opens after the configured threshold
- fail-fast blocks new browser executions until cooldown expires
- the next allowed probe closes the breaker on success or reopens it on another counted failure
Runtime knobs:
set GRACEKELLY_BROWSER_CIRCUIT_BREAKER_ENABLED=trueset GRACEKELLY_BROWSER_CIRCUIT_BREAKER_FAILURE_THRESHOLD=3set GRACEKELLY_BROWSER_CIRCUIT_BREAKER_COOLDOWN_SECONDS=60Operational guidance:
- prefer waiting for cooldown if the root cause is transient UI or provider instability
- restart the service if the browser runtime itself is wedged and cooldown alone is not enough
- investigate repeated
open_countgrowth before increasing thresholds
Storage validation
Section titled “Storage validation”Validate PostgreSQL connectivity and schema:
set GRACEKELLY_POSTGRES_DSN=postgresql://postgres:postgres@localhost:5432/gracekellyset GRACEKELLY_POSTGRES_CONNECT_TIMEOUT_SECONDS=5python -m gracekelly.tools.validate_postgresUse --no-bootstrap if the target database should not be modified during validation.
Export a JSON snapshot of recent durable-state records:
set GRACEKELLY_POSTGRES_DSN=postgresql://postgres:postgres@localhost:5432/gracekellygracekelly-export-postgres --limit 100Export specific tasks only:
gracekelly-export-postgres --task-id task-1 --task-id task-2Export artifacts now carry snapshot_format_version, gracekelly_version, and snapshot_sha256 so restores can reject incompatible or corrupted JSON before task rows are touched.
The export command summary now also echoes generated_at, compressed_output, output_exists, output_size_bytes, manifest_status, snapshot_status_consistency_status, selection_status, missing_task_ids_status, field-level manifest verification statuses, requested_task_ids, exported_task_ids, missing_task_ids, task_count, step_count, event_count, repository_health, and repository_schema, so the operator can capture both selection results and storage state without opening the snapshot file immediately.
If export fails after the snapshot manifest was already assembled, the error payload preserves that manifest context too.
If the export path ends with .gz, the snapshot is written as gzip-compressed JSON.
Inspect a snapshot artifact offline before restore:
gracekelly-inspect-snapshot --input <repo>\tmp\postgres-export\selected.jsonThat command verifies snapshot_sha256 when present and reports manifest details such as manifest_status, snapshot_status_consistency_status, selection_status, missing_task_ids_status, field-level manifest verification statuses, selection, task_count, step_count, event_count, exported_task_ids, missing_task_ids, input_size_bytes, and import_ready without requiring database connectivity. If the file cannot be parsed, the error payload still includes compressed_input and input_size_bytes.
Restore a snapshot back into PostgreSQL:
set GRACEKELLY_POSTGRES_DSN=postgresql://postgres:postgres@localhost:5432/gracekellygracekelly-import-postgres --input <repo>\tmp\postgres-export\selected.jsonRestore semantics:
- imported
task_idvalues are replaced in place - related step and event rows are replaced together with the task
- unrelated tasks remain in the database
snapshot_format_versionis verified when presentsnapshot_sha256is verified when present
Restore only selected task bundles from a larger snapshot:
gracekelly-import-postgres --input <repo>\tmp\postgres-export\selected.json --task-id task-1 --task-id task-2If one or more requested task_id values are absent, the command still restores the bundles that exist and returns status=partial plus missing_task_ids in the JSON summary.
Use --allow-degraded-schema only for deliberate manual recovery when the guardrail would otherwise block a needed restore.
Validate restore inputs without writing:
gracekelly-import-postgres --input <repo>\tmp\postgres-export\selected.json --dry-runThat success payload includes repository_health and repository_schema, so operators can confirm the target backend state in the same preflight call.
It also echoes compressed_input, input_size_bytes, source_format_status, source_migration_status, source_checksum_status, source_snapshot_sha256, source_import_ready, source_status_consistency_status, source_manifest_status, source_selection_status, source_selection, source_task_count, source_step_count, source_event_count, source_exported_task_ids, source_missing_task_ids, and source_missing_task_ids_status, so the restore report preserves the source artifact manifest context.
Failed import preflights also include compressed_input, input_size_bytes, and the source compatibility verdict fields derivable from the parsed artifact, so the operator can still identify and classify the rejected snapshot from the error payload.
Compressed .json.gz snapshot input is supported directly.
Live smoke harness
Section titled “Live smoke harness”scripts/live_smart_smoke.py is a manual-gated operator harness for end-to-end browser-backed smoke checks. It is not scheduled and is not part of CI; run it only when you explicitly want to spend live browser quota.
Preconditions
Section titled “Preconditions”- Chrome profile is already authenticated to Perplexity Pro, for example
<repo>/chrome-profile/. - Uvicorn is running on
http://127.0.0.1:8011/with at least:Terminal window $env:GRACEKELLY_BROWSER_ENABLED="true"$env:GRACEKELLY_EXECUTION_PROFILE="hybrid" - No other
chrome.exeprocess is using that profile:The command should return no rows before you launch the smoke.Terminal window Get-CimInstance Win32_Process -Filter "name = 'chrome.exe'" |Where-Object { $_.CommandLine -like '*<repo>\\chrome-profile*' } |Select-Object ProcessId, CommandLine
Supported patterns
Section titled “Supported patterns”| Pattern | API path | UI label | Default prompt summary | Expected quota | Min answer length |
|---|---|---|---|---|---|
smart | /api/v1/smart | Умный выбор | EV market comparison across Europe, USA, China | 1-3 submits | 500 chars |
debate | /api/v1/debate | Дебаты | EV market comparison with challenge/defense loop | 3-5 submits | 500 chars |
consensus | /api/v1/consensus | not surfaced; direct POST fallback | 3 leading EV manufacturers in China | 3-5 submits | 300 chars |
compare | /api/v1/compare | not surfaced; direct POST fallback | Claude Sonnet 4.6 vs GPT-5.4 reasoning comparison | 5 submits | 400 chars |
upload | /api/v1/orchestrate/upload | n/a; composer attachment flow | summarize attached file | 1 submit | 150 chars |
Quota expectations are approximate and assume a healthy authenticated browser session. smart may fan out into 1-3 submits, debate usually needs 3-5, consensus usually needs 3-5, compare fans out across five models, and upload is expected to be a single submit.
The UI upload path intentionally collapses any current multi-model menu selection to one model form field before POSTing /api/v1/orchestrate/upload. With the default Claude + GPT menu item, the upload smoke uses the first resolved model (Claude Sonnet 4.6) and should report model_count=1 with no quorum cancellation.
Usage examples
Section titled “Usage examples”python scripts/live_smart_smoke.py --pattern smartpython scripts/live_smart_smoke.py --pattern debatepython scripts/live_smart_smoke.py --pattern consensuspython scripts/live_smart_smoke.py --pattern comparepython scripts/live_smart_smoke.py --pattern upload --attachment <path>Artifacts and interpretation
Section titled “Artifacts and interpretation”Reports are written to .workflow/outbox/<tag>-<PATTERN>-report.md and raw payloads to .workflow/outbox/<tag>-<PATTERN>-response.json.
Status: success means the harness completed prompt-to-response end-to-end and the evaluator accepted the pattern-specific response without AUTH_FAILED, shell-chrome, forbidden markers, or length/topic failures.
Status: failure means the report contains explicit rejection reasons such as non-200 status, missing answer field, too-short output, forbidden markers, or missing topic keywords. Inspect the paired response.json to see the captured HTTP status and the raw response body fields that the evaluator examined.
Coverage notes
Section titled “Coverage notes”Fallback behaviour is validated via unit-tests in tests/test_router_fallback.py, not through the live harness; browser-adapter failure is not reproduced artificially in smoke runs.
This harness does not cover smart/v2, batch, or pipeline. Those paths stay validated through unit tests and route-level smoke coverage such as tests/test_routes_*.
Harness limitations
Section titled “Harness limitations”Cyrillic prompts via PowerShell pipe
Section titled “Cyrillic prompts via PowerShell pipe”When the harness or any other CLI tool passes a cyrillic prompt through a PowerShell pipe
(echo 'привет' | python ...), PowerShell’s default encoding can downgrade the text to ?
placeholders before the child process sees it. That is a PowerShell / harness issue, not a
GraceKelly backend bug.
Workarounds:
- pass the prompt directly with
--prompt, for examplepython scripts/live_smart_smoke.py --prompt "привет" - set
$OutputEncoding = [System.Text.Encoding]::UTF8before piping in the current session - use
--ascii-fallbackfor deterministic ASCII smoke prompts
Reference incident: Phase 17 / batch-82 live SMART failure recorded in docs/phased-roadmap.md.
Persistent session reuse
Section titled “Persistent session reuse”Authentication is persisted through the dedicated Chrome profile directory (default
chrome-profile/, configurable via GRACEKELLY_BROWSER_PROFILE_DIR). There is no separate session
token file to rotate or copy.
Current local operation is single-account: GRACEKELLY_BROWSER_PROFILE_DIR=<repo>/chrome-profile
and no GRACEKELLY_ACCOUNTS pool. Keep that profile signed in with the intended Gmail-backed
Perplexity account; do not add alternate browser profiles or API fallback keys unless the operating
mode changes deliberately.
If another Chrome process is still using that profile, startup can fail with the live-profile guard
or BrowserProfileBusyError. Use a dedicated profile created by gracekelly-create-perplexity-profile
and follow docs/onboarding.md for the bootstrap / recovery flow.
Selector drift symptom: if model selection reports Upload files or images, Search, or a stray
New model, rerun the live smoke and inspect PerplexitySelectors.model_button. The model button can
include a mode suffix such as Gemini 3.1 Pro Thinking; the selector must prefer known model labels
over the broad composer menu fallback.
For the current Pro-backed profile, keep GPT-5.4 as the GPT browser model; do not admit Max-only
menu labels such as GPT-5.5, Claude Opus, or Max into the runtime browser catalog.
Task inspection workflow
Section titled “Task inspection workflow”- Find the recent failures:
GET /api/v1/tasks?status=failed&dry_run=false
- Narrow by backend shape:
GET /api/v1/tasks?execution_mode=browser
- Narrow by failure class:
GET /api/v1/tasks?failure_code=provider_unavailable
- Inspect one task deeply:
GET /api/v1/tasks/{task_id}
Use execution_details, terminal event payloads, and step event details together. That is where current adapter diagnostics, browser driver metadata, and circuit-breaker-origin failures surface without widening storage tables.
Log correlation
Section titled “Log correlation”If callers supply metadata.trace_id on POST /api/v1/orchestrate, GraceKelly now echoes that value in:
- route-level
orchestrate.request/orchestrate.accepted - orchestrator-level
task.submit.started/task.submit.completed task.event_persistence_failedwarnings
That gives a minimal correlation key across HTTP entry, task creation, and best-effort event logging without requiring an external tracing system.
Health endpoint security
Section titled “Health endpoint security”The GET /health endpoint returns a minimal summary by default (status, environment, backend name, saturation counts). Internal component details are hidden.
To expose full details (storage schema, browser circuit-breaker state, adapter keys present/absent):
set GRACEKELLY_HEALTH_EXPOSE_DETAILS=trueSecurity implications:
- The detailed view reveals which adapters have API keys configured and whether the browser session is authenticated.
- Keep
GRACEKELLY_HEALTH_EXPOSE_DETAILS=false(default) in any internet-facing deployment. - The detailed view is safe on an internal monitoring network or when the health endpoint is behind API key authentication.
GET /api/v1/health/detailedalways returns full adapter and embeddings status; protect it withGRACEKELLY_API_KEYif health endpoints are public.
Request timeout (orchestrate)
Section titled “Request timeout (orchestrate)”POST /api/v1/orchestrate runs synchronously in a thread pool. To cap execution time and return HTTP 504 to the caller instead of holding the connection indefinitely:
set GRACEKELLY_ORCHESTRATE_TIMEOUT_SECONDS=60Setting 0 (default) disables the timeout.
How it works:
- The orchestration coroutine is wrapped in
asyncio.wait_forwith the configured timeout. - On breach, the endpoint returns
504 Gateway Timeoutwithdetail: "Orchestration request timed out." - The background thread continues running until the underlying adapter call completes or fails on its own - the timeout only affects the HTTP response, not the execution itself.
Tuning guidance:
- Start with the slowest expected model timeout + 10 s of overhead (e.g. Anthropic 120 s -> set 130 s).
- For dry-run mode, 5 s is sufficient.
- For consensus V2 with multiple rounds, account for
max_rounds x variations_per_round x model_timeout_seconds. - Pair this setting with load-balancer / reverse-proxy timeouts: both must be larger than the orchestrate timeout.
Known integrators
Section titled “Known integrators”V2 is the only active orchestrator. All three known clients run on http://127.0.0.1:8011:
RAG_Support_Assistant(RAG_Support_Assistant, port 8000)- Smoke:
python RAG_Support_Assistant\scripts\gracekelly_smoke.py - Failover provider: ollama (when V2 returns 5xx).
- Smoke:
agent_toolkit(agent_toolkit)- LangGraph wrapper (
OrchestratorChatModel) → V2 endpoints byGKPattern. - Test:
cd agent_toolkit && uv run pytest tests/integration/
- LangGraph wrapper (
juhub(Perplexity_Orchestrator2\juhub, scheduled 08:30 daily)backend/scheduler.pydoes pre-flight:8011/healthz/ready; if V2 is down, the run is skipped with an error log (no auto-start).- Manual dry-run:
cd Perplexity_Orchestrator2 && set GK_DRY_RUN=1 && python -m juhub.backend.scheduler --now
Legacy V1 orchestrator at Perplexity_Orchestrator2 (:8001, /api/gk/*) is deprecated 2026-04-25 and not used by any client. See Perplexity_Orchestrator2\DEPRECATED.md.