Ratio metrics: analyzable + delta-method sizing + demo (Phase 3 T3.1, 2026-06-26)
Ratio metrics: analyzable + delta-method sizing + demo (Phase 3 T3.1, 2026-06-26)
Section titled “Ratio metrics: analyzable + delta-method sizing + demo (Phase 3 T3.1, 2026-06-26)”Workstream (2) of Юля’s chosen next steps (handoff #20). The ratio live-stats block (F2 delta
method) only renders for an analyzed project, but routes/analysis._build_calculation_payload
rejected ratio sizing with a 422 — so a ratio experiment could never be analyzed and its live block
never surfaced, and no ratio demo could exist. This makes ratio a first-class, analyzable metric and
adds a ratio demo.
Approach — delta-method sizing reduces to the continuous formula
Section titled “Approach — delta-method sizing reduces to the continuous formula”A ratio metric R = E[Y]/E[X] is sized by the delta method: the per-user linearized value
Y - R*X is the analysis unit, with mean R (the baseline ratio) and a per-user standard deviation
σ_L. The two-sample sample-size formula is then exactly the continuous one with
baseline = R, std_dev = σ_L. So ratio sizing reuses the verified continuous math (no new
statistic) and — crucially — produces a normal calculation_result, leaving the report builder,
the frontend CalculationsSection, and the decision service unchanged. This is a far smaller
blast radius than threading “optional sizing” through the whole report/UI stack, and it makes ratio
fully first-class (sizing and the existing live delta-method analysis).
Verified parity: a ratio payload and the equivalent continuous payload produce an identical planned
sample size (test_ratio_calculation_reduces_to_continuous_delta_method).
Changes
Section titled “Changes”Backend
schemas/api.py—CalculationRequest.metric_typeaccepts"ratio"; its validator requires a positive baseline ratio and a positivestd_devfor ratio (ratio-specific i18n messages).services/calculations_service.py— ratio branch reduces tocalculate_continuous_sample_size(baselineR,std_devσ_L), then restampsmetric_type="ratio"and the lead assumption to name the delta-method linearization. CUPED stays continuous-only; the bayesian path already routes non-binary through the continuous formula; sequential is metric-agnostic.routes/analysis.py—_build_calculation_payloadno longer blanket-rejects ratio; it raises a clear 422 only when a ratio metric is missingstd_dev(otherwise sizing proceeds like continuous).app/backend/app/i18n/*.json(7 locales) —errors.schemas.ratio_baseline_positive/ratio_std_positive.
Demo (data + template)
templates/ad_ctr_ratio.yaml— new built-in ratio template (Feed Ad Click-Through Ratio:ad_ctr = ad_clicks / ad_impressions,R=0.05,σ_L=0.09).startup_seed.py— adds the ratio demo toSAMPLE_PROJECTS(now 4 demo projects). It flows through the ordinary analyze path automatically (now that ratio is analyzable).demo_execution.py—build_ad_ctr_executionseeds 1200 users/arm with a per-user variable number of impressions (the denominator differs per user — the point of a ratio) and a binomial number of clicks at the arm’s true rate (control 0.046 → treatment 0.062), so the live ratio comparison reads significant, the always-valid view crosses, and the decision reads ship.
Frontend
lib/field-config.ts— thestd_devfield is shown for ratio (not just continuous); thebaseline_value/std_devtooltips explain the ratio meaning (baseline ratio R; per-user linearized std).hooks/useCalculationPreview.ts—canComputeallows ratio onceR>0andstd_dev>0.lib/payload.ts—buildCalculationPayloadno longer throws for ratio; it sizes ratio like a continuous metric.lib/generated/api-contract.ts— regenerated (CalculateRequestmetric_typenow includes ratio).
Verification
Section titled “Verification”- New tests: calculator ratio↔continuous parity + std_dev requirement (
test_calculations.py);/api/v1/designsizes ratio (200) and rejects ratio-without-std_dev (422) (test_api_routes.py); ratio demo builder shape + determinism (test_demo_execution.py); the seeded ratio demo’s live block is significant on the demo path (test_startup_seed.py); ratio preview path (useCalculationPreview.test.tsx). - Updated stale counts: built-in templates 10→11, demo projects 3→4.
- Gate (serial, Windows): ruff ✓ · mypy
--strict67 ✓ · backend suite ✓ · tsc ✓ · targeted vitest ✓ · vite build 493.59 kB < 500 ✓ · contract--check✓ · locale-content 14 clean ✓. - End-to-end (seeded demo, SQLite): ratio R control 0.0445 vs treatment 0.0628, p≈0, frequentist + always-valid significant, SRM ok, decision = ship, ratio read ~6 ms (fast after the live-read indexing work).