Skip to content

Search is only available in production builds. Try building and previewing the site to test it out locally.

Architecture

Optra Prism is three stacked measurement layers and one parallel intelligence pipeline. Layers are numbered bottom-up by data flow — signals rise from Layer 0 through Layer 1 into Layer 2 — and each layer holds exactly one kind of thing.

flowchart LR
    L0["Layer 0 — Telemetry<br/>OpenTelemetry from Claude Code"] --> L1["Layer 1 — Measurement<br/>Metrics · PES · SSE"]
    L1 --> L2["Layer 2 — Prism Score<br/>Speed · Skill · Efficiency"]
    L1 -.-> PIQ["Insight<br/>Intent-adaptive agents"]
    PIQ -.-> L1

    style L0 fill:#22c55e,stroke:#16a34a,color:#fff
    style L1 fill:#f59e0b,stroke:#d97706,color:#fff
    style L2 fill:#8b5cf6,stroke:#7c3aed,color:#fff
    style PIQ fill:#a855f7,stroke:#9333ea,color:#fff

The single source of truth. Claude Code emits cost, tokens, events, links, and labels via OpenTelemetry. No git webhook, no repo scanner, no CI integration.

Three flags need to be set for full functionality:

Terminal window
CLAUDE_CODE_ENABLE_TELEMETRY=1
OTEL_LOG_USER_PROMPTS=1
OTEL_LOG_TOOL_DETAILS=1

Without OTEL_LOG_USER_PROMPTS, Prompt Efficiency scoring and sub-session detection both run in degraded mode — the prompt text isn’t available to the rubric agents. Pure-counter metrics (tokens, turns, active time) are unaffected. Without OTEL_LOG_TOOL_DETAILS, Quality Retention file-path extraction breaks; sub-session efficiency still runs on timestamps and token counters.

The /prism:setup flow wires these flags up automatically.

Three parallel verticals, each measuring something structurally different:

  • Metrics — aggregations over sub-session or week. Iteration Efficiency (IE), Context Reset Rate (CRR), Flow Continuity (FC), Response Latency Ratio, Quality Retention, Weekly Token Usage.
  • Prompt Efficiency Score (PES) — per-prompt anchor. The Efficiency Rubric combines four LLM-judged dimensions — Context Leverage, Information Density, Turn Economy, Ambiguity Cost — into a single 0–10 score. Language-agnostic by construction: the agents judge semantic content, so prompts in any human language score the same for the same underlying behavior.
  • Sub-Session Efficiency Score (SSE) — ground truth. Weighted log-mean of four baseline ratios (tokens, turns, human think time, response latency), clipped per-axis and mapped to 0–10 with a B-centered asymmetric curve. One goal = one sub-session = one score.

The two scoring verticals form a calibration pair: the per-prompt score predicts, the sub-session score verifies. Target correlation is Pearson ≥ 0.65 on a rolling window.

Three developer-facing scores, derived from Layer 1 by pure arithmetic — no LLM judgment at this layer.

ScoreFormula (summary)Unit
SpeedΣ(active_cli_seconds) / 3600, discounted when Quality Retention < 85%Hours/week
Skill100 × (0.45·SSE + 0.20·PES + 0.15·IE + 0.10·CRR + 0.10·FC)0–100
EfficiencyΣ(tokens) / Σ(active_cli_hours) — lower is betterTokens/hour

See PRISM Scores for the full definitions and tier tables.

Insight runs alongside the layers, not above them. It’s a multi-agent pipeline that turns raw prompts into the per-prompt score that feeds Layer 1:

  1. Language detection + intent classification — classifies each prompt into one of seven intents (new_code, fix, refactor, question, meta, continuation, system_callback), with confidence.
  2. Rubric routing — intents that describe actionable work are routed to an intent-specific rubric agent (authoring, debugging, planning). Intents like question, meta, continuation, and system_callback short-circuit and don’t get rubric-scored.
  3. Rubric scoring — the rubric agent scores the four PES dimensions on a 0–10 scale.
  4. Aggregation + confidence — the four dimensions combine into the PES score; a confidence value is stored alongside for audit.

Everything Insight produces is auditable end-to-end: each agent’s output is logged so a human can see why a prompt scored the way it did.

sequenceDiagram
    participant Dev as You
    participant Plugin as Prism Plugin
    participant Ingest as Ingest Service
    participant Engine as Prism Engine
    participant Dash as Dashboard

    Dev->>Plugin: Write a prompt
    Plugin-->>Dev: Single-line nudge (if needed)
    Plugin->>Ingest: Send OTLP + prompt text
    Ingest->>Engine: NATS publish
    Engine->>Engine: Parquet write → Insight → SSE → Skill
    Dev->>Dash: Open /prism
    Dash-->>Dev: Speed / Skill / Efficiency

Everything the dashboard shows is recomputable from S3 Parquet + Postgres — no scores live only in memory.

All communication uses your gck_* API key:

  • The plugin includes your key on every request to Ingest
  • Ingest validates the key and associates data with your organization
  • Dashboard access uses Supabase login (email/password or OAuth)

Your key is stored locally in ~/.prism/config.json with restricted file permissions.

There are two kinds of gck_* keys in the system, and they do different jobs:

KeyWho holds itWhat it’s used for
Plugin keyYou, locally in ~/.prism/config.jsonAuthenticates your Claude Code traffic when it’s redirected through the Optra gateway — the one you set with /prism:setup
Platform keyThe Prism Engine, as OPTRA_GATEWAY_KEY in its environmentAuthenticates server-side LLM calls for rubric scoring, session summaries, and the dashboard advisor

The two are not interchangeable. The plugin key is scoped to your developer identity and gateway governance; the platform key is a service credential the engine uses to call the Optra gateway on your behalf. Mixing them up is the usual cause of “gateway 401” errors during engine boot.