PRISM Score

PRISM Score is one number:

The percentage of your sessions that crushed their goal.

PRISM Score = 100 × crushed_count / total_count

Trivial chat and goal-less ambient sessions are excluded from both sides of the ratio. They don’t pad your score and they don’t drag it down.

This page is the hub. The sub-pages below break it down into supporting metrics — each one is also a sidebar entry under PRISM Score v3.

What “crushed” means

A session crushes when all four of these are true:

Substance floor passed. Real work happened: ≥3 turns, ≥10 net lines of code, or ≥1 mutating tool call.
Goal complete. An LLM outcome judge says the session landed its goal against per-intent rules.
Not rework. A later session didn’t revert or rewrite the same code.
Intent established. The rubric judge committed to a clear class (Question, Bug fix, Feature, Refactor, etc.).

If any one fails, the session doesn’t crush. If the substance floor or the intent step fail, the session also drops out of the denominator — see Trivia.

For the full algorithm — boundary detection, per-intent completion criteria, anti-fragmentation merge — see Algorithm Overview.

The sub-score pages

The hub shows the headline percentage and a trend line. The supporting pages drill into what’s driving it.

Page	Route	What it shows
Speed	`/score-v3/speed`	CSPW — crushed sessions per week. TTC — median time from session start to crush.
Skill	`/score-v3/skill`	APG — average Prompt Grade (letter grade rolled up from the per-prompt rubric).
Prompt Grade	`/score-v3/prompt-grade`	Per-prompt rubric drilldown — which prompts graded well, which didn’t, and why.
Token Usage	`/score-v3/tokens`	Crush Weight — share of tokens spent in crushed sessions. Plus TET, TPT, and $CPCS (dollar cost per crushed session).
Trivia	`/score-v3/trivia`	Sessions excluded by the substance floor — what they were, why they dropped. Useful for sanity-checking the filter.
Integrity	`/score-v3/integrity`	Anti-gaming flags — sessions the judge couldn’t confidently rule on, or that look engineered to crush.
Sub-sessions	`/score-v3/sub-sessions`	Explorer for the underlying scored unit (a goal arc inside a Claude Code session).

Reading the headline number

The PRISM Score is intentionally outcome-based. You can write beautiful prompts and still fail to land bug fixes — and the older Skill composite would still grade you well for the prompts. The PRISM Score doesn’t.

That makes it harder to game but also more sensitive to how you finish:

Bug fix without verifying the fix → won’t crush. The completion criterion is fix applied + verification evidence.
Feature with no acceptance → won’t crush. You need scaffolding + acceptance + (tests OR explicit “tests later”).
Refactor without running tests → won’t crush. Behavior-preservation evidence is required.

A few quick habits raise the hit rate without changing how you work much. See How to crush a session explicitly.

The older PRISM Score (v2.1)

The previous composite — Speed (hours/week), Skill (0.45·SSE + 0.20·PES + 0.15·IE + 0.10·CRR + 0.10·FC), Efficiency (tokens per hour) — still runs in parallel during the calibration window. Toggle to it from the sidebar if you want to compare.

When v3 reaches steady-state agreement against the hand-labeled set, v2.1 retires.