06 · LAMBDA-BER Use Case

Re-analysis triggers:
when the tools improve, the results catch up

Structure software never stops improving — a new reconstruction algorithm, a better CTF model, an updated sequence database. Old data often holds more than it gave the first time. Because every result records the exact software and inputs that produced it, the agent can find precisely which past results a new release affects, re-run only those, and report whether the numbers actually moved.

Trigger: new version / reference Scope: by software_version + type Action: re-run affected only Output: a metric diff

What sets it off

Three kinds of trigger

A re-analysis sweep starts from a change in the world, not a calendar. The schema's pinned software_name / software_version on each WorkflowRun is what makes "which results are now stale?" a query rather than a guess.

software

New release

RELION 5, a new cryoSPARC, or a Phenix point release with a better refinement target.

reference

Updated database

A refreshed sequence DB, a new ligand restraint library, or an improved reference model.

method

Better algorithm

Improved motion correction or CTF estimation that can squeeze more from existing raw frames.

Not Phenix-specific. The same mechanism re-runs a RELION refinement, re-normalizes an XAS scan, or re-fits a SAXS curve. What matters is that the workflow recorded its version — the trigger is technique-agnostic.

The sweep

Find the affected set, re-run, diff

trigger RELION 5 lands affected set software_name='RELION' AND version < 5 re-run same inputs new version N rows metric diff resolution · R-free · clashscore improved? regressed? flat? write new WorkflowRun versions ↩ (old ones kept)

Scope the work

Re-run only what the change touches

The expensive mistake is re-processing everything. The point of pinned versions is surgical scope: a query returns exactly the runs a release can improve, and nothing else is touched.

the affected-set query
-- which completed runs used a now-superseded RELION?
SELECT w.workflow_code, w.software_version, e.experiment_code
FROM   workflow_run w
JOIN   workflow_input_assoc i USING (workflow_id)
JOIN   experiment_run e        ON e.id = i.experiment_id
WHERE  w.software_name = 'RELION'
  AND  w.workflow_type = 'refinement'
  AND  w.software_version LIKE '3.%'
  AND  w.processing_status = 'completed';
-- → 18 runs. Re-run these 18, not the other 3,400.

The result

A diff that says whether it was worth it

Re-running produces new WorkflowRun rows beside the old ones — never overwriting history. The agent reports the metric deltas so a scientist can accept the improvements, investigate regressions, and ignore the unchanged.

RunMetricWas (v3)Now (v5)Δ
EXP-NCP-CRYOEM-001resolution (Å)3.43.0▲ 0.4 better
EXP-COMPLEX-001map-model FSC0.710.78▲ improved
EXP-APO-014resolution (Å)2.92.9— flat
EXP-LIG-022clashscore4.15.8▼ regressed — flag

Pair this with the lakehouse (use case 05) and every sweep becomes a versioned snapshot: you can always answer "what did this structure look like before the RELION 5 reprocessing?"

The agent at work

A release lands; the sweep runs itself

beril · claude-code · ~/lab-catalog
> RELION 5 is out — what of ours could benefit?

⏺ scope software_name='RELION' AND version LIKE '3.%' AND type='refinement'
→ 18 runs affected (of 3,418 total) · est. 14 GPU-hours

⏺ re-run (same inputs by storage_uri, RELION 5)
18/18 complete · new WorkflowRun rows written · old versions retained

⏺ diff metrics
▲ 11 improved (best: 3.4→3.0 Å) — flat 6
▼ 1 regressed (clashscore 4.1→5.8 on EXP-LIG-022) — flagged for review

⏺ summary
→ 3 structures now cross the 3.0 Å deposition threshold (see use case 05)

Payoff

What re-analysis triggers buy

[ ↻ ] Old data, new value

Raw collections keep paying off as methods improve — without anyone remembering to revisit them.

[ ⌖ ] Surgical scope

Pinned versions mean you re-run the 18 runs a release touches, not the 3,400 it doesn't.

[ ⚑ ] Regressions surface

A diff catches the rare case where "newer" is worse, before it quietly ships into a deposition.

[ ⏱ ] History preserved

New results sit beside old ones as versioned rows, so every past state stays reproducible.