LLM EvalOps Lab
Release Evidence Dashboard
Portfolio-grade static dashboard for deterministic EvalOps, regression, RAG retrieval quality, safety red-team, experiment tracking, and release readiness artifacts.
Status: READY
EvalOps
2
RAG
1
Safety
1
Readiness
1
Gate Evidence
READY reflects the current passing release candidate. Controlled blocked-release evidence proves gates fail closed when required retrieval evidence is missing.
| Evidence Type | Status | Detail | Path |
|---|---|---|---|
| Current release candidate | READY | Aggregated EvalOps, RAG, safety, and readiness reports for the current candidate. | |
| Controlled blocked-release scenario | BLOCKED AS EXPECTED | rag-blocked-failure: pass_rate=0.000; expected missing evidence to fail closed. | reports/rag-blocked-failure.json |
| Report | Type | Pass Rate | Passed / Total | Gate | Path |
|---|---|---|---|---|---|
| sample-fake | evalops | 1.000 | 3 / 3 | true | reports/sample-fake.json |
| expanded-fake | evalops | 1.000 | 7 / 7 | true | reports/expanded-fake.json |
| rag-sample | rag-evaluation | 1.000 | 5 / 5 | n/a | reports/rag-sample.json |
| redteam-fake | safety-redteam | 1.000 | 5 / 5 | n/a | reports/redteam-fake.json |
| release-readiness | release-readiness | 1.000 | 4 / 4 | n/a | reports/release-readiness.json |