LLM EvalOps Lab

Release Evidence Dashboard

Portfolio-grade static dashboard for deterministic EvalOps, regression, RAG retrieval quality, safety red-team, experiment tracking, and release readiness artifacts.

Status: READY
EvalOps
2
RAG
1
Safety
1
Readiness
1

Gate Evidence

READY reflects the current passing release candidate. Controlled blocked-release evidence proves gates fail closed when required retrieval evidence is missing.

Evidence Type Status Detail Path
Current release candidateREADYAggregated EvalOps, RAG, safety, and readiness reports for the current candidate.
Controlled blocked-release scenarioBLOCKED AS EXPECTEDrag-blocked-failure: pass_rate=0.000; expected missing evidence to fail closed.reports/rag-blocked-failure.json
Report Type Pass Rate Passed / Total Gate Path
sample-fakeevalops1.0003 / 3truereports/sample-fake.json
expanded-fakeevalops1.0007 / 7truereports/expanded-fake.json
rag-samplerag-evaluation1.0005 / 5n/areports/rag-sample.json
redteam-fakesafety-redteam1.0005 / 5n/areports/redteam-fake.json
release-readinessrelease-readiness1.0004 / 4n/areports/release-readiness.json