>Scope
QA harness for LLM agents: scenario suites, flake controls, tool sandboxing, LLM-as-judge scoring, and regression protocols.