Tasks

20 tasks across 5 tiers. Each task pairs a prompt with a deterministic pass rubric and a curator-built reference solve. The canary GUID in every task.yaml is our contamination-detection signal.