Opponent Chaotic Balanced State Coset Codes Population Genetics Targeting Interventions Average
vs coarse (DeepSeek V3.2) 6.00 6.00 6.00 6.00 6.00
vs reviewer3 (independent) 6.00 5.88 6.00 6.00 5.97
vs stanford (independent) 6.00 6.00 5.88 6.00 5.97
vs coarse (Qwen 3.5 Plus) 6.00 6.00 5.38 5.50 5.72
vs coarse (Kimi K2.5) 6.00 5.88 5.38 5.38 5.66
vs coarse (GPT-5 mini) 6.00 6.00 4.88 5.12 5.50
vs coarse (Claude Sonnet 4.6) 5.88 5.25 5.00 5.50 5.41
vs refine.ink 5.88 5.25 4.25 5.88 5.31