Opponent Chaotic Balanced State Coset Codes Population Genetics Targeting Interventions Average
vs reviewer3 (independent) 6.00 6.00 5.83 6.00 5.96
vs stanford (independent) 6.00 6.00 5.67 6.00 5.92
vs coarse (Kimi K2.5) 6.00 5.83 5.67 6.00 5.88
vs coarse (Qwen 3.5 Plus) 6.00 5.67 5.83 6.00 5.88
vs coarse (DeepSeek V3.2) 6.00 6.00 5.50 5.83 5.83
vs coarse (GPT-5 mini) 5.50 5.67 5.50 6.00 5.67
vs refine.ink 6.00 5.83 4.67 6.00 5.62
vs coarse (Claude Sonnet 4.6) 5.83 5.50 5.17 6.00 5.62