Skip to content

Conversation

@john-b-yang
Copy link
Contributor

Creates blog post describing CC:Ladder

  • Describes what CC:Ladder is in detail
  • Walks through what we did for Core War and RobotRumble (scrape human solutions, evaluate against each other to get relative ELOs)

The main thing that's missing is what ranks frontier models actually achieve on the ladder, but this branch + PR is starting to be in limbo at this point, so making a decision to just merge things.

@john-b-yang john-b-yang merged commit 75f8abd into main Jan 26, 2026
@john-b-yang john-b-yang deleted the john/ladder branch January 26, 2026 18:06
@john-b-yang
Copy link
Contributor Author

Leaving a note for myself - please update the blog post with numbers once the evaluation runs finish.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants