Will GPT-4.5 top the LLMSys Chatbot Arena leaderboard within a month of its release?
Basic
10
Ṁ433
Oct 1
78%
chance

If OpenAI skips GPT-4.5, this resolves to N/A.

Otherwise starting from the first time it appears at all on the leaderboard, resolves YES if it hits the top spot, and NO if it does not.

Does not apply to any future versions that are also called "GPT 4.5-preview", just the first iteration to appear on the LLMSys leaderboard. If multiple appear at once (like Opus/Sonnet/Haiku), any of them count.

  • Update 2024-30-12 (PST): - If multiple models are tied for the #1 rank, the top spot will be determined based on ELO rankings. (AI summary of creator comment)

Get
Ṁ1,000
and
S3.00
Sort by:

Multiple models can share number 1 rank in lmsys leaderboard, so does it still count if it shares the #1 rank with 2 or 3 other models? or does it visually have to show up at the top of the leaderboard?

@LuigiD tied for first count, based on ELO.

@Sketchy hmm, but are you taking into account margin of error or no? because right now 3 models are all tied in #1 rank right now even in the style controlled leaderboard, and this is because the margin of error for them is crossing over into each other, so a single #1 winner cannot be confident asserted, the Elo you see listed is simply the single average or median value within the margin of error.

So if multiple models are all within margin of error of each other for the #1 spot, then does it resolve as a tie? or are you just deciding to go with the average/median single score listed, regardless of what margin of error that lmsys warns about?

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules