Will GPT-4.5 top the LLMSys Chatbot Arena leaderboard within a month of its release?

Basic

Ṁ433

Oct 1

78%

chance

ALL

If OpenAI skips GPT-4.5, this resolves to N/A.

Otherwise starting from the first time it appears at all on the leaderboard, resolves YES if it hits the top spot, and NO if it does not.

Does not apply to any future versions that are also called "GPT 4.5-preview", just the first iteration to appear on the LLMSys leaderboard. If multiple appear at once (like Opus/Sonnet/Haiku), any of them count.

Update 2024-30-12 (PST): - If multiple models are tied for the #1 rank, the top spot will be determined based on ELO rankings. (AI summary of creator comment)

This question is managed and resolved by Manifold.

#AI

#ChatGPT

#GPT-4 speculation

Get

1,000

and

3.00

3 Comments

9 Holders

15 Trades

Sort by:

Multiple models can share number 1 rank in lmsys leaderboard, so does it still count if it shares the #1 rank with 2 or 3 other models? or does it visually have to show up at the top of the leaderboard?

@LuigiD tied for first count, based on ELO.

@Sketchy hmm, but are you taking into account margin of error or no? because right now 3 models are all tied in #1 rank right now even in the style controlled leaderboard, and this is because the margin of error for them is crossing over into each other, so a single #1 winner cannot be confident asserted, the Elo you see listed is simply the single average or median value within the margin of error.

So if multiple models are all within margin of error of each other for the #1 spot, then does it resolve as a tie? or are you just deciding to go with the average/median single score listed, regardless of what margin of error that lmsys warns about?

Related questions

Related questions