By EOY 2025, will the model with the lowest perplexity on Common Crawl will not be based on transformers?

Plus

Ṁ13k

Dec 31

chance

ALL

If perplexity on Common Crawl is not available for models, I will use other benchmarks as a surrogate. This will inherently be a judgement process. If a model has not been announced by EOY 2025 and no benchmarks have been posted publicly, it will not be counted for the purpose of this market.

"Based on transformers" for the purpose of this question will be anything with multi-headed self-attention that feeds into an MLP.

Update 2025-04-10 (PST) (AI summary of creator comment): Clarification on what constitutes 'based on transformers':
- deepseek-style MLA with MoE is considered as based on transformers.
- All current models, except for SSMs and LSTMs, are assumed to fall under the category of based on transformers.
- The status of RWKV remains open for discussion.

Update 2025-04-12 (PST) (AI summary of creator comment): Hunyuan-turbo{s} Classification Update:
- Transformer-SSM Hybrid: Despite being a hybrid, it will be counted as non-transformer.
- This establishes that for models with mixed components (e.g., transformer and SSM), the creator will treat them as non-transformer for the purposes of market resolution, unless persuaded otherwise.

This question is managed and resolved by Manifold.

Get

1,000

and

3.00

15 Comments

29 Holders

118 Trades

Sort by:

All of the top 20 lmarena models are based on transformers in my reckoning.

Of the top 20 lmarena models, hunyuan-turbo{s} is the only non-transformer one. appears to be transformer-SSM hybrid, which I'll count as non-transformer. Willing to be persuaded on this one. https://cloud.tencent.com/document/product/1729/104753

bought Ṁ5,000 NO

deepseek-style MLA with MoE counts as "based on transformers" in my mind. So do all the current models that I'm aware of outside of SSMs and LSTMs. I'm willing to be convinced either way on RWKV.

predictedNO

A MoE of transformer model still counts as "based on transformers" right?

predictedNO

@jonsimon yes

Just to be clear here, if we think that it will be based on transformers, then we should vote No?

predictedNO

@jonsimon correct

@jonsimon wish they didn't not write the title with not so many double negatives

predictedNO

@ConnorMcCormick oh yeah that's definitely confusing people. We'll, better for us who do understand it :)

and thanks for the exit gigacasting! $1 -> $68 in five minutes

predictedNO

there are few situations where you want to buy a new market from 50% to 99.2%, you're just giving free money away, use limit oreders pls

lmao i beat michael's bot in buying shares

predictedNO

giving me a solid +2892.5% profit for m$1

i'mt not entirely sure what NO means on this market though

@jacksonpolack The API only refreshes the data every 15 seconds, so if you're quick on the draw, it's totally doable.

predictedNO

i'mt not entirely sure what NO means on this market though

Lol, me neither.

Related questions

Related questions