Will a single model running on a single consumer GPU (<1.5k 2020 USD) outperform GPT-3 175B on all benchmarks in the original paper by 2025?

Plus

Ṁ3927

Jan 1

86%

chance

ALL

There are no restrictions on the amount or kind of compute used to *train* the model. Question is about whether it will actually be done, not whether it will be possible in theory. If I judge the model to really be many specific models stuck together to look like one general model it will not count.

This question is managed and resolved by Manifold.

#AI

#Technical AI Timelines

Get

1,000

and

3.00

9 Comments

24 Holders

221 Trades

Sort by:

Llamas on pixel 7s https://github.com/rupeshs/alpaca.cpp/tree/linux-android-build-support (ik ik its not over 13B yet, just sharing progress)

predictedYES

There are people who run 30B Llama on consumer PC successfully and even 65B (but it is extremely slow)

@ValeryCherepanov By "run on a single GPU" I mean the weights + one full input vector can fit on a consumer GPU at once. Otherwise the question would be meaningless - you can always split up matrices into smaller blocks and run the computation sequentially.

This is now extremely close to being resolved by Llama (Llama 13B does not actually beat GPT-3 on every measured benchmark, however, it only comes very close). 72% is way too low though so I guess whoever reads this comment first can collect some free mana in expectation.

@vluzko

Are the benchmarks text generation only? Or do they work with chat models too

FLAN-T5 3B very likely can resolve this now, but I suspect it will be a while before anyone actually bothers to run it on all of the benchmarks.

lol yeah this one's gonna happen: https://arxiv.org/abs/2205.05131

Tautologically this is possible today. Whether it’s possible to do “with entire model in GPU memory at once” just not that interesting to calculate.

I think it's a tight call. I think I'd go the other way if this was for 2027.

Related questions

Related questions