Which, if any, GPT-n will outperform AlphaGeometry merely via prompting, by 2030?
Basic
3
Ṁ79
2030
7%
GPT-4
19%
GPT-5
21%
GPT-6
12%
GPT-7
12%
GPT-8
12%
GPT-9
16%
None

Resolves to the lowest numbered GPT that scores higher than "25" on the benchmark test set of 30 Olympiad geometry problems, as used in the AlphaGeometry paper: https://twitter.com/GoogleDeepMind/status/1747651826730610696

Both GPT-n or a derivative fine-tuned version of GPT-n count. It also cannot use any special scaffolding: it must take in the problem description in its prompt, and output the geometry problem solution in the first outputted answer (potentially after some chian of thought).

In case the architecture changes significantly such that question is no longer applicable, I will resolve as N.A..

Get
Ṁ1,000
and
S3.00
© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules