Gemini 3's 50% time horizon, per METR
38
Ṁ8202
Oct 1
3%
<1.5h
7%
1.5h - 2h
15%
2h - 2.5h
26%
2.5h - 3h
27%
3h - 3.5h
12%
3.5h - 4h
5%
4h - 5h
3%
5h - 6h
1.1%
6h - 7h
0.3%
7h - 8h
0.3%
8h - 9h
0.3%
9h - 10h
0.3%
10h - 11h
0.3%
11h - 12h
0.6%
>=12h

This market will resolve to the highest 50% time horizon, as reported by METR, for any Gemini 3 model released within a month of the first Gemini 3 announcement.

50% time horizon is a measure of AI autonomy based on the length of tasks that AI can do: roughly, it is the time that humans take to complete tasks that an AI system can successfully do 50% of the time. See METR's "Measuring AI Ability to Complete Long Tasks" for the technical definition. Claude 3.7 Sonnet, released in February 2025, was the leading model with a 50% horizon of 59 minutes.

Left bounds inclusive, right bounds exclusive.

Time horizon could vary based on the set of tasks used to measure it, so this market will be based on the time horizon for the most comprehensive set of tasks reported by METR (as of 2025, largely software and engineering tasks). This will be ambiguous if METR stops publishing time horizons across all of their autonomy tasks and only publishes separate results for different subsets; I might N/A in that scenario.

See also:

/Bayesian/gemini-3s-50-time-horizon-per-metr (this market)

/Bayesian/gpt5s-50-time-horizon-per-metr

/Bayesian/grok-5s-50-time-horizon-per-metr

/Bayesian/r2s-50-time-horizon-per-metr

Get
Ṁ1,000
and
S3.00
Sort by:

you guys are criminally retarded

gemini 3.0 will make the best METR 50% time horizon go from 2h17min to 5h+, noting that gemini 2.5 pro, which was already very good, had a 50% time horizon of 39 minutes.

i wanna have what you're having jim

@Bayesian what I'm having is faith in METR-style time-horizon extrapolation. 5h is at the high end, but it's not <2% (nor <5% tbc)

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules