In "Situational Awareness: The Decade Ahead", Leopold Aschenbrenner estimates that the largest training clusters will cost over one trillion dollars around 2030.
Clarifications:
$1T worth of computers and associated data center infrastructure (e.g. building, cooling, networking; does not include the cost of the power plant)
Computers must be networked together and training one model. They do not need to synchronize weights at each gradient step.
Value of data centers will be estimated with reasonable depreciation. So, $1T of purchase price in TPUv1s would not count.
Nominal dollars.
So, if the nominal value of the datacenter being used to train a model less depreciation ever crosses $1T, this market resolves YES.
This is one of a series of markets on claims made in Leopold Aschenbrenner's Situational Awareness report(s).
Will models be able to do the work of an AI researcher/engineer before 2027?31%
Before 2028, will anyone train a GPT-4-level model in a minute?14%
Will a tech company buy an aluminum smelting company before 2030?39%
Other markets about Leopold's predictions:
I think asynchronous training distributed between multiple datacenters will cause issues with this question's resolution
Considering how quickly GPUs are improving I'm guessing the training FLOPs vs intelligence curve would be basically flat well before you would reach $1 trillion in 2031
If Google has spent 1T on all of their tpus and do distributed training does that count?
Or does this require a single data center?