The LLM must not be limited to only playing chess. For example, if an LLM can play chess and hold a conversation, it counts. However, it cannot make use of a non-LLM based chess engine, or non-LLM related computation, like a calculator or a python script.
A super grandmaster has a FIDE Elo rating of 2700 or more.
The super GM must be trying to win the game.
The game format is unspecified. The GM can be blindfold, or play on a board whose moves are relayed to the LLM via standard notation, or anything else.
duplicate, with different resolution criteria, to
Update 2025-06-19 (PST) (AI summary of creator comment): The creator has specified that they would count an AI that is like a 'human brain in AI form' for the purposes of this market. This indicates a broad interpretation of what constitutes a 'large language model'.
@patrik Do you find any loophole or flaw with this draft def?
A large language model is a neural network with millions to trillions of parameters, trained on vast amounts of text data, that can generate coherent human-like text and perform diverse language tasks through a unified learned representation of language.
@Bayesian I propose that you limit it to only verbal reasoning plus no new major architectural changes (not sure how to quantify that one tbh).
@patrik Yeah that is tricky. If it has non verbal reasoning using internal representation id definitely count that i think. Yeah tricky to tell what future architectural changes are ‘too big’ or wtv before they are created. I am tentatively fine with a pretty weakly limited architecture where it feels like the llm is a at least partly text based intelligence that thinks about and plays chess in a way that doesnt feel like it’s cheating against the human, ie it’s using its brain and reasoning without like a specialized chess computer program.
@Bayesian I don't think something centered around language can do it but if it's just partially using language then sure. It might not make a lot of sense to call it LLM tho.
@patrik i think something centered around language can do it? What do you have in mind that would partially use language and be superhuman at chess but that it would be a stretch to call an llm?
@patrik I see. Yeah i’d want something like a human brain in ai form to count for the purpose of this market so it doesnt seem like a problem?