When will LLMs be better at Paradox grand strategy games than the in-game AI for NPCs?
When will LLMs be better at Paradox grand strategy games than the in-game AI for NPCs?
1
Ṁ50
2031

Invalid contract

Resolution Criteria

This market resolves to the date when Large Language Models (LLMs) are demonstrably better at playing Paradox grand strategy games (such as Europa Universalis, Crusader Kings, Hearts of Iron, Stellaris, or Victoria) than the built-in AI that controls non-player characters (or nations.)

The relevant Paradox games are those current at the time of resolution.

If Paradox integrates LLMs into the AI for NPCs, that counts as admitting that LLMs are better at the task, and this market will resolve to the date the relevant game (or patch, or DLC) is released to the public.

Otherwise, this market will resolve when there is publicly available code I can run, alongside a copy of one of the then-current generation of Paradox GSGs, which consistently plays the game well (in single-player mode.) It doesn't need to achieve world conquest or anything, or even play as well as any given human player would play. But it needs to consistently avoid faceplanting. If it semi-consistently achive success (relative to its starting position), the way even a significantly less-than-median human player can, that's enough to resolve the market.

The level of skill I'm talking about here is one a human player can reach within tens of hours of play time; this isn't meant to be a high bar.

The LLM-based AI can be specialized for playing Paradox games, or one particular game. It can be fine-tunes to the task, or include e.g. specialized tool-calling. I need to be able to run it against a game running on my computer (or in a virtual machine), but the model itself need not be a local one; i.e. it can call the API of a proprietary hosted LLM like Claude or GPT.

As the resolution criteria is somewhat subjective, I will not bet on this market.

Get
Ṁ1,000
and
S3.00

What is this?

What is Manifold?
Manifold is the world's largest social prediction market.
Get accurate real-time odds on politics, tech, sports, and more.
Win cash prizes for your predictions on our sweepstakes markets! Always free to play. No purchase necessary.
Are our predictions accurate?
Yes! Manifold is very well calibrated, with forecasts on average within 4 percentage points of the true probability. Our probabilities are created by users buying and selling shares of a market.
In the 2022 US midterm elections, we outperformed all other prediction market platforms and were in line with FiveThirtyEight’s performance. Many people who don't like trading still use Manifold to get reliable news.
How do I win cash prizes?
Manifold offers two market types: play money and sweepstakes.
All questions include a play money market which uses mana Ṁ and can't be cashed out.
Selected markets will have a sweepstakes toggle. These require sweepcash S to participate and winners can withdraw sweepcash as a cash prize. You can filter for sweepstakes markets on the browse page.
Redeem your sweepcash won from markets at
S1.00
→ $1.00
, minus a 5% fee.
Learn more.
© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules