This resolves YES if an AI is able to prove that there are infinitely many prime numbers by the close date.
The AI has to operate without human-generated training data. This is somewhat subjective, but is intended in the spirit of the difference between AlphaGo and AlphaZero - the AI can be taught the rules of the theorem-proving game, but has to come up with strategies on its own. In particular, it should not be considered acceptable for inputs to the AI algorithm at any stage to include datasets of previously-formalized theorems (many such datasets include this theorem anyway). It will be acceptable, however, to provide the AI with formal definitions of the natural numbers, primality, and the statement it's trying to prove (rather than having the AI creatively generate the theorem itself, for example).
Many theorem provers, rather than introducing the concept of sets and infinity, formalize this theorem as "for any natural number N, there exists a prime p greater than N". This is acceptable for the purpose of this market.
Nov 9, 5:54pm: Will an AI prove the infinitude of primes by 2025-11-03? → Will a blank-slate AI prove the infinitude of primes by 2025-11-03?
@wadimiusz wait do you mean chatgpt is allowed?
then it’s trivial chatgpt 4 can do it alr im pretty sure
Have you heard of Euclid's theorem? A proof of the infinitude of primes was written down about 300 BCE. Why does it need to be proved again by AI, which uses fuzzy logic to imitate human intelligence and pre-existing knowledge! (https://t5k.org/notes/proofs/infinite/euclids.html)
@michelangelo I understand your line of reasoning, but if you think about it for another ten seconds you'll realize that the question is meant to be a test of AI's mathematical skills, rather than having AI solve, say, the millennium problems.
@BoltonBailey hmm. then we'd probably need to generate some random bullshit theorems, and ask the prover to prove or disprove them, or something...
Eurisko would totally breeze through that theorem.
Would it count as blank-slate? Probably not considering the human operator needs to steer the AI towards interesting concepts.... But with modern compute, one can still wonder if brute-force unsupervised Eurisko could do it...
Pretty sure will actually do it, though.
I'm up to 1150 lines of Coq, trying to prove this without the standard library. I had to use the library slightly to define a decomposition function that takes a natural number and returns a list of it's primes. Coq wouldn't let me use my own nat type to prove that it's well-founded, but after some wresting I've got that function defined, and I'm close to proving this.
I've labeled all my sections, I've got sections for plus, times, le and lt, truncated subtraction, division, modules, the divides predicate, lists of numbers, decomposition and the final supporting lemmas section.
The most I've proven so far is that if you give me any number n, I can find a number that isn't divisible by any number less than or equal to n.
Theorem always_ndless : forall (n : nat), exists (m : nat), forall (o : nat), le (S (S o)) n -> not (divides m (S (S o))).
At least in Coq, the AI would have it's work cut out for it.
@AnthonyPeterson Cool. Obviously with the usual proof, you would have to go through redefining the factorial function. I would be interested to see how many LoC it ends up at. I'd also be interested to know if some of the other proof approaches are any more parsimonious.
@BoltonBailey I was stuck trying to define a prime decomposition function that provably terminates for a month, but I'm finally past that after bashing my head against various attempts Program Fixpoint is awful, the Function keyword with a helper function is what I settled on.
Now I'm working on proving simple things about the prime decomposition function, to try to build back up to the final proof.
My factorial is pretty simple since I'm using Peano numbers:
```Fixpoint fact (n : nat) :=
match n with
| O => S O
| S n' => times n (fact n')
end.```
@AnthonyPeterson I was writing a divisors function to assist in proving stuff related to the decomposition function, and I realized I can do my proof with just the divisors function.
Theorem: for any natural number n, I can find a prime larger than n.
(all results are setup for n>=2, I can special case n=0 and 1).
Part 1: construct n! + 1. This number is not divisible by any number less than or equal to n.
(done)
Part 2: The number n! +1 has some number of proper divisors, excluding 1 and n itself. Find the minimum of the divisors. This number must be prime, if it wasn't there would be smaller divisors.
Part 3: We have a prime divisor of n! + 1. But n! + 1 doesn't have any divisors less than or equal to n, so this prime must be greater than n.
I just have to finish out parts 2 and 3, which look to be pretty easy.
@AnthonyPeterson I finally did it! I proved the infinitude of primes from scratch. Working on the decomposition function was a huge distraction. There's 1642 lines of Coq, but a lot of lemmas probably won't be needed, so I plan to see how far I can trim it down.
Proof Method (roughly):
Prove that for any number n, there exists a larger number m which is prime.
Consider n! + 1. By Lemma 1, we can show that n! +1 does not divide by anything <= n. By Lemma 2, we know that there exists some number d, such that n! + 1 is divisible by d, and d is prime. But n! + 1 is not divisible by any number <= n, so d must be a prime > n.
Qed.
Lemma 1 is simple conceptually, but Lemma 2 is a bit interesting. I wrote a lemma that any property of natural numbers is either always false, or there is some number n, such that it's true for n, and false for all values less than n.
Then applying that to the divisibility property, I was able to select the minimum divisor of n! + 1, and prove it must be prime since it's minimal.
Since everything mathematical seems to be going towards Lean, I rewrote the proof in Lean. It took a couple of months to deal with Lean's oddities and differences from Coq, but I've finished rewriting it.
Definitely some nice advantages to Lean, but there were places where I had to restructure the detailed proof steps. The main flow of the proof is the same.
My eyes glazed over at some of the most complicated parts of the proof. Definitely a case of looking at old code and not quite understanding it, but I was able to muddle through.
It's about 400 lines longer than the Coq version, but I'm still learning Lean.
I've been trying to prove this in Coq myself, and there's been a number of obstacles. I'm sure there are simpler ways of proving it that a blank-slate AI could discover, but I don't think I'm going to be writing a blank-slate AI to do this. It's been a fun distraction and wake up call on what is involved to formalize a significant mathematical statement.
There's a lot of inequality, subtraction, division and modulus lemmas that I'm bogged down in, just to prove stuff like x + a is not divisible by x as long as a is not divisible by x. I've got 600 lines of Coq definitions and lemmas. I've been doing everything from scratch just like a Blank Slate AI would.
Ah, I hate text editors in which pressing enter causes you to post your comment. Anyway, cont: would providing an interface to a theorem prover also disqualify a method from consideration? Like, if the system implicitly learns to define things in the language of a theorem prover, and uses the theorem prover to check its guesses to bootstrap its quality?
@MrR Re: "restricting the play", this would depend on how heavily the play was "restricted". If the algorithm was told to look for formulas that provably didn't have factors in a set of prime numbers, that would be implicitly telling the AI something about Euclid's proof and wouldn't count. Simply telling it the axioms of Peano Arithmetic would be fine though. Re: Interface to theorem prover - this would be fine, as long as the infrastructure that the theorem prover software used didn't include any human-proved theorems about prime numbers and the like. For example, having the AI interface to Lean would be fine, but it would be unacceptable if the theorem prover had access to mathlib.