I presented my first sole-authored paper (based on this blog post) at the AGI conference in December of last year, and it was cited by Marcus Hutter in an interesting paper about approximating universal intelligence (which was presented at this year's AGI conference, which was once again a summer conference, so already took place).

When I set out to write the paper, my main goal was to show that we gain something by representing beliefs as something like logical statements, rather than as something like programs. This allows our beliefs to decompose more easily, readily allows inference in "any direction" (whereas programs are naturally executed in one direction, producing specific results in a specific order), and also allowing incomputable hypotheses to be dealt with in a partial way (dealing somewhat more gracefully with the possibility by explicitly representing it in the hypothesis class, but incompletely).

My desire to put forward this thesis was partly out of an annoyance with people invoking the Curry-Howard isomorphism all-too-often, to claim that logic and computation are really one and the same. I still think this is misguided, and not what the curry-howard isomorphism really says when you get down to it. The "programs are proofs" motto is misleading. There is no consensus on how to deal with Turing-complete programs in this way; turing-complete programming languages seem to correspond to trivial logics where you can prove anything from anything!*

Annoyed, I wanted to show that there was a material difference between the two ways of representing knowledge.

As I wrote the paper and got feedback from colleagues, it became clear that I was fighting a losing fight for that thesis: although the first-order prior represented a new mathematical object with interesting features along the lines I was advocating, it would be possible to write a somewhat program-like representation with the same features. I would still argue each of the advantages I mentioned, and still argue against naive invocations of Curry-Howard, but I was trying to make these arguments too strong, and it wasn't working. In any case, this was a point that didn't need to be made in order for the paper to be interesting, for two reasons:

- If desired, you could re-do everything in a more computational way. It would still be a new, interesting distribution with features similar to but different from the Solomonoff distribution.
- A universal distribution over logic, done right, is interesting even if it had turned out to be somehow equivalent to the Solomonoff distribution.

So, all told, I downplayed the "logic is different from computation" side of the paper, and tried to focus more on the prior itself.

After submitting the paper, I went back to working on other things. Although I still thought about logical priors every so often, I didn't make very much conceptual progress for a while.

At the July MIRI workshop, I got the opportunity to spend time on the topic again, with other smart folks. We spend roughly a day going over the paper, and then discussed how to take things further.

The main problem with the first-order prior is that the probability of a universal statement does not approach 1 as we see more and more examples. This is because all the examples in the world will still be consistent with the statement "there exists a counterexample"; so, if we are randomly sampling sentences to compose a logical theory, the probability that we add that sentences doesn't drop below a certain minimum.

So, for example, if we are observing facts about the natural numbers, we will not converge to arbitrarily high probability for generalizations of these facts. To make it more concrete, we cannot arrive at arbitrarily high probabilities for the Goldbach conjecture by observing more and more examples of even numbers being written as the sum of two primes.

This isn't a bad thing in all cases. Holding back some fixed probability for the existence of a counterexample matches with the semantics of first-order logic, which is not supposed to be able to rule out omega-inconsistent theories. (Omega inconsistency is the situation where we deny a universal statement while simultaneously believing all the examples.)

For some domains, though, we really do want to rule out omega-inconsistency; the natural numbers are one of these cases. The reason the first-order prior allows some probability for omega-inconsistent possibilities is that first-order logic is unable to express the fact that natural numbers correspond exactly to the finite ones. ("Finite" cannot be properly characterized in first-order logic.) More expressive logics, such as second-order logic, can make this kind of assertion; so, we might hope to specify reasonable probability distributions over those logics which have the desired behavior.

Unfortunately, it is not difficult to show that the desired behavior is not approximable. If the probability of universal statements approaches 1 as we observe increasingly many examples, then it must equal 1 if we believe all the examples. Let's take an example. If we believe all the axioms of peano arithmetic, then we may be able to prove all the examples of the Goldbach conjecture. In fact, we end up believing all true Pi_1 statements in the arithmetic hierarchy. But this implies that we believe all true Sigma_2 statements, if our beliefs are closed under implication. This in turn means that we believe all the examples of the Pi_3 universal statements, which means we must believe the true Pi_3 with probability 1, since we supposed that we believe universal statements if we believe their examples. And so on. This process can be used to argue that we must believe the true statements on every level of the hierarchy.

Since the hierarchy transcends every level of hypercomputation, there can be no hope of a convergent approximation for it. So, convergence of universal statements to probability 1 as we see more examples is (very) uncomputable. This may seem a bit surprising, given the naturalness of the idea.

Marcus Hutter has discussed distributions like this, and argues that it's OK: this kind of distribution doesn't try to capture our uncertainty about logically undecidable statements. Instead, his probability distribution represents the strong inductive power that we could have if we could infallibly arrive at correct mathematical beliefs.

Personally, though, I am much more interested in approximable distributions, and approaches which

*do*try to represent the kind of uncertainty we have about undecidable mathematical statements.

My idea has been that we can get something interesting by requiring convergence on the Pi_1 statements only.

One motivation for this is that Pi_1 convergence guarantees that a logical probability distribution will eventually recognize the consistency of any axiomatic system, which sort-of gets around the 2nd incompleteness theorem: an AI based on this kind of distribution would eventually recognize that any axioms you give it to start with are consistent, which would allow it to gradually increase its logical strength as it came to recognize more mathematical truth. This plausibly seems like a step in the direction of self-trusting AI, one of the goals of MIRI.

The immediate objection to this is that the system still won't trust itself, because it is not a set of axioms, but rather, is a convergent approximation of a probability distribution. Convergence facts are higher up in the arithmetic hierarchy, which suggests that the system won't be able to trust itself even if it does become able to (eventually) trust axiomatic systems.

This intuition turns out to be wrong! There is a weak sense in which Pi_1 convergence implies self-trust. Correctness for Pi_1 implies that we believe the true Sigma_2 statements, which are statements of the form "There exists x such that for all y, R(x,y)" where R is some primitive recursive relation. Take R to be "y is greater than x, and at time y in the approximation process, our probability of statement S is greater than c." (The arithmetic hierarchy can discuss the probability approximation process through a godel-encoding.) The relevant Sigma_2 statements place lower bounds on the limiting probabilities from our probability approximation. We can state upper bounds in a similar way.

This shows that a probability distribution which has Pi_1 convergence will obey something strikingly like the probabilistic reflection principle which came out of a previous MIRI workshop. If its probabilities fall within specific bounds, it will believe that (but the converse, that if it believes they fall within specific bounds, they do, does not hold). This gives such a probability distribution a significant amount of self-knowledge.

So, Pi_1 convergence looks like a nice thing to have. But is it?

During the MIRI workshop, Will Sawin proved that this leads to bad (possibly unacceptable) results: any logically coherent, approximable probability distribution over statements in arithmetic which assigns probability 1 to true pi_1 statements will assign probability 0 to some true pi_2 statements. This seems like a rather severe error; the whole purpose of using probabilities to represent uncertainty about mathematical truth would be to allow "soft failure", where we don't have complete mathematical knowledge, but can assign reasonable probabilities so as to be less than completely in the dark. This theorem shows that we get hard failures if we try for pi_1 convergence.

How concerned should we be? Some of the "hard failures" here correspond to the necessary failures in probabilistic reflection. These actually seem quite tolerable. There could be a lot more errors than that, though.

One fruitful idea might be to weaken the coherence requirement. The usual argument for coherence is the dutch book argument; but this makes the assumption that bets will pay off, which does not apply here, since we may never face the truth or falsehood of certain mathematical statements. Intuitionistic probability comes out of a variation of the dutch book argument for the case when bets not paying off at all is a possible outcome. This does not require that probabilities sum to 1, which means we can have a gap between the probability of X and the probability of not-X.

An extreme version of this was proposed by Marcello Herreschoff at the MIRI workshop; he suggested that we can get Pi_1 convergence by

*only*sampling Pi_1 statements. This gets what we want, but results in probability gaps at higher levels in the hierarchy; it's possible that a sampled theory will never prove or disprove some complicated statements. (This is similar to the intuitionistic probability idea, but doesn't actually satisfy the intuitionistic coherence requirements. I haven't worked this out, though, so take what I'm saying with a grain of salt.)

We may even preserve some degree of probabilistic reflection this way, since the true Pi_1 still imply the true Sigma_2.

That particular approach seems rather extreme; perhaps too limiting. The general idea, though, may be promising: we may be able to get the advantages of Pi_1 convergence without the disadvantages.

*(Source: Last paragraph of this section on wikipedia.)

## No comments:

## Post a Comment