Sunday, May 8, 2011

Current Probability Ideas

Follow-up to this post (and the series preceding it). I think that while my distinction between subjective and objective probability makes sense, it's not the right direction for answering the questions I'm asking-- it overcomplicates the issues. That change of view is based on a few papers, which I still need to read in more detail. (IE, this post does not summarise insights from those papers, it summarises my reaction from discovering their existence.)

Aside: my previous post, this one, and perhaps the next one or two do not represent me "thinking aloud" as my posts usually do... instead, I'm summarising thoughts that I had over the semester, but felt too busy to blog about.

So. How well can we do with just a single layer of probability, over a standard logic? How can we achieve something similar to Solomonoff induction?

One idea is to imagine that the universe we are observing was constructed by randomly writing down sentences in logic to determine each fact, only taking care not to ever write something which contradicts what we've already written. For example, we can  use a length-based prior to choose a sentence at random from first-order logic.  This prior obviously contains the Solomonoff prior ("is dominant over" as they say), but it's not clear to me whether the relationship goes the other way ("is dominated by") so that they are roughly equivalent. I'm thinking the prior I'm specifying is "even more uncomputable," since we have to check for consistency of theories.

This prior lends itself to an interesting approximation scheme, though.

• Each sentence has at least log-probability inversely proportional to its length
• To this, we can add some probability mass for each consistent sequence of sentences which we find that implies the sentence. (To test for "consistent" we just try to find a contradiction for some reasonable amount of time, perhaps trying harder later if it's important.)
• Normalise this probability by dividing by (1 - probability we encounter an inconsistent sequence drawing randomly), which is estimated based on the cumulative probability of the inconsistent sequences we've identified so far.
• This gives a rough lower bound for the probability. To get an upper bound, we apply the same technique to the negation of the sentence.
• If A implies B, then B is at least as probable as A (ie, it can inherit A's lower bound). Likewise, A is at most as probable as B (inheriting B's upper bound). In other words, we can use existing ideas about propagating probability intervals to work with these estimates if we like.
• To update on evidence, we use compatibility with that evidence as an extra criteria for consistency of sequences of statements. Intuitively, this will throw out a bunch of possibilities (and increase the amount of normalisation by doing that, which increases the probability of the remaining possibilities).

Unfortunately, these bounds would tend to be very wide, I think. The system would have to do a lot of thinking to narrow them down. More generous approximations might find one model and stick to it until it's disproved or a better model is found, say. This would at least give the system something to work with.

Now that we have a little bit of foundation in place, I think we can more usefully consider what to do if we want higher-order probabilities. Just as we can randomly choose sentences of first-order logic, we can randomly choose sentences of any system of probabilistic models we like (such as BLP, BLog, etc). This allows us to use the inference algorithms, et cetera from our favourite formalism. As long as first-order logic is a special case of the formalism, it will be just as universal.

For example, we could perfectly well use a prior based on random theories stated with the (objective-)probability-theoretic quantification which I was discussing in the previous posts. If we do this, we basically solve the puzzles I was running up against there: the inference from objective probabilities over classes to subjective probabilities over individuals is justified by the regularity assumption in the prior, IE, by the idea that the objective probability might be an "essential feature" of the class... something that was written down in the book of rules for the universe before the individual cases were decided.

Saturday, May 7, 2011

Impredicativity

Lukasz Stafiniak sent me this about impredicativity. I had seen the concern before, but never seen why it was worth any worry. Impredicative comprehension principles are a big convenience for normal mathematics, and don't lead to any paradoxes that we know of.

However, looking at it again, I realised that the basic complaint could be read as "impredicative comprehension is not valid in Kripke's theory of truth." This is more concerning to me. In fact, the concern can be stated just in terms of the Tarski hierarchy: 2nd-order logic does not exist at any stage in the Tarski hierarchy. I was foolishly thinking that there was a direct correspondence between stages in the hierarchy and orders of logic. There is, if we use predicative comprehension! But, impredicative comprehension is "too powerful" to exist at any stage of any recursive hierarchy.

This causes trouble for me, because impredicative comprehension is necessary to my understanding of the natural numbers (and other important things).

Fortunately, the article suggests a solution: impredicative quantifications should be understood at the level of provability. IE, we aren't stating that all instances are true; we are stating something stronger, that they are all provable. (But we use infinitary logic, so that this is not as much of a restriction.) This is a very intuitionistic suggestion.

Unfortunately, we can see that this really just introduces another metalanguage (the language of provability). Hence, comprehension interpreted this way does not include all possible predicates after all (even if we extend the provability interpretation to the point where it covers all cases of 2nd-order comprehension): it skips the predicates which can be defined in terms of provability.

Building a self-referential theory of logical implication which follows the rules we want is hard. The obvious way of doing it just leads directly to Curry's paradox. I still have not finished reading Hartry Field's "Saving Truth from Paradox", but it includes an account of implication which may be helpful. These posts also contain some notes which may be helpful.

I should have known for about a year now that the immediate challenge is to built up to 2nd-order logic, not to build up to set theory. However, this made it hit home: I can't rely on 2nd order logic to be there for me when I need it. I don't yet know what 2nd-order logic is, or where it comes from.

Still, the fact remains that 2nd-order logic is a fairly natural, very useful, and almost certainly consistent set of axioms. It seems probable that aliens would stumble upon it, etc. It does not just poof out of existence. Even just as a syntactic entity rather than one with a semantics, it's still important!