Wednesday, August 4, 2010

Logical Systems

This is a rehash of some ideas from my older blog. Basically, I'm trying to state the problem anew, in light of reading Jon Cogburn's paper Are Turing Machines Platonists? (Unfortunately not available online, but do email him if you want a copy, he is a nice fellow.)

Inferentialism is, roughly speaking, the view that the only thing that is important to our understanding of a statement is the way that statements interacts with the surrounding web of statements in our belief system. This is made precise by saying that we understand a statement precisely when we could recognize a proof or disproof of it.

Computationalism is the view that a mind can be represented as a computer program, that is, there is no fundamentally non-computable stuff going on up there: if a computer was fast enough, it could compute the proper outputs to the nerves based on the microsecond-to-microsecond inputs received.

Together, these two views entail that we cannot fully understand math in its present form. Goedel's semantic incompleteness theorem shows that for any computer program, there will exist statements in basic number theory which that computer program can recognize no proof or disproof of. Perhaps one might respond that this seems acceptable, as it becomes very difficult to understand complicated mathematical statements, and it doesn't seem implausible that we have some limit corresponding to Goedel-style incompleteness. However, in general our ability to understand the meaning of mathematical statements does not seem to correspond that well to our ability to prove or disprove them (or understand proofs provided by others). The continuum hypothesis, for example, seems understandable; yet it is known not to have a proof or disproof in any widely accepted set of axioms. It seems implausible (to me, at least) that it's understandability comes from its being provable or disprovable in our mental logic. The halting problem provides numerous other examples which I would claim were understandable yet not amenable to proof or disproof.

This means we've got to either give up inferentialism, computationalism, or classical mathematics. Very roughly speaking: people who give up classical mathematics are some variety of intuitionist or constructivist; people who give up computationalism are some variety of dualist or hypercomputationalist (like Roger Penrose).Now, I don't disagree in principle with restructuring math from the bottom up, but it seems desirable for a foundational program to capture as much as possible of the way mathematicians intuitively reason, so I'm hesitant to be a constructivist (though I may yet be converted). Similarly, I don't have anything against the possibility that some processes yet-unknown to physicists are endowing minds with special non-computational behaviors (and since I'm not a constructivist, I even believe that such behaviors can be well-defined and deterministic!). However, I don't know that this is the case, and neuroscience seems to suggest that much of neural processing can be accounted for in a computational way. Furthermore, as an artificial intelligence researcher, I hope that the essence of "mind" can be captured computationally. I fall in the third category, wanting to give up inferentialism.

Giving up inferentialism is not to be done lightly. It's a highly plausible assertion, especially for the computationalist: what else should matter about a statement then the computational interactions it has with other statements?

The solution I find plausible I'll call weak inferentialism: we understand a statement if we can compute a defining set of inferences. The statement's meaning is precisely that defining set; it is merited when all inferences in that set are merited, and (optionally?) false when one of them is false. (Should falseness be taken as a basic concept?) This does not mean that we can compute all of the statement's consequences, though. For example, the defining set of a universal statement P: "For all x, S is true of x" will be all the statements "S is true of A", "S is true of B", ... It's possible that another statement, Q, has a list of consequences which is some subset of P's list. In this case, Q would be a consequence not in the list for P. In some sense, however, Q does not add anything to the list: it just summarizes a portion of it. The weak inferentialist argues that this allows us to understand P without necessarily knowing that Q follows from it.

(There may be some interesting connections between these two types of inferentialism and the "Principle of Harmony" from proof theory, which states that the introduction rules and elimination rules for symbols should precisely mirror each other. This basically corresponds to a connection between the inferences from which we can conclude a statement and the inferences we can make from that statement. This principle may have to be spelled out differently for the two types of inferentialism. I don't know enough about the principle of harmony to make a well-considered connection, though.)

Now, the question: what foundational logics do the two different inferentialist pictures recommend? In particular, if we're also computationalist?

Strong inferentialism will only care about notions of logical consequence which have complete proof theories, like first-order logic. An inferentialist will only care about what structures of reasoning can be implemented in the logic. In particular, it seems natural to consider a logic as a programming language. Think of it like this: we have some basic domain of discourse we wish to talk about (such as the actual world), and we have the logic which allows us to make assertions which will cause some statements about the domain of discourse to entail other such statements. The logic is nothing more than a means for expressing these entailment relationships between the domain-level facts.

Ignoring computational efficiency and one or two other practical matters, it seems that little about the logic matters once we've determined that it is Turing complete. Classical first-order logic, intuitionistic first-order logic, and a host of others will all do equally well.

Interestingly, though, more powerful logics can be motivated even in this minimalistic worldview (if we bring practical matters like speed back into the picture). Goedel showed that mathematically more powerful logics (in a specific sense) will always have the following property: there will be some inferences which can be made in an astronomical number of inference steps in the less-powerful logic, but which the more-powerful logic proves in just a few steps. This theorem only holds for arbitrarily large domains of discourse, though, so it is an empirical question whether the phenomenon occurs in practical situations. The paper "A curious inference" by George Boolos and "Some More Curious Inferences" by Jeffrey Ketland discuss the issue (taking the affirmative).

Happily, the notion of "more powerful" here coincides at least to an extent with the more typical notions, which seems to mean that we can justify a good amount of normal math via this sort of reasoning, despite the fact that strong inferentialism will deny that math its standard meaning. However, I don't know the precise connection here, and I won't try to explore (in this blog post) precisely what of mathematics could be justified in that way.

In any case: what sort of view of logic does weak inferentialism suggest? Well, based on the idea of the defining set of consequences, we could say that a (non-basic) statement is a computer program for listing its own consequences. The "most expressive" logic will be one which uses a Turing-complete notation to do this. The key difference between this system and the previous is that we may not be able to infer a statement even if we can infer all of its defining consequences: we cannot implement the truth conditions computationally. Hence, we still have a (highly abstract, irrelevant of speed issues) concept of a more powerful logic: a more powerful logic will know more about which statements follow from which others. This is equivalent to knowing more about the halting problem (since we could check for implication A->B by making a program that halts when B implies something A does not, but keeps looping otherwise).

Fortunately, the extra information can always be expressed in the same logic! We never need to add more expressiveness, only more knowledge. The work done by any additional symbols can evidently be done without them, because the notation is Turing-complete.

The weak inferentialist logic includes the Liar sentence, ie, the sentence whose defining consequence is just its own negation. This can be dealt with via Kripke's fixed-point valuation: we enforce the constraint that a statement is considered true exactly when its defining inferences are merited, but we don't require that a sentence is either true or false. The inference "The Liar sentence is false" is neither right nor wrong; it remains undefined, since there is nothing for it to take its truth or falsehood from. The Liar sentence is ungrounded.

Exactly how this works will depend on whether we want falsehood as a basic concept, which I left open at the beginning. If we don't take it as basic, then falsehood might be defined as the property of implying everything.  The Liar paradox then becomes something very reminiscent of the Curry paradox: "This sentence implies everything." What the fixed-point construction tells us is that we can't in general use hypothetical reasoning to see if inferences are indeed justified: if we want to know whether sentence X, which asserts just X|-Y, is true, then it appears we should assume X and see if we can derive Y. (Read X|-Y as "from X  we can infer Y".) If we assume X, then we know X|-Y, but combining those two, we know Y. This hypothetical reasoning proves X|-Y, so we know X (un-hypothetically). But this entails Y, which might be "everything"! Logicians have provided weaker forms of hypothetical reasoning which conform to the fixed-point construction in order to avoid this problem. (Specifically, we can only make hypothetical assumptions which we already know are grounded.)

It's interesting that this means the sentence which just claims that inferring Y is justified is radically different from the sentence which claims that inferring Y from itself is justified, despite the fact that once we believe either, they justify the same inferences. The two even have the same conditions for being true: if we know Y, we can conclude both (since A|-B is true when B is inferable, that is, |-B implies A|-B, regardless of A). However, when Y is false, then the statement "infer Y" is false, but "From this statement, infer Y" is ungrounded and thus considered undefined.

The final thing to note is that, although no further expressiveness appears to be justified by weak inferentialism, the system described cannot fully express the concept of "groundedness" I've been using. (It can only mark it true, never false; but I've noted that statements are ungrounded more than once in this discussion.) Hence, we have an immediate example of a concept which appears to be mathematically well-defined, but which weak inferentialism does not seem to be able to account for. Yet, what is lost? After all, these supposed statements can't even be given a computable list of defining inferences they justify! Is it useful to state that something is ungrounded? (I think the more basic notion called into question here is negation itself: is it always meaningful to know that something is not the case? Negation has no defining set of inferences!)