Tuesday, January 24, 2012

Why Logic?

Continuing in my what are you doing and why are you doing it series, I'll try to explain why I'm interested in logic.

Previously I talked about why relational methods are needed. However, the argument was really incomplete. My reason for arguing that relational methods are needed was expressive power. As I said:

My major recurring rant has been referential generality. The idea is that a general machine intelligence system should be able to represent any kind of pattern that a human can. This is motivated by the fact that it's impossible to learn a pattern that you can't represent in the first place.
 In other words, my original motivation was learning theory, but I realized that in order to learn any kind of pattern that humans can learn, a system needs to be able to represent any kind of pattern a human can. So, I went from learning theory to knowledge representation.

However, there is already a very general and powerful theory which tells us what kind of patterns can be represented in general: the theory of computation. The space of patterns can be taken as the space of computer programs. Algorithmic information theory develops this line of thought. Solomonoff Induction is a theory of learning based on this, and Levin Search is a general problem solving method which takes advantage of this idea.

These ideas are all very good in my opinion (perhaps I'll write some posts explaining them a bit). Yet, I've chosen to focus on logic. Why?

There is a long-standing argument between procedural and declarative knowledge representation in the artificial intelligence field. Those on the procedural side believe that it is better to represent knowledge through computer programs which execute to do things, whereas those on the declarative side prefer logic and related formalisms.

I think both are needed for different purposes. However, I am more interested in the declarative side. When declarative methods are applied, the results can be more understandable. Knowledge can be more easily carried over from one task to another. The system can be instructed declaratively, given constraints, and so on.

All of those things are nice, but perhaps my "real" reason for preferring logic is that I want to capture human reasoning (at least, the successful/useful aspects of it). A procedural representation is somewhat opaque. I can't tell for sure what elements of human reasoning it captures. Most of the time, the answer is something like "If that behavior is useful, then it can learn to duplicate it in principle". That's not quite satisfying for me. For example, I am very interested in the foundations of mathematics. A procedural learner doesn't fundamentally care about mathematics. I can only say that if math is useful, it could learn to do it. With a logical representation, however, I can more explicitly see how the system could reason with specific mathematical constructs.

Of course, procedural representations also have advantages. The behavior of the system can be easier to examine. The system can be instructed procedurally (which can be easier in some cases). Inference is typically more efficient, since inference is just execution. As I said, it is more appropriate in some cases.

One response to this situation might be to argue that the two options are equivalent from a theoretical point of view, the differences being pragmatic. Logic can be thought of as a programming language, with the logic engine being the interpreter. Or, programs can be thought of as a sort of logic. An advocate of this view might point to Prolog, the logic programming language, or to the Curry-Howard isomorphism with its "programs are proofs" motto.

I believe this is misguided. Prolog is not logic. Programs are not proofs (or, rather, they are very boring proofs: the curry-howard isomorphism just says that writing a program proves that there exists a program of that type). Declarative and procedural knowledge have essentially different characters. One difference is that declarative knowledge justifies a wide variety of inferences. It does not tell you what order to make those inferences in. That's why the same knowledge can be used in many different ways. Another difference is that logical theories can be essentially incomplete, as Goedel proved. There is no analogous phenomenon for programs. [A program can be syntactically incomplete (ie, unfinished, partly unwritten), but in that case there always exists a completion. A program can contain an infinite loop or otherwise fail to halt, in which case we call it a partial program; but in that case, there is no relevant notion of "completion" so we can't define essential incompleteness.]

Therefore, I am interested in studying learning theory from the perspective of logic. It turns out that these learning systems are strictly more expressive, in ways related to the above observations, and without becoming uncomputable. (Logic can represent any computable theory which would be generated by procedural learning, but logic can also represent some theories which can't be precisely translated to procedural knowledge, despite inference on those theories being computable.) I wouldn't say this is a decisive victory for logic (it's not clear what practical advantage a logic-based learner would have over a procedural learner), but it's certainly interesting. I'm not done yet, though... I can explain some nice properties of human reasoning, but others still escape me.