Honestly, the theory is motivated more by "this is what we can do" than by "this is what Truth should taste like". However, the result was surprisingly nice, so I think it's worth doing a post-hoc analysis.
I came up with the mechanism based on the realisation that Peano Arithmetic and its relatives are self-referential not because they talk about numbers, but because they talk about arbitrary computations on numbers. Robinson Arithmetic was formulated by examining what parts of Peano Arithmetic are actually used to get self-reference; it is a significantly weaker theory which still supports the diagonal lemma and all it entails. The fact is, all total computable functions have fixed points. This means that for any (halting) computable function F which takes as input one program and as output generates another, there will always be a program P which is identical in behaviour to F(P). This result is almost silly! However, it tells us something about the structure of computation. It is what allows us to construct self-referential sentences in Peano Arithmetic (and Robinson Arithmetic).
So, in order to get a truth predicate in the same language ("self-referential in the good way"), all we need to do is deny the ability to talk about arbitrary computations on sentences while discussing their truth (so that it isn't "self-referential in the bad way").
The basic idea is to avoid introducing new, undesired sentences which turn out to be pathological. Talking about the truth of arbitrarily computed sentences seems perfectly plausible until we see that we're getting more than we bargained for. The strict language-metalanguage barrier introduced by Tarski ensures that we only talk about the truth of sentences which are in our original language, avoiding the problem. My 'impredicative' approach weakens this barrier; we can talk about the truth of sentences which make use of the truth predicate, which means we still get some "unexpected" sentences where truth is discussing itself rather than discussing the base-level domain. We're just restricted to making use of relatively simple manipulations on these sentences, so that we don't have the machinery to make fixed-point constructions.
My formalism allows predicates to be recursively defined, using constructions similar to the definition of the natural numbers in 2nd-order logic. However, it does not allow self-referential predicate definitions. It is interesting that this distinction can be made. We can employ recursion at the object level, but not at the concept level.
Recursion at the object level looks like this (where 0 is a logical constant and s() is a function symbol):
All P: (P(0) and [All y: P(y)->P(s(y))]) -> P(x).Here, we are defining a predicate for "natural number" based on a constant for "zero" and a function for "+1". Notice that although this is a recursive definition, we do not need to use the predicate itself in its definition. This is very important; if we were allowed to refer to a predicate in its own definition, then we would run into trouble very quickly:
Predicate P(x): not P(x).More generally, a self-referential definition may not be consistent or complete, whereas a recursive definition will be (although it may not be decidable).
So, it doesn't make sense to form concepts using self-referential definitions (at least not just any self-referential definition). And since forming predicates from arbitrary computations is enough to give us that ability, it's got to go, too.
That's all well and good, but this doesn't seem like anything really new for theories of truth: we know that complete expressive power results in inconsistency, so various theories of truth give us different restrictions on expressive power which (hopefully) result in a consistent theory. So what?
The interesting thing here is that, because of the closeness to 2nd-order logic, we get a nice property: we can represent 3rd-order logic, 4th-order logic, .. Nth-order logic where N is an ordinal definable within 2nd-order logic. The basic story is that the powerset relation is definable at the object level, which means that we can add axioms which force special 1st-order entities to behave as classes. Despite the separation in the language, concepts can be examined as objects after all!
If we do this, we can even discuss the (1st-order) truth of arbitrarily computed sentences, and more or less anything else which we would have wanted to do.
In Kripke's theory of truth, we had the "revenge problem": truth can be completely discussed within the language, but a new semantic value, the "indeterminate" or "paradoxical" state, cannot be discussed. This preserved Tarski's theorem, in a generalised sense: the system could discuss its own truth predicate, but not its own semantics.
If we wanted to add a predicate for this 3rd value, in such a way as to allow the assertion of true statements about it, we would be taking on a rather significant modification of the logic. Certainly the semantics of the predicate could not be described with a few sentences in the existing language.
The same theme extends to other theories: typically, the semantics of the theory are a mathematical object too complex to be defined in the theory of truth at hand. That's what concerned me when trying to use theories of truth as foundations for mathematics; in general, it seemed as if the exposition of the theory itself provided an example of a mathematical discussion which could not be captured in the theory.
For the theory of impredicative truth, on the other hand, we find that although the theory cannot represent its own semantics with no modification, the modification needed is rather light. Adding axioms defining object-level classes seems rather easy. Granted, the extended system technically has a different semantics; in asserting the new axioms to be true, we are changing the meaning of some logical constants (assigning a relation to the role of an object-level membership relation, for example). It may look at first glance like a system describing its own semantics, but really, the extended system is describing the old system. (If we attempted to make objects corresponding to all the classes in the new system, we would be making axioms equivalent to naive set theory.)
So, to talk about the semantics of the system, we still make a meta-language. The close relationship between the language and the metalanguage is very encouraging, though. Specifically, this seems to indicate that a system capable of learning new axioms will be able to incrementally learn its own semantics.
What this gives us is a hierarchy of languages which are increasingly expressive. Is it any better than the Tarski hierarchy? To some extent, I think this solution is very similar in spirit; it is a very "conservative" solution, relying on a stratified language and maintaining classical logic. The nice relationship between language and metalanguage is arguably only a little nicer than Tarski's. However, the impredicative stratification is more powerful; I feel it is closer to the foundations of mathematics. What we get out of this is a recommendation to view a finite axiomization in 2nd-order logic as, to some extent, more basic than a formulation of a theory in 1st-order logic via axiom schema. 2nd-order arithmetic is a more fundamental foundation than 1st-order, and NBG set theory is preferred over ZFC. Proper classes are seen as a natural thing to have around, rather than an unnatural partial level.
I'm actually disappointed that this view seems to endorse a more standard well-founded view of set theory, as opposed to what I view as more natural non-well founded theories (in which we have a set of all sets, which is itself a set). Oh well.
What I need next is a natural way to specify a prior distribution over statements in this logic, so that we can reason probabilistically over it.
In any case, that's enough for now. I think I might have given a more results-oriented account, despite hoping to give intuition prior to results. Perhaps this theory is better justified in terms of what we can do with it, rather than what we wanted truth to be like beforehand.
Cautionary note: I have not yet spelled out what the semantics of this theory might be; in particular, I have not spelled out how similar or different it might be from the standard semantics of 2nd-order logic. This could bring into doubt some of the parallels with 2nd-order logic used in this post.