I just did a post on inference guidance, but I've got a lot more thoughts (which may or may not be worth anything).
The first one is a curiosity I've had for a while: a good way of predicting is to compress sensory information. Can compressing good inference methods lead to a similarly good method of inference?
The previous post indicated at least two ways in which compression can be relevant: first, the Levin-like search gives more priority to shorter inference guidance methods, so although it's not actually trying to compress anything, the result will tend to be compressive of the solution (and of its proof). One might expect that the more compressive a solution, the better it will generalize. However, note that what we're worried about is speed, not correctness--- an inference method cannot harm the correctness of a result, only the speed with which it is inferred. (It can delay the inference perpetually, though.) So we want the speed to generalize. A program that is more compressive of the same data tends to take longer to execute.
Still, I'd expect some connection between compressiveness and the generalization of speed. Wouldn't you?
The second definite connection is that it's useful to compress the problems the system has solved so far, in order to anticipate the coming problems and fine-tune to them. One way of thinking about this is that we look for a good language to describe the problems so far. Then, to generate new problems, we generate random descriptions in that language.
It would be nice, to apply this to the search for strategies-- rather than looking for a good strategy, look for a good programming language, in which randomly-generated strategies tend to be good! Remember the best strategies, and use them to continue revising the language and improving it. The question is, does this scheme have normative value? Will it systematically approximate the optimal behavior?