So: if we want to make good inferences, is it enough to look for a programming language which describes good inferences succinctly, and then proceed to carry out actions which have succinct descriptions in that language?
Specifically, can this strategy be arranged such that it is (in some sense) equivalent with the strategy of searching for the inference algorithm that's got the lowest expected time (or, lowest average time on randomly generated test problems typical of what's been seen so far)?
If we've got a bunch of trial runs of different inference algorithms, then the absolute best language in the sense that I'm looking for would be one that could state just the algorithm with the best average time. Less-good languages would be judged by how good the inference algorithms they could state would be. This should make clear part of the divergence from the simple concept of compression: we're not trying to compress all the data, just the "best" data. We've got a range of goodness for our data samples; we want to have very short descriptions for the best cases, but very long ones for the worst cases.
Another difference is that we don't care in the same way about the simplicity of the result. It might be a good idea to favour simpler languages when we only know how good they are on some data, since shorter ones will probably generalise their behaviour more cleanly to more data. Yet, if we could calculate the actual average runtime on truly average data, we would no longer consider shortness to be a concern; not unless we ran a risk of exhausting the memory. With compression, this is not the case.
One might still base an effective algorithm on the loose metaphor: perhaps implement the Levin-like search for good inference strategies, and then some compression-like meta-search for strategy programming languages that tend to result in good strategies. However, the basis in compression doesn't appear to be totally rigorous.