Normally, I'm a hard-core Bayesian. This means that I believe all uncertainty is essentially probabilistic uncertainty, and also, goal-oriented behavior is essentially utility-maximization. These are beliefs which make predictions: I predict that AI techniques which attempt to approximate the rules of probability theory and utility maximization will tend to work the best.
In my previous post, I gave a reason to doubt this: Bayesian systems rely on a pre-specified model space, and this model space will be inherently incomplete, for several different reasons. This does not have profoundly bad consequences for probability theory (a Bayesian system will still do its best to make reasonable predictions, in some sense); however, it may have worse consequences for utility theory (it isn't clear to me that the system will do its best to achieve its given goal, in any strong sense).
This, too, is a testable belief. I've been discussing some experiments with Lukasz Stafiniak which will help here (but we have set no definite deadline to get around to this, as we both have other things to do). (I should mention that his motivation for being interested in these experiments is not the same as mine.) It could also be addressed on purely theoretical grounds, if it could be proven that (specific sorts of?) Bayesian systems can or cannot learn specific behaviors (behaviors which other sorts of systems are capable of learning).
What is the competitor to Bayesianism? Model-free learning attempts to directly learn the policy for the environment based on feedback, without trying to make predictions about the world or directly represent world knowledge.
In this view, trial-and-error learning becomes more fundamental than Bayesian learning. This makes some philosophical sense. After all, if we reason in an approximately Bayesian way, it is because we have evolved to do so through a process of mutation and natural selection.
The model-free approach has been popular in the past, and there is still research being done in the area, but model-based methods have the technique of choice for complex problems. To take a somewhat Bayesian line of argument, this is natural, because refusing to state your assumptions doesn't actually exempt you from having assumptions, and explicitly modeling the world allows for data to be used in a more efficient way: we separately optimize the world model based on the data, and then optimize the policy based on the world model.