I've been talking recently about robust statistics, and the consequences of replacing means with medians. However, I've only looked at this in a fairly limited way, asking about one particular distribution (the bell curve). Mean values are everywhere in statistics; perhaps to a greater degree than you realize, because we often refer to the mean value as the "expected value". It's a simple alias for the same thing, but that may be easy to forget when we are taking expectations everywhere.
In some sense, the "expectation" seems to be a more basic concept than the "mean". We could think of the mean as simply one way of formalizing the intuitive notion of expected value. What happens if we choose a different formalization? What if we choose the median?
The post on altering the bell curve is (more or less) an exploration of what happens to some of classical statistics if we do this. What happens to Bayesian theory?
The foundations of Bayesian statistics are really not touched at all by this. A Bayesian does not rely as heavily on "statistics" in the way a frequentist statistician does. A statistic is a number derived from a dataset which gives some sort of partial summary. We can look at mean, variance, and higher moments; correlations; and so on. We distinguish between the sample statistic (the number derived from the data at hand) and the population statistic (the "true" statistic which we could compute if we had all the examples, ever, of the phenomenon we are looking at). We want to estimate the population statistics, so we talk about estimators; these are numbers derived from the data which are supposed to be similar to the true values. Unbiased estimators are an important concept: ways of estimating population statistics whose expected values are exactly the population statistics.
These concepts are not exactly discarded by Bayesians, since they may be useful approximations. However, to a Bayesian, a distribution is a more central object. A statistic may be a misleading partial summary. The mean (/mode/median) is sort of meaningless when a distribution is multimodal. Correlation does not imply... much of anything (because it assumes a linear model!). Bayesian statistics still has distribution parameters, which are directly related to population statistics, but frequentist "estimators" are not fundamental because they only provide point estimates. Fundamentally, it makes more sense to keep a distribution over the possibilities, assigning some probability to each option.
However, there is one area of Bayesian thought where expected value makes a great deal of difference: Bayesian utility theory. The basic law of utility theory is that we choose actions so as to maximize expected value. Changing the definition of "expected" would change everything! The current idea is that in order to judge between different actions (or plans, policies, designs, et cetera) we look at the average utility achieved with each option, according to our probability distribution over the possible results. What if we computed the median utility rather than the average? Let's call this "robust utility theory".
From the usual perspective, robust utility would perform worse: to the extent that we take different actions, we would get a lower average utility. This begs the question of whether we care about average utility or median utility, though. If we are happy to maximize median utility, then we can similarly say that the average-utility maximizers are performing poorly by our standards.
At first, it might not be obvious that the median is well-defined for this purpose. The median value coming from a probability distribution is defined to be the median in the limit of infinite independent samples from that distribution, though. Each case will contribute instances in proportion to its probability. What we end up doing is lining up all the possible consequences of our choice in order of utility, with a "width" determined by the probability of each, and taking the utility value of whatever consequence ends up in the middle. So long as we are willing to break ties somehow (as is usually needed with the median), it is actually well-defined more often than the mean! We avoid problems with infinite expected value. (Suppose I charge you to play a game where I start with a $1 pot, and start flipping a coin. I triple the pot every time I get heads. Tails ends the game, and I give you the pot. Money is all you care about. How much should you be willing to pay to play?)
Since the median is more robust than the mean, we also avoid problems dealing with small-probability but disproportionately high-utility events. The typical example is Pascal's Mugging. Pascal walks up to you and says that if you don't give him your wallet, God will torture you forever in hell. Before you object, he says: "Wait, wait. I know what you are thinking. My story doesn't sound very plausible. But I've just invented probability theory, and let me tell you something! You have to evaluate the expected value of an action by considering the average payoff. You multiply the probability of each case by its utility. If I'm right, then you could have an infinitely large negative payoff by ignoring me. That means that no matter how small the probability of my story, so long as it is above zero, you should give me your wallet just in case!"
A Robust Utility Theorist avoids this conclusion, because small-probability events have a correspondingly small effect on the end result, no matter how high a utility we assign.
Now, a lot of nice results (such as the representation theorem) have been derived for average utilities over the years. Naturally, taking a median utility might do all kinds of violence to these basic ideas in utility theory. I'm not sure how it would all play out. It's interesting to think about, though.