First, I dove into Toward a Synthesis of Cognitive Biases, by Martin Hilbert: a work which claims to explain how eight different biases observed in the literature are an inevitable result of noise in the information-processing channels in the brain.

The paper starts out with what it calls the

*conservatism*bias. (The author complains that the literature is inconsistent about naming biases, both giving one bias multiple names and using one name for multiple biases. Conservatism is what is used for this paper, but this may not be standard terminology. What's important is the mathematical idea.)

The idea behind conservatism is that when shown evidence, people tend to update their probabilities more conservatively than would be predicted by probability theory. It's as if they didn't observe all the evidence, or aren't taking the evidence fully into account. A well-known study showed that subjects were overly conservative in assigning probabilities to gender based on height; an earlier study had found that the problem is more extreme when subjects are asked to aggregate information, guessing the gender of a random selection of same-sex individuals from height. Many studies were done to confirm this bias. A large body of evidence accumulated which indicated that subjects irrationally avoided extreme probabilities, preferring to report middling values.

The author construed conservatism very broadly. Another example given was: if you quickly flash a set of points on a screen and ask subjects to estimate their number, then subjects will tend to over-estimate the number of a small set of points, and under-estimate the number of a large set of points.

The hypothesis put forward in

*Toward a Synthesis*is that conservatism is a result of random error in the information-processing channels which take in evidence. If all red blocks are heavy and all blue blocks are light, but you occasionally mix up red and blue, you will conclude that

*most*red blocks are heavy and

*most*blue blocks are light. If you are trying to integrate some quantity of information, but some of it is mis-remembered, small probabilities will become larger and large will become smaller.

One thing that bothered me about this paper was that it did not directly contrast processing-error conservitism with the

*rational*conservatism which can result from quantifying uncertainty. My estimate of the number of points on a screen

*should*tend toward the mean if I only saw them briefly; this bias will increase my overall accuracy rate. It seems that previous studies established that people were over-conservative compared to the rational amount, but I didn't take the time to dig up those analyses.

All eight biases explained in

*Toward a Synthesis*were effectively consequences of conservatism in different ways.

Two rare events X and Y which are independent appear correlated as a result of their probabilities being inflated by conservatism bias. I found this to be the most interesting application. The standard example of illusory correlation is stereotyping of minority groups. The race is X, and some rare trait is Y. What was found was that stereotyping could be induced in subjects by showing them artificial data in which the traits were entirely independent of the races. Y could be either a positive or a negative trait; illusory correlation occurs either way. The effect that conservatism has on the judgements will depend on how you ask the subject about the data, which is interesting, but illusory correlation emerges regardless. Essentially, because all the frequencies are smaller within the minority group, the conservatism bias operates more strongly; the trait Y is inflated so much that it's seen as being about 50-50 in that group, whereas the judgement about its frequency in the majority group is much more realistic.*Illusory correlation:**Self-Other Placement:*People with low skill tend to overestimate their abilities, and people with high skill tend to underestimate theirs; this is known as the Dunning-Kruger effect. This is a straightforward case of conservatism. Self-other placement refers to the further effect that people tend to be even*more*conservative about estimating*other*people's abilities, which paradoxically means that people of high ability tend to over-estimate the probability that they are better than a specific other person, despite the Dunning-Kruger effect; ans similarly, people of low ability tend to over-estimate the probability that they are worse as compared with specific individuals, despite over-estimating their ability overall. The article explains this as a result of having less information about others, and hence, being more conservative.*(I'm not sure how this fits with the previously-mentioned result that people get more conservative as they have more evidence.)**Sub-Additivity:*This bias is a class of inconsistent probability judgements. The estimated probability of an event will be higher if we ask for the probability of a set of sub-events, rather than merely asking for the overall probability. From Wikipedia:*For instance, subjects in one experiment judged the probability of death from cancer in the United States was 18%, the probability from heart attack was 22%, and the probability of death from "other natural causes" was 33%. Other participants judged the probability of death from a natural cause was 58%. Natural causes are made up of precisely cancer, heart attack, and "other natural causes," however, the sum of the latter three probabilities was 73%, and not 58%. According to Tversky**and Koehler (1994) this kind of result is observed consistently*. The bias is explained with conservativism again. The smaller probabilities are inflated more by the conservatism bias than the larger probability is, which makes their sum much more inflated than the original event.*Hard-Easy Bias:*People tend to overestimate the difficulty of easy tasks, and underestimate the difficulty of hard ones. This is straightforward conservatism, although the paper framed it in a somewhat more complex model (it was the 8th bias covered in the paper, but I'm putting it out of order in this blog post).

That's 5 biases down, and 3 to go. The article has explained conservatism as a mistake made by a noisy information-processor, and explains 4 other biases as consequences of conservatism. So far so good.

Here's where things start to get... weird.

## Simultaneous Overestimation and Underestimation

Bias 5 is termed

Similarly, we can turn the relationship around. The conservatism bias was based on looking at P(estimate|evidence). We can turn it around with Bayes' Law, to examine P(evidence|estimate). If there is noise in one direction, there is noise in the other direction. This has a surprising implication: the

If you're like me, at this point you're saying

The section refers to a paper about this, so before moving further I took a look at that reference. The paper is

What is going on? How can both of these be true?

One possible answer is that the experimental conditions are different.

The Erev paper points out that revision-of-opinion experiments used

Both techniques compared the objective probability, OP, with the subject's reported probability, SP. OP is the empirical frequency, while SP is whatever the subject writes down to represent their degree of belief. However, revision-of-opinion studies started with a desired OP for each situation and calculated the

This is exactly what the

Does this mean that the two biases are mere statistical artifacts, and humans are actually fairly good information systems whose beliefs are neither under- nor over- confident? No, not really. The statistical phenomena are real: humans

Both papers propose a model which accounts for overconfidence as the result of noise during the creation of an estimate, although they are put in different terms. The next section of

Both models

This is made worse by the fact that human memories and judgement are actually fallible, not perfect, and subject to the same effects. Information is subject to bias-inducing-noise at each step of the way, from first observation, through interpretation and storage in the brain, modification by various reasoning processes, and final transmission to other humans. In fact, most information we consume is subject to distortion before we even touch it (as I discussed in my previous post). I was a bit disappointed when the

Overall, I find

Both papers come off as quite critical of the state of the research, and I walk away from these with a bitter taste in my mouth:

A lot of analysis is still needed to clear up the issues raised by these two papers. One problem which strikes me is the use of averaging to aggregate data, which has to do with the statistical phenomenon of simultaneous over- and under- confidence. Averaging isn't really the right thing to do to a set of probabilities to see whether it has a tendency to be over or under a mark. What we really want to know, I take it, is whether there is some

Finally, to keep the reader from thinking that this is the only theory trying to account for a broad range of biases: go read this paper too! It's good, I promise.

*exaggerated expectation*in the paper. This is a relatively short section which reviews a bias dual to conservatism. Conservatism looks at the statistical relationship*from*the evidence,*to*the estimate formed in the brain. If there is noise in the information channel connecting the two, then conservatism is a statistical near-certainty.Similarly, we can turn the relationship around. The conservatism bias was based on looking at P(estimate|evidence). We can turn it around with Bayes' Law, to examine P(evidence|estimate). If there is noise in one direction, there is noise in the other direction. This has a surprising implication: the

*evidence*will be conservative with respect to the*estimate*, by essentially the same argument which says that the estimate will tend to be conservative with respect to the evidence. This implies that (under statistical assumptions spelled out in the paper), our estimates will tend to be more extreme than the data. This is the exaggerated expectation effect.If you're like me, at this point you're saying

*what???**The*

*whole idea*of conservatism was that the estimates tend to be*less*extreme than the data! Now "by the same argument" we are concluding the opposite?The section refers to a paper about this, so before moving further I took a look at that reference. The paper is

*Simultaneous Over- and Under- Confidnece: the Role of Error in Judgement Process*by Erev et. al. It's a very good paper, and I recommend taking a look at it.*Simultaneous Over- and Under- Estimation*reviews two separate strains of literature in psychology. A large body of studies in the 1960s found systematic and reliable*underestimation*of probabilities. This*revision-of-opinion*literature concluded that it was difficult to take the full evidence into account to change your beliefs. Later, many studies on*calibration*found systematic*overestimation*of probabilities: when subjects are asked to give probabilities for their beliefs, the probabilities are typically higher than their frequency of being correct.What is going on? How can both of these be true?

One possible answer is that the experimental conditions are different.

*Revision-of-opinion*tests give a subject evidence, and then test how well the subject has integrated the evidence to form a belief.*Calibration*tests are more like trivia sessions; the subject is asked an array of questions, and assigns a probability to each answer they give. Perhaps humans are*stubborn but boastful*: slow to revise their beliefs, but quick to over-estimate the accuracy of those beliefs. Perhaps this is true. It's difficult to test this against the data, though, because we can't always distinguish between calibration tests and revision-of-opinion tests. All question-answering involves drawing on world knowledge combined with specific knowledge given in the question to arrive at an answer. In any case, a much more fundamental answer is available.The Erev paper points out that revision-of-opinion experiments used

*different data analysis.*Erev re-analysed the data for studies on both sides, and found that the statistical techniques used by revision-of-opinion researchers found underconfidence, while the techniques of calibration researchers found overconfidence,*in the same data-set!*Both techniques compared the objective probability, OP, with the subject's reported probability, SP. OP is the empirical frequency, while SP is whatever the subject writes down to represent their degree of belief. However, revision-of-opinion studies started with a desired OP for each situation and calculated the

*average SP for a given OP.*Calibration literature instead starts with the numbers written down by the subjects, and then asks how often they were correct; so, they're computing the*average OP for a given SP.**When we look at data and try to find functions from X to Y like that, we're creating statistical estimators. A very general principle is that*

*estimators tend to be regressive*: my Y estimate will tend to be closer to the Y average than the actual Y. Now, in the first case, scientists were using X=OP and Y=SP; lo and behold, they found it to be regressive. In later decades, they took X=SP and Y=OP, and found*that*to be regressive! From a statistical perspective, this is plain and ordinary business as usual. The problem is that one case was termed under-confidence and the other over-confidence, and they appeared from those names to be contrary to one another.This is exactly what the

*Toward a Synthesis*paper was trying to get across with the reversed channel, P(estimate|evidence) vs P(evidence|estimate).Does this mean that the two biases are mere statistical artifacts, and humans are actually fairly good information systems whose beliefs are neither under- nor over- confident? No, not really. The statistical phenomena are real: humans

*are*both under- and over-confident in these situations. What*Toward a Synthesis*and*Simultaneous Over- and Under- Confidence*are trying to say is that these are not mutually inconsistent, and can be accounted for by noise in the information-processing system of the brain.Both papers propose a model which accounts for overconfidence as the result of noise during the creation of an estimate, although they are put in different terms. The next section of

*Toward a Synthesis*is about overconfidence bias specifically (which it sees as a special case of exaggerated expectations, as I understand them; the 7th bias to be examined in the paper, for those keeping count). The model shows that even with accurate memories (and therefore the theoretical ability to reconstruct accurate frequencies), an overconfidence bias should be observed (under statistical conditions outlined in the paper). Similarly,*Simultaneous Over-and Under- confidence*constructs a model in which people have perfectly accurate probabilities*in their heads*, and the noise occurs when they put pen to paper: their explicit reflection on their belief adds noise which results in an observed overconfidence.Both models

*also*imply underconfidence. This means that in situations where you expect perfectly rational agents to reach 80% confidence in a belief, you'd expect rational agents with noisy*reporting*of the sort postulated to give estimates averaging lower (say, 75%). This is the apparent underconfidence. On the other hand, if you are*ignorant*of the empirical frequency and one of these agents*tells you*that it is 80%, then it is*you*who is best advised to revise the number down to 75%.This is made worse by the fact that human memories and judgement are actually fallible, not perfect, and subject to the same effects. Information is subject to bias-inducing-noise at each step of the way, from first observation, through interpretation and storage in the brain, modification by various reasoning processes, and final transmission to other humans. In fact, most information we consume is subject to distortion before we even touch it (as I discussed in my previous post). I was a bit disappointed when the

*Toward a Synthesis*paper dismissed the relevance of this, stating flatly "false input does not make us irrational".Overall, I find

*Toward a Synthesis of Cognitive Biases*a frustrating read and recommend the shorter, clearer*Simultaneous Over- and Under- Confidence*as a way to get most of the good ideas with less of the questionable ones. However, that's for people who already read this blog post and so have the general idea that these effects can actually explain a*lot*of biases. By itself,*Simultaneous Over- and Under- Confidence*is one step away from dismissing these effects as mere statistical artifacts. I was left with the impression that Erev doesn't even fully dismiss the model where our internal probabilities are perfectly calibrated and it's only the error in conscious reporting that's causing over- and under- estimation to be observed.Both papers come off as quite critical of the state of the research, and I walk away from these with a bitter taste in my mouth:

*is this the best we've got?*The extend of the statistical confusion observed by Erev is saddening, and although it was*cited*in*Toward a Synthesis*, I didn't get the feeling that it was sharply understood (another reason I recommend the Erev paper instead).*Toward a Synthesis*also discusses a lot of confusion about the names and definitions of biases as used by different researchers,which is not quite as problematic, but also causes trouble.A lot of analysis is still needed to clear up the issues raised by these two papers. One problem which strikes me is the use of averaging to aggregate data, which has to do with the statistical phenomenon of simultaneous over- and under- confidence. Averaging isn't really the right thing to do to a set of probabilities to see whether it has a tendency to be over or under a mark. What we really want to know, I take it, is whether there is some

*adjustment*which we can do after-the-fact to systematically improve estimates. Averaging tells us whether we can improve a square-loss comparison, but that's not the notion of error we are interested in; it seems better to use a proper scoring rule.Finally, to keep the reader from thinking that this is the only theory trying to account for a broad range of biases: go read this paper too! It's good, I promise.

## No comments:

## Post a Comment