This post is part 2 of a series.
In a previous post, I partially outlined calibrationism, the view that judgments of probability are trustworthy only if they are produced by ways of thinking that are “well calibrated”. But calibrationism also says more is necessary for trustworthy judgments.
The Second Ingredient of Trustworthiness: Inclusivity
Another important ingredient is inclusivity: that is, the extent to which we consider all the evidence that is relevant. After all, calibration isn’t everything we care about, since someone could be perfectly well calibrated merely by assigning 50 percent probabilities to a series of “yes/no” questions. Additionally, some evidence suggests people form more accurate judgments when they include more evidence than others—and it’s obvious how this can improve accuracy when, for example, including DNA evidence that vindicates otherwise convicted defendants in law.
What we also care about, then, is whether judgments of probability are informative in the sense that they tell us whether something is true or not in a particular case. This, in turn, is largely a matter of including evidence. For example, one could be well calibrated by assigning 50 percent probabilities for the “yes/no” questions, but they would likely be omitting relevant evidence and not saying anything particularly informative.
Calibrationism then says that, for judgments to be trustworthy, they must also include all the evidence that we regard as relevant.
Getting Practical: How to Implement Calibrationism
So that’s calibrationism in a rough nutshell: Our judgments are trustworthy to the extent we have evidence that (a) they are produced in ways that are well-calibrated and (b) they are inclusive of all the relevant evidence. What, then, are the implications of this? Four come to mind:
- We should measure calibration to determine which judgments are trustworthy;
- We should trust individuals who have evidence of their calibration;
- But only if they are inclusive of all the evidence;
- And where necessary, we should aim to improve our calibration.
Practically, I provide more ideas about how we can do these things elsewhere. For example, one can measure calibration by plugging some judgments into a spreadsheet template, as I discuss here. Calibration can also be improved with recommendations that I discuss here. Lastly, if we want to assess the inclusivity and trustworthiness of someone’s thinking, we can list the evidence that we think is relevant, ask them questions about their reactions to each item, and, if their responses seem to reflect calibrated engagement with all of the evidence, then we can trust their judgments.
Put simply, to determine which judgments to trust, we might want to see more calibration graphs and evidence checklists to determine calibration and inclusivity—at least when the stakes are high. This might help make a world with fewer fatal misdiagnoses, false criminal convictions, and other expressions of inaccuracy that compromise the functioning and well-being of our societies.