Bias Or Bayesianism? Criminologists Don't Want To Believe Stereotypes Are True
Print Friendly and PDF

Having watched some true crime shows recently, I’m reminded that murders tend to be harder than most other crimes to solve, precisely because the best witness, the victim, is dead. It seems like it would be pretty easy to get worked up over one theory of a murder and try to railroad a suspect.

Also, people who go into criminal justice don’t tend to be brilliant. I suspect that, say, the National Transportation Safety Board investigators and consultants tend to be smarter in raw IQ terms than their law enforcement counterparts.

Moreover, a large fraction of homicides are never “cleared,” often because surviving witnesses are terrified of the murderers or are criminals themselves or generally hate the police.

From Science:


Itiel Dror is determined to reveal the role of bias in forensics, even if it sparks outrage
12 MAY 2022


A version of this story appeared in Science, Vol 376, Issue 6594.

In February 2021, cognitive psychologist Itiel Dror set off a firestorm in the forensics community. In a paper, he suggested forensic pathologists were more likely to pronounce a child’s death a murder versus an accident if the victim was Black and brought to the hospital by the mother’s boyfriend than if they were white and brought in by the grandmother. It was the latest of Dror’s many experiments suggesting forensic scientists are subconsciously influenced by cognitive biases—biases that can put innocent people in jail.

Or maybe they are Bayesians?

Dror, a researcher at University College London (UCL), has spent decades using real-world cases and data to show how experts in fields as diverse as hospital care and aviation can reverse themselves when presented with the same evidence in different contexts. But his most public work has involved forensic science, a field reckoning with a history of unscientific methods. In 2009, the National Research Council published a groundbreaking report that most forensic sciences—including the analysis of bullets, hair, bite marks, and even fingerprints—are based more on tradition than on quantifiable science. Since then, hundreds of studies and legal cases have revealed flaws in forensic sciences.

Dror’s work forms a connective tissue among them. He has shown that most problems with forensics do not originate with “bad apple” technicians who have infiltrated crime labs. Rather they come from the same kind of subconscious bias that affects everyone’s daily decisions—the shortcuts and generalizations our brains rely on to process reality. “We don’t actually see the environment,” Dror says. “We perceive stimuli from the environment that our brain represents to us,” shaped by feelings and past experience.

… Dror now travels the world testifying in trials, taking part in commissions, and offering training to police departments, forensic laboratories, judges, militaries, corporations, government agencies, and hospitals. National agencies, forensic labs, and police forces have adopted his approach to shielding experts from information that could bias them.

… Dror’s previous studies on bias in forensics caused grumbling, but nothing like the reaction to the 2021 paper. This time, he used a survey to see whether bias could affect decision-making among medical examiners. He concluded that nonmedical evidence such as the race of the decedent or their relation to the caregiver—details that most medical examiners routinely consider—were actually a source of bias.

Eighty-five of the country’s most prominent pathologists demanded its retraction. The National Association of Medical Examiners (NAME) alleged ethical misconduct and demanded that Dror’s employer, UCL, stop his research. The editor of the Journal of Forensic Sciences wrote that he hadn’t seen so many arguments in the journal’s 65-year history, or so much anger. After decades challenging forensic experts, Dror had gotten into a fight that threatened his career.

… He speaks with a mixture of accents and intonations from his upbringing in Israel, his graduate work in the United States, and his professional life in the United Kingdom.

… They found five fingerprint experts who knew about the Mayfield case but had not seen the fingerprints. Dror and Charlton sent each expert a pair of prints from one of the expert’s own previous cases, which they had personally verified as “matched,” but told them the prints came from the notorious case of FBI’s mismatch of Mayfield’s prints with the terrorist’s.

In other words, Prof. Dror lied. Criminal examiners should be on the lookout for lies, but they probably don’t expect them from other academics.

Four of the five experts contradicted their previous decision: Three now concluded the pair was a mismatch, and one felt he needed more information. They seemed to have been influenced by the passage of time and extraneous information.

“It was so simple and elegant,” Peter Neufeld, co-founder of the Innocence Project, says of the study. “And when people in the forensic community read it, they got it.”

… Dror looked at other biasing factors in fingerprint analysis, some of which were shockingly innocuous. When police retrieve a print from a crime scene, they consult an FBI computer database containing millions of fingerprints and receive several possible matches, in order of the most likely possibilities. Dror found that experts were likely to pick “matches” near the top of the list even after he had scrambled their order, perhaps because of the subconscious tendency to overly trust computer technology. …

Dror and his colleagues are quick to point out that bias does not always equal prejudice, but it can foster injustice. Studies have shown, for example, that Black schoolchildren get punished more readily than white children for the same misbehavior, because many teachers subconsciously assume Black children will continue to misbehave. And in forensic science, bias can subconsciously influence experts to interpret data in a way that incriminates a suspect. …

Itiel Dror and his collaborators have coined various terms to describe how bias sneaks into forensic analysis—and how experts perceive and react to their biases.

TARGET-DRIVEN BIAS Subconsciously working backward from a suspect to crime scene evidence, and thus fitting the evidence to the suspect—akin to shooting an arrow at a target and drawing a bull’s-eye around where it hits

I can recall hearing a rumor about a scientist who sounded suspicious in the then unsolved 2001 anthrax case. I looked into his past and talked myself into believing he should be considered a suspect. But fortunately, before hitting “Publish” I ran through the evidence I had assembled one more time, but from a skeptical perspective. I suddenly realized I didn’t have anything.

CONFIRMATION BIAS Focusing on one suspect and highlighting the evidence that supports their guilt, while ignoring or dismissing evidence to the contrary

BIAS CASCADE When bias spills from one part of the investigation to another, such as when the same person who collects evidence from a crime scene later does the laboratory analysis and is influenced by the emotional impact of the crime scene

I suspect this is more of a problem in small towns that can’t afford as much division of labor in crime investigations.

BIAS SNOWBALL A kind of echo chamber effect in which bias gets amplified because those who become biased then bias others, and so on

BIAS BLIND SPOT The belief that although other experts are subject to bias, you certainly are not

EXPERT IMMUNITY The belief that being an expert makes a person objective and unaffected by bias

ILLUSION OF CONTROL The belief that when an expert is aware of bias, they can overcome it by a sheer act of will

BAD APPLES The belief that bias is a matter of incompetence or bad character

TECHNOLOGICAL PROTECTION The belief that the use of technology, such as computerized fingerprint matching or artificial intelligence, guards against bias

Dror says the best approach to fighting bias is to shield experts from extraneous information, similar to the “blinding” in scientific experiments. He calls the process Linear Sequential Unmasking, in which the analyst only sees the evidence that’s directly relevant to their task. Some authorities have endorsed the approach. The United Kingdom’s Forensic Science Regulator recommends it as “the most powerful means of safeguarding against the introduction of contextual bias.” FBI adopted the process following the Mayfield case: Because humans tend to see similarities between objects viewed side by side, agents now document the features of a crime scene fingerprint on its own before comparing it to a suspect’s prints.

After consultation with Dror, police in the Netherlands began to blind fingerprint examiners to details of a crime investigation that might influence their analysis, such as the condition of the body or the urgency of the case, says John Riemen, the police force’s lead biometrics specialist. The approach ensures “you’re looking at fingerprints, and not at your biases,” he says.

Sounds reasonable. But it also sounds like it would slow down investigations, especially in less densely populated countries than the Netherlands.

This is like the now common recommendation for researchers that they publicly register their hypotheses before collecting their data so that they don’t p-hack their way to statistical significance. There is much to be said for that, but I usually don’t do it, in large part because I usually come up with much better hypotheses after looking at the data. Why? Because knowledge is good.

Similarly, when it comes to murder investigations, a major question is whether America has a bigger problem with convicting innocent men of murder or with failing to bring murderers to justice. It’s widely assumed that because so many black men are convicted of murder compared to white men that that must prove that lots of innocent black men are being wrongly convicted of crimes committed by white men.

But the L.A. Times murder reporter Jill Leovy memorably argued that the bigger problem is the criminal justice system failing to arrest anybody for a very large fraction of murders.

IT WAS AN ATTEMPT to win medical examiners over to this approach that landed Dror in hot water. In 2019, he got a message from Daniel Atherton, a pathologist at the University of Alabama, Birmingham, who wanted him to look at some data he had collected. Atherton had sent a survey to 713 pathologists across the country positing one of two scenarios in which a toddler with a skull fracture and brain hemorrhage was brought to an emergency room and died shortly thereafter. In one scenario, the child was white and was brought in by the grandmother. In the other, the child was Black and brought in by the mother’s boyfriend. The survey asked participants to decide whether the manner of death was undetermined, accidental, or homicide.

Dror analyzed the results and found that of the 133 people who answered the survey, 32 concluded the death was a homicide. And a disproportionate number of those—23—had received the scenario with the Black child and the boyfriend. Participants reading the “Black condition” were five times more likely to conclude homicide than accident, whereas participants in the “White condition” ruled accident more than twice as frequently as homicide.

“Their decisions were noticeably affected by medically irrelevant contextual information,” Dror, Atherton, and their colleagues wrote in their paper, published in the Journal of Forensic Sciences.

The paper also included a survey of 10 years of Nevada death certificates showing an apparent correlation between Black deaths and findings of homicide versus accident—influenced, perhaps, by cultural biases. “I just wanted to get that information out there to begin a discussion,” Dror says of the study.

He got more of a discussion than he expected. The journal was swamped with angry letters from medical examiners. One derided the study as “rank pseudoscience.” Another, signed by the president of NAME along with 84 other pathologists, excoriated the study as “fatally flawed” and “an abject failure of the peer review process,” and demanded its retraction. (Michael Peat, editor of the journal, declined to retract the article, saying it had been peer reviewed before publication and rereviewed by a respected biostatistician following the complaints.)

Many pathologists pointed out that the experimental design linked two unrelated variables—the race of the child and their relationship to the caretaker. They were further inflamed by Dror’s labeling the scenarios “Black condition” and “White condition,” when they had reason to suspect that the caretaker, not the race, was the relevant variable. Statistics show a boyfriend of any race is far more likely to harm a child in his care than a grandmother.

Statistics, on the other hand, don’t say anything about the race of homicide perps, apparently.

“To introduce race … appears to be an effort to label the survey responders, and their colleagues by proxy, as racist,” said the letter from the 85 practitioners. “Had this survey been done with the races reversed … White cases were more likely to be called homicide and Black cases more likely to be called accident.” They contended that Dror was using inflammatory language to get headlines. And they noted that other factors could have played a role in the pathologists’ decisions, such as their level of experience, local crime statistics, and office policies, none of which Dror had considered.

Stephen Soumerai, an expert in research design at Harvard Medical School, agrees that linking a known risk factor for homicide (caregiver relationship) to a nonwhite race is problematic. And the survey of Nevada death certificates failed to investigate other possible explanations beyond race, he says. “The hypothesis is reasonable and important, but the research does not adhere to basic principles of research design,” he says.

Dror admits he would have been wise to use neutral terms to designate the two experimental groups. But he doesn’t concede that the study is flawed. “It is a first study to examine and establish that there is bias in forensic pathology,” he says. Dror agrees that statistics do show an unrelated caretaker is more likely to harm a child than a grandmother. But such generalizations should not affect how examiners diagnose individual cases.

It’s a general problem in any kind of decision-making. In the real world, white grandmothers are about two to three orders of magnitude less likely to murder a child than a black boyfriend. That’s a pretty good reason not to waste much investigative resources on white grandmothers. On the other hand, it’s not proof beyond a reasonable doubt of the black boyfriend’s guilt. But where does the examiner fit into the chain of decision making? How important is it for him to not shut off an investigation into what could be a homicide vs. how important it is to not take a step toward putting what could be an innocent man on trial?

… The question of bias in autopsies rocketed to the headlines after Minneapolis police officer Derek Chauvin killed George Floyd on 25 May 2020. During the trial in April 2021, the local medical examiner for Hennepin County in Minnesota testified that the manner of death was “homicide,” as did other pathologists. But an expert hired by Chauvin’s defense team, former Maryland Chief Medical Examiner David Fowler, testified that Floyd had so many underlying health challenges that the manner of death was “undetermined.”

Chauvin was found guilty, but Fowler’s testimony outraged other pathologists and physicians, who saw in his conclusions a pro-police bias.

In contrast, there couldn’t be any bias against the Great White Defendant, as proven by how the entire American Establishment didn’t suddenly throw a fit over the story and egg on 500+ Mostly Peaceful Protests and boost the murder and car fatality rates. And trying to ruin the career of an expert witness testifying for the defendant will of course keep the prosecution from framing innocent men in the future by unspecified mechanisms.

More than 400 of them signed a petition to Maryland Attorney General Brian Frosh demanding an investigation into all the death-in-police-custody cases during Fowler’s 17 years in office. Frosh recruited seven international experts to design the study, including Dror. And despite all the blowback Dror has received for trespassing in the field of forensic pathology, he agreed to participate.

“If my work results even in one person not getting wrongly convicted, or one guilty person not going free, then it’s worth all the grief I’ve been getting,” he says. “And maybe not just one person. Hopefully this is going to change the domain.”

It seems like there are trade-offs rather than just absolutes. The old American legal saying is that better that 9 guilty men go free than one innocent man be convicted. But there are also costs to society in general, such as more murders, in not clearing cases, as reporter Jill Leovy documented in her 2015 book Ghettoside.

[Comment at]

Print Friendly and PDF