Bryan Caplan writes:
Years ago, I told Tyler Cowen, "It's surprising that IQ tests predict life outcomes so well, because there's usually no financial incentive to get a high score." He replied, "People try out of pride - an under-rated motive." So when Tyler blogged Duckworth et al, "Role of Test Motivation in Intelligence Testing" I naturally took notice. Key claims:
1. Material incentives boost IQ scores: ... "The authors reasonably infer that IQ is more of a composite intelligence/motivation measure than usually believed - especially by inter-disciplinary researchers."
As far as I can tell, the authors do nothing to show that their results make IQ is less predictive. They don't even show that IQ is more mutable than earlier studies find; boosting incentives boosts scores while the incentives remain in place, but there's no reason to think the boost lasts after the test-takers receive their pay. All the researchers require us to reconsider is the reason why IQ is so predictive and hard to durably improve.
I made Duckworth's point in my 2007 FAQ on IQ:
Q. So, you're saying that IQ testing can tell us more about group differences than about individual differences?
A. If the sample sizes are big enough and all else is equal, a higher IQ group will virtually always outperform a lower IQ group on any behavioral metric....
Of course, everything else is seldom equal. A more conscientious group may well outperform a higher IQ group. On the other hand, conscientiousness, like many virtues, is positively correlated with IQ, so IQ tests work surprisingly well.
Q. Wait a minute, does that mean that maybe some of the predictive power of IQ comes not from intelligence itself, but from virtues associated with it like conscientiousness?
A. Most likely. But perhaps smarter people are more conscientious because they are more likely to foresee the bad consequences of slacking off. It's an interesting philosophical question, but, in a practical sense, so what? We have a test that can predict behavior. That's useful.
Keep in mind that the notorious average group gaps in cognitive test scores show up not only on low stakes tests, but on high-stakes tests where the testees are highly motivated: the SAT, ACT, LSAT, MCAT, GMAT, GRE, the military's AFQT enlistment test, NYC firefighting hiring tests, New Haven fire department promotion tests, Chicago cop tests, the NFL's Wonderlic IQ test, insurance agent licensing tests, and so forth and so on ad infinitum.
I can think of only one example where different levels of group motivation had a sizable effect: the military's AFQT enlistment test was renormed in 1980 on the National Longitudinal Study of Youth sample of about 12,000 young people, most of whom weren't trying to enlist. The test was 105 pages long. It was found years later that the anomalously large white-black gap on this renorming (18.6 IQ points rather than the usual 15 or 16) was caused by blacks being more likely to give up from discouragement part way through this long and hard test and leave the latter questions unanswered or just "bubbled in." (Keep in mind that this was a low stakes test for the participants, who were just taking part in a social science project, not trying to enlist).
In 1997, the AFQT was renormed using a computer adaptive testing where wrong answers lead to easier questions and thus less discouragement. The white-black gap was only 14.7 points.
This finding is worth keeping in mind for evaluating school performance test scores, which are usually low stakes tests for the students.
Some of the difference in performance among schools on achievement tests therefore depends upon how well the principal and teachers manage to motivate students to keep working until the end of the test.
So, a lot of reports of miracle schools that seem to fizzle out after awhile have to do with higher scores ginned up by getting students just to not bubble in.
On the other hand, I'd rather send my kid to a school where the management has enough on the ball to figure out how to look better and is persuasive enough to motivate students to work for an extra 20 minutes than a school where management isn't. And a school that manages to motivate students on their state tests is likely to attract the children of more motivated and smarter parents in the future.
So, once again, the question of intelligence v. motivation turns out to be more philosophical than predictive.
One thing to keep in mind is that in experimental situations involving low stakes tests, if the experimenters _want_ one group of testtakers to be unmotivated, it's easy to demotivate them to work less hard on the test. The test administrator can convey that a lackadaisical attitude is okay just through word choice, tone of voice, body language, and so forth.
I suspect this is a major feature of the popular stereotype threat experiments where low stakes tests are given to blacks. In the test group, blacks are told that they are expected to score low on the following test and in the control group, they aren't. Not surprisingly, on these tests that are meaningless to the testtakers, the first group is more likely to pick up the experimenters' hopes that they will work less hard and they do work less hard.
I've never seen stereotype threat confirmed experimentally on high stakes tests. I can't see how such an experiment would pass an ethical review board.
You'll note that stereotype threat experiments aren't about getting blacks to perform better on tests but about getting them to perform worse. Big difference.