The statistical accident that founded an industry

In 1904, a British psychologist named Charles Spearman sat down to look at the grades of a group of schoolchildren in a rural school in Berkshire. He was after something dull: whether maths grades bore any relation to language grades, to music grades, to the ability to tell tones apart. What he found, however, was that all the grades correlated with each other — not perfectly, but systematically. The child who did well in arithmetic also tended to do well in French and to be better at telling whether two musical sounds were the same or different [1].

That could have had a thousand explanations — motivation, family, sleep, hunger — but Spearman put forward one that has lasted a hundred and twenty years and is still being fought over in conferences: there is a common factor, something shared by every mental task, that makes it so if you’re good at one you’re more likely to be good at the rest. He called it the g factor.

To this day we still don’t know exactly what it is, and yet that statistical construct predicts your academic performance, your life expectancy and your risk of a workplace accident better than just about any other psychological variable.

Fluid, crystallized, and the rest of the zoo

In 1941, Raymond Cattell, an indirect disciple of Spearman, started to suspect that g was too coarse and too general to be useful. He proposed splitting it in two: fluid intelligence (Gf), the ability to reason through new problems without leaning on anything you learned before; and crystallized intelligence (Gc), everything you’ve accumulated thanks to schooling, culture and books — vocabulary, general knowledge, verbal comprehension [2].

Curiously, his own student John Horn went on to expand the catalogue. By the sixties there were no longer two capacities but eight or nine: short-term memory, visual processing, auditory processing, processing speed, long-term memory and a few more.

And then, to make the whole thing even more complicated, John Carroll published Human Cognitive Abilities in 1993, a book in which he re-analysed 461 datasets covering sixty years of psychometric research [3]. Out came a three-level hierarchy. At the base, some eighty specific, narrow abilities — speed of naming objects, phoneme discrimination, digit memory. In the middle layer, around ten broad capacities: fluid, crystallized, memory, visual processing, auditory processing, speed, and so on. And at the very top, crowning it all, g again.

Horn, incidentally, never accepted to his dying day that g was a real thing. Carroll thought the opposite. Their colleagues forcibly welded them together into a model called Cattell-Horn-Carroll (CHC), synthesised by Kevin McGrew in the nineties. It’s the model that today structures nearly every serious intelligence test used in clinic and education: WISC, WAIS, Woodcock-Johnson, Kaufman [4]. McGrew himself, oddly enough, admitted in 2023 that calling it “one theory” is cheating: it’s several related theories, not always fully compatible, huddled under the same label [5].

What an IQ test actually measures

When someone tells you “I have an IQ of 130”, what they’re really saying is this: compared with people of their age from a normative sample, they scored better than roughly 97 % of them on that test. It isn’t a measure of anything absolute. It’s a ranking. The intelligence quotient is built so that, by definition, the mean is 100 and the standard deviation is 15. If tomorrow the whole of humanity got smarter, IQs would still have a mean of 100 because the tests get renormalised every fifteen or twenty years. Up or down…

In a 2007 meta-analysis of 85 longitudinal studies, Tarmo Strenze found that IQ measured in childhood or adolescence correlates with the level of education reached in adulthood at around 0.56, with occupational status at around 0.45, and with income at around 0.20 [6]. In plain speech: a good predictor of school success, a decent one of how prestigious your job ends up being, and a rather weak one of what you’re going to earn. Studies on job performance show similar correlations, especially in cognitively demanding jobs.

Now it gets interesting. In 1932 the Scottish government administered an intelligence test to practically every eleven-year-old in the country on a single day. Decades later, Ian Deary and Lawrence Whalley followed up on that cohort. They found that fifteen extra IQ points at eleven were associated with a 21 % greater chance of still being alive at seventy-six [7]. The effect remained after controlling for social class. Nobody fully knows why. The speculation is that there is better adherence to medical treatments, better comprehension of health instructions, fewer traffic accidents, more informed lifestyle decisions. But the exact mechanism is still an open question.

What an IQ test doesn’t measure, on the other hand, makes for a longer list: divergent creativity, practical wisdom, emotional regulation, motivation, perseverance, ethics, teamwork, common sense, social skill, adaptation to cultural contexts different from the one that designed the test. And still, it goes on predicting big things about your life. Not because it measures “everything that matters”, but because the little it does measure — the efficiency with which you solve abstract problems under pressure — happens to leak transversally into a lot of areas.

The brain doing all this — where?

For decades, looking for where intelligence lived in the brain was like looking for where a car’s volume lives: nowhere in particular. In 2007, Rex Jung and Richard Haier reviewed 37 neuroimaging studies of intelligence and put forward the Parieto-Frontal Integration Theory (P-FIT) [8]. There is no intelligence zone; there’s a distributed network that connects the dorsolateral prefrontal cortex with the parietal lobe, passing through the anterior cingulate and some temporal regions.

Described as a process it would go something like this: information enters through sensory areas, gets elaborated in the parietal, is integrated and evaluated in the frontal, and the connections between those regions — the white matter that links them — determine how much bandwidth your processing has. People with more efficient connections between those areas tend to score higher on tests of fluid reasoning. Later studies have confirmed the big picture: functional connectivity within the P-FIT network correlates with performance on matrix reasoning tasks [9].

Careful, though: no single region has been shown to account, on its own, for more than a small percentage of the variance in intelligence. And what the model captures well — fluid reasoning, matrices — it captures worse when you move to verbal, social or creative tasks.

Is bigger always better?

For a long time it was taken for granted that the bigger the brain, the greater the intelligence. The correlation does exist, but it’s weak. A meta-analysis of 88 studies found a correlation of roughly 0.24 between brain volume and IQ [10]. That is: brain size explains around 6 % of individual differences in intelligence. The other 94 % depends on something else. Elephants have bigger brains than humans and they don’t publish papers in scientific journals. What seems to matter is not raw volume but the efficiency and organisation of the connections. In blunt terms, contemporary neuroscience leans toward the hypothesis that intelligence looks more like the bandwidth of a network than the raw power of a processor.

The genes, or the relative failure of the hype

In 2018, a team led by Danielle Posthuma published in Nature Genetics the largest genome-wide association study (GWAS) of intelligence ever done: 269,867 people and 205 genetic loci associated with cognitive performance [11]. The heritability of intelligence is estimated at between 50 % and 80 % — meaning the percentage to which genes drive a specific level of intelligence — a figure that hasn’t moved much for decades, though it comes from twin studies.

And yet, when those same researchers tried to build a polygenic score — an algorithm that sums up the effect of all the identified genetic variants to predict a person’s IQ — the result explained less than 5 % of the actual variation in intelligence [12].

The upshot is that, even though intelligence has a strong genetic component at the population level, as of today nobody can look at your DNA and tell you what you’re going to score on a WISC. And probably won’t be able to for quite some time.

The Flynn effect, and why it’s disappearing

Here’s another piece of information that rarely makes it into textbooks. Throughout the twentieth century, in developed countries, IQ scores rose steadily: roughly three points per decade on average. James Flynn documented this in the eighties, which is why the phenomenon carries his name. The most widely accepted explanations are improved childhood nutrition, the reduction of infectious disease and — above all — mass schooling and exposure to abstract modes of thought.

The problem is that since the nineties the effect has started to reverse in several countries. In Norway, Bratsberg and Rogeberg published a study in PNAS in 2018 on military conscripts showing a clear turnaround: those born from the mid-seventies onwards score lower, on average, than their predecessors [13]. And the most interesting bit: the drop shows up even when comparing brothers within the same family. That rules out easy hypotheses like “it’s just that lower-IQ people are having more children”.

In the United States, a 2023 study of nearly 400,000 adults analysed data from 2006 to 2018 and found declines in matrix reasoning and number series, though not in verbal reasoning or three-dimensional rotation (this last one actually went up) [14]. In Germany, a recent analysis of student samples between 2012 and 2022 detected declines of between 4.7 and 5.2 IQ points per decade in figural reasoning [15]. Five points in ten years is a lot.

Nobody knows exactly what’s going on, but several hypotheses are on the table: changes in the education system, less deep reading, more screen time, changes in diet and sleep, and the saturation of the very environmental causes that had driven the Flynn effect in its day. The interesting bit is that the pattern isn’t uniform: not everything falls — what falls is the abstract, the stuff you have to solve without cultural scaffolding. And that, if it’s real, has uncomfortable implications.

So let me float the following: if fluid intelligence is going down in literate populations and crystallized intelligence is holding steady or even improving, does that imply we’re outsourcing our reasoning to external tools — search engines, calculators, assistants based on language models — in a way that leaves our brains allocating fewer resources to the muscle of unaided reasoning? It’s just a conjecture, but bear in mind that these data predate the rise of artificial intelligence, which in many cases can end up substituting for our mental effort. Some authors call it “atrophy by delegation”, even though there is no solid experimental evidence yet.

What the tests don’t see

Intelligence tests were designed in the early twentieth century in a very specific context — as usual —: European and North American schools, written culture, Western logic, pencil and paper, an examiner giving instructions in a particular language. Even today, most studies on intelligence are done on WEIRD populations — the acronym for Western, Educated, Industrialised, Rich and Democratic — which make up less than 15 % of humanity and yet supply the overwhelming majority of experimental subjects in psychology [16].

That means a large part of what we call “the structure of human intelligence” is inferred from a very small and very peculiar subset of humans. When tests are applied outside that context — rural communities in sub-Saharan Africa, Indigenous Amazonian populations, illiterate adults anywhere in the world — anomalous results appear that rarely make it into popular reviews.

A classic case is Sylvia Scribner’s work with the Kpelle of Liberia in the seventies: unschooled farmers failed logical classification problems that a seven-year-old Western child would solve. But when the same problem was presented in terms of the social relations of their village, they solved it without trouble. Decontextualised abstraction, the queen of IQ tests, turns out to be a culturally learned skill rather than a universal natural capacity.

And then there’s everything the tests don’t even try to measure. Creativity, emotional wisdom, moral judgement, social intuition, the ability to deal with real uncertainty — not textbook problems — and resilience. Robert Sternberg has been arguing for forty years that academic intelligence is only one of the kinds relevant to human success. His triarchic proposal — analytical, practical and creative — hasn’t dethroned CHC in the clinic, but it has made it clear that if someone reduces a person to their score on Raven’s matrices, they’re cheating.

So what

Intelligence, as a scientific construct, is a strange object. It’s real enough to predict your life expectancy and your job performance. It’s slippery enough that a century of intensive research hasn’t managed to define it without a fight. It’s distributed across a parieto-frontal network that we can already trace with functional MRI, but each specific region explains little. It’s 50–80 % heritable, yet the genes that make it up escape even the best GWAS. It rose for a hundred years and now, in some countries, it’s going down without anyone being sure why.

The reality is that there are so many variants and types of intelligence, and so many ways of measuring them, that being fully objective is probably close to impossible. And still, current methods — with all their flaws and biases in mind — are valid enough to get an approximate measure.

References

[1] Spearman, C. (1904). “General Intelligence, Objectively Determined and Measured”. American Journal of Psychology, 15(2), 201–292. Reliable

[2] Cattell, R. B. (1963). “Theory of fluid and crystallized intelligence: A critical experiment”. Journal of Educational Psychology, 54(1), 1–22. / Horn, J. L. & Cattell, R. B. (1966). “Refinement and test of the theory of fluid and crystallized general intelligences”. Journal of Educational Psychology, 57(5), 253–270. Reliable

[3] Carroll, J. B. (1993). Human Cognitive Abilities: A Survey of Factor-Analytic Studies. Cambridge University Press. Book

[4] Schneider, W. J. & McGrew, K. S. (2018). “The Cattell–Horn–Carroll theory of cognitive abilities”. In Flanagan & McDonough (Eds.), Contemporary Intellectual Assessment: Theories, Tests, and Issues (4th ed., pp. 73–163). Guilford Press. Reliable

[5] McGrew, K. S. (2023). “Carroll’s Three-Stratum (3S) Cognitive Ability Theory at 30 Years: Impact, 3S-CHC Theory Clarification, Structural Replication, and Cognitive–Achievement Psychometric Network Analysis Extension”. Journal of Intelligence, 11(2), 32. Reliable

[6] Strenze, T. (2007). “Intelligence and socioeconomic success: A meta-analytic review of longitudinal research”. Intelligence, 35(5), 401–426. Reliable

[7] Whalley, L. J. & Deary, I. J. (2001). “Longitudinal cohort study of childhood IQ and survival up to age 76”. BMJ, 322(7290), 819. Reliable

[8] Jung, R. E. & Haier, R. J. (2007). “The Parieto-Frontal Integration Theory (P-FIT) of intelligence: Converging neuroimaging evidence”. Behavioral and Brain Sciences, 30(2), 135–154. With reservations, small sample

[9] Hilger, K., Ekman, M., Fiebach, C. J. & Basten, U. (2017). “Intelligence is associated with the modular structure of intrinsic brain networks”. Scientific Reports, 7, 16088. With reservations, low replicability

[10] Pietschnig, J., Penke, L., Wicherts, J. M., Zeiler, M. & Voracek, M. (2015). “Meta-analysis of associations between human brain volume and intelligence differences: How strong are they and what do they mean?”. Neuroscience & Biobehavioral Reviews, 57, 411–432. Reliable

[11] Savage, J. E. et al. (2018). “Genome-wide association meta-analysis in 269,867 individuals identifies new genetic and functional links to intelligence”. Nature Genetics, 50(7), 912–919. Reliable

[12] Plomin, R. & von Stumm, S. (2018). “The new genetics of intelligence”. Nature Reviews Genetics, 19(3), 148–159. With reservations

[13] Bratsberg, B. & Rogeberg, O. (2018). “Flynn effect and its reversal are both environmentally caused”. PNAS, 115(26), 6674–6678. Reliable

[14] Dworak, E. M., Revelle, W., Doebler, P. & Condon, D. M. (2023). “Looking for Flynn effects in a recent online U.S. adult sample: Examining shifts within the SAPA Project”. Intelligence, 98, 101734. With reservations

[15] Breit, M., Scherrer, V., Blickle, J. & Preckel, F. (2024). “Measurement-Invariant Fluid Anti-Flynn Effects in Population-Representative German Student Samples (2012–2022)”. Journal of Intelligence, 12(1), 9. With reservations

[16] Henrich, J., Heine, S. J. & Norenzayan, A. (2010). “The weirdest people in the world?”. Behavioral and Brain Sciences, 33(2–3), 61–83. Reliable

Leave a comment

Your email address will not be published. Required fields are marked *