Education


At last, after 24 years of reflection, Professor Flynn has felt able to disburden himself of a ambitious general book[1] aimed at explaining the “massive rise in IQ” during the 20th century, an effect which now carries his name.

 

The results are curious. To be sure, Flynn has wished to write a popular book of general interest, not devoid of some philosophical self-indulgence, and to achieve both “critical acumen” (a pet concept) and judicious inoffensiveness. He includes for aficionados much of the psychometric test data which support his finding of long-term secular gains in IQ in tabular form.

 

Many debates regarding intelligence are both vexed and thorny, so it is helpful that his general account is careful to avoid animosity and welcomes the contending parties within the congenial circle of his “humane egalitarianism”.

 

However, I do have some criticisms:  

  1. It is actually quite difficult to discover what he does think. Eventually he admits — in the final chapter — to what must be described as an extreme form of environmentalism (an increasingly complex post-industrial society has amplified cognitive skills). This is partly because his manner of exposition is amazingly diffuse.
  2.  Flynn seems surprisingly uninterested in psychometrics and test construction as such. What is universally known as the ACID profile he calls, for some reason, the AICD profile. Yet, as I will argue, these topics may contain a large part of the relevant explanation.
  3. Flynn mentions his background as a moral philosopher and seems disinclined to do any original research into the questions that interest him (though he is generous in suggesting research designs to others). The consequence is that the book is a hash of opinions, mostly unsupported by factual evidence. This does not detract from its stimulating properties but cumulatively leads to some confusion and frustration in the specialist reader.
  4. The book is quite carelessly written, much of it apparently dictated, and poorly proofed: words are missing and the expression is often rough. It reads like a draft. Though often amusing and direct, the manner is frequently somewhat crackerbarrel.

 

Missing not only from Flynn’s account, but from most other discussions of this topic which has excited so many experts, is any discussion of the vastly changed technology of psychometric assessment. We expect a car built in 2000 to be a great deal more powerful than one built in 1910. Particularly since 1960, when Georg Rasch published his innovative account of test construction theory,[2] and 1980 when, after a 20 year lag during which the penny dropped, the technology of item response theory (IRT) has radically streamlined test administration. Older tests now look like what they are — museum pieces — simply because of their gross superfluity of items. We no longer find it acceptable that a new test should be published that does not utilise IRT-based scaling.

 

For instance, the family of Raven tests — the Coloured Progressive Matrices,[3] Standard Progressive Matrices[4] and Advanced Progressive Matrices[5]  between them offer 108 items, probably enough for three equivalent all-age tests. Using Andrich’s Rasch analysis of the difficulty values,[6] it is possible to select every third item, once ranked for difficulty, and come up with something like an economic modern test.[7]

 

Because of the worldwide establishment and acceptance of the Wechsler tests, there has been a sustained reluctance to modernise them. The Wechsler Intelligence Scale for Children – 3rd edition (WISC-III)[8]  was a remarkably tame revision and, even with one new test added, psychologists continued to give the test as if it were the Wechsler Intelligence Scale for Children – Revised (WISC-R),[9]  omitting Symbol Search — the new addition — and reporting subtests in their old order. They even continued hunting for the subjectively-perceived ACID profiles even when provided with the quantitatively based factor or Index scores.[10]

 

Things changed, however, when the Psychological Corporation assembled a team to revise the WAIS. Originally known as the Wechsler-Bellevue and first published in 1939, this was standardised in the most limited way on a single convenience sample of adults in a suburb of New York, but became the progenitor of the entire family of Wechsler tests. Desiring a downward extension for children, David Wechsler produced the first WISC in 1949. Although now better sampled, all the children were white:

 

The national standardization sample for the WISC, stratified by geographic region and parental occupation, consisted of 2200 white children (1100 boys, 1100 girls): 100 boys and 100 girls at each age from 5 to 15 years inclusive […] Miele (1958 ) reported the mean raw scores obtained by the examinees in the WISC standardization sample for each subtest by age and sex. Because standard deviations for WISC raw scores were never reported, total raw-score variability had to be estimated from the tables of norms in the WISC manual, which were used to calculate the interquartile range (the difference between the score at the 25th percentile and the 75th percentile in a normal distribution of scores) … [T]he standard deviation is 25% smaller than the interquartile range in a normal distribution of scores (Rosenthal & Rosnow, 1984) […] [11]

 

The test was to prove a permanent success in that it added the vital dimension of visuospatial problem-solving ability to the heavily verbal emphasis embodied in the older Stanford-Binet (Terman Merrill) test.

 

Flynn does not seem to appreciate the extent to which the content of subtests was revised with each new edition, nor the progressive refinement of technical standards indicated above. By the time the team got to work to produce the Wechsler Adult Scale of Intelligence (WAIS-III),[12] the courage to modernise radically had at last been found;[13] the same team was shortly afterwards responsible for the WISC-IV revision also.[14]  The latter two tests give Flynn a problem because of the radical extent of the alterations. Yet if one accepts the factorial logic of g — the statistical general factor in such tests — it should not really matter how it is measured.  In particular, he overlooks the case lying within his own data that modernisation seems to result in more intelligence.[15]

 

The technological argument must be that, as the measurement of intelligence threw off its historical legacy of sensory-motor testing,[16] so the tests revealed more and more of what we might nowadays regard as true intelligence, namely higher-order abstract logical reasoning and novel problem-solving. Engineers speak of signal-to-noise ratio and the history of psychometric technology is one of increasing signal and decreasing noise. In particular, the Wechsler tests included an enormous contamination of fine motor skills (especially timed tests) in what seems a mélange of sensorimotor activities. The most streamlined modern tests of general cognitive ability, such as the Differential Ability Scales – Second Edition,[17]  are virtually free of such noise and show why such psychometric testing is now regarded as virtually a finished technology. This may indeed result in a cessation of further upward movement in population IQs, as is apparently already evident in the Scandinavian countries.

 

Yet this is counterintuitive. We are all familiar with the antique tests, beloved of Mensa, which happily produce IQs of 180.[18] Is it therefore the case that older tests inflate, and more rigorous modern tests deflate, IQ? This would not constitute an explanation of the Flynn Effect! I believe the opposite is the case. Because of the poor technical standards — to our contemporary eyes — of the older tests, there was poor targeting of intellectual ability and a greater element of randomness. Paradoxically, the more rigorous modern technology actually reveals more of the intelligence that is there. Thus we do not overestimate contemporaries but underestimate, for technical reasons, previous generations.

 

The special case here, and the counterargument which must be addressed, is that of the Raven tests. If only raw scores were reported in the literature, as they often are by experimental psychologists typically reporting highly focused enquiries, we should be on firm ground. But the massive gains that show up on this test of non-verbal reasoning typically involve feeding raw scores through the wholly inadequate apparatus of the various standardisations published since the 1930s[19] in order to report derived scores of some kind. None of these standardisations has ever commanded general acceptance, with the partial exception of the 1979 norms for the SPM.

 

Nevertheless, I remain open to the very considerable evidence of the Flynn effect, to which some reliable Raven raw score data must powerfully contribute, but suggest that many of the competing explanations at present remain unsatisfactory. The most parsimonious explanation, or part-explanation, should concern the measurement technology itself — before we start speculating about the complexity of modern society and the scientific training of modern populations. Flynn certainly reveals the extent to which so-called IQ tests are relativistic from top to bottom, though he does not choose to emphasise their powerful clinical accuracy and utility. But he seems unwilling to recognize their central role in contributing to his undoubted effect.

 

We are left, then, with widely accepted and scarcely challenged evidence of “massive IQ gains” ever since such tests were invented, together with an excessive number of explanations and non-explanations. Though we may feel more sympathy for this one, and less for that, I suggest that a modest but fundamental source of influence lies in the progress of the technology itself. Before we resort to Bonapartist social explanations, we ought surely to consider the little black boxes from which the grand effect arises.

 


[1] Flynn, J.R. What Is Intelligence? New York: Cambridge University Press, 2007.

[2] Rasch, G. Probabilistic Models For Some Intelligence And Attainment Tests. Expanded edition. Chicago: University of Chicago Press, 1960/1980.

[3] Raven, J.C. Coloured Progressive Matrices (CPM). London: H.K. Lewis, 1956, 1962.

[4] Raven, J.C. Standard Progressive Matrices (SPM). London: H.K. Lewis, 1958.

[5] Raven, J.C. Advanced Progressive Matrices (APM). London: H.K. Lewis, 1947.

[6] Andrich, D. and Dawes, I. Conversion Tables CPM/SPM/APM. Chapter 4 in Raven, J. and Court, J.H. Manual For Raven’s Progressive Matrices And Vocabulary Scales, Research Supplement No. 4. London: H.K. Lewis, 1989.

[7] Sadly, it appears that the 2008 Pearson UK standardisation of Raven’s classical tests, although the first proper standardisation they have ever received, has missed the opportunity for the radical restructuring needed.

[8] Wechsler, D. Wechsler Intelligence Scale For Children – Third Edition UK (WISC-III). New York: Harcourt Brace Jovanovitch, The Psychological Corporation, 1992.

[9] Wechsler, D. The Wechsler Intelligence Scale For Children – Revised (WISC-R). San Antonio, Texas: The Psychological Corporation, 1974.

[10] A simple spreadsheet, using published values and weightings, was sufficient to convert the valuable ACID profile, unique to the WISC-R, into an objective third, LD (or FD) factor which could be evaluated according to gradated criteria.

[11] Feingold, A. Cognitive gender differences: a developmental perspective. In: Sex Roles: A Journal of Research,  July 1993.

[12] Wechsler, D. Wechsler Adult Intelligence Scale – Third Edition (WAIS-III). New York: Harcourt Brace Jovanovitch, The Psychological Corporation, 1997.

[13] The first public presentation of the revised test produced a standing ovation in San Diego from their professional audience.

[14] Wechsler, D. Wechsler Intelligence Scale for Children – 4th UK edition. (WISC-IVUK). London NW1 7BY: The Psychological Corporation, 2004.

[15] See Flynn (2007), p. 185, last paragraph.

[16] Because of their sensorimotor, timed aspects, the subtests which were among those showing the greatest gains in Flynn (2007), Appendix 1, Table 1 (Object Assembly, Coding, Picture Arrangement) were among those dropped or sidelined in the revisions: Full Scale IQ was no longer based on these as core tests.

[17] Elliott, C. Differential Scales – 2nd ed’n (DAS-II). San Antonio, Texas: Harcourt Assessment/Psychological Corporation, 2007.

[18] In real life, IQs above 145 are vanishingly rare.

[19] These often involve a few hundred males living in the Dumfries district.

 

We now know that in the 2007 PIRLS survey Britain has plummeted to 19th of 45 countries in primary reading standards (it was third in 2001); and in the still more recent OECD survey of 57 nations, our fifteen year olds are barely average in reading (we were seventh in 2000).

 

Somehow, one is too inclined to think of education as an event that occurs between a teacher, instructing, and a pupil, learning.  Like all such themes, this is subject to an infinitude of variations, including instructional methods, class sizes, curriculum resources and individualisation.  At all events, the result — if the lesson is one in literacy — is that children learn to read, write and perhaps spell.

 

It is increasingly apparent that this is wishful thinking.  It completely ignores the content of literacy and the cultural context in which such instruction takes place.  One might as well build sandcastles on a beach which is being approached by a tsunami after Krakatoa — or indeed describe a tea-party being assembled and conducted in a gale.

 

We are all familiar with instances of children, perhaps dyslexic, who have been taught against all the odds to read and actually do so with zest and enjoyment only a little diminished; conversely, there are children who have struggled up the adverse gradient of their reading difficulty, acquiring in painstaking and laboured fashion the necessary techniques, who refuse ultimately to pick up a book unless at gunpoint. In other words, facility with this set of techniques to be learned, ultimate success and the deployment of prowess are all dissociable. It is with the latter that we need to concern ourselves now.

 

What is going on outside of the classroom is quite as important as what is going on inside it.  After they leave the bosom of their family, children are subject to the overwhelming influence of their peer group.  Nowadays this means the pervasive, roaring vortex of popular culture.  Specifically it means the world-wide electronic mass media.  Unless children are actually reared on Lindisfarne, and boarding schools may be more protective than day schools from this point of view, children are going to be indoctrinated early with the cult of films and film-stars, Hello and Okay, Big Brother, rock and pop music, premier league football and meaningless celebrity.

 

It does seem that the vortex of popular culture rushes in like a cyclone where there is little in the way of a family culture to withstand its force; but just how powerful can family culture be?  One imagines a family with a well-defined religious tradition, with structures of discipline and routine, with limited television viewing and (perhaps) children in independent schools.  Both parents read, discuss current affairs and listen to and encourage their children’s opinions.  Immediately we have reduced our sample to about 5% of the population, but such a family has hardly a hope in hell of instilling a literate culture in their children.

 

Beyond the doors to peaceful hearth and home, the cyclone rages, picking up farmhouses and outbuildings, pigs and cows, and hurling them up to heaven.  And in case you don’t know what heaven looks like, try switching on the X-Factor. There is a vast floodlit auditorium, bathed in oceanic blue, studded with searchlights, with electronic snow dappling down across three different stages, each dais welcoming briefly a different population of sequinned guests.  Concealed musicians produce undemanding, indeed long-familiar music.  Converted door-to-door salesmen of central heating appear with microphones to elicit a mumble of infinitely repeated responses of gratitude for support.  These supporters are not far away; indeed they are leaping up and down in T-shirts in the audience. Thus at home in Edinburgh or Cardiff an intense pack of tweenies have sent their representatives to participate, but these in turn hope to see their hero ascend to heaven on their behalf. The show begins at a climactic level of hysteria and gradually intensifies until the tearful Leon experiences actual apotheosis on behalf of us all.

 

This conduces to immediate, overwhelming sensation. Fame and Fashion are the presiding deities, objects of unexampled devotion, and the cause served — and make no mistake, we are talking about a value system here — is the great lie of outwardness. To describe this combination of power and inanity as a vortex is just; to describe the dismal absence of any educational or critical nous as mediocrity is to commit that grave breach of manners known as over-optimism. It may be such things can happen in modern Britain and other comparable nations solely because of the fading of religious faith and the rushing in of secularism into the vacuum; but even primitive instincts of self-preservation seem to have been lost or willingly abandoned.

 

Reading is not like this.  Reading implies thinking, critical comparison, memory, reflection and the gradual sifting of what comes to matter to one most.  Culture remains what it always was — the best that has been thought and said.  The reader exists in a simultaneous world in which Marcus Aurelius speaks to Gabriel Marcia Marquez, Plotinus discusses the descent of the soul and each new generation of lyric poet tries its hand at redefining Dante and Horace in a perpetual carousel of harmonious voices.  All of this is inward, in the sense that culture is all that we value most and, as Seamus Heaney puts it, takes the form of a shared inwardness. To those who can read but don’t, and to those who can’t read, alike, is denied the opportunity of spiritual autonomy which is the essence of freedom.

 

The substitute is deeply discouraging.  The power of the international electronic media is the tyranny of our times, just as at various former times in our history kings, barons, aristocrats and trade unionists posed apparently insuperable obstacles to control by the Common-Wealth (Hobbes’s term). The monster is always hungry and roars about seeking whom it may devour. Youngsters are eager alike to be sucked in as star-struck angels ascending the stairway to heaven and as recruits in the ever-expanding army of journalists and technicians who must keep the monster fed.

 

Quiet, to the mighty Danube as it flows,

hushed, out onto the balcony.

Quiet, to the drinks machine that takes the new coins.

Quiet. Let us think what to feed the monster next.

 

Six floors down, in the foyer, the National Lottery Live

is cast up to foraging shoulder-cameras.

This is real time. Through dusk and plate-glass walls

the public peers in. The cameras need no one.[1]

 

In short, there is no external support system that values longer term, grounded, reflective values of the kind that are pursued through reading.  What happens outside the classroom negates what small successes there may be within it.  We are in the grip of a powerful tyranny which purveys entertainment, success and style at the expense of ancient humane values, which it tends to mock. Every act of private reading may be seen as a difficult gesture of resistance to the prevailing popular culture.  Schools themselves, by substituting busyness for learning, often discourage reading.  Increasingly they are places where young children, having donned their uniforms and stuffed their school bags, go to interact with the surrounding electronic culture.  So reading and creative writing have become acts of resistance even to the culture of schools.

 

It is impossible to delude ourselves any longer.  Consequently we should cease all attempts to deceive ourselves.  The defence of the humanities has always been slippery — in a Philistine age when ’theory’ and pseudoscience have reigned — and has never been very convincingly conducted.  Today it is in full retreat in the face of an unprecedented threat.  The values of literacy are those of the individuation of spirit, the colloquy of genius across the centuries and the promise of personality.  In an era of shameless cliché to have an original or critical thought is instantly recognized as an act of subversion.  Given the well documented shrinkage of vocabulary, one would be well advised, on Newsnight, not to employ a word like solipsistic.

 

In conclusion, though it has taken seventeen years for government to promote specific teaching methods likely to promote technical literacy (seventeen years is nothing in a country like this), objective benchmarks of functional literacy remain extremely disappointing in relation to money spent.  The cause lies not inside but outside the classroom, where very little of a literate culture remains to provide an ecosystem for the young reader. Parents and families seem powerless to counteract the vortex of popular culture.  Illiteracy, innumeracy and general stupidity are celebrated everywhere. And the concept of education, which must always be “reformed”, is everywhere too narrow to permit a meaningful linkage between cause and effect, the decline of a literate culture and the decline of literacy.


[1] See Turner, M. The Deer of Tamniès. PublishAmerica, 2006.