Chances Are Read online

Page 17


  Pearson, already a professor of mathematics at University College London, first took on this task with the analysis of an odd distribution of measurements of the foreheads of Neapolitan crabs that troubled his colleague W.F.R. Weldon. Using analytical tools borrowed from mechanics, he was able to resolve this into two superimposed normal distributions, implying that here was a population observed at the moment of evolving into two distinct forms. Sadly, the data may have been at fault, not accounting sufficiently for the crabs’ usual habit of snipping bits off one another.

  Pearson went on to derive from a single differential equation five Types or families of distribution, both skew and symmetrical, of which the Normal was Type V; where it appeared “we may probably assume something approaching a stable condition; there is production and destruction impartially around the mean.” The young pea plants of exceptional parents will disappoint by their conformism, but the children of the average will also include the exceptional—life goes on, churning but stable, down the generations. Other Types governed—or so Pearson confidently claimed—data “from paupers to cricket-scores”; he fit his curves to St. Louis schoolgirls, barometric pressure, Bavarian skulls, property valuations, divorce, buttercups, and mortality.

  You would think that Galton, with his omnivorous delight in the measurable, would have been overjoyed. Here was a man, a “fighting Quaker” like himself, reducing the complexity of the world into forms that might, in time, reveal the inner dynamics of evolution and inheritance. And yet, reading Galton’s letters to Pearson, he seems a man who has conjured up a djinn of great power but worrying intent. “[Your charts] are very interesting, but have you got the rationale quite clear, for your formulae, and your justification for applying them?” Galton knew he had not the mathematics to dispute Pearson’s formulae; but he fretted that a curve expressed as

  might be difficult to reconcile with the reality of a group of Ovambo warriors, watchful but willing, lining up in their quartiles. Weldon (the man with the crabs) was worried too: “The position now seems to be that Pearson, whom I do not trust as a clear thinker when he writes without symbols, has to be trusted implicitly when he hides in a table of Γ-functions, whatever they may be.”

  There are two significant points here—one cultural and one philosophical. The first has to do with the dismal reputation of statistics: that dry and empty sensation in the stomach’s core when the subject arises, either in assignment or conversation. Any newspaper reader would be able to make sense of one of Galton’s essays, but Pearson carried statistics off into a mathematical thicket that has obscured it ever since. He was a scientist and wrote for scientists—but this meant that the ultimate tool of science, its measure of certainty, became something the rest of us had to take on trust. Now that the calculation is further entombed, deep within the software that generates statistical analyses, it becomes even more like the pronouncement of an oracle: the adepts may interpret, but we in the congregation can only raise our hands and wonder.

  The larger point has to do with that certainty. Let us put aside utter skepticism and agree that the stars are really there—or at least something very like them. To admit that our observations of their positions show unavoidable error is no great leap: we are human; we make mistakes. To say these errors fall into a pattern is also an acceptable assumption about reality. Pearson, though, took things further: to his mind, measurements were not attempts at certainty about a thing, but the results of a random process. Observation is like rolling a die—one with many faces and an unknown shape. The scatter of measurements appears across our chart with as little intrinsic meaning as the record of red and black over two months at Monte Carlo. It is only when we have plotted our distribution curve on that scatter and judged its “goodness of fit” that we can begin to do science: determining the mean value, judging the symmetry of the curve, calculating the standard deviation, looking for outlying values. These parameters—a word we toss around with such facility—are the apparent qualities of a curve that has been fitted to the plot of selected observed measurements of a random variable, in itself an apparent quality of some real thing. This is the reality; this is the truth. Whatever is out there—whether a new star in Cassiopeia or your child’s intelligence—is only an approximation: an individual manifestation of a random collective process. This view may be right mathematically—after all, the whole point of probability and statistics is to be rigorous about uncertainty—but it makes the journey back from rigor to reality a long and hazardous expedition.

  Florence Nightingale had once approached her great friend Benjamin Jowett, Master of Balliol, with the idea of starting a statistics faculty at Oxford, but nothing came of it. Pearson, thanks to a substantial legacy from Francis Galton, did establish a Department of Eugenics at University College London and the first great statistical journal, Biometrika . He fought hard for the recognition of statistics as a separate discipline—under, of course, his own banner; practitioners who took too independent a line were usually cast into outer darkness. He brought in armies of assistants to plot curves, judge goodness of fit, and churn out parameters, using new mechanical aids like the Millionaire, a sewing-machine-size calculator that (as its name implied) could handle inputs in seven digits. Many of these assistants were female; indeed, one distinction of statistics is that it served as a secret passage into the fortress of academia for many women who later built distinguished careers. Under Pearson’s rule, statistics was no longer just a philosophical vogue, a useful method, a tool for observers and activists, a side branch of mathematics—it had become a Department with Professors, almost an end in itself. That is how we find it today, in universities, companies, hospitals, and government departments all over the world; and this now seems . . . perfectly normal.

  7

  Healing

  Non est vivere, sed valere vita est. (Life is not living, but living in health)

  —Martial, Epigrams, VI, lxx

  Watching the balls rattle down Galton’s quincunx, you feel amazed at order arising out of chaos—random ricochets settling into the satisfying curve of a normal distribution—but only if each ball is anonymous. If one ball represents you, why should you care that your haphazard path combines with the others to make a neat, coherent picture? You want to know where you’re going.

  This has always been the dilemma in medicine. In Malraux’s La voie royale, the hero concludes: “There is no Death . . . there is only I, who am dying.” When the body—oldest and closest companion—begins to give out, the problem is not abstract. I hurt, I tire, I fear: something is wrong with me; there must be something you can do for me. The doctor’s answer may also be couched in those terms, but the science on which it is based has nothing to do with the individual. It is collective and probabilistic. That standby of medical dialogue—“What chance do I have, Doctor?” “I’d say about sixty percent”—does not mean what the patient thinks it means: “You have a sixty percent chance of coming through this crisis” but rather, at best: “I remember a study reporting that, of a thousand people roughly like you, roughly in the same position, four hundred of them died.”

  Medicine is a profession long held in honor because it averts fate. Asclepius was considered the son of a god because only someone in touch with the divine could legitimately interfere with illness, previously the province of prayer and sacrifice. Medieval doctors were like priests, drawing knowledge from ancient texts, proceeding by deduction and comparison. As late as the mid-eighteenth century, Galen, Avicenna, and Hippocrates remained the set books in every European medical school: although anatomists had long since shown the heart to be a pump, students still learned that it was some sort of furnace. As of 1750, there were two effective medicines: cinchona bark for malaria and mercury for syphilis; one effective surgical procedure: cutting for bladder stones; and one sensible prescription based on observation: vaccination against smallpox with cowpox. Apart from that (and despite a long tradition of excellent descriptive anatomy going back to Leonardo) all wa
s fluctuating humors, plague-repellent pomanders, purgings, bleedings, clysters, and humbug. Indeed, it is debatable whether going to a doctor at any time before 1880 would have increased or decreased one’s chance of survival.

  Medicine lacks the key experiments associated with other sciences. Walter Reed’s mosquito, Pasteur’s rabid boy, and Fleming’s gardening boots don’t have the clinching perfection of Galileo on his tower. The body has too few ways of letting us know what’s wrong, and there are too many potential causes for the same phenomenon—think how baffled medicine still seems by vague but persistent viral infections. If each symptom pointed to a single cause, we could reason about it deductively—a simple procedure, which may explain why the works of the deductive theorists were preserved, in contradiction to observed fact, for so long.

  When Quetelet’s work first appeared, his new Statistics briefly promised real value in medicine. In 1828 Pierre Charles Alexandre Louis made a quantitative study of the effect of bloodletting on pneumonia and found it essentially useless, thus sparing countless exhausted patients the ordeal of lancet and leech. The vogue for numerical medicine soon died out, however, both because the limited studies hardly merited their broad conclusions, and because of powerful resistance from within the medical profession.

  It’s tempting to believe that this was simply the reactionary tendency of an elite anxious to hold on to its fees, but there were also more interesting and more sincere objectors. Risueño d’Amador thought that statistics breached the intuitive connection between doctor and patient, the “tact” that allows this unique interaction of person and disease to be understood. Considering disease collectively might actually reduce the chance of curing the patient in front of you, since he would be bound to differ from the average. Claude Bernard saw statistics as standing in the way of developing exact knowledge: medicine was simply a young science, which in time would grow into the certainty of physics. Physiologists, he said, “must never make average descriptions of experiments, because the true relations of phenomena disappear in the average.”

  The problem arose from having to draw collective conclusions when the way things vary is probably more important than the way they stay the same. Although statistics could reveal broad correlations, as in John Snow’s cholera map and Florence Nightingale’s bat-wing charts, it had as yet no power to define degrees of action, no method for separating confounded variables, no hope of being scientifically rigorous about the uncertain.

  Yet there was an entirely different field of study that tackled these same difficulties: agronomy. In the thirteenth century, when the red-robed magister at Oxford’s medical schools would explain that all disease was the result of uneven mixture of the hot, wet, cold, and dry elements in the body, a lecturer further down the muddy street, Walter of Henley, was telling students in the estate-management course:Change yearly your seed corn at Michaelmas, for more increase shall you have of the seed that grew on another man’s land than by that which groweth upon your own land. And will you see it? Cause the two lands to be plowed in one day and sow the one with the bought seed and the other with the seed which grew of your own and at harvest you shall find that I say truth.

  “And will you see it?” is a refrain running all through Walter of Henley’s works. Each problem was approached through experiment and scrupulous accounting. Plant two fields together; watch through the year; keep track of the overall costs “and you shall find that I say truth.”

  Francis Bacon continued this same experimental tradition, testing to see how well seeds germinated in separate, revolting concoctions; the seeds soaked in urine showed a marked advantage over the untreated, or those soaked in wine. Nowadays, we could make an educated assumption about the roles of urea and nitrogen; but Bacon’s experiment made it possible to decide what to do even without the fundamental knowledge.

  Eighteenth-century agronomy was approaching ever closer to what we would now call the scientific method. Arthur Young’s Course of Experimental Agriculture appeared in 1770: not only did he insist on split-field trials of any new technique or treatment, but he said that those trials should be repeated in several different fields to exclude the effects of variation in soil fertility or drainage. He measured value down to the farthing and tested it by real sales on the same day in the same market. Most of all, he deplored hypothesis: “adopting a favorite notion, and forming experiments with an eye to confirm it.” Each step toward discipline made agronomy more scientific: a modern researcher would find very little recognizable in a medical laboratory of 1820, but he could walk onto an experimental farm of the same date and feel entirely at home.

  Whenever a collective experiment is being planned; whenever researchers are collating and preparing data; whenever government agencies, pharmaceutical companies, or hospital authorities decide a result is “statistically significant”—there stands the blinking, bearded, pipe-smoking spirit of Ronald Aylmer Fisher.

  Fisher combined great abilities with great hatreds, collegial warmth with an ungovernable temper, broad interests with painstaking precision. He had such weak eyesight that his schoolmasters arranged for him to do as little reading and writing as possible: he learned mathematics not from blackboards and textbooks, but from conversation and the development of a precise imagination. This gave him an uncanny talent for inner visualization: the shape of a scatter of points in eight dimensions was as intuitively clear to him as if they had been in two. He studied mathematics at Cambridge, but always with an eye to its applications in astronomy, biology, and genetics.

  Academic feuds, like the wars of hill tribes, are as tiresome as they are endless, except when they spur some unexpected creation: an epic poem, a theory. In 1917, Karl Pearson published, without prior warning, a paper criticizing Fisher’s emerging ideas on likelihood, claiming that they were essentially the same as Laplace’s inverse probability. The prickly Fisher felt snubbed and deliberately misunderstood. When, two years later, Pearson offered Fisher a position as his assistant, he spurned it and went off to be statistician at Rothamsted Agricultural Experimental Station. The student of genetic variation, heir to the statistical tradition of Galton, had come back to the land.

  He found a spread of rolling, well-tended fields; at their center, a cluster of sturdy brick buildings; and within one of them, a room filled with leather-bound data: ninety years of daily rainfall, temperature, soil conditions, fertilizer application, and crop yields. The proprietors of Rothamsted, a family enriched by the invention of artificial guano, had understood the importance of raw numbers in the study of variation.

  Fisher plunged into this multidimensional world, where every factor was at once separate and correlated, and set it running in that mental theater where eight dimensions seemed like two. He learned how to strip out variables one after the other: cycles of weather, exhaustion of land, regression to mean, annual rainfall—filtering out the noise so that only the signal of interest remained. He was even able to isolate what had been until then an inexplicable phenomenon and determine its cause: there had been a deterioration of yield beginning in 1876, accelerating in 1880, suddenly improving in 1901, dropping off thereafter. Why? Because the Education Acts of 1876 and 1880 made attendance at school compulsory, so the little boys who had previously earned their pocket money by weeding disappeared; then, in 1901, the vigorous master of a local girls’ school thought weeding would be a healthy outdoor activity for his charges—but they soon disagreed.

  His progress in the analysis of real data prompted Fisher to look at how the experiments themselves were set up. There had been, since the days of Young, constant debate about the layout of field tests. Let’s say you want to test your superphosphate fertilizer (a polite term for bird droppings) against a control. You might think that setting out your field in alternate strips, A|B|A|B, would be a reasonable proposal—and if your B strips outgrew your A strips you could confidently recommend superphosphate to your friends. But what if there was a natural gradient in fertility across the whole field from ri
ght to left? Each of your B strips would be just that bit more fertile than its neighboring A; if you added up the total yields, you might be seeing an effect that wasn’t there. The recognized solution to this problem was the “Latin square,” an ingenious anagram that allowed many small areas of different treatment to be grown evenly across a field in such a way that no two adjacent plots received the same treatment, thus:

  In Fisher’s view, however, any repeated system, no matter how balanced and ingenious, introduced an element of bias that would make it difficult to separate out natural variation from the final data—and without the natural variation, there would be nothing with which to compare the effects of treatment. What, on the other hand, was the only confounder that was easily washed out of results? Error—thanks to the error curve. And how could you make sure that the extra variation expressed error and nothing else? Randomize. Fisher suggested—and science took up—the rule that the only way to assure an unbiased distribution of treatments to subjects is to flip a coin or roll a die.

  Fisher’s rule was tested in a remarkable nonexperiment by the New Zealander A. W. Hudson, who planted potatoes and then applied six entirely imaginary “treatments” in random or systematic patterns. Although nothing had been done to the potatoes, there was less variation in those “treated” systematically than in the random. Even when the only variation is natural, a regular system of observation can introduce a spurious appearance of order. Fisher was vindicated.