The Riddle of Sweden's COVID-19 Numbers

Comparing Sweden's COVID-19 statistics to other European countries, two peculiar features emerge:

  1. Despite very different policies, Sweden has a similar pattern of cases.
  2. Despite a similar pattern of cases, Sweden has a very different pattern of deaths.

Sweden's Strategy

What exactly has Sweden done (and not done) in response to COVID-19?

  • The government has banned large public gatherings.
  • The government has partially closed schools and universities: lower secondary schools remained open while older students stayed at home.
  • The government recommends voluntary social distancing. High-risk groups are encouraged to isolate.
  • Those with symptoms are encouraged to stay at home.
  • The government does not recommend the use of masks, and surveys confirm that very few people use them (79% "not at all" vs 2% in France, 0% in Italy, 11% in the UK).
  • There was a ban on visits to care homes which was lifted in September.
  • There have been no lockdowns.

How has it worked? Well, Sweden is roughly at the same level as other western European countries in terms of per capita mortality, but it's also doing much worse than its Nordic neighbors. Early apocalyptic predictions have not materialized. Economically it doesn't seem to have gained much, as its Q2 GDP drop was more or less the same as that of Norway and Denmark.1

Case Counts

Sweden has followed a trajectory similar to other Western countries with the first wave in April, a pause during the summer (Sweden took longer to settle down, however), and now a second wave in autumn.2

The fact that the summer drop-off in cases happened in Sweden without lockdowns and without masks suggests that perhaps those were not the determining factors? It doesn't necessarily mean that lockdowns are ineffective in general, just that in this particular case the no-lockdown counterfactual probably looks similar.

The similarity of the trajectories plus the timing points to a common factor: climate.


This sure looks like a seasonal pattern, right? And there are good a priori reasons to think COVID-19 will be slow to spread in summer: the majority of respiratory diseases all but disappear during the warmer months. This chart from Li, Wang & Nair (2020) shows the monthly activity of various viruses sorted by latitude:

The exact reasons are unclear, but it's probably a mix of temperature, humidity,3 behavioral factors, UV radiation, and possibly vitamin D.

However, when it comes to COVID-19 specifically there are reasons to be skeptical. The US did not have a strong seasonal pattern:

And in the southern hemisphere, Australia's two waves don't really fit a clear seasonal pattern. [Edit: or perhaps it does fit? Their second wave was the winter wave; climate differences and lockdowns could explain the differences from the European pattern?]

The WHO (yes, yes, I know) says it's all one big wave and covid-19 has no seasonal pattern like influenza. A report from the National Academy of Sciences is also very skeptical about seasonality, making comparisons to SARS and MERS which do not exhibit seasonal patterns.

A review of 122 papers on the seasonality of COVID-19 is mostly inconclusive, citing lack of data and problems with confounding from control measures, social, economic, and cultural conditions. The results in the papers themselves "offer mixed offer mixed statistical support (none, weak, or strong relationships) for the influence of environmental drivers." Overall I don't think there's compelling evidence in favor of climatic variables explaining a large percentage of variation in COVID-19 deaths. So if we can't attribute the summer "pause" and autumn "second wave" in Europe to seasonality, what is the underlying cause?


If not the climate, then I would suggest schools, but the evidence suggests they play a very small role. I like this study from Germany which uses variation in the timing of summer breaks across states, finding no evidence for an effect on new cases. This paper utilizes the partial school closures in Sweden and finds open schools had only "minor consequences". Looking at school closures during the SARS epidemic the results are similar. The ECDC is not particularly worried about schools, arguing that outbreaks in educational facilities are "exceptional events" that are "limited in number and size".

So what are we left with? Confusion.


This chart shows daily new cases and new deaths for all of Europe:

There's a clear relationship between cases & deaths, with a lag of a few weeks as you would expect (and a change in magnitude due to increased testing and decreasing mortality rates). Here's what Sweden's chart looks like:

What is going on here? Fatality rates have been dropping everywhere, but cases and deaths appear to be completely disconnected in Sweden. Even the first death peak doesn't coincide with the first case peak, but that's probably because of early spread in nursing homes.

Are they undercounting deaths? I don't think so, total deaths seem to be below normal levels (data from euromomo):

So how do we explain the lack of deaths in Sweden?


Could it be that only young people are catching it in Sweden? I haven't found any up to date, day-by-day breakdowns by age, but comparing broad statistics for Sweden and Europe as a whole, they look fairly similar. Even if age could explain it, why would that be the case in Sweden and not in other countries? Why aren't the young people transmitting it to vulnerable old people? Perhaps it's happening and the lag is long enough that it's just not reflected in the data yet?

[Edit: thanks to commenter Frank Suozzo for pointing out that cases are concentrated in lower ages. I have found data from July 31 on the internet archive; comparing it to the latest figures, it appears that old people have managed to avoid getting covid in Sweden! Here's the chart showing total case counts:]

Improved Treatment?

Mortality has declined everywhere, and part of that is probably down to improved treatment. But I don't see Sweden doing anything unique which could explain the wild discrepancy.

Again I'm left confused about these cross-country differences. If you have any good theories I would love to hear them. Looks like age is the answer.

  1. 1.I think the right way to look at this is to say that Sweden has underperformed given its cultural advantages. The differences between Italian-, French-, and German-speaking cantons in Switzerland suggest a large role for cultural factors. Sweden should've followed a trajectory similar to its neighbors rather than one similar to Central/Southern Europe. Of course it's hard to say how things will play out in the long run.
  2. 2.Could this all be just because of increased testing? No. While testing has increased, the rate of positive tests has also risen dramatically. The second wave is not a statistical artifact.
  3. 3.Humidity seems very important, at least when it comes to influenza. See eg Absolute Humidity and the Seasonal Onset of Influenzain the Continental United States and Absolute humidity modulates influenza survival, transmission, and seasonality. There's even experimental evidence here, some papers: High Humidity Leads to Loss of Infectious Influenza Virus from Simulated Coughs, Humidity as a non-pharmaceutical intervention for influenza A.

When the Worst Man in the World Writes a Masterpiece

Boswell's Life of Johnson is not just one of my favorite books, it also engendered some of my favorite book reviews. While praise for the work is universal, the main question commentators try to answer is this: how did the worst man in the world manage to write the best biography?

The Man

Who was James Boswell? He was a perpetual drunk, a degenerate gambler, a sex addict, whoremonger, exhibitionist, and rapist. He gave his wife an STD he caught from a prostitute.

Selfish, servile and self-indulgent, lazy and lecherous, vain, proud, obsessed with his aristocratic status, yet with no sense of propriety whatsoever, he frequently fantasized about the feudal affection of serfs for their lords. He loved to watch executions and was a proud supporter of slavery.

“Where ordinary bad taste leaves off,” John Wain comments, “Boswell began.” The Thrales were long-time friends and patrons of Johnson; a single day after Henry Thrale died, Boswell wrote a poem fantasizing about the elderly Johnson and the just-widowed Hester: "Convuls'd in love's tumultuous throws, / We feel the aphrodisian spasm". The rest of his verse is of a similar quality; naturally he considered himself a great poet.

Boswell combined his terrible behavior with a complete lack of shame, faithfully reporting every transgression, every moronic ejaculation, every faux pas. The first time he visited London he went to see a play and, as he happily tells us himself, he "entertained the audience prodigiously by imitating the lowing of a cow."

By all accounts, including his own, he was an idiot. On a tour of Europe, his tutor said to him: "of young men who have studied I have never found one who had so few ideas as you."

As a lawyer he was a perpetual failure, especially when he couldn't get Johnson to write his arguments for him. As a politician he didn't even get the chance to be a failure despite decades of trying.

His correspondence with Johnson mostly consists of Boswell whining pathetically and Johnson telling him to get his shit together.

He commissioned a portrait from his friend Joshua Reynolds and stiffed him on the payment. His descendants hid the portrait in the attic because they were ashamed of being related to him.

Desperate for fame, he kept trying to attach himself to important people, mostly through sycophancy. In Geneva he pestered Rousseau,1 leading to this conversation:

Rousseau: You are irksome to me. It’s my nature. I cannot help it.
Boswell: Do not stand on ceremony with me.
Rousseau: Go away.

Later, Boswell was given the task of escorting Rousseau's mistress Thérèse Le Vasseur to England—they had an affair on the way.

When Adam Smith and Edward Gibbon were elected to The Literary Club, Boswell considered leaving because he thought the club had now "lost its select merit"!

On the positive side, his humor and whimsy made for good conversation; he put people at ease; he gave his children all the love his own father had denied him; and, somehow, he wrote one of the great works of English literature.

The Masterpiece

The Life of Samuel Johnson, LL.D. was an instant sensation. While the works of Johnson were quickly forgotten,2 his biography has never been out of print in the 229 years since its initial publication. It went through 41 editions just in the 19th century.

Burke told King George III that he had never read anything more entertaining. Coleridge said "it is impossible not to be amused with such a book." George Bernard Shaw compared Boswell's dramatization of Johnson to Plato's dramatization of Socrates, and placed old Bozzy in the middle of an "apostolic succession of dramatists" from the Greek tragedians through Shakespeare and ending, of course, with Shaw himself.

It is a strange work, an experimental collage of different modes: part traditional biography, part collection of letters, and part direct reports of Johnson's life as observed by Boswell.3 His inspiration came not from literature, but from the minute naturalistic detail of Flemish paintings. It is difficult to convey its greatness in compressed form: Boswell is not a great writer at the sentence level, and all the famous quotes are (hilarious) Johnsonian bon mots. The book succeeds through a cumulative effect.

Johnson was 54 years old when he first met Boswell, and most of his major accomplishments (the poetry, the dictionary, The Rambler) were behind him; his wife had already died; he was already the recipient of a £300 pension from the King; his edition of Shakespeare was almost complete. All in all they spent no more than 400 days together. Boswell had limited material to work with, but what he doesn't capture in fact, he captures in feeling: an entire life is contained in this book: love and friendship, taverns and work, the glory of success and recognition, the depressive bouts of failure and penury, the inevitable tortures of aging and death.

Out of a person, Boswell created a literary personality. His powers of characterization are positively Shakespearean, and his Johnson resembles none other but the bard's greatest creation: Sir John Falstaff. Big, brash, and deeply flawed, but also lovable. He would "laugh like a rhinoceros":

Johnson could not stop his merriment, but continued it all the way till he got without the Temple-gate. He then burst into such a fit of laughter that he appeared to be almost in a convulsion; and in order to support himself, laid hold of one of the posts at the side of the foot pavement, and sent forth peals so loud, that in the silence of the night his voice seemed to resound from Temple-bar to Fleet ditch.

And around Johnson he painted an entire dramatic cast, bringing 18th century London to life: Garrick the great actor, Reynolds the painter, Beauclerk with his banter, Goldsmith with his insecurities. Monboddo and Burke, Henry and Hester Thrale, the blind Mrs Williams and the Jamaican freedman Francis Barber.

Borges (who was also a big fan) finds his parallels not in Shakespeare and Falstaff, but in Cervantes and Don Quixote. He (rather implausibly) suggests that every Quixote needs his Sancho, and "Boswell appears as a despicable character" deliberately to create a contrast.4

And in the 1830s, two brilliant and influential reviews were written by two polar opposites: arch-progressive Thomas Babington Macaulay and radical reactionary Thomas Carlyle. The first thing you'll notice is their sheer magnitude: Macaulay's is 55 pages long, while Carlyle's review in Fraser's Magazine reaches 74 pages!5 And while they both agree that it's a great book and that Boswell was a scoundrel, they have very different theories about what happened.


Never in history, Macaulay says, has there been "so strange a phænomenon as this book". On the one hand he has effusive praise:

Homer is not more decidedly the first of heroic poets, Shakspeare is not more decidedly the first of dramatists, Demosthenes is not more decidedly the first of orators, than Boswell is the first of biographers. He has no second. He has distanced all his competitors so decidedly that it is not worth while to place them.

On the other hand, he spends several paragraphs laying into Boswell with gusto:

He was, if we are to give any credit to his own account or to the united testimony of all who knew him, a man of the meanest and feeblest intellect. [...] He was the laughing-stock of the whole of that brilliant society which has owed to him the greater part of its fame. He was always laying himself at the feet of some eminent man, and begging to be spit upon and trampled upon. [...] Servile and impertinent, shallow and pedantic, a bigot and a sot, bloated with family pride, and eternally blustering about the dignity of a born gentleman, yet stooping to be a talebearer, an eavesdropper, a common butt in the taverns of London.

Macaulay's theory is that while Homer and Shakespeare and all the other greats owe their eminence to their virtues, Boswell is unique in that he owes his success to his vices.

He was a slave, proud of his servitude, a Paul Pry, convinced that his own curiosity and garrulity were virtues, an unsafe companion who never scrupled to repay the most liberal hospitality by the basest violation of confidence, a man without delicacy, without shame, without sense enough to know when he was hurting the feelings of others or when he was exposing himself to derision; and because he was all this, he has, in an important department of literature, immeasurably surpassed such writers as Tacitus, Clarendon, Alfieri, and his own idol Johnson.

Of the talents which ordinarily raise men to eminence as writers, Boswell had absolutely none. There is not in all his books a single remark of his own on literature, politics, religion, or society, which is not either commonplace or absurd. [...] Logic, eloquence, wit, taste, all those things which are generally considered as making a book valuable, were utterly wanting to him. He had, indeed, a quick observation and a retentive memory. These qualities, if he had been a man of sense and virtue, would scarcely of themselves have sufficed to make him conspicuous; but, because he was a dunce, a parasite, and a coxcomb, they have made him immortal.

The work succeeds partly because of its subject: if Johnson had not been so extraordinary, then airing all his dirty laundry would have just made him look bad.

No man, surely, ever published such stories respecting persons whom he professed to love and revere. He would infallibly have made his hero as contemptible as he has made himself, had not his hero really possessed some moral and intellectual qualities of a very high order. The best proof that Johnson was really an extraordinary man is that his character, instead of being degraded, has, on the whole, been decidedly raised by a work in which all his vices and weaknesses are exposed.

And finally, Boswell provided Johnson with a curious form of literary fame:

The reputation of [Johnson's] writings, which he probably expected to be immortal, is every day fading; while those peculiarities of manner and that careless table-talk the memory of which, he probably thought, would die with him, are likely to be remembered as long as the English language is spoken in any quarter of the globe.


Carlyle rates Johnson's biography as the greatest work of the 18th century. In a sublime passage that brings tears to my eyes, he credits the Life with the power of halting the inexorable passage of time:

Rough Samuel and sleek wheedling James were, and are not. [...] The Bottles they drank out of are all broken, the Chairs they sat on all rotted and burnt; the very Knives and Forks they ate with have rusted to the heart, and become brown oxide of iron, and mingled with the indiscriminate clay. All, all has vanished; in every deed and truth, like that baseless fabric of Prospero's air-vision. Of the Mitre Tavern nothing but the bare walls remain there: of London, of England, of the World, nothing but the bare walls remain; and these also decaying (were they of adamant), only slower. The mysterious River of Existence rushes on: a new Billow thereof has arrived, and lashes wildly as ever round the old embankments; but the former Billow with its loud, mad eddyings, where is it? Where! Now this Book of Boswell's, this is precisely a revocation of the edict of Destiny; so that Time shall not utterly, not so soon by several centuries, have dominion over us. A little row of Naphtha-lamps, with its line of Naphtha-light, burns clear and holy through the dead Night of the Past: they who are gone are still here; though hidden they are revealed, though dead they yet speak. There it shines, that little miraculously lamplit Pathway; shedding its feebler and feebler twilight into the boundless dark Oblivion, for all that our Johnson touched has become illuminated for us: on which miraculous little Pathway we can still travel, and see wonders.

Carlyle disagrees completely with Macaulay: it is not because of his vices that Boswell could write this book, but rather because he managed to overcome them. He sees in Boswell a hopeful symbol for humanity as a whole, a victory in the war between the base and the divine in our souls.

In fact, the so copious terrestrial dross that welters chaotically, as the outer sphere of this man's character, does but render for us more remarkable, more touching, the celestial spark of goodness, of light, and Reverence for Wisdom, which dwelt in the interior, and could struggle through such encumbrances, and in some degree illuminate and beautify them.

Boswell's shortcomings were visible: he was "vain, heedless, a babbler". But if that was the whole story, would he really have chosen Johnson? He could have picked more illustrious targets, richer ones, perhaps some powerful statesman or an aristocrat with a distinguished lineage. "Doubtless the man was laughed at, and often heard himself laughed at for his Johnsonism". Boswell must have been attracted to Johnson by nobler motives. And to do that he would have to "hurl mountains of impediment aside" in order to overcome his nature.

The plate-licker and wine-bibber dives into Bolt Court, to sip muddy coffee with a cynical old man, and a sour-tempered blind old woman (feeling the cups, whether they are full, with her finger); and patiently endures contradictions without end; too happy so he may but be allowed to listen and live.

The Life is not great because of Boswell's foolishness, but because of his love and his admiration, an admiration that Macaulay considered a disease. Boswell wrote that in Johnson's company he "felt elevated as if brought into another state of being".

His sneaking sycophancies, his greediness and forwardness, whatever was bestial and earthy in him, are so many blemishes in his Book, which still disturb us in its clearness; wholly hindrances, not helps. Towards Johnson, however, his feeling was not Sycophancy, which is the lowest, but Reverence, which is the highest of human feelings.

On Johnson's personality, Carlyle writes: "seldom, for any man, has the contrast between the ethereal heavenward side of things, and the dark sordid earthward, been more glaring". And this is what Johnson wrote about Falstaff in his Shakespeare commentary:

Falstaff is a character loaded with faults, and with those faults which naturally produce contempt. [...] the man thus corrupt, thus despicable, makes himself necessary to the prince that despises him, by the most pleasing of all qualities, perpetual gaiety, by an unfailing power of exciting laughter, which is the more freely indulged, as his wit is not of the splendid or ambitious kind, but consists in easy escapes and sallies of levity, which make sport but raise no envy.

Johnson obviously enjoyed the comparison to Falstaff, but would it be crazy to also see Boswell in there? The Johnson presented to us in the Life is a man who had to overcome poverty, disease, depression, and a constant fear of death, but never let those things poison his character. Perhaps Boswell crafted the character he wished he could become: Johnson was his Beatrice—a dream, an aspiration, an ideal outside his grasp that nonetheless thrust him toward greatness. Through a process of self-overcoming Boswell wrote a great book on self-overcoming.

Mediocrities Everywhere...I Absolve You

The story of Boswell is basically the plot of Amadeus, with the role of Salieri being played by Macaulay, by Carlyle, by me, and—perhaps even by yourself, dear reader. The line between admiration, envy, and resentment is thin, and crossing it is easier when the subject is a scoundrel. But if Bozzy could set aside resentment for genuine reverence, perhaps there is hope for us all. And would be an error to see in Boswell the Platonic Form of Mankind.

Shaffer and Forman's film portrays Mozart as vulgar, arrogant, a womanizer, bad with money—but, like Bozzy, still somehow quite likable. In one of the best scenes of the film, we see Mozart transform the screeching of his mother-in-law into the Queen of the Night Aria; thus Boswell transformed his embarrassments into literary gold. He may be vulgar, but his productions are not. He may be vulgar, but he is not ordinary.

Perhaps it is in vain that we seek correlations among virtues and talents: perhaps genius is ineffable. Perhaps it's Ramanujans all the way down. You can't even say that genius goes with independence: there's nothing Boswell wanted more than social approval. I won't tire you with clichés about the Margulises and the Musks.

Would Johnson have guessed that he would be the mediocrity, and Bozzy the genius? Would he have felt envy and resentment? What would he say, had he been given the chance to read in Carlyle that Johnson's own writings "are becoming obsolete for this generation; and for some future generation may be valuable chiefly as Prolegomena and expository Scholia to this Johnsoniad of Boswell"?

If you want to read The Life of Johnson, I recommend a second-hand copy of the Everyman's Library edition: cheap, reasonably sized, and the paper & binding are great.

  1. 1.In the very first letter Boswell wrote to Rousseau, he described himself as "a man of singular merit".
  2. 2.They were "rediscovered" in the early 1900s.
  3. 3.While some are quick to dismiss the non-direct parts, I think they're necessary, especially the letters which illuminate a different side of Johnson's character.
  4. 4.Lecture #10 in Professor Borges: A Course on English Literature.
  5. 5.What happened to the novella-length book review? Anyway, many of those pages are taken up by criticism of John Wilson Croker's incompetent editorial efforts.

Links & What I've Been Reading Q3 2020

High Replicability of Newly-Discovered Social-behavioral Findings is Achievable: a replication of 16 papers that followed "optimal practices" finds a high rate of replicability and virtually identical effect sizes as the original studies.

How do you decide what to replicate? This paper attempts to build a model that can be used to pick studies to maximize utility gained from replications.

Guzey on that deworming study, tracks which variables are reported across 5 different drafts of the paper starting in 2011. "But then you find that these variables didn’t move in the right direction. What do you do? Do you have to show these variables? Or can you drop them?"

I've been enjoying the NunoSempre forecasting newsletter, a monthly collection of links on forecasting.

COVID-19 made weather forecasts worse by limiting the metereological data coming from airplanes.

The 16th paragraph in this piece on the long-term effects of coronavirus mentions that 2 out of 3 people with "long-lasting" COVID-19 symptoms never had COVID to begin with.

An experiment with working 120 hours in a week goes surprisingly well.

Gwern's giant GPT-3 page. The Zizek Navy Seal Copypasta is incredible, as are the poetic imitations.

Ethereum is a Dark Forest. "In the Ethereum mempool, these apex predators take the form of “arbitrage bots.” Arbitrage bots monitor pending transactions and attempt to exploit profitable opportunities created by them."

Tyler Cowen in conversation with Nicholas Bloom, lots of fascinating stuff on innovation and progress. "Just in economics — when I first started in economics, it was standard to do a four-year PhD. It’s now a six-year PhD, plus many of the PhD students have done a pre-doc, so they’ve done an extra two years. We’re taking three or four years longer just to get to the research frontier." Immediately made me think of Scott Alexander's Ars Longa, Vita Brevis.

The Progress Studies for Young Scholars youtube channel has a bunch of interesting interviews, including Cowen, Collison, McCloskey, and Mokyr.

From the promising new Works in Progress magazine, Progress studies: the hard question.

I've written a parser for your Kindle's My Clippings.txt file. It removes duplicates, splits them up by book, and outputs them in convenient formats. Works cross-platform.

Generative bad handwriting in 280 characters. You can find a lot more of that sort of thing by searching for #つぶやきProcessing on twitter.

A new ZeroHPLovecraft short story, Key Performance Indicators. Black Mirror-esque.

A great skit about Ecclesiastes from Israeli sketch show The Jews Are Coming. Turn on the subs.

And here's some sweet Dutch prog-rock/jazz funk from the 70s.

What I've Been Reading

  • Piranesi by Susanna Clarke. 16 years after Jonathan Strange & Mr Norrell, a new novel from Susanna Clarke! It's short and not particularly ambitious, but I enjoyed it a lot. A tight fantastical mystery that starts out similar to The Library of Babel but then goes off in a different direction.

  • The Poems of T. S. Eliot: the great ones are great, and there's a lot of mediocre stuff in between. Ultimately a bit too grey and resigned and pessimistic for my taste. I got the Faber & Faber hardcover edition and would not recommend it, it's unwieldy and the notes are mostly useless.

  • Antkind by Charlie Kaufman. A typically Kaufmanesque work about a neurotic film critic and his discovery of an astonishing piece of outsider art. Memory, consciousness, time, doubles, etc. Extremely good and laugh-out-loud funny for the first half, but the final 3-400 pages were a boring, incoherent psychedelic smudge.

  • Under the Volcano by Malcolm Lowry. Very similar to another book I read recently, Lawrence Durrell's Alexandria Quartet. I prefer Durrell. Lowry doesn't have the stylistic ability to make the endless internal monologues interesting (as eg Gass does in The Tunnel), and I find the central allegory deeply misguided. Also, it's the kind of book that has a "central allegory".

  • Less than One by Joseph Brodsky. A collection of essays, mostly on Russian poetry. If I knew more about that subject I think I would have enjoyed the book more. The essays on his life in Soviet Russia are good.

  • Science Fictions: Exposing Fraud, Bias, Negligence and Hype in Science by Stuart Ritchie. Very good, esp. if you are not familiar with the replication crisis. Some quibbles about the timing and causes of the problems. Full review here.

  • The Idiot by "Dostoyevsky". Review forthcoming.

  • Borges and His Successors: The Borgesian Impact on Literature and the Arts: a collection of fairly dull essays with little to no insight.

  • Samuel Johnson: Literature, Religion and English Cultural Politics from the Restoration to Romanticism by J.C.D. Clark: a dry but well-researched study on an extraordinarily narrow slice of cultural politics. Not really aimed at a general audience.

  • Dhalgren by Samuel R. Delany. A wild semi-autobiographical semi-post-apocalyptic semi-science fiction monster. It's a 900 page slog, it's puerile, the endless sex scenes (including with minors) are pointless at best, the characters are uninteresting, there's barely any plot, the 70s counterculture stuff is just comical, and stylistically it can't reach the works it's aping. So I can see why some people hate it. But I actually enjoyed it, it has a compelling strangeness to it that is difficult to put into words (or perhaps I was just taken in by all the unresolved plot points?). Its sheer size is a quality in itself, too. Was it worth the effort? Could I recommend it? Probably not.

  • Novum Organum by Francis Bacon. While he did not actually invent the scientific method, his discussion of empiricism, experiments, and induction was clearly a step in that direction. The first part deals with science and empiricism and induction from an abstract perspective and it feels almost contemporary, like it was written by a time traveling 19th century scientist or something like that. The quarrel between the ancients and the moderns is already in full swing here, Bacon dunks on the Greeks constantly and upbraids people for blindly listening to Aristotle. Question received dogma and popular opinions, he says. He points to inventions like gunpowder and the compass and printing and paper and says that surely these indicate that there's a ton of undiscovered ideas out there, we should go looking for them. He talks about cognitive biases and scientific progress:

    we are laying the foundations not of a sect or of a dogma, but of human progress and empowerment.

    Then you get to the second part and the middle ages hit you like a freight train, you suddenly realize this is no contemporary man at all and his conception of how the world works is completely alien. Ideas that to us seem bizarre and just intuitively nonsensical (about gravity, heat, light, biology, etc.) are only common sense to him. He repeats absurdities about worms and flies arising spontaneously out of putrefaction, that light objects are pulled to the heavens while heavy objects are pulled to the earth, and so on. Not just surface-level opinions, but fundamental things that you wouldn't even think someone else could possibly perceive differently.

    You won't learn anything new from Bacon, but it's a fascinating historical document.

  • The Book of Marvels and Travels by John Mandeville. This medieval bestseller (published around 1360) combines elements of travelogue, ethnography, and fantasy. It's unclear how much of it people believed, but there was huge demand for information about far-off lands and marvelous stories. Mostly compiled from other works, it was incredibly popular for centuries. In the age of exploration (Columbus took it with him on his trip) people were shocked when some of the fantastical stories (eg about cannibals) actually turned out to be true. The tricks the author uses to generate verisimilitude are fascinating: he adds small personal touches about people he met, sometimes says that he doesn't know anything about a particular region because he hasn't been there, etc.

What's Wrong with Social Science and How to Fix It: Reflections After Reading 2578 Papers

I've seen things you people wouldn't believe.

Over the past year, I have skimmed through 2578 social science papers, spending about 2.5 minutes on each one. This was due to my participation in Replication Markets, a part of DARPA's SCORE program, whose goal is to evaluate the reliability of social science research. 3000 studies were split up into 10 rounds of ~300 studies each. Starting in August 2019, each round consisted of one week of surveys followed by two weeks of market trading. I finished in first place in 3 out 10 survey rounds and 6 out of 10 market rounds. In total, about $200,000 in prize money will be awarded.

The studies were sourced from all social science disciplines (economics, psychology, sociology, management, etc.) and were published between 2009 and 2018 (in other words, most of the sample came from the post-replication crisis era).

The average replication probability in the market was 54%; while the replication results are not out yet (250 of the 3000 papers will be replicated), previous experiments have shown that prediction markets work well.1

This is what the distribution of my own predictions looks like:2

My average forecast was in line with the market. A quarter of the claims were above 76%. And a quarter of them were below 33%: we're talking hundreds upon hundreds of terrible papers, and this is just a tiny sample of the annual academic production.

Criticizing bad science from an abstract, 10000-foot view is pleasant: you hear about some stuff that doesn't replicate, some methodologies that seem a bit silly. "They should improve their methods", "p-hacking is bad", "we must change the incentives", you declare Zeuslike from your throne in the clouds, and then go on with your day.

But actually diving into the sea of trash that is social science gives you a more tangible perspective, a more visceral revulsion, and perhaps even a sense of Lovecraftian awe at the sheer magnitude of it all: a vast landfill—a great agglomeration of garbage extending as far as the eye can see, effluvious waves crashing and throwing up a foul foam of p=0.049 papers. As you walk up to the diving platform, the deformed attendant hands you a pair of flippers. Noticing your reticence, he gives a subtle nod as if to say: "come on then, jump in".

They Know What They're Doing

Prediction markets work well because predicting replication is easy.3 There's no need for a deep dive into the statistical methodology or a rigorous examination of the data, no need to scrutinize esoteric theories for subtle errors—these papers have obvious, surface-level problems.

There's a popular belief that weak studies are the result of unconscious biases leading researchers down a "garden of forking paths". Given enough "researcher degrees of freedom" even the most punctilious investigator can be misled.

I find this belief impossible to accept. The brain is a credulous piece of meat4 but there are limits to self-delusion. Most of them have to know. It's understandable to be led down the garden of forking paths while producing the research, but when the paper is done and you give it a final read-over you will surely notice that all you have is a n=23, p=0.049 three-way interaction effect (one of dozens you tested, and with no multiple testing adjustments of course). At that point it takes more than a subtle unconscious bias to believe you have found something real. And even if the authors really are misled by the forking paths, what are the editors and reviewers doing? Are we supposed to believe they are all gullible rubes?

People within the academy don't want to rock the boat. They still have to attend the conferences, secure the grants, publish in the journals, show up at the faculty meetings: all these things depend on their peers. When criticising bad research it's easier for everyone to blame the forking paths rather than the person walking them. No need for uncomfortable unpleasantries. The fraudster can admit, without much of a hit to their reputation, that indeed they were misled by that dastardly garden, really through no fault of their own whatsoever, at which point their colleagues on twitter will applaud and say "ah, good on you, you handled this tough situation with such exquisite virtue, this is how progress happens! hip, hip, hurrah!" What a ridiculous charade.

Even when they do accuse someone of wrongdoing they use terms like "Questionable Research Practices" (QRP). How about Questionable Euphemism Practices?

  • When they measure a dozen things and only pick their outcome variable at the end, that's not the garden of forking paths but the greenhouse of fraud.
  • When they do a correlational analysis but give "policy implications" as if they were doing a causal one, they're not walking around the garden, they're doing the landscaping of forking paths.
  • When they take a continuous variable and arbitrarily bin it to do subgroup analysis or when they add an ad hoc quadratic term to their regression, they're...fertilizing the garden of forking paths? (Look, there's only so many horticultural metaphors, ok?)

The bottom line is this: if a random schmuck with zero domain expertise like me can predict what will replicate, then so can scientists who have spent half their lives studying this stuff. But they sure don't act like it.

...or Maybe They Don't?

The horror! The horror!

Check out this crazy chart from Yang et al. (2020):

Yes, you're reading that right: studies that replicate are cited at the same rate as studies that do not. Publishing your own weak papers is one thing, but citing other people's weak papers? This seemed implausible, so I decided to do my own analysis with a sample of 250 articles from the Replication Markets project. The correlation between citations per year and (market-estimated) probability of replication was -0.05!

You might hypothesize that the citations of non-replicating papers are negative, but negative citations are extremely rare.5 One study puts the rate at 2.4%. Astonishingly, even after retraction the vast majority of citations are positive, and those positive citations continue for decades after retraction.6

As in all affairs of man, it once again comes down to Hanlon's Razor. Either:

  1. Malice: they know which results are likely false but cite them anyway.
  2. or, Stupidity: they can't tell which papers will replicate even though it's quite easy.

Accepting the first option would require a level of cynicism that even I struggle to muster. But the alternative doesn't seem much better: how can they not know? I, an idiot with no relevant credentials or knowledge, can fairly accurately determine good research from bad, but all the tenured experts can not? How can they not tell which papers are retracted?

I think the most plausible explanation is that scientists don't read the papers they cite, which I suppose involves both malice and stupidity.7 Gwern has a nice write-up on this question citing some ingenious analyses based on the proliferation of misprints: "Simkin & Roychowdhury venture a guess that as many as 80% of authors citing a paper have not actually read the original". Once a paper is out there nobody bothers to check it, even though they know there's a 50-50 chance it's false!

Whatever the explanation might be, the fact is that the academic system does not allocate citations to true claims.8 This is bad not only for the direct effect of basing further research on false results, but also because it distorts the incentives scientists face. If nobody cited weak studies, we wouldn't have so many of them. Rewarding impact without regard for the truth inevitably leads to disaster.

There Are No Journals With Strict Quality Standards

Naïvely you might expect that the top-ranking journals would be full of studies that are highly likely to replicate, and the low-ranking journals would be full of p<0.1 studies based on five undergraduates. Not so! Like citations, journal status and quality are not very well correlated: there is no association between statistical power and impact factor, and journals with higher impact factor have more papers with erroneous p-values.

This pattern is repeated in the Replication Markets data. As you can see in the chart below, there's no relationship between h-index (a measure of impact) and average expected replication rates. There's also no relationship between h-index and expected replication within fields.

Even the crème de la crème of economics journals barely manage a ⅔ expected replication rate. 1 in 5 articles in QJE scores below 50%, and this is a journal that accepts just 1 out of every 30 submissions. Perhaps this (partially) explains why scientists are undiscerning: journal reputation acts as a cloak for bad research. It would be fun to test this idea empirically.

Here you can see the distribution of replication estimates for every journal in the RM sample:

As far as I can tell, for most journals the question of whether the results in a paper are true is a matter of secondary importance. If we model journals as wanting to maximize "impact", then this is hardly surprising: as we saw above, citation counts are unrelated to truth. If scientists were more careful about what they cited, then journals would in turn be more careful about what they publish.

Things Are Not Getting Better

Before we got to see any of the actual Replication Markets studies, we voted on the expected replication rates by year. Gordon et al. (2020) has that data: replication rates were expected to steadily increase from 43% in 2009/2010 to 55% in 2017/2018.

This is what the average predictions looked like after seeing the papers: from 53.4% in 2009 to 55.8% in 2018 (difference not statistically significant; black dots are means).

I frequently encounter the notion that after the replication crisis hit there was some sort of great improvement in the social sciences, that people wouldn't even dream of publishing studies based on 23 undergraduates any more (I actually saw plenty of those), etc. Stuart Ritchie's new book praises psychologists for developing "systematic ways to address" the flaws in their discipline. In reality there has been no discernible improvement.

The results aren't out yet, so it's possible that the studies have improved in subtle ways which the forecasters have not been able to detect. Perhaps the actual replication rates will be higher. But I doubt it. Looking at the distribution of p-values over time, there's a small increase in the proportion of p<.001 results, but nothing like the huge improvement that was expected.

Everyone is Complicit

Authors are just one small cog in the vast machine of scientific production. For this stuff to be financed, generated, published, and eventually rewarded requires the complicity of funding agencies, journal editors, peer reviewers, and hiring/tenure committees. Given the current structure of the machine, ultimately the funding agencies are to blame.9 But "I was just following the incentives" only goes so far. Editors and reviewers don't actually need to accept these blatantly bad papers.

Journals and universities certainly can't blame the incentives when they stand behind fraudsters to the bitter end. Paolo Macchiarini "left a trail of dead patients" but was protected for years by his university. Andrew Wakefield's famously fraudulent autism-MMR study took 12 years to retract. Even when the author of a paper admits the results were entirely based on an error, journals still won't retract.

Elisabeth Bik documents her attempts to report fraud to journals. It looks like this:

The Editor in Chief of Neuroscience Letters [Yale's Stephen G. Waxman] never replied to my email. The APJTM journal had a new publisher, so I wrote to both current Editors in Chief, but they never replied to my email.

Two papers from this set had been published in Wiley journals, Gerodontology and J Periodontology. The EiC of the Journal of Periodontology never replied to my email. None of the four Associate Editors of that journal replied to my email either. The EiC of Gerodontology never replied to my email.

Even when they do take action, journals will often let scientists "correct" faked figures instead of retracting the paper! The rate of retraction is about 0.04%; it ought to be much higher.

And even after being caught for outright fraud, about half of the offenders are allowed to keep working: they "have received over $123 million in federal funding for their post-misconduct research efforts".

Just Because a Paper Replicates Doesn't Mean it's Good

First: a replication of a badly designed study is still badly designed. Suppose you are a social scientist, and you notice that wet pavements tend to be related to umbrella usage. You do a little study and find the correlation is bulletproof. You publish the paper and try to sneak in some causal language when the editors/reviewers aren't paying attention. Rain is never even mentioned. Of course if someone repeats your study, they will get a significant result every time. This may sound absurd, but it describes a large proportion of the papers that successfully replicate.

Economists and education researchers tend to be relatively good with this stuff, but as far as I can tell most social scientists go through 4 years of undergrad and 4-6 years of PhD studies without ever encountering ideas like "identification strategy", "model misspecification", "omitted variable", "reverse causality", or "third-cause". Or maybe they know and deliberately publish crap. Fields like nutrition and epidemiology are in an even worse state, but let's not get into that right now.

"But Alvaro, correlational studies can be usef-" Spare me.

Second: the choice of claim for replication. For some papers it's clear (eg math educational intervention → math scores), but other papers make dozens of different claims which are all equally important. Sometimes the Replication Markets organisers picked an uncontroversial claim from a paper whose central experiment was actually highly questionable. In this way a study can get the "successfully replicates" label without its most contentious claim being tested.

Third: effect size. Should we interpret claims in social science as being about the magnitude of an effect, or only about its direction? If the original study says an intervention raises math scores by .5 standard deviations and the replication finds that the effect is .2 standard deviations (though still significant), that is considered a success that vindicates the original study! This is one area in which we absolutely have to abandon the binary replicates/doesn't replicate approach and start thinking more like Bayesians.

Fourth: external validity. A replicated lab experiment is still a lab experiment. While some replications try to address aspects of external validity (such as generalizability across different cultures), the question of whether these effects are relevant in the real world is generally not addressed.

Fifth: triviality. A lot of the papers in the 85%+ chance-to-replicate range are just really obvious. "Homeless students have lower test scores", "parent wealth predicts their children's wealth", that sort of thing. These are not worthless, but they're also not really expanding the frontiers of science.

So: while about half the papers will replicate, I would estimate that only half of those are actually worthwhile.

Lack of Theory

The majority of journal articles are almost completely atheoretical. Even if all the statistical, p-hacking, publication bias, etc. issues were fixed, we'd still be left with a ton of ad-hoc hypotheses based, at best, on (WEIRD) folk intuitions. But how can science advance if there's no theoretical grounding, nothing that can be refuted or refined? A pile of "facts" does not a progressive scientific field make.

Michael Muthukrishna and the superhuman Joe Henrich have written a paper called A Problem in Theory which covers the issue better than I ever could. I highly recommend checking it out.

Rather than building up principles that flow from overarching theoretical frameworks, psychology textbooks are largely a potpourri of disconnected empirical findings.

There's Probably a Ton of Uncaught Frauds

This is a fairly lengthy topic, so I made a separate post for it. tl;dr: I believe about 1% of falsified/fabricated papers are retracted, but overall they represent a very small portion of non-replicating research.

Power: Not That Bad

[Warning: technical section. Skip ahead if bored.]

A quick refresher on hypothesis testing:

  • α, the significance level, is the probability of a false positive.
  • β, or type II error, is the probability of a false negative.
  • Power is (1-β): if a study has 90% power, there's a 90% chance of successfully detecting the effect being studied. Power increases with sample size and effect size.
  • The probability that a significant p-value indicates a true effect is not 1-α. It is called the positive predictive value (PPV), and is calculated as follows: PPV=priorpowerpriorpower+(1prior)αPPV = \frac{prior \cdot power}{prior \cdot power + (1-prior) \cdot \alpha}

This great diagram by Felix Schönbrodt gives the intuition behind PPV:

This model makes the assumption that effects can be neatly split into two categories: those that are "real" and those that are not. But is this accurate? In the opposite extreme you have the "crud factor": everything is correlated so if your sample is big enough you will always find a real effect.10 As Bakan puts it: "there is really no good reason to expect the null hypothesis to be true in any population". If you look at the universe of educational interventions, for example, are they going to be neatly split into two groups of "real" and "fake" or is it going to be one continuous distribution? What does "false positive" even mean if there are no "fake" effects, unless it refers purely to the direction of the effect? Perhaps the crud factor is wrong, at least when it comes to causal effects? Perhaps the pragmatic solution is to declare that all effects with, say, d<.1 are fake and the rest are real? Or maybe we should just go full Bayesian?

Anyway, let's pretend the previous paragraph never happened. Where do we find the prior? There are a few different approaches, and they're all problematic.11

The exact number doesn't really matter that much (there's nothing we can do about it), so I'm going to go ahead and use a prior of 25% for the calculations below. The main takeaways don't change with a different prior value.

Now the only thing we're missing is the power of the typical social science study. To determine that we need to know 1) sample sizes (easy), and 2) the effect size of true effects (not so easy).14 I'm going to use the results of extremely high-powered, large-scale replication efforts:

Surprisingly large, right? We can then use the power estimates in Szucs & Ioannidis (2017): they give an average power of .49 for "medium effects" (d=.5) and .71 for "large effects" (d=.8). Let's be conservative and split the difference.

With a prior of 25%, power of 60%, and α=5%, PPV is equal to 80%. Assuming no fraud and no QRPs, 20% of positive findings will be false.

These averages hide a lot of heterogeneity: it's well-established that studies of large effects are adequately powered whereas studies of small effects are underpowered, so the PPV is going to be smaller for small effects. There are also large differences depending on the field you're looking at. The lower the power the bigger the gains to be had from increasing sample sizes.

This is what PPV looks like for the full range of prior/power values, with α=5%:

At the current prior/power levels, PPV is more sensitive to the prior: we can only squeeze small gains out of increasing power. That's a bit of a problem given the fact that increasing power is relatively easy, whereas increasing the chance that the effect you're investigating actually exists is tricky, if not impossible. Ultimately scientists want to discover surprising results—in other words, results with a low prior.

I made a little widget so you can play around with the values:

False positives
True positives
False negatives
True negatives

Assuming a 25% prior, increasing power from 60% to 90% would require more than twice the sample size and would only increase PPV by 5.7 percentage points. It's something, but it's no panacea. However, there is something else we could do: sample size is a budget, and we can allocate that budget either to higher power or to a lower significance cutoff. Lowering alpha is far more effective at reducing the false discovery rate.15

Let's take a look at 4 different different power/alpha scenarios, assuming a 25% prior and d=0.5 effect size.16 The required sample sizes are for a one-sided t-test.

False Discovery Rate
Required Sample Size

To sum things up: power levels are decent on average and improving them wouldn't do much. Power increases should be focused on studies of small effects. Lowering the significance cutoff achieves much more for the same increase in sample size.

Field of Dreams

Before we got to see any of the actual Replication Markets studies, we voted on the expected replication rates by field. Gordon et al. (2020) has that data:

This is what the predictions looked like after seeing the papers:

Economics is Predictably Good

Economics topped the charts in terms of expectations, and it was by far the strongest field. There are certainly large improvements to be made—a 2/3 replication rate is not something to be proud of. But reading their papers you get the sense that at least they're trying, which is more than can be said of some other fields. 6 of the top 10 economics journals participated, and they did quite well: QJE is the behemoth of the field and it managed to finish very close to the top. A unique weakness of economics is the frequent use of absurd instrumental variables. I doubt there's anyone (including the authors) who is convinced by that stuff, so let's cut it out.

EvoPsych is Surprisingly Bad

You were supposed to destroy the Sith, not join them!

Going into this, my view of evolutionary psychology was shaped by people like Cosmides, Tooby, DeVore, Boehm, and so on. You know, evolutionary psychology! But the studies I skimmed from evopsych journals were mostly just weak social psychology papers with an infinitesimally thin layer of evolutionary paint on top. Few people seem to take the "evolutionary" aspect really seriously.

Also underdetermination problems are particularly difficult in this field and nobody seems to care.

Education is Surprisingly Good

Education was expected to be the worst field, but it ended up being almost as strong as economics. When it came to interventions there were lots of RCTs with fairly large samples, which made their claims believable. I also got the sense that p-hacking is more difficult in education: there's usually only one math score which measures the impact of a math intervention, there's no early stopping, etc.

However, many of the top-scoring papers were trivial (eg "there are race differences in science scores"), and the field has a unique problem which is not addressed by replication: educational intervention effects are notorious for fading out after a few years. If the replications waited 5 years to follow up on the students, things would look much, much worse.

Demography is Good

Who even knew these people existed? Yet it seems they do (relatively) competent work. googles some of the authors Ah, they're economists. Well.

Criminology Should Just Be Scrapped

If you thought social psychology was bad, you ain't seen nothin' yet. Other fields have a mix of good and bad papers, but criminology is a shocking outlier. Almost every single paper I read was awful. Even among the papers that are highly likely to replicate, it's de rigueur to confuse correlation for causation.

If we compare criminology to, say, education, the headline replication rates look similar-ish. But the designs used in education (typically RCT, diff-in-diff, or regression discontinuity) are at least in principle capable of detecting the effects they're looking for. That's not really the case for criminology. Perhaps this is an effect of the (small number of) specific journals selected for RM, and there is more rigorous work published elsewhere.

There's no doubt in my mind that the net effect of criminology as a discipline is negative: to the extent that public policy is guided by these people, it is worse. Just shameful.


In their current state these are a bit of a joke, but I don't think there's anything fundamentally wrong with them. Sure, some of the variables they use are a bit fluffy, and of course there's a lack of theory. But the things they study are a good fit for RCTs, and if they just quintupled their sample sizes they would see massive improvements.

Cognitive Psychology

Much worse than expected; generally has a reputation as being one of the more solid subdisciplines of psychology, and has done well in previous replication projects. Not sure what went wrong here. It's only 50 papers and they're all from the same journal, so perhaps it's simply an unrepresentative sample.

Social Psychology

More or less as expected. All the silly stuff you've heard about is still going on.

Limited Political Hackery

Some of the most highly publicized social science controversies of the last decade happened at the intersection between political activism and low scientific standards: the implicit association test,17 stereotype threat, racial resentment, etc. I thought these were representative of a wider phenomenon, but in reality they are exceptions. The vast majority of work is done in good faith.

While blatant activism is rare, there is a more subtle background ideological influence which affects the assumptions scientists make, the types of questions they ask, and how they go about testing them. It's difficult to say how things would be different under the counterfactual of a more politically balanced professoriate, though.

Interaction Effects Bad

A paper whose main finding is an interaction effect is about 10 percentage points less likely to replicate. Their usage is not inherently wrong, sometimes it's theoretically justified. But all too often you'll see blatant fishing expeditions with a dozen double and triple ad hoc interactions thrown into the regression. They make it easy to do naughty things and tend to be underpowered.

Nothing New Under the Sun

All is mere breath, and herding the wind.

The replication crisis did not begin in 2010, it began in the 1950s. All the things I've written above have been written before, by respected and influential scientists. They made no difference whatsoever. Let's take a stroll through the museum of metascience.

Sterling (1959) analyzed psychology articles published in 1955-56 and noted that 97% of them rejected their null hypothesis. He found evidence of a huge publication bias, and a serious problem with false positives which was compounded by the fact that results are "seldom verified by independent replication".

Nunnally (1960) noted various problems with null hypothesis testing, underpowered studies, over-reliance on student samples (it doesn't take Joe Henrich to notice that using Western undergrads for every experiment might be a bad idea), and much more. The problem (or excuse) of publish-or-perish, which some portray as a recent development, was already in place by this time.18

The "reprint race" in our universities induces us to publish hastily-done, small studies and to be content with inexact estimates of relationships.

Jacob Cohen (of Cohen's d fame) in a 1962 study analyzed the statistical power of 70 psychology papers: he found that underpowered studies were a huge problem, especially for those investigating small effects. Successive studies by Sedlemeier & Gigerenzer in 1989 and Szucs & Ioannidis in 2017 found no improvement in power.

If we then accept the diagnosis of general weakness of the studies, what treatment can be prescribed? Formally, at least, the answer is simple: increase sample sizes.

Paul Meehl (1967) is highly insightful on problems with null hypothesis testing in the social sciences, the "crud factor", lack of theory, etc. Meehl (1970) brilliantly skewers the erroneous (and still common) tactic of automatically controling for "confounders" in observational designs without understanding the causal relations between the variables. Meehl (1990) is downright brutal: he highlights a series issues which, he argues, make psychological theories "uninterpretable". He covers low standards, pressure to publish, low power, low prior probabilities, and so on.

I am prepared to argue that a tremendous amount of taxpayer money goes down the drain in research that pseudotests theories in soft psychology and that it would be a material social advance as well as a reduction in what Lakatos has called “intellectual pollution” if we would quit engaging in this feckless enterprise.

Rosenthal (1979) covers publication bias and the problems it poses for meta-analyses: "only a few studies filed away could change the combined significant result to a nonsignificant one". Cole, Cole & Simon (1981) present experimental evidence on the evaluation of NSF grant proposals: they find that luck plays a huge factor as there is little agreement between reviewers.

I could keep going to the present day with the work of Goodman, Gelman, Nosek, and many others. There are many within the academy who are actively working on these issues: the CASBS Group on Best Practices in Science, the Meta-Research Innovation Center at Stanford, the Peer Review Congress, the Center for Open Science. If you click those links you will find a ton of papers on metascientific issues. But there seems to be a gap between awareness of the problem and implementing policy to fix it. You've got tons of people doing all this research and trying to repair the broken scientific process, while at the same time journal editors won't even retract blatantly fraudulent research.

There is even a history of government involvement. In the 70s there were battles in Congress over questionable NSF grants, and in the 80s Congress (led by Al Gore) was concerned about scientific integrity, which eventually led to the establishment of the Office of Scientific Integrity. (It then took the federal government another 11 years to come up with a decent definition of scientific misconduct.) After a couple of embarrassing high-profile prosecutorial failures they more or less gave up, but they still exist today and prosecute about a dozen people per year.

Generations of psychologists have come and gone and nothing has been done. The only difference is that today we have a better sense of the scale of the problem. The one ray of hope is that at least we have started doing a few replications, but I don't see that fundamentally changing things: replications reveal false positives, but they do nothing to prevent those false positives from being published in the first place.

What To Do

The reason nothing has been done since the 50s, despite everyone knowing about the problems, is simple: bad incentives. The best cases for government intervention are collective action problems: situations where the incentives for each actor cause suboptimal outcomes for the group as a whole, and it's difficult to coordinate bottom-up solutions. In this case the negative effects are not confined to academia, but overflow to society as a whole when these false results are used to inform business and policy.

Nobody actually benefits from the present state of affairs, but you can't ask isolated individuals to sacrifice their careers for the "greater good": the only viable solutions are top-down, which means either the granting agencies or Congress (or, as Scott Alexander has suggested, a Science Czar). You need a power that sits above the system and has its own incentives in order: this approach has already had success with requirements for pre-registration and publication of clinical trials. Right now I believe the most valuable activity in metascience is not replication or open science initiatives but political lobbying.19

  • Earmark 60% of funding for registered reports (ie accepted for publication based on the preregistered design only, not results). For some types of work this isn't feasible, but for ¾ of the papers I skimmed it's possible. In one fell swoop, p-hacking and publication bias would be virtually eliminated.20
  • Earmark 10% of funding for replications. When the majority of publications are registered reports, replications will be far less valuable than they are today. However, intelligently targeted replications still need to happen.
  • Earmark 1% of funding for progress studies. Including metascientific research that can be used to develop a serious science policy in the future.
  • Increase sample sizes and lower the significance threshold to .005. This one needs to be targeted: studies of small effects probably need to quadruple their sample sizes in order to get their power to reasonable levels. The median study would only need 2x or so. Lowering alpha is generally preferable to increasing power. "But Alvaro, doesn't that mean that fewer grants would be funded?" Yes.
  • Ignore citation counts. Given that citations are unrelated to (easily-predictable) replicability, let alone any subtler quality aspects, their use as an evaluative tool should stop immediately.
  • Open data, enforced by the NSF/NIH. There are problems with privacy but I would be tempted to go as far as possible with this. Open data helps detect fraud. And let's have everyone share their code, too—anything that makes replication/reproduction easier is a step in the right direction.
  • Financial incentives for universities and journals to police fraud. It's not easy to structure this well because on the one hand you want to incentivize them to minimize the frauds published, but on the other hand you want to maximize the frauds being caught. Beware Goodhart's law!
  • Why not do away with the journal system altogether? The NSF could run its own centralized, open website; grants would require publication there. Journals are objectively not doing their job as gatekeepers of quality or truth, so what even is a journal? A combination of taxonomy and reputation. The former is better solved by a simple tag system, and the latter is actually misleading. Peer review is unpaid work anyway, it could continue as is. Attach a replication prediction market (with the estimated probability displayed in gargantuan neon-red font right next to the paper title) and you're golden. Without the crutch of "high ranked journals" maybe we could move to better ways of evaluating scientific output. No more editors refusing to publish replications. You can't shift the incentives: academics want to publish in "high-impact" journals, and journals want to selectively publish "high-impact" research. So just make it impossible. Plus as a bonus side-effect this would finally sink Elsevier.
  • Have authors bet on replication of their research. Give them fixed odds, say 1:4—if it's good work, it's +EV for them. This sounds a bit distasteful, so we could structure the same cashflows as a "bonus grant" from the NSF when a paper you wrote replicates successfully.22

And a couple of points that individuals can implement today:

  • Just stop citing bad research, I shouldn't need to tell you this, jesus christ what the fuck is wrong with you people.
  • Read the papers you cite. Or at least make your grad students to do it for you. It doesn't need to be exhaustive: the abstract, a quick look at the descriptive stats, a good look at the table with the main regression results, and then a skim of the conclusions. Maybe a glance at the methodology if they're doing something unusual. It won't take more than a couple of minutes. And you owe it not only to SCIENCE!, but also to yourself: the ability to discriminate between what is real and what is not is rather useful if you want to produce good research.23
  • When doing peer review, reject claims that are likely to be false. The base replication rate for studies with p>.001 is below 50%. When reviewing a paper whose central claim has a p-value above that, you should recommend against publication unless the paper is exceptional (good methodology, high prior likelihood, etc.)24 If we're going to have publication bias, at least let that be a bias for true positives. Remember to subtract another 10 percentage points for interaction effects. You don't need to be complicit in the publication of false claims.
  • Stop assuming good faith. I'm not saying every academic interaction should be hostile and adversarial, but the good guys are behaving like dodos right now and the predators are running wild.

...My Only Friend, The End

The first draft of this post had a section titled "Some of My Favorites", where I listed the silliest studies in the sample. But I removed it because I don't want to give the impression that the problem lies with a few comically bad papers in the far left tail of the distribution. The real problem is the median.

It is difficult to convey just how low the standards are. The marginal researcher is a hack and the marginal paper should not exist. There's a general lack of seriousness hanging over everything—if an undergrad cites a retracted paper in an essay, whatever; but if this is your life's work, surely you ought to treat the matter with some care and respect.

Why is the Replication Markets project funded by the Department of Defense? If you look at the NSF's 2019 Performance Highlights, you'll find items such as "Foster a culture of inclusion through change management efforts" (Status: "Achieved") and "Inform applicants whether their proposals have been declined or recommended for funding in a timely manner" (Status: "Not Achieved"). Pusillanimous reports repeat tired clichés about "training", "transparency", and a "culture of openness" while downplaying the scale of the problem and ignoring the incentives. No serious actions have followed from their recommendations.

It's not that they're trying and failing—they appear to be completely oblivious. We're talking about an organization with an 8 billion dollar budget that is responsible for a huge part of social science funding, and they can't manage to inform people that their grant was declined! These are the people we must depend on to fix everything.

When it comes to giant bureaucracies it can be difficult to know where (if anywhere) the actual power lies. But a good start would be at the top: NSF director Sethuraman Panchanathan, SES division director Daniel L. Goroff, NIH director Francis S. Collins, and the members of the National Science Board. The broken incentives of the academy did not appear out of nowhere, they are the result of grant agency policies. Scientists and the organizations that represent them (like the AEA and APA) should be putting pressure on them to fix this ridiculous situation.

The importance of metascience is inversely proportional to how well normal science is working, and right now it could use some improvement. The federal government spends about $100b per year on research, but we lack a systematic understanding of scientific progress, we lack insight into the forces that underlie the upward trajectory of our civilization. Let's take 1% of that money and invest it wisely so that the other 99% will not be pointlessly wasted. Let's invest it in a robust understanding of science, let's invest it in progress studies, let's invest it in—the future.

Thanks to Alexey Guzey and Dormin for their feedback. And thanks to the people at SCORE and the Replication Markets team for letting me use their data and for running this unparalleled program.

  1. 1.Dreber et al. (2015), Using prediction markets to estimate the reproducibility of scientific research.
    Camerer et al. (2018), Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015.
  2. 2.The distribution is bimodal because of the way p-values are typically reported: there's a huge difference between p<.01 and p<.001. If actual p-values were reported instead of cutoffs, the distribution would be unimodal.
  3. 3.Even laypeople are half-decent at it.
  4. 4.Ludwik Fleck has an amusing bit on the development of anatomy: "Simple lack of 'direct contact with nature' during experimental dissection cannot explain the frequency of the phrase "which becomes visible during autopsy" often accompanying what to us seem the most absurd assertions."
  5. 5.Another possible explanation is that importance is inversely related to replication probability. In my experience that is not the case, however. If anything it's the opposite: important effects tend to be large effects, and large effects tend to replicate. In general, any "conditioning on a collider"-type explanation doesn't work here because these citations also continue post-retraction.
  6. 6.Some more:
  7. 7.Though I must admit that after reading the papers myself I understand why they would shy away from the task.
  8. 8.I can tell you what is rewarded with citations though: papers in which the authors find support for their hypothesis.
  9. 9.Perhaps I don't understand the situation at places like the NSF or the ESRC but the problem seems to be incompetence (or a broken bureaucracy?) rather than misaligned incentives.
  10. 10.Theoretically there's the possibility of overpowered studies being a problem. Meehl (1967) argues that 1) everything in psychology is correlated (the "crud factor"), and 2) theories only make directional predictions (as opposed to point predictions in eg physics). So as power increases the probability of finding a significant result for a directional prediction approaches 50% regardless of what you're studying.
  11. 11.In medicine there are plenty of cohort-based publication bias analyses, but I don't think we can generalize from those to the social sciences.
  12. 12.But RRs are probably not representative of the literature, so this is an overestimate. And who knows how many unpublished pilot studies are behind every RR?
  13. 13.Dreber et al. (2015) use prediction market probabilities and work backward to get a prior of 9%, but this number is based on unreasonable assumptions about false positives: they don't take into account fraud and QRPs. If priors were really that low, the entire replication crisis would be explained purely by normal sampling error: no QRPs!
  14. 14.Part of the issue is that the literature is polluted with a ton of false results, which actually pushes estimates of true effect sizes downwards. There's an unfortunate tendency to lump together effect sizes of real and non-existent effects (eg Many Labs 2: "ds were 0.60 for the original findings and 0.15 for the replications"), but that's a meaningless number.
  15. 15.False negatives are bad too, but they're not as harmful as false positives. Especially since they're almost never published. Also, there's been a ton of stuff written on lowering alpha, a good starting point is Redefine Statistical Significance.
  16. 16.These figures actually understate the benefit of a lower alpha, because it would also change the calculus around p-hacking. With an alpha of 5%, getting a false positive is quite easy. Simply stopping data collection once you have a significant result has a hit rate of over 20%! Add some dredging and HARKing to that and you can squeeze a result out of anything. With a lower alpha, the chances of p-hacking success will be vastly lower and some researchers won't even bother trying.
  17. 17.The original IAT paper is worth revisiting. You only really need to read page 1475. The construct validity evidence is laughable. The whole thing is based on N=26 and they find no significant correlation between the IAT and explicit measures of racism. But that's OK, Greenwald says, because the IAT is meant to find secret racists ("reveal explicitly disavowed prejudice")! The question of why a null correlation between implicit and explicit racial attitudes is to be expected is left as an exercise to the reader. The correlation between two racial IATs (male and female names) is .46 and they conveniently forget to mention the comically low test-retest reliability. That's all you need for 13k citations and a consulting industry selling implicit bias to the government for millions of dollars.
  18. 18.I suspect psychologists today would laugh at the idea of the 1960s being an over-competitive environment. Personally I highly doubt that this situation can be blamed on high (or increasing) productivity.
  19. 19.You might ask: well, why haven't the independent grant agencies already fixed the problem then? I'm not sure if it's a lack of competence, or caring, or power, or something else. But I find Garrett Jones' arguments on the efficacy of independent government agencies convincing: this model works well in other areas.
  20. 20."But Alvaro, what if I make an unexpected discovery during my investigation?" Well, you start writing a new registered report, and perhaps publish it as an exploratory result. You may not like it, but that's how we protect against false positives. In cases where only one dataset is available (eg historical data) we must rely on even stricter standards of evidence, to protect against multiple testing.
  21. 21.Another idea to steal from the SEC: whistleblower rewards.
  22. 22.This would be immediately exploited by publishing a bunch of trivial results. But that's a solvable problem. In any case, it's much better to have systematic, automatic mechanisms instead of relying on subjective factors and prosecuting of individual cases.
  23. 23.I believe the SCORE program intends to use the data from Replication Markets to train a ML model that predicts replicability. If scientists had the ability to just run that on every reference in their papers, perhaps they could go back to not reading what they cite.
  24. 24.Looking at Replication Markets data, about 1 in 4 studies with p>.001 had more than a 50% chance to replicate. Of course I'd consider 50-50 odds far too low a threshold, but you have to start somewhere. "But Alvaro, science is not done paper by paper, it is a cumulative enterprise. We should publish marginal results, even if they're probably not true. They are pieces of evidence that, brick by brick, raise the vast edifice that we call scientific knowledge". In principle this is a good argument: publish everything and let the meta-analyses sort it out. But given the reality of publication bias we must be selective. If registered reports became the standard, this problem would not exist.

How Many Undetected Frauds in Science?

0.04% of papers are retracted. At least 1.9% of papers have duplicate images "suggestive of deliberate manipulation". About 2.5% of scientists admit to fraud, and they estimate that 10% of other scientists have committed fraud. 27% of postdocs said they were willing to select or omit data to improve their results. More than 50% of published findings in psychology are false. The ORI, which makes about 13 misconduct findings per year, gives a conservative estimate of over 2000 misconduct incidents per year.

That's a wide range of figures, and all of them suffer from problems if we try to use them as estimates of the real rate of fraud. While the vast majority of false published claims are not due to fabrication, it's clear that there is a huge iceberg of undiscovered fraud hiding underneath the surface.

Part of the issue is that the limits of fraud are unclear. While fabrication/falsification are easy to adjudicate, there's a wide range of quasi-fraudulent but quasi-acceptable "Questionable Research Practices" (QRPs) such as HARKing which result in false claims being presented as true. Publishing a claim that has a ~0%1 chance of being true is the worst thing in the world, but publishing a claim that has a 15% chance of being true is a totally normal thing that perfectly upstanding scientists do. Thus the literature is inundated by false results that are nonetheless not "fraudulent". Personally I don't think there's much of a difference.

There are two main issues with QRPs: first, there's no clear line in the sand, which makes it difficult to single out individuals for punishment. Second, the majority of scientists engage in QRPs. In fact they have been steeped in an environment full of bad practices for so long that they are no longer capable of understanding that they are behaving badly:

Let him who is without QRPs cast the first stone.

The case of Brian Wansink (who committed both clear fraud and QRPs) is revealing: in the infamous post that set off his fall from grace, he brazenly admitted to extreme p-hacking. The notion that any of this was wrong had clearly never crossed his mind: he genuinely believed he was giving useful advice to grad students. When commenters pushed back, he justified himself by writing that "P-hacking shouldn’t be confused with deep data dives".

Anyway, here are some questions that might help us determine the size of the iceberg:

  • Are uncovered frauds high-quality, or do we only have the ability to find low-hanging fruit?
  • Are frauds caught quickly, or do they have long careers before anyone finds out?
  • Are scientists capable of detecting fraud or false results in general (regardless of whether they are produced by fraud, QRPs, or just bad luck)?
  • How much can we rely on whistleblowers?


Here's an interesting case recently uncovered by Elisabeth Bik: 8 different published, peer-reviewed papers, by different authors, on different subjects, with literally identical graphs. The laziness is astonishing! It would take just a few minutes to write an R script that generates random data so that each fake paper could at least have unique charts. But the paper mill that wrote these articles won't even do that. This kind of extreme sloppiness is a recurring theme when it comes to frauds that have been caught.

In general the image duplication that Bik uncovers tends to be rather lazy: people just copy paste to their heart's content and hope nobody will notice (and peer reviewers and editors almost certainly won't notice).

The Bell Labs physicist Jan Hendrik Schön was found out because he used identical graphs for multiple, completely different experiments.

This guy not only copy-pasted a ton of observations, he forgot to delete the excel sheet he used to fake the data! Managed to get three publications out of it.

Back to Wansink again: he was smart enough not to copy-paste charts, but he made other stupid mistakes. For example in one paper (The office candy dish) he reported impossible means and test statistics (detected through granularity testing). If he had just bothered to create a plausible sample instead of directly fiddling with summary statistics, there's a good chance he would not have been detected. (By the way, the paper has not been retracted, and continues to be cited. I Fucking Love Science!)

In general Wansink comes across as a moron, yet he managed to amass hundreds of publications, 30k+ citations, and half a dozen books. What percentile of fraud competence do you think Wansink represents?

The point is this: generating plausible random numbers is not that difficult! Especially considering the fact that these are intelligent people with extensive training in science and statistics. It seems highly likely that there are more sophisticated frauds out there.


Do frauds manage to have long careers before they get caught? I don't think there's any hard data on this (though someone could probably compile it with the Retraction Watch database). Obviously the highest-profile frauds are going to be those with a long history, so we have to be careful not to be misled. Perhaps there's a vast number of fraudsters who are caught immediately.

Overall the evidence is mixed. On the one hand, a relatively small number of researchers account for a fairly large proportion of all retractions. So while these individuals managed to evade detection for a long time (Yoshitaka Fujii published close to 200 papers over a 25 year career), most frauds do not have such vast track records.

On the other hand just because we haven't detected fraudulent papers doesn't necessarily mean they don't exist. And repeat fraud seems fairly common: simple image duplication checks reveal that "in nearly 40% of the instances in which a problematic paper was identified, screening of other papers from the same authors revealed additional problematic papers in the literature."

Even when fraud is clearly present, it can take ages for the relevant authorities to take action. The infamous Andrew Wakefield vaccine autism paper, for example, took 12 years to retract.

Detection Ability

I've been reading a lot of social science papers lately and a thought keeps coming up: "this paper seems unlikely to replicate, but how can I tell if it's due to fraud or just bad methods?" And the answer is that in general we can't tell. In fact things are even worse, as scientists seem to be incapable of detecting even really obviously weak papers (more on this in the next post).

In cases such as Wansink's, people went over his work with a fine comb after the infamous blogpost and discovered all sorts of irregularities. But nobody caught those signs earlier. Part of the issue is that nobody's really looking for fraud when they casually read a paper. Science tends to work on a kind of honor system where everyone just assumes the best. Even if you are looking for fraud, it's time-consuming, difficult, and in many cases unclear. The evidence tends to be indirect: noticing that two subgroups are a bit too similar, or that the effects of an intervention are a bit too consistent. But these can be explained away fairly easily. So unless you have a whistleblower it's often difficult to make an accusation.

The case of the 5-HTTLPR gene is instructive: as Scott Alexander explains in his fantastic literature review, a huge academic industry was built up around what should have been a null result. There are literally hundreds of non-replicating papers on 5-HTTLPR—suppose there was one fraudulent article in this haystack, how would you go about finding it?

Some frauds (or are they simply errors?) are detected using statistical methods such as the granularity testing mentioned above, or with statcheck. But any sophisticated fraud would simply check their own numbers using statcheck before submitting, and correct any irregularities.

Detecting weak research is easy. Detecting fraud and then prosecuting it is extremely difficult.


Some cases are brought to light by whistleblowers, but we can't rely on them for a variety of reasons. A survey of scientists finds that potential whistleblowers, especially those without job security, tend not to report fraud due to the potential career consequences. They understand that institutions will go to great lengths to protect frauds—do you want a career, or do you want to do the right thing?

Often there simply is no whistleblower available. Scientists are trusted to collect data on their own, and they often collaborate with people in other countries or continents who never have any contact with the data-gathering process. Under such circumstances we must rely on indirect means of detection.

South Korean celebrity scientist Hwang Woo-suk was uncovered as a fraud by a television program which used two whistleblower sources. But things only got rolling when image duplication was detected in one of his papers. Both whistleblowers lost their jobs and were unable to find other employment.

In some cases people blow the whistle and nothing happens. The report from the investigation into Diederik Stapel, for example, notes that "on three occasions in 2010 and 2011, the attention of members of the academic staff in psychology was drawn to this matter. The first two signals were not followed up in the first or second instance." By the way, these people simply noticed statistical irregularities, they never had direct evidence.

And let's turn back to Wansink once again: in the blog post that sank him, he recounted tales of instructing students to p-hack data until they found a result. Did those grad students ever blow the whistle on him? Of course not.

This is the End...

Let's say that about half of all published research findings are false. How many of those are due to fraud? As a very rough guess I'd say that for every 100 papers that don't replicate, 2.5 are due to fabrication/falsification, and 85 are due to lighter forms of methodological fraud. This would imply that about 1% of fraudulent papers are retracted.

This is both good and bad news. On the one hand, while most fraud goes unpunished, it only represents a small portion of published research. On the other hand, it means that we can't fix reproducibility problems by going after fabrication/falsification: if outright fraud completely disappeared tomorrow, it would be no more than an imperceptible blip in the replication crisis. A real solution needs to address the "questionable" methods used by the median scientist, not the fabrication used by the very worst of them.

Book Review: Science Fictions by Stuart Ritchie

In 1945, Robert Merton wrote:

There is only this to be said: the sociology of knowledge is fast outgrowing a prior tendency to confuse provisional hypothesis with unimpeachable dogma; the plenitude of speculative insights which marked its early stages are now being subjected to increasingly rigorous test.

Then, 16 years later:

After enjoying more than two generations of scholarly interest, the sociology of knowledge remains largely a subject for meditation rather than a field of sustained and methodical investigation. [...] these authors tell us that they have been forced to resort to loose generalities rather than being in a position to report firmly grounded generalizations.

In 2020, the sociology of science is stuck more or less in the same place. I am being unfair to Ritchie (who is a Merton fanboy), because he has not set out to write a systematic account of scientific production—he has set out to present a series of captivating anecdotes, and in those terms he has succeeded admirably. And yet, in the age of progress studies surely one is allowed to hope for more.

If you've never heard of Daryl Bem, Brian Wansink, Andrew Wakefield, John Ioannidis, or Elisabeth Bik, then this book is an excellent introduction to the scientific misconduct that is plaguing our universities. The stories will blow your mind. For example you'll learn about Paolo Macchiarini, who left a trail of dead patients, published fake research saying he healed them, and was then protected by his university and the journal Nature for years. However, if you have been following the replication crisis, you will find nothing new here. The incidents are well-known, and the analysis Ritchie adds on top of them is limited in ambition.

The book begins with a quick summary of how science funding and research work, and a short chapter on the replication crisis. After that we get to the juicy bits as Ritchie describes exactly how all this bad research is produced. He starts with outright fraud, and then moves onto the gray areas of bias, negligence, and hype: it's an engaging and often funny catalogue of misdeeds and misaligned incentives. The final two chapters address the causes behind these problems, and how to fix them.

The biggest weakness is that the vast majority of the incidents presented (with the notable exception of the Stanford prison experiment) occurred in the last 20 years or so. And Ritchie's analysis of the causes behind these failures also depends on recent developments: his main argument is that intense competition and pressure to publish large quantities of papers is harming their quality.

Not only has there been a huge increase in the rate of publication, there’s evidence that the selection for productivity among scientists is getting stronger. A French study found that young evolutionary biologists hired in 2013 had nearly twice as many publications as those hired in 2005, implying that the hiring criteria had crept upwards year-on-year. [...] as the number of PhDs awarded has increased (another consequence, we should note, of universities looking to their bottom line, since PhD and other students also bring in vast amounts of money), the increase in university jobs for those newly minted PhD scientists to fill hasn’t kept pace.

By only focusing on recent examples, Ritchie gives the impression that the problem is new. But that's not really the case. One can go back to the 60s and 70s and find people railing against low standards, underpowered studies, lack of theory, publication bias, and so on. Imre Lakatos, in an amusing series of lectures at the London School of Economics in 1973, said that "the social sciences are on a par with astrology, it is no use beating about the bush."

Let's play a little game. Go to the Journal of Personality and Social Psychology (one of the top social psych journals) and look up a few random papers from the 60s. Are you going to find rigorous, replicable science from a mythical era when valiant scientists followed Mertonian norms and were not incentivized to spew out dozens of mediocre papers every year? No, you're going to find exactly the same p<.05, tiny N, interaction effect, atheoretical bullshit. The only difference being the questionable virtue of low productivity.

If the problem isn't new, then we can't look for the causes in recent developments. If Ritchie had moved beyond "loose generalities" to a more systematic analysis of scientific production I think he would have presented a very different picture. The proposals at the end mostly consist of solutions that are supposed to originate from within the academy. But they've had more than half a century to do that—it feels a bit naive to think that this time it's different.

Finally, is there light at the end of the tunnel?

...after the Bem and Stapel affairs (among many others), psychologists have begun to engage in some intense soul-searching. More than perhaps any other field, we’ve begun to recognise our deep-seated flaws and to develop systematic ways to address them – ways that are beginning to be adopted across many different disciplines of science.

Again, the book is missing hard data and analysis. I used to share his view (surely after all the publicity of the replication crisis, all the open science initiatives, all the "intense soul searching", surely things must change!) but I have now seen some data which makes me lean in the opposite direction. Check back toward the end of August for a post on this issue.

Ritchie's view of science is almost romantic: he goes on about the "nobility" of research and the virtues of Mertonian norms. But the question of how conditions, incentives, competition, and even the Mertonian norms themselves actually affect scientific production is an empirical matter that can and should be investigated systematically. It is time to move beyond "speculative insights" and onto "rigorous testing", exactly in the way that Merton failed to do.

Links Q2 2020

Tyler Cowen reviews Status and Beauty in the Global Party Circuit. "In this world, girls function as a form of capital." The podcast is good too.

Lots of good info on education: Why Conventional Wisdom on Education Reform is Wrong (a primer)

Scott Alexander on the life of Herbert Hoover.

Longer-Run Economic Consequences of Pandemics [speculative]:

Measured by deviations in a benchmark economic statistic, the real natural rate of interest, these responses indicate that pandemics are followed by sustained periods—over multiple decades—with depressed investment opportunities, possibly due to excess capital per unit of surviving labor, and/or heightened desires to save, possibly due to an increase in precautionary saving or a rebuilding of depleted wealth.

Do cognitive biases go away when the stakes are high? A large pre-registered study with very high stakes finds that effort increases significantly but performance does not.

Disco Elysium painting turned into video using AI.

Long-run consequences of the pirate attacks on the coasts of Italy: "in 1951 Rome would have been 15% more populous without piracy."

“A” Business by Any Other Name: Firm Name Choice as a Signal of Firm Quality (2014): "The average plumbing firm whose name begins with A or a number receives five times more service complaints than other firms and also charges higher prices."

Yarkoni: The Generalizability Crisis [in psychology].

Lakens: Review of "The Generalizability Crisis" by Tal Yarkoni.

Yarkoni: Induction is not optional (if you’re using inferential statistics): reply to Lakens.

Estimating the deep replicability of scientific findings using human and artificial intelligence - ML model does about as well as prediction markets when it comes to predicting replication success. "the model’s accuracy is higher when trained on a paper’s text rather than its reported statistics and that n-grams, higher order word combinations that humans have difficulty processing, correlate with replication." Also check out the horrific Fig 1.

Wearing a weight vest leads to weight loss, fairly huge (suspiciously huge?) effect size. The hypothesized mechanism is the "gravitostat": your body senses how heavy you are and adjusts accordingly.

Tyler Cowen on uni- vs multi-disciplinary policy advice in the time of Corona

...and here's Señor Coconut, "A Latin Tribute to Kraftwerk". Who knew "Autobahn" needed a marimba?

Memetic Defoundation

The bunny ears sign used to be a way of calling someone a cuck. In fact they're not bunny ears at all, they're cuckold horns. The original meaning has been lost, and today clueless children across the world use it as nothing more than a vaguely teasing gesture. This is an amusing case of a wider phenomenon I like to call memetic defoundation.

A general formulation would look something like this:

  • Start with a couple of ideas of the form "[foundation] therefore [meme]"1
  • [foundation] is forgotten, disproved, or rendered obsolete
  • [meme] persists regardless

Dead beliefs

Organizational decay is a hotspot for memetic defoundation. Luttwak tells us of a unit in the Rhine legions led by a Praefectus Militum Balistariorum long after the Roman army had lost the ability to construct and use ballistae. Gene Wolfe uses this effect in The Book of the New Sun to evoke the image of an ancient, ossified, slowly crumbling civilization: my favorite example is a prison called the "antechamber" where the inmates are still served coffee and pastries every morning.

E. R. Dodds offers another example in The Greeks and the Irrational, where he describes the decline of religion in Hellenistic times:

Gods withdraw, but their rituals live on, and no one except a few intellectuals notices that they have ceased to mean anything.

Scott Alexander comments on the relation between science and policy: "The science did a 180, but the political implications stayed exactly the same."

John Stuart Mill writes that memetic defoundation "is illustrated in the experience of almost all ethical doctrines and religious creed" and argues that free speech is necessary to prevent it, as open debate preserves the arguments behind ideas:2

If, however, the mischievous operation of the absence of free discussion, when the received opinions are true, were confined to leaving men ignorant of the grounds of those opinions, it might be thought that this, if an intellectual, is no moral evil, and does not affect the worth of the opinions, regarded in their influence on the character. The fact, however, is, that not only the grounds of the opinion are forgotten in the absence of discussion, but too often the meaning of the opinion itself. The words which convey it, cease to suggest ideas, or suggest only a small portion of those they were originally employed to communicate. Instead of a vivid conception and a living belief, there remain only a few phrases retained by rote; or, if any part, the shell and husk only of the meaning is retained, the finer essence being lost. [...] Truth, thus held, is but one superstition the more, accidentally clinging to the words which enunciate a truth.

Sometimes a meme will spread because it captures a true relation, but will use an unrelated foundation to do so. Greg Cochran suggests that Christian Science (a sect that avoids all medical care) developed as a response to the high fatality rates of pre-modern medicine. But the meme only spread when the foundation was put in theological rather than medical terms. What really matters for defoundation is the implicit relation that is captured (pseudoscientific medicine → avoid medical care) rather than the explicit one (sickness results from spiritual error → avoid medical care). When medicine improved, the true basis of the meme was gone, but of course that did nothing to change people's religious beliefs.

Finally, many (including Schumpeter,3 Santayana,4 and Saint Max5) have identified an instance of memetic defoundation in the relation between Protestantism and political liberalism (in the most general sense of the word). In broad strokes, the argument is that liberalism dropped God but kept the Protestant morality. Moldbug6 erroneously places this transition after WWII, while Barzun argues it happened 300 years earlier7. Tom Holland thinks this is an awesome development,8 while others are more skeptical. My old buddy Freddie makes the same diagnosis in Twilight of the Idols:

In England, in response to every little emancipation from theology one has to reassert one’s position in a fear-inspiring manner as a moral fanatic. That is the penance one pays there. – With us it is different. When one gives up Christian belief one thereby deprives oneself of the right to Christian morality. For the latter is absolutely not self-evident: one must make this point clear again and again, in spite of English shallowpates. Christianity is a system, a consistently thought out and complete view of things. If one breaks out of it a fundamental idea, the belief in God, one thereby breaks the whole thing to pieces: one has nothing of any consequence left in one’s hands. Christianity presupposes that man does not know, cannot know what is good for him and what evil: he believes in God, who alone knows. Christian morality is a command: its origin is transcendental; it is beyond all criticism, all right to criticize; it possesses truth only if God is truth – it stands or falls with the belief in God. – If the English really do believe they will know, of their own accord, ‘intuitively’, what is good and evil; if they consequently think they no longer have need of Christianity as a guarantee of morality; that is merely the consequence of the ascendancy of Christian evaluation and an expression of the strength and depth of this ascendancy: so that the origin of English morality has been forgotten, so that the highly conditional nature of its right to exist is no longer felt.

Things are in the saddle

Which brings us to the question of how memetic defoundation happens. In Nietzsche's model you start with the foundation and the meme is derived from it, but once the ideas have been entrenched deeply enough, the foundation can evaporate without affecting the meme. Like a fish doesn't notice water, people no longer notice the assumptions behind their beliefs. I call this the foundation-first model.

But I think he's wrong: in some cases, including the question of Christianity, the correct approach is a meme-first model. In this view, the foundation is simply a post-hoc justification (or a spandrel) glued onto a preëxisting meme. That is not to say the foundation is irrelevant, just that its role in supporting the meme is viral rather than logical.

Where did the meme come from? In his brilliant essay The Three Sources of Human Values, Hayek argues that ideas come from three sources:

  1. Consciously directed rational thought
  2. Biology
  3. Cultural evolution

We can use this classification to look at memetic defoundation. The first case is the easiest: the Roman army uses siege weapons, so someone in charge creates a siege unit and a Praefectus to lead it (a clear foundation-first instance). Eventually it loses those capabilities, but the structure remains.

Biologically instilled tendencies and values are more challenging to analyze: their aims tend to be inaccessible to introspection or hidden through self-deception. And they are not necessarily moral judgements: it could be something as simple as folkbiological classifications predisposed to certain patterns, which then influence values.9

Behaviors and social structures generated by cultural evolution also tend to be opaque: they were created by a process of random variation and selection, then sustained by a distributed system of knowledge accumulation and replication—no individual understands how they work (and they generally don't even try to, simply attributing them to custom or one's ancestors). Henrich details how the tendency of modern westerners to search for causal, explicable reasons is an anomaly.

Even when we try, we don't always succeed: the age of reason didn't necessarily make culturally evolved behaviors transparent. For example, traditional societies in the New World had various processes for nixtamalizing corn before eating it, which makes the niacin nutritionally available and prevents the disease of pellagra. It took until the 1940s(!) and hundreds of thousands of deaths until scientists finally understood the problem. And that's a simple nutritional issue rather than a question of complex social organization. As Scott Alexander puts it:

Our ancestors lived in Epistemic Hell, where they had to constantly rely on causally opaque processes with justifications that couldn’t possibly be true, and if they ever questioned them then they might die.

In a world filled with vital customs and weak explanations it's important to make sure nobody ever questions tradition—thus it is safeguarded by indoctrination, preference falsification,10 ostracism, or the promise of divine punishment. And now we have a second level of selective forces which are shaped by the needs of the memes: they mould their biological and social substrate to maximize their spread. And what are the traits they select for? Conformity, homogeneity, mimesis, self-ignorance, lack of critical thought: the herd-instinct. An overbearing society for a myopic, servile species domesticated under the yoke of ideas. That is the price we pay for the "secret of our success".11

Now consider what happens after a rapid shift in our environment (such as the introduction of agriculture, large-scale hierarchical societies, or the industrial revolution): both biological and cultural evolution are slow processes, and the latter has built-in safeguards to prevent modification. That is how we end up with a lag of ideas: baseless memes designed for a different habitat. Like a saltwater fish thrown in a lake, modern man depends on ideas he thinks are universal when they are really made for a different time and place. Hayek:

The gravest deficiency of the older prophets was their belief that the intuitively perceived ethical values, divined out of the depth of man's breast, were immutable and eternal.

What kind of ideas are most likely to take hold? "Doctrines intrinsically fitted to make the deepest impression upon the mind"12 that also increase fitness. Successful cultural adaptations tend to capture true relations, in false yet convincing ways. This is why religious memes are particularly susceptible to defoundation, and why most defoundation is meme-first. While many of these ideas may appear altruistic, they are really "subtly selfish" as George Williams put it—otherwise they would not have survived.

For example, G. E. Pugh in The Biological Origin of Human Values talks about the ubiquitous sharing norms in primitive human societies. Christopher Boehm in Hierarchy in the Forest (a work that blatantly plagiarizes Nietzsche) discusses the "egalitarian ethos" of primitive societies and its evolutionary origin, which expresses itself as a "drive to parity", which became possible to enforce with the evolution of tool use and greater coordination abilities:

Because the united subordinates are constantly putting down the more assertive alpha types in their midst, egalitarianism is in effect a bizarre type of political hierarchy.

The collective power of resentful subordinates is at the base of human egalitarian society, and we can see important traces of this group approach in chimpanzee behavior. [...] It is obvious that symbolic communication and possession of an ethos make a very large difference for humans. Yet it would appear that the underlying emotions and behavioral orientations are similar to those of chimpanzees, as are group intimidation strategies that have the effect of terminating resented behaviors of aggressors.

To re-work Nietzsche's argument into a more plausible form: the drive to parity came first. Christian morality is simply a post-hoc justification of this innate tendency, in a highly contagious and highly effective prosocial package. God is now dead, but that does nothing to change our evolved moral intuitions, so this drive simply finds new outlets: humanism, democracy, liberalism, socialism, etc. As this shift of ideas happens, we inevitably bring along some old baggage.

The sentiments necessary to thrive in a band or a tribe are not those that we need today, but they are largely those we are stuck with. Modern civilization and its markets are inhuman and unintuitive (if not actively repulsive) and exist largely because we are able to suspend, disregard, and master our innate impulses. Seemingly new ideologies directed against the market are nothing but an atavism: the incompatibility between our innate tendencies and the external environment explains their peculiar combination of perpetual failure and perpetual popularity.

Clean sweep

Counterintuitively, the memes can be strengthened by abandoning the thing they're (supposedly) based on. You can attack Christianity-the-religion-and-ethical-system by attacking God: if morality comes from God, when you take down God you also take down his morality. But it didn't work out that way in practice: people dropped the God but kept his system; where do you attack now? In theory "that which can be asserted without evidence can be dismissed without evidence." In reality, that which is asserted without evidence is difficult to refute regardless of the evidence.13

Another issue, as I argued above, is that we don't comprehend them, either because of self-deception, limited introspection, or the blind forces of cultural evolution. The solution to both of these problems is the genealogical method. The ultimate aims of our values and customs lie in their (genetic or cultural) evolutionary history; by understanding their development we can understand their purpose and the selective forces that shaped them. Through genealogy we can reach truths we have been designed not to see.14

Which brings us back to Nietzsche. How should one argue against God? Forget the old debate tactics, he says in Daybreak 95, and just treat it as an anthropological problem:

In former times, one sought to prove that there is no God – today one indicates how the belief that there is a God arose and how this belief acquired its weight and importance: a counter-proof that there is no God thereby becomes superfluous. – When in former times one had refuted the 'proofs of the existence of God' put forward, there always remained the doubt whether better proofs might not be adduced than those just refuted: in those days atheists did not know how to make a clean sweep.

It is this approach that we should deploy against foundationless memes. Don't bother with arguments attacking the foundation or the meme itself, rather go for a "clean sweep". The case of Christian Science mentioned above is a perfect example: providing theological arguments against it is futile (and fundamentally aiming at the wrong target). But understanding how it came to be makes the situation crystal clear.

The Hansonian approach of noticing a disconnect between stated and revealed preferences is also useful for spotting these memes in the first place. Hanson combines both techniques in his analysis of The Evolution of Health Altruism.

What if some lies are useful and life-preserving? What if such lies are fundamentally necessary for societies to work well? Isn't this just a naïve overexpression of the drive to truth? That may well be the case, but just because some lies are useful does not mean that the particular lies we live by right now are the best ones. In fact the tyranny of mediocrity that flourished in our recent evolutionary past appears to be fundamentally incompatible with the modern world (not to mention the world of tomorrow). Understanding is a precondition for designing superior replacements, or as Nietzsche put it "we must become physicists in order to be able to be creators".

Genealogy allows us to understand the selective forces at play, and once we understand that we (and by we I refer to a tiny minority) have the power to overcome our self-ignorance and ingrained limitations in order to choose from a higher point of view. Not a position of "transcendent leverage", but at least an informed valuing of values, consistent with the world as it is.

  1. 1.I deliberately avoid the use of "assumptions" and "conclusion" because they're not always assumptions and/or conclusions.
  2. 2.He also supports an early version of steelmanning for the same purpose: "So essential is this discipline to a real understanding of moral and human subjects, that if opponents of all important truths do not exist, it is indispensable to imagine them, and supply them with the strongest arguments which the most skilful devil’s advocate can conjure up."
  3. 3."Though the classical doctrine of collective action may not be supported by the results of empirical analysis, it is powerfully supported by that association with religious belief to which I have adverted already. This may not be obvious at first sight. The utilitarian leaders were anything but religious in the ordinary sense of the term. In fact they believed themselves to be anti-religious and they were so considered almost universally. They took pride in what they thought was precisely an unmetaphysical attitude and they were quite out of sympathy with the religious institutions and the religious movements of their time. But we need only cast another glance at the picture they drew of the social process in order to discover that it embodied essential features of the faith of protestant Christianity and was in fact derived from that faith. For the intellectual who had cast off his religion the utilitarian creed provided a substitute for it.", Capitalism, Socialism, and Democracy
  4. 4."The chief fountains of this [genteel] tradition were Calvinism and transcendentalism. Both were living fountains; but to keep them alive they required, one an agonised conscience, and the other a radical subjective criticism of knowledge. When these rare metaphysical preoccupations disappeared—and the American atmosphere is not favourable to either of them—the two systems ceased to be inwardly understood; they subsisted as sacred mysteries only; and the combination of the two in some transcendental system of the universe (a contradiction in principle) was doubly artificial.", The Genteel Tradition in American Philosophy
  5. 5."Take notice how a “moral man” behaves, who today often thinks he is through with God and throws off Christianity as a bygone thing. [...] Much as he rages against the pious Christians, he himself has nevertheless as thoroughly remained a Christian — to wit, a moral Christian.", The Ego and His Own
  6. 6."Progressive Christianity, through secular theologians such as Harvey Cox, abandoned the last shreds of Biblical theology and completed the long transformation into mere socialism. [...] Creedal declarations of Universalism are not hard to find. I am fond of the Humanist Manifestos, which pretty much say it all. The UN Declaration of Human Rights is good as well. No mainline Protestant will find anything morally objectionable in any of these documents."
  7. 7."The outcome of what has been reviewed here—late 17C critical thought, the events of 1688, and the writings of Locke, Voltaire, and Montesquieu— may be summed up in a few points [...] the political ideas of the English Puritans aiming at equality and democracy were now in the main stream of thought, minus the religious component.", From Dawn to Decadence
  8. 8.His book Dominion: How the Christian Revolution Remade the World is all about this topic. "If secular humanism derives not from reason or from science, but from the distinctive course of Christianity’s evolution – a course that, in the opinion of growing numbers in Europe and America, has left God dead – then how are its values anything more than the shadow of a corpse? What are the foundations of its morality, if not a myth?" Holland also likes to quote the Indian historian S. N. Balagangadhara: "Christianity spreads in two ways: through conversion and through secularisation."
  9. 9.Henrich has a very interesting paper with Scott Atran: The Evolution of Religion: How Cognitive By-Products, Adaptive Learning Heuristics, Ritual Displays, and Group Competition Generate Deep Commitments to Prosocial Religions. "Most religious beliefs minimally violate the expectations created by our intuitive ontology and these modes of construal, thus creating cognitively manageable and memorable supernatural worlds."
  10. 10.I highly recommend Timur Kuran's Private Truths, Public Lies, his analysis of how social pressures cause people to display and sustain false beliefs is brilliant.
  11. 11.Nietzsche also brings up another related issue: the incompatibility between the older animalistic values and the new ones imposed by selective forces downstream of cultural accumulation, turning man into a "sick animal". But that's a story for another day.
  12. 12.Mill, On Liberty.
  13. 13.It might be interesting to approach this from the POV of Zizekian "ideology". Perhaps the issue is a kind of a-priori faith (because belief by conviction isn't really—it has already been mediated through our subjectivity) which disintegrates once you instrumentalize the idea. Of course people are resistant to instrumentalizing sacred values. From The Sublime Object of Ideology: "Pascal's final answer, then, is: leave rational argumentation and submit yourself simply to ideological ritual".
  14. 14.There's a Chesterton's Fence aspect to all of this: you need to understand the lie before you try to tear it down.

Links Jan-Feb 2020

Word2vec: fish + music = bass

fish + music = bass
fish + friend = chum
fish + hair = mullet
fish + struggle = flounder
oink - pig + bro = wassup
yeti – snow + economics = homo economicus
music – soul = downloadable ringtones
good : bad :: rock : nü metal

Related, The (Too Many) Problems of Analogical Reasoning with Word Vectors.

We always knew meta-analyses are somewhat flawed because of publication bias and the "file drawer problem", but exactly how bad is it? A new paper compares meta-analyses to pre-registered replications and finds that meta-analyses overstate effect sizes by 3x.

In related news, registered reports in psychology have 44% positive results vs 96% in the standard literature.

Female orgasm frequency by male income quartile. Obviously confounded in all sorts of ways, but still.

Effective Altruists tackle the problem of tfw no gf. h/t @SilverVVulpes

Mark Koyama reviews Scheidel's Escape from Rome, with some very interesting comments on the use of counterfactuals by historians vs economists doing history. "There is no control group for Europe had Archduke Ferdinand not been assassinated."

A review of Dietz Vollrath's new book, Fully Grown:

Vollrath’s preferred decomposition of the causes of the 1.25% annual slowdown in real GDP per capita growth is:

  • 0.80pp - Declining growth in human capital
  • 0.20pp - The shift of spending from goods to services
  • 0.15pp - Declining reallocation of workers and firms
  • 0.10pp - Declining geographic mobility

Pay-as-you-go pension systems are going to have serious trouble in countries with rapidly aging populations. Just how bad is it going to be? If you're a <40 yo worker today, it's probably safe to assume you won't be getting much out of the money you're paying into the pension system.

RCA summarizes his views on US healthcare costs with a ton of great charts: Why conventional wisdom on health care is wrong (a primer).

Should we be worrying about automation in the near future? Scholl and Hanson argue no.

Disco Elysium (which I highly recommend) lead designer and writer Robert Kurvitz talks about the development process and how twitter inspired their dialogue engine: The Feature That Almost Sank Disco Elysium.

It has long been established that asking the same question twice in the same questionnaire will often result in the same person giving two different responses. But what happens if you place the repeated questions right next to each other?

Human-cat co-evolution: "We found that the population density of free-ranging cats is linearly related to the proportion of female students in the university. [...] suggests that the cats may have the ability to distinguish human sex and adopt a sociable skill to human females."

The dril Turing test.

And here's some sweet Afro-Cuban jazz fusion.

Reading Notes: Civilization & Capitalism, 15th-18th Century, Vol. I: The Structures of Everyday Life

I first discovered Fernand Braudel when Tyler Cowen answered the question: "whose entire body of work is worth reading?", placing him next to people like Nietzsche and Hume. It was good advice.

Braudel starts working on his doctoral dissertation in 1923, at age 21, intending to concentrate on the policies of Philip II of Spain in the form of a conventional history. To support himself, he teaches at an Algerian high school for a decade, then at the university of Sao Paulo until 1937. During this period he keeps up with developments in France, especially Marc Bloch and Lucien Febvre's Annales School, which focuses on long-term history and statistical data.

In 1934, 11 years after he began, Braudel starts to find quantitative data. Population figures, ship cargoes, prices, arrivals and departures. These will form the basis of his novel, data-driven approach. Five years later, in 1939, he finally has an outline ready.

Then the Nazis capture him. He spends the next 5 years in a POW camp where he writes the first draft of La Méditerranée without access to any materials, mailing notebooks back to Paris. When the war ends, he becomes the de facto leader of the second generation of the Annales School. An additional four years after that, 26 years after he started working on it, The Mediterranean and the Mediterranean World in the Age of Philip II is published.

The general argument of this work is that history moves at different speeds, and one must distinguish them: the short term (daily events as perceived by contemporaries), the medium term ('economic systems, states, societies, civilisations'), and la longue durée – a perspective of centuries or millennia without which the shorter timeframes cannot be understood.

In the preface, Braudel declares: "I have always believed that history cannot be really understood unless it is extended to cover the entire human past." Civilization and Capitalism is built on similar principles.

The initial seeds for C&C were planted in 1950, when Febvre asked Braudel to contribute to a volume for a series on world history. Braudel would simply provide a summary of existing work on the development of capitalism. But Febvre died before the volume could be completed, and Braudel took responsibility for what turned out to be a three-volume series on capitalism. The first volume came out 17 years after work began, in 1967. The final volume would not be published until 1979.

Reading Braudel one gets the impression of an infinite curiosity at work for decades, mining every source for the tiniest piece of data, and then magisterially combining everything together. Despite fairly brutal editing these notes are still way too long, and yet they struggle to capture even a tiny part of the detail and depth that the book contains.

Vol. I: The Structures of Everyday Life

A good starting point might be what is left out: politics, wars, dynasties, religion, ideology, peoples. The index of maps & graphs gives the reader a taste of what is to come: "Budget of a mason's family in Berlin about 1800"; "Bread weights and grain prices in Venice at the end of the sixteenth century"; "French Merchants registered as living in Antwerp, 1450-1585".

The first volume aims to illuminate every aspect of material life: agriculture, food, dress, housing, towns, cities, energy, metals, machines, animals, transportation, money. Braudel's goal is not simply to examine each of these in isolation, but to show how all the elements of material life interact to form cultures, economies, systems of governance, power structures, long-term cycles or trends. He comes remarkably close to achieving this absurdly ambitious task. For people into worldbuilding this tome is pure gold. The first volume also has the greatest general appeal: unlike the other two which are somewhat esoteric, I think this is a book everyone will love.

In short, at the very deepest levels of material life, there is at work a complex order, to which the assumptions, tendencies and unconscious pressures of economies, societies and civilizations all contribute.

It is here that Braudel shows off his greatest skill, which is the combination of the microscopic with the panoramic. At the top level: Geography. Climate. Land. Crops. ZOOM IN. Trading routes. Piracy. Economy. Cities. Technology. And then zoom into details like the price of wheat relative to oats in 1351 Paris. He shifts effortlessly between the global, long-term perspective and minute, specific data and anecdotes, combining the two to form a coherent understanding.

The Weight of Numbers

Everything, both in the short and long term, and at the level of local events as well as on the grand scale of world affairs, is bound up with the numbers and fluctuations of the mass of people.

The predominant feature of the ancien régime is malthusianism. From the 16th century on, Europe was constantly on the brink of overpopulation. Epidemics and famines established balance, and occasional recessions in population created great wealth for the survivors. "Thus in Languedoc between 1350 and 1450, the peasant and his patriarchal family were masters of an abandoned countryside. Trees and wild animals overran fields that once had flourished." France had 26 general famines just in the 11th century; 16 in the 18th.

Famine recurred so insistently for centuries on end that it became incorporated into man's biological regime and built into his daily life. Dearth and penury were continual, and familiar even in Europe, despite its privileged position. [...] Things were far worse in Asia, China and India. Famines there seemed like the end of the world. In China everything depended on rice from the southern provinces; in India, on providential rice from Bengal, and on wheat and millet from the northern provinces, but vast distances had to be crossed and this contribution only covered a fraction of the requirements.

Slowly, expansion and improvements in agricultural productivity doubled the global population, which Braudel calls "indubitably the basic fact in world history from the fifteenth to the eighteenth century".

Almost all of these people lived in the countryside. "The towns the historian discovers in his journeys back into pre-nineteenth-century times are small; and the armies miniature." The towns were also great population sinks, drawing in men from the countryside and killing them. Wild animals were everywhere, often a real threat. Even in Europe, which was full of wolves and bears.

A lapse in vigilance, an economic setback, a rough winter, and they multiplied. In 1420, packs entered Paris through a breach in the ramparts or unguarded gates. They were there again in September 1438, attacking people this time outside the town, between Montmartre and the Saint-Antoine gate. In 1640, wolves entered Besançon by crossing the Doubs near the mills of the town and 'ate children along the roads'.

Braudel writes about the global ebb and flow of epidemics over the course of centuries, and how they were aided by global trade. And to illustrate their effect, he brings up statistics like the annual number of plague victims in the town of Strauling between 1623 and 1635 (702 people). He tells us of Montaigne, who as mayor of Bordeaux fled the town (like all rich people would) and abandoned his post during the 1585 plague. He quotes the diaries of Samuel Pepys ("the plague making us cruel, as doggs, one to another"). He quotes Francois Dragonet of Fogasses, a rich Avignon citizen of Italian origin, whose leases provided for a time when he would be obliged to leave the town (which he did in 1588, during a fresh plague) and lodge with his farmers: 'In case of contagion (God forbid), they will give me a room at the house... and I will be able to put my horses in the stable on my way there and back, and they will give me a bed for myself.' The dead pile up in the streets (Defoe: "for the most part on to a cart like common dung"), the palaces of the rich are looted.

Montaigne tells how he wandered in search of a roof when the epidemic reached his estate, 'serving six months miserably as a guide' to his 'distracted family, frightening their friends and themselves and causing horror wherever they tried to settle'.

Daily bread


Diets in this period were almost universally vegetable-based, especially outside Europe, for the simple reason that land devoted to cultivation is much more efficient. Braudel focuses on three major crops: wheat, rice, and maize. These crops sit at the basis of everything: they determine population size, and their required inputs determine labor relations, animal usage (which in turn need their own crops), and so on.

Thus there became established in Europe, with certain regional variations, 'a complicated system of relationships and habits', based on wheat and other grains, which was 'so firmly cemented together that no fissure was possible' according to Ferdinand Lot. Plants, animals and people each had their place in it. In fact the whole system was inconceivable without the peasants, the harnessed teams of animals, and the seasonal labourers at harvest and threshing time, since reaping and threshing was all done by hand. The fertile lowlands called on labour from poor land, inevitably wild highland regions. Innumerable examples (the southern Jura and Dombes, the Massif Central and Languedoc) demonstrate that the partnership was a basic rule of life, repeated on many occasions. An immense crowd of harvesters arrived every summer in the Tuscan Maremma, where fever was so prevalent, in search of high wages (up to five paoli a day in 1796). Malaria regularly claimed innumerable victims there.


Wheat's unpardonable fault was its low yield: it did not provide for its people adequately. All recent studies establish the fact with an overwhelming abundance of detail and figures. Wherever one looks, from the fifteenth to the eighteenth century, the results were disappointing. For every grain sown, the harvest was usually no more than five and sometimes less.

Until very late, agricultural production was fertilizer-limited. In southern Europe half the field would lie fallow every year, and this only really changed after the industrial revolution. Trade happened on local exchanges, which combined with laws against "hoarding" made local shortages problematic. In the 16thC total maritime trade was perhaps 1% of total consumption. White bread was a luxury until the latter half of the 18thC. Flour doesn't keep well, so every town had a mill that worked daily (about 1 mill per 400 people); any interruption (eg because of the river freezing) immediately created supply problems.

Rice and maize

Rice is an even more tyrannical and enslaving crop than wheat.

The key difference between rice and wheat is that the former can produce ~7.3 million kcals per hectare, whereas wheat can only reach 1.5 million. Unlike wheat, there was no need for fallow land, and by the 13thC in China a system of double (or sometimes triple) crop was established. "And thus the great demographic expansion of southern China began."

The high population density created by rice, combined with the necessity for elaborate top-down irrigation systems, resulted in strong state authority that constantly pursued large-scale works.

The problem then is that on one hand we have a series of striking achievements, on the other, human misery. As usual we must ask: who is to blame? Man of course. But maize as well.

While wheat yielded maybe 5 grains for every one planted, maize would yield 150 or more. It grows easily and requires little effort on the part of the farmer (perhaps 50 days per year). "The maize-growing societies on the irrigated terraces of the Andes or on the lakesides of the Mexican plateaux resulted in theocratic totalitarian systems and all the leisure of the peasants was used for gigantic public works of the Egyptian type."

After the discovery of the New World, potatoes and maize flowed back toward Eurasia, but very slowly. It took until the 18thC for maize to see widespread cultivation in Europe. The potato was strongly resisted everywhere, as people thought it caused leprosy or flatulence; it only spread rapidly in the face of famine or war.

Hoe cultivation beltHoe cultivation belt

There is also an enormous region that spans the globe where work was done with a digging stick or hoe, and animals are generally not used. These societies are surprisingly homogeneous:

The world of men with hoes was characterized - and this is the most striking fact about it - by a fairly marked homogeneity of goods, plants, animals, tools and customs. We can say that the house of the peasant with a hoe, wherever it may be, is almost invariably rectangular and has only one storey. He is able to make coarse pottery, uses a rudimentary hand loom for weaving, prepares and consumes fermented drinks (but not alcohol), and raises small domestic animals - goats, sheep, pigs, dogs, chickens and sometimes bees (but not cattle). He lives off the vegetable world round about him: bananas, bread-fruit trees, oil palms, calabashes, taros and yams.

Superfluity and Sufficiency: Food and Drink

Eating Habits

Prices and therefore diets followed population numbers. Large-scale death from war or plague made meat accessible; overpopulation meant the peasants didn't even eat the wheat they produced.

Things had begun to change in the West by the middle of the sixteenth century. Heinrich Muller wrote in 1550 that in Swabia 'in the past they ate differently at the peasant's house. Then, there was meat and food in profusion every day; tables at village fairs and feasts sank under their load. Today, everything has truly changed. Indeed, for some years now, what a calamitous time, what high prices! And the food of the most comfortably-off peasants is almost worse than that of day-labourers and valets in the old days.


The peasant often sold more than his 'surpluses', and above all, he never ate his best produce: he ate millet and maize and sold his wheat; he ate salt pork once a week and took his poultry, eggs, kids, calves and lambs to market.

Spoons and knives were old customs, but the fork dates to the 16thC and spread from Venice.

Anne of Austria ate her meat with her fingers all her life. And so did the Court of Vienna until at least 1651. Who used a fork at the Court of Louis XIV? The Duke of Montausier, whom Saint-Simon describes as being 'of formidable cleanliness'. Not the king, whose skill at eating chicken stew with his fingers without spilling it is praised by the same Saint-Simon! When the Duke of Burgundy and his brothers were admitted to sup with the king and took up the forks they had been taught to use, the king forbade them to use them. This anecdote is told by the Princess Palatine, with great satisfaction: she has 'always used her knife and fingers to eat with'.

The Baron de Tott has left a humorous description of a reception in the country house near Istanbul of 'Madame the wife of the First Dragoman', in 1760. This class of rich Greeks in the service of the Grand Turk adopted local customs, but liked to make some difference felt. 'A circular table, with chairs all round it, spoons, forks nothing was missing except the habit of using them. But they did not wish to omit any of our manners which were just becoming as fashionable among the Greeks as English manners are among ourselves, and I saw one woman throughout the dinner taking olives with her fingers and then impaling them on her fork in order to eat them in the French manner'.

In the West, eggs were accessible to most people, as were cheese and milk. Butter remained limited to Northern Europe. Fish were generally an important source of nourishment, but with large regional variation. The Atlantic coast was particularly advanced in its exploitation of the ocean.

Fish was all the more important here as religious rulings multiplied the number of fast days: 166 days, including Lent, observed extremely strictly until the reign of Louis XIV. Meat, eggs and poultry could not be sold during those forty days except to invalids and with a double certificate from doctor and priest. To facilitate control, the 'Lent butcher' was the only person authorized to sell prohibited foods at that time in Paris, and only inside the area of the Hotel Dieu.

Sugar was brought from the East, with a lot of regional variation in consumption. "In 1800 England consumed 150,000 tons of sugar annually, almost fifteen times more than in 1700." But in other parts of Europe it was virtually unknown. Cultivation of sugar was a labor- and capital-intensive enterprise, and often in sugar colonies there was no space left for any other crops: food had to be imported.

Drinks, stimulants and drugs

Water was generally hard to find. Couldn't be stored on ships, and many cities (like Venice) lacked a real supply and instead relied on filtered rain water and water brought from the mainland. Few aqueducts remained in use, though some were restored in the 15thC (Rome, Paris). Some places used hydraulic wheels to pump water from rivers. The late 18thC saw steam pumps in London and Paris, replacing water-carrying laborers. Snow water was reserved for the wealthy; there was a trade in it, with ships filled with snow moving around the Mediterranean.

Everyone drank wine, and alcoholism was increasingly a problem. The production was generally in the south of Europe, and trade brought it to the north. But it was all new wine, as it did not keep well: regular use of corks would take until the 17thC. The non-wine growing regions had beer, which the south "vigorously opposed". In some areas consumption reached 3 liters per day. "Beer of superior quality was being exported as far as the East Indies from Brunswick and Bremen by the end of the seventeenth century." Cider only started making headway in the 16thC, among the poor. Other civilizations fermented maple juice, agave, or maize.

The great innovation, the revolution in Europe was the appearance of brandy and spirits made from grain - in a word: alcohol. The sixteenth century created it; the seventeenth consolidated it; the eighteenth popularized it.

Stills existed in the West before the 12thC, but things took a while to get going. And the stills would remain primitive until 1773. The drinks started out as medicine. Various guilds fought hard for the privilege of producing Brandy in France. Further north where they had no vines for brandy, grain spirits were most popular. "By the early eighteenth century, the whole of London society, from top to bottom, was determinedly getting drunk on gin."

At nearly the same time as the discovery of alcohol, Europe, at the centre of the innovations of the world, discovered three new drinks, stimulants and tonics: coffee, tea and chocolate. All three came from abroad: coffee was Arab (originally Ethiopian); tea, Chinese; chocolate, Mexican.

Samuel Pepys drank his first cup of tea on September 25, 1660. A century later the English were consuming it by the boatload.

Superfluity and Sufficiency: Houses, Cloths and Fashion

Houses and interiors

The basic constraint on housing was local materials, and as such houses only changed very slowly. Stone mostly for the upper classes; wood (which was gradually replaced by brick) and thatched roof for most people. Earthen dwellings in places where neither stone nor wood existed. In rural areas homes were extremely simple.

Villages were often mobile, "they grew up, expanded, contracted, and also shifted their sites. Sometimes these 'desertions' were total and final - the Wustungen mentioned by German historians and geographers. More often the centre of gravity within a given cultivated area shifted, and everything - furniture, people, animals, stones - was moved out of the abandoned village to a site a few kilometres away."

On 3 February 1695 the Princess Palatine wrote: 'At the king's table the wine and water froze in the glasses.' [...] When the severity of the weather increased, as in Paris in 1709, 'the people died of cold like flies'.(2 March). In the absence of heating since January (again according to the Princess Palatine) 'all entertainments have ceased as well as law suits'.

There were no fireplaces set in the wall before the 12thC. They spread fast, but the design was deficient and they were not very useful for warming homes. It took until the early 18thC for new chimney designs to come along: by utilizing the draught they vastly improve the fireplace's ability to warm the home.

People had almost no furnishings or other possessions. "Official reports for Burgundy between the sixteenth and the eighteenth centuries are full of 'references to people [sleeping] on straw... with no bed or furniture' who were only separated 'from the pigs by a screen'." Outside Europe even chairs were a rarity. In general there was very limited production of such items, and renovations were a large expense even for the rich.

Costume and fashion

Subject to incessant change, costume everywhere is a persistent reminder of social position. The sumptuary laws were therefore an expression of the wisdom of governments but even more of the resentment of the upper classes when they saw the nouveaux riches imitate them.

In societies that remained stable over time, so did dress. China, Japan, even Algiers. "The Indian women in New Spain in Cortes' day wore long tunics, sometimes embroidered, made of cotton and later of wool: and so they did still in the eighteenth century. Male costume, on the other hand, changed - but only to the extent that the conquerors and missionaries demanded clothing decently concealing the nudity of the past." Even in Western Europe in the early 19thC, peasants were still wearing simple coarse cloth that had not changed much for centuries. "In fact, the further back in time one goes, even in Europe, one is more likely to find the still waters of ancient situations like those we have described in India, China and Islam. The general rule was changelessness." The long robes which had persisted from Roman times were only abandoned around 1350.

Tradition was both a strength and a straitjacket. Perhaps if the door is to be opened to innovation, the source of all progress, there must be first some restlessness which may express itself in such trifles as dress, the shape of shoes and hairstyles? Perhaps too, a degree of prosperity is needed to foster any innovating movement?

People in Europe were dirty. Late 18thC Parisians might bathe once or twice per year.

The West even experienced a significant regression from the point of view of body baths and bodily cleanliness from the fifteenth to the seventeenth centuries. [...] After the sixteenth century, public baths became less frequent and almost disappeared, it was said because of the risk of infection and in particular the terrible disease of syphilis. Another reason was no doubt the influence of preachers, both Catholic and Calvinist, who fulminated against the moral dangers and ignominy of the baths. Although rooms for bathing survived in private homes for a long time, the bath became a means of medication rather than a habit of cleanliness.

The Spread of Technology: Sources of Energy, Metallurgy

There are times when technology represents the possible, which for various reasons - economic, social or psychological men are not yet capable of achieving or fully utilizing; and other times when it is the ceiling which materially and technically blocks their efforts. In the latter case, when one day the ceiling can resist the pressure no longer, the technical breakthrough becomes the point of departure for a rapid acceleration. However, the force that overcomes the obstacle is never a simple internal development of technology or science, or at any rate not before the nineteenth century.

Energy was the key problem. Coal had been used in Europe since the 11thC and in China perhaps as early as 4000 BC, but it took very long to realize how much potential it had. Instead the main sources of energy were humans, animals, wind and water, and wood.

Particularly outside Europe, human power was used to an extreme degree. And cheap labor was a problem for the development of machinery.

The precondition for progress was probably a reasonable balance between human labour and other sources of power. The advantage was illusory when man competed with machines inordinately, as in the ancient world and China, where mechanization was ultimately blocked by cheap labour.

In the Old World, camels and mules were indispensable for transportation. Oxen were everywhere, mostly for working the land but also for transportation. Later farming practices replaced them with horses, but that required horse technology improvements such as better harnesses (and it would take very long for these advancements to spread - "The Chinese were still using wooden saddles and ordinary ropes instead of reins in the eighteenth century.") Lavoisier estimated 1.8 million horses and 3 million oxen in France.

The West experienced its first mechanical revolution in the eleventh, twelfth and thirteenth centuries. Not so much a revolution, perhaps, as a whole series of slow changes brought about by the increased numbers of wind- and watermills. The power from these 'primary engines' was probably not very great, from two to five horse-power from a water-wheel, sometimes five, at most ten, from the sails of a windmill. But they represented a considerable increase of power in an economy where power supplies were poor. And they undoubtedly played a part in Europe's first age of growth.


The uses of the water-wheel had become manifold; it worked pounding devices for crushing minerals, heavy tilt hammers used in iron-forging, enormous beaters used by cloth fullers, bellows at iron-works; also pumps, grindstones, tanning mills and paper mills, which were the last to appear. We should also mention the mechanical saws that appeared in the thirteenth century.

Watermills provided power for mines, which saw a rise in the 15thC: they raised ore, ventilated galleries, pumped water, etc. On the eve of the industrial revolution there were perhaps 500,000 watermills in Europe.

Windmills were a later invention, and the key development was to fit the wheel vertically (as opposed to horizontally, as they had been used in China for centuries), which greatly increased their power. Their uses were not limited to milling; in the Netherlands they drove bucket chains that drained water, a key instrument for land reclamation.

Wood was important both directly as a source of energy when burned, and as a building material for machines, ships, etc. Huge transportation costs unless it could be floated down a waterway. By the 18thC demand and prices had skyrocketed. "In France in the eighteenth century, it was said that a single forge used as much wood as a town the size of Chalons-sur-Marne. Enraged villagers complained of the forges and foundries which devoured the trees of the forests, not even leaving enough for the bakers' ovens."

As for coal, there were two key locations in Europe: Liege and Newcastle. Newcastle's coal production increased 15x between the mid-16th and mid-17th century.

It was an integral part of the coal revolution that modernized England after 1600, enabling fuel to be used in a series of industries with large outputs: the manufacture of salt by evaporating sea water; the production of sheets of glass, bricks, and tiles; sugar refining; the treatment of alum, previously imported from the Mediterranean but now developed on the Yorkshire coast; not to mention the bakers' ovens, breweries and the enormous amount of domestic heating that was to pollute London for centuries.


There was thus an often imperceptible or unrecognized industrial pre-revolution in an accumulation of discoveries and technical advances, some of them spectacular, others almost invisible: various types of gear-wheels, jacks, articulated transmission belts, the 'ingenious system of reciprocating movement' , the fly-wheel that regularized any momentum, rolling mills, more and more complicated machinery for the mines. [...] It is revealing to see how European travellers unfailingly comment on the contrast between the primitive machinery in use in India and China, and the quality and refinement of its products.


With the coming of steam, the pace of the West increased as if by magic. But the magic can be explained: it had been prepared and made possible in advance.


Today production is calculated in thousands of tons; 200 years ago they talked about 'hundredweights', which were quintals, the equivalent of fifty present-day kilograms. That is the difference in scale. It divides two civilizations. As Morgan wrote in 1877: 'When iron succeeded in becoming the most important production material, it was the event of events in the evolution of humanity.'

In 1800 metallurgy was still mostly traditional, the economy was dominated by textiles. Metallurgical products other than luxury items did not travel.

We are speaking of the period before the first smelting of steel, before the discovery of puddling, before the general use of coke for smelting, before the long sequence of famous names and processes: Bessemer, Siemens, Martin, Thomas. We are speaking of what was still another planet.

There were two major advances: an early one in China which stagnated by the 13thC, and the later one in Europe leading up to the industrial revolution.

After two smeltings in the crucible, the product obtained enabled the Chinese to cast ploughshares or cooking pots in series - an art that the West discovered only some eighteen or twenty centuries later. [...] Another triumph of Asiatic smelting by crucible was the manufacture - thought by some to be of Indian origin, by others Chinese - of a special kind of steel, 'high quality carbonized steel', as good as the best hypereutectoid steels made today. The nature of this steel and the secrets of its manufacture remained a mystery to Europeans until the nineteenth century. [...] What is so extraordinary is that after this incredibly early start, Chinese metallurgy progressed no further after the thirteenth century. Chinese foundries and forges made no more discoveries, but simply repeated their old processes. Coke-smelting if it was known at all - was not developed. It is difficult to ascertain this, let alone explain it. But Chinese development as a whole poses the same problem time after time: veiled in mystery, it has not yet been resolved.

In Europe, the water-wheel was crucial in the development of iron-smelting, starting with blast furnaces in the 14thC. Water powered enormous bellows and pounding devices - ironworks had to move from forests to riversides. Generally everything was made in small workshops with a master and 3 or 4 workers, but these tended to be concentrated: Brescia had perhaps 200 arms factories.

The Spread of Technology: Revolution and Delays

Innovations penetrated only slowly and with difficulty. The great technological 'revolutions' between the fifteenth and eighteenth centuries were artillery, printing and ocean navigation. But to speak of revolution here is to use a figure of speech. None of these was accomplished at breakneck speed, and only the third - ocean navigation - eventually led to an imbalance, or 'asymmetry' between different parts of the globe.


Produced in China from the 9thC. In Europe, it took to the 14-15thC for pieces to become larger and gunpowder cheaper. Mobility was an issue, large teams of horses needed to move them. Early cannons fired on walls almost at point-blank range. Defense design changed from stone ramparts to earthworks. Installed on ships very early on, by late 14thC all English ships had some artillery. But it was a bit of a mess, and cannon-ports were not a regular feature up to the 16thC. Arquebuses appear in the 15thC, slow and cumbersome. Muskets a bit later, similar issues. Only with the rifle at the start of the 18thC we start seeing large changes.

The new warfare had huge costs, favoring centralization and rich states: independent cities (which had preserved their autonomy in the middle ages) were eliminated as their walls were easily knocked over by huge cannons.

But the cost of artillery did not end when it had been built and supplied with ammunition. It had also to be maintained and moved. The monthly bill for maintenance of the fifty pieces the Spaniards had in the Netherlands in 1554 (cannon, demi-cannon, culverins and serpentines) was over forty thousand ducats. To set such a mass in motion required a 'small train' of 473 horses for the mounted troops and a 'large train' of 1014 horses and 575 wagons (with 4 horses each) , or 4777 horses in all, which meant almost 90 horses per piece. At the same period a galley cost about 500 ducats a month to maintain.

In the late 16thC, Venice had gunpowder in store that cost more than the entire annual receipts of the city.

Paper and Printing

A similar story to gunpowder. Originally developed in the East. Industry took off by the application of water-wheel power to manufacture. "The invention travelled round the world. Like gunners looking for hire, printing workers with makeshift equipment wandered at random, settled down when the opportunity offered and moved on again to accept the welcome of a new patron." Spread fairly quickly around Europe at the end of the 15thC. Perhaps 20 million books printed before 1500 (for a population of 70 million). A key ingredient in 16thC humanism (spreading Greek/Latin thought and mathematics), and later the reformation and counter-reformation.

Ocean Navigation

"The conquest of the high seas gave Europe a world supremacy that lasted for centuries." It also presents a problem: why was this technology not diffused into other cultures?

The Chinese junks, despite their many advantages (sails, rudders, hulls with watertight compartments, compasses after the eleventh century, and a large displacement volume from the fourteenth), went as far as Japan but did not venture beyond the Gulf of Tonkin to the south.

Shipbuilding technology in Europe drew from diverse traditions. The 15thC Portuguese caravel was a marriage of north and south. There was a fairly long history of exploration: the Faroes and Greenland were found multiple times in the first millenium. The Vivaldi brothers attempted to reach the Indies at the end of the 13thC, but were lost at sea. In the 15thC the Chinese started making some voyages of exploration under the Muslim eunuch admiral Cheng Huo. The seventh and last voyage reached Hormuz. Then everything just stopped.

The Atlantic consists of three large wind and sea circuits, shown on a map as three great ellipses. The currents and winds will take a boat in either direction with no effort on its part, as both the Vikings' circuit of the North Atlantic and the voyage of Columbus demonstrate.

For this to be achieved, "Europe had to be aroused to a more active material life, combine techniques from north and south, learn about the compass and navigational charts and above all conquer its instinctive fear." Perhaps the growth of Capitalist forces was what made these voyages possible. But it was not entirely a matter of money: both China and Islam were rich societies at the time.

What historians have called the hunger for gold, the hunger to conquer the world or the hunger for spices was accompanied in the technological sphere by a constant search for new inventions and utilitarian applications - utilitarian in the sense that they would actually serve mankind, making human labour both less wearisome and more efficient. The accumulation of practical discoveries showing a conscious will to master the world and a growing interest in every source of energy was already shaping the true face of Europe and hinting at things to come, well before that success was actually achieved.


Up to the eighteenth century, sea journeys were interminable and overland transport went at snail's pace. [...] The 'defeat of distance', as Ernst Wagemann calls it, was only to be achieved after 1875, with the laying of the first intercontinental cable. True mass communication on a world scale did not appear until the age of the railway, the steamship, telegraph and telephone. Very little changed in terms of the means of transportation across this time. Paul Valery pointed out that 'Napoleon moved no faster than Julius Caesar'. Stone/paved roads increased speeds a bit, but these long remained exceptions. The 18C saw improvements with paved roads + stagecoaches, prefiguring the railway. These were the result of large-scale investment, what economic growth made possible in practice what was possible technically much earlier.

Roadside inns and staging houses were important, typically these had to be reached by evening. "A Neapolitan traveller described these inns more simply in 1693: 'They are nothing but... long stables where the horses occupy the central part; the sides are left for the Masters.' [...] Amenities and speed were the privileges of populated and firmly maintained, 'policed', lands: China, Japan, Europe, Islam." In the rest of the world travel was even more difficult.

Sea routes were fixed, being dependent on winds. Water was more efficient of course (perhaps by a factor of 100!), so waterways brought activity to the areas around them.


The same process can be observed everywhere: any society based on an ancient structure which opens its doors to money sooner or later loses its acquired equilibria and liberates forces that can never afterwards be adequately controlled.

Barter remained the general rule over most of the globe up to the 18th century. Depending on local conditions barter could be partially replaced by primitive currencies such as cowrie shells. Often a highly valued/circulated commodity played the role of money: salt in Senegal, dried fish in Iceland, furs in Alaska and Russia. Other places used cloth, gold dust, copper bracelets, animals, sugar, or cocoa. In some places these lasted for a very long time: Corsica "was not annexed by a really efficient monetary economy until after the First World War."

Early metallic money faced problems with speculation, only existed in large denominations, and was often scarce. The limitations meant that the coins barely touched the masses. Japan, India, Islam, and China were familiar with coinage from early on. China even experimented with paper money from the 9th to the 14thC, but hyperinflation ruined the system. Afterwards China used cumbersome copper and lead coins, with silver for higher level transactions.

In Europe the metals used were typically gold, silver, and copper. When and where these were used depended on the economy, the relative values of the metals, etc.

Their production was irregular and never very flexible, so that depending on circumstances, one of the two metals would be relatively more plentiful than the other; then, with varying degrees of slowness, the situation would reverse, and so on. This resulted in upsets and disasters on the exchanges, and led above all to those slow but powerful fluctuations which were a feature of the monetary ancien regime. It is a well-known truth that 'silver and gold are hostile brothers'.

In general, after the age of exploration, specie flowed from the New World and Europe into the Indies and China, as that is what the Europeans exchanged for commodities from the East.

The 'jingle of coin' thus found its way into everyday life by many different paths. The modern state was the great provider (taxes, mercenaries' pay in money, office-holders' salaries) and recipient of these transfers; but not the only one. Many people were well placed to benefit: the tax-collector, the salt-tax farmer, the pawnbroker, the landowner, the large merchant entrepreneur and the 'financier'. Their net stretched everywhere. And naturally this new wealthy class, like their equivalent today, did not arouse sympathy.

Paper Money and Credit

To be found in circulation alongside metallic money were both fiduciary money (bank notes) and scriptural money (created by the process of book-keeping, by transferring money from one bank account to another: a practice known to the Germans as Buchgeld, book money.

The use of notes in trade is ancient (at least from 2000 BC), and was also well-known outside of Europe. The Europeans rediscovered bills of exchange in the 13thC: "When the West rediscovered the old instruments, it was not like discovering America. In fact every economy that found itself restricted by metallic currency fairly quickly opened up instruments of credit of its own accord, as though in a logical and natural development. They sprang from its commitments, and no less from its shortcomings."

What began to happen very soon was the artificial manufacture of money, of ersatz or perhaps one might say 'manipulated and manipulable' money. All those bank promoters and eventually the Scot, John Law, gradually realized 'the business potentialities of the discovery that money and hence capital in the monetary sense of the term - can be manufactured or created'. This was both a sensational discovery (a lot better than the alchemists!) and a huge temptation. And what a revelation it is for us: it was the slow pace of the heavy metal money, its failure so to speak to keep the engine running, that created the necessary profession of banker, at the very dawn of economic life. He was the man who repaired or tried to repair the mechanical breakdown.

Towns and Cities

Towns, cities, are turning-points, watersheds of human history. When they first appeared, bringing with them the written word, they opened the door to what we now call history. Their revival in Europe in the eleventh century marked the beginning of the continent's rise to eminence. When they flourished in Italy, they brought the age of the Renaissance. So it has been since the city-states, the poleis of ancient Greece, the medinas of the Muslim conquest, to our own times. All major bursts of growth are expressed by an urban explosion.


If towns are considered to be settlements of over 400 inhabitants, then 10% of the English population was living in towns in 1500, and 25% in 1700. But if 5000 is taken as the minimum definition, the figure would only be 13 % in 1700, 16% in 1750, 25% in 1801.

The fundamental aspects of towns: power, markets, division of labor. Cities were population sinks, drawing in immigrants from the countryside. Lots of poverty, lots of death, lots of abandoned children, lots of old/sick/dying in horrible poor-houses like the Hotel-Dieu. The squares would fill up every morning with peasants selling fresh produce. Except for a few places like England, towns all had fortifications. Growth was "organic": city planning had died with the Roman Empire. However, outside of Europe and Islam, the grid pattern was a universal standard.

The West had long ensured security at a low cost by a moat and a perpendicular wall. This did little to interfere with urban expansion - much less than is usually thought. When the town needed more space the walls were moved like theatre sets - in Ghent, Florence, and Strasbourg, for example - and as many times as was required. Walls were made-to-measure corsets. Towns grew and made themselves new ones.


Western towns faced severe problems from the fifteenth century onwards. Their populations had increased and artillery made their ancient walls useless. They had to be replaced whatever the cost, by wide ramparts half sunk in the ground, extended by bastions, terrepleins, 'cavaliers', where loose soil reduced possible damage from bullets. These ramparts were wider horizontally and could no longer be moved without enormous expense. And an empty space in front of these fortified lines was essential to defence operations; buildings, gardens and trees were therefore forbidden there.

The consequence was vertical growth and higher land prices inside the towns. Carriages from the 16thC onwards created huge problems as the streets were generally not equipped to deal with them.

Islamic towns were very large as a rule, and distant from each other. [...] The Great Mosque stood in the centre, with shopping streets (souqs) and warehouses (khans or caravanserai) all around; then a series of craftsmen ranged in concentric circles in a traditional order which always reflected notions concerning what was clean and what was unclean.

The Originality of Western Towns

The West quite soon became a kind of luxury of the world. The towns there had been brought to a pitch hardly found anywhere else.

Its towns were marked by an unparalleled freedom. They had developed as autonomous worlds and according to their own propensities. They had outwitted the territorial state, which was established slowly and then only grew with their interested cooperation - and was moreover only an enlarged and often insipid copy of their development. They ruled their countrysides autocratically, regarding them exactly as later powers regarded their colonies, and treating them as such. They pursued an economic policy of their own via their satellites and the nervous system of urban relay points; they were capable of breaking down obstacles and creating or recreating protective privileges.

But the main, the unpredictable thing was that certain towns made themselves into autonomous worlds, city-states, buttressed with privileges (acquired or extorted) like so many juridical ramparts.

The town was able to try the experiment of leading a completely separate life for quite a long time. This was a colossal event. Its genesis cannot be pinpointed with certainty, but its enormous consequences are visible.


They invented public loans: the first issues of the Monte Vecchio in Venice could be said to go back to 1167. [...] One after another, they reinvented gold money. [...] They organized industry and the guilds; they invented long-distance trade, bills of exchange, the first forms of trading companies and accountancy. They also quickly became the scene of class struggles.


Capitalism and towns were basically the same thing in the West. Lewis Mumford humorously claimed that capitalism was the cuckoo's egg laid in the confined nests of the medieval towns. By this he meant to convey that the bird was destined to grow inordinately and burst its tight framework (which was true), and then link up with the state, the conqueror of towns but heir to their institutions and way of thinking and completely incapable of dispensing with them.


Only the West swung completely over in favour of its towns. The towns caused the West to advance. It was, let us repeat, an enormous event, but the deep-seated reasons behind it are still inadequately explained. What would the Chinese towns have become if the junks had discovered the Cape of Good Hope at the beginning of the fifteenth century, and had made full use of such a chance of world conquest?

The Big Cities

For a long time the only big cities in the world had been in the East and Far East. Marco Polo's amazement makes it clear that the East was the site of empires and enormous cities. With the sixteenth century, and more still during the following two centuries, large towns grew up in the West, assumed positions of prime importance and retained them brilliantly thereafter.

Braudel examines some of the most important cities: Naples, Paris, St. Petersburg, Peking.

In London, the agglomeration of huge masses of poor people was seen as a threat.

In Elizabeth's reign observers already regarded London as an exceptional world. For Thomas Dekker it was 'the Queene of Cities', made incomparably more beautiful by its winding river than Venice itself judged by the marvellous view of the Grand Canal (a very paltry sight compared with what London could offer). Samuel Johnson (20 September 1777) was even more lyrical: 'when a man is tired of London, he is tired of life; for there is in London all that life can afford.' [...] The royal government shared these illusions, but it was none the less in constant fear of the enormous capital. In its eyes London was a monster whose unhealthy growth had to be limited at all costs. [...] The first prohibition on new building (with exceptions in favour of the rich) appeared in 1580. Others followed in 1593, 1607 and 1625. The result was to encourage the dividing-up of existing houses and secret construction-work in poor brick in the courtyards of old houses, away from the street and even from minor alleys.

Regardless, it grew from about 93,000 inhabitants in 1563 to over 700,000 in 1700.

In the seventeenth and eighteenth centuries fresh expansion pushed the town in all directions at once. Appalling districts grew up on the outskirts - shanty towns with filthy huts, unsightly industries (notably innumerable brickworks), pig farms using household refuse for feed, accumulations of rubbish, and sordid streets.

What can we conclude? That London, alongside Paris, was a good example of what a capital of the ancien régime could be. A luxury that others had to pay for, a gathering of a few chosen souls, numerous servants and poor wretches, all linked however, by some collective destiny of the great agglomeration.


The truth is that these densely populated cities, in part parasites, do not arise of their own volition. [...] The world of the ancien régime, very largely a rural one, was slowly but surely collapsing and being wiped out. And great cities were not alone in bringing about the painful birth of the new order. It was often as spectators rather than participants that the capital cities watched the coming Industrial Revolution. Not London, but Manchester, Birmingham, Leeds, Glasgow and countless small mill-towns launched the new age. It was not even the capital accumulated by eighteenth-century patricians that was first invested in the new ventures. London did not take advantage of the industrial movement through her financial assets until about 1830. Paris for a moment looked as if she might welcome industry, but was quickly displaced by the establishment of the real industrial centres near the coalmines of the north, the waterpower of Alsace, and the iron of Lorraine.


Books, even history books, run away with their authors. This one has run on ahead of me. But what can one say about its waywardness, its whims, even its own logic, that will be serious and valid? Our children do as they please. And yet we are responsible for their actions.


Material life, of course, presents itself to us in the anecdotal form of thousands and thousands of assorted facts. Can we call these events? No: to do so would be to inflate their importance, to grant them a significance they never had. That the Holy Roman Emperor Maximilian ate with his fingers from the dishes at a banquet (as we can see from a drawing) is an everyday detail, not an event. So is the story about the bandit Cartouche, on the point of execution, preferring a glass of wine to the coffee he was offered. This is the dust of history, microhistory in the same sense that Georges Gurvitch talks about micro-sociology: little facts which do, it is true, by indefinite repetition, add up to form linked chains. Each of them represents the thousands of others that have crossed the silent depths of time and endured.


It is a fact that 'every great centre of population has worked out a set of elementary answers and has an unfortunate tendency to stick to them out of that force of inertia which is one of the great artisans of history. What is a civilization then, if not the ancient settlement of a certain section of mankind in a certain place? It is a category of history, a necessary classification. Mankind has only shown any tendency to become united (and has certainly not yet succeeded) since the end of the fifteenth century. Until then, and the further we go back in time the more obvious it becomes, humanity was divided between different planets, each the home of an individual civilization or culture, with its own distinctive features and age-old choices. Even when they were close together, these solutions never combined.


In a context where other structures were inflexible (those of material life and, no less, those of ordinary economic life) capitalism could choose the areas where it wished and was able to intervene, and the areas it would leave to their fate, rebuilding as it went its own structures from these components, and gradually in the process transforming the structures of others.


I did not think it was possible to achieve an understanding of economic life as a whole if the foundations of the house were not first surveyed.

Get the book on Amazon.