Links & What I've Been Reading Q3 2020

High Replicability of Newly-Discovered Social-behavioral Findings is Achievable: a replication of 16 papers that followed "optimal practices" finds a high rate of replicability and virtually identical effect sizes as the original studies.

How do you decide what to replicate? This paper attempts to build a model that can be used to pick studies to maximize utility gained from replications.

Guzey on that deworming study, tracks which variables are reported across 5 different drafts of the paper starting in 2011. "But then you find that these variables didn’t move in the right direction. What do you do? Do you have to show these variables? Or can you drop them?"

I've been enjoying the NunoSempre forecasting newsletter, a monthly collection of links on forecasting.

COVID-19 made weather forecasts worse by limiting the metereological data coming from airplanes.

The 16th paragraph in this piece on the long-term effects of coronavirus mentions that 2 out of 3 people with "long-lasting" COVID-19 symptoms never had COVID to begin with.

An experiment with working 120 hours in a week goes surprisingly well.

Gwern's giant GPT-3 page. The Zizek Navy Seal Copypasta is incredible, as are the poetic imitations.

Ethereum is a Dark Forest. "In the Ethereum mempool, these apex predators take the form of “arbitrage bots.” Arbitrage bots monitor pending transactions and attempt to exploit profitable opportunities created by them."

Tyler Cowen in conversation with Nicholas Bloom, lots of fascinating stuff on innovation and progress. "Just in economics — when I first started in economics, it was standard to do a four-year PhD. It’s now a six-year PhD, plus many of the PhD students have done a pre-doc, so they’ve done an extra two years. We’re taking three or four years longer just to get to the research frontier." Immediately made me think of Scott Alexander's Ars Longa, Vita Brevis.

The Progress Studies for Young Scholars youtube channel has a bunch of interesting interviews, including Cowen, Collison, McCloskey, and Mokyr.

From the promising new Works in Progress magazine, Progress studies: the hard question.

I've written a parser for your Kindle's My Clippings.txt file. It removes duplicates, splits them up by book, and outputs them in convenient formats. Works cross-platform.

Generative bad handwriting in 280 characters. You can find a lot more of that sort of thing by searching for #つぶやきProcessing on twitter.

A new ZeroHPLovecraft short story, Key Performance Indicators. Black Mirror-esque.

A great skit about Ecclesiastes from Israeli sketch show The Jews Are Coming. Turn on the subs.

And here's some sweet Dutch prog-rock/jazz funk from the 70s.

What I've Been Reading

  • Piranesi by Susanna Clarke. 16 years after Jonathan Strange & Mr Norrell, a new novel from Susanna Clarke! It's short and not particularly ambitious, but I enjoyed it a lot. A tight fantastical mystery that starts out similar to The Library of Babel but then goes off in a different direction.

  • The Poems of T. S. Eliot: the great ones are great, and there's a lot of mediocre stuff in between. Ultimately a bit too grey and resigned and pessimistic for my taste. I got the Faber & Faber hardcover edition and would not recommend it, it's unwieldy and the notes are mostly useless.

  • Antkind by Charlie Kaufman. A typically Kaufmanesque work about a neurotic film critic and his discovery of an astonishing piece of outsider art. Memory, consciousness, time, doubles, etc. Extremely good and laugh-out-loud funny for the first half, but the final 3-400 pages were a boring, incoherent psychedelic smudge.

  • Under the Volcano by Malcolm Lowry. Very similar to another book I read recently, Lawrence Durrell's Alexandria Quartet. I prefer Durrell. Lowry doesn't have the stylistic ability to make the endless internal monologues interesting (as eg Gass does in The Tunnel), and I find the central allegory deeply misguided. Also, it's the kind of book that has a "central allegory".

  • Less than One by Joseph Brodsky. A collection of essays, mostly on Russian poetry. If I knew more about that subject I think I would have enjoyed the book more. The essays on his life in Soviet Russia are good.

  • Science Fictions: Exposing Fraud, Bias, Negligence and Hype in Science by Stuart Ritchie. Very good, esp. if you are not familiar with the replication crisis. Some quibbles about the timing and causes of the problems. Full review here.

  • The Idiot by "Dostoyevsky". Review forthcoming.

  • Borges and His Successors: The Borgesian Impact on Literature and the Arts: a collection of fairly dull essays with little to no insight.

  • Samuel Johnson: Literature, Religion and English Cultural Politics from the Restoration to Romanticism by J.C.D. Clark: a dry but well-researched study on an extraordinarily narrow slice of cultural politics. Not really aimed at a general audience.

  • Dhalgren by Samuel R. Delany. A wild semi-autobiographical semi-post-apocalyptic semi-science fiction monster. It's a 900 page slog, it's puerile, the endless sex scenes (including with minors) are pointless at best, the characters are uninteresting, there's barely any plot, the 70s counterculture stuff is just comical, and stylistically it can't reach the works it's aping. So I can see why some people hate it. But I actually enjoyed it, it has a compelling strangeness to it that is difficult to put into words (or perhaps I was just taken in by all the unresolved plot points?). Its sheer size is a quality in itself, too. Was it worth the effort? Could I recommend it? Probably not.

  • Novum Organum by Francis Bacon. While he did not actually invent the scientific method, his discussion of empiricism, experiments, and induction was clearly a step in that direction. The first part deals with science and empiricism and induction from an abstract perspective and it feels almost contemporary, like it was written by a time traveling 19th century scientist or something like that. The quarrel between the ancients and the moderns is already in full swing here, Bacon dunks on the Greeks constantly and upbraids people for blindly listening to Aristotle. Question received dogma and popular opinions, he says. He points to inventions like gunpowder and the compass and printing and paper and says that surely these indicate that there's a ton of undiscovered ideas out there, we should go looking for them. He talks about cognitive biases and scientific progress:

    we are laying the foundations not of a sect or of a dogma, but of human progress and empowerment.

    Then you get to the second part and the middle ages hit you like a freight train, you suddenly realize this is no contemporary man at all and his conception of how the world works is completely alien. Ideas that to us seem bizarre and just intuitively nonsensical (about gravity, heat, light, biology, etc.) are only common sense to him. He repeats absurdities about worms and flies arising spontaneously out of putrefaction, that light objects are pulled to the heavens while heavy objects are pulled to the earth, and so on. Not just surface-level opinions, but fundamental things that you wouldn't even think someone else could possibly perceive differently.

    You won't learn anything new from Bacon, but it's a fascinating historical document.

  • The Book of Marvels and Travels by John Mandeville. This medieval bestseller (published around 1360) combines elements of travelogue, ethnography, and fantasy. It's unclear how much of it people believed, but there was huge demand for information about far-off lands and marvelous stories. Mostly compiled from other works, it was incredibly popular for centuries. In the age of exploration (Columbus took it with him on his trip) people were shocked when some of the fantastical stories (eg about cannibals) actually turned out to be true. The tricks the author uses to generate verisimilitude are fascinating: he adds small personal touches about people he met, sometimes says that he doesn't know anything about a particular region because he hasn't been there, etc.




What's Wrong with Social Science and How to Fix It: Reflections After Reading 2578 Papers

I've seen things you people wouldn't believe.

Over the past year, I have skimmed through 2578 social science papers, spending about 2.5 minutes on each one. This was due to my participation in Replication Markets, a part of DARPA's SCORE program, whose goal is to evaluate the reliability of social science research. 3000 studies were split up into 10 rounds of ~300 studies each. Starting in August 2019, each round consisted of one week of surveys followed by two weeks of market trading. I finished in first place in 3 out 10 survey rounds and 6 out of 10 market rounds. In total, about $200,000 in prize money will be awarded.

The studies were sourced from all social science disciplines (economics, psychology, sociology, management, etc.) and were published between 2009 and 2018 (in other words, most of the sample came from the post-replication crisis era).

The average replication probability in the market was 54%; while the replication results are not out yet (250 of the 3000 papers will be replicated), previous experiments have shown that prediction markets work well.1

This is what the distribution of my own predictions looks like:2

My average forecast was in line with the market. A quarter of the claims were above 76%. And a quarter of them were below 33%: we're talking hundreds upon hundreds of terrible papers, and this is just a tiny sample of the annual academic production.

Criticizing bad science from an abstract, 10000-foot view is pleasant: you hear about some stuff that doesn't replicate, some methodologies that seem a bit silly. "They should improve their methods", "p-hacking is bad", "we must change the incentives", you declare Zeuslike from your throne in the clouds, and then go on with your day.

But actually diving into the sea of trash that is social science gives you a more tangible perspective, a more visceral revulsion, and perhaps even a sense of Lovecraftian awe at the sheer magnitude of it all: a vast landfill—a great agglomeration of garbage extending as far as the eye can see, effluvious waves crashing and throwing up a foul foam of p=0.049 papers. As you walk up to the diving platform, the deformed attendant hands you a pair of flippers. Noticing your reticence, he gives a subtle nod as if to say: "come on then, jump in".

They Know What They're Doing

Prediction markets work well because predicting replication is easy.3 There's no need for a deep dive into the statistical methodology or a rigorous examination of the data, no need to scrutinize esoteric theories for subtle errors—these papers have obvious, surface-level problems.

There's a popular belief that weak studies are the result of unconscious biases leading researchers down a "garden of forking paths". Given enough "researcher degrees of freedom" even the most punctilious investigator can be misled.

I find this belief impossible to accept. The brain is a credulous piece of meat4 but there are limits to self-delusion. Most of them have to know. It's understandable to be led down the garden of forking paths while producing the research, but when the paper is done and you give it a final read-over you will surely notice that all you have is a n=23, p=0.049 three-way interaction effect (one of dozens you tested, and with no multiple testing adjustments of course). At that point it takes more than a subtle unconscious bias to believe you have found something real. And even if the authors really are misled by the forking paths, what are the editors and reviewers doing? Are we supposed to believe they are all gullible rubes?

People within the academy don't want to rock the boat. They still have to attend the conferences, secure the grants, publish in the journals, show up at the faculty meetings: all these things depend on their peers. When criticising bad research it's easier for everyone to blame the forking paths rather than the person walking them. No need for uncomfortable unpleasantries. The fraudster can admit, without much of a hit to their reputation, that indeed they were misled by that dastardly garden, really through no fault of their own whatsoever, at which point their colleagues on twitter will applaud and say "ah, good on you, you handled this tough situation with such exquisite virtue, this is how progress happens! hip, hip, hurrah!" What a ridiculous charade.

Even when they do accuse someone of wrongdoing they use terms like "Questionable Research Practices" (QRP). How about Questionable Euphemism Practices?

  • When they measure a dozen things and only pick their outcome variable at the end, that's not the garden of forking paths but the greenhouse of fraud.
  • When they do a correlational analysis but give "policy implications" as if they were doing a causal one, they're not walking around the garden, they're doing the landscaping of forking paths.
  • When they take a continuous variable and arbitrarily bin it to do subgroup analysis or when they add an ad hoc quadratic term to their regression, they're...fertilizing the garden of forking paths? (Look, there's only so many horticultural metaphors, ok?)

The bottom line is this: if a random schmuck with zero domain expertise like me can predict what will replicate, then so can scientists who have spent half their lives studying this stuff. But they sure don't act like it.

...or Maybe They Don't?

The horror! The horror!

Check out this crazy chart from Yang et al. (2020):

Yes, you're reading that right: studies that replicate are cited at the same rate as studies that do not. Publishing your own weak papers is one thing, but citing other people's weak papers? This seemed implausible, so I decided to do my own analysis with a sample of 250 articles from the Replication Markets project. The correlation between citations per year and (market-estimated) probability of replication was -0.05!

You might hypothesize that the citations of non-replicating papers are negative, but negative citations are extremely rare.5 One study puts the rate at 2.4%. Astonishingly, even after retraction the vast majority of citations are positive, and those positive citations continue for decades after retraction.6

As in all affairs of man, it once again comes down to Hanlon's Razor. Either:

  1. Malice: they know which results are likely false but cite them anyway.
  2. or, Stupidity: they can't tell which papers will replicate even though it's quite easy.

Accepting the first option would require a level of cynicism that even I struggle to muster. But the alternative doesn't seem much better: how can they not know? I, an idiot with no relevant credentials or knowledge, can fairly accurately determine good research from bad, but all the tenured experts can not? How can they not tell which papers are retracted?

I think the most plausible explanation is that scientists don't read the papers they cite, which I suppose involves both malice and stupidity.7 Gwern has a nice write-up on this question citing some ingenious analyses based on the proliferation of misprints: "Simkin & Roychowdhury venture a guess that as many as 80% of authors citing a paper have not actually read the original". Once a paper is out there nobody bothers to check it, even though they know there's a 50-50 chance it's false!

Whatever the explanation might be, the fact is that the academic system does not allocate citations to true claims.8 This is bad not only for the direct effect of basing further research on false results, but also because it distorts the incentives scientists face. If nobody cited weak studies, we wouldn't have so many of them. Rewarding impact without regard for the truth inevitably leads to disaster.

There Are No Journals With Strict Quality Standards

Naïvely you might expect that the top-ranking journals would be full of studies that are highly likely to replicate, and the low-ranking journals would be full of p<0.1 studies based on five undergraduates. Not so! Like citations, journal status and quality are not very well correlated: there is no association between statistical power and impact factor, and journals with higher impact factor have more papers with erroneous p-values.

This pattern is repeated in the Replication Markets data. As you can see in the chart below, there's no relationship between h-index (a measure of impact) and average expected replication rates. There's also no relationship between h-index and expected replication within fields.

Even the crème de la crème of economics journals barely manage a ⅔ expected replication rate. 1 in 5 articles in QJE scores below 50%, and this is a journal that accepts just 1 out of every 30 submissions. Perhaps this (partially) explains why scientists are undiscerning: journal reputation acts as a cloak for bad research. It would be fun to test this idea empirically.

Here you can see the distribution of replication estimates for every journal in the RM sample:

As far as I can tell, for most journals the question of whether the results in a paper are true is a matter of secondary importance. If we model journals as wanting to maximize "impact", then this is hardly surprising: as we saw above, citation counts are unrelated to truth. If scientists were more careful about what they cited, then journals would in turn be more careful about what they publish.

Things Are Not Getting Better

Before we got to see any of the actual Replication Markets studies, we voted on the expected replication rates by year. Gordon et al. (2020) has that data: replication rates were expected to steadily increase from 43% in 2009/2010 to 55% in 2017/2018.

This is what the average predictions looked like after seeing the papers: from 53.4% in 2009 to 55.8% in 2018 (difference not statistically significant; black dots are means).

I frequently encounter the notion that after the replication crisis hit there was some sort of great improvement in the social sciences, that people wouldn't even dream of publishing studies based on 23 undergraduates any more (I actually saw plenty of those), etc. Stuart Ritchie's new book praises psychologists for developing "systematic ways to address" the flaws in their discipline. In reality there has been no discernible improvement.

The results aren't out yet, so it's possible that the studies have improved in subtle ways which the forecasters have not been able to detect. Perhaps the actual replication rates will be higher. But I doubt it. Looking at the distribution of p-values over time, there's a small increase in the proportion of p<.001 results, but nothing like the huge improvement that was expected.

Everyone is Complicit

Authors are just one small cog in the vast machine of scientific production. For this stuff to be financed, generated, published, and eventually rewarded requires the complicity of funding agencies, journal editors, peer reviewers, and hiring/tenure committees. Given the current structure of the machine, ultimately the funding agencies are to blame.9 But "I was just following the incentives" only goes so far. Editors and reviewers don't actually need to accept these blatantly bad papers.

Journals and universities certainly can't blame the incentives when they stand behind fraudsters to the bitter end. Paolo Macchiarini "left a trail of dead patients" but was protected for years by his university. Andrew Wakefield's famously fraudulent autism-MMR study took 12 years to retract. Even when the author of a paper admits the results were entirely based on an error, journals still won't retract.

Elisabeth Bik documents her attempts to report fraud to journals. It looks like this:

The Editor in Chief of Neuroscience Letters [Yale's Stephen G. Waxman] never replied to my email. The APJTM journal had a new publisher, so I wrote to both current Editors in Chief, but they never replied to my email.

Two papers from this set had been published in Wiley journals, Gerodontology and J Periodontology. The EiC of the Journal of Periodontology never replied to my email. None of the four Associate Editors of that journal replied to my email either. The EiC of Gerodontology never replied to my email.

Even when they do take action, journals will often let scientists "correct" faked figures instead of retracting the paper! The rate of retraction is about 0.04%; it ought to be much higher.

And even after being caught for outright fraud, about half of the offenders are allowed to keep working: they "have received over $123 million in federal funding for their post-misconduct research efforts".

Just Because a Paper Replicates Doesn't Mean it's Good

First: a replication of a badly designed study is still badly designed. Suppose you are a social scientist, and you notice that wet pavements tend to be related to umbrella usage. You do a little study and find the correlation is bulletproof. You publish the paper and try to sneak in some causal language when the editors/reviewers aren't paying attention. Rain is never even mentioned. Of course if someone repeats your study, they will get a significant result every time. This may sound absurd, but it describes a large proportion of the papers that successfully replicate.

Economists and education researchers tend to be relatively good with this stuff, but as far as I can tell most social scientists go through 4 years of undergrad and 4-6 years of PhD studies without ever encountering ideas like "identification strategy", "model misspecification", "omitted variable", "reverse causality", or "third-cause". Or maybe they know and deliberately publish crap. Fields like nutrition and epidemiology are in an even worse state, but let's not get into that right now.

"But Alvaro, correlational studies can be usef-" Spare me.

Second: the choice of claim for replication. For some papers it's clear (eg math educational intervention → math scores), but other papers make dozens of different claims which are all equally important. Sometimes the Replication Markets organisers picked an uncontroversial claim from a paper whose central experiment was actually highly questionable. In this way a study can get the "successfully replicates" label without its most contentious claim being tested.

Third: effect size. Should we interpret claims in social science as being about the magnitude of an effect, or only about its direction? If the original study says an intervention raises math scores by .5 standard deviations and the replication finds that the effect is .2 standard deviations (though still significant), that is considered a success that vindicates the original study! This is one area in which we absolutely have to abandon the binary replicates/doesn't replicate approach and start thinking more like Bayesians.

Fourth: external validity. A replicated lab experiment is still a lab experiment. While some replications try to address aspects of external validity (such as generalizability across different cultures), the question of whether these effects are relevant in the real world is generally not addressed.

Fifth: triviality. A lot of the papers in the 85%+ chance-to-replicate range are just really obvious. "Homeless students have lower test scores", "parent wealth predicts their children's wealth", that sort of thing. These are not worthless, but they're also not really expanding the frontiers of science.

So: while about half the papers will replicate, I would estimate that only half of those are actually worthwhile.

Lack of Theory

The majority of journal articles are almost completely atheoretical. Even if all the statistical, p-hacking, publication bias, etc. issues were fixed, we'd still be left with a ton of ad-hoc hypotheses based, at best, on (WEIRD) folk intuitions. But how can science advance if there's no theoretical grounding, nothing that can be refuted or refined? A pile of "facts" does not a progressive scientific field make.

Michael Muthukrishna and the superhuman Joe Henrich have written a paper called A Problem in Theory which covers the issue better than I ever could. I highly recommend checking it out.

Rather than building up principles that flow from overarching theoretical frameworks, psychology textbooks are largely a potpourri of disconnected empirical findings.

There's Probably a Ton of Uncaught Frauds

This is a fairly lengthy topic, so I made a separate post for it. tl;dr: I believe about 1% of falsified/fabricated papers are retracted, but overall they represent a very small portion of non-replicating research.

Power: Not That Bad

[Warning: technical section. Skip ahead if bored.]

A quick refresher on hypothesis testing:

  • α, the significance level, is the probability of a false positive.
  • β, or type II error, is the probability of a false negative.
  • Power is (1-β): if a study has 90% power, there's a 90% chance of successfully detecting the effect being studied. Power increases with sample size and effect size.
  • The probability that a significant p-value indicates a true effect is not 1-α. It is called the positive predictive value (PPV), and is calculated as follows: PPV=priorpowerpriorpower+(1prior)αPPV = \frac{prior \cdot power}{prior \cdot power + (1-prior) \cdot \alpha}

This great diagram by Felix Schönbrodt gives the intuition behind PPV:

This model makes the assumption that effects can be neatly split into two categories: those that are "real" and those that are not. But is this accurate? In the opposite extreme you have the "crud factor": everything is correlated so if your sample is big enough you will always find a real effect.10 As Bakan puts it: "there is really no good reason to expect the null hypothesis to be true in any population". If you look at the universe of educational interventions, for example, are they going to be neatly split into two groups of "real" and "fake" or is it going to be one continuous distribution? What does "false positive" even mean if there are no "fake" effects, unless it refers purely to the direction of the effect? Perhaps the crud factor is wrong, at least when it comes to causal effects? Perhaps the pragmatic solution is to declare that all effects with, say, d<.1 are fake and the rest are real? Or maybe we should just go full Bayesian?

Anyway, let's pretend the previous paragraph never happened. Where do we find the prior? There are a few different approaches, and they're all problematic.11

The exact number doesn't really matter that much (there's nothing we can do about it), so I'm going to go ahead and use a prior of 25% for the calculations below. The main takeaways don't change with a different prior value.

Now the only thing we're missing is the power of the typical social science study. To determine that we need to know 1) sample sizes (easy), and 2) the effect size of true effects (not so easy).14 I'm going to use the results of extremely high-powered, large-scale replication efforts:

Surprisingly large, right? We can then use the power estimates in Szucs & Ioannidis (2017): they give an average power of .49 for "medium effects" (d=.5) and .71 for "large effects" (d=.8). Let's be conservative and split the difference.

With a prior of 25%, power of 60%, and α=5%, PPV is equal to 80%. Assuming no fraud and no QRPs, 20% of positive findings will be false.

These averages hide a lot of heterogeneity: it's well-established that studies of large effects are adequately powered whereas studies of small effects are underpowered, so the PPV is going to be smaller for small effects. There are also large differences depending on the field you're looking at. The lower the power the bigger the gains to be had from increasing sample sizes.

This is what PPV looks like for the full range of prior/power values, with α=5%:

At the current prior/power levels, PPV is more sensitive to the prior: we can only squeeze small gains out of increasing power. That's a bit of a problem given the fact that increasing power is relatively easy, whereas increasing the chance that the effect you're investigating actually exists is tricky, if not impossible. Ultimately scientists want to discover surprising results—in other words, results with a low prior.

I made a little widget so you can play around with the values:

Alpha0.05
Power0.5
Prior0.25
False positives
True positives
False negatives
True negatives
PPV

Assuming a 25% prior, increasing power from 60% to 90% would require more than twice the sample size and would only increase PPV by 5.7 percentage points. It's something, but it's no panacea. However, there is something else we could do: sample size is a budget, and we can allocate that budget either to higher power or to a lower significance cutoff. Lowering alpha is far more effective at reducing the false discovery rate.15

Let's take a look at 4 different different power/alpha scenarios, assuming a 25% prior and d=0.5 effect size.16 The required sample sizes are for a one-sided t-test.

False Discovery Rate
α
0.050.005
Power0.523.1%2.9%
0.815.8%1.8%
Required Sample Size
α
0.050.005
Power0.545110
0.8100190

To sum things up: power levels are decent on average and improving them wouldn't do much. Power increases should be focused on studies of small effects. Lowering the significance cutoff achieves much more for the same increase in sample size.

Field of Dreams

Before we got to see any of the actual Replication Markets studies, we voted on the expected replication rates by field. Gordon et al. (2020) has that data:

This is what the predictions looked like after seeing the papers:

Economics is Predictably Good

Economics topped the charts in terms of expectations, and it was by far the strongest field. There are certainly large improvements to be made—a 2/3 replication rate is not something to be proud of. But reading their papers you get the sense that at least they're trying, which is more than can be said of some other fields. 6 of the top 10 economics journals participated, and they did quite well: QJE is the behemoth of the field and it managed to finish very close to the top. A unique weakness of economics is the frequent use of absurd instrumental variables. I doubt there's anyone (including the authors) who is convinced by that stuff, so let's cut it out.

EvoPsych is Surprisingly Bad

You were supposed to destroy the Sith, not join them!

Going into this, my view of evolutionary psychology was shaped by people like Cosmides, Tooby, DeVore, Boehm, and so on. You know, evolutionary psychology! But the studies I skimmed from evopsych journals were mostly just weak social psychology papers with an infinitesimally thin layer of evolutionary paint on top. Few people seem to take the "evolutionary" aspect really seriously.

Also underdetermination problems are particularly difficult in this field and nobody seems to care.

Education is Surprisingly Good

Education was expected to be the worst field, but it ended up being almost as strong as economics. When it came to interventions there were lots of RCTs with fairly large samples, which made their claims believable. I also got the sense that p-hacking is more difficult in education: there's usually only one math score which measures the impact of a math intervention, there's no early stopping, etc.

However, many of the top-scoring papers were trivial (eg "there are race differences in science scores"), and the field has a unique problem which is not addressed by replication: educational intervention effects are notorious for fading out after a few years. If the replications waited 5 years to follow up on the students, things would look much, much worse.

Demography is Good

Who even knew these people existed? Yet it seems they do (relatively) competent work. googles some of the authors Ah, they're economists. Well.

Criminology Should Just Be Scrapped

If you thought social psychology was bad, you ain't seen nothin' yet. Other fields have a mix of good and bad papers, but criminology is a shocking outlier. Almost every single paper I read was awful. Even among the papers that are highly likely to replicate, it's de rigueur to confuse correlation for causation.

If we compare criminology to, say, education, the headline replication rates look similar-ish. But the designs used in education (typically RCT, diff-in-diff, or regression discontinuity) are at least in principle capable of detecting the effects they're looking for. That's not really the case for criminology. Perhaps this is an effect of the (small number of) specific journals selected for RM, and there is more rigorous work published elsewhere.

There's no doubt in my mind that the net effect of criminology as a discipline is negative: to the extent that public policy is guided by these people, it is worse. Just shameful.

Marketing/Management

In their current state these are a bit of a joke, but I don't think there's anything fundamentally wrong with them. Sure, some of the variables they use are a bit fluffy, and of course there's a lack of theory. But the things they study are a good fit for RCTs, and if they just quintupled their sample sizes they would see massive improvements.

Cognitive Psychology

Much worse than expected; generally has a reputation as being one of the more solid subdisciplines of psychology, and has done well in previous replication projects. Not sure what went wrong here. It's only 50 papers and they're all from the same journal, so perhaps it's simply an unrepresentative sample.

Social Psychology

More or less as expected. All the silly stuff you've heard about is still going on.

Limited Political Hackery

Some of the most highly publicized social science controversies of the last decade happened at the intersection between political activism and low scientific standards: the implicit association test,17 stereotype threat, racial resentment, etc. I thought these were representative of a wider phenomenon, but in reality they are exceptions. The vast majority of work is done in good faith.

While blatant activism is rare, there is a more subtle background ideological influence which affects the assumptions scientists make, the types of questions they ask, and how they go about testing them. It's difficult to say how things would be different under the counterfactual of a more politically balanced professoriate, though.

Interaction Effects Bad

A paper whose main finding is an interaction effect is about 10 percentage points less likely to replicate. Their usage is not inherently wrong, sometimes it's theoretically justified. But all too often you'll see blatant fishing expeditions with a dozen double and triple ad hoc interactions thrown into the regression. They make it easy to do naughty things and tend to be underpowered.

Nothing New Under the Sun

All is mere breath, and herding the wind.

The replication crisis did not begin in 2010, it began in the 1950s. All the things I've written above have been written before, by respected and influential scientists. They made no difference whatsoever. Let's take a stroll through the museum of metascience.

Sterling (1959) analyzed psychology articles published in 1955-56 and noted that 97% of them rejected their null hypothesis. He found evidence of a huge publication bias, and a serious problem with false positives which was compounded by the fact that results are "seldom verified by independent replication".

Nunnally (1960) noted various problems with null hypothesis testing, underpowered studies, over-reliance on student samples (it doesn't take Joe Henrich to notice that using Western undergrads for every experiment might be a bad idea), and much more. The problem (or excuse) of publish-or-perish, which some portray as a recent development, was already in place by this time.18

The "reprint race" in our universities induces us to publish hastily-done, small studies and to be content with inexact estimates of relationships.

Jacob Cohen (of Cohen's d fame) in a 1962 study analyzed the statistical power of 70 psychology papers: he found that underpowered studies were a huge problem, especially for those investigating small effects. Successive studies by Sedlemeier & Gigerenzer in 1989 and Szucs & Ioannidis in 2017 found no improvement in power.

If we then accept the diagnosis of general weakness of the studies, what treatment can be prescribed? Formally, at least, the answer is simple: increase sample sizes.

Paul Meehl (1967) is highly insightful on problems with null hypothesis testing in the social sciences, the "crud factor", lack of theory, etc. Meehl (1970) brilliantly skewers the erroneous (and still common) tactic of automatically controling for "confounders" in observational designs without understanding the causal relations between the variables. Meehl (1990) is downright brutal: he highlights a series issues which, he argues, make psychological theories "uninterpretable". He covers low standards, pressure to publish, low power, low prior probabilities, and so on.

I am prepared to argue that a tremendous amount of taxpayer money goes down the drain in research that pseudotests theories in soft psychology and that it would be a material social advance as well as a reduction in what Lakatos has called “intellectual pollution” if we would quit engaging in this feckless enterprise.

Rosenthal (1979) covers publication bias and the problems it poses for meta-analyses: "only a few studies filed away could change the combined significant result to a nonsignificant one". Cole, Cole & Simon (1981) present experimental evidence on the evaluation of NSF grant proposals: they find that luck plays a huge factor as there is little agreement between reviewers.

I could keep going to the present day with the work of Goodman, Gelman, Nosek, and many others. There are many within the academy who are actively working on these issues: the CASBS Group on Best Practices in Science, the Meta-Research Innovation Center at Stanford, the Peer Review Congress, the Center for Open Science. If you click those links you will find a ton of papers on metascientific issues. But there seems to be a gap between awareness of the problem and implementing policy to fix it. You've got tons of people doing all this research and trying to repair the broken scientific process, while at the same time journal editors won't even retract blatantly fraudulent research.

There is even a history of government involvement. In the 70s there were battles in Congress over questionable NSF grants, and in the 80s Congress (led by Al Gore) was concerned about scientific integrity, which eventually led to the establishment of the Office of Scientific Integrity. (It then took the federal government another 11 years to come up with a decent definition of scientific misconduct.) After a couple of embarrassing high-profile prosecutorial failures they more or less gave up, but they still exist today and prosecute about a dozen people per year.

Generations of psychologists have come and gone and nothing has been done. The only difference is that today we have a better sense of the scale of the problem. The one ray of hope is that at least we have started doing a few replications, but I don't see that fundamentally changing things: replications reveal false positives, but they do nothing to prevent those false positives from being published in the first place.

What To Do

The reason nothing has been done since the 50s, despite everyone knowing about the problems, is simple: bad incentives. The best cases for government intervention are collective action problems: situations where the incentives for each actor cause suboptimal outcomes for the group as a whole, and it's difficult to coordinate bottom-up solutions. In this case the negative effects are not confined to academia, but overflow to society as a whole when these false results are used to inform business and policy.

Nobody actually benefits from the present state of affairs, but you can't ask isolated individuals to sacrifice their careers for the "greater good": the only viable solutions are top-down, which means either the granting agencies or Congress (or, as Scott Alexander has suggested, a Science Czar). You need a power that sits above the system and has its own incentives in order: this approach has already had success with requirements for pre-registration and publication of clinical trials. Right now I believe the most valuable activity in metascience is not replication or open science initiatives but political lobbying.19

  • Earmark 60% of funding for registered reports (ie accepted for publication based on the preregistered design only, not results). For some types of work this isn't feasible, but for ¾ of the papers I skimmed it's possible. In one fell swoop, p-hacking and publication bias would be virtually eliminated.20
  • Earmark 10% of funding for replications. When the majority of publications are registered reports, replications will be far less valuable than they are today. However, intelligently targeted replications still need to happen.
  • Earmark 1% of funding for progress studies. Including metascientific research that can be used to develop a serious science policy in the future.
  • Increase sample sizes and lower the significance threshold to .005. This one needs to be targeted: studies of small effects probably need to quadruple their sample sizes in order to get their power to reasonable levels. The median study would only need 2x or so. Lowering alpha is generally preferable to increasing power. "But Alvaro, doesn't that mean that fewer grants would be funded?" Yes.
  • Ignore citation counts. Given that citations are unrelated to (easily-predictable) replicability, let alone any subtler quality aspects, their use as an evaluative tool should stop immediately.
  • Open data, enforced by the NSF/NIH. There are problems with privacy but I would be tempted to go as far as possible with this. Open data helps detect fraud. And let's have everyone share their code, too—anything that makes replication/reproduction easier is a step in the right direction.
  • Financial incentives for universities and journals to police fraud. It's not easy to structure this well because on the one hand you want to incentivize them to minimize the frauds published, but on the other hand you want to maximize the frauds being caught. Beware Goodhart's law!
  • Why not do away with the journal system altogether? The NSF could run its own centralized, open website; grants would require publication there. Journals are objectively not doing their job as gatekeepers of quality or truth, so what even is a journal? A combination of taxonomy and reputation. The former is better solved by a simple tag system, and the latter is actually misleading. Peer review is unpaid work anyway, it could continue as is. Attach a replication prediction market (with the estimated probability displayed in gargantuan neon-red font right next to the paper title) and you're golden. Without the crutch of "high ranked journals" maybe we could move to better ways of evaluating scientific output. No more editors refusing to publish replications. You can't shift the incentives: academics want to publish in "high-impact" journals, and journals want to selectively publish "high-impact" research. So just make it impossible. Plus as a bonus side-effect this would finally sink Elsevier.
  • Have authors bet on replication of their research. Give them fixed odds, say 1:4—if it's good work, it's +EV for them. This sounds a bit distasteful, so we could structure the same cashflows as a "bonus grant" from the NSF when a paper you wrote replicates successfully.22

And a couple of points that individuals can implement today:

  • Just stop citing bad research, I shouldn't need to tell you this, jesus christ what the fuck is wrong with you people.
  • Read the papers you cite. Or at least make your grad students to do it for you. It doesn't need to be exhaustive: the abstract, a quick look at the descriptive stats, a good look at the table with the main regression results, and then a skim of the conclusions. Maybe a glance at the methodology if they're doing something unusual. It won't take more than a couple of minutes. And you owe it not only to SCIENCE!, but also to yourself: the ability to discriminate between what is real and what is not is rather useful if you want to produce good research.23
  • When doing peer review, reject claims that are likely to be false. The base replication rate for studies with p>.001 is below 50%. When reviewing a paper whose central claim has a p-value above that, you should recommend against publication unless the paper is exceptional (good methodology, high prior likelihood, etc.)24 If we're going to have publication bias, at least let that be a bias for true positives. Remember to subtract another 10 percentage points for interaction effects. You don't need to be complicit in the publication of false claims.
  • Stop assuming good faith. I'm not saying every academic interaction should be hostile and adversarial, but the good guys are behaving like dodos right now and the predators are running wild.

...My Only Friend, The End

The first draft of this post had a section titled "Some of My Favorites", where I listed the silliest studies in the sample. But I removed it because I don't want to give the impression that the problem lies with a few comically bad papers in the far left tail of the distribution. The real problem is the median.

It is difficult to convey just how low the standards are. The marginal researcher is a hack and the marginal paper should not exist. There's a general lack of seriousness hanging over everything—if an undergrad cites a retracted paper in an essay, whatever; but if this is your life's work, surely you ought to treat the matter with some care and respect.

Why is the Replication Markets project funded by the Department of Defense? If you look at the NSF's 2019 Performance Highlights, you'll find items such as "Foster a culture of inclusion through change management efforts" (Status: "Achieved") and "Inform applicants whether their proposals have been declined or recommended for funding in a timely manner" (Status: "Not Achieved"). Pusillanimous reports repeat tired clichés about "training", "transparency", and a "culture of openness" while downplaying the scale of the problem and ignoring the incentives. No serious actions have followed from their recommendations.

It's not that they're trying and failing—they appear to be completely oblivious. We're talking about an organization with an 8 billion dollar budget that is responsible for a huge part of social science funding, and they can't manage to inform people that their grant was declined! These are the people we must depend on to fix everything.

When it comes to giant bureaucracies it can be difficult to know where (if anywhere) the actual power lies. But a good start would be at the top: NSF director Sethuraman Panchanathan, SES division director Daniel L. Goroff, NIH director Francis S. Collins, and the members of the National Science Board. The broken incentives of the academy did not appear out of nowhere, they are the result of grant agency policies. Scientists and the organizations that represent them (like the AEA and APA) should be putting pressure on them to fix this ridiculous situation.

The importance of metascience is inversely proportional to how well normal science is working, and right now it could use some improvement. The federal government spends about $100b per year on research, but we lack a systematic understanding of scientific progress, we lack insight into the forces that underlie the upward trajectory of our civilization. Let's take 1% of that money and invest it wisely so that the other 99% will not be pointlessly wasted. Let's invest it in a robust understanding of science, let's invest it in progress studies, let's invest it in—the future.


Thanks to Alexey Guzey and Dormin for their feedback. And thanks to the people at SCORE and the Replication Markets team for letting me use their data and for running this unparalleled program.


  1. 1.Dreber et al. (2015), Using prediction markets to estimate the reproducibility of scientific research.
    Camerer et al. (2018), Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015.
  2. 2.The distribution is bimodal because of the way p-values are typically reported: there's a huge difference between p<.01 and p<.001. If actual p-values were reported instead of cutoffs, the distribution would be unimodal.
  3. 3.Even laypeople are half-decent at it.
  4. 4.Ludwik Fleck has an amusing bit on the development of anatomy: "Simple lack of 'direct contact with nature' during experimental dissection cannot explain the frequency of the phrase "which becomes visible during autopsy" often accompanying what to us seem the most absurd assertions."
  5. 5.Another possible explanation is that importance is inversely related to replication probability. In my experience that is not the case, however. If anything it's the opposite: important effects tend to be large effects, and large effects tend to replicate. In general, any "conditioning on a collider"-type explanation doesn't work here because these citations also continue post-retraction.
  6. 6.Some more:
  7. 7.Though I must admit that after reading the papers myself I understand why they would shy away from the task.
  8. 8.I can tell you what is rewarded with citations though: papers in which the authors find support for their hypothesis.
  9. 9.Perhaps I don't understand the situation at places like the NSF or the ESRC but the problem seems to be incompetence (or a broken bureaucracy?) rather than misaligned incentives.
  10. 10.Theoretically there's the possibility of overpowered studies being a problem. Meehl (1967) argues that 1) everything in psychology is correlated (the "crud factor"), and 2) theories only make directional predictions (as opposed to point predictions in eg physics). So as power increases the probability of finding a significant result for a directional prediction approaches 50% regardless of what you're studying.
  11. 11.In medicine there are plenty of cohort-based publication bias analyses, but I don't think we can generalize from those to the social sciences.
  12. 12.But RRs are probably not representative of the literature, so this is an overestimate. And who knows how many unpublished pilot studies are behind every RR?
  13. 13.Dreber et al. (2015) use prediction market probabilities and work backward to get a prior of 9%, but this number is based on unreasonable assumptions about false positives: they don't take into account fraud and QRPs. If priors were really that low, the entire replication crisis would be explained purely by normal sampling error: no QRPs!
  14. 14.Part of the issue is that the literature is polluted with a ton of false results, which actually pushes estimates of true effect sizes downwards. There's an unfortunate tendency to lump together effect sizes of real and non-existent effects (eg Many Labs 2: "ds were 0.60 for the original findings and 0.15 for the replications"), but that's a meaningless number.
  15. 15.False negatives are bad too, but they're not as harmful as false positives. Especially since they're almost never published. Also, there's been a ton of stuff written on lowering alpha, a good starting point is Redefine Statistical Significance.
  16. 16.These figures actually understate the benefit of a lower alpha, because it would also change the calculus around p-hacking. With an alpha of 5%, getting a false positive is quite easy. Simply stopping data collection once you have a significant result has a hit rate of over 20%! Add some dredging and HARKing to that and you can squeeze a result out of anything. With a lower alpha, the chances of p-hacking success will be vastly lower and some researchers won't even bother trying.
  17. 17.The original IAT paper is worth revisiting. You only really need to read page 1475. The construct validity evidence is laughable. The whole thing is based on N=26 and they find no significant correlation between the IAT and explicit measures of racism. But that's OK, Greenwald says, because the IAT is meant to find secret racists ("reveal explicitly disavowed prejudice")! The question of why a null correlation between implicit and explicit racial attitudes is to be expected is left as an exercise to the reader. The correlation between two racial IATs (male and female names) is .46 and they conveniently forget to mention the comically low test-retest reliability. That's all you need for 13k citations and a consulting industry selling implicit bias to the government for millions of dollars.
  18. 18.I suspect psychologists today would laugh at the idea of the 1960s being an over-competitive environment. Personally I highly doubt that this situation can be blamed on high (or increasing) productivity.
  19. 19.You might ask: well, why haven't the independent grant agencies already fixed the problem then? I'm not sure if it's a lack of competence, or caring, or power, or something else. But I find Garrett Jones' arguments on the efficacy of independent government agencies convincing: this model works well in other areas.
  20. 20."But Alvaro, what if I make an unexpected discovery during my investigation?" Well, you start writing a new registered report, and perhaps publish it as an exploratory result. You may not like it, but that's how we protect against false positives. In cases where only one dataset is available (eg historical data) we must rely on even stricter standards of evidence, to protect against multiple testing.
  21. 21.Another idea to steal from the SEC: whistleblower rewards.
  22. 22.This would be immediately exploited by publishing a bunch of trivial results. But that's a solvable problem. In any case, it's much better to have systematic, automatic mechanisms instead of relying on subjective factors and prosecuting of individual cases.
  23. 23.I believe the SCORE program intends to use the data from Replication Markets to train a ML model that predicts replicability. If scientists had the ability to just run that on every reference in their papers, perhaps they could go back to not reading what they cite.
  24. 24.Looking at Replication Markets data, about 1 in 4 studies with p>.001 had more than a 50% chance to replicate. Of course I'd consider 50-50 odds far too low a threshold, but you have to start somewhere. "But Alvaro, science is not done paper by paper, it is a cumulative enterprise. We should publish marginal results, even if they're probably not true. They are pieces of evidence that, brick by brick, raise the vast edifice that we call scientific knowledge". In principle this is a good argument: publish everything and let the meta-analyses sort it out. But given the reality of publication bias we must be selective. If registered reports became the standard, this problem would not exist.



How Many Undetected Frauds in Science?

0.04% of papers are retracted. At least 1.9% of papers have duplicate images "suggestive of deliberate manipulation". About 2.5% of scientists admit to fraud, and they estimate that 10% of other scientists have committed fraud. 27% of postdocs said they were willing to select or omit data to improve their results. More than 50% of published findings in psychology are false. The ORI, which makes about 13 misconduct findings per year, gives a conservative estimate of over 2000 misconduct incidents per year.

That's a wide range of figures, and all of them suffer from problems if we try to use them as estimates of the real rate of fraud. While the vast majority of false published claims are not due to fabrication, it's clear that there is a huge iceberg of undiscovered fraud hiding underneath the surface.

Part of the issue is that the limits of fraud are unclear. While fabrication/falsification are easy to adjudicate, there's a wide range of quasi-fraudulent but quasi-acceptable "Questionable Research Practices" (QRPs) such as HARKing which result in false claims being presented as true. Publishing a claim that has a ~0%1 chance of being true is the worst thing in the world, but publishing a claim that has a 15% chance of being true is a totally normal thing that perfectly upstanding scientists do. Thus the literature is inundated by false results that are nonetheless not "fraudulent". Personally I don't think there's much of a difference.

There are two main issues with QRPs: first, there's no clear line in the sand, which makes it difficult to single out individuals for punishment. Second, the majority of scientists engage in QRPs. In fact they have been steeped in an environment full of bad practices for so long that they are no longer capable of understanding that they are behaving badly:

Let him who is without QRPs cast the first stone.

The case of Brian Wansink (who committed both clear fraud and QRPs) is revealing: in the infamous post that set off his fall from grace, he brazenly admitted to extreme p-hacking. The notion that any of this was wrong had clearly never crossed his mind: he genuinely believed he was giving useful advice to grad students. When commenters pushed back, he justified himself by writing that "P-hacking shouldn’t be confused with deep data dives".

Anyway, here are some questions that might help us determine the size of the iceberg:

  • Are uncovered frauds high-quality, or do we only have the ability to find low-hanging fruit?
  • Are frauds caught quickly, or do they have long careers before anyone finds out?
  • Are scientists capable of detecting fraud or false results in general (regardless of whether they are produced by fraud, QRPs, or just bad luck)?
  • How much can we rely on whistleblowers?

Quality

Here's an interesting case recently uncovered by Elisabeth Bik: 8 different published, peer-reviewed papers, by different authors, on different subjects, with literally identical graphs. The laziness is astonishing! It would take just a few minutes to write an R script that generates random data so that each fake paper could at least have unique charts. But the paper mill that wrote these articles won't even do that. This kind of extreme sloppiness is a recurring theme when it comes to frauds that have been caught.

In general the image duplication that Bik uncovers tends to be rather lazy: people just copy paste to their heart's content and hope nobody will notice (and peer reviewers and editors almost certainly won't notice).

The Bell Labs physicist Jan Hendrik Schön was found out because he used identical graphs for multiple, completely different experiments.

This guy not only copy-pasted a ton of observations, he forgot to delete the excel sheet he used to fake the data! Managed to get three publications out of it.

Back to Wansink again: he was smart enough not to copy-paste charts, but he made other stupid mistakes. For example in one paper (The office candy dish) he reported impossible means and test statistics (detected through granularity testing). If he had just bothered to create a plausible sample instead of directly fiddling with summary statistics, there's a good chance he would not have been detected. (By the way, the paper has not been retracted, and continues to be cited. I Fucking Love Science!)

In general Wansink comes across as a moron, yet he managed to amass hundreds of publications, 30k+ citations, and half a dozen books. What percentile of fraud competence do you think Wansink represents?

The point is this: generating plausible random numbers is not that difficult! Especially considering the fact that these are intelligent people with extensive training in science and statistics. It seems highly likely that there are more sophisticated frauds out there.

Speed

Do frauds manage to have long careers before they get caught? I don't think there's any hard data on this (though someone could probably compile it with the Retraction Watch database). Obviously the highest-profile frauds are going to be those with a long history, so we have to be careful not to be misled. Perhaps there's a vast number of fraudsters who are caught immediately.

Overall the evidence is mixed. On the one hand, a relatively small number of researchers account for a fairly large proportion of all retractions. So while these individuals managed to evade detection for a long time (Yoshitaka Fujii published close to 200 papers over a 25 year career), most frauds do not have such vast track records.

On the other hand just because we haven't detected fraudulent papers doesn't necessarily mean they don't exist. And repeat fraud seems fairly common: simple image duplication checks reveal that "in nearly 40% of the instances in which a problematic paper was identified, screening of other papers from the same authors revealed additional problematic papers in the literature."

Even when fraud is clearly present, it can take ages for the relevant authorities to take action. The infamous Andrew Wakefield vaccine autism paper, for example, took 12 years to retract.

Detection Ability

I've been reading a lot of social science papers lately and a thought keeps coming up: "this paper seems unlikely to replicate, but how can I tell if it's due to fraud or just bad methods?" And the answer is that in general we can't tell. In fact things are even worse, as scientists seem to be incapable of detecting even really obviously weak papers (more on this in the next post).

In cases such as Wansink's, people went over his work with a fine comb after the infamous blogpost and discovered all sorts of irregularities. But nobody caught those signs earlier. Part of the issue is that nobody's really looking for fraud when they casually read a paper. Science tends to work on a kind of honor system where everyone just assumes the best. Even if you are looking for fraud, it's time-consuming, difficult, and in many cases unclear. The evidence tends to be indirect: noticing that two subgroups are a bit too similar, or that the effects of an intervention are a bit too consistent. But these can be explained away fairly easily. So unless you have a whistleblower it's often difficult to make an accusation.

The case of the 5-HTTLPR gene is instructive: as Scott Alexander explains in his fantastic literature review, a huge academic industry was built up around what should have been a null result. There are literally hundreds of non-replicating papers on 5-HTTLPR—suppose there was one fraudulent article in this haystack, how would you go about finding it?

Some frauds (or are they simply errors?) are detected using statistical methods such as the granularity testing mentioned above, or with statcheck. But any sophisticated fraud would simply check their own numbers using statcheck before submitting, and correct any irregularities.

Detecting weak research is easy. Detecting fraud and then prosecuting it is extremely difficult.

Whistleblowers

Some cases are brought to light by whistleblowers, but we can't rely on them for a variety of reasons. A survey of scientists finds that potential whistleblowers, especially those without job security, tend not to report fraud due to the potential career consequences. They understand that institutions will go to great lengths to protect frauds—do you want a career, or do you want to do the right thing?

Often there simply is no whistleblower available. Scientists are trusted to collect data on their own, and they often collaborate with people in other countries or continents who never have any contact with the data-gathering process. Under such circumstances we must rely on indirect means of detection.

South Korean celebrity scientist Hwang Woo-suk was uncovered as a fraud by a television program which used two whistleblower sources. But things only got rolling when image duplication was detected in one of his papers. Both whistleblowers lost their jobs and were unable to find other employment.

In some cases people blow the whistle and nothing happens. The report from the investigation into Diederik Stapel, for example, notes that "on three occasions in 2010 and 2011, the attention of members of the academic staff in psychology was drawn to this matter. The first two signals were not followed up in the first or second instance." By the way, these people simply noticed statistical irregularities, they never had direct evidence.

And let's turn back to Wansink once again: in the blog post that sank him, he recounted tales of instructing students to p-hack data until they found a result. Did those grad students ever blow the whistle on him? Of course not.

This is the End...

Let's say that about half of all published research findings are false. How many of those are due to fraud? As a very rough guess I'd say that for every 100 papers that don't replicate, 2.5 are due to fabrication/falsification, and 85 are due to lighter forms of methodological fraud. This would imply that about 1% of fraudulent papers are retracted.

This is both good and bad news. On the one hand, while most fraud goes unpunished, it only represents a small portion of published research. On the other hand, it means that we can't fix reproducibility problems by going after fabrication/falsification: if outright fraud completely disappeared tomorrow, it would be no more than an imperceptible blip in the replication crisis. A real solution needs to address the "questionable" methods used by the median scientist, not the fabrication used by the very worst of them.




Book Review: Science Fictions by Stuart Ritchie

In 1945, Robert Merton wrote:

There is only this to be said: the sociology of knowledge is fast outgrowing a prior tendency to confuse provisional hypothesis with unimpeachable dogma; the plenitude of speculative insights which marked its early stages are now being subjected to increasingly rigorous test.

Then, 16 years later:

After enjoying more than two generations of scholarly interest, the sociology of knowledge remains largely a subject for meditation rather than a field of sustained and methodical investigation. [...] these authors tell us that they have been forced to resort to loose generalities rather than being in a position to report firmly grounded generalizations.

In 2020, the sociology of science is stuck more or less in the same place. I am being unfair to Ritchie (who is a Merton fanboy), because he has not set out to write a systematic account of scientific production—he has set out to present a series of captivating anecdotes, and in those terms he has succeeded admirably. And yet, in the age of progress studies surely one is allowed to hope for more.

If you've never heard of Daryl Bem, Brian Wansink, Andrew Wakefield, John Ioannidis, or Elisabeth Bik, then this book is an excellent introduction to the scientific misconduct that is plaguing our universities. The stories will blow your mind. For example you'll learn about Paolo Macchiarini, who left a trail of dead patients, published fake research saying he healed them, and was then protected by his university and the journal Nature for years. However, if you have been following the replication crisis, you will find nothing new here. The incidents are well-known, and the analysis Ritchie adds on top of them is limited in ambition.

The book begins with a quick summary of how science funding and research work, and a short chapter on the replication crisis. After that we get to the juicy bits as Ritchie describes exactly how all this bad research is produced. He starts with outright fraud, and then moves onto the gray areas of bias, negligence, and hype: it's an engaging and often funny catalogue of misdeeds and misaligned incentives. The final two chapters address the causes behind these problems, and how to fix them.

The biggest weakness is that the vast majority of the incidents presented (with the notable exception of the Stanford prison experiment) occurred in the last 20 years or so. And Ritchie's analysis of the causes behind these failures also depends on recent developments: his main argument is that intense competition and pressure to publish large quantities of papers is harming their quality.

Not only has there been a huge increase in the rate of publication, there’s evidence that the selection for productivity among scientists is getting stronger. A French study found that young evolutionary biologists hired in 2013 had nearly twice as many publications as those hired in 2005, implying that the hiring criteria had crept upwards year-on-year. [...] as the number of PhDs awarded has increased (another consequence, we should note, of universities looking to their bottom line, since PhD and other students also bring in vast amounts of money), the increase in university jobs for those newly minted PhD scientists to fill hasn’t kept pace.

By only focusing on recent examples, Ritchie gives the impression that the problem is new. But that's not really the case. One can go back to the 60s and 70s and find people railing against low standards, underpowered studies, lack of theory, publication bias, and so on. Imre Lakatos, in an amusing series of lectures at the London School of Economics in 1973, said that "the social sciences are on a par with astrology, it is no use beating about the bush."

Let's play a little game. Go to the Journal of Personality and Social Psychology (one of the top social psych journals) and look up a few random papers from the 60s. Are you going to find rigorous, replicable science from a mythical era when valiant scientists followed Mertonian norms and were not incentivized to spew out dozens of mediocre papers every year? No, you're going to find exactly the same p<.05, tiny N, interaction effect, atheoretical bullshit. The only difference being the questionable virtue of low productivity.

If the problem isn't new, then we can't look for the causes in recent developments. If Ritchie had moved beyond "loose generalities" to a more systematic analysis of scientific production I think he would have presented a very different picture. The proposals at the end mostly consist of solutions that are supposed to originate from within the academy. But they've had more than half a century to do that—it feels a bit naive to think that this time it's different.

Finally, is there light at the end of the tunnel?

...after the Bem and Stapel affairs (among many others), psychologists have begun to engage in some intense soul-searching. More than perhaps any other field, we’ve begun to recognise our deep-seated flaws and to develop systematic ways to address them – ways that are beginning to be adopted across many different disciplines of science.

Again, the book is missing hard data and analysis. I used to share his view (surely after all the publicity of the replication crisis, all the open science initiatives, all the "intense soul searching", surely things must change!) but I have now seen some data which makes me lean in the opposite direction. Check back toward the end of August for a post on this issue.

Ritchie's view of science is almost romantic: he goes on about the "nobility" of research and the virtues of Mertonian norms. But the question of how conditions, incentives, competition, and even the Mertonian norms themselves actually affect scientific production is an empirical matter that can and should be investigated systematically. It is time to move beyond "speculative insights" and onto "rigorous testing", exactly in the way that Merton failed to do.




Links Q2 2020

Tyler Cowen reviews Status and Beauty in the Global Party Circuit. "In this world, girls function as a form of capital." The podcast is good too.

Lots of good info on education: Why Conventional Wisdom on Education Reform is Wrong (a primer)

Scott Alexander on the life of Herbert Hoover.

Longer-Run Economic Consequences of Pandemics [speculative]:

Measured by deviations in a benchmark economic statistic, the real natural rate of interest, these responses indicate that pandemics are followed by sustained periods—over multiple decades—with depressed investment opportunities, possibly due to excess capital per unit of surviving labor, and/or heightened desires to save, possibly due to an increase in precautionary saving or a rebuilding of depleted wealth.

Do cognitive biases go away when the stakes are high? A large pre-registered study with very high stakes finds that effort increases significantly but performance does not.

Disco Elysium painting turned into video using AI.

Long-run consequences of the pirate attacks on the coasts of Italy: "in 1951 Rome would have been 15% more populous without piracy."

“A” Business by Any Other Name: Firm Name Choice as a Signal of Firm Quality (2014): "The average plumbing firm whose name begins with A or a number receives five times more service complaints than other firms and also charges higher prices."

Yarkoni: The Generalizability Crisis [in psychology].

Lakens: Review of "The Generalizability Crisis" by Tal Yarkoni.

Yarkoni: Induction is not optional (if you’re using inferential statistics): reply to Lakens.

Estimating the deep replicability of scientific findings using human and artificial intelligence - ML model does about as well as prediction markets when it comes to predicting replication success. "the model’s accuracy is higher when trained on a paper’s text rather than its reported statistics and that n-grams, higher order word combinations that humans have difficulty processing, correlate with replication." Also check out the horrific Fig 1.

Wearing a weight vest leads to weight loss, fairly huge (suspiciously huge?) effect size. The hypothesized mechanism is the "gravitostat": your body senses how heavy you are and adjusts accordingly.

Tyler Cowen on uni- vs multi-disciplinary policy advice in the time of Corona

...and here's Señor Coconut, "A Latin Tribute to Kraftwerk". Who knew "Autobahn" needed a marimba?




Memetic Defoundation

The bunny ears sign used to be a way of calling someone a cuck. In fact they're not bunny ears at all, they're cuckold horns. The original meaning has been lost, and today clueless children across the world use it as nothing more than a vaguely teasing gesture. This is an amusing case of a wider phenomenon I like to call memetic defoundation.

A general formulation would look something like this:

  • Start with a couple of ideas of the form "[foundation] therefore [meme]"1
  • [foundation] is forgotten, disproved, or rendered obsolete
  • [meme] persists regardless

Dead beliefs

Organizational decay is a hotspot for memetic defoundation. Luttwak tells us of a unit in the Rhine legions led by a Praefectus Militum Balistariorum long after the Roman army had lost the ability to construct and use ballistae. Gene Wolfe uses this effect in The Book of the New Sun to evoke the image of an ancient, ossified, slowly crumbling civilization: my favorite example is a prison called the "antechamber" where the inmates are still served coffee and pastries every morning.

E. R. Dodds offers another example in The Greeks and the Irrational, where he describes the decline of religion in Hellenistic times:

Gods withdraw, but their rituals live on, and no one except a few intellectuals notices that they have ceased to mean anything.

Scott Alexander comments on the relation between science and policy: "The science did a 180, but the political implications stayed exactly the same."

John Stuart Mill writes that memetic defoundation "is illustrated in the experience of almost all ethical doctrines and religious creed" and argues that free speech is necessary to prevent it, as open debate preserves the arguments behind ideas:2

If, however, the mischievous operation of the absence of free discussion, when the received opinions are true, were confined to leaving men ignorant of the grounds of those opinions, it might be thought that this, if an intellectual, is no moral evil, and does not affect the worth of the opinions, regarded in their influence on the character. The fact, however, is, that not only the grounds of the opinion are forgotten in the absence of discussion, but too often the meaning of the opinion itself. The words which convey it, cease to suggest ideas, or suggest only a small portion of those they were originally employed to communicate. Instead of a vivid conception and a living belief, there remain only a few phrases retained by rote; or, if any part, the shell and husk only of the meaning is retained, the finer essence being lost. [...] Truth, thus held, is but one superstition the more, accidentally clinging to the words which enunciate a truth.

Sometimes a meme will spread because it captures a true relation, but will use an unrelated foundation to do so. Greg Cochran suggests that Christian Science (a sect that avoids all medical care) developed as a response to the high fatality rates of pre-modern medicine. But the meme only spread when the foundation was put in theological rather than medical terms. What really matters for defoundation is the implicit relation that is captured (pseudoscientific medicine → avoid medical care) rather than the explicit one (sickness results from spiritual error → avoid medical care). When medicine improved, the true basis of the meme was gone, but of course that did nothing to change people's religious beliefs.

Finally, many (including Schumpeter,3 Santayana,4 and Saint Max5) have identified an instance of memetic defoundation in the relation between Protestantism and political liberalism (in the most general sense of the word). In broad strokes, the argument is that liberalism dropped God but kept the Protestant morality. Moldbug6 erroneously places this transition after WWII, while Barzun argues it happened 300 years earlier7. Tom Holland thinks this is an awesome development,8 while others are more skeptical. My old buddy Freddie makes the same diagnosis in Twilight of the Idols:

In England, in response to every little emancipation from theology one has to reassert one’s position in a fear-inspiring manner as a moral fanatic. That is the penance one pays there. – With us it is different. When one gives up Christian belief one thereby deprives oneself of the right to Christian morality. For the latter is absolutely not self-evident: one must make this point clear again and again, in spite of English shallowpates. Christianity is a system, a consistently thought out and complete view of things. If one breaks out of it a fundamental idea, the belief in God, one thereby breaks the whole thing to pieces: one has nothing of any consequence left in one’s hands. Christianity presupposes that man does not know, cannot know what is good for him and what evil: he believes in God, who alone knows. Christian morality is a command: its origin is transcendental; it is beyond all criticism, all right to criticize; it possesses truth only if God is truth – it stands or falls with the belief in God. – If the English really do believe they will know, of their own accord, ‘intuitively’, what is good and evil; if they consequently think they no longer have need of Christianity as a guarantee of morality; that is merely the consequence of the ascendancy of Christian evaluation and an expression of the strength and depth of this ascendancy: so that the origin of English morality has been forgotten, so that the highly conditional nature of its right to exist is no longer felt.

Things are in the saddle

Which brings us to the question of how memetic defoundation happens. In Nietzsche's model you start with the foundation and the meme is derived from it, but once the ideas have been entrenched deeply enough, the foundation can evaporate without affecting the meme. Like a fish doesn't notice water, people no longer notice the assumptions behind their beliefs. I call this the foundation-first model.

But I think he's wrong: in some cases, including the question of Christianity, the correct approach is a meme-first model. In this view, the foundation is simply a post-hoc justification (or a spandrel) glued onto a preëxisting meme. That is not to say the foundation is irrelevant, just that its role in supporting the meme is viral rather than logical.

Where did the meme come from? In his brilliant essay The Three Sources of Human Values, Hayek argues that ideas come from three sources:

  1. Consciously directed rational thought
  2. Biology
  3. Cultural evolution

We can use this classification to look at memetic defoundation. The first case is the easiest: the Roman army uses siege weapons, so someone in charge creates a siege unit and a Praefectus to lead it (a clear foundation-first instance). Eventually it loses those capabilities, but the structure remains.

Biologically instilled tendencies and values are more challenging to analyze: their aims tend to be inaccessible to introspection or hidden through self-deception. And they are not necessarily moral judgements: it could be something as simple as folkbiological classifications predisposed to certain patterns, which then influence values.9

Behaviors and social structures generated by cultural evolution also tend to be opaque: they were created by a process of random variation and selection, then sustained by a distributed system of knowledge accumulation and replication—no individual understands how they work (and they generally don't even try to, simply attributing them to custom or one's ancestors). Henrich details how the tendency of modern westerners to search for causal, explicable reasons is an anomaly.

Even when we try, we don't always succeed: the age of reason didn't necessarily make culturally evolved behaviors transparent. For example, traditional societies in the New World had various processes for nixtamalizing corn before eating it, which makes the niacin nutritionally available and prevents the disease of pellagra. It took until the 1940s(!) and hundreds of thousands of deaths until scientists finally understood the problem. And that's a simple nutritional issue rather than a question of complex social organization. As Scott Alexander puts it:

Our ancestors lived in Epistemic Hell, where they had to constantly rely on causally opaque processes with justifications that couldn’t possibly be true, and if they ever questioned them then they might die.

In a world filled with vital customs and weak explanations it's important to make sure nobody ever questions tradition—thus it is safeguarded by indoctrination, preference falsification,10 ostracism, or the promise of divine punishment. And now we have a second level of selective forces which are shaped by the needs of the memes: they mould their biological and social substrate to maximize their spread. And what are the traits they select for? Conformity, homogeneity, mimesis, self-ignorance, lack of critical thought: the herd-instinct. An overbearing society for a myopic, servile species domesticated under the yoke of ideas. That is the price we pay for the "secret of our success".11

Now consider what happens after a rapid shift in our environment (such as the introduction of agriculture, large-scale hierarchical societies, or the industrial revolution): both biological and cultural evolution are slow processes, and the latter has built-in safeguards to prevent modification. That is how we end up with a lag of ideas: baseless memes designed for a different habitat. Like a saltwater fish thrown in a lake, modern man depends on ideas he thinks are universal when they are really made for a different time and place. Hayek:

The gravest deficiency of the older prophets was their belief that the intuitively perceived ethical values, divined out of the depth of man's breast, were immutable and eternal.

What kind of ideas are most likely to take hold? "Doctrines intrinsically fitted to make the deepest impression upon the mind"12 that also increase fitness. Successful cultural adaptations tend to capture true relations, in false yet convincing ways. This is why religious memes are particularly susceptible to defoundation, and why most defoundation is meme-first. While many of these ideas may appear altruistic, they are really "subtly selfish" as George Williams put it—otherwise they would not have survived.

For example, G. E. Pugh in The Biological Origin of Human Values talks about the ubiquitous sharing norms in primitive human societies. Christopher Boehm in Hierarchy in the Forest (a work that blatantly plagiarizes Nietzsche) discusses the "egalitarian ethos" of primitive societies and its evolutionary origin, which expresses itself as a "drive to parity", which became possible to enforce with the evolution of tool use and greater coordination abilities:

Because the united subordinates are constantly putting down the more assertive alpha types in their midst, egalitarianism is in effect a bizarre type of political hierarchy.

The collective power of resentful subordinates is at the base of human egalitarian society, and we can see important traces of this group approach in chimpanzee behavior. [...] It is obvious that symbolic communication and possession of an ethos make a very large difference for humans. Yet it would appear that the underlying emotions and behavioral orientations are similar to those of chimpanzees, as are group intimidation strategies that have the effect of terminating resented behaviors of aggressors.

To re-work Nietzsche's argument into a more plausible form: the drive to parity came first. Christian morality is simply a post-hoc justification of this innate tendency, in a highly contagious and highly effective prosocial package. God is now dead, but that does nothing to change our evolved moral intuitions, so this drive simply finds new outlets: humanism, democracy, liberalism, socialism, etc. As this shift of ideas happens, we inevitably bring along some old baggage.

The sentiments necessary to thrive in a band or a tribe are not those that we need today, but they are largely those we are stuck with. Modern civilization and its markets are inhuman and unintuitive (if not actively repulsive) and exist largely because we are able to suspend, disregard, and master our innate impulses. Seemingly new ideologies directed against the market are nothing but an atavism: the incompatibility between our innate tendencies and the external environment explains their peculiar combination of perpetual failure and perpetual popularity.

Clean sweep

Counterintuitively, the memes can be strengthened by abandoning the thing they're (supposedly) based on. You can attack Christianity-the-religion-and-ethical-system by attacking God: if morality comes from God, when you take down God you also take down his morality. But it didn't work out that way in practice: people dropped the God but kept his system; where do you attack now? In theory "that which can be asserted without evidence can be dismissed without evidence." In reality, that which is asserted without evidence is difficult to refute regardless of the evidence.13

Another issue, as I argued above, is that we don't comprehend them, either because of self-deception, limited introspection, or the blind forces of cultural evolution. The solution to both of these problems is the genealogical method. The ultimate aims of our values and customs lie in their (genetic or cultural) evolutionary history; by understanding their development we can understand their purpose and the selective forces that shaped them. Through genealogy we can reach truths we have been designed not to see.14

Which brings us back to Nietzsche. How should one argue against God? Forget the old debate tactics, he says in Daybreak 95, and just treat it as an anthropological problem:

In former times, one sought to prove that there is no God – today one indicates how the belief that there is a God arose and how this belief acquired its weight and importance: a counter-proof that there is no God thereby becomes superfluous. – When in former times one had refuted the 'proofs of the existence of God' put forward, there always remained the doubt whether better proofs might not be adduced than those just refuted: in those days atheists did not know how to make a clean sweep.

It is this approach that we should deploy against foundationless memes. Don't bother with arguments attacking the foundation or the meme itself, rather go for a "clean sweep". The case of Christian Science mentioned above is a perfect example: providing theological arguments against it is futile (and fundamentally aiming at the wrong target). But understanding how it came to be makes the situation crystal clear.

The Hansonian approach of noticing a disconnect between stated and revealed preferences is also useful for spotting these memes in the first place. Hanson combines both techniques in his analysis of The Evolution of Health Altruism.

What if some lies are useful and life-preserving? What if such lies are fundamentally necessary for societies to work well? Isn't this just a naïve overexpression of the drive to truth? That may well be the case, but just because some lies are useful does not mean that the particular lies we live by right now are the best ones. In fact the tyranny of mediocrity that flourished in our recent evolutionary past appears to be fundamentally incompatible with the modern world (not to mention the world of tomorrow). Understanding is a precondition for designing superior replacements, or as Nietzsche put it "we must become physicists in order to be able to be creators".

Genealogy allows us to understand the selective forces at play, and once we understand that we (and by we I refer to a tiny minority) have the power to overcome our self-ignorance and ingrained limitations in order to choose from a higher point of view. Not a position of "transcendent leverage", but at least an informed valuing of values, consistent with the world as it is.


  1. 1.I deliberately avoid the use of "assumptions" and "conclusion" because they're not always assumptions and/or conclusions.
  2. 2.He also supports an early version of steelmanning for the same purpose: "So essential is this discipline to a real understanding of moral and human subjects, that if opponents of all important truths do not exist, it is indispensable to imagine them, and supply them with the strongest arguments which the most skilful devil’s advocate can conjure up."
  3. 3."Though the classical doctrine of collective action may not be supported by the results of empirical analysis, it is powerfully supported by that association with religious belief to which I have adverted already. This may not be obvious at first sight. The utilitarian leaders were anything but religious in the ordinary sense of the term. In fact they believed themselves to be anti-religious and they were so considered almost universally. They took pride in what they thought was precisely an unmetaphysical attitude and they were quite out of sympathy with the religious institutions and the religious movements of their time. But we need only cast another glance at the picture they drew of the social process in order to discover that it embodied essential features of the faith of protestant Christianity and was in fact derived from that faith. For the intellectual who had cast off his religion the utilitarian creed provided a substitute for it.", Capitalism, Socialism, and Democracy
  4. 4."The chief fountains of this [genteel] tradition were Calvinism and transcendentalism. Both were living fountains; but to keep them alive they required, one an agonised conscience, and the other a radical subjective criticism of knowledge. When these rare metaphysical preoccupations disappeared—and the American atmosphere is not favourable to either of them—the two systems ceased to be inwardly understood; they subsisted as sacred mysteries only; and the combination of the two in some transcendental system of the universe (a contradiction in principle) was doubly artificial.", The Genteel Tradition in American Philosophy
  5. 5."Take notice how a “moral man” behaves, who today often thinks he is through with God and throws off Christianity as a bygone thing. [...] Much as he rages against the pious Christians, he himself has nevertheless as thoroughly remained a Christian — to wit, a moral Christian.", The Ego and His Own
  6. 6."Progressive Christianity, through secular theologians such as Harvey Cox, abandoned the last shreds of Biblical theology and completed the long transformation into mere socialism. [...] Creedal declarations of Universalism are not hard to find. I am fond of the Humanist Manifestos, which pretty much say it all. The UN Declaration of Human Rights is good as well. No mainline Protestant will find anything morally objectionable in any of these documents."
  7. 7."The outcome of what has been reviewed here—late 17C critical thought, the events of 1688, and the writings of Locke, Voltaire, and Montesquieu— may be summed up in a few points [...] the political ideas of the English Puritans aiming at equality and democracy were now in the main stream of thought, minus the religious component.", From Dawn to Decadence
  8. 8.His book Dominion: How the Christian Revolution Remade the World is all about this topic. "If secular humanism derives not from reason or from science, but from the distinctive course of Christianity’s evolution – a course that, in the opinion of growing numbers in Europe and America, has left God dead – then how are its values anything more than the shadow of a corpse? What are the foundations of its morality, if not a myth?" Holland also likes to quote the Indian historian S. N. Balagangadhara: "Christianity spreads in two ways: through conversion and through secularisation."
  9. 9.Henrich has a very interesting paper with Scott Atran: The Evolution of Religion: How Cognitive By-Products, Adaptive Learning Heuristics, Ritual Displays, and Group Competition Generate Deep Commitments to Prosocial Religions. "Most religious beliefs minimally violate the expectations created by our intuitive ontology and these modes of construal, thus creating cognitively manageable and memorable supernatural worlds."
  10. 10.I highly recommend Timur Kuran's Private Truths, Public Lies, his analysis of how social pressures cause people to display and sustain false beliefs is brilliant.
  11. 11.Nietzsche also brings up another related issue: the incompatibility between the older animalistic values and the new ones imposed by selective forces downstream of cultural accumulation, turning man into a "sick animal". But that's a story for another day.
  12. 12.Mill, On Liberty.
  13. 13.It might be interesting to approach this from the POV of Zizekian "ideology". Perhaps the issue is a kind of a-priori faith (because belief by conviction isn't really—it has already been mediated through our subjectivity) which disintegrates once you instrumentalize the idea. Of course people are resistant to instrumentalizing sacred values. From The Sublime Object of Ideology: "Pascal's final answer, then, is: leave rational argumentation and submit yourself simply to ideological ritual".
  14. 14.There's a Chesterton's Fence aspect to all of this: you need to understand the lie before you try to tear it down.



Links Jan-Feb 2020

Word2vec: fish + music = bass

fish + music = bass
fish + friend = chum
fish + hair = mullet
fish + struggle = flounder
oink - pig + bro = wassup
yeti – snow + economics = homo economicus
music – soul = downloadable ringtones
good : bad :: rock : nü metal

Related, The (Too Many) Problems of Analogical Reasoning with Word Vectors.

We always knew meta-analyses are somewhat flawed because of publication bias and the "file drawer problem", but exactly how bad is it? A new paper compares meta-analyses to pre-registered replications and finds that meta-analyses overstate effect sizes by 3x.

In related news, registered reports in psychology have 44% positive results vs 96% in the standard literature.

Female orgasm frequency by male income quartile. Obviously confounded in all sorts of ways, but still.

Effective Altruists tackle the problem of tfw no gf. h/t @SilverVVulpes

Mark Koyama reviews Scheidel's Escape from Rome, with some very interesting comments on the use of counterfactuals by historians vs economists doing history. "There is no control group for Europe had Archduke Ferdinand not been assassinated."

A review of Dietz Vollrath's new book, Fully Grown:

Vollrath’s preferred decomposition of the causes of the 1.25% annual slowdown in real GDP per capita growth is:

  • 0.80pp - Declining growth in human capital
  • 0.20pp - The shift of spending from goods to services
  • 0.15pp - Declining reallocation of workers and firms
  • 0.10pp - Declining geographic mobility

Pay-as-you-go pension systems are going to have serious trouble in countries with rapidly aging populations. Just how bad is it going to be? If you're a <40 yo worker today, it's probably safe to assume you won't be getting much out of the money you're paying into the pension system.

RCA summarizes his views on US healthcare costs with a ton of great charts: Why conventional wisdom on health care is wrong (a primer).

Should we be worrying about automation in the near future? Scholl and Hanson argue no.

Disco Elysium (which I highly recommend) lead designer and writer Robert Kurvitz talks about the development process and how twitter inspired their dialogue engine: The Feature That Almost Sank Disco Elysium.

It has long been established that asking the same question twice in the same questionnaire will often result in the same person giving two different responses. But what happens if you place the repeated questions right next to each other?

Human-cat co-evolution: "We found that the population density of free-ranging cats is linearly related to the proportion of female students in the university. [...] suggests that the cats may have the ability to distinguish human sex and adopt a sociable skill to human females."

The dril Turing test.

And here's some sweet Afro-Cuban jazz fusion.




Reading Notes: Civilization & Capitalism, 15th-18th Century, Vol. I: The Structures of Everyday Life

I first discovered Fernand Braudel when Tyler Cowen answered the question: "whose entire body of work is worth reading?", placing him next to people like Nietzsche and Hume. It was good advice.

Braudel starts working on his doctoral dissertation in 1923, at age 21, intending to concentrate on the policies of Philip II of Spain in the form of a conventional history. To support himself, he teaches at an Algerian high school for a decade, then at the university of Sao Paulo until 1937. During this period he keeps up with developments in France, especially Marc Bloch and Lucien Febvre's Annales School, which focuses on long-term history and statistical data.

In 1934, 11 years after he began, Braudel starts to find quantitative data. Population figures, ship cargoes, prices, arrivals and departures. These will form the basis of his novel, data-driven approach. Five years later, in 1939, he finally has an outline ready.

Then the Nazis capture him. He spends the next 5 years in a POW camp where he writes the first draft of La Méditerranée without access to any materials, mailing notebooks back to Paris. When the war ends, he becomes the de facto leader of the second generation of the Annales School. An additional four years after that, 26 years after he started working on it, The Mediterranean and the Mediterranean World in the Age of Philip II is published.

The general argument of this work is that history moves at different speeds, and one must distinguish them: the short term (daily events as perceived by contemporaries), the medium term ('economic systems, states, societies, civilisations'), and la longue durée – a perspective of centuries or millennia without which the shorter timeframes cannot be understood.

In the preface, Braudel declares: "I have always believed that history cannot be really understood unless it is extended to cover the entire human past." Civilization and Capitalism is built on similar principles.

The initial seeds for C&C were planted in 1950, when Febvre asked Braudel to contribute to a volume for a series on world history. Braudel would simply provide a summary of existing work on the development of capitalism. But Febvre died before the volume could be completed, and Braudel took responsibility for what turned out to be a three-volume series on capitalism. The first volume came out 17 years after work began, in 1967. The final volume would not be published until 1979.

Reading Braudel one gets the impression of an infinite curiosity at work for decades, mining every source for the tiniest piece of data, and then magisterially combining everything together. Despite fairly brutal editing these notes are still way too long, and yet they struggle to capture even a tiny part of the detail and depth that the book contains.


Vol. I: The Structures of Everyday Life

A good starting point might be what is left out: politics, wars, dynasties, religion, ideology, peoples. The index of maps & graphs gives the reader a taste of what is to come: "Budget of a mason's family in Berlin about 1800"; "Bread weights and grain prices in Venice at the end of the sixteenth century"; "French Merchants registered as living in Antwerp, 1450-1585".

The first volume aims to illuminate every aspect of material life: agriculture, food, dress, housing, towns, cities, energy, metals, machines, animals, transportation, money. Braudel's goal is not simply to examine each of these in isolation, but to show how all the elements of material life interact to form cultures, economies, systems of governance, power structures, long-term cycles or trends. He comes remarkably close to achieving this absurdly ambitious task. For people into worldbuilding this tome is pure gold. The first volume also has the greatest general appeal: unlike the other two which are somewhat esoteric, I think this is a book everyone will love.

In short, at the very deepest levels of material life, there is at work a complex order, to which the assumptions, tendencies and unconscious pressures of economies, societies and civilizations all contribute.

It is here that Braudel shows off his greatest skill, which is the combination of the microscopic with the panoramic. At the top level: Geography. Climate. Land. Crops. ZOOM IN. Trading routes. Piracy. Economy. Cities. Technology. And then zoom into details like the price of wheat relative to oats in 1351 Paris. He shifts effortlessly between the global, long-term perspective and minute, specific data and anecdotes, combining the two to form a coherent understanding.

The Weight of Numbers

Everything, both in the short and long term, and at the level of local events as well as on the grand scale of world affairs, is bound up with the numbers and fluctuations of the mass of people.

The predominant feature of the ancien régime is malthusianism. From the 16th century on, Europe was constantly on the brink of overpopulation. Epidemics and famines established balance, and occasional recessions in population created great wealth for the survivors. "Thus in Languedoc between 1350 and 1450, the peasant and his patriarchal family were masters of an abandoned countryside. Trees and wild animals overran fields that once had flourished." France had 26 general famines just in the 11th century; 16 in the 18th.

Famine recurred so insistently for centuries on end that it became incorporated into man's biological regime and built into his daily life. Dearth and penury were continual, and familiar even in Europe, despite its privileged position. [...] Things were far worse in Asia, China and India. Famines there seemed like the end of the world. In China everything depended on rice from the southern provinces; in India, on providential rice from Bengal, and on wheat and millet from the northern provinces, but vast distances had to be crossed and this contribution only covered a fraction of the requirements.

Slowly, expansion and improvements in agricultural productivity doubled the global population, which Braudel calls "indubitably the basic fact in world history from the fifteenth to the eighteenth century".

Almost all of these people lived in the countryside. "The towns the historian discovers in his journeys back into pre-nineteenth-century times are small; and the armies miniature." The towns were also great population sinks, drawing in men from the countryside and killing them. Wild animals were everywhere, often a real threat. Even in Europe, which was full of wolves and bears.

A lapse in vigilance, an economic setback, a rough winter, and they multiplied. In 1420, packs entered Paris through a breach in the ramparts or unguarded gates. They were there again in September 1438, attacking people this time outside the town, between Montmartre and the Saint-Antoine gate. In 1640, wolves entered Besançon by crossing the Doubs near the mills of the town and 'ate children along the roads'.

Braudel writes about the global ebb and flow of epidemics over the course of centuries, and how they were aided by global trade. And to illustrate their effect, he brings up statistics like the annual number of plague victims in the town of Strauling between 1623 and 1635 (702 people). He tells us of Montaigne, who as mayor of Bordeaux fled the town (like all rich people would) and abandoned his post during the 1585 plague. He quotes the diaries of Samuel Pepys ("the plague making us cruel, as doggs, one to another"). He quotes Francois Dragonet of Fogasses, a rich Avignon citizen of Italian origin, whose leases provided for a time when he would be obliged to leave the town (which he did in 1588, during a fresh plague) and lodge with his farmers: 'In case of contagion (God forbid), they will give me a room at the house... and I will be able to put my horses in the stable on my way there and back, and they will give me a bed for myself.' The dead pile up in the streets (Defoe: "for the most part on to a cart like common dung"), the palaces of the rich are looted.

Montaigne tells how he wandered in search of a roof when the epidemic reached his estate, 'serving six months miserably as a guide' to his 'distracted family, frightening their friends and themselves and causing horror wherever they tried to settle'.

Daily bread

Wheat

Diets in this period were almost universally vegetable-based, especially outside Europe, for the simple reason that land devoted to cultivation is much more efficient. Braudel focuses on three major crops: wheat, rice, and maize. These crops sit at the basis of everything: they determine population size, and their required inputs determine labor relations, animal usage (which in turn need their own crops), and so on.

Thus there became established in Europe, with certain regional variations, 'a complicated system of relationships and habits', based on wheat and other grains, which was 'so firmly cemented together that no fissure was possible' according to Ferdinand Lot. Plants, animals and people each had their place in it. In fact the whole system was inconceivable without the peasants, the harnessed teams of animals, and the seasonal labourers at harvest and threshing time, since reaping and threshing was all done by hand. The fertile lowlands called on labour from poor land, inevitably wild highland regions. Innumerable examples (the southern Jura and Dombes, the Massif Central and Languedoc) demonstrate that the partnership was a basic rule of life, repeated on many occasions. An immense crowd of harvesters arrived every summer in the Tuscan Maremma, where fever was so prevalent, in search of high wages (up to five paoli a day in 1796). Malaria regularly claimed innumerable victims there.

 

Wheat's unpardonable fault was its low yield: it did not provide for its people adequately. All recent studies establish the fact with an overwhelming abundance of detail and figures. Wherever one looks, from the fifteenth to the eighteenth century, the results were disappointing. For every grain sown, the harvest was usually no more than five and sometimes less.

Until very late, agricultural production was fertilizer-limited. In southern Europe half the field would lie fallow every year, and this only really changed after the industrial revolution. Trade happened on local exchanges, which combined with laws against "hoarding" made local shortages problematic. In the 16thC total maritime trade was perhaps 1% of total consumption. White bread was a luxury until the latter half of the 18thC. Flour doesn't keep well, so every town had a mill that worked daily (about 1 mill per 400 people); any interruption (eg because of the river freezing) immediately created supply problems.

Rice and maize

Rice is an even more tyrannical and enslaving crop than wheat.

The key difference between rice and wheat is that the former can produce ~7.3 million kcals per hectare, whereas wheat can only reach 1.5 million. Unlike wheat, there was no need for fallow land, and by the 13thC in China a system of double (or sometimes triple) crop was established. "And thus the great demographic expansion of southern China began."

The high population density created by rice, combined with the necessity for elaborate top-down irrigation systems, resulted in strong state authority that constantly pursued large-scale works.

The problem then is that on one hand we have a series of striking achievements, on the other, human misery. As usual we must ask: who is to blame? Man of course. But maize as well.

While wheat yielded maybe 5 grains for every one planted, maize would yield 150 or more. It grows easily and requires little effort on the part of the farmer (perhaps 50 days per year). "The maize-growing societies on the irrigated terraces of the Andes or on the lakesides of the Mexican plateaux resulted in theocratic totalitarian systems and all the leisure of the peasants was used for gigantic public works of the Egyptian type."

After the discovery of the New World, potatoes and maize flowed back toward Eurasia, but very slowly. It took until the 18thC for maize to see widespread cultivation in Europe. The potato was strongly resisted everywhere, as people thought it caused leprosy or flatulence; it only spread rapidly in the face of famine or war.

Hoe cultivation beltHoe cultivation belt

There is also an enormous region that spans the globe where work was done with a digging stick or hoe, and animals are generally not used. These societies are surprisingly homogeneous:

The world of men with hoes was characterized - and this is the most striking fact about it - by a fairly marked homogeneity of goods, plants, animals, tools and customs. We can say that the house of the peasant with a hoe, wherever it may be, is almost invariably rectangular and has only one storey. He is able to make coarse pottery, uses a rudimentary hand loom for weaving, prepares and consumes fermented drinks (but not alcohol), and raises small domestic animals - goats, sheep, pigs, dogs, chickens and sometimes bees (but not cattle). He lives off the vegetable world round about him: bananas, bread-fruit trees, oil palms, calabashes, taros and yams.

Superfluity and Sufficiency: Food and Drink

Eating Habits

Prices and therefore diets followed population numbers. Large-scale death from war or plague made meat accessible; overpopulation meant the peasants didn't even eat the wheat they produced.

Things had begun to change in the West by the middle of the sixteenth century. Heinrich Muller wrote in 1550 that in Swabia 'in the past they ate differently at the peasant's house. Then, there was meat and food in profusion every day; tables at village fairs and feasts sank under their load. Today, everything has truly changed. Indeed, for some years now, what a calamitous time, what high prices! And the food of the most comfortably-off peasants is almost worse than that of day-labourers and valets in the old days.

 

The peasant often sold more than his 'surpluses', and above all, he never ate his best produce: he ate millet and maize and sold his wheat; he ate salt pork once a week and took his poultry, eggs, kids, calves and lambs to market.

Spoons and knives were old customs, but the fork dates to the 16thC and spread from Venice.

Anne of Austria ate her meat with her fingers all her life. And so did the Court of Vienna until at least 1651. Who used a fork at the Court of Louis XIV? The Duke of Montausier, whom Saint-Simon describes as being 'of formidable cleanliness'. Not the king, whose skill at eating chicken stew with his fingers without spilling it is praised by the same Saint-Simon! When the Duke of Burgundy and his brothers were admitted to sup with the king and took up the forks they had been taught to use, the king forbade them to use them. This anecdote is told by the Princess Palatine, with great satisfaction: she has 'always used her knife and fingers to eat with'.

The Baron de Tott has left a humorous description of a reception in the country house near Istanbul of 'Madame the wife of the First Dragoman', in 1760. This class of rich Greeks in the service of the Grand Turk adopted local customs, but liked to make some difference felt. 'A circular table, with chairs all round it, spoons, forks nothing was missing except the habit of using them. But they did not wish to omit any of our manners which were just becoming as fashionable among the Greeks as English manners are among ourselves, and I saw one woman throughout the dinner taking olives with her fingers and then impaling them on her fork in order to eat them in the French manner'.

In the West, eggs were accessible to most people, as were cheese and milk. Butter remained limited to Northern Europe. Fish were generally an important source of nourishment, but with large regional variation. The Atlantic coast was particularly advanced in its exploitation of the ocean.

Fish was all the more important here as religious rulings multiplied the number of fast days: 166 days, including Lent, observed extremely strictly until the reign of Louis XIV. Meat, eggs and poultry could not be sold during those forty days except to invalids and with a double certificate from doctor and priest. To facilitate control, the 'Lent butcher' was the only person authorized to sell prohibited foods at that time in Paris, and only inside the area of the Hotel Dieu.

Sugar was brought from the East, with a lot of regional variation in consumption. "In 1800 England consumed 150,000 tons of sugar annually, almost fifteen times more than in 1700." But in other parts of Europe it was virtually unknown. Cultivation of sugar was a labor- and capital-intensive enterprise, and often in sugar colonies there was no space left for any other crops: food had to be imported.

Drinks, stimulants and drugs

Water was generally hard to find. Couldn't be stored on ships, and many cities (like Venice) lacked a real supply and instead relied on filtered rain water and water brought from the mainland. Few aqueducts remained in use, though some were restored in the 15thC (Rome, Paris). Some places used hydraulic wheels to pump water from rivers. The late 18thC saw steam pumps in London and Paris, replacing water-carrying laborers. Snow water was reserved for the wealthy; there was a trade in it, with ships filled with snow moving around the Mediterranean.

Everyone drank wine, and alcoholism was increasingly a problem. The production was generally in the south of Europe, and trade brought it to the north. But it was all new wine, as it did not keep well: regular use of corks would take until the 17thC. The non-wine growing regions had beer, which the south "vigorously opposed". In some areas consumption reached 3 liters per day. "Beer of superior quality was being exported as far as the East Indies from Brunswick and Bremen by the end of the seventeenth century." Cider only started making headway in the 16thC, among the poor. Other civilizations fermented maple juice, agave, or maize.

The great innovation, the revolution in Europe was the appearance of brandy and spirits made from grain - in a word: alcohol. The sixteenth century created it; the seventeenth consolidated it; the eighteenth popularized it.

Stills existed in the West before the 12thC, but things took a while to get going. And the stills would remain primitive until 1773. The drinks started out as medicine. Various guilds fought hard for the privilege of producing Brandy in France. Further north where they had no vines for brandy, grain spirits were most popular. "By the early eighteenth century, the whole of London society, from top to bottom, was determinedly getting drunk on gin."

At nearly the same time as the discovery of alcohol, Europe, at the centre of the innovations of the world, discovered three new drinks, stimulants and tonics: coffee, tea and chocolate. All three came from abroad: coffee was Arab (originally Ethiopian); tea, Chinese; chocolate, Mexican.

Samuel Pepys drank his first cup of tea on September 25, 1660. A century later the English were consuming it by the boatload.

Superfluity and Sufficiency: Houses, Cloths and Fashion

Houses and interiors

The basic constraint on housing was local materials, and as such houses only changed very slowly. Stone mostly for the upper classes; wood (which was gradually replaced by brick) and thatched roof for most people. Earthen dwellings in places where neither stone nor wood existed. In rural areas homes were extremely simple.

Villages were often mobile, "they grew up, expanded, contracted, and also shifted their sites. Sometimes these 'desertions' were total and final - the Wustungen mentioned by German historians and geographers. More often the centre of gravity within a given cultivated area shifted, and everything - furniture, people, animals, stones - was moved out of the abandoned village to a site a few kilometres away."

On 3 February 1695 the Princess Palatine wrote: 'At the king's table the wine and water froze in the glasses.' [...] When the severity of the weather increased, as in Paris in 1709, 'the people died of cold like flies'.(2 March). In the absence of heating since January (again according to the Princess Palatine) 'all entertainments have ceased as well as law suits'.

There were no fireplaces set in the wall before the 12thC. They spread fast, but the design was deficient and they were not very useful for warming homes. It took until the early 18thC for new chimney designs to come along: by utilizing the draught they vastly improve the fireplace's ability to warm the home.

People had almost no furnishings or other possessions. "Official reports for Burgundy between the sixteenth and the eighteenth centuries are full of 'references to people [sleeping] on straw... with no bed or furniture' who were only separated 'from the pigs by a screen'." Outside Europe even chairs were a rarity. In general there was very limited production of such items, and renovations were a large expense even for the rich.

Costume and fashion

Subject to incessant change, costume everywhere is a persistent reminder of social position. The sumptuary laws were therefore an expression of the wisdom of governments but even more of the resentment of the upper classes when they saw the nouveaux riches imitate them.

In societies that remained stable over time, so did dress. China, Japan, even Algiers. "The Indian women in New Spain in Cortes' day wore long tunics, sometimes embroidered, made of cotton and later of wool: and so they did still in the eighteenth century. Male costume, on the other hand, changed - but only to the extent that the conquerors and missionaries demanded clothing decently concealing the nudity of the past." Even in Western Europe in the early 19thC, peasants were still wearing simple coarse cloth that had not changed much for centuries. "In fact, the further back in time one goes, even in Europe, one is more likely to find the still waters of ancient situations like those we have described in India, China and Islam. The general rule was changelessness." The long robes which had persisted from Roman times were only abandoned around 1350.

Tradition was both a strength and a straitjacket. Perhaps if the door is to be opened to innovation, the source of all progress, there must be first some restlessness which may express itself in such trifles as dress, the shape of shoes and hairstyles? Perhaps too, a degree of prosperity is needed to foster any innovating movement?

People in Europe were dirty. Late 18thC Parisians might bathe once or twice per year.

The West even experienced a significant regression from the point of view of body baths and bodily cleanliness from the fifteenth to the seventeenth centuries. [...] After the sixteenth century, public baths became less frequent and almost disappeared, it was said because of the risk of infection and in particular the terrible disease of syphilis. Another reason was no doubt the influence of preachers, both Catholic and Calvinist, who fulminated against the moral dangers and ignominy of the baths. Although rooms for bathing survived in private homes for a long time, the bath became a means of medication rather than a habit of cleanliness.

The Spread of Technology: Sources of Energy, Metallurgy

There are times when technology represents the possible, which for various reasons - economic, social or psychological men are not yet capable of achieving or fully utilizing; and other times when it is the ceiling which materially and technically blocks their efforts. In the latter case, when one day the ceiling can resist the pressure no longer, the technical breakthrough becomes the point of departure for a rapid acceleration. However, the force that overcomes the obstacle is never a simple internal development of technology or science, or at any rate not before the nineteenth century.

Energy was the key problem. Coal had been used in Europe since the 11thC and in China perhaps as early as 4000 BC, but it took very long to realize how much potential it had. Instead the main sources of energy were humans, animals, wind and water, and wood.

Particularly outside Europe, human power was used to an extreme degree. And cheap labor was a problem for the development of machinery.

The precondition for progress was probably a reasonable balance between human labour and other sources of power. The advantage was illusory when man competed with machines inordinately, as in the ancient world and China, where mechanization was ultimately blocked by cheap labour.

In the Old World, camels and mules were indispensable for transportation. Oxen were everywhere, mostly for working the land but also for transportation. Later farming practices replaced them with horses, but that required horse technology improvements such as better harnesses (and it would take very long for these advancements to spread - "The Chinese were still using wooden saddles and ordinary ropes instead of reins in the eighteenth century.") Lavoisier estimated 1.8 million horses and 3 million oxen in France.

The West experienced its first mechanical revolution in the eleventh, twelfth and thirteenth centuries. Not so much a revolution, perhaps, as a whole series of slow changes brought about by the increased numbers of wind- and watermills. The power from these 'primary engines' was probably not very great, from two to five horse-power from a water-wheel, sometimes five, at most ten, from the sails of a windmill. But they represented a considerable increase of power in an economy where power supplies were poor. And they undoubtedly played a part in Europe's first age of growth.

 

The uses of the water-wheel had become manifold; it worked pounding devices for crushing minerals, heavy tilt hammers used in iron-forging, enormous beaters used by cloth fullers, bellows at iron-works; also pumps, grindstones, tanning mills and paper mills, which were the last to appear. We should also mention the mechanical saws that appeared in the thirteenth century.

Watermills provided power for mines, which saw a rise in the 15thC: they raised ore, ventilated galleries, pumped water, etc. On the eve of the industrial revolution there were perhaps 500,000 watermills in Europe.

Windmills were a later invention, and the key development was to fit the wheel vertically (as opposed to horizontally, as they had been used in China for centuries), which greatly increased their power. Their uses were not limited to milling; in the Netherlands they drove bucket chains that drained water, a key instrument for land reclamation.

Wood was important both directly as a source of energy when burned, and as a building material for machines, ships, etc. Huge transportation costs unless it could be floated down a waterway. By the 18thC demand and prices had skyrocketed. "In France in the eighteenth century, it was said that a single forge used as much wood as a town the size of Chalons-sur-Marne. Enraged villagers complained of the forges and foundries which devoured the trees of the forests, not even leaving enough for the bakers' ovens."

As for coal, there were two key locations in Europe: Liege and Newcastle. Newcastle's coal production increased 15x between the mid-16th and mid-17th century.

It was an integral part of the coal revolution that modernized England after 1600, enabling fuel to be used in a series of industries with large outputs: the manufacture of salt by evaporating sea water; the production of sheets of glass, bricks, and tiles; sugar refining; the treatment of alum, previously imported from the Mediterranean but now developed on the Yorkshire coast; not to mention the bakers' ovens, breweries and the enormous amount of domestic heating that was to pollute London for centuries.

 

There was thus an often imperceptible or unrecognized industrial pre-revolution in an accumulation of discoveries and technical advances, some of them spectacular, others almost invisible: various types of gear-wheels, jacks, articulated transmission belts, the 'ingenious system of reciprocating movement' , the fly-wheel that regularized any momentum, rolling mills, more and more complicated machinery for the mines. [...] It is revealing to see how European travellers unfailingly comment on the contrast between the primitive machinery in use in India and China, and the quality and refinement of its products.

 

With the coming of steam, the pace of the West increased as if by magic. But the magic can be explained: it had been prepared and made possible in advance.

Iron

Today production is calculated in thousands of tons; 200 years ago they talked about 'hundredweights', which were quintals, the equivalent of fifty present-day kilograms. That is the difference in scale. It divides two civilizations. As Morgan wrote in 1877: 'When iron succeeded in becoming the most important production material, it was the event of events in the evolution of humanity.'

In 1800 metallurgy was still mostly traditional, the economy was dominated by textiles. Metallurgical products other than luxury items did not travel.

We are speaking of the period before the first smelting of steel, before the discovery of puddling, before the general use of coke for smelting, before the long sequence of famous names and processes: Bessemer, Siemens, Martin, Thomas. We are speaking of what was still another planet.

There were two major advances: an early one in China which stagnated by the 13thC, and the later one in Europe leading up to the industrial revolution.

After two smeltings in the crucible, the product obtained enabled the Chinese to cast ploughshares or cooking pots in series - an art that the West discovered only some eighteen or twenty centuries later. [...] Another triumph of Asiatic smelting by crucible was the manufacture - thought by some to be of Indian origin, by others Chinese - of a special kind of steel, 'high quality carbonized steel', as good as the best hypereutectoid steels made today. The nature of this steel and the secrets of its manufacture remained a mystery to Europeans until the nineteenth century. [...] What is so extraordinary is that after this incredibly early start, Chinese metallurgy progressed no further after the thirteenth century. Chinese foundries and forges made no more discoveries, but simply repeated their old processes. Coke-smelting if it was known at all - was not developed. It is difficult to ascertain this, let alone explain it. But Chinese development as a whole poses the same problem time after time: veiled in mystery, it has not yet been resolved.

In Europe, the water-wheel was crucial in the development of iron-smelting, starting with blast furnaces in the 14thC. Water powered enormous bellows and pounding devices - ironworks had to move from forests to riversides. Generally everything was made in small workshops with a master and 3 or 4 workers, but these tended to be concentrated: Brescia had perhaps 200 arms factories.

The Spread of Technology: Revolution and Delays

Innovations penetrated only slowly and with difficulty. The great technological 'revolutions' between the fifteenth and eighteenth centuries were artillery, printing and ocean navigation. But to speak of revolution here is to use a figure of speech. None of these was accomplished at breakneck speed, and only the third - ocean navigation - eventually led to an imbalance, or 'asymmetry' between different parts of the globe.

Gunpowder

Produced in China from the 9thC. In Europe, it took to the 14-15thC for pieces to become larger and gunpowder cheaper. Mobility was an issue, large teams of horses needed to move them. Early cannons fired on walls almost at point-blank range. Defense design changed from stone ramparts to earthworks. Installed on ships very early on, by late 14thC all English ships had some artillery. But it was a bit of a mess, and cannon-ports were not a regular feature up to the 16thC. Arquebuses appear in the 15thC, slow and cumbersome. Muskets a bit later, similar issues. Only with the rifle at the start of the 18thC we start seeing large changes.

The new warfare had huge costs, favoring centralization and rich states: independent cities (which had preserved their autonomy in the middle ages) were eliminated as their walls were easily knocked over by huge cannons.

But the cost of artillery did not end when it had been built and supplied with ammunition. It had also to be maintained and moved. The monthly bill for maintenance of the fifty pieces the Spaniards had in the Netherlands in 1554 (cannon, demi-cannon, culverins and serpentines) was over forty thousand ducats. To set such a mass in motion required a 'small train' of 473 horses for the mounted troops and a 'large train' of 1014 horses and 575 wagons (with 4 horses each) , or 4777 horses in all, which meant almost 90 horses per piece. At the same period a galley cost about 500 ducats a month to maintain.

In the late 16thC, Venice had gunpowder in store that cost more than the entire annual receipts of the city.

Paper and Printing

A similar story to gunpowder. Originally developed in the East. Industry took off by the application of water-wheel power to manufacture. "The invention travelled round the world. Like gunners looking for hire, printing workers with makeshift equipment wandered at random, settled down when the opportunity offered and moved on again to accept the welcome of a new patron." Spread fairly quickly around Europe at the end of the 15thC. Perhaps 20 million books printed before 1500 (for a population of 70 million). A key ingredient in 16thC humanism (spreading Greek/Latin thought and mathematics), and later the reformation and counter-reformation.

Ocean Navigation

"The conquest of the high seas gave Europe a world supremacy that lasted for centuries." It also presents a problem: why was this technology not diffused into other cultures?

The Chinese junks, despite their many advantages (sails, rudders, hulls with watertight compartments, compasses after the eleventh century, and a large displacement volume from the fourteenth), went as far as Japan but did not venture beyond the Gulf of Tonkin to the south.

Shipbuilding technology in Europe drew from diverse traditions. The 15thC Portuguese caravel was a marriage of north and south. There was a fairly long history of exploration: the Faroes and Greenland were found multiple times in the first millenium. The Vivaldi brothers attempted to reach the Indies at the end of the 13thC, but were lost at sea. In the 15thC the Chinese started making some voyages of exploration under the Muslim eunuch admiral Cheng Huo. The seventh and last voyage reached Hormuz. Then everything just stopped.

The Atlantic consists of three large wind and sea circuits, shown on a map as three great ellipses. The currents and winds will take a boat in either direction with no effort on its part, as both the Vikings' circuit of the North Atlantic and the voyage of Columbus demonstrate.

For this to be achieved, "Europe had to be aroused to a more active material life, combine techniques from north and south, learn about the compass and navigational charts and above all conquer its instinctive fear." Perhaps the growth of Capitalist forces was what made these voyages possible. But it was not entirely a matter of money: both China and Islam were rich societies at the time.

What historians have called the hunger for gold, the hunger to conquer the world or the hunger for spices was accompanied in the technological sphere by a constant search for new inventions and utilitarian applications - utilitarian in the sense that they would actually serve mankind, making human labour both less wearisome and more efficient. The accumulation of practical discoveries showing a conscious will to master the world and a growing interest in every source of energy was already shaping the true face of Europe and hinting at things to come, well before that success was actually achieved.

Transport

Up to the eighteenth century, sea journeys were interminable and overland transport went at snail's pace. [...] The 'defeat of distance', as Ernst Wagemann calls it, was only to be achieved after 1875, with the laying of the first intercontinental cable. True mass communication on a world scale did not appear until the age of the railway, the steamship, telegraph and telephone. Very little changed in terms of the means of transportation across this time. Paul Valery pointed out that 'Napoleon moved no faster than Julius Caesar'. Stone/paved roads increased speeds a bit, but these long remained exceptions. The 18C saw improvements with paved roads + stagecoaches, prefiguring the railway. These were the result of large-scale investment, what economic growth made possible in practice what was possible technically much earlier.

Roadside inns and staging houses were important, typically these had to be reached by evening. "A Neapolitan traveller described these inns more simply in 1693: 'They are nothing but... long stables where the horses occupy the central part; the sides are left for the Masters.' [...] Amenities and speed were the privileges of populated and firmly maintained, 'policed', lands: China, Japan, Europe, Islam." In the rest of the world travel was even more difficult.

Sea routes were fixed, being dependent on winds. Water was more efficient of course (perhaps by a factor of 100!), so waterways brought activity to the areas around them.

Money

The same process can be observed everywhere: any society based on an ancient structure which opens its doors to money sooner or later loses its acquired equilibria and liberates forces that can never afterwards be adequately controlled.

Barter remained the general rule over most of the globe up to the 18th century. Depending on local conditions barter could be partially replaced by primitive currencies such as cowrie shells. Often a highly valued/circulated commodity played the role of money: salt in Senegal, dried fish in Iceland, furs in Alaska and Russia. Other places used cloth, gold dust, copper bracelets, animals, sugar, or cocoa. In some places these lasted for a very long time: Corsica "was not annexed by a really efficient monetary economy until after the First World War."

Early metallic money faced problems with speculation, only existed in large denominations, and was often scarce. The limitations meant that the coins barely touched the masses. Japan, India, Islam, and China were familiar with coinage from early on. China even experimented with paper money from the 9th to the 14thC, but hyperinflation ruined the system. Afterwards China used cumbersome copper and lead coins, with silver for higher level transactions.

In Europe the metals used were typically gold, silver, and copper. When and where these were used depended on the economy, the relative values of the metals, etc.

Their production was irregular and never very flexible, so that depending on circumstances, one of the two metals would be relatively more plentiful than the other; then, with varying degrees of slowness, the situation would reverse, and so on. This resulted in upsets and disasters on the exchanges, and led above all to those slow but powerful fluctuations which were a feature of the monetary ancien regime. It is a well-known truth that 'silver and gold are hostile brothers'.

In general, after the age of exploration, specie flowed from the New World and Europe into the Indies and China, as that is what the Europeans exchanged for commodities from the East.

The 'jingle of coin' thus found its way into everyday life by many different paths. The modern state was the great provider (taxes, mercenaries' pay in money, office-holders' salaries) and recipient of these transfers; but not the only one. Many people were well placed to benefit: the tax-collector, the salt-tax farmer, the pawnbroker, the landowner, the large merchant entrepreneur and the 'financier'. Their net stretched everywhere. And naturally this new wealthy class, like their equivalent today, did not arouse sympathy.

Paper Money and Credit

To be found in circulation alongside metallic money were both fiduciary money (bank notes) and scriptural money (created by the process of book-keeping, by transferring money from one bank account to another: a practice known to the Germans as Buchgeld, book money.

The use of notes in trade is ancient (at least from 2000 BC), and was also well-known outside of Europe. The Europeans rediscovered bills of exchange in the 13thC: "When the West rediscovered the old instruments, it was not like discovering America. In fact every economy that found itself restricted by metallic currency fairly quickly opened up instruments of credit of its own accord, as though in a logical and natural development. They sprang from its commitments, and no less from its shortcomings."

What began to happen very soon was the artificial manufacture of money, of ersatz or perhaps one might say 'manipulated and manipulable' money. All those bank promoters and eventually the Scot, John Law, gradually realized 'the business potentialities of the discovery that money and hence capital in the monetary sense of the term - can be manufactured or created'. This was both a sensational discovery (a lot better than the alchemists!) and a huge temptation. And what a revelation it is for us: it was the slow pace of the heavy metal money, its failure so to speak to keep the engine running, that created the necessary profession of banker, at the very dawn of economic life. He was the man who repaired or tried to repair the mechanical breakdown.

Towns and Cities

Towns, cities, are turning-points, watersheds of human history. When they first appeared, bringing with them the written word, they opened the door to what we now call history. Their revival in Europe in the eleventh century marked the beginning of the continent's rise to eminence. When they flourished in Italy, they brought the age of the Renaissance. So it has been since the city-states, the poleis of ancient Greece, the medinas of the Muslim conquest, to our own times. All major bursts of growth are expressed by an urban explosion.

 

If towns are considered to be settlements of over 400 inhabitants, then 10% of the English population was living in towns in 1500, and 25% in 1700. But if 5000 is taken as the minimum definition, the figure would only be 13 % in 1700, 16% in 1750, 25% in 1801.

The fundamental aspects of towns: power, markets, division of labor. Cities were population sinks, drawing in immigrants from the countryside. Lots of poverty, lots of death, lots of abandoned children, lots of old/sick/dying in horrible poor-houses like the Hotel-Dieu. The squares would fill up every morning with peasants selling fresh produce. Except for a few places like England, towns all had fortifications. Growth was "organic": city planning had died with the Roman Empire. However, outside of Europe and Islam, the grid pattern was a universal standard.

The West had long ensured security at a low cost by a moat and a perpendicular wall. This did little to interfere with urban expansion - much less than is usually thought. When the town needed more space the walls were moved like theatre sets - in Ghent, Florence, and Strasbourg, for example - and as many times as was required. Walls were made-to-measure corsets. Towns grew and made themselves new ones.

 

Western towns faced severe problems from the fifteenth century onwards. Their populations had increased and artillery made their ancient walls useless. They had to be replaced whatever the cost, by wide ramparts half sunk in the ground, extended by bastions, terrepleins, 'cavaliers', where loose soil reduced possible damage from bullets. These ramparts were wider horizontally and could no longer be moved without enormous expense. And an empty space in front of these fortified lines was essential to defence operations; buildings, gardens and trees were therefore forbidden there.

The consequence was vertical growth and higher land prices inside the towns. Carriages from the 16thC onwards created huge problems as the streets were generally not equipped to deal with them.

Islamic towns were very large as a rule, and distant from each other. [...] The Great Mosque stood in the centre, with shopping streets (souqs) and warehouses (khans or caravanserai) all around; then a series of craftsmen ranged in concentric circles in a traditional order which always reflected notions concerning what was clean and what was unclean.

The Originality of Western Towns

The West quite soon became a kind of luxury of the world. The towns there had been brought to a pitch hardly found anywhere else.

Its towns were marked by an unparalleled freedom. They had developed as autonomous worlds and according to their own propensities. They had outwitted the territorial state, which was established slowly and then only grew with their interested cooperation - and was moreover only an enlarged and often insipid copy of their development. They ruled their countrysides autocratically, regarding them exactly as later powers regarded their colonies, and treating them as such. They pursued an economic policy of their own via their satellites and the nervous system of urban relay points; they were capable of breaking down obstacles and creating or recreating protective privileges.

But the main, the unpredictable thing was that certain towns made themselves into autonomous worlds, city-states, buttressed with privileges (acquired or extorted) like so many juridical ramparts.

The town was able to try the experiment of leading a completely separate life for quite a long time. This was a colossal event. Its genesis cannot be pinpointed with certainty, but its enormous consequences are visible.

 

They invented public loans: the first issues of the Monte Vecchio in Venice could be said to go back to 1167. [...] One after another, they reinvented gold money. [...] They organized industry and the guilds; they invented long-distance trade, bills of exchange, the first forms of trading companies and accountancy. They also quickly became the scene of class struggles.

 

Capitalism and towns were basically the same thing in the West. Lewis Mumford humorously claimed that capitalism was the cuckoo's egg laid in the confined nests of the medieval towns. By this he meant to convey that the bird was destined to grow inordinately and burst its tight framework (which was true), and then link up with the state, the conqueror of towns but heir to their institutions and way of thinking and completely incapable of dispensing with them.

 

Only the West swung completely over in favour of its towns. The towns caused the West to advance. It was, let us repeat, an enormous event, but the deep-seated reasons behind it are still inadequately explained. What would the Chinese towns have become if the junks had discovered the Cape of Good Hope at the beginning of the fifteenth century, and had made full use of such a chance of world conquest?

The Big Cities

For a long time the only big cities in the world had been in the East and Far East. Marco Polo's amazement makes it clear that the East was the site of empires and enormous cities. With the sixteenth century, and more still during the following two centuries, large towns grew up in the West, assumed positions of prime importance and retained them brilliantly thereafter.

Braudel examines some of the most important cities: Naples, Paris, St. Petersburg, Peking.

In London, the agglomeration of huge masses of poor people was seen as a threat.

In Elizabeth's reign observers already regarded London as an exceptional world. For Thomas Dekker it was 'the Queene of Cities', made incomparably more beautiful by its winding river than Venice itself judged by the marvellous view of the Grand Canal (a very paltry sight compared with what London could offer). Samuel Johnson (20 September 1777) was even more lyrical: 'when a man is tired of London, he is tired of life; for there is in London all that life can afford.' [...] The royal government shared these illusions, but it was none the less in constant fear of the enormous capital. In its eyes London was a monster whose unhealthy growth had to be limited at all costs. [...] The first prohibition on new building (with exceptions in favour of the rich) appeared in 1580. Others followed in 1593, 1607 and 1625. The result was to encourage the dividing-up of existing houses and secret construction-work in poor brick in the courtyards of old houses, away from the street and even from minor alleys.

Regardless, it grew from about 93,000 inhabitants in 1563 to over 700,000 in 1700.

In the seventeenth and eighteenth centuries fresh expansion pushed the town in all directions at once. Appalling districts grew up on the outskirts - shanty towns with filthy huts, unsightly industries (notably innumerable brickworks), pig farms using household refuse for feed, accumulations of rubbish, and sordid streets.

What can we conclude? That London, alongside Paris, was a good example of what a capital of the ancien régime could be. A luxury that others had to pay for, a gathering of a few chosen souls, numerous servants and poor wretches, all linked however, by some collective destiny of the great agglomeration.

 

The truth is that these densely populated cities, in part parasites, do not arise of their own volition. [...] The world of the ancien régime, very largely a rural one, was slowly but surely collapsing and being wiped out. And great cities were not alone in bringing about the painful birth of the new order. It was often as spectators rather than participants that the capital cities watched the coming Industrial Revolution. Not London, but Manchester, Birmingham, Leeds, Glasgow and countless small mill-towns launched the new age. It was not even the capital accumulated by eighteenth-century patricians that was first invested in the new ventures. London did not take advantage of the industrial movement through her financial assets until about 1830. Paris for a moment looked as if she might welcome the.new industry, but was quickly displaced by the establishment of the real industrial centres near the coalmines of the north, the waterpower of Alsace, and the iron of Lorraine.

Conclusion

Books, even history books, run away with their authors. This one has run on ahead of me. But what can one say about its waywardness, its whims, even its own logic, that will be serious and valid? Our children do as they please. And yet we are responsible for their actions.

 

Material life, of course, presents itself to us in the anecdotal form of thousands and thousands of assorted facts. Can we call these events? No: to do so would be to inflate their importance, to grant them a significance they never had. That the Holy Roman Emperor Maximilian ate with his fingers from the dishes at a banquet (as we can see from a drawing) is an everyday detail, not an event. So is the story about the bandit Cartouche, on the point of execution, preferring a glass of wine to the coffee he was offered. This is the dust of history, microhistory in the same sense that Georges Gurvitch talks about micro-sociology: little facts which do, it is true, by indefinite repetition, add up to form linked chains. Each of them represents the thousands of others that have crossed the silent depths of time and endured.

 

It is a fact that 'every great centre of population has worked out a set of elementary answers and has an unfortunate tendency to stick to them out of that force of inertia which is one of the great artisans of history. What is a civilization then, if not the ancient settlement of a certain section of mankind in a certain place? It is a category of history, a necessary classification. Mankind has only shown any tendency to become united (and has certainly not yet succeeded) since the end of the fifteenth century. Until then, and the further we go back in time the more obvious it becomes, humanity was divided between different planets, each the home of an individual civilization or culture, with its own distinctive features and age-old choices. Even when they were close together, these solutions never combined.

 

In a context where other structures were inflexible (those of material life and, no less, those of ordinary economic life) capitalism could choose the areas where it wished and was able to intervene, and the areas it would leave to their fate, rebuilding as it went its own structures from these components, and gradually in the process transforming the structures of others.

 

I did not think it was possible to achieve an understanding of economic life as a whole if the foundations of the house were not first surveyed.


Get the book on Amazon.




The Best and Worst Books I Read in 2019

The Best

Stefan Zweig, The World of Yesterday

From the late 19th century to World War II, through the eyes of Stefan Zweig. From the stable order, prosperity, and peace of the Austro-Hungarian Empire to fratricidal wars, Weimar Germany, and Nazism. Mechanization, electrification, the breakdown of ossified social and political structures. Beautifully written and just incredibly sad and wistful. Zweig finished the manuscript in February 1942, posted it to his publisher, and killed himself the next day.

For I have indeed been torn from all my roots, even from the earth that nourished them, more entirely than most in our times. I was born in 1881 in the great and mighty empire of the Habsburg Monarchy, but you would look for it in vain on the map today; it has vanished without trace. I grew up in Vienna, an international metropolis for two thousand years, and had to steal away from it like a thief in the night before it was demoted to the status of a provincial German town. My literary work, in the language in which I wrote it, has been burnt to ashes in the country where my books made millions of readers their friends. So I belong nowhere now, I am a stranger or at the most a guest everywhere.

 

Leo Tolstoy, The Kreutzer Sonata

Chesterton wrote that Tolstoy pitied humanity not only for its pains, but also for its pleasures: "He weeps at the thought of hatred; but in The Kreutzer Sonata he weeps almost as much at the thought of love." Delightfully dark and cynical to an extreme degree—like a presentiment of Cioran in the form of a Russian moralist. Short, dense, and endlessly quotable.

And when one looks at our golden youth, at our officers, at those Parisians! And when all those gentlemen and myself, debauchees in our thirties with hundreds of the most varied and abominable crimes against women on our consciences, go into a drawing-room or a ballroom, well scrubbed, clean-shaven, perfumed, wearing immaculate linen, in evening dress or uniform, the very emblems of purity – aren’t we a charming sight?

 

Timur Kuran, Private Truths, Public Lies

This book is drier than the Sahara, but its insights on preference falsification are essential for understanding politics, public opinion, public discourse, and how governments control their subjects. How and why people present false opinions in public, how people try to influence this expression, epistemic and political effects. It's tough, but worth it: you will see the world through new eyes after reading it.

In Havel's own words, the crucial "line of conflict" thus ran not between the Party and the people but "through each person," for everyone was "both a victim and a supporter of the system."

 

Julien Gracq, Château d'Argol

An extraordinarily strange work that I initially didn't love, but grew to appreciate as I kept thinking about it for months and months. If you like Poe, Huysmans, Gustave Moreau, or surrealism, this is the book for you. A creepy castle, a dark forest, and lots of impenetrable symbols. Everything is somehow...suspended and quasi-magical, detached from the rules of everyday life. Gracq's elaborate style makes Lovecraft look like Hemingway and the translation is excellent.

A curious feature is the complete lack of dialogue. Instead, there are descriptions of conversations, which never touch on the subject but focus on the atmosphere and effects.

Herminien possessed the gift of penetrating the secrets of literature and art with subtle and perfect taste, revealing, however, their mechanism rather than all the power of the grace they contained.

 

Sir Thomas Browne, Hydriotaphia, Urne-Buriall, or, a Discourse of the Sepulchrall Urnes lately found in Norfolk

This classic essay from 1658 counts among its fans Borges, De Quincey, Robert Louis Stevenson, and Poe. Not so much for its contents, but rather for its style: endless sentences, over-the-top abuses of Latinate words, striking metaphors and imagery. It "smells in every word of the sepulchre", wrote Emerson. Browne begins by discussing an archaeological find in a dry, almost anthropological mode and then moves on to death, religion, burial rites, fame and infamy—all the while the style escalates to a wonderful Baroque crescendo in the final chapter.

There is no antidote against the Opium of time, which temporally considereth all things; Our Fathers finde their graves in our short memories, and sadly tell us how we may be buried in our Survivors. Grave-stones tell truth scarce fourty years: Generations passe while some trees stand, and old Families last not three Oaks. To be read by bare Inscriptions like many in Gruter, to hope for Eternity by Ænigmaticall Epithetes, or first letters of our names, to be studied by Antiquaries, who we were, and have new Names given us like many of the Mummies, are cold consolations unto the Students of perpetuity, even by everlasting Languages.

 

Max Frisch, Man in the Holocene

An experimental novella about knowledge, memory, and aging. Like a literary collage, I've never read anything like it. Frisch manages to evoke an overwhelming atmosphere of internal and external decay and collapse. Chaos and nature dismantling order. Man dies, the natural cycle continues.

While Geiser is wondering why he wanted a candle in the middle of the afternoon, he remembers having intended to seal a document, his final instructions in case anything happened. His resolve, as he searches for a pan, is to clean out his closet one of these days. But the pan, the little one, is already standing on the hot plate, the water in it bubbling, though the hot plate is no longer glowing. He forgot, while thinking about the untidiness of his closet and about his heirs, that he had already drunk his tea; the empty cup is warm, the tea bag dark and wet.

 

Michael Benson, Space Odyssey: Stanley Kubrick, Arthur C. Clarke, and the Making of a Masterpiece

Imagine the perfect book about the making of 2001: A Space Odyssey. Well, this is it. Exhaustive but never dull, it really manages to capture the Herculean nature of the accomplishment. The pictures are incredible.

Clarke even predicted to Kubrick, “This is the last big space film that won’t be made on location.”

 

Raymond Queneau, Exercises in Style

A very fun little book which I read based on Calvino's recommendation. A banal anecdote is repeated 99 times in 99 different styles ("botanical", "abusive", "dream", "animism", "apheresis", etc). It seems like it would get boring, but he really pulls it off. It's got a playful energy to it, and the concept sustains itself all the way through. Impressive translation, too.

Olfactory

In that meridian S, apart from the habitual smell, there was a smell of a beastly seedy ego, of effrontery, of jeering, of H-bombs, of a high jakes, of cakes and ale, of emanations, of opium, of curious ardent esquimos, of tumescent venal double-usurers, of extraordinary white zoosperms, there was a certain scent of long juvenile neck, a certain perspiration of plaited cord, a certain pungency of anger, a certain loose and constipated stench, which were so unmistakeable that when I passed the gare Saint-Lazare two hours later I recognised them and identified them in the cosmetic, modish and tailoresque perfume which emanated from a badly placed button.

 

Rutilius Claudius Namatianus, De Reditu Suo

A fascinating poem from the late Roman Empire. Rutilius was a high-ranking administrator from Gaul who still clung to paganism in an increasingly Christian empire. Rome had been sacked by Alaric in 410, and the poem was written 6 years after that. It describes Rutilius' journey from Rome back to his homeland. There's a Wolfeian dying world feel to it—they go by ship because the land route has been devastated by the invasion. Rutilius decries Christian ascetics hiding out from the world, though never losing hope that Rome will bounce back. Filled with all sorts of casual, mundane details that are usually left out from ancient accounts: annoying innkeepers, makeshift tents on the beach, meeting old friends.

Unfortunately the second half has been lost.

Each seventh day to shameful sloth's condemned,
Effeminate picture of a wearied god!
Their other fancies from the mart of lies
Methinks not even all boys could believe.
Would that Judea ne'er had been subdued
By Pompey's wars and under Titus' sway!
The plague's contagion all the wider spreads;
The conquered presses on the conquering race.

 

Herman Melville, Moby-Dick; or, The Whale

Strange, experimental, at times Shakespearean, at times postmodern. Not even remotely flawless, but what book of this magnitude could be? Layers upon layers! Genuinely one of those "great, imperfect, torrential works, books that blaze a path into the unknown." America, God, capitalism, race, whiteness, literature, man vs nature, revenge, obsession, madness.

"Vengeance on a dumb brute!" cried Starbuck, "that simply smote thee from blindest instinct! Madness! To be enraged with a dumb thing, Captain Ahab, seems blasphemous."

"Hark ye yet again—the little lower layer. All visible objects, man, are but as pasteboard masks. But in each event—in the living act, the undoubted deed—there, some unknown but still reasoning thing puts forth the mouldings of its features from behind the unreasoning mask. If man will strike, strike through the mask! How can the prisoner reach outside except by thrusting through the wall? To me, the white whale is that wall, shoved near to me. Sometimes I think there's naught beyond. But 'tis enough. He tasks me; he heaps me; I see in him outrageous strength, with an inscrutable malice sinewing it. That inscrutable thing is chiefly what I hate; and be the white whale agent, or be the white whale principal, I will wreak that hate upon him. Talk not to me of blasphemy, man; I'd strike the sun if it insulted me. For could the sun do that, then could I do the other; since there is ever a sort of fair play herein, jealousy presiding over all creations. But not my master, man, is even that fair play.

 

W. Stanley Moss, Ill Met by Moonlight

Most World War II memoirs are filled with brutality, endless carnage, mud, ice, and/or burning tropical heat. This is a World War II memoir in the form of a bucolic comedy. A couple of charming and eccentric British officers (one was Moss, the other was famous travel writer Patrick Leigh Fermor) based in Egypt come up with the idea of abducting General Kreipe from Crete. They go to Crete, do a bit of spying, have a great time with the peasant guerrillas in the mountains drinking raki and wine, enjoying great food, and reading Baudelaire. Eventually they get to the general, which involves all sorts of daring deeds and close calls as the Nazis try to capture them and they try to escape back to Egypt. It's a fantastic (if somewhat juvenile) adventure all-round.

We are now hiding in a delightful spot which is about a quarter of a mile from Patsos. We sleep in a stone-walled hut which has been built against the base of a steep cliff, so with trees on three sides and the cliff behind us we could not have found a more sheltered position. Close at hand there is a waterfall, and all day long we hear the sound of water as it tumbles away, down and down into the valley. This sound seems to attract every bird in the neighbourhood, and from dawn till dusk we can hear nightingales singing in the trees around us. Nightingales seem mostly to sing in the day-time in Crete—but that’s Crete all over. “No sense of timing,” as Paddy has said.

 

Herman Melville, Bartleby, the Scrivener

The polar opposite of Moby-Dick in almost every way. This is one of those "perfect exercises of the great masters", short and flawless. Sloth, refusal, solitude, and negation.

I would prefer not to.

 

Michel Houellebecq, H. P. Lovecraft: Against the World, Against Life

A very amusing essay, in that dry Houellebecqian way. Covers a lot of ground in very little space: Lovecraft's life, his themes, his style, critical reactions. Avoids the cheap psychologizing that many Lovecraft critics fall into. And reveals a lot about Houellebecq himself, too.

Lovecraft, in fact, hasn’t got the attitude of a novelist. Hardly any novelist of any description imagines that it is within his capacities to give an exhaustive depiction of life. Their mission is rather to “shed new light” on it; but given the facts themselves there is absolutely no choice. Sex, money, religion, technology, ideology, redistribution of wealth…a good novelist can’t ignore anything. And all this must take place within a coherent vision, grosso modo, of the world. Obviously the task is scarcely humanly possible, and the result almost always disappointing. A nasty profession.

 

Cormac McCarthy, Blood Meridian, or The Evening Redness in the West

Desolation, deserts, blood, mud, savagery, endless death, and senseless brutality. A lean and muscular style that goes perfectly with its subject matter. A Landian novel if there ever was one. The Judge is an incredible character who cuts straight to the darkest aspects of life, nature, and the universe as a whole. And what an unforgettable ending.

Whatever exists, he said. Whatever in creation exists without my knowledge exists without my consent. He looked about at the dark forest in which they were bivouacked. He nodded toward the specimens he’d collected. These anonymous creatures, he said, may seem little or nothing in the world. Yet the smallest crumb can devour us. Any smallest thing beneath yon rock out of men’s knowing. Only nature can enslave man and only when the existence of each last entity is routed out and made to stand naked before him will he be properly suzerain of the earth.

 

H. G. Wells, The Time Machine

I went into this one expecting a comfy Victorian adventure, instead I got a Nietzschean/Lovecraftian evolution-horror story about degeneration and the end of all life. Kind of falls apart towards the end, but it's a good read.

The brown and charred rags that hung from the sides of it, I presently recognized as the decaying vestiges of books. They had long since dropped to pieces, and every semblance of print had left them. But here and there were warped boards and cracked metallic clasps that told the tale well enough. Had I been a literary man I might, perhaps, have moralized upon the futility of all ambition. But as it was, the thing that struck me with keenest force was the enormous waste of labour to which this sombre wilderness of rotting paper testified.

 

 


The Worst

Gregory Bateson, Steps to an Ecology of Mind

This collection of essays is filled with a series of baseless pronouncements on a bewildering array of topics: Bateson jumps from von Neumann's game theory to William Blake to learning and meta-learning to schizophrenia to anthropology to evolution to Freud to "the epistemology of cybernetics" to consciousness, all the time hinting that there is some sort of profound Grand Correspondence between them. But he never articulates it because no such correspondence actually exists.

Toward the end the book he drops even the pretense of rigour and just devolves into hippie ramblings about environmentalism and LSD, which feels like a fitting conclusion.

I personally do not believe that the dolphins have anything that a human linguist would call a “language.” I do not think that any animal without hands would be stupid enough to arrive at so outlandish a mode of communication.

 

Paul Feyerabend, Against Method

Some of the arguments Feyerabend makes are just bad in a normal way. He uses a series of post hoc ergo propter hoc arguments to push the idea that anti-scientific methods were necessary for scientific progress (counterfactuals? what are those?). He argues that alternatives to a theory must always precede the detection of anomalies, despite a million different counterexamples in the history of science.

At other times his arguments are bad in a theatrical, spectacular, absurd way. For example when he says that the Catholic church was right to suppress Galileo because he threatened social harmony, or when he praises Mao's Cultural Revolution (when universities were closed and scientists were sent to do manual labor in fields and factories) for "improving the practice of medicine". Feyerabend somehow simultaneously argues in favor of totalitarian government control of science and "methodological anarchism". He never seems to notice the contradiction.

Of course, our well-conditioned materialistic contemporaries are liable to burst with excitement over events such as the moonshots, the double helix, non-equilibrium thermodynamics. But let us look at the matter from a different point of view, and it becomes a ridiculous exercise in futility. It needed billions of dollars, thousands of well-trained assistants, years of hard work to enable some inarticulate and rather limited contemporaries to perform a few graceless hops in a place nobody in his right mind would think of visiting - a dried out, airless, hot stone. But mystics, using only their minds, travelled across the celestial spheres to God himself, whom they viewed in all his splendour, receiving strength for continuing their lives and enlightenment for themselves and their fellow men. It is only the illiteracy of the general public and of their stem trainers, the intellectuals, and their amazing lack of imagination that makes them reject such comparisons without further ado.

 

Benjamin Noys, Malign Velocities: Accelerationism & Capitalism

This book is written at the intellectual level of a youtube comment. Noys ignores Land because he thinks a critique of his work would be "superfluous", which is a bit like me dismissing Roger Federer as a tennis player by saying that a game between us would be superfluous. Bad history, worse economics, and filled with endless whining.

The relative lack of commodities – at first glance anti-pleasure – would actually allow for a less extreme division of labor, freeing one from illusory ‘choices’ and the mental overload of advertising, as well as a greater (if not absolute) freedom from the tyranny of things.

 

Alfred North Whitehead, Science and the Modern World

Whitehead co-wrote Principia Mathematica with Russell. Whitehead was Quine's thesis advisor. Yet this book comes across as the ramblings of a hippie who took one too many hits of LSD. A bizarre attempt to counter materialism and replace it with a cooky pantheistic metaphysics where everything perceives everything else ("For Berkeley's mind I substitute a process of prehensive unification") and events happen multiple times in multiple locations because things perceive each other. God, religion, social progress, science, environmentalism, art...my notes for this book are just filled from top to bottom with question marks.

Cognition is the emergence, into some measure of individualised reality, of the general substratum of activity, poising before itself possibility, actuality, and purpose.

 

Eric Voegelin, Science, Politics and Gnosticism

Science, Politics and Gnosticism is basically this tweet in book form:

The shit he was on about at the time was gnosticism, on which he blames everything bad from Nietzsche and Hobbes to Hegel and Marx. Ultimately gnosticism has very little to do with the point being made, and that point is preposterous and badly argued. Anyone going into this not already believing in God is just wasting their time, as Voegelin is only interested in preaching to the choir.

A second complex of symbols that runs through modern gnostic mass movements was created in the speculation on history of Joachim of Flora at the end of the twelfth century.

 

Haruki Murakami, Hard-Boiled Wonderland and the End of the World

I loved the first chapter, thinking it was stand-alone. Imagine if it had no follow-up, just a guy in an elevator and then a long walk. Fantastically crazy. But then the rest of the novel happens. I'm not entirely sure what I was expecting with this one, but I had the impression that Murakami was...good? This is sub-pulp garbage. Murakami is a Kmart P K Dick, and Dick's not exactly highbrow to begin with.

The SF bits are not really SF, they're fantasy papered-over with a bit of technospeak. Despite this there are endless, meaningless, tedious infodumps. What could possibly be the point of an infodump about a nonsensical and arbitrary system of magic? And what's the deal with all the chicks being desperately thirsty for the protagonist's cock? It's like something out of an extremely bad harem anime, yet it doesn't appear to be ironic.

 




Having Had No Predecessor to Imitate, He Had No Successor Capable of Imitating Him

It is against nature that he made the most excellent creation that could ever be; for things are normally born imperfect, then grow and gather strength as they do so. He took poetry and several other sciences in their infancy and brought them to perfect, accomplished maturity. [Thus] one may call him the first and last of poets, in accordance with that fine tribute left to us by antiquity: that, having had no predecessor to imitate, he had no successor capable of imitating him.

That's my old pal Michel talking about Homer.1 He is almost completely wrong.

The Homeric Question

What he gets right is progress in the arts: early Greek sculpture was copied from the Egyptians and was so amateurish that it seems abstract, but the lack of detail simply reflects a lack of ability. The art was "born imperfect" and slowly gathered strength, eventually reaching an apex four or six centuries later with pieces like the Laocoön.

Homer appears to completely break any such notions of artistic development: the Iliad and the Odyssey emerge at the end of the 8th century with no visible tradition behind them, and until the 19th century this miraculous event was accepted at face value, elevating Homer from merely a great poet to a superhuman figure.2

But the Age of Reason rolled around and not even Homer remained unruffled. In 1795 Friedrich August Wolf published Prolegomena ad Homerum which established the basis for the Homeric Question debates over the next century (though, as the 11th edition of the Encyclopædia Britannica reminds us, his argument was so overwhelming it would not be challenged for 30 years3). Wolf was skeptical, likening the poems to an enormous ship constructed far inland, with no access to tools or water.

I find it impossible to accept the belief to which we have become accustomed: that these two works of a single genius burst forth suddenly from the darkness in all their brilliance, just as they are, with both the splendor of their parts and the many great virtues of the connected whole.

Wolf approached the problem by historicizing it, tracing the text and its historical context from Homer down to the first century.4 Like Sherlock Holmes, he systematically eliminated the impossible until only the truth remained:

  • There was no plausible occasion at which a poem of 15,000 lines would be recited.
  • There is no internal evidence in the Iliad for books or writing.5
  • If writing existed at all in Homer's time, it was primitive and utilized a limited alphabet.
  • The Greeks at that time had no access to papyrus or parchment, materials necessary to preserve a poem of this length: it must have been transmitted orally (and therefore lossily).
  • The earliest long-form writing came much later than Homer and was utilitarian/public in nature, for example Solon's laws written on wood for public display.
  • It took yet more time for writing to move from the public sphere to the artistic one.
  • The poem is full of "obvious and imperfectly fitted joints" which suggest alterations from a "later period".
  • The Alexandrian editors of the Hellenistic age felt free to edit the Homeric poems: "often removing many verses, and elsewhere adding polish where there was none".

Taken together, these arguments imply that 1) Homer did not compose anything like the Iliad as we know it, and 2) even if he somehow had done it, it was hopelessly corrupted in the process of transmission.6

Instead Wolf proposed that Homer was responsible for some small parts which were then combined and expanded by Peisistratus with the assistance of his poet friends in the middle of the 6th century, and finally polished by Aristarchus or Aristophanes 400 years later.7

Following Wolf, a cottage industry of Homeric analysis cropped up. Any line that did not suit their tastes was declared an interpolation, and repetitions were taken as proof of copying by later poets. They all agreed that the Iliad was the work of multiple poets, and that philologists could "scientifically prove"8 it. Yet no two Analysts could ever agree on which parts were Homer's and which were later additions.

Bethe thought it was the work of two poets. Theiler argued for 5 or 6. Hermann proposed that the Iliad consisted of a bunch of crap surrounding the pristine core of the original Homer, while Lachmann argued that it actually consisted of 18 different folk stories strung together, like the Finnish Kalevala. Ulrich von Wilamowitz-Moellendorff (I swear I'm not making up these names) claimed to detect three different "layers" in the Odyssey, one of which (the "old Odyssey"—books 5-14 and 17-19) had in turn been compiled from three different, even earlier poems, two of which had originally been part of other, longer poems. Wilamowitz went as far as to call the Iliad "a miserable piece of patchwork". Bryant argued the city of Troy never existed at all.

Arrayed against the Analysts were the Unitarians led by Gregor Wilhelm Nitzsch, who still believed the poems were mostly the work of a single author. Unable to counter Wolf's historical arguments, they relied on internal evidence, appealing to the poem's stylistic unity and intricate structure. How is it possible, they would ask, for a collection of different folk tales to have a coherent plot and characters, and where do all these elaborate structural correspondences come from?

The Unitarians had an anti-enlightenment aesthetic, fundamentally doubting the possibility of logical analysis of poetry. De Quincey got involved and declared all the external arguments irrelevant: "all arguments worth a straw in this matter must be derived from the internal structure of the 'Iliad.'" And in some cases the Unitarian cause was wrapped up with religious conservatism: if one accepted the analyst arguments in the case of Homer, one could hardly deny them in the case of the Old Testament.

Perhaps the most famous Unitarian was Heinrich Schliemann who was not a philologist at all but rather a globetrotting con artist and businessman. After amassing a vast fortune, Schliemann dedicated his life to finding the ruins of Troy. His trust in Homer was complete (even compared to other Unitarians, most of whom believed Homer contained no useful topographical details), and it was rewarded when he discovered Troy exactly where Homer said it would be. What he actually found was a later city on the same site as Troy and unfortunately he had no training as an archaeologist which meant that he completely destroyed the ruins, but Homer had led him to the right place.9

The battle raged on, and by the early 20th century the Analysts had the upper hand while Unitarianism was a heretical minority view. Thus in 1920 Georg Brandes could write that "save for a few uncritical people, of course, no one to-day believes that a single poet named Homer wrote either the Iliad or the Odyssey". Yet in the same year, John A. Scott, a respected philologist, published his passionately polemical The Unity of Homer, ridiculing the German Analysts and proclaiming the greatness of Homer.

And there were problems that neither side could solve, despite more than a century of intense scholarship, such as the peculiar mixture of dialects in Homer. The Iliad and the Odyssey are composed in a language that was never spoken by anyone, a blend of dialects from all across Greece. This was obviously a problem for the Unitarians, but the Analysts did not have any explanations either: the dialects were not cleanly separated (as you might expect from a combination of different poems), but jumbled together, sometimes even within a single word: a prefix from one dialect and a postfix from another. It was difficult to argue that 6th century Athenian poets wrote like this.

Abduction

In the early 19th century, scientists started noticing that the orbit of Uranus did not match the predictions of Newton's laws. In 1846, the French astronomer Urbain Le Verrier proposed a solution: he hypothesized the existence of an undiscovered planet and, based on the magnitude of the perturbation, worked backwards and deduced its mass and orbit. Le Verrier sent his results by post to Johann Gottfried Galle at the Berlin Observatory. They were received on September 23. At midnight, Neptune was discovered—exactly in the position predicted. It was an absolute triumph: a bold mathematical prediction and confirmation of Newton's laws, instantly validated by the discovery of an entirely new planet.

A decade later, Le Verrier reported that the precession of the perihelion of Mercury did not match the predictions of Newton's laws. But he knew what to do, and once again estimated the mass and orbit of a hypothetical new planet which would explain the anomaly. We may deduce his confidence from the fact that he quickly named this planet—"Vulcan". (Astute readers may recall that no planet with that name can be found in our solar system.)

Astronomers failed to detect Vulcan, though they kept trying with every new eclipse up to 1908. In the face of this failure, they started coming up with increasingly absurd solutions: Asaph Hall suggested changing Newton's law of gravitation from Gm1m2/r2G m_1 m_2 / r^2 to Gm1m2/r2.00000016G m_1 m_2 / r^{2.00000016}. Others thought there was an invisible band of matter inside Mercury's orbit.

Sometimes the bottleneck to scientific progress is not data, but hypotheses. And hypotheses tend to be trickier to acquire. The problem was finally solved in 1915 when Einstein introduced general relativity, which perfectly predicted Mercury's perihelion advance without any ad hoc hypotheses or fudge factors.

The pragmatist philosopher Charles Sanders Peirce coined the term "abduction" (as opposed to "deduction" and "induction") to describe inference from effect to cause (in other words, from data to novel and successful hypotheses). As Le Verrier's example shows, it is a fickle art. The fact that abduction can be done at all, wrote Peirce, is "the most surprising of all the wonders of the universe".10

Part of what makes abduction so surprising is that scientific discovery often involves eureka moments, when the unconscious suddenly provides the solution to a previously insoluble problem. Ludwik Fleck characterized the process of discovery as going "from false assumptions and irreproducible initial experiments", through "many errors and detours" to eventually arriving at a result whose development none of the principal actors can explain. Gauss once said after a eureka moment: "I have the result, only I do not yet know how to get to it".11

Philosophers of science have generally avoided the topic entirely,12 using phrases like "ineffable", "nonrational", "intractable", and "surrounded by dense mists of romanticism".13 Popper famously wrote a book called The Logic of Scientific Discovery in which he denies any role to logic in scientific discovery.14 Einstein himself thought that "there is no way from experience to the setting up of a theory". Philosophers argue the matter should be left to psychologists, but psychologists haven't exactly taken up the torch. Neuroscientists haven't done much better. And from a computational complexity perspective, abduction is more or less impossible.

But I think there are reasons to be optimistic. First, the abductive process can at least be steered to some degree. It's not like scientists working on wallaby testicles have sudden insights about neutron stars. Second, some people clearly have greater abductive abilities than others, which at least tells us that there is something that one can have more of which improves the process. Finally, the frequent phenomenon of multiple discovery shows that abduction maybe isn't so magical after all: once the requisite foundations are in place, making the next leap is easy enough that we often get two15 people doing it independently. Skeptics ask awkward questions like: if abduction is a rational process, why haven't we discovered the rules? But just because they are hard to find doesn't mean that they do not exist.

To sum things up: your brain is a black box that "you" feed hard problems to and occasionally get answers from which "you" could never consciously discover, based on some alchemical mixture of conscious work, intuition, heuristics, thought experiments, and other impenetrable subconscious processes. Nobody knows how or why this magical meat works, nobody has even proposed a set of rules that might approximate the abduction process, and the continued upward techonomic trajectory of our civilization fundamentally depends on its continued operation.

The Solution

By the early 1900s the Homeric Question war had been raging for over a century, with no end in sight. Homerists had all the data, but what they really needed was the right hypothesis. The man to provide it was the American Milman Parry, who proposed a third alternative: neither pastiche nor unitarian composition, but the culmination of a formulaic oral tradition. The first breakthrough came with his 1928 dissertation, L'Épithète Traditionnelle dans Homère: Essai sur un problème de style Homérique16 in which Parry demonstrated that Homer composed orally, using fixed phrases and combinations called formulae. Five years later he travelled to Bosnia with Albert Lord, seeking out oral poets in places still unspoiled by literacy.17 By studying their technique and comparing it to Homer, they confirmed the oral-formulaic hypothesis.

To understand this system we have to start with a short refresher on Homer's meter, the dactylic hexameter: five dactyls (a long syllable followed by two short ones, —UU) plus a spondee (— —) or trochee (—U) at the end:

— U U, — U U, — U U, — U U, — U U, — —

Imagine a rapper freestyling a song about a war. Now imagine him doing that for 15,693 lines (or about 24 hours). Now imagine not just freestyling, but freestyling to a specific meter. Extemporaneous composition was only possible because Homer had a vast repertoire of formulae which fit onto the dactylic hexameter like lego blocks. For example, Achilles is πόδας ὠκὺς Ἀχιλλεύς (swift-footed Achilles) when Homer needs to fill seven syllables, and δῖος Ἀχιλλεύς (godlike Achilles) when he needs to fill five (and both formulae appear only at the end of the line).

Parry found two essential qualities in this system: scope, and economy. Homer had "a noun-epithet formula to meet every regularly recurring need. And what is equally striking, there is usually only one such formula". The formulae are not limited to epithets, but also cover type-scenes such as battles and speeches—explaining the repetitions which the Analysts blamed on copying by later poets. Similarly, the various small inconsistencies are simply the result of the method of composition: it's hard to keep every detail in mind when you're busy composing and singing.18

We shall find then, I think, that this failure to see the difference between written and oral verse was the greatest single obstacle to our understanding of Homer, we shall cease to be puzzled by much, we shall no longer look for much that Homer would never have thought of saying, and above all, we shall find that many, if not most of the questions we were asking, were not the right ones to ask.

Thus by abducting the right hypothesis, Parry could explain everything that had been mysterious about Homer's style. The mixture of dialects was a natural result of an old and widespread tradition. Archaisms and neologisms could exist side by side because the formulae evolved at different points in time.

The formulaic system was vast and Parry emphasized that it was impossible for any one man to put it together: this is where the element of tradition comes in. It developed by a slow processes of accumulation, a long lineage of bards across the centuries, each one adding a few phrases of his own to the great edifice of epic poetry.

We can use the development of the Greek language to trace the age of various formulae. Once a phrase was put in a particular metric place, it tended to stay there regardless of whether it continued to actually fit the meter. Early Greek featured a letter called the digamma (which represented the /w/ sound) and formulae were composed using that letter. As the language evolved the digamma disappeared and the formulae no longer fit the meter—but they stayed in the same place. This was a mortal blow to the Analysts: it would have been an extraordinary coincidence for a 6th century poet to write a metrically faulty line as if it had been caused by a letter that he had never heard of.

Just like the oral tradition preserved the ghost of a lost letter, it also preserved forgotten words: for "the great number of noun-epithet formulas in which the meaning of the epithet has been lost to us", the meaning had been lost to Homer also. And with the language it also preserved details from the world of Bronze Age Greece: wealthy Mycenae, war chariots, boar tusk helmets, etc. If the poems had continued being oral after Homer's time, the language would have continued evolving. Thus we can establish probable dates for the composition based on the neologisms.

Comically, the Germans simply pretended that Parry did not exist for a while. Harder wrote in 1942 that "no one any longer doubts that Homer could write, and wrote his poetry down".19 In the end, even though almost none of Homer's language was his own direct creation, you could say the Unitarians won: in his 2011 review of the Homeric Question, Martin West writes "we all agree these days that the Iliad and Odyssey are unified poems [...] And we all accept the implication, that this design is a design conceived by a single author, however much it may owe to earlier poems". If there were any doubt left, computer-based stylometric analysis finds "an astonishing homogeneity" in the Iliad (though not so much in the Odyssey).20

The final piece of the puzzle fell into place in the 1950s when Michael Ventris deciphered Linear B. Up to that point it was believed that the the Myceneans (the stars of Homer's story) were simply not Greek. Ventris himself was absolutely certain they were Etruscan, writing that the theory that Linear B "could be Greek is based...on a deliberate disregard for historical plausibility".21 This raised all sorts of questions about the transmission of details that predate the 13th century collapse. But since they were Greek, an oral epic tradition reaching back to that time is entirely plausible. This also provides us with an elegant explanation for the Bellerophon story: epic poetry preserved the idea of writing even though the technology had been lost for over 500 years:

"Would you be killed, o Proitos? Then murder Bellerophontes who tried to lie with me in love, though I was unwilling."
So she spoke, and anger took hold of the king at her story.
He shrank from killing him, since his heart was awed by such action,
but sent him away to Lykia, and handed him murderous symbols,
which he inscribed in a folding tablet, enough to destroy life,
and told him to show it to his wife's father, that he might perish.

And what about the transmission issues? The most popular view today (though there is no consensus) is the amanuensis theory: Homer was illiterate, but someone wrote down his song and passed it along. The example of Hesiod, whose long written works survived, shows that it's not outside the realm of possibility. Some of the difficulties Wolf pointed out are still there, and Haslam (a supporter of the amanuensis theory) says that writing down the poems would have been "an enterprise so remarkable that it is hard to credit". But the alternatives seem yet more doubtful.

In the 18-19th centuries many scholars, in whose minds literacy and intelligence were inextricably linked, worried that illiteracy would degrade Homer's achievement.22 But it only makes it more impressive. A poem like the Iliad can only exist because of Homer's illiteracy, not despite of it. Once literacy penetrated the Greek world, oral poetry disappeared. By the 6th century the bards had been replaced by rhapsodoi, no longer creative poets but merely parrots like Plato's Ion.

Some argue that Homer was a transitional figure, that he could write but still knew the ways of oral composition. Others think that's impossible. The idea of a literate Homer opens up all sorts of intriguing and unlikely possibilities: writing—the technology that killed oral poetry—also immortalized what used to be transient and perishable. Remarkably, what it preserved was a poem about the use of art and fame (klea andron, the famous deeds of men) to overcome mortality. Perhaps the Iliad is one big allegory for the written word. Would Homer have been conscious of the decline of oral poetry? Ultimately, the only reason we can read him today is that he happened to live in that tiny sliver of time when oral poetry was still alive and the technology existed to preserve it for the future. We will never know what great poets sang into the void across those letterless aeons.

Not only did Homer have predecessors to imitate, he was drawing on an ancient tradition that survived the Greek Dark Ages—an eternal flame passed from singer to singer across the centuries, sustaining a vague, mythic memory of a fallen civilization. A tradition that was quickly extinguished when, on a bright summer morning, a curious trader asked some Phoenicians about their scribbles.

Further Reading

  • G. S. Kirk, The Songs of Homer
  • E. R. Dodds, Homer, in M. Platnauer (ed.) Fifty Years of Classical Scholarship
  • John Wright (ed.), Essays on the "Iliad": Selected Modern Criticism
  • George Steiner & Robert Fagles (eds.), Homer: A Collection of Critical Essays

  1. 1.The "tribute left to us by antiquity" is from Velleius Paterculus 1.5: in quo hoc maximum est, quod neque ante illum, quem ipse imitaretur, neque post illum, qui eum imitari posset, inventus est.
  2. 2.Such a miraculous event really did occur a bit later, with Herodotus and Thucydides. Perhaps the moral is that history is easier than poetry.
  3. 3."The effect of Wolf’s Prolegomena was so overwhelming that, although a few protests were made at the time, the true Homeric controversy did not begin till after Wolf’s death (1824). His speculations were thoroughly in harmony with the ideas and sentiment of the time, and his historical arguments, especially his long array of testimonies to the work of Peisistratus, were hardly challenged."
  4. 4.He was mainly inspired by Eichhorn and Heyne, who approached the Bible as a historical and anthropological document with multiple authors.
  5. 5.Actually, there is: the Bellerophon story. Wolf interpreted it away.
  6. 6.There was actually nothing original in Wolf's argument. Everything he wrote had been written before by other scholars: Blackwell, Wood, Bentley, Heyne, Eichhorn, even Rousseau. Casaubon had already written about the transmission problems in 1583. But Wolf's synthesis of all the arguments, combined with a masterful presentation, made those views persuasive.
  7. 7.Wolf never really explained how these people could produce poems from a very different era, why they would insert episodes they considered immoral, or why they were content to credit Homer for their own efforts. A century after Wolf, Comparetti commented: "where and when are great poets known and proved to have been so humble?"
  8. 8.Eduard Meyer, Geschichte des Altertums
  9. 9.Schliemann also excavated in Mycenae, and his finds there (like the Mask of Agamemnon) confirmed the Homeric idea that it was an extremely wealthy place. Mycenae never recovered from the 13th century Bronze Age Collapse, so by Homer's time it was a backwater.
  10. 10.This business is additionally complicated by the issue of underdetermination but let's leave that aside for now.
  11. 11.Which of course raises the question: did his subconscious have the proof? Why not "send up" the proof as well as the answer? Or was it just a guess? In which case how does the brain generate such good guesses? Perhaps it was simply a display of superiority.
  12. 12.With the exception of Carnap's abortive attempt in The Logical Structure of the World (1928).
  13. 13.Simon, Models of Discovery
  14. 14."The initial stage, the act of conceiving or inventing a theory, seems to me neither to call for logical analysis nor to be susceptible of it."
  15. 15.Or more: Merton found two cases of 9 independent co-discoverers.
  16. 16.In case you have forgotten your high-school French, Parry outlined his argument in English in two essays which can be found in Harvard Studies in Classical Philology, vols. 41 (1930) and 43 (1932).
  17. 17.An example of the importance of contingency in scientific progress: in 1819, the Austrian Empire’s renowned Slavic scholar, Jernej Kopitar, argued in a letter to Wolf that “today there is no better match for your Homeric ‘Homerids’ than in Serbia and Bosnia”. Wolf never bothered to follow up on the lead.
  18. 18.This might seem a bit mechanical, and Parry believed that they were "purely ornamental", but later work has shown that the epithets are context-sensitive to some degree. But words were not Homer's "unit of composition" as they would be for a literate poet, he composed by combining phrases. See John Miles Foley on the context-sensitive usage of epithets.
  19. 19.Das Neue Bild der Antike, ed. Berve, p. 102.
  20. 20.Dietmar Najock, Letter-Distribution and Authorship in Early Greek Epics
  21. 21.His belief was so powerful that even after successfully deciphering the script, he dismissed the result "as a mirage and cast the idea aside". It took him several months to figure out he was actually right. Eventually, with the right hypothesis, everything fell into place: "Once I made this assumption, most of the peculiarities of the language and spelling which had puzzled me seemed to find a logical explanation, and although many of the tablets remain as incomprehensible as before, many others are suddenly beginning to make sense." And speaking of inexplicable scientific leaps, Ventris was never able to produce a narrative of the method that allowed him to decipher Linear B.
  22. 22.Dodds: "That the Homeric poems are oral compositions is of course no new idea. It was put forward by Robert Wood in 1767, and developed (as regards the original state of the poems) by Wolf in 1795. But before Milman Parry it was open to any one to deny the assertion, and many scholars dismissed it as out of the question."