Links & What I've Been Reading Q1 2022

Links

Machine Learning

1. Incredibly cool from deepmind: ML applied to ancient Greek fragments can generate restoration hypotheses for the missing text and locate the fragment's origin in both time and place. Paper in Nature.

2. Incredibly uncool:

These researchers built an AI for discovering less toxic drug compounds. Then they retrained it to do the opposite. Within six hours it generated 40,000 toxic molecules, including VX nerve agent and "many other known chemical warfare agents.

Sufficiently advanced AI alignment is indistinguishable from AI risk?

3. Fantastic Gwern theory-fiction: It Looks Like You're Trying To Take Over The World.

4. Also on LW, Brain Efficiency: Much More than You Wanted to Know:

Eventually advances in software and neuromorphic computing should reduce the energy requirement down to brain levels of 10W or so, allowing for up to a trillion brain-scale agents at near future world power supply, with at least a concomitant 100x increase in GDP. All of this without any exotic computing.

5. Also on LW, New Scaling Laws for Large Language Models.

Forecasting

6. Karger, Atanasov & Tetlock, Improving Judgments of Existential Risk: Better Forecasts, Questions, Explanations, Policies.

7. How good are generalist forecasters vs experts, really? Gavin Leech revisits the literature and argues against the superforecasters. They still do as well or slightly better than the experts, but not by much. I feel the way the results are presented is a bit misleading.

Metascience

8. Derek Thompson in the Atlantic on Silicon Valley science funding.

9. In what sense is the science of science a science?

What makes my spidey sense tingle is that the objects in any such theory are (in part) a hypothetical space of possible discoveries, of possible explanations of the world. I called it a theory of discovery just above, but it might equally well be called a theory of the unknown, or theory of exploration, or theory of theories. Of course, some of the objects of any such theory would also be amenable to more standard descriptions: things like exploration strategies, or group dynamics. But some would be a lot stranger: currently unknown types of explanation, currently unknown types of theoretical entity.

Economic History

10. WW2 Japanese internment camps? You guessed it, Good, Actually! Internment had a positive effect on long-run incomes on the order of 9-22%. And remember to burn the cities, too. h/t ADS

11. Some issues with Putterman & Weil (2010), judging by the new results it doesn't seem all that problematic to the deep roots lit?

Book Reviews

12. There's a new Landmark Edition out, Xenophon's Anabasis. Here's a short review.

13. ZHPL on TLP's Sadly, Porn.

14. Scott on the same (the reviews are complementary goods).

Covid

15. Vaccination Rates and COVID Outcomes across U.S. States finds that it takes about $5000 worth of vaccines to save a life. Would be interesting to see a comparison to molnupiravir in terms of dollars per life saved.

16. A report from a covid human challenge experiment. Hopefully this paves the path for a faster response against the next pandemic.

The Rest

17. Against the Naming of Fungi

The egotism and futility of these costly initiatives is quite mind-boggling as the human threat to biological diversity multiplies. Rather than competing with animal and plant taxonomists, mycologists should show pluck in asserting philosophical independence from the waning fields of zoology and botany. By turning our attention towards experimental questions and away from cataloguing, mycologists may escape the shackles of Linnean fundamentalism.

18. Related(?), SMTM on citrus taxonomy, "in which the Bene Gesserit attempt to breed the Kumquat Haderach".

19. Luttwak on China: The myth of Chinese supremacy

Always improbable, G-2 became impossible when Xi Jinping arrived. For him only G-1 is good enough. Not because he is a megalomaniac but the opposite: he thinks, accurately, that unless the Party establishes an unchallenged global hegemony, with its rule is deemed superior to democratic governance, Communist China will collapse just as Soviet rule did. He is right.

20. Indian National Stock Exchange CEO scandal:

The drama intensified in February, when the Securities and Exchange Board of India released a 190-page regulatory order disclosing that Ramkrishna had sent sensitive information to an outsider described as a yogi in the Himalayas. [...] The yogi was non-corporeal, she said, but corresponded using the email address [email protected]

21. On the role of mathematics in the neolithic revolution. "The mathematical abilities of Neolithic humans advanced in concert with the new requirements of agricultural life. These needs can be summed up into three categories: Surplus, Trade, and Time." Here's wikipedia on the Rhind Mathematical Papyrus which dates to the 16thC BC.

22. From the new Institute for Progress, Progress is a Policy Choice.

23. Ed West on the coming demographic issues: 'Children of Men' is really happening (actually understates the problem imo).

24. Theses and counter-theses on sleep. Seems like one of those things where there's tons of variation and you're probably best off doing some rigorous self-experimentation?

25. Death Toll of Price Limits and Protectionism in the Russian Pharmaceutical Market. In 2012, Russia put price caps and protectionist regulations on various pharmaceuticals. The result was a decrease in supply, leading to a striking increase in mortality from diseases those drugs protect against.

26. Fluvoxamine-caffeine interaction:

Just learned that fluvoxamine, a common SSRI used to treat depression and other psychiatric conditions, increases the half-life of caffeine in the bloodstream. Like, to an absurd degree:

27. Modeling assortative mating and genetic similarities between partners, siblings, and in-laws

We found evidence of genetic similarity between partners for educational attainment (rg = 0.37), height (rg = 0.13), and depression (rg = 0.08). Common genetic variants associated with educational attainment correlated between siblings above 0.50 (rg = 0.68) and between siblings-in-law (rg = 0.25) and co-siblings-in-law (rg = 0.09). Comparisons between the genetic similarities of partners and siblings indicated that genetic variances were in intergenerational equilibrium. This study shows genetic similarities between extended family members and that assortative mating has taken place for several generations.

28. New EA GWAS with N=3 million, 12-16% variance explained.

29. "Las Pozas ("the Pools") is a surrealistic group of structures created by Edward James in a subtropical rainforest in the Sierra Gorda mountains of Mexico. It includes more than 80 acres (32 ha) of natural waterfalls and pools interlaced with towering surrealist sculptures in concrete."

30. The Senseless, Tragic Rape of Charles Bukowski’s Ghost by John Martin’s Black Sparrow Press

31. Stalin's amused notes on Lysenko.

32. A letter from Claude Shannon to Warren McCulloch, in behalf of L. Ron Hubbard.

33. No peeing towards Russia.

Audio-Visual

34. They found Shackleton's ship in the Antarctic, and it's perfectly preserved.

35. Kogonada's After Yang is one of my favorite new films in years. What if Roy Batty was a personal assistant, what happens to his adopted family after he dies? A poignant and wistful film about memory, death, and the legacy we leave behind us.

36. And here's DJ Shadow remixing King Gizzard & The Lizard Wizard.

What I've Been Reading

  • How to Think Like Shakespeare: Lessons from a Renaissance Education, by Scott L. Newstok. A Romantic old-man-yells-at-clouds tirade about modern education practices. It didn't change any of my views, but it didn't really attempt to do so in the first place: Newstok is a reformist, while I am strictly an abolitionist—and therefore far outside the target audience. I find it hard to separate mass education from the commoditization of knowledge, while Newstok believes we can have our cake and eat it too. In any case, if you want a passionate argument in favor of high-quality education interspersed with Shakespeare quotes, this is the book for you.
  • Dune, by Frank Herbert. Pretty great, Herbert constructs a deeply alluring world which pulls you in despite some rather hilariously implausible aspects. It's interesting how so much of the "plot" actually happens in the background. The audiobook is quite good.
  • Dune Messiah, by Frank Herbert. I was told the sequels get crazy, and this is a pretty good start in that direction! Can't wait to see where this nonsense ends up. This is basically a book of palace intrigue and scheming, with a rich religious/predestination/weird time loop sauce on top.
  • Star Maker, by Olaf Stapledon. I kept thinking that it felt like a really weird throwback to the 1920s-30s, then I looked it up and it was written in 1937. Whoops. It's a non-stop torrent of interesting science fiction ideas, but there's no continuity, no characters to latch on to, and the examination of the ideas stays at the surface level. It's just a series of "this happened, then this happened, then this happened" which I found rather boring.
  • The Island of Doctor Death and Other Stories and Other Stories, by Gene Wolfe. Some fantastic stories in this collection, in particular I loved Feather Tigers, Death of Dr. Island, Toy Theater, and Seven American Nights. Many of them are in that classic Wolfe style where you have to piece together what's going on from tiny hints left in the text, and it's all a bit ambiguous in the end and so on. There's a lot of focus on religion and death (with two stories, The Hero as Werwolf and The Doctor of Death Island, being fairly explicitly death-ist).
  • Orphans of the Sky, by Robert Heinlein. Fairly standard generation ship story. Juvenile and ham-fisted (there's a scene where the protagonist literally yells out "and yet it moves!"). Mutants and knife fights and all that. 12 year old me would've loved it.
  • Wittgenstein's Nephew, by Thomas Bernhard. Bernhard documents his friendship with Paul Wittgenstein (not the pianist), a black sheep of the Wittgenstein family who suffered from various mental problems. They're both rejected by Austrian society, and they both reject it. Bernhard's attitude toward awards (he views them as a kind of insult and punishment) really sums up his relation to his country. A bitter book, sad and pathetic and miserly. Recommended if you're in the market for a feel-bad memoir.
  • The Status Game: On Social Position and How We Use It, by Will Storr. There's quite a bit of overlap with The Elephant in the Brain, but Storr's book is obviously more focused on status. Also reminiscent of Goffman's Presentation of Self in Everyday Life. Lots of references to Boehm, Henrich, Kuran, Wrangham, etc. (You're probably better off going straight to the source?) If I had to choose between this and Elephant I'd go for Elephant, but they're fairly complementary so it won't be a waste of your time to read both. Parts of the book are focused on contemporary culture war issues, which felt a bit shallow and tiresome. Overall it's not bad though.
  • The Biology of Moral Systems, by Richard Alexander. There's a great core here, but I wouldn't recommend it. The basic idea of approaching moral systems from an evopsych perspective is useful. However, huge swathes of text are wasted on dull and low-quality academic bickering, many of the specifics (eg the arguments on the development of religion) are completely off, and the last third of the book is dedicated to a mostly fruitless discussion of nuclear war and mutually assured destruction.



The Best and Worst Books I Read in 2021

The Best

Ibn Battutah, The Travels of Ibn Battutah

Also known as A Masterpiece to Those Who Contemplate the Wonders of Cities and the Marvels of Travelling, this is a wonderful travelogue from the 14th century (or, more appropriately, the 8th century of the Hegira). Battutah was born in Morocco; he was not wealthy, but he was well-educated and went into the family business of Islamic law. At age 21, he set out for the pilgrimage to Mecca. He would extend his journey for decades, however, following traders in ships and caravans, relying on generous Muslim institutions and his talent for befriending rulers. He eventually covered virtually the entire Islamic world and beyond, from North Africa to China.

Battutah gets into all sorts of adventures (luckily escaping death by disease, shipwreck, pirates, bandits, and so on) and provides us with some incredible ethnographic observations. In Constantinople, he meets the Emperor. In India, he becomes a prominent and wealthy administrator under the rule of an erratic Sultan. In the Maldives, he marries six local women and lives a life of leisure under the shade of the palm trees. Yet his wanderlust compels him to keep moving. Battutah himself as a person, however, remains tantalizingly obscure.

Having divorced my wives I set sail. We came to a little island in the archipelago in which there was but one house, occupied by a weaver. He had a wife and family, a few coco-palms and a small boat, with which he used to fish and to cross over to any of the islands he wished to visit. His island contained also banana bushes, but we saw no land birds on it except two crows, which came out to us on our arrival and circled above our vessel. And I swear I envied that man, and wished that the island had been mine, that I might have made it my retreat until the inevitable hour should befall me.

 

Don DeLillo, Libra

A semi-fictionalized biography of Lee Harvey Oswald in the Oliver Stone tradition, suffused with that great DeLillo style. There's also a kind of meta parallel story of an FBI agent trying to piece together all the evidence, meticulously going through even the tiniest element (much like DeLillo himself). It's quite Pynchonesque with all the criss-crossing conspiracies, the CIA, paranoia, axes of control and influence, a series of coincidences, taking liberty with history...and the ultimately mysterious "fate" that brought Oswald to the assassination. It lacks Pynchon's humor though.

"I don't know what they want me to do." "Of course you know." "Tell me where it happens." "Miami." "That means nothing to me." "You've known for weeks." "What happens in Miami?" Ferrie took a while to finish chewing his food. "Think of two parallel lines," he said. "One is the life of Lee H. Oswald. One is the conspiracy to kill the President. What bridges the space between them? What makes a connection inevitable? There is a third line. It comes out of dreams, visions, intuitions, prayers, out of the deepest levels of the self. It's not generated by cause and effect like the other two lines. It's a line that cuts across causality, cuts across time. It has no history that we can recognize or understand. But it forces a connection. It puts a man on the path of his destiny."

 

Christopher de Hamel, Meetings with Remarkable Manuscripts: Twelve Journeys into the Medieval World

Twelve chapters, each one dedicated to a different medieval manuscript, from the 6th century Gospels of St. Augustine to the 16th century Spinola Book of Hours. The book is filled with fantastic, gorgeous, high-quality prints from these manuscripts, interspersed with history and commentary in a pleasant conversational style. It's not just about the manuscripts themselves, but also who owned them, their condition, how they've been maintained or altered, where they're housed, and the people taking care of them. Cultural differences in library regulatory practices are a virtually infinite source of comedy. Just lovely all around. Make sure you get the hardcover as the paperback is apparently printed in black & white.

     

Confirmation that he was indeed both scribe and artist is found in the shape of the spaces left for the insertion of initials. Both scribes 2 and 3 (let us exclude 1 for the moment) left simple rectangular blank spaces where large initials were to be painted later, without thought to their shape or composition, and they added guidewords in the margins to indicate what letters were to be supplied. When Hugo came to fill them in, his flamboyantly fluid and multi-tentacled initials fitted uncomfortably into these big draughty square apertures. However, during the stint written by the last scribe from folio 185v onwards, the edges of the script are moulded line by line to fit around the curves and limbs of the painted initials, nestling together snugly like a newly married couple in bed. Text and decoration must have been executed simultaneously by the same person. In short, scribe 4 must be Hugo.

 

Ananyo Bhattacharya, The Man From the Future: The Visionary Life of John von Neumann

Short, dense, and with a great balance between accessibility and dumbing down complex subjects. Bhattacharya approaches his subject by focusing on ideas. The first chapter takes care of JvN's early life, and the rest of the book is split up based on the subjects he worked on: mathematics, quantum mechanics, the nuclear bomb, computing, game theory, RAND, and artificial life. Large parts of the book (I'd say about a third) are dedicated not to von Neumann but rather the work other people did based on his ideas. The game theory chapter, for example, covers Nash, Schelling, Aumann, etc. in economics, and John Maynard Smith, Price, Hamilton, etc. in evolutionary game theory. Bhattacharya is good at making all these technical subjects accessible without dumbing them down too much. The one failing point is that JvN's personality, personal life, and professional relationships don't get much attention.

From 1944, meetings instigated by Norbert Wiener helped to focus von Neumann’s thinking about brains and computers. In gatherings of the short-lived ‘Teleological Society’, and later in the ‘Conferences on Cybernetics’, von Neumann was at the heart of discussions on how the brain or computing machines generate ‘purposive behaviour’. Busy with so many other things, he would whizz in, lecture for an hour or two on the links between information and entropy or circuits for logical reasoning, then whizz off again – leaving the bewildered attendees to discuss the implications of whatever he had said for the rest of the afternoon. Listening to von Neumann talk about the logic of neuro-anatomy, one scientist declared, was like ‘hanging on to the tail of a kite’. Wiener, for his part, had the discomfiting habit of falling asleep during discussions and snoring loudly, only to wake with some pertinent comment demonstrating he had somehow been listening after all.

 

Giorgio Vasari, The Lives of the Most Excellent Painters, Sculptors, and Architects

History by way of biography—Vasari tells a tale of rebirth and artistic progress as Europe emerges from the dark ages, rediscovers the ancients, and then strives to surpass them. Tons of interesting observations on competition, collaboration, the spread of technology, and the psychology of (artistic) greatness. More than 180 lives in over 2000 pages, starting with Cimabue in the 13thC and reaching a climax with Michelangelo in the 16th. Somewhat gossipy and often inaccurate, it nonetheless remains our best source of information on the art and artists of Renaissance Italy. Vasari was a fairly successful painter himself, and his personal aquaintance with both the technique and the business of painting gives us an inside view of the craft. Full review.

It is clear that Leonardo, through his comprehension of art, began many things and never finished one of them, since it seemed to him that the hand was not able to attain to the perfection of art in carrying out the things which he imagined; for the reason that he conceived in idea difficulties so subtle and so marvellous, that they could never be expressed by the hands, be they ever so excellent. And so many were his caprices, that, philosophizing of natural things, he set himself to seek out the properties of herbs, going on even to observe the motions of the heavens, the path of the moon, and the courses of the sun.

 

Arthur Schopenhauer, Essays and Aphorisms

Excerpts from Parerga und Paralipomena. Unexpectedly hilarious; Arthur would've been one hell of a poaster. Surprisingly similar to the pragmatists in many respects. Spans a huge number of topics: ethics, the will, intelligence, animal welfare, religion, suicide, writing, and much more.

Thus we see, for example, the Catholic clergy totally convinced of the truth of all the doctrines of its Church, and the Protestant clergy likewise convinced of the truth of all the doctrines of its Church, and both defending the doctrines of their confession with equal zeal. Yet this conviction depends entirely on the country in which each was born: to the South German priest the truth of the Catholic dogma is perfectly apparent, but to the North German priest it is that of Protestant dogma which is perfectly apparent. If, then, these convictions, and others like them, rest on objective grounds, these grounds must be climatic; such convictions must be like flowers, the one flourishing only here, the other only there.

 

Thucydides, The History of the Peloponnesian War

I'm a Herodotus man through and through, but I can appreciate the Thycydidean perspective as well. Though I'm not entirely sure what that perspective entails: how much of his work is prescriptive and how much of it is descriptive? He's obviously a skeptic when it comes to the supernatural, and there's very little room for morality in his history; is this an artifact of the lack of morality in the way the Athenian went about their affairs, or is this something Thuc projects onto them? In any case, while reading this, one must always keep in mind that the Athenians lost!

It's interesting to read an ancient historian write about battles with 60 hoplites and 20 archers, and that kind of accounting accuracy perfectly captures Thuc's personality.

"... For Athens alone of her contemporaries is found when tested to be greater than her reputation, and alone gives no occasion to her assailants to blush at the antagonist by whom they have been worsted, or to her subjects to question her title to rule by merit. Rather, the admiration of the present and succeeding ages will be ours, since we have not left our power without witness, but have shown it by mighty proofs; and far from needing a Homer for our eulogist, or other of his craft whose verses might charm for the moment only for the impression which they gave to melt at the touch of fact, we have forced every sea and land to be the highway of our daring, and everywhere, whether for evil or for good, have left imperishable monuments behind us. Such is the Athens for which these men, in the assertion of their resolve not to lose her, nobly fought and died; and well may every one of their survivors be ready to suffer in her cause."

 

J. A. Baker, The Peregrine

10 years of obsessive, monomaniacal peregrine-watching in the East of England distilled to 200 pure, intense, astonishing pages. An incredibly rich dish that you can only eat so much of before needing to take a break. Reflects and contains nature both in its form and content. Somewhat reminiscent of Urne-Buriall in that it starts out in a dry, scientific tone and then reaches stylistic extremes later on.

Famously recommended by Werner Herzog (along with Virgil and The Short Happy Life of Francis Macomber), and it is indeed extremely Herzogian. There's no green idealism here, the endless cycle of killing which sustains the peregrine is presented unapologetically. "Beauty is vapour from the pit of death", Baker writes.

 

He hovered, and stayed still, striding on the crumbling columns of air, curved wings jerking and flexing. Five minutes he stayed there, fixed like a barb in the blue flesh of the sky. His body was still and rigid, his head turned from side to side, his tail fanned open and shut, his wings whipped and shuddered like canvas in the lash of the wind. He side-slipped to his left, paused, then glided round and down into what could only be the beginning of a tremendous stoop. There is no mistaking the menace of that first easy drifting fall. Smoothly, at an angle of fifty degrees, he descended; not slowly, but controlling his speed; gracefully, beautifully balanced. There was no abrupt change. The angle of his fall became gradually steeper till there was no angle left, but only a perfect arc. He curved over and slowly revolved, as though for delight, glorying in anticipation of the dive to come. His feet opened and gleamed golden, clutching up towards the sun. He rolled over, and they dulled, and turned towards the ground beneath, and closed again. For a thousand feet he fell, and curved, and slowly turned, and tilted upright. Then his speed increased, and he dropped vertically down. He had another thousand feet to fall, but now he fell sheer, shimmering down through dazzling sunlight, heart-shaped, like a heart in flames. He became smaller and darker, diving down from the sun. The partridge in the snow beneath looked up at the black heart dilating down upon him, and heard a hiss of wings rising to a roar. In ten seconds the hawk was down, and the whole splendid fabric, the arched reredos and immense fan-vaulting of his flight, was consumed and lost in the fiery maelstrom of the sky.

And for the partridge there was the sun suddenly shut out, the foul flailing blackness spreading wings above, the roar ceasing, the blazing knives driving in, the terrible white face descending, hooked and masked and horned and staring-eyed. And then the back-breaking agony beginning, and snow scattering from scuffling feet, and show filling the bill’s wide silent scream, till the merciful needle of the hawk’s beak notched in the straining neck and jerked the shuddering life away.

And for the hawk, resting now on the soft flaccid bulk of his prey, there was the rip and tear of choking feathers, and hot blood dripping from the hook of his beak, and rage dying slowly to a small hard core within.

And for the watcher, sheltered for centuries from such hunger and such rage, such agony and such fear, there is the memory of that sabring fall from the sky, and the vicarious joy of the guiltless hunter who kills only through his familiar, and wills him to be fed.

The Worst

William Hazlitt, Selected Writings

I despise the style of his political writings. Puffed up, aiming to dazzle rather than illuminate. The cheap rhetoric of the ochlagogue. Actively offensive. The non-political writings are much better: they are merely unreadable and sophomoric. Hazlitt's entire aesthetic philosophy just boils down to "art should imitate nature" repeated over and over again, and I can't stand the way he expresses it.

It is not denied that the people are best acquainted with their own wants, and most attached to their own interests. But then a question is started, as if the persons asking it were at a great loss for the answer,—Where are we to find the intellect of the people? Why, all the intellect that ever was is theirs. The public opinion expresses not only the collective sense of the whole people, but of all ages and nations, of all those minds that have devoted themselves to the love of truth and the good of mankind,—who have bequeathed their instructions, their hopes, and their example to posterity,—who have thought, spoke, written, acted, and suffered in the name and on the behalf of our common nature. All the greatest poets, sages, heroes, are ours originally, and by right.

 

Carlos Ruiz Zafón, The Shadow of the Wind

Just a dull airport novel. The coincidences pile on top of eachother as we are treated to interminable exposition dumps from improbable sources that conveniently know everything. Stylistically it tries too hard and achieves nothing.

Destiny is usually just around the corner. Like a thief, a hooker, or a lottery vendor: its three most common personifications. But what destiny does not do is home visits. You have to go for it.

 

Ada Palmer, Too Like the Lightning

Love Palmer's blog but this book just wasn't for me. Even though I read plenty of older books, I found the affected faux-18thC style absolutely grating. The plot mostly seems to be based on the Star Wars prequels, with endless scenes of characters talking about the taxation of trade routes or some other similarly boring nonsense. And there's a magical boy thrown in there for good measure, as well.

I could ask any contemporary here, ‘Are you a majority?’ and I know what he or she would answer: Of course not, Mycroft. I have a Hive, a race, a second language, a vocation and an avocation, hobbies of my own; add up my many strats and you will soon reduce me to a minority of one, and hence my happiness. I am unique, and proud of my uniqueness, and prouder still that, by being no majority, I ensure eternal peace. You lie, reader. There is one majority still entrenched in our commingled world, a great ‘us’ against a smaller ‘them.’ You will see it in time. I shall give only one hint—the deadliest majority is not something most of my contemporaries are, reader, it is something they are not.




Aspects of the Seeker

In Averroës's Search, Borges tells the story of the Islamic philosopher Averroës trying, and failing, to understand Aristotle's writings on theater. Borges sums it up in the afterword:

In the preceding tale, I have tried to narrate the process of failure, the process of defeat. I thought first of that archbishop of Canterbury who set himself the task of proving that God exists; then I thought of the alchemists who sought the philosopher’s stone; then, of he vain trisectors of the angle and squares of the circle. Then I reflected that a more poetic case than these would be a man who sets himself a goal that is not forbidden to other men, but is forbidden to him. I recalled Averroës, who, bounded within the circle of Islam, could never know the meaning of the words tragedy and comedy.

History and literature offer many cases of ironically failed quests for knowledge.

Some phenomena disappear immediately once someone describes them. Douglas Adams wrote of a theory "which states that if ever anyone discovers exactly what the Universe is for and why it is here, it will instantly disappear". The modern world offers many such anti-inductive cases, above all in the movements of the stock market: successful trading strategies tend to stop working after they become known. On a civilizational scale, Malthusianism became irrelevant right at the time someone was able to articulate the idea, and it seems that the moment we are able to improve ourselves through genetic engineering, we will be wiped out by our artificial creations.

A second type of ill-fated seeker is one who finds what he is looking for, but his goal is also a punishment. William Beckford, categorically rejecting Ulysses' actions at the land of the Sirens (perhaps inspired by his own life, and perhaps commenting on all attempts to comprehend the universe) created the apostate Caliph Vathek whose obsessive quest for knowledge results in his damnation, and for whom Hell is both the object of desire and the punishment for that desire. There are those who argue that the libertine Beckford only adopted this biblical attitude against the Faustian spirit as an ironic orientalist façade, but the Caliph resists all attempts at interpretation.

Some seekers reach their goal, only to have it slip out of their hands. Scientists will occasionally chance on the right idea but lack the ability to prove it: Aristarchus of Samos was doomed by the apparent size of the stars and the lack of parallax. The Royal Navy discovered that lemons prevent scurvy, and then through terrible epistemic luck managed to lose that knowledge over the course of the 19th century: lemons were replaced by limes low in vitamin C, but nobody noticed because the ships were faster. The problem only reappeared when polar explorers started suffering from scurvy despite bringing lime juice with them—and the answer was only discovered by the miraculously good luck of experimenting on guinea pigs, one of very few animals that don't produce vitamin C on their own.

Finally the most ironic case of them all, that of the Dalmatian archbishop and heretic Marco Antonio de Dominis: a seeker who is able to find the answer, but is condemned to believe it is false. De Dominis, a contemporary of Kepler (who wrote in favor of the lunar theory of tides) and Galileo (who mocked it), was also an amateur astronomer and wrote a book on the tides titled Euripus.

The archbishop begins by presenting both empirical and theoretical arguments in favor of the thesis that the earth is a sphere. He then describes the luni-solar theory of tides: he (correctly) writes that tides are caused by the combined gravitational action of the sun and the moon, (correctly) predicts that high tide occurs simultaneously at antipodal points, and (correctly) shows that the cycle of spring and neap tides can be explained by the combined action of the sun and moon. He also (correctly) deduces that the diurnal inequality between tides will be greatest when the moon is above the tropic of Cancer or Capricorn. Finally, de Dominis explains (incorrectly) that since the two daily tides are always equal to each other, the theory must be false. The heretical archbishop died behind the bars of the Castel Sant'Angelo before his book could be published.




Links & What I've Been Reading Q4 2021

Metascience

1. Investigating the replicability of preclinical cancer biology: "50 experiments from 23 papers were repeated, generating data about the replicability of a total of 158 effects [...] for positive effects, the median effect size in the replications was 85% smaller than the median effect size in the original experiments"

2. A catastrophic failure of peer review in obstetrics and gynaecology: "I estimate that across these 46 articles, 346 (64%) of the 542 parametric tests (unpaired t tests, or, occasionally, ANOVA) and 151 (61%) of the 247 contingency table test (Pearson's Χ² or Fisher's exact test) that I was able to check were incorrectly reported."

3. The Business of Extracting Knowledge from Academic Publications: "Close to nothing of what makes science actually work is published as text on the web."

4. A large replication project in marketing, with fairly catastrophic results. Amusingly the abstract doesn't mention the rate of successful replication.

5. Increasing Politicization and Homogeneity in Scientific Funding: An Analysis of NSF Grants, 1990-2020. The methodology is somewhat questionable, but insteresting nonetheless.

Covid

6. Scott Alexander on the Ivermectin literature and the trouble with trying to wade through a bunch of questionable papers. Alexandros Marinos responds.

7. Zvi's latest.

You are probably going to get Omicron, if you haven’t had it already. The level of precaution necessary to change this assessment is very high, and you probably don’t want to pay that price.

8. ADS on the Zvi-Holden bet and taking ideas seriously.

Making a blockchain game might genuinely be the best use of Zvi’s time, and he might be acting both rationality and ethically in choosing to pursue it. And so this situation is Good, but only in a very limited and local sense. The tragedy isn’t Zvi’s decision, it’s that a scenario even exists where this is the decision he has to make.

9. Omicron spreading faster than delta because of immune evasion? SARS-CoV-2 Omicron VOC Transmission in Danish Households. Plus twitter thread.

Forecasting

10. Forecasting in the Field: academics and non-experts try to predict the effects of development interventions.

the average correlation between predicted and observed effects is 0.75. Recipient types are less accurate than academics on average, but are at least as accurate for interventions and outcomes that are likely to be more familiar to them. The mean forecast of each group outperforms more than 75% of the comprising individuals, and averaging just five forecasts substantially reduces error, indicating strong “wisdom-of-crowds” effects. Three measures of academic expertise (rank, citations, and conducting research in East Africa) and two measures of confidence do not correlate with accuracy. Among recipient-types, high-accuracy “superforecasters” can be identified using observables. Small groups of these superforecasters are as accurate as academic respondents.

Economic History

11. The United Fruit Company? Good, Actually.

Using administrative census data with census-block geo-references from 1973 to 2011, we implement a geographic regression discontinuity design that exploits a land assignment that is orthogonal to our outcomes of interest. We find that the firm had a positive and persistent effect on living standards. Company documents explain that a key concern at the time was to attract and maintain a sizable workforce, which induced the firm to invest heavily in local amenities that can account for our result.

Book Reviews

12. Reviews of Moby Dick from 1851. "This is an odd book, professing to be a novel; wantonly eccentric; outrageously bombastic; in places charmingly and vividly descriptive." I love it when modern editions of old books include their contemporary reviews, unfortunately it's not done very often.

13. ADS on Stubborn Attachments and Straussian writing.

Crypto

14. Bloomberg report on Tether, including the story of how a French screenwriter ended up owning a Bahamian bank.

15. Vitalik Buterin on Crypto Cities.

16. A Glimpse of the Deep: Finding a Creature in Ethereum's Dark Forest.

This monster was watching Ethereum for an obscure mistake deep in the process of creating a transaction: the reuse of a number while signing a transaction. I went searching for this creature, laid bait, saw it in the wild, and found unexplained tracks. To understand how this bot works, we need to begin by reviewing ECDSA and digital signatures.

The Rest

17. Some answers to my questions about Borges, Browne, and Quevedo: On Borges and Quevedo. "The (sad) irony in Tlon’s ending is, therefore, not in a contrast Quevedo vs Browne, then, but in the contrast (Borges + Quevedo + Browne) vs Tlon. Or, maybe, grecolatin tradition versus modernity. With a tinge of sad resignation for the slow but unstoppable victory of the second over the first."

18. And here's a very interesting essay (in Spanish) on Borges's "francophobia".

19. SMTM wrap up the Chemical Hunger series on the causes of obesity after 20 posts.

20. On the NIH and the challenges of funding alcohol consumption RCTs. The big alcohol study that didn't happen: My primal scream of rage.

21. RCT of health insurance in India finds few positive effects: Effect of Health Insurance in India: A Randomized Controlled Trial.

22. "Many young females report joining Draco Malfoy as his girlfriend."

23. An interesting ACX comment on reversals in artistic "progress".

it's a pattern that has repeated throughout history and around the world, one of naturalist art executed with great skill being deliberately replaced with highly abstract art not requiring as much skill.

The cave paintings of Chauvet Cave in France ca 30,000 BP (before present) are more natural and technically much more sophisticated than any cave or rock paintings found after 20,000 BP (some of which are quite abstract and stylized).

Reminds me of this paper on bursts of technological development 60-80kya that lasted for a few thousand years and then disappeared. Related, a great new article on the Antikythera mechanism.

24. The Browser interview with QNTM.

25. Nemets on the genetic history of the ancient Greeks and the identity of the Sea Peoples.

26. Razib Khan: Out of Africa's midlife crisis

two San from different groups both living in Namibia’s Northern Kalahari desert, and speaking click languages from the same family, are more genetically distinct from one another, by a solid 20%, than a person from Stockholm is from a person from Shanghai.

27. Don't take psychedelics. "Results revealed significant shifts away from ‘physicalist’ or ‘materialist’ views, and towards panpsychism and fatalism, post use."

28. Blind people have a pretty good understanding of color.

Audio-Visual

29. Interface | Part II, cool animation project.

30. A project that made 999 forgeries of a Warhol drawing, then randomly mixed in the original, and sold them.

31. How to Build a Supersonic Trebuchet.

32. And here's a cool remix of Hugh Masekela's Stimela.

What I've Been Reading

Non-Fiction

  • The Man from the Future: The Visionary Life of John von Neumann by Ananyo Bhattacharya. Bhattacharya approaches his subject by focusing on ideas. The first chapter takes care of JvN's early life, and the rest of the book is split up based on the subjects he worked on: mathematics, quantum mechanics, the nuclear bomb, computing, game theory, RAND, and artificial life. Large parts of the book (I'd say about a third) are dedicated not to von Neumann but rather the work other people did based on his ideas. The game theory chapter, for example, covers Nash, Schelling, Aumann, etc. in economics, and John Maynard Smith, Price, Hamilton, etc. in evolutionary game theory. Bhattacharya is good at making all these technical subjects accessible without dumbing them down too much. JvN's personality, personal life, professional relationships, etc. on the other hand are given scant attention.

    Overall it felt a bit too short. In less than 300 pages we get such a wide array of ideas, and the story of how they influenced so many people, that it often feels like we're just skimming the surface in a speedboat. I'd like to take a deeper, more ponderous ride in a submarine some day.

  • Meetings with Remarkable Manuscripts by Christopher de Hamel. Fantastically gorgeous book, filled with high-quality prints of medieval manuscripts. Pleasant conversational style. Just lovely all around. Not just about the manuscripts themselves, but also who owned them, their condition, where they're housed, the librarians taking care of them, etc.

  • The Rings of Saturn by W. G. Sebald. A book of digressions. The frame is a walking tour of England, and on it are bolted various musings on Sir Thomas Browne, Joseph Konrad, silk manufacture, the Taiping rebellion, and so on. The subjects flow into each other so you don't know where one digression begins and the other ends. However, Sebald kind of undersells how interesting his subjects are; comparing his notes on FitzGerald to the famous Borges essay, for example, makes me wonder how Sebald managed to turn such a fascinating subject into such a dull essay.

  • Conquistador: Hernán Cortés, King Montezuma, and the Last Stand of the Aztecs by Buddy Levy. I didn't love the book (it felt a bit sloppy, and the style isn't great), but Cortes is an incredible character. The determination, the ingenuity, the absolute ruthlesness. When he murders his wife at the end of the book, all you can think is "well of course he did". And self-aware too: "I and my companions suffer from a disease of the heart that can be cured only with gold"! Perhaps it is the contrast against the Aztecs that, in a way, softens his image? Going to try Prescott's History of the Conquest of Mexico next.

  • Over the Edge of the World: Magellan's Terrifying Circumnavigation of the Globe by Laurence Bergreen. Solid narrative pop history. Feels a bit rushed after the point of Magellan's death. Exciting, adventurous stuff as you'd expect from the age of exploration.

  • A Man on the Moon: The Voyages of the Apollo Astronauts by Andrew Chaikin. Covers the entire thing plus a ton of backstory, very thorough (within its scope). Focused on the astronauts, and much of it is the preoduct of interviews with those astronauts, which is kind of obvious at many points as you're only getting one person's perspective on certain events. It would have been better with a broader, more objective view, in my opinion. The latter parts (after the first moon landing) include a surprising amount of geology! I read three books on the early space program this year and none of them was completely satisfying, I'm still trying to find the Richard Rhodes of Apollo...




How I Made $10k Predicting Which Studies Will Replicate

Starting in August 2019 I took part in the Replication Markets project, a part of DARPA's SCORE program whose goal is to predict which social science papers will successfully replicate. I have previously written about my views on the replication crisis after reading 2500+ papers; in this post I will explain the details of forecasting, trading, and optimizing my strategy within the rules of the game.

The Setup

3000 papers were split up into 10 rounds of ~300 papers each. Every round began with one week of surveys, followed by two weeks of market trading, and then a one week break. The studies were sourced from all social science disciplines (economics, psychology, sociology, management, etc.) and were published between 2009 and 2018 (in other words, most of the sample came from the post-replication crisis era).

Only a subset of the papers will be replicated: ~100 papers were selected for a full replication, and another ~150 for a "data replication" in which the same methodology is applied to a different (but pre-existing) dataset.1 Out of the target 250 replications, only about 100 were completed by the time the prizes were paid out.

Surveys

The surveys included a link to the paper, a brief summary of the claim selected for replication, the methodology, and a few statistical values (sample size, effect size, test statistic values, p-value). We then had to answer three questions:

  1. What is the probability of the paper replicating?
  2. What proportion of other forecasters do you think will answer >50% to the first question?
  3. How plausible is the claim in general?

The papers were split up into batches of 10, and the top 4 scorers in each batch won awards of $80, $40, $20, and $20 for a total of $4,800 per survey round.

The exact scoring method was not revealed in order to prevent gaming the system, but after the competition ended the organizers wrote a technical blog post explaining the "surrogate scoring rule" approach. Since the replications were not completed yet, scoring predictions had to be done without reference to the "ground truth"; instead they generated a "surrogate outcome" based on all the survey answers and used that to score the predictions.2

Markets

Every user started each round with 1 point per claim (so typically 300).3 These points were the currency used to buy "shares" for every claim. Long share positions pay out if the paper replicates successfully and short positions pay out if it does not. Like a normal stock market, if you bought shares at a low price and the price went up, you could sell those shares for a profit.

The starting price of each claim was based on its p-value:

  • p<.05: 30%
  • p<.01: 40%
  • p<.001: 80%

The market did not operate like a typical stock market (ie a continuous double auction); instead, they used Robin Hanson's Logarithmic Market Scoring Rule which allows users to trade without a counterparty.4 Effectively it works as an automated market maker, making it costlier to trade the more extreme the price: taking a claim from 50% to 51% was cheap, while taking it from 98% to 99% was very expensive. Without any order book depth, prices could be rather volatile as it didn't take much for a single person to significantly shift the price on a claim; this also created profitable trading opportunities.

The payout for the markets was about $14k per round, awarded in proportion to winning shares in the papers selected for replication. Given the target of 250 replications, that means about 8% of the claims would actually resolve. The small number of actually completed replications, however, caused some issues: round 9, for example, only had 2 (out of the target 25) replications actually pay out.

Early Steps - A Simple Model

I didn't take the first round very seriously, and I had a horrible flu during the second round, so I only really started playing in round 3. I remembered Tetlock writing that "it is impossible to find any domain in which humans clearly outperformed crude extrapolation algorithms, less still sophisticated statistical ones", so I decided to start with a statistical model to help me out.

This felt like a perfect occasion for a centaur approach (combining human judgment with a model), as there was plenty of quantitative data, but also lots of qualitative factors that are hard to model. For example, some papers with high p-values were nevertheless obviously going to replicate, due to how plausible the hypothesis was a priori.5

Luckily someone had already collected the relevant data and built a model.6 Altmejd et al. (2019) combine results from four different replication projects covering 131 replications (which they helpfully posted on OSF). Here are the features they used ranked by importance:

Their approach was fairly complex, however, and I wanted something simpler. On top of that I wanted to limit the number of variables I would have to collect for every paper, as I had to do 300 of them in a week—any factors that would be cumbersome to look up (eg the job title of each author) were discarded. I also transformed a bunch of the variables, for example replacing raw citation counts with log citations per year.

I ended up going with a logistic ridge regression (shrinkage tends to help with out-of-sample predictions). The Altmejd sample was limited in terms of the fields covered (they only had social/cognitive/econ), so I just pulled some parameter values out of my ass for the other fields—in retrospect they were not very good guesses.7

1
2
3
cv.ridge <- cv.glmnet(as.matrix(mydata), y_class, alpha = 0, family = "binomial")

coef(cv.ridge, cv.ridge$lambda.min)
ParameterValue
intercept0.40
log # of pages-0.26
p value-25.07
log # of authors-0.67
% male authors0.90
dummy for interaction effects-0.77
log citations per year0.37
discipline: economics0.27
discipline: social psychology-0.77
discipline: education-0.40
discipline: political science0.10
discipline: sociology-0.40
discipline: marketing0.10
discipline: orgbeh0.1
discipline: criminology-0.2
discipline: other psychology-0.2

This model was then implemented in a spreadsheet, so all I had to do was enter the data, and the prediction popped up:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
=exp(Intercept+
F18*pval+
IF(N18="interaction",1,0)*interaction+
male*P18+
logauth*ln(O18)+
loglen*ln(L18)+
if(D18="Social",1,0)*social+
if(D18="Economics",1,0)*econ+
if(D18="PoliSci",1,0)*polisci+
if(D18="Education",1,0)*educ+
if(D18="Sociology",1,0)*sociology+
if(D18="Marketing",1,0)*marketing+
if(D18="OrgBeh",1,0)*OrgBeh+
if(D18="Criminology",1,0)*criminology+
if(D18="Other Psychology",1,0)*otherpsych+
ln(M18/(2019-E18))*logcitesperyear)

While my model had significant coefficients on # of authors, ratio male, and # of pages, these variables were not predictive of market prices in RM. Even the relation of citations to market prices was very weak. I think the market simply ignored any data it was not given directly, even if it was important. This gave me a bit of an edge, but also made evaluating the performance of the model more difficult as the market was systematically wrong in some ways.

Collecting the additional data needed for the model was fairly cumbersome: completing the surveys took ~140 seconds per paper when I was just doing it in my head, and ~210 seconds with the extra work of data entry. It also made the process significantly more boring.

Predictions

I will give a quick overview of the forecasting approach here; a full analysis will come in a future post, including a great new dataset I'm preparing that covers the methodology of replicated papers.

At the broadest level it comes down to: the prior, the probability of a false negative, and the probability of a false positive.8 One must consider these factors for both the original and the replication.9

What does that look like in practice? I started by reading the summary of the study on the RM website (which included the abstract, a description of the selected claim, sample size, p-value, and effect size). After that I skimmed the paper itself. If I didn't understand the methodology I checked the methods and/or conclusions, but the vast majority of papers were just straight regressions, ANOVAs, or SEMs. The most important information was almost always in the table with the main statistical results.

The factors I took into account, in rough order of importance:

  • p-value. Try to find the actual p-value, they are often not reported. Many papers will just give stars for <.05 and <.01, but sometimes <.01 means 0.0000001! There's a shocking number of papers that only report coefficients and asterisks—no SEs, no CIs, no t-stats.
  • Power. Ideally you'll do a proper power analysis, but I just eyeballed it.
  • Plausibility. This is the most subjective part of the judgment and it can make an enormous difference. Some broad guidelines:
    • People respond to incentives.
    • Good things tend to be correlated with good things and negatively correlated with bad things.
    • Subtle interventions do not have huge effects.
  • Pre-registration. Huge plus. Ideally you want to check if the plan was actually followed.
  • Interaction effect. They tend to be especially underpowered.
  • Other research on the same/similar questions, tests, scales, methodologies—this can be difficult for non-specialists, but the track record of a theory or methodology is important. Beware publication bias.
  • Methodology - RCT/RDD/DID good. IV depends, many are crap. Various natural-/quasi-experiments: some good, some bad (often hard to replicate). Lab experiments, neutral. Approaches that don't deal with causal identification depend heavily on prior plausibility.
  • Robustness checks: how does the claim hold up across specifications, samples, experiments, etc.
  • Signs of a fishing expedition/researcher degrees of freedom. If you see a gazillion potential outcome variables and that they picked the one that happened to have p<0.05, that's what we in the business call a "red flag". Look out for stuff like ad hoc quadratic terms.
  • Suspiciously transformed variables. Continuous variables put into arbitrary bins are a classic p-hacking technique.
  • General propensity for error/inconsistency in measurements. Fluffy variables or experiments involving wrangling 9 month old babies, for example.

Things that don't matter for replication but matter very much in the real world:

  • Causal identification! The plausibility of a paper's causal identification strategy is generally orthogonal to its chances of replicating.
  • Generalizability. Lab experiments are replicated in other labs.

Some papers were completely outside my understanding, and I didn't spend any time trying to understand them. Jargon-heavy cognitive science papers often fell into this category. I just gave a forecast close to the default and marked them as "low confidence" in my notes, then avoided trading them during the market round. On the other hand, sometimes I got the feeling that the jargon was just there to cover up bullshit (leadership studies, I'm looking at you) in which case I docked points for stuff I didn't understand. The epistemological problem of how to determine which jargon is legit and which is not, is left as an exercise to the reader.

Pour exemple

The data from Replication Markets are still embargoed, so I can't give you any real examples. Instead, I have selected a couple of papers that were not part of the project but are similar enough.

Ex. 1: Criminology

My first example is a criminology paper which purports to investigate the effect of parenting styles on criminal offending. Despite using causal language throughout, the paper has no causal identification strategy whatsoever. If criminologists had better GRE scores this nonsense would never have been published. The most relevant bits of the abstract:

The present study used path analyses and prospective, longitudinal data from a sample of 318 African American men to examine the effects of eight parenting styles on adult crime. Furthermore, we investigated the extent to which significant parenting effects are mediated by criminogenic schemas, negative emotions, peer affiliations, adult transitions, and involvement with the criminal justice system. Consonant with the study hypotheses, the results indicated that [...] parenting styles low on demandingness but high on responsiveness or corporal punishment were associated with a robust increase in risk for adult crime.

The selected claim is the effect of abusive parenting (the "abusive" parenting style involves "high corporal punishment" but low "demandingness" and "responsiveness") on offending; I have highlighted the outcome in the main regression table below. While the asterisks only say p<.01, the text below indicates that the p-value is actually <.001.

Make your own guess about the probability of replication and then scroll down to mine below.

I'd give this claim 78%. The results are obviously confounded, but they're confounded in a way that is fairly intuitive, and we would expect the replication to be confounded in the exact same way. Abusive parents are clearly more likely to have kids who become criminals. Although they don't give us the exact t-stat, the p-value is very low. On the negative side the sample size (318 people spread over 8 different parenting styles) isn't that big, I'm a bit worried about variance in the classification of parenting styles, and there's a chance that the (non-causal) relation between abusive parenting and offending could be lost in the controls.

This is a classic example of "just because it replicates doesn't mean it's good", and also a prime example of why the entire field of criminology should be scrapped.

Ex. 2: Environmental Psychology

My second example is an "environmental psychology" paper about collective guilt and how people act in response to global warming.

The present research examines whether collective guilt for an ingroup’s collective greenhouse gas emissions mediates the effects of beliefs about the causes and effects of global warming on willingness to engage in mitigation behavior.

N=72 people responded to a survey after a manipulation, on a) the causes and b) the importance of the effects of climate change. The selected claim is that "participants in the human cause-minor effect condition reported more collective guilt than did participants in the other three conditions (b* = .50, p <.05)". Again, make your own guess before scrolling down.

I'd go with 23% on this one. Large p-value, interaction effect, relatively small sample, and a result that does not seem all that plausible a priori. The lack of significance on the Cause/Effect parameters alone is also suspicious, as is the lack of signifiance on mitigation intentions. Lots of opportunities to find some significant effect here!

Spreadsheets

The worst part of Replication Markets was the user interface: it did not offer any way to keep track of one's survey answers, so in order to effectively navigate the market rounds I had to manually keep track of all the predictions. There was also no way to track changes in the value of one's shares, so again that had to be done manually in order to exit successful trades and find new opportunities. The initial solution was giant spreadsheets:

Since the initial prices were set depending on the claim's p-value, I knew ahead of time which claims would be most mispriced at the start of trading (and that's where the greatest opportunities were). So a second spreadsheet was used to track the best initial trades.11 The final column tracks how those trades worked out by the end of the market round; as you can see not all of them were successful (including some significant "overshoots"), but in general I had a good hit rate. As you can see, there were far more "longs" than "shorts" at the start: these were mostly results that were highly plausible a priori but had failed to get a p-value below 0.001.

["Final" is my estimate, "default" is the starting price, "mkt" is the final market price]

Finally, a third spreadsheet was used to track live trading during the market rounds. There was no clean way of getting the prices from the RM website to my sheet, so I copy/pasted everything, parsed it, and then inserted the values into the sheet. I usually did that a few times per day (more often at the start, since that was where most trading activity was concentrated). The claims were then ranked by the difference between my own estimate and the market. My current share positions were listed next to them so I knew what I needed to trade. The "Change" column listed the change in price since the last update, so I could easily spot big changes (which usually meant new trading opportunities).

["Live" is the current market price, "My" is my estimate, "Shares" is the current position]

Forget the Model!

After the third round I took a look at the data to evaluate the model and there were two main problems:

  • My own errors (prediction minus market price) were very similar with the errors of the model:
  • The model failed badly at high-probability claims, and failed to improve overall performance. Here's the root mean square error vs market prices, grouped by p-value:

Of course what the model was actually trying to predict was replication, not the market price. But market prices were the only guide I had to go by (we didn't even get feedback on survey performance), and I believed the market was right and the model was wrong when it came to low-p-value claims.

What would happen if everyone tried to optimize for predicting market prices? I imagine we could have gotten into weird feedback loops, causing serious disconnects between market prices and actual replication probability. In practice I don't think that was an issue though.

If I had kept going with the model, I had some improvements in mind:

  • Add some sort of non-linear p-value term (or go with z-scores instead).
  • Quantify my subjective judgment of "plausibility" and add it as another variable in the model.
  • Use the round 3 market data of 300 papers (possibly with extremized prices) to estimate a new model, which would more than triple my N from the original 131 papers. But I wasn't sure how to combine categorical data from the previous replications and probabilities from the prices in a single model.12

At this point it didn't seem worth the effort, especially given all the extra data collection work involved. So, from round 4 onward I abandoned the model completely and relied only on my own guesses.

Playing the Game

Two basic facts dictated the trading strategy:

  1. Only a small % of claims will actually be replicated and pay out.
  2. Most claims are approximately correctly priced.

It follows that smart traders make many trades, move the price by a small amount (the larger your trade the larger the price impact), and have a diversified portfolio. The inverse of this rule can be used to identify bad traders: anyone moving the price by a huge amount and concentrating their portfolio in a small number of bets is almost certainly a bad trader, and one can profitably fade their trades.

Another source of profitable trades was the start of the round. Many claims were highly mispriced, but making a profit depended on getting to them first, which was not always easy since everyone more or less wanted to make the same trades. Beyond that, I focused on simply allocating most of my points toward the most-mispriced claims.

I split the trading rounds into two phases:

  1. Trading based on the expected price movement.

  2. At the very end of the round, trading based on my actual estimate of replication probability.

Usually these two aspects would coincide, but there were certain types of claims that I believed were systematically mispriced by other market participants.13 Trading those in the hope of making profits during the market round didn't work out, so I only allocated points toward them at the end.

Another factor to take into consideration was that not all claims were equally likely to be selected for replication. In some cases it was pretty obvious that a paper would be difficult or impossible to replicate directly. I was happy to trade them, but by the end of the round I excluded them from the portfolio.14

Buying the most mispriced items also means you're stuck with a somewhat contrarian portfolio, which can be dangerous if you're wrong. Given the flat payout structure of the market, following the herd was not necessarily a bad idea. Sometimes if a claim traded strongly against my own forecast, I would lower the weight assigned to it or even avoid it completely. Suppose you think a study has a 30% chance of replicating, and a liquid market insists it has a 70% chance—how do you revise your forecast?

Reacting to Feedback

After every round I generated a bunch of graphs that were designed to help me understand the market and improve my own forecasts. This was complicated by the fact that there were no replication results—all I had to go by were the market prices, and they could be misleading.

Among other things, I compared means, standard deviations, and quartiles of my own predictions vs the market; looked at my means and RMSE grouped by p-value and discipline; plotted the distribution of forecasts, and error vs market price; etc.

One standard pattern of prediction markets is that extremizing the market prediction makes it better. Simplistically, you can think of the market price being determined by informed traders and uninformed/noise traders. The latter pull the price toward the middle, so the best prediction is going to be (on average) more extreme than the market's. This is made worse in the case of Replication Markets because of the LMSR algorithm which makes shares much more expensive the closer you get to 0 or 100%. So you can often improve on things by just extremizing the market forecast, and I always checked to see if my predictions were on the extremizing side vs the market.

Here you can see the density plots of my own vs the market forecasts, split up by p-value category. (The vertical line is the default starting price for each group.)

And here's the same data in scatterplot form:

My predictions vs the market.My predictions vs the market. Difference between my forecasts and the market, by discipline. The market was more confident in results from economics, at least in round 3.Difference between my forecasts and the market, by discipline. The market was more confident in results from economics, at least in round 3.

Over time my own predictions converged with the market. I'm not entirely sure how to interpret this trend. Perhaps I was influenced by the market and subtly changed my predictions based on what I saw. Did that make me more accurate or less? It's unclear, and based on the limited number of actual replication results it's impossible to tell. Another possibility is that the changing composition of forecasters over time made the market more similar to me?

Automated Trading

I think a lot of my success was due to putting in more effort than others were willing to. And by "putting in effort" I mean automating it so I don't have to put in any effort. In round 6 the trading API was introduced; at that point I dropped the spreadsheets and quickly threw together a desktop application (using C# & WPF) that utilized the API and included both automated and manual trading.15 Automating things also made more frequent data updates possible: instead of copy-pasting a giant webpage a few times a day, now everything updated automatically once every 15 minutes.

The main area on the left is the current state of the market and my portfolio, with papers sorted by how mispriced they are. Mkt is the current market price, My is my forecast, Position is the number of shares owned, Liq. Value is the number of points I could get by exiting this position, WF is a weight factor for the portfolio optimization, and Hist shows the price history of that claim.

On the right we have pending orders, a list of the latest orders executed on the market, plus logging on the bottom.

I used a simple weighting algorithm with a few heuristics sprinkled on top. Below you can see the settings for the weighting, plus a graph of the portfolio weights allocated by claim (the most-mispriced claims are on the left).

To start with I simply generated weights proportional to the square of the difference between the current market price and my target price (Exponent). Then,

  • multiplied that by a per-study weight factor (WF in the main screen),
  • multiplied that by ExtremeValueMultiplier for claims with extreme prices (<8% or >96%),
  • removed any claims with a difference smaller than the CutOff,
  • removed any claims with weight below MinThreshold,
  • limited the maximum weight to MaxPosition,
  • and disallowed any trading for claims that were already close to their target weight (NoWeightChangeBandwidth).

There was also another factor to take into consideration: the RM organizers ran some bots of their own. One simply traded randomly, while the other systematically moved prices back toward their default values. This created a predictable price pressure which had to be taken into account and potentially exploited: the DefDiffPenalizationFactor lowered the weight of claims that were expected to have adverse movements due to the bots.

Fading large price movements was automated, and I kept a certain amount of free points available so that I could take advantage of them quickly. Finally, turning the weighting algorithm into trades was fairly simple. If the free points fell below a threshold, the bot would automatically sell some shares. Most trades did not warrant a reaction however, and I had a semi-automated system for bringing the portfolio in line with the generated weights, which involved hitting a button to generate the orders and then firing them off.

High Frequency Trading

When there are a) obviously profitable trades to be made and b) multiple people competing for them, it's very easy to get into a competitive spiral that pushes speeds down to the minimum allowed by the available technology. That's how a replication prediction market ended up being all about shaving milliseconds off of trading algos.

By round 9 another player (named CPM) had also automated his trades and he was faster than me so he took all my profits by reacting to profitable opportunities before I could get my orders in—we were now locked in an HFT latency race. There was only one round left so I didn't want to spend too much time on it, but I did a small rewrite of my trading app so it could run on linux (thanks, .NET Core), which involved splitting it into a client (with the UI) and a server (with the trading logic), and patching in some networking so I could control it remotely.16 Then, I threw it up on my VPS which had lower ping to the RM servers.

When I first ran my autotrader, I polled the API for new trades once every 15 minutes17. Now it was a fight for milliseconds. Unfortunately placing the autotrader on the VPS wasn't enough, the latency was still fairly high and CPM crushed me again, though by a smaller margin this time. Sometimes I got lucky and snagged an opportunity before he could get to it though.

The Results

In money terms, I made $6640 from the surveys and $4020 from the markets for a total of $10,660 (out of a total prizepool of about $190k).

In terms of the actual replication results, the detailed outcomes are still embargoed, so we'll have to wait until next summer (at least) to get a look at them. Some broad stats can be shared however: the market predicted a 54% chance of replication on average—and 54% of the replications succeeded (the market isn't that good, it got lucky).

Of 107 claims that resolved, I have data on 31 which I made money on. For the rest I either had no shares, or had shares in the incorrect direction. Since I only have data on the successes, there's no way to judge my performance right now.

Survey vs Market Payouts

The survey round payout scheme was top-heavy, and small variations in performance resulted in large differences in winnings. The market payout on the other hand was more or less communistic. Everyone gets the same number of points; and it was difficult to either gain or lose too many of them in the two weeks of trading. As a result, the final distribution of prizes is rather flat. At best a good forecaster might increase earnings by ~10% by exploiting mispricings, plus a bit more through intelligent trading. The Gini coefficient of the survey payouts was 0.76, while the Gini of the market payouts was 0.63 (this is confounded by different participation levels, but you get the point).

This was backwards. I think one of the most important aspects of "ideal" prediction markets is that informed traders can compound their winnings, while uninformed traders go broke. The market mechanism works well because the feedback loop weeds out those who are consistently wrong. This element was completely missing in the RM project. I think the market payout scheme should have been top-heavy, and should have allowed for compounding across rounds, while the survey round should have been flatter in order to incentivize broader participation.

Conclusion

If the market had kept going, my next step would have been to use other people's trades to update my estimates. The idea was to look at their past trades to determine how good they were (based on the price movement following their trade), then use the magnitude of their trades to weigh their confidence in each trade, and finally incorporate that info in my own forecast. Overall it's fascinating how even a relatively simple market like this has tons of little nuances, exploitable regularities, and huge potential for modeling and trading strategies of all sorts.

In the end, are subsidized markets necessary for predicting replication? Probably not. The predictions will(?) be used to train our AI replacements, and I believe SCORE's other replication prediction project, repliCATS, successfully used (cheaper) discussion groups. It will be interesting to see how the two approaches compare. Tetlock's research shows that working as part of a team increases the accuracy of forecasters, so it wouldn't surprise me if repliCATS comes out ahead. A combination of teams (aided by ML) and markets would be the best, but at some point the marginal accuracy gains aren't really worth the extra effort and money.

I strongly believe that identifying reliable research is not the main problem in social science today. The real issue is making sure unreliable research is not produced in the first place, and if it is produced, to make sure it does not receive money and citations. And for that you have to change The Incentives.


PS. Shoot me an email if you're doing anything interesting and/or lucrative in forecasting.

PPS. CPM, rm_user, BradleyJBaker, or any other RM participant who wants to chat, hit me up!


  1. 1.For example a paper based on US GDP data might be "replicated" on German GDP data.
  2. 2.The Bayesian Truth Serum answers do not appear to be used in the scoring?
  3. 3.There were also some bonus points for continuous participation over multiple rounds.
  4. 4.There would be significant liquidity problems with a continuous double auction market.
  5. 5.I can't provide any specific examples until the embargo is lifted, sometime next year.
  6. 6.Cowen's Second Law!
  7. 7.If page count/# authors/% male variables are actually predictive, I suspect it's mostly as a proxy for discipline and/or journal. I haven't quantified it, but subjectively I felt there were large and consistent differences between fields.
  8. 8.The RM replications followed a somewhat complicated protocol: first, a replication with "90% power to detect 75% of the original effect size at the 5% level. If that fails, additional data will be collected to reach "90% power to detect 50% of the original effect size at the 5% level".
  9. 9.Scroll down to "Reconstruction of the Prior and Posterior Probabilities p0, p1, and p2 from the Market Price" in Dreber et al. 2015 for some equations.
  10. 10.In fact it's a lot lower than the .001 threshold they give.
  11. 11.In order to trade quickly at the start, I opened a tab for each claim. When the market opened, I refreshed them all and quickly put in the orders.
  12. 12.I still haven't looked into it, any suggestions? Could just estimate two different models and weighted average the coefficients - caveman statistics.
  13. 13.Behavioral genetics papers for example were undervalued by the market. Also claims where the displayed p-value was inaccurate - most people wouldn't delve into the paper and calculate the p-value, they just trusted the info given on the RM interface.
  14. 14.Another factor to take into consideration was that claims with more shares outstanding had lower expected value, especially during the first five rounds when only ~10 claims per round would pay out. The more winning shares on a claim, the less $ per share would be paid out (assuming the claim is replicated). At the end of the round I traded out of busy claims and into ignored ones in order to maximize my returns. After round 5 the number of claims selected for replication per round increased a lot, making this mostly irrelevant. Or so i thought: this actually turned out to be quite important since only a handful of replications were actually completed for each round.
  15. 15.The code is pretty ugly so I'm probably not going to release it.
  16. 16.A basic familiarity with network programming is an invaluable tool for every forecaster's toolkit.
  17. 17.The API had no websockets or long polling, so I had to poll the server for new trades all the time.



Links & What I've Been Reading Q3 2021

Economic History

1. Was the Industrial Revolution The Industrial Revolution? A fascinating look at the industrial revolution in the UK, including some explanations of slow/zero growth in various periods before WWII.

From 1760 to 1800, the contribution of the steam engine was .004 percent per year to capital deepening and .005 percent to TFP growth. Not until after 1850 had the high-pressure engine become widespread and efficient enough to be deployed in factories and on rail engines that these numbers each rose to .2 percent. A century passed between James Watt’s patent—the first revolutionary “general purpose technology”—and its maximum realization in TFP growth.

Soaring population growth in the late eighteenth and early nineteenth centuries threatened the island with a Malthusian demographic catastrophe. [...] Without an Industrial Revolution, Mokyr reasons that GDP per capita in Britain could have been twenty percent lower in 1830 than in 1760.

Britain became modern, and then it got rich.

2. Leo Aschenbrenner on his favorite Chad Jones papers. "Most of all, Chad’s papers showed me what beautiful economic theory looks like. Simple models that capture a few essential forces, guided by broad empirical trends. These can often reveal insights that totally non-obvious ex ante—but are strikingly intuitive and powerful once found."

3. The Ant and the Grasshopper: Seasonality and the Invention of Agriculture A fascinating (and speculative) paper from 2019 which argues that agriculture was invented because changes in the earth's orbit caused an increase in seasonality!

Metascience

4. Evidence of Fraud in an Influential Field Experiment About Dishonesty. Fairly brazen data fabrication, though it's still not clear whether it was Ariely or the company that was in charge of collecting the data.

5. Some evidence suggesting that the Sputnik vaccine paper used fake data. I'd note that real-world data shows the vaccine working pretty well regardless of whether there was fraud in the trial.

6. Predicting and reasoning about replicability using structured groups: predicting replicability using the IDEA protocol (‘Investigate’, ‘Discuss’, ‘Estimate’ and ‘Aggregate’) for generating and combining predictions seems to work very well, achieving 84% classification accuracy in this sample. Still waiting on the SCORE results.

7. The Effect of Replications on Citation Patterns: Evidence From a Large-Scale Reproducibility Project

successful replications led to an increase in yearly citations of around 5% and that unsuccessful replications led to a decrease in yearly citations of around 4%. For the average article in my sample, which has roughly eight citations per year, this would imply a change of ±1 citation every 2 to 3 years.

As I was saying, replications don't really matter, so it's better to go for forward-looking reforms instead of trying to fix the past.

8. A survey on questionable research practices s from the Netherlands. ~4% fabrication, ~50% frequently engage in QRPs.

9. A clever paper uses the shutdown of a journal (due to an "exogenous shock" in economese) to measure the prevalence of strategic citations. Citations drop by about 20% after discontinuation.

10. Is the Value Premium Smaller Than We Thought? A look at the various decisions that go into constructing a risk factor, and how they affect the end result. "The results suggest that the original value premium estimate is upward biased because of a chance result in the original research decisions."

11. Text-generating models are sometimes used to plagiarize papers by back-and-forth translation, or to generate new (nonsensical) papers. This study looks for "tortured phrases" like "profound neural organization" (ie deep neural network) and "haze figuring" (ie cloud computing), and finds many published papers that appear to have been computer-generated.

12. Arcadia Science is a for-profit research institute, with a biology lab opening in Berkeley next month. "No work produced or funded by Arcadia will be published in journals."

Covid

13. Simpson's paradox and Israeli vaccine data. On stratification by age and calculating vaccine effectiveness.

14. Tamiflu for covid? Looks pretty good, ~50% decrease in risk of hospitalization. Costs $700 though.

Forecasting

15. Alignment Problems With Current Forecasting Platforms. A look at some issues with GJO/CSET/Metaculus. It's not easy to incentivize people to provide their true forecasts at all times, share information, etc.

16. Facebook's new forecasting platform lasted about a month.

17. Hedgehog, blockchain prediction market from "Futarchy Research Limited".

Book Reviews

18. Razib Khan has a relatively positive review of Harden's The Genetic Lottery, but the Steve Sailer review is a lot more entertaining. It's amusing that the BBEG for these people is still Charles Murray rather than, say, David Reich who has said much worse things.

The Rest

19. George Church is bringing back the woolly mammoth.

20. ADS: Become a Billionaire.

Surveying the top Y Combinator companies, I find that around the top 50 are valued at over $1,000,000,000. They won’t all exit successfully, and the founders won’t all own enough equity to emerge with tres commas to their net worth, but this already gets us to a much more practical and optimistic heuristic to life:

  1. Try very hard to get into YC
  2. Conditional on acceptance, try very hard to become a billionaire

The odds really aren't that bad. Also from ADS, Does Moral Philosophy Drive Moral Progress?

21. You've probably already seen SMTM's fantastic series on the causes behind the rise in obesity. Some interesting pushback from RCA and a literal banana.

22. Felix Stocker on Will MacAskill's longtermist plans: Reflecting on the Long Reflection. "I'm struggling to see the Long Reflection as anything other than impossible and pointless: impossible in that we cannot solve all x-risks before any s-risks, or avoid race dynamics; pointless in that I don't believe that there is a great Answer for it to discover."

23. Alexey Guzey on Bloom et al's Are Ideas Getting Harder to Find? The paper has a bunch of problems, but the more general section on TFP is the most interesting:

France’s TFP in 2001 was higher than in 2019. Italy’s TFP in 1970 was higher than in 2019. Japan’s TFP in 1990 was higher than in 2009. Spain’s TFP in 1984 was higher than in 2019. Sweden’s TFP in 1973 was higher than in 1993. Switzerland’s TFP in 1974 was higher than in 1996. United Kingdom’s TFP in 2003 was higher than in 2019.

24. ACX on the FDA: Adumbrations Of Aducanumab The Moldbuggian aspects of this are still underappreaciated. Bureaucracy and bureaucrats are isolated from the consequences of their actions; the idea of equality before the law is a complete joke in the modern regulatory state, and the incentive vectors point in exactly the wrong direction. Scott ultimately blames it on the incentives of the politicians—the people seem to accept infinite costs to prevent certain bad things from happening; but if we take the people as a given, isn't ultimately the system of governance at fault? Plus ACX on missing school: Kids Can Recover From Missing Even Quite A Lot Of School.

25. Herding, Warfare, and a Culture of Honor: Global Evidence. "The culture of pre-industrial societies that relied on animal herding emphasizes violence, punishment, and revenge-taking". Highly speculative (the approach of extracting culture of honor from folklore seems doubtful for various reasons) and those scatter plots are not entirely convincing, but also intuitively appealing.

26. Exploiting an exogenous shock in birth control prices, The Children of the Missed Pill looks at the causal impact of the pill: "As children reached school age, we find lower school enrollment rates and higher participation in special education programs." The eugenic effect of abortion/contraception is both underrated and understudied.

27. A primer on olivine weathering as a cheap method of carbon capture; looks like it could sequester a tonne of CO2 for less than $20. Geoengineering is very cheap compared to most proposed "green" solutions. The OECD has 120 euros per tonne as its "central estimate" of carbon costs in 2030, implying an extremely high ROI for geongineering.

28. DeepMind: Generally capable agents emerge from open-ended play. "We find the agent exhibits general, heuristic behaviours such as experimentation, behaviours that are widely applicable to many tasks rather than specialised to an individual task. This new approach marks an important step toward creating more general agents with the flexibility to adapt rapidly within constantly changing environments."

29. Unintentionally hilarious paper about AI spotting race in chest x-rays: "Our findings that AI can trivially predict self-reported race - even from corrupted, cropped, and noised medical images - in a setting where clinical experts cannot, creates an enormous risk for all model deployments in medical imaging: if an AI model secretly used its knowledge of self-reported race to misclassify all Black patients, radiologists would not be able to tell using the same data the model has access to."

30. "Pain Reprocessing Therapy" "centered on changing patients’ beliefs about the causes and threat value of pain" more effective than usual care for back pain, at least if you think you can trust people's responses in surveys.

31. Yet another piece of evidence against the efficacy of advertising: TV Advertising Effectiveness and Profitability: Generalizable Results From 288 Brands. "...negative ROIs at the margin for more than 80% of brands, implying over-investment in advertising by most firms. Further, the overall ROI of the observed advertising schedule is only positive for one third of all brands."

32. Brain surgery causes man to need 3 hours less sleep per day.

33. Matt Lakeman travels to Peru and Panama.

34. Poemage is a visualization system for exploring the sonic topology of a poem.

Audio-Visual

35. An animated explainer of Robin Hanson's grabby aliens model: Humanity was born way ahead of its time. The reason is grabby aliens.

36. Did you know that Milla Jovovich released an album in 1994 and it's...not bad at all? Sounds like Kate Bush and Peter Gabriel. Check out Clocks. [NSFW cover art]

37. Plus some great krautrock: Et Cetera - Kabul.

What I've Been Reading

Non-Fiction

  • The History of the Pelopponesian War, by Thucydides. Re-read. What was that Coleridge quip? "All men are born Herodotians or Thucydideans"? Something like that. Anyway, I was definitely born a Herodotian. Thucydides is a historian with the soul of an accountant. Still, there are things to appreciate in that attitude: while most ancient historians never saw an army smaller than 400,000, he's happy to tell you about engagements with 60 hoplites and 20 archers. And keeping track of a myriad engagements, covering Asia Minor, Greece, and Italy, over the span of multiple decades is extremely impressive.

    How prescriptive is Thuc's realpolitik? I'm not entirely sure, it certainly didn't do the Athenians any good. He's obviously a skeptic when it comes to the supernatural, and there's very little room for morality in his history; is this an artifact of the lack of morality in the way the Athenian went about their affairs, or is this something Thuc projects onto them? One interesting point is that his story draws on the structure of tragedy: the hubris of the Sicilian expedition is ultimately punished; the players seem to lack any ability to change course. Perhaps morality plays no role in this history because Thuc views the path taken by each polis as deterministic. (This applies both to the "Thucydides trap" specifically, and also more generally).

    On the question of direct democracy as a system of government things are a bit clearer as Thuc doesn't hide his views. He's short on alternatives though; the traditional polis obviously can't cope with the environment of the 4th century, but Thuc can't really see beyond it.

    There are apparently some people who think Thucydides influenced the Neoconservatives, and I find that utterly absurd. Thuc is extraordinarily cynical when it comes to "spreading freedom"-style justifications for war, and if there's any realpolitik involved in spending trillions so that Afghan women can get gender studies degrees for 20 years before the Taliban come back, I'm not seeing it.

    One of the things that stand out is how bad the Greeks are at war. Reading Thuc, you're constantly thinking "well of course these guys got rolled by the Romans". How did they beat the Persians so hard? Sieges seem to be a sticking point (something Phillip II turned out to be quite good at), so perhaps the open battles against the Persians played into their hands, or perhaps it was simply a matter of mismatched unit compositions. On the other hand the Athenians were extraordinarily persistent; even after the plague and the Sicilian disaster they still kept going for years, possibly only losing due to the Persian money flowing into the Spartan coffers.

    If you haven't read any histories of the Pelopponesian war, this is highly recommended, just keep in mind it's very unfinished. Get the Landmark edition.

  • The Swerve: How the World Became Modern, by Stephen Greenblatt. This book has an incredibly ambitious thesis: it argues that the world became modern due to the rediscovery of Lucretius' De Rerum Natura. Unfortunately the evidence presented in favor of that thesis is pretty weak, and the book suddenly ends right as it starts to get into a groove. Still, it's fairly entertaining and has a ton of interesting anecdotes from the life of Poggio Bracciolini (the man who rediscovered Lucretius).

  • The Origin of Species, by Charles Darwin. A fairly dry read, its value today mainly lies in its documentation of the discovery of evolution, and in showing how Darwin could reason his way forward despite rather limited means (not even an inkling of DNA!). It was cool to see the role geology played in the development of evolutionary theory, and there's a very interesting passage (at the end of the chapter ON THE IMPERFECTION OF THE GEOLOGICAL RECORD) in which Darwin almost invents plate tectonics based on the geographical distribution of species. It's difficult to recommend: if you want to learn about evolution, pick up a modern textbook; if you're interested in the history of science you should probably read a historian; and if you just want to read something cool by Charles Darwin, pick up his Beagle adventure.

Fiction

  • A Fire Upon the Deep, by Vernor Vinge. Some pretty cool worldbuilding, with a universe divided into zones where different levels of technology are possible (the highest one is filled with Gods who quickly commit suicide). One of the main alien races is a sentient houseplant riding a roomba (seriously). But half the novel is wasted on a dull isekai story about some annoying kids stuck on a backwards planet with telepathic wolves, making the thing way overlong. And the resolution is not entirely satisfying.

  • Inhibitor Phase, by Alastair Reynolds. A new novel in the Revelation Space universe, unfortunately it's also the worst novel in the Revelation Space universe. It's a bit like a horror theme park, going from one ride to the next with little to no connective tissue between them. Even worse, many of the rides are completely nonsensical given the setting (humanity has almost been completely wiped out by the inhibitors). The two main characters are completely uninteresting, their dialogue is annoying, and the revelations of their backstory are completely predictable.

  • Crash, by J. G. Ballard. Holy mother of Christ, this is an experience. A blunt tool that beats you into submission through drone-like repetitiveness. Truly a novel that lives up to its reputation (one publisher's reader wrote: "This author is beyond psychiatric help. Do Not Publish!"). What images! A peerless examination of the intersection between sex and technology. The Cronenberg film gets the imagery right, but the languid, whispered tempo is completely wrong. Kermode, in a very positive review, described it as "glacial"! I feel the novel required a more in-your-face treatment.

    He dreamed of ambassadorial limousines crashing into jack-knifing butane tankers, of taxis filled with celebrating children colliding head-on below the bright display windows of deserted supermarkets. He dreamed of alienated brothers and sisters, by chance meeting each other on collision courses on the access roads of petrochemical plants, their unconscious incest made explicit in this colliding metal, in the heamorrhages of their brain tissue flowering beneath the aluminized compression chambers and reactions vessels.

  • Lord of the Flies, by William Golding. Somehow managed to evade this as a kid. It's compelling and effective but I can't get on board with its overwhelming cynicism.

  • Don't Make Me Think, by Zero HP Lovecraft. The emoji gimmick doesn't work, but I loved the world-building.

  • Flashman and the Dragon, by George MacDonald Fraser. Flashman's in China this time, right in the middle of the Taiping Rebellion. This is the 8th book in the series, and things are starting to get repetitive, but the humor, deep historical research, and memorable characters manage to overcome the familiar plotline.

  • The Shadow of the Wind. by Carlos Ruiz Zafón. Bad audiobook of a bad airport novel filled with interminable exposition dumps in an awful style. Dropped it halfway through.




Book Review: The Lives of the Most Excellent Painters, Sculptors, and Architects

I found Giorgio Vasari through Burckhardt1 and Barzun. The latter writes: "Vasari, impelled by the unexampled artistic outburst of his time, divided his energies between his profession of painter and builder in Florence and biographer of the modern masters in the three great arts of design. His huge collection of Lives, which is a delight to read as well as a unique source of cultural history, was an amazing performance in an age that lacked organized means of research. [...] Throughout, Vasari makes sure that his reader will appreciate the enhanced human powers shown in the works that he calls "good painting" in parallel with "good letters.""

Vasari was mainly a painter, but also worked as an architect. He was not the greatest artist in the world, but he had a knack for ingratiating himself with the rich and powerful, so his career was quite successful. Besides painting, he also cared a lot about conservation: both the physical preservation of works and the conceptual preservation of the fame and biographies of artists. He gave a kind of immortality to many lost paintings and sculptures by describing them to us in his book.

His Lives are a collection of more than 180 biographies of Italian artists, starting with Cimabue (1240-1302) and reaching a climax with Michelangelo Buonarroti (1475-1564). They're an invaluable resource, as there is very little information available about these people other than his book; his biography of Botticelli is 8 pages long, yet on Botticelli's wikipedia page, Vasari is mentioned 36 times.

He was a straight-laced man surrounded on all sides by wild and eccentric artists. While Vasari was a sober businessman, always delivering his work on time, the people he was writing about were usually tempestuous madmen who would take commissions and leave the work unfinished, or go off on the slightest affront and start hacking apart their own works. Even of the great Leonardo he writes that "through his comprehension of art, [he] began many things and never finished one of them".

The greater part of the craftsmen who had lived up to that time had received from nature a certain element of savagery and madness, which, besides making them strange and eccentric, had brought it about that very often there was revealed in them rather the obscure darkness of vice than the brightness and splendour of those virtues that make men immortal.

Many of them were undone by their love of food, drink, and/or women:

...when his dear friend Agostino Chigi commissioned him to paint the first loggia in his palace, Raffaello was not able to give much attention to his work, on account of the love that he had for his mistress.

Gwern's review of the autobiography of Cellini (which includes the words "aside from the demonology and weather-controlling") should give you a taste of what these guys were like.

Arnold M. Ludwig, The Price of Greatness: Resolving the Creativity and Madness ControversyArnold M. Ludwig, The Price of Greatness: Resolving the Creativity and Madness Controversy

Vasari's approach to the truth can be described as loose, if not gossipy. Many of the lives include fabricated elements, sometimes obviously so: I doubt anyone ever believed the story of Cimabue taking on Giotto as a pupil after seeing him scratch a painting on a stone. One of the most striking tales is the murder of Domenico Veneziano by Andrea del Castagno, but in reality Castagno actually died first. Vasari also damaged the reputation of some of his competitors, such as Jacopo da Pontormo, whom he portrayed as a paranoid recluse.

Vasari is also hilariously biased in favor of Florence: "in the practice of these rare exercises and arts—namely, in painting, in sculpture, and in architecture—the Tuscan intellects have always been exalted and raised high above all others". The story of his visit to Titian (a Venetian) is typical:

One day as Michelangelo and Vasari were going to see Titian in the Belvedere, they saw in a painting he had just completed a naked woman representing Danae with Jupiter transformed into a golden shower on her lap, and, as is done in the artisan's presence, they gave it high praise. After leaving Titian, and discussing his method, Buonarroti strongly commended him, declaring that he liked his colouring and style very much but that it was a pity artisans in Venice did not learn to draw well from the beginning and that Venetian painters did not have a better method of study.

Titian, Danae with Jupiter as a "golden shower"Titian, Danae with Jupiter as a "golden shower"

I read (as usual) the Everyman edition, but would not recommend braving the entire work unless you're a Renaissance art fanatic. The collection spans over 2000 pages, and can get tiresome and repetitive when you go through the 100th similar biography of some minor painter you've never heard of. I would, however, recommend the best chapters which I have picked out below (and which are freely available online):

  • Giotto (1267-1337), an early stepping stone between the medieval "Greek style" and the modern one.
  • Uccello (1397-1475), the pioneer of perspective.
  • Piero di Cosimo (1462-1522), an eccentric who was influenced by the Dutch and eventually fell under the sway of Savonarola.
  • Raffaello (1483-1520), a brilliant talent who died young.
  • Il Rosso (1495-1540), who travelled to France and painted for Francis I.
  • Il Sodoma (1477-1549), the name says it all. Had a pet monkey.

The Golden Present

Giorgio Vasari was one of the earliest philosophers of progress. Petrarch (1304-1374) invented the idea of the dark ages in order to explain the deficiencies of his own time relative to the ancients, and dreamt of a better future:

My fate is to live among varied and confusing storms. But for you perhaps, if as I hope and wish you will live long after me, there will follow a better age. This sleep of forgetfulness will not last forever. When the darkness has been dispersed, our descendants can come again in the former pure radiance.

To this scheme of ancient glory and medieval darkness, Vasari added a third—modern—age and gave it a name: rinascita.2 And within his rinascita, Vasari described an upward trajectory starting with Cimabue, and ending in a golden age beginning with eccentric Leonardo and crazed sex maniac Raphael, only to give way to the perfect Michelangelo in the end. It is a trajectory driven by the modern conception of the artist as an individual auteur, rather than a faceless craftsman.

The most benign Ruler of Heaven in His clemency turned His eyes to the earth, and, having perceived the infinite vanity of all those labours, the ardent studies without any fruit, and the presumptuous self-sufficiency of men, which is even further removed from truth than is darkness from light, and desiring to deliver us from such great errors, became minded to send down to earth a spirit with universal ability in every art and every profession.

This golden age was certainly no utopia, as 16th century Italy was ravaged by political turbulence, frequent plague, and incessant war. Many of the artists mentioned were at some point taken hostage by invading armies; Vasari himself had to rescue a part of Michelangelo's David when it was broken off in the battle to expel the Medici from Florence.

And yet Vasari saw greatness in his time, and the entire book is structured around a narrative of artistic progress. He documented the spread of new technologies and techniques (such as the spread of oil painting, imported from the Low Countries), which—as an artist—he had an intimate understanding of.

This story of progress is paralleled with the rediscovery (and, ultimately, surpassing) of the ancients. It would take until the 17th century for the querelle des Anciens et des Modernes to really take off in France, but in Florence Vasari had already seen enough to decide the question in favor of his contemporaries—the essence of the Enlightenment is already present in 1550. He writes about Donatello (1386-1466), who produced the first nude male sculpture of the modern era:

The talent of Donato was such, and he was so admirable in all his actions, that he may be said to have been one of the first to give light, by his practice, judgment, and knowledge, to the art of sculpture and of good design among the moderns; and he deserves all the more commendation, because in his day, apart from the columns, sarcophagi, and triumphal arches, there were no antiquities revealed above the earth. And it was through him, chiefly, that there arose in Cosimo de' Medici the desire to introduce into Florence the antiquities that were and are in the house of the Medici; all of which he restored with his own hand.

Similarly, he explains that Mino da Fiesole's (1429-1484) work was "somewhat stiff", yet it was nonetheless admired because "few antiquities had been discovered up to that time". The ancients created a new higher standard, which first created a thirst for imitation, then an impetus to outclass it.

...their successors were enabled to attain to it through seeing excavated out of the earth certain antiquities cited by Pliny as amongst the most famous, such as the Laocoon, the Hercules, the Great Torso of the Belvedere, and likewise the Venus, the Cleopatra, the Apollo, and an endless number of others, which, both with their sweetness and their severity, with their fleshy roundness copied from the greatest beauties of nature, and with certain attitudes which involve no distortion of the whole figure but only a movement of certain parts, and are revealed with a most perfect grace, brought about the disappearance of a certain dryness, hardness, and sharpness of manner...

It is curious that this competitive attitude seems to have disappeared in later eras. In the 18th century, for example, the English painter Joshua Reynolds said of the Belvedere Torso that it retained "the traces of superlative genius…on which succeeding ages can only gaze with inadequate admiration." The Italians of the Renaissance had such a civilizational confidence and such an individual lust for Glory that these fatalistic thoughts would never enter their minds. Take Raphael's School of Athens for example: imagine the self-confidence (if not presumption) necessary to paint Plato (portrayed by da Vinci) and Heraclitus (portrayed by Michelangelo) in the form of your friends and contemporaries! Imagine someone trying that today—Dennet as Plato, Gaspar Noé as...Diogenes?

What came first, the excellence or the confidence? Who can disentangle cause and effect? Braudel suggests an initial "restlessness" created the necessary conditions:

Perhaps if the door is to be opened to innovation, the source of all progress, there must be first some restlessness which may express itself in such trifles as dress, the shape of shoes and hairstyles?

Vasari certainly thought this ambition was a necessary ingredient for greatness. Commenting on Andrea del Sarto, he writes that he was excellent in all skills but "a certain timidity of spirit and a sort of humility and simplicity in his nature made it impossible that there should be seen in him that glowing ardour and that boldness which, added to his other qualities, would have made him truly divine in painting".

And one may ask: why is there no ancient Vasari? Pliny, describing the Laocoön, writes that it is "a work to be preferred to all that the arts of painting and sculpture have produced". Yet he is content to simply mention the names of the artists in passing: Agesander, Athenodorus, and Polydorus of Rhodes. There is not the slightest hint of curiosity about the lives of those most excellent men. Even worse, they were (highly-skilled) copyists, selling reproductions of Hellenistic works to wealthy Romans. The name of the original sculptor is lost to time.3

Aesthetic Value Over Time

You're probably familiar with the story of the Mona Lisa: it was unpopular until it was stolen in 1911, Apollinaire and Picasso were suspects in the case, and when it was finally returned two years later it had become the most famous painting in the world. I was surprised, then, to see that the Mona Lisa was singled out for effusive praise by Vasari. He even focuses on that famous smile:

For Francesco del Giocondo, Leonardo undertook the portrait of Mona Lisa, his wife, and after working on it for four years, he left the work unfinished, and it may be found at Fontainebleau today in the possession of King Francis. Anyone wishing to see the degree to which art can imitate Nature can easily understand this from the head, for here Leonardo reproduced all the details that can be painted with subtlety. The eyes have the lustre and moisture always seen in living people, while around them are the lashes and all the reddish tones which cannot be produced without the greatest care. The eyebrows could not be more natural, for they represent the way the hair grows in the skin—thicker in some places and thinner in others, following the pores of the skin. The nose seems lifelike with its beautiful pink and tender nostrils. The mouth, with its opening joining the red of the lips to the flesh of the face, seemed to be real flesh rather than paint. Anyone who looked very attentively at the hollow of her throat would see her pulse beating: to tell the truth, it can be said that portrait was painted in a way that would cause every brave artist to tremble and fear, whoever he might be. Since Mona Lisa was very beautiful, Leonardo employed this technique: while he was painting her portrait, he had musicians who played or sang and clowns who would always make her merry in order to drive away her melancholy, which painting often brings to portraits. And in this portrait by Leonardo, there is a smile so pleasing that it seems more divine than human, and it was considered a wondrous thing that it was as lively as the smile of the living original.

There are, however, one or two minor problems with his account. One of them is that Vasari never actually saw the Mona Lisa: he was about 6 years old when the painting was moved to France, and he never left Italy. Vasari also says that Leonardo left the painting unfinished, while the Mona Lisa is very much finished. So what's going on? Until the 20th century people simply thought he made it up based on sketches or second-hand accounts.

And then, in 1913, an art collector discovered a second Mona Lisa hanging in a house in Somerset. By all accounts it appears to be authentic, and it matches a sketch by Raphael. It's also a better match for Vasari's description, though he may not have seen this one either. If he thought this one was great, just imagine how he would have raved about the first Mona Lisa!

The Second Mona LisaThe Second Mona Lisa

This raises the question: do we venerate the same paintings as Vasari due to path dependency, or due to constancy in aesthetic judgment? Broadly, Vasari's taste is our own. He likes Raphael, da Vinci, and Michelangelo above all others. There are certainly those who argue that the influence and worship of Florentine artists is merely a historical accident, and if Vasari had been a Venetian the history of painting would have turned out rather different.

There are a few interesting points of difference. For example, Botticelli only gets a very short biography, and his Birth of Venus merits not more more than a passing comment: Vasari says "he expressed himself with grace". Another artist who was mostly ignored by Vasari and was later "reevaluated" is the highly erotic Antonio da Correggio.

Correggio, Jupiter and IoCorreggio, Jupiter and Io

Architecture?

YOU - Hold on, is architecture also art?

CONCEPTUALIZATION - Of course not, it's autism. Box-drawing. Masturbation with a ruler and a sextant or whatever they use.

Painters, naturally. Sculptors, of course. But...architects? Certainly no twenty-first century chronicler would collect the lives of painters, sculptors, and architects. In our own age architecture is little more than an exercise in applied misanthropy. It has gotten so bad even the commies can tell it sucks, and they're not exactly famous for their aesthetic discernment. And these Renaissance artists were not limited to constructing fancy villas or churches, they often got involved in military engineering as well!

Architects might try to defend themselves by appealing to specialization and saying that, as the science has progressed, an architect today requires far more specialized knowledge than they did in the 16th century. One cannot be both a painter and an architect at the same time. Yet I cannot help but notice that the Duomo and the Uffizi are still beautiful and still standing, while our contemporary concrete claptrap starts crumbling after a couple of years. Our segregation of these fields is both arbitrary and misguided. Perhaps education (and the way it commoditizes knowledge) is to blame.4

Human Capital, Power Laws, and Cluster Effects

Perhaps the reason these people were able to paint, sculpt, architect, and sometimes even do a bit of military engineering on the side, is that art was one of very few avenues available at the time for people to monetize their high human capital. A smart guy in 16th century Florence had limited options: he might go into law, try to be a scribe, or (if he had money) commerce. Science was not a profession, and there were no hedge funds or startups. Art offered a new avenue, open to all with talent.5

Art was also something of a winner-take-all market, with virtually unlimited upside for the select few who could make it to the top. Like modern athletes, the superstars were drowning in money while the average painter didn't make all that much. Time seems to have confirmed this power law in artistic excellence: nobody goes to a museum for the paintings of Bartolomeo Vivarini, while da Vinci draws millions every year.

Societies are broadly defined by how they allocate status and (by extension) how they allocate the scarce biological resources they have access to. Rome rewarded military leadership, so it got a lot of great generals (and civil wars). The kleptocrats of Renaissance Italy allocated talent to art; gold and fame attracted competence. At the time Vasari was writing, there were about thirty thousand men in Florence—roughly the same as the number of male citizens in Classical Athens, and also roughly the same as today's population of Dubuque, Iowa. Yet their achievements (to borrow a phrase from Gibbon) would excuse the computation of imaginary millions.

One might ask: where are all the Shakespeares? There are about 25x more literate men in England today than in 1564, how come we aren't producing 25x more Shakespeares? The answer is that our society does not allocate much of its human capital to playwriting. There are (potentially) great authors who spend 8 hours a day writing ads for cereal, or improving trading algorithms by 0.01%. Capitalism, for all its virtues, tends to instill a preference for mere optimization in its subjects; the shadow of utility blots out the impetus for Glory.

One of the starkest lessons from Vasari is the importance of clusters in artistic production. He highlights both the spur of competition and the virtues of learning through imitation. He documents how new techniques (oil, perspective, new approaches to color) spread like wildfire, and how the newly unearthed Roman statues provided both a lesson and a stimulus for improvement. The mentor-mentee relationship was extremely important; a young artist could destroy his entire career by choosing the wrong master.

There is an extensive literature in "economics"6 covering the influence of agglomeration in creative industries. Hollywood is an obvious example, but there also seem to be agglomeration gains in 18-19th century classical music, while a writer who moved to London in the 18th or 19th century ended up with 12% higher productivity. Renaissance Florence certainly seems to be another one of these (which suggests an element of path dependency).

A century ago a man like Ernest Hemingway could just travel to Paris, join a flourishing artistic community, and have lunch with the world's greatest author (James Joyce). Imagine some random guy flying to New York and trying to have a meal with Thomas Pynchon today. Global connectivity has made us more insular by removing the barriers that used to act as filters. The apprenticeship opportunities of the Renaissance do not exist any more, though the wealthy patrons are still around.

Yet new possibilities for cluster formation open up on the internet: group chats, forums, perhaps even twitter. But it is not easy to cultivate the right mix of competition and imitation, or the preconditions necessary for cultural confidence and a lust for Glory. Perhaps the closest analogy in our time would be Silicon Valley; a relatively small area which attracts talent in search of money and fame. It certainly has that culture of ambition.

I leave you with a final quotation from Vasari, on the motivations behind art and how they affect the ultimate product:

And, to tell the truth of the matter, those craftsmen who have as their ultimate and principal end gain and profit, and not honour and glory, rarely become very excellent, even although they may have good and beautiful genius; besides which, labouring for a livelihood, as very many do who are weighed down by poverty and their families, and working not by inclination, when the mind and the will are drawn to it, but by necessity from morning till night, is a life not for men who have honour and glory as their aim, but for hacks, as they are called, and manual labourers, for the reason that good works do not get done without first having been well considered for a long time. And it was on that account that Rustici used to say in his more mature years that you must first think, then make your sketches, and after that your designs; which done, you must put them aside for weeks and even months without looking at them, and then, choosing the best, put them into execution; but that method cannot be followed by everyone, nor do those use it who labour only for gain.

Giorgio Vasari, Self-portraitGiorgio Vasari, Self-portrait

  1. 1."And without Giorgio Vasari of Arezzo and his all-important work, we should perhaps to this day have no history of northern art, or of the art of modern Europe, at all."
  2. 2.The word "renaissance" was only popularized in the 19th century by Michelet.
  3. 3.There was a sculptor-biographer named Xenokrates of Sicyon but all his works are lost.
  4. 4.McKenzie Wark: "Education “disciplines” knowledge, segregating it into homogenous “fields,” presided over by suitably “qualified” guardians charged with policing its representations. The production of abstraction both within these fields and across their borders is managed in the interests of preserving hierarchy and prestige. Desires that might give rise to a robust testing and challenging of new abstractions is channelled into the hankering for recognition."
  5. 5.Michelangelo would undoubtedly have scored very well on the GRE.
  6. 6.All will be trampled under the steady imperial advance of the SPQE—the Senatus Populusque Economicus!



Against Caring About Individual Bad Studies

The idea of a personal "carbon footprint" is an oil company psyop. About 20 years ago, British Petroleum launched an ad campaign popularizing the notion and put out a website letting you calculate your "carbon footprint". They're still at it.

It's an idea with remarkable memetic power, both for individuals and brands. Displaying your concern about your personal carbon footprint lets you show off your prosociality and marks you out as someone virtuous, someone who takes personal responsibility. The idea also feeds into people's narcissistic tendencies, reassuring them that they're actually important and that their actions matter in the world.

Marketers love the concept, and any company trying to appeal to the nature-loving demographic can use and abuse it: Outdoor Brands Get Serious About the Carbon Footprint of Adventure: The North Face and Protect Our Winters unveil an activism-oriented CO2 calculator.

The problem with all of this, and the reason BP pushed the idea in the first place, is that your personal carbon footprint doesn't matter. You're 1 of about 7.6 billion people on earth, so your effect is about 1/7,600,000,000 ≈ 0.000000013%. Your personal carbon footprint is completely irrelevant to climate change. Global warming is a collective issue that requires collective solutions; framing it as a problem that individuals can tackle (and that individuals are responsible for) distracts from the public policy changes that are necessary. Environmentalist signaling is complete nonsense but also deadly serious.

The Parallel

Caring about individual bad studies is a bit like caring about your individual carbon footprint.

People occasionally send me shitty papers, and year or two ago I would care a lot, enjoying the shameful thrill of getting Mad Online about some fraud, or having fun picking apart yet another terrible study. It's an attractive activity, and performing it in public shows how much you care about Good Science. Picking out a single paper to replicate operates at the same level. What's the impact of all this?

In the idealized version of science, a replication failure would raise serious doubts about the veracity of the original study and have all sorts of downstream effects. In the real version of social science, none of that matters. You have to go on active memetic warfare if you want to have any effect, and even then there's no guarantee you'll succeed. As Tal Yarkoni puts it, "the dominant response is some tut-tutting and mild expression of indignation, and then everything reverts to status quo until the next case". People keep citing retracted articles. Brian fucking Wansink has been getting over 7 citations per day in 2021. So what exactly do you think a replication is going to achieve?

Walker

A couple years ago Alexey Guzey wrote "Matthew Walker's "Why We Sleep" Is Riddled with Scientific and Factual Errors", finding not only errors but even egregious data manipulation in Walker's book. Guzey later collaborated with Andrew Gelman on Statistics as Squid Ink: How Prominent Researchers Can Get Away with Misrepresenting Data.

What was the effect of all this? Nothing.

Guzey explains on twitter:

my piece on the book has gotten >250k views by now and still not a single neuroscientist or sleep scientist commented meaningfully on the merits of my accusations. [...] According to UC Berkeley, "there were some minor errors in the book, which Walker intends to correct". The case is closed.

The feedback loops that are supposed to reward people who seek truth and to punish charlatans are just completely broken.

...but a prominent neuroscientist did write to him in private to express his agreement.

Implicit Bias

It is so much harder to get rid of bullshit than it is to prevent its publication in the first place. Let's take a look at some of the literature on implicit bias.

Oswald et al (2013) meta-analyze the relation between the IAT and discrimination: "IATs were poor predictors of every criterion category other than brain activity, and the IATs performed no better than simple explicit measures." Carlsson & Agerström (2016) refine the Oswald et al paper, and find that "the overall effect was close to zero and highly inconsistent across studies [...] little evidence that the IAT can meaningfully predict discrimination".

Meissner et al (2019) review the IAT literature and find that the "predictive value for behavioral criteria is weak and their incremental validity over and above self-report measures is negligible".

Forscher et al (2019) meta-analyze the effect of procedures to change implicit measures, and find that the "generally produced trivial changes in behavior [...] changes in implicit measures did not mediate changes in explicit measures or behavior". Figure 8 from their paper shows the effect of changing implicit measures on actual behavior:

What was the effect of all this? Nothing.

Just within the last few days, the New Jersey Supreme Court announced implicit bias training for all employees of state courts, "Dean Health Plan in Wisconsin implemented new strategies to address health equity in maternal health, including implicit bias training for employees", the Auburn Human Rights Commission has "offered implicit bias training to supervisory personnel in Auburn city government, the Cayuga County Sheriff's Office, public schools and other local organizations", and California's Attorney General is making sure that healthcare facilities are complying with a law requiring anti-implicit bias training.

You can debunk, and (fail to) replicate all you want, but it don't mean a thing. Mitchell & Tetlock (2017) write:

once employers, health care providers, police forces, and policy-makers seek to develop real solutions to real problems and then monitor the costs and benefits of these proposed solutions, the shortcomings of implicit prejudice research will likely become apparent

But it didn't turn out that way, did it? Just as with the personal carbon footprint, the ultimate outcome is a secondary consideration at best.

Fin

Yarkoni (the British Petroleum of social science) says "it's not the incentives, it's you" but, really, it's the incentives. Before you can run, you must walk. Before you replicate, you must have a scientific ecosystem with reliable self-correction mechanisms.1 And before you build that, it's a good idea to limit the publication of false positives and low-quality research in general.

One of the key insights of longtermism is that if humanity survives in the long term, the vast majority of humans will live in the future, so even a small improvement to their welfare can have a huge effect. We might make a similar argument about longtermism in social science: the vast majority of papers lie in the future. If we can do something today to improve them even by a little bit, the cumulative impact would be enormous. On the other hand, defeating one of the 10,000 bad papers that will be published this year is not going to do much at all. Effective scientific altruism is systematically improving the future by 0.01% rather than putting your energy into deboonking a single study. Every dollar wasted on replication is a dollar that could've been invested in fixing the underlying collective problems instead. The past is not going to change, but the future is still malleable.

Ideally we'd proclaim the beginning of a new era, ban citations to any pre-2022 works, and start from scratch (except actually do things properly this time). Realistically that won't happen, so the second-best approach is probably a Hirschmanian Exit into parallel institutions.


  1. 1.I don't want to overstate the case here—some disciplines work pretty well, so it's not entirely hopeless. I would certainly hope that medical researchers still try to replicate the effects of drugs, and physicists replicate their particle experiments. But in the social sciences things are different.



Links & What I've Been Reading Q2 2021

Metascience

1. New Science is an attempt to construct a brand new, parallel research ecosystem without all the cruft of academia. Founded by Alexey Guzey and advised (among others) by Tyler Cowen and Andrew Gelman.

2. Observing Many Researchers Using the Same Data and Hypothesis Reveals a Hidden Universe of Uncertainty: 73 teams study the same hypothesis with the same data. Chaos ensues. "Each model deployed to test the hypothesis was unique".

3. Tal Yarkoni and Joe Hilgard have exited academia. Some notes on The Science Reform Brain Drain. "I didn’t believe then that scientific reform would just fizzle out, given the attention and passion it elicited. Now, seeing how tenure insulates older researchers and competition weeds out those who don’t play by their rules, I understand the cynicism better."

4. Please Commit More Blatant Academic Fraud "The problem with this sort of low-key fraud is that it’s insidious, it’s subtle. In many ways, a fraudulent action is indistinguishable from a simple mistake. There is plausible deniability [...] Let’s make explicit academic fraud commonplace enough to cast doubt into the minds of every scientist reading an AI paper. Overall, science will benefit."

5. Nonreplicable publications are cited more than replicable ones.

6. Atoms: smart contracts for science funding. "Implicit researcher duties are now made explicit with incentives. As a result, scientific roles can become both more specialized and more diverse. PIs can focus less time on writing grants and more time on conducting research. Or the PIs who enjoy and excel at raising funds can do so and even re-deploy it to the right scientists, akin to founders who become angel investors and venture capitalists."

7. Understanding and Predicting Retractions of Published Work Based on metadata + full text. Performs surprisingly well. "Individually, SJR, abstract, country give the best performance out of all metadata features."

8. Collison, Cowen & Hsu, What We Learned Doing Fast Grants. "64% of respondents told us that the work in question wouldn’t have happened without receiving a Fast Grant."

9. Elisabeth Bik is facing legal threats. This Didier Raoult character apparently has more than 3500 publications.

10. A Retrospective on the 2014 NeurIPS Experiment: a giant post on the consistency of the review process, based on 170 papers submitted to NeurIPS. The consistency is actually fairly high, though there is "no correlation between reviewer quality scores and paper's eventual impact".

Covid

11. On the role of scientific journals in shaping the narrative around the origins of covid.

Yet this is the same prestigious journal that published a now infamous statement early last year attacking “conspiracy theories suggesting that Covid-19 does not have a natural origin“. Clearly, this was designed to stifle debate. It was signed by 27 experts but later turned out to have been covertly drafted by Peter Daszak, the British scientist with extensive ties to Wuhan Institute of Virology. To make matters worse, The Lancet then set up a commission on the origins — and incredibly, picked Daszak to chair its 12-person task force, joined by five others who signed that statement dismissing ideas the virus was not a natural occurrence.

12. Who killed the lab leak hypothesis? (twitter thread).

Forecasting

13. Avraham Eisenberg: Tales from Prediction Markets

There was a market on how many times Souljaboy would tweet during a given week. The way these markets are set up, they subtract the total number of tweets on the account at the beginning and end, so deletions can remove tweets. Someone went on his twitch stream, tipped a couple hundred dollars, and said he'd tip more if Soulja would delete a bunch of tweets. Soulja went on a deleting spree and the market went crazy.

14. The Market Consequences of Investment Advice on Reddit's Wallstreetbets: "We find average ‘buy’ recommendations result in two-day announcement returns of 1.1%.[...] 2% over the subsequent month and nearly 5% over the subsequent quarter. [...] our findings suggest that both WSB posters and users are skilled." Or as /r/wallstreetbets put it, "a group of scientists checked our sub out and came to the conclusion that we are not complete morons".

Book Reviews

15. On Sarah Ruden's translation of the gospels: Do you know how weird the gospels are?

Plenty of good reviews came out of the SSC book review contest. My favorites:

16. Double Fold, on librarians and preservation.

17. On The Natural Faculties, a defense of Galen.

18. Down And Out In Paris And London, on Orwell's experiences as a tramp and menial worker.

The Rest

19. There’s no such thing as a tree (phylogenetically): a fantastic post on convergent evolution and the classification of 'trees'. "The common ancestor of a maple and a mulberry tree was not a tree. The common ancestor of a stinging nettle and a strawberry plant was a tree."

20. From the great new blog SLIME MOLD TIME MOLD: Higher than the Shoulders of Giants; Or, a Scientist’s History of Drugs. What if the productivity growth slowdown is due to the 1970s Controlled Substances Act? Come for the history of stimulants, stay for Tesla's views on chewing gum. Too many good quotes! Not entirely sure if it's serious or tongue-in-cheek, but that's part of the charm.

21. Toby Ord: The Edges of Our Universe

  • Many galaxies that are currently outside the observable universe will become observable later.
  • Less than 5% of the galaxies we can currently observe could ever be affected by us, and this is shrinking all the time.
  • But we can affect some of the galaxies that are receding from us faster than the speed of light.

22. Scott Alexander Contra Smith On Jewish Selective Immigration. The final paragraph is absolutely spot on: if the Ashkenazi advantage is cultural, then studying it is by far the most important question in the social sciences.

23. Shocks to human capital persist, shocks to physical capital do not: BOMBS, BRAINS, AND SCIENCE: THE ROLE OF HUMAN AND PHYSICAL CAPITAL FOR THE CREATION OF SCIENTIFIC KNOWLEDGE. Also interesting for the data on Jewish contributions to German science before the war: "While 15.0% of physicists were dismissed, they published 23.8% of top journal papers before 1933, and received 64% of the citations"! h/t @cicatriz

24. Social Mobility and Political Regimes: Intergenerational Mobility in Hungary,1949-2017. Social mobility rates ~the same during and after communism. Aristocrats still privileged after 1949. h/t @devarbol

25. No causal associations between childhood family income and subsequent psychiatric disorders, substance misuse and violent crime arrests: a nationwide Finnish study of >650 000 individuals and their siblings. A new study from Amir Sariaslan and colleagues, corroborating earlier results from Sweden. Perhaps the Scandinavian nations with their generous social spending are different from countries with greater inequality though?

26. The Lead-Crime Hypothesis: A Meta-Analysis. "When we restrict our analysis to only high-quality studies that address endogeneity the estimated mean effect size is close to zero." That's quite the funnel plot:

27. Better air is the easiest way not to die. On particles in the air, the harm they cause, and how to avoid them. "By all means, control your body-mass, eat well, and start running. Those are important, but they’re also kind of hard. You might fail to lose weight, but if you try to fix your air, you’ll succeed. You should put the stuff with the highest return on effort first, and that’s air."

28. On Fantastic Mr. Fox and Ted Kaczynski.

By the end of the movie Mr. Fox has pillaged and salted three of the country's largest industrial farms and set a small town on fire with acorn bombs. He got symbolically castrated, lost his home, almost lost his marriage, children, and destroyed the homes and businesses of 20 people who were lucky they didn't starve to death—but he's gotten people to read his column.

29. In 1989 there was an ecoterrorist attack on California, using an invasive species of fruit fly.

30. Viral Visualizations: How Coronavirus Skeptics Use Orthodox Data Practices to Promote Unorthodox Science Online. A seemingly-Straussian (but possibly not) paper on the social epistemology of covid skepticism. "Most fundamentally, the groups we studied believe that science is a process, and not an institution. [...] Moreover, this is a subculture shaped by mistrust of established authorities and orthodox scientific viewpoints. Its members value individual initiative and ingenuity, trusting scientific analysis only insofar as they can replicate it themselves by accessing and manipulating the data firsthand."

31. Robin Hanson: Managed Competition or Competing Managers? On how attitudes toward competition influence our judgments about things like evolution and alien civilizations. "This strong norm favoring management over competition helps explain the widespread and continuing dislike for the theory of natural selection, which explicitly declares a system of competition to be the largest encompassing system."

32. The Deep History of Human Inequality. Rousseau, Darwin, and Boehm on the question of evolution and inequality. "Going further, it could be that culture was essential for reversing polygyny. That’s because practising reverse dominance requires collective action. It’s only by working together that bachelors can depose the big boys."

33. Applied Divinity Studies on Stubborn Attachments, longtermism, progress studies, and effective altruism: The Moral Foundations of Progress. "If we stagnate now, we may be able to restart growth in the future. In comparison, an existential catastrophe is by definition unrecoverable. Given the choice, we ought to focus on stability."

34. A new essay from Houellebecq: The narcissistic fall of France.

No, we are not really dealing with a “French suicide” — to evoke the title of Eric Zemmour’s book — but a Western suicide or rather a suicide of modernity, since Asian countries are not spared. What is specifically, authentically French is the awareness of this suicide. [...] By refusing all forms of immigration, Asian countries have opted for a simple suicide, without complications or disturbances. The countries of Southern Europe are in the same situation, although one wonders if they have consciously chosen it. Migrants do land in Italy, in Spain and in Greece — but they only pass through, without helping to sort out the demographic balance, although the women of these countries are often highly desirable. No, the migrants are drawn irresistibly to the biggest and fattest cheeses, the countries of Northern Europe.

35. The Borderless Welfare State, a report from the Netherlands on the costs and benefits of immigration. Summary in English on p. 19. Scroll down for some great charts.

36. Learning to Hesitate: people tend to spend too much time gathering info on low-impact choices, and too little time gathering info on high-impact choices.

37. What if humans and chimpanzees diverged because of ticks? Hair loss as defense against ticks caused babies to be unable to cling to their mothers, which caused upright walking?! Obviously speculative, but I love this kind of speculation.

38. Wikipedia: Meteor burst communications "is a radio propagation mode that exploits the ionized trails of meteors during atmospheric entry to establish brief communications paths between radio stations up to 2,250 kilometres (1,400 mi) apart."

39. Niccolo Soldo interviews Marc Andreessen(?!?!) "I predict that we — the West — are going to WEIRDify the entire world, within the next 50 years, the next two generations. We will do this not by converting non-WEIRD people to WEIRD, but by getting their kids." His interview with "Unrepentant Baguette Merchant" PEG is also entertaining.

40. Everyone with an e-reader has run into public domain ebooks with horrible formatting/OCR errors on Amazon or Project Gutenberg. Standard Ebooks produces high-quality (and free) versions of public domain books.

41. AI-designed hardware. "We believe that more powerful AI-designed hardware will fuel advances in AI, creating a symbiotic relationship between the two fields."

42. ETH token fights back against frontrunning bots by trapping them in the position.

43. How I Taught The Iliad to Chinese Teenagers

44. On the virtues of frozen food.

45. Great non-fiction books under 250 pages.

Audio-Visual

46. DeepMind's AlphaGo documentary is quite good.

47. Doom on a holographic(?) display.

48. And here's Viagra Boys with Girls & Boys from Shrimp Sessions 2.

What I've Been Reading

Non-Fiction

  • The Lives of the Most Excellent Painters, Sculptors, and Architects by Giorgio Vasari. Vasari was a painter and architect who lived in the first half of the 16th century and personally knew many of the greats (including Michelangelo). In this gossipy collection of biographies he covers more than 180 artists, starting with Cimabue and Giotto in the 13thC and ending with Michelangelo and others who were still alive at the time of writing (like Titian and Jacopo Sansovino). The ideas of progress and renaissance are front and center: the great ancients, the decline in the middle ages, and finally the triumphant rebirth of art in his own era. Parts of it are excellent, but it can get a bit dry and repetitive when he describes various minor artists, so I probably wouldn't recommend the full 2000+ page unabridged version. There's a good two-part BBC documentary called Travels With Vasari. Full review forthcoming.

  • Not by Genes Alone: How Culture Transformed Human Evolution by Robert Boyd & Peter Richerson. I was curious to see if there was anything in B&R that Henrich failed to capture in his work, and the answer is broadly "no", but there are a few interesting differences: while Henrich is rather triumphalist, B&R take a much more skeptical view of cultural evolution (a Nietzschean perspective, though of course they don't cite him). Unfortunately most of the book is bogged down by a series of dull arguments against various opponents of cultural evolution. My recommendation would be to read The Secret of Our Success, then read just chapter 5 ("Culture is Maladaptive") in this one.

  • Great Mambo Chicken and the Transhuman Condition by Ed Regis. A fun pot-pourri of hubristic futurist ideas (cryonics, space habitats, interstellar travel, and so on), and the wild eccentrics who come up with them (Bob Truax, Hans Moravec, Freeman Dyson). The subjects are fascinating, but the book is a bit disorganized and repetitive.

  • Essays and Aphorisms by Arthur Schopenhauer. Selections from Parerga und Paralipomena. Very funny, Schopenhauer would have been one hell of a twitter poaster. Surprisingly similar to the pragmatists in some respects. And a pessimistic inverse of Nietzsche in others: "Between the spirit of Graeco-Roman paganism and the spirit of Christianity the real antithesis is that of affirmation and denial of the will to live – in which regard Christianity is in the last resort fundamentally in the right." Will be tackling World as Will and Representation soon-ish.

  • Selected Writings by William Hazlitt. How pathetic the petty political polemics of the past appear to the present... I despise his style, especially in the political pieces: cheap bluster that aims only to dazzle, never to illuminate. The puffed-up rhetoric of a third-rate ochlagogue. The non-political writings are much better—they are merely unreadable rather than actively offensive.

  • The Literary Art of Edward Gibbon by Harold L. Bond. A fine, short overview. Not aimed at a general audience.

  • Fiscal Regimes and the Political Economy of Premodern States, edited by Andrew Monson & Walter Scheidel. I read the three chapters on Rome and skimmed the rest. If an edited volume on the taxation regimes of pre-modern states sounds interesting topic to you, check it out. Revenue sources, coinage, debt, trade, principal agent problems in collection, constraints to budget allocation, and so on. I should probably get to Scheidel's other works at some point.

  • The Viennese Students of Civilization: The Meaning and Context of Austrian Economics Reconsidered by Erwin Dekker. On the dry/academic side of things in terms of its style. Ultimately feels a bit superficial: this guy said this, the other guy said that...but we never get a critical examination of the substance of the arguments. The key take-away is that the "Austrian school" focused mostly on humbleness before evolved institutions, and emphasized the necessity of limits in order to have practical freedom.
  • Failure Is Not an Option: Mission Control From Mercury to Apollo 13 and Beyond by Gene Kranz. Fascinating subject, but written in a dry, militaristic, PR-conscious style. Even the story of Apollo 13 can become almost boring when told in this manner. Focused entirely on the mission control perspective. The most interesting aspect is how uncredentialed and inexperienced everyone was, and how quickly the space program moved. Reminiscent of Napoleon after the revolution. Feels like they really got lucky sometimes. Genius in hiring?
  • The 48 Laws of Power by Robert Greene. For some reason I read a bunch of "self-help"(-adjacent) books. This one is really anodyne compared to what I was expecting. It's mostly famous as a book read by ruthless rappers, but it's just a bunch of amusing historical anecdotes plus a boatload of confirmation bias. Greene likes the history of Japanese tea ceremonies, France during the Ancien Régime and the revolution, ancient Rome and Greece, and even takes several stories from Giorgio Vasari! Above all he likes Baltasar Gracián, whose The Pocket Oracle and Art of Prudence I can heartily recommend.
  • Influence: Science and Practice by Robert Cialdini. While 48 Laws of Power presents itself as a manual of manipulation and Influence presents itself more as a disinterested scientific study, the former is actually about airy stories of kings and courtiers while the latter is a cynical dark arts manual for manipulating your coworkers. Make of that what you will. Repetitive & overlong. Also, Cialdini loves to cite dubious social science papers—Milgram, Robber's Cave, etc. Still, the broad strokes are fairly convincing.
  • The Presentation of Self in Everyday Life by Erving Goffman. Class-signaling behaviors, profession-signaling behaviors, and so on, viewed through the lens of theatrical presentation. Rather one-sided, I feel it misses situations that can't be boiled down to actor-audience. Nothing really surprising, I think most people will have noticed most of this stuff. Also draws on many questionable historical examples (for example he repeatedly uses the Thugs to illustrate his points).
  • Impro: Improvisation and the Theatre by Keith Johnstone. The general observations on status, presentation, space, etc. are quite good, but when he gets into the specifics about theater and masks it's rather dull and fluffy. Would have preferred something a bit more solid.

Fiction

  • Uzumaki by Junji Ito. Horror manga. Starts with a simple idea: spirals are kinda creepy. From there it spins out in every direction, finally ending up in a bizarre post-apocalyptic Lovecraftian scenario. A virtuosic display of variations on a visual theme. Fantastic art, fantastically weird. Highly recommended. Lots of crazy body horror, not for the squeamish.

  • The Sailor Who Fell from Grace with the Sea by Yukio Mishima. A great short novel about the sea, glory, death, and wanting to have sex with your mother. Somewhat autobiographical, in a symbolic way. Nihilism, tradition vs westernization, youth vs age, all in a lyrical and nautical style.

  • Mao II by Don DeLillo. Cults, mass media, a reclusive author. Love the style, very impressionistic. Lots of great sentences and great paragraphs, unfortunately they do not combine to form a Great Novel, the ideas never coalesce into anything solid. DeLillo revisits many of his typical themes here: American foreign policy, terrorism, cults, etc. Rather presciently written in 1991, very pessimistic on the potentials of mass action. "The future belongs to crowds."

  • Libra by Don DeLillo. A semi-fictionalized biography of Lee Harvey Oswald, based on the CIA/Cuban exiles conspiracy theory of the JFK assassination. Somewhat conventional in its style, and Pynchonesque in its attitude: conspiracies, axes of control and influence, strange coincidences, overeager pattern-matching, taking liberties with history. It's lacking the humor though. There's also a kind of meta parallel story of an FBI agent trying to piece together all the evidence, meticulously going through even the tiniest element (much like DeLillo). Pretty good, but The Names remains my favorite DeLillo.

  • The Pussy by Delicious Tacos. A collection of autobiographical vignettes about sex and relationships. Starts out extremely vulgar and extremely funny, ends up in deep ugliness and despair. A tragedy disguised as a comedy. Pure blackpill fuel: a dystopian vision of work, love, aging, and human connection in our society. Slightly longer review.

  • The Unnamable by Samuel Beckett. If you're interested in the extremes of experimental literature, this is a book for you. The novel at its most abstract and formless. Virtually no characters, plot, movement, imagery, dialogue, paragraphs, or really anything else you might normally associate with a novel. I wouldn't say it's a pleasurable read, but it's an interesting one at least. Isolation, existential loneliness, death.

  • How It Is by Samuel Beckett. It can't possibly be sparser and more formless than The Unnamable, you think. But it is! Beckett does away even with coherent, full sentences in this one. Nothing but a series of roughly sketched impressions, in a halting and disjointed language. Not really my jam.

  • Wasteland of Flint by Thomas Harlan. A fun space opera in a unique setting (an Aztec-Japanese space empire), focused on xenoarchaeology. Ancient aliens, some cool Solaris-like ideas, some really out-there imagery. Unfortunately it's mostly sequelbait and the sequels don't seem to be very good.

  • Too Like the Lightning by Ada Palmer (dropped it half-way through). Wat. My reaction to this book is just pure bewilderment. I love Ada Palmer's blog, but wtf is going on here? Am I supposed to be laughing at the terrible narrator and his horrifically bad similes? Is it for children? The magical boy protagonist and philosophy 101 stuff certainly seems to indicate so. Or maybe "young adults"? What's with the nonsensical worldbuilding (an SF/fantasy future that worships 18th century philosophers, with absurd coincidences piled on top of each other)? And apparently none of the plot is resolved by the end of the book! The whole thing reminded me of the "taxation of trade routes" stuff from the prequels, and this image kept popping into my head:




On the Pension Apocalypse

Aging populations, archaic pay-as-you-go systems, and undercapitalized pension funds will create huge problems for future retirees. Just how bad is it, and what should you do about it?

The Situation

In the past there were many workers and few retirees, so it seemed like a good idea to have the workers pay for old peoples' pensions and promise them the same in return. Thus the pay-as-you-go pension system was born.1 But people stopped having children, started living longer, and the worker:retiree ratio has been falling and will continue to fall precipitously. These problems will be coming home to roost over the next few decades.

To put things into perspective: simply maintaining the current prime:aged ratio would require 383 million additional prime aged people by 2050. The math is clear, and even if fertility tripled tomorrow morning there's a huge lag until that actually starts affecting the economy.

How much will it cost? It's hard to say exactly, the projections depend on fertility, longevity, immigration, growth, and the actual pensions. Plus there are non-pension expenses to take into account: government-funded healthcare spending on retirees is going to increase as well. On the low end some (including the EU) project an increase in spending of just ~3% of GDP, but I find that highly implausible. My own forecast would be around 10% of GDP for the average advanced economy by 2050.

For countries with relatively low government spending and good growth prospects like the US this might not be a problem. For European countries that already have government spending in the 55%+ of GDP range however, things look dire.2 Raising an additional 10% of GDP through taxation would result in a 20-25% cut in disposable income for the average worker for literally nothing in return. Combine that with low/zero growth and things start looking really bad.

Anyone under the age of 40 or so should expect to receive little in return for their pay-as-you-go pension system contributions. Is it unfair that today's workers slave away, are forced to give away all their money to the boomers, only to receive virtually nothing in return? Sure. Is there anything you can do about it? No. Welcome to democracy.

Regional Variation

There is enormous variation in pension systems both between and within countries. Places with relatively small pay-as-you-go systems and heavy reliance on private pensions are probably going to be fine. On the other hand there are municipalities in the US which have already started defaulting.

EU

By 2050, the German workforce is expected to shrink by about 10 million people while the number of retirees will increase by about 7 million people. Most European countries should expect little to no GDP growth in the coming decades, as workforce declines will offset productivity gains. And most of Europe isn't seeing any productivity gains anyway (though some countries, such as Germany, have been growing):


Even more terrifying is the fact that nobody really seems to care about growth in Europe. There's this idea that the EU is ruled by technocrats, but these "technocrats" seem more concerned with adding annoying popups to every website than the permanent collapse of economic growth in the European Union.

Japan has had zero GDP growth since 1995 (which was also when its workforce was at its highest point), and Europe should expect a similar future. Here's what the Nikkei 225 has looked like over the past 3 decades, by the way:

The pie is no longer growing; all that's left is the fight over who gets the biggest piece. Sam Altman is right when he argues that zero-sum economics create a toxic political environment.

In a system with economic growth, things can improve for everyone. In a system without growth, or even one with very little growth, that’s not the case—if things improve for me, it has to come at the expense of things getting worse for you. Without growth, we’re voting against someone else’s interest as much as we’re voting for our own. This ends with lots of fighting and everyone feeling screwed, broken into factions, and unmotivated. Democracy does not work well in a zero-sum world.

People either seem unaware or incapable of preparing for what is to come. Even in prosperous countries like Germany and France, median savings are below €100k. The wealthiest German cohort, those aged 55-64, have median net wealth of €180k, and the younger generations don't seem to be in a hurry to save for retirement. 42% of Europeans have less than three months’ take-home pay saved.

Japan

Despite being ahead of the curve on aging, Japan is actually in a pretty good position as it only spends ~10% of GDP on pensions. Compare that to 17% in Italy, 14.5% in France, and 10% in Germany even though those places have significantly smaller retired populations.3 How do they do it? It's a pay-as-you-go system that simply doesn't pay out very much: the average pension is only ~$2k per month for a married couple. Could you live on that budget? Despite this, they are cutting pensions, increasing the retirement age, and finding ways to get older people to keep working.

It's also worth mentioning, however, that they've been running deficits for 30 years and have a debt/GDP ratio of over 230%. Total government spending has been hovering around 40% lately, so it would seem that they have room to increase taxes if it becomes necessary.

China

China is in a nightmarish demographic position and needs to maintain rapid growth despite a declining workforce. Their age pyramid is a time bomb that's about to explode:

In 2011, every pensioner was supported by 3.1 workers. By the end of 2017, that ratio had fallen to 2.8-to-one, and the Ministry estimates that by 2050, it will be just 1.3-to-one.

In 2016 the one-child policy became the two-child policy. In 2021, the two-child policy became the three-child policy. But it's too late.

How long can China keep up the "outgrow the debt" strategy with a declining workforce? And what happens when growth stalls? This seems like one of the likelier scenarios for the next global recession. Of course many have predicted this collapse before, and they were wrong. But the demographic problem is unavoidable.

The retirement age is quite low: 60 for men and 55 for women; we can probably expect this to change which will give them a bit of breathing room. But any such changes are wildly unpopular. On top of that, pension funds are already heavily reliant on additional funding from the central government.

USA

Given its low average age and strong growth, the US is in a decent position compared to the EU and China.

But there is a large amount of variation within the country: some local governments are doing perfectly fine, while others have serious problems with defined-benefit pensions for public employees. Politicians have been promising generous pensions without bothering to fund them (with the assistance of absurd return assumptions from the funds): pensions give them the ability to offer huge payouts to special interest groups without impacting the budget immediately. The logic of public choice is so clear that there is only one really serious question left, and that is why states haven't collapsed already.

As these pensions start taking up a larger percentage of state/local revenues, things will come to a head. In Illinois, for example, pensions took up about 4% of the budget in the 90s. Today it's 25% and growing. There are three alternatives, all painful: cut pensions, cut other services, or start raising taxes. How much of that will people tolerate before they start moving out?

If this were simply a horrific problem that we were trying to deal with, it would be bad enough. But it's a horrific problem that we are ignoring, and will continue ignoring until it blows up in our faces. In the middle of the longest bull market in US stock market history, pension deficits have ballooned:

Just imagine what a decade of weak stock market returns would do.

At the federal level, Social Security has about 15 years until they have to start cutting benefits, but it won't be that expensive to shore it up. And most importantly, the US is growing, and has a lot more room left for tax increases.

What Governments Can Do

How will governments respond to the pension apocalypse? All the alternatives seem bad: pension cuts, big tax increases, vast borrowing, inflation, unprecedented immigration. Nobody wants to do any of these things, but the math must eventually balance out. In the end something's gotta give. This survey of Europeans captures the heart of the problem:

When it comes to the measures required, even those respondents who acknowledge the threat of demographic problems appear to be fairly reluctant to endorse them: most of the reform proposals are refused by the majority.

Everyone understands that governments either need to tax more or pay out less, but people aren't ready to accept either solution. Just 46% support a system that combines basic public pensions with private savings! Even conservatives in America hate the idea of cuts: just 15% of Republicans support Medicare spending cuts, while 10% support Social Security cuts. And when you spend $2T on "stimulus" at a time when there is no AD shortfall, how are you going to close the taps later? With such large political costs (old people are sympathetic, numerous, and politically influential),4 few politicians are willing to take the necessary steps. And the worse the worker:retiree ratio, the more political power the retirees have—this is not a self-balancing problem.

The example of Japan shows that these problems are not insurmountable, as long as politicians are willing to make difficult choices (and the people accept those choices). The earlier reforms are enacted, the easier things will go, but in most places I expect it will be impossible until a breaking point is reached. Maybe in the end we'll just get a little bit of everything and the math will balance out. But someone is going to have to make sacrifices.

The biggest danger comes not from the pension apocalypse itself, but rather from the stupid things politicians might do to avoid addressing the pension problem head-on. Some possible scenarios:

  • Huge tax increases → mass emigration → death spiral
  • Central banks monetize debt → hyperinflation → economic crash
  • Central banks don't monetize debt → debt crisis → Greece 2.0

Grow

You can think of pension liabilities like debt: you can keep growing it forever without problem as long as you also grow your economy quickly enough. We can talk about progress studies as much as we want, but the practical reality on the ground is not encouraging when it comes to growth. Especially in Europe, it is more or less a distant dream rather than a real possibility. And things are slowing down even in the US.

China has no alternative, and so far it seems to be succeeding against all expectations (though the data is fake to some extent, see this and this). We'll see how long they can keep it up.

Pay Out Less

One possibility is, of course, a straight cut to pensions. But you have to keep in mind that old people tend to vote at higher rates than young people, and that due to demographic collapse the old people will be the most powerful voting block in these countries. People get angry when you cut spending.5 They get especially angry when they have paid in quite a lot of money to the pension system and will not see much in return. Even the best-managed systems (like the Dutch) will be running into trouble though.

Raise the Retirement Age

Instead of paying out less, you can try to raise the retirement age instead. This not only decreases the total amount you need to pay, but also props up the worker:retiree ratio. It also has the benefit of not affecting current retirees: bypassing that powerful bloc makes changes easier to implement from a political perspective. But people in surveys say they expect to retire around 63, so I don't know how politically viable this plan is going to be in practice.

For example, Denmark plans to raise the retirement age in step with increases in life expectancy. Under this model, a Danish worker born in 1990 can expect "early retirement" at 70 and normal retirement at 73!

To which I say: fuck off and die.

Edit: after some conversations I have decided that raising the retirement age might not be that bad. Lots of people are still able and willing to work in their 60s and 70s. The best solution would probably be a flexible system in which people can choose when to retire, and the benefits adjust accordingly (the earlier you stop working, the less you get).

Tax

Raising taxes is another possibility, but how much slack is there in income taxation given a declining base? The US (which is currently at <40% government spending/GDP) has a lot of wiggle room, and you could even say the same about China. But for Europe you have to figure that at some point they'll be hitting the downward slope of the Laffer curve. Emigration is easier than ever and the people with the greatest ability to work remotely also tend to be those who are most desirable from a fiscal perspective.

Immigration

The sheer number of people needed makes immigration a partial solution at best, and only a few countries use immigration in a way that actually helps. Canada, New Zealand, Australia, and Switzerland for example have fairly reasonable immigration policies that select for high human capital: Canada has the smartest immigrants in the world (average PISA math scores of 527, higher than the natives' and corresponding to an IQ around 103). But despite high population growth and productive immigrants, Canada still faces a shortfall in the near future.

Needless to say, immigration policies that select for low human capital (US: average 1st gen immigrant PISA math score 437, corresponding to an IQ around 91) only make the problem worse. In Europe, non-EU migrants are less likely to be employed and earn much less than Europeans when they are employed. You can't fill a fiscal hole by adding more fiscal burdens to your society.6

There is astonishingly little international competition for productive people, but I think that is going to change in the future. This process has already started, with some countries offering digital nomad visas, sometimes with tax incentives on top. In Italy some cities will pay half your rent. I imagine there will be calls for coordination to prevent a "race to the bottom", but I doubt there will be any kind of global agreement on the matter.

Debt

Hell, if zero rates persist you could just fund the whole thing with debt. Rising interest rate forecasts have been a complete meme for more than a decade now, maybe free money is the new normal. On the other hand, at high levels of debt/GDP it only takes a small rise in rates to create serious problems (and possibly trigger debt crises). But how long can this last? Perhaps the "solution" to rising rates will be inflation, just kicking the can even further down the road.

The Andrew Dobson Gambit

Inflate!

How much willingness for debt monetization is there among independent central banks? Probably not much (who knows though—remember "no bailouts"? lol) On the other hand, how long can CBs retain their independence against mounting political pressure? What's more unpopular, high inflation or pension cuts? This seems like a fairly unlikely scenario.

Transition to DC Plans

The countries that are best prepared have some combination of well-funded basic public pension system that makes sure old people don't starve, combined with defined-contribution pensions. Governments with large defined-benefit plans will either need to take serious pain, or start transitioning to defined-contribution plans. The problem with making this transition is that it's expensive immediately, and extremely difficult politically. They tried to do it in Illinois and it was shot down by the courts:

Under the Illinois Supreme Court’s 2015 precedent, a government worker’s pension benefits cannot be changed in any way after their first day working for the state.

If you thought pensioners were a powerful lobby, wait till you see what public employees get away with.

What You Can Do

First of all, understand that you need to save for retirement.

After that, just follow the standard boring investing advice. Right now is not a great time (high valuations after a 12-year bull market that quintupled the S&P500, near-zero bond yields), but I'm sure there will be good opportunities in the decades to come. The safe withdrawal rate (how much you can withdraw from your investments every year without running out of money before you die) is generally held to be around 3-4%. Suppose you can get by on $30k/year, you'll need $1m in investments. That number has to be adjusted for inflation: assuming you'll retire in 2060 and 2% inflation, that's $2.2m in 2060-dollars. Getting there isn't that difficult: saving $10k/year for 40 years with 7% annual returns will get you to $2m. The earlier you start the better.

Where to put the money? I'd go with some sort of global equity ETF, perhaps with a tilt toward the US. Beware home equity bias, unless you're American.

What happens if the political demands generated by the collapse of pension systems end up causing a hyperinflationary scenario? As long as you have the money in real estate or equities, you'll probably be fine. German stocks actually did fine in the Weimar hyperinflation era (but only if you held them through an 80% drawdown).

It's an absolutely terrible time for bonds, don't be misled by the incredible bull market of the last 40 years. 60/40 is going to look much worse in the future. Inflation goes up, you're screwed; rates go up, you're screwed. The Greek 10y bond is currently yielding 0.824%—this is a country with a debt/GDP ratio over 200%, a GDP 35% lower than it was 10 years ago, and a recent history of default. The bond market is absolutely nuts right now.

If things get bad enough you might want to protect against expropriation, which means international diversification. But I doubt things will get that bad.

Looking beyond investments, you could move to a cheaper country, which would allow you to get away with lower savings. There are nice places in SEA or South America that are both civilized and cheap. It's pretty easy for Norteamericanos and Europeans with some savings to get retiree visas. If you were retiring now, Argentina would be an interesting choice: very cheap due to the currency situation but still a safe & pleasant country. As long as your investments are in a stable currency, you can go wherever you want. On the other hand your home country might be unwilling to pay out even your meager pension if you don't actually live there, so plan accordingly.

You could also have more kids. The society-wide dependency ratio is going to be pretty bad, but if you have enough kids your family dependency ratio could be relied on instead. It's probably a bad idea to have kids as a retirement strategy, but if you're leaning in that direction already why not read Caplan's book and pop out another one?

What to Expect

Some countries have a political culture that allows for tough decisions to be made and accepted, but for everyone else I think the most likely course is to kick the can down the road and muddle along while the problems accumulate, until a crisis erupts.

Worst-Case Scenario

Weimar, then war? Probably not. Societies filled with old people don't do revolution or war. The age of bangs is over; we only have whimpers to look forward to. At worst we'll see a death spiral of stagnation, brain drain, expropriation, and perhaps devaluation/inflation. Think South America. They won't let you starve, but it won't be very nice either. If the catastrophe isn't global, and you manage to keep your portfolio out of their hands, you'll be fine.

Best-Case Scenario

Cheap fusion energy or friendly superhuman general artificial intelligence?7 If we get a significant increase in growth, the pension problems disappear.


  1. 1.I believe the first such system was set up in Germany in 1889.
  2. 2.They could theoretically cut spending on other things to compensate, but good luck with that.
  3. 3.In Japan 28% of the population is >65 years old. That number is 23% in Italy and 21.5% in Germany.
  4. 4.When 30%+ of the population is retirees, nobody's getting elected without their vote.
  5. 5."Expenditure cuts carry a significant risk of increasing the frequency of riots, anti-government demonstrations, general strikes, political assassinations, and attempts at revolutionary overthrow of the established order. [...] Once unrest erupts, governments quickly reverse course and increase spending in the following year".
  6. 6.European immigration policy is such a mystery to me it might as well be a supranatural phenomenon. In the US at least you can explain it through the political motive. But what about Germany? Poor, unemployed migrants obviously don't vote CDU. If there is any intentionality at all behind European immigration policy (and there probably isn't) it must be based on a fundamental misunderstanding of how the welfare state works.
  7. 7.Or, you know...the JvNs.