Two Paths to the Future

2021-03-23

[[ ]] Level-1 or world space is an anthropomorphically scaled, predominantly vision-configured, massively multi-slotted reality system that is obsolescing very rapidly.
Garbage time is running out.
Can what is playing you make it to level-2?

The history of the world is a history of s-curves stacked on top of each other. Here's one of my favorite charts, showing growth in how much cutting edge could be produced from 1kg of flint during the stone age:¹

Note the logarithmic y-axis—the growth is not exponential but hyperbolic! Of course what appears to be a singularity is really the lower half of an s-curve, but that was enough to trigger a paradigm shift to a new, higher curve: bronze. The stone-age singularity was the end of stone. Can what is playing you make it to level-2?

Where are we today? It's hard to say without the benefit of hindsight, but it sure feels like the upper half of an s-curve. Things are decelerating: economic growth is falling, fertility is cratering, and IQ shredders are quickly using up the finite biological substrate necessary to sustain our upward path. The world of 2020 is more or less the same as the world of 1920. We are richer, have better technology, but the fundamentals are quite similar: we're on the same curve. All we're doing is getting more blades from our flint.

The world of 2120 is going to be radically different. In exactly what way I cannot say, any more than a peasant in 1500 could predict the specifics of the industrial revolution. But it almost certainly involves unprecedented levels of growth as the constraints of the old paradigm are dissolved under the new one. One corollary to this view is that our long-term concerns (global warming, dysgenics, aging societies) are only relevant to the extent that they affect the arrival of the next paradigm.

There are two paths to the future: silicon, and DNA. Whichever comes first will determine how things play out. The response to the coronavirus pandemic has shown that current structures are doomed to fail against a serious adversary: if we want to have a chance against silicon, we need better people. That is why I think any AI "control" strategy not predicated on transhumanism is unserious.²

Our neolithic forefathers could not have divined the metallurgical destiny of their descendants, but today, perhaps for the first time in universal history, we can catch a glimpse of the next paradigm before it arrives. If you point your telescope in exactly the right direction and squint really hard, you can just make out the letters: "YOU'RE FUCKED".

Artificial Intelligence

Nothing human makes it out of the near-future.

There are two components to forecasting the emergence of superhuman AI. One is easy to predict: how much computational power we will have.³ The other is very difficult to predict: how much computational power will be required. Good forecasts are either based on past data, or generalization from theories constructed from past data. Because of their novelty, paradigm shifts are difficult to predict. We're in uncharted waters here. But there are two sources of information we can use: biological intelligence (brains, human or otherwise), and progress in the limited forms of artificial intelligence we have created thus far.⁴

ML progress

GPT-3 forced me to start taking AI concerns seriously. Two features make GPT-3 a scary sign of what's to come: scaling, and meta-learning. Scaling refers to gains in performance from increasing the number of parameters in a model. Here's a chart from the GPT-3 paper:

Meta-learning refers to the ability of a model to learn how to solve novel problems. GPT-3 was trained purely on next-word prediction, but developed a wide array of surprising problem-solving abilities, including translation, programming, arithmetic, literary style transfer, and SAT analogies. Here's another GPT-3 chart:

Put these two together and extrapolate, and it seems like a sufficiently large model trained on a diversity of tasks will eventually be capable of superhuman general reasoning abilities. As gwern puts it:

More concerningly, GPT-3’s scaling curves, unpredicted meta-learning, and success on various anti-AI challenges suggests that in terms of futurology, AI researchers’ forecasts are an emperor sans garments: they have no coherent model of how AI progress happens or why GPT-3 was possible or what specific achievements should cause alarm, where intelligence comes from, and do not learn from any falsified predictions.

GPT-3 is scary because it’s a magnificently obsolete architecture from early 2018 (used mostly for software engineering convenience as the infrastructure has been debugged), which is small & shallow compared to what’s possible, on tiny data (fits on a laptop), sampled in a dumb way⁠, its benchmark performance sabotaged by bad prompts & data encoding problems (especially arithmetic & commonsense reasoning), and yet, the first version already manifests crazy runtime meta-learning—and the scaling curves still are not bending!

Still, extrapolating ML performance is problematic because it's inevitably an extrapolation of performance on a particular set of benchmarks. Lukas Finnveden, for example, argues that a model similar to GPT-3 but 100x larger could reach "optimal" performance on the relevant benchmarks. But would optimal performance correspond to an agentic, superhuman, general intelligence? What we're really interested is surprising performances in hard-to-measure domains, long-term planning, etc. So while these benchmarks might be suggestive (especially compared to human performance on the same benchmark), and may offer some useful clues in terms of scaling performance, I don't think we can rely too much on them—the error bars are wide in both directions.

Brains

The human brain is an existence proof not only for the possibility of general intelligence, but also for the existence of a process—evolution—that can design general intelligences with no knowledge of how minds operate. The brain and its evolution offer some hints about what it takes to build a general intelligence:

Number of synapses in the brain.
Computational power (~1e15 FLOPS/s).
Energy consumption (~20 watts).
Required computation for the training of individual brains.
Required evolutionary "computation" to generate brains.

The most straight-forward comparison is synapses to parameters. GPT-3 has 1.75e11 parameters compared to ~1e15 synapses in the brain, so it's still 4 orders of magnitude off, and a model parameter is not a perfect analogue to a synapse, so this is more of an extreme lower limit rather than a mean estimate.

The brain also presents some interesting problems when it comes to scaling, as some aspects of human evolution suggest the scaling hypothesis might be wrong. While the human brain got bigger during the last few million years, it's only about 3-4x the size of a chimpanzee brain—compare to the orders of magnitude we're talking about in ML. On the other hand, if we look at individual differences today, brain volume and IQ correlate at ~0.4. Not a very strong correlation, but it does suggest that you can just throw more neurons at a problem and get better results.

How do we reconcile these facts pointing in different directions? Sensorimotor skills are expensive and brain size scales with body mass: whales might have massive brains compared to us, but that doesn't give them anywhere near the same intellectual capabilities. So what appears to be a relatively small increase compared to chimps is actually a large increase in brain volume not dedicated to sensorimotor abilities.

How much power will we have?

Compute use has increased by about 10 orders of magnitude in the last 20 years, and that growth has accelerated lately, currently doubling approximately every 3.5 months. A big lesson from the pandemic is that people are bad at reasoning about exponential curves, so let's put it in a different way: training GPT-3 cost approximately 0.000005%⁵ of world GDP. Go on, count the zeroes. Count the orders of magnitude. Do the math! There is plenty of room for scaling, if it works.

The main constraint is government willingness to fund AI projects. If they take it seriously, we can probably get 6 orders of magnitude just by spending more money. GPT-3 took 3.14e23 FLOPs to train, so if strong AGI can be had for less than 1e30 FLOPs it might happen soon. Realistically any such project would have to start by building fabs to make the chips needed, so even if we started today we're talking 5+ years at the earliest.

Looking into the near future, I'd predict that by 2040 we could squeeze another 1-2 orders of magnitude out of hardware improvements. Beyond that, growth in available compute would slow down to the level of economic growth plus hardware improvements.

Putting it all together

The best attempt at AGI forecasting I know of is Ajeya Cotra's heroic 4-part 168-page Forecasting TAI with biological anchors. She breaks down the problem into a number of different approaches, then combines the resulting distributions into a single forecast. The resulting distribution is appropriately wide: we're not talking about ±15% but ±15 orders of magnitude.

Her results can be summarized in two charts. First, the distribution of how many FLOPs it will take to train a "transformative" AI:

I highly recommend checking out Cotra's work if you're interested in the details behind each of those forecasts. This distribution is then combined with projections of our computational capacity to generate AI timelines. Cotra's estimate (which I find plausible) is ~30% by 2040, ~50% by 2050, and ~80% by 2100:

Metaculus has a couple of questions on AGI, and the answers are quite similar to Cotra's projections. This question is about "human-machine intelligence parity" as judged by three graduate students; the community gives a 54% chance of it happening by 2040. This one is based on the Turing test, the SAT, and a couple of ML benchmarks, and the median prediction is 2038, with an 83% chance of it coming before 2100. Here's the cumulative probability chart:

Both extremes should be taken into account: we must prepare for the possibility that AI will arrive very soon, while also tending to our long-term problems in case it takes more than a century.

Human Enhancement

All things change in a dynamic environment. Your effort to remain what you are is what limits you.

The second path to the future involves making better humans. Ignoring the AI control question for a moment, better humans would be incredibly valuable to the rest of us purely for the positive externalities of their intelligence: smart people produce benefits for everyone else in the form of greater innovation, faster growth, and better governance. The main constraint to growth is intelligence, and small differences cause large effects: a standard deviation in national averages is the difference between a cutting-edge technological economy and not having reliable water and power. While capitalism has ruthlessly optimized the productivity of everything around us, the single most important input—human labor—has remained stagnant. Unlocking this potential would create unprecedented levels of growth.

Above all, transhumanism might give us a fighting chance against AI. How likely are they to win that fight? I have no idea, but their odds must be better than ours. The pessimistic scenario is that enhanced humans are still limited by numbers and meat, while artificial intelligences are only limited by energy and efficiency, both of which could potentially scale quickly.

The most important thing to understand about the race between DNA and silicon is that there's a long lag to human enhancement. Imagine the best-case scenario in which we start producing enhanced humans today: how long until they start seriously contributing? 20, 25 years? They would not be competing against the AI of today, but against the AI from 20-25 years in the future. Regardless of the method we choose, if superhuman AGI arrives in 2040, it's already too late. If it arrives in 2050, we have a tiny bit of wiggle room.⁶

Let's take a look at our options.

Normal Breeding with Selection for Intelligence

Way too slow and extraordinarily unpopular.

Gene Editing

With our current technology, editing is not relevant. It's probably fine for very simple edits, but intelligence is a massively polygenic trait and would require far too many changes. However, people are working on multiplex editing, reducing the odds of off-target edits, etc. Perhaps eventually we will even be able to introduce entirely new sequences with large effects, bypassing the polygenicity problem. More realistically, in a couple of decades editing might be a viable option, and it does have the benefit of letting people have their own kids in a relatively normal way (except better).

Cyborgs

Seems unlikely to work. Even if we solve the interface problem it's unclear how a computer without superhuman general reasoning capabilities could enhance the important parts of human thought to superhuman levels. If, instead of relying on the independent capabilities of an external computer, a hypothetical brain-computer interface could act as a brain "expander" adding more synapses, that might work, but it seems extremely futuristic even compared to the other ideas in this post. Keep in mind even our largest (slowest, most expensive) neural networks are still tiny compared to brains.

Iterated Embryo Selection

The basic idea is to do IVF, sequence the embryos, select the best one, and then instead of implanting the embryo, you use its DNA for another round of IVF. Iteration adds enormous value compared to simply picking the best embryos: Shulman and Bostrom estimate that single-stage selection of 1 in 1000 embryos would add 24.3 IQ points on average. On the other hand, 10 generations of 1-in-10 selection would add ~130 points. On top of the IQ gains we could also optimize other traits (health, longevity, altruism) at the same time. This is the method with the highest potential gains, as the only limit would be physiological or computational. And IES wouldn't be starting at 100: if we use Nobel laureates, we've already got ~3 SDs in the bag.

How far can IES go? There are vast amounts of variation for selection to work on, and therefore vast room for improvement over the status quo. Selection for weight in broiler chickens has resulted in rapid gain of 10+ standard deviations, and I believe we can expect similar results for selection on intelligence. Chicken size is probably not a perfect parallel to intelligence, and I'm sure that eventually we will start hitting some limits as additivity starts to fail or physical limitations impose themselves. But I don't think there are any good reasons to think that these limits are close to the current distribution of human intelligence.

However, there are some problems:

IVF is expensive and has a low success rate.
Gametogenesis from stem cells isn't there yet (but people are working on it).
Sequencing every embryo still costs ~$500 and our PGSs⁷ aren't great yet.
The process might still take several months per generation.

None of these are insoluble. Sequencing costs are dropping rapidly, and even with today's prices the costs aren't that bad: gwern estimates $9k per IQ point. He says it's expensive, but it would be very easily worth it from a societal perspective. The biggest downside is the time it would take: say 5 years to make IES tech viable, another 5 years for 10 rounds of selection,⁸ and we're at 2050 before they're grown up. Still, it's definitely worth pursuing, at the very least as a backup strategy. After all there's still a good chance we won't reach superhuman AGI by 2050.

How likely is any of this to happen? Metaculus is very pessimistic even on simple IVF selection for IQ, let alone IES, with the median prediction for the first 100 IQ-selected babies being the year 2036. Here's the cumulative probability chart:

That seems late to me. Perhaps I'm underestimating the taboo against selecting on IQ, but even then there's a whole world out there, many different cultures with fewer qualms. Then again, not even China used human challenge trials against covid.

Cloning

I believe the best choice is cloning. More specifically, cloning John von Neumann one million times.⁹ No need for editing because all the good genes are already there; no need for long generation times because again, the genes are already there, just waiting, chomping at the bit, desperately yearning to be instantiated into a person and conquer the next frontiers of science. You could do a bunch of other people too, diversity of talents and interests and so on, but really let's just start with a bunch of JvNs.¹⁰

Why JvN? Lots of Nobel laureates in physics were awed by his intelligence (Hans Bethe said his brain indicated "a species superior to that of man"). He was creative, flexible, and interested in a fairly wide array of fields (including computing and artificial intelligence).¹¹ He was stable and socially competent. And he had the kind of cosmic ambition that we seem to be lacking these days: "All processes that are stable we shall predict. All processes that are unstable we shall control." As for downsides: he died young of cancer,¹² he was not particularly photogenic, and he loved to eat and thought exercise was "nonsense".

There are commercial dog- and horse-cloning operations today. We've even cloned primates. The costs, frankly, are completely trivial. A cloned horse costs $85k! Undoubtedly the first JvN would cost much more than that, but since we're making a million of them we can expect economies of scale. I bet we could do it for $200 billion, or less than half what the US has spent on the F35 thus far. Compared to the predicted cost of training superhuman AGI, the cost of one million JvNs is at the extreme lower end of the AI forecasts, about 5 orders of magnitude above GPT-3.

Let's try to quantify the present value of a JvN, starting with a rough and extremely conservative estimate: say von Neumann had an IQ of 180¹³ and we clone a million of them. Due to regression to the mean, the average clone will be less intelligent than the original; assuming heritability of .85, we'd expect an average IQ of 168. This would increase average US IQ by 0.21 points. Going by Jones & Schneider's estimate of a 6% increase in GDP per point in national IQ, we might expect about 1.26% higher output. If the effect was immediate, that would be about $270 billion per year, but we should probably expect a few decades before it took effect. Even this extremely modest estimate puts the value of a JvN at about $270k per year. Assuming the JvNs become productive after 20 years, affect GDP for 50 years, 2% GDP growth, and a 3% discount rate, the present value of a JvN is around $8.4 million. The cost:benefit ratio is off the charts.¹⁴

However, I think this is a significant under-estimate. Intuitively, we would expect that governance and innovation are more heavily influenced by the cognitive elite rather than the mean. And there is decent empirical evidence to support this view, generally showing that the elite is more important. As Kodila-Tedika, Rindermann & Christainsen put it, "cognitive capitalism is built upon intellectual classes." One problem with these studies is that they only look at the 95th percentile of the IQ distribution, but what we really care about here is the 99.9th percentile. And there are no countries with a million 170+ IQ people, so the best we can do is extrapolate from slight deviations from the normal distribution.

Another piece of evidence worth looking at is Robertson et al.'s data on outcomes among the top 1% within the US. They took boys with IQs above the 99th percentile and split them into four groups (99.00-99.25th percentile, etc.), then looked at the outcomes of each group:

Simply going from an IQ of ~135 (99th percentile) to ~142+ (99.75th percentile) has enormous effects in terms of income, innovation, and science. Extrapolate that to 168. Times a million. However, the only variable here that's directly quantifiable in dollar terms is >95th percentile income and what we're really interested in are the externalities. Undoubtedly the innovation that goes into those patents, and the discoveries that go into those publications create societal value. But putting that value into numbers is difficult. If I had to make a wild guess, I'd multiply our naïve $8.4 million estimate by 5, for a present value of $42 million per JvN.

That is all assuming a permanent increase in the level of output. A more realistic estimate would include faster growth. Beyond the faster growth due to their discoveries and innovations, the JvNs would also 1) help make genetic enhancement universal (even light forms like single-stage embryo selection would increase world GDP by several multiples) and, most importantly, 2) accelerate progress toward superhuman AI, and (hopefully) tame it. To take a wild guess at how much closer the JvNs would bring us to AGI: they would give us an order of magnitude more in spending, an order of magnitude faster hardware, and an order of magnitude better training algorithms, for a total increase in effective computational resources of 1000x. Here's how Cotra's average AI timeline would be accelerated, assuming the JvNs deploy that extra power between 2045 and 2055:

US GDP growth has averaged about 2.9% in the 21st century; suppose the JvNs triple the growth rate for the rest of the century:¹⁵ it would result in a cumulative increase in output with a present value of approximately $9 quadrillion, or $9 billion per JvN.¹⁶

At a time when we're spending trillions on projects with comically negative NPV, this is probably the best investment a government could possibly make. It's also an investment that will almost certainly never come from the private sector because the benefits are impossible to capture for a private entity.

Would there be declining marginal value to JvNs? It seems likely, as there's only so much low-hanging fruit they could pick before the remaining problems became too difficult for them. And a JvN intellectual monoculture might miss things that a more varied scientific ecosystem would catch. That would be an argument for intellectual diversity in our cloning efforts. Assuming interests are highly heritable we should probably also re-create leading biologists, chemists, engineers, entrepreneurs, and so on.¹⁷ On the other hand there might be o-ring style network effects which push in the opposite direction. There's also a large literature on how industrial clusters boost productivity and create economies of scale; perhaps there are such clusters in scientific output as well.

As for the problems with this idea: human cloning has about the same approval rating that Trump has among Democrats.¹⁸ We can expect the typical trouble with implanting embryos. We would need either artificial wombs or enough women willing to take on the burden. And worst of all, unlike IES, cloning has a hard upside limit.

A Kind of Solution

I visualise a time when we will be to robots what dogs are to humans, and I’m rooting for the machines.

Let's revisit the AI timelines and compare them to transhumanist timelines.

If strong AGI can be had for less than 1e30 FLOPs, it's almost certainly happening before 2040—the race is already over.
If strong AGI requires more than 1e40 FLOPs, people alive today probably won't live to see it, and there's ample time for preparation and human enhancement.
If it falls within that 1e30-1e40 range (and our forecasts, crude as they are, indicate that's probable) then the race is on.

Even if you think there's only a small probability of this being right, it's worth preparing for. Even if AGI is a fantasy, transhumanism is easily worth it purely on its own merits. And if it helps us avoid extinction at the hand of the machines, all the better!

So how is it actually going to play out? Expecting septuagenarian politicians to anticipate wild technological changes and do something incredibly expensive and unpopular today for a hypothetical benefit that may or may not materialize decades down the line—is simply not realistic. Right now from a government perspective these questions might as well not exist; politicians live in the current paradigm and expect it to continue indefinitely. On the other hand, the Manhattan Project shows us that immediate existential threats have the power to get things moving very quickly. In 1939, Fermi estimated a 10% probability that a nuclear bomb could be built; 6 years later it was being dropped on Japan.

My guess is that some event makes AI/transhumanism (geo)politically salient, which will trigger a race between the US and China, causing an enormous influx of money.¹⁹ Perhaps something like the Einstein–Szilárd letter, perhaps some rogue scientist doing something crazy. Unlike the Manhattan Project, in today's interconnected world I doubt it could happen in secret: people would quickly notice if all the top geneticists and/or ML researchers suddenly went dark. From a geopolitical perspective, He Jiankui's 3 year jail sentence might be thought of as similar to Khrushchev removing the missiles from Cuba: Xi sending a message of de-escalation to make sure things don't get out of hand. Why does he want to de-escalate? Because China would get crushed if it came to a race between them and the US today. But in a decade or two? Who knows what BGI will be capable of by then. It's probably in the West's interests to make it happen ASAP.

This all has an uncomfortable air of "waiting for the barbarians", doesn't it? There's a handful of people involved in immanentizing the eschaton while the rest of us are mere spectators. Are we supposed to twiddle our thumbs while they get on with it? Do we just keep making our little flint blades? And what if the barbarians never arrive? Maybe we ought to pretend they're not coming, and just go on with our lives.²⁰

The race to the future is not some hypothetical science fiction thing that your grand-grand-grandkids might have to worry about. It's on, right now, and we're getting creamed. In an interview about in vitro gametogenesis, Mitinori Saitou said “Technology is always technology. How you use it? I think society probably decides.” Technology is always technology, but the choice is illusory. What happened to stone age groups that did not follow the latest innovations in knapping? What happened to groups that did not embrace bronze? The only choice we have is: up, or out. And that is no choice at all.

1.Meyer & Vallée, The Dynamics of Long-Term Growth. ↩
2.I also suspect that enslaving God is a bad idea even if it works. ↩
3.It comes down to hardware improvements and economic resources, two variables that are highly predictable (at least over the near future). ↩
4."But Alvaro, I don't believe we will build AGI at all!" I can barely tell apart parodies from "serious" anti-AI arguments any more. There's nothing magical about either meat or evolution; we will replicate their work. If you can find any doubters out there who correctly predicted GPT-3's scaling and meta-learning performance, and still doubt, then I'd be interested in what they have to say. ↩
5.How much could we spend? 0.5% per year for a decade seems very reasonable. Apollo cost about ~0.25% of US GDP per year and that had much more limited benefits. ↩
6.This is all predicated on the supposition that enhanced humans would side with humanity against the machines—if they don't then we're definitely no more than a biological bootloader. ↩
7.Polygenic score, a measure of the aggregate effect of relevant genetic variants in a given genome on some phenotypic trait. ↩
8.Alternatively we could start implanting from every generation on a rolling basis. ↩
9.If you can do it once there's no reason not to do it at scale. ↩
10.And maybe a few hundred Lee Kuan Yews for various government posts. ↩
11.He may even have been the first to introduce the idea of the technological singularity? Ulam on a conversation with JvN: "One conversation centered on the ever accelerating progress of technology and changes in the mode of human life, which gives the appearance of approaching some essential singularity in the history of the race beyond which human affairs, as we know them, could not continue." ↩
12.Might not be genetic though: "The cancer was possibly caused by his radiation exposure during his time in Los Alamos National Laboratory." ↩
13.That's about 1 in 20 million. ↩
14.Should we expect a million von Neumanns to affect the level or growth rate of GDP? Probably both, but I'm trying to keep it as simple and conservative as possible for now. ↩
15.A metaculus question on the economic effects of human-level AI predicts a 77% chance of >30% GDP growth at least once in the 15 years after the AI is introduced. ↩
16.Of course this ignores all the benefits that do not appear in GDP calculations. Technology, health, lack of crime, and so on. Just imagine how much better wikipedia will be! ↩
17.No bioethicists though. ↩
18.How much of this is driven by people averse to rich people selfishly cloning themselves? Perhaps the prosocial and rather abstract idea of cloning a million von Neumanns would fare better. One clone is a tragedy; a million clones is a statistic. ↩
19.Europe is (of course) not even a player in this game. I wouldn't trust Mutti Merkel and her merry band of eurocrats to boil a pot of water let alone operate a transhumanist program. ↩
20.In Disco Elysium the world is quickly being destroyed by a supernatural substance known as the "pale": they don't know how much time they have left, but it's on the order of a few decades. Everyone goes on with their petty squabbles regardless. The more I think about it, it seems like less of a fantasy and more of a mirror. How long until the last generation? ↩

Princeton Cemetery

2021-03-22

Three men jump over the fence of Princeton Cemetery under the cover of a moonless night. They know where they're going and quickly reach their target. After a few minutes of methodical digging they hit the bones. It's the head they really want, but they take a few more just to be sure. As the van drives off, a thermite charge annihilates everything left behind. An hour later they reach the coast and take a dinghy out to meet the submarine that is going to take them back to Shanghai.

Within a week, geneticists have extracted the DNA and started implanting embryos in artificial wombs. Production quickly ramps up, and plans are made for additional cohorts. Two years later, doubts begin to emerge when the babies still have trouble walking. By the age of 3 there's no question: the children look like him but they're developmentally unremarkable. The decision is made to liquidate the entire cohort.

Meanwhile, deep beneath the Nevada desert, an immense chamber fills with young boys as morning class is about to begin. Thousands of 5-year-old von Neumanns sit down at their desks and open their calculus textbooks.

Urne-Buriall in Tlön and the Sedulous Ape

2021-02-09

At the end of Jorge Luis Borges's Tlön, Uqbar, Orbis Tertius, the Tlönian conspiracy is taking over the world and the narrator retreats into esoteric literary pursuits. The final paragraph reads:

Then English and French and mere Spanish will disappear from the globe. The world will be Tlön. I pay no attention to all this and go on revising, in the still days at the Adrogue hotel, an uncertain Quevedian translation (which I do not intend to publish) of Browne's Urn Burial.

Who's Browne? What is Urn Burial? And what does it have to do with the story?

Browne & Urne-Buriall

Sir Thomas Browne was born in 1605, died in 1682, and was trained as a doctor but is remembered mostly as a writer. He coined a huge number of words including "cryptography", "electricity", "holocaust", "suicide", and "ultimate". He personally embodied the transition from the Renaissance to the Enlightenment—a skeptical polymath with an interest in science, he published Pseudodoxia Epidemica, a book that refuted all sorts of popular superstitions. At the same time he believed in witches, alchemy, and astrology.

Hydriotaphia, Urne-Buriall, or, a Discourse of the Sepulchrall Urnes lately found in Norfolk is his most influential work.¹ On the occasion of the discovery of some ancient burial urns, he launches into an examination of funerary customs across the world and ages, various cultures' ideas about death and the afterlife, and the ephemerality of posthumous fame. But the essay is mainly famous for its style rather than its content: endless sentences, over-the-top abuses of Latin, striking metaphors and imagery. It "smells in every word of the sepulchre", wrote Emerson. Browne begins in a dry, almost anthropological mode:

In a Field of old Walsingham, not many moneths past, were digged up between fourty and fifty Urnes, deposited in a dry and sandy soile, not a yard deep, nor farre from one another: Not all strictly of one figure, but most answering these described: Some containing two pounds of bones, distinguishable in skulls, ribs, jawes, thigh-bones, and teeth, with fresh impressions of their combustion.

The essay then slowly escalates...

To be gnaw’d out of our graves, to have our sculs made drinking-bowls, and our bones turned into Pipes, to delight and sport our Enemies, are Tragicall abominations, escaped in burning Burials.

...and reaches a crescendo in the final chapter:

And therefore restlesse inquietude for the diuturnity of our memories unto present considerations seems a vanity almost out of date, and superanuated peece of folly. We cannot hope to live so long in our names as some have done in their persons, one face of Janus holds no proportion unto the other. ’Tis too late to be ambitious. The great mutations of the world are acted, our time may be too short for our designes.

There is no antidote against the Opium of time, which temporally considereth all things; Our Fathers finde their graves in our short memories, and sadly tell us how we may be buried in our Survivors. Grave-stones tell truth scarce fourty years: Generations passe while some trees stand, and old Families last not three Oaks. To be read by bare Inscriptions like many in Gruter, to hope for Eternity by Ænigmaticall Epithetes, or first letters of our names, to be studied by Antiquaries, who we were, and have new Names given us like many of the Mummies, are cold consolations unto the Students of perpetuity, even by everlasting Languages. [...] In vain we compute our felicities by the advantage of our good names, since bad have equall durations; and Thersites is like to live as long as Agamemnon. Who knows whether the best of men be known? or whether there be not more remarkable persons forgot, than any that stand remembred in the known account of time?

Borges & Browne

The ending of Tlön refers to a real event. When Borges and Bioy-Casares were young, they really did produce a Quevedian² translation of Browne's Urn Burial which they did not publish. Browne was a significant stylistic influence on Borges, especially his "baroque" and "labyrinthine" sentences. That conspicuously unliterary word found in the famous first sentence of Tlön, "conjunction", almost certainly comes from Browne. Borges explains in an interview:

When I was a young man, I played the sedulous ape to Sir Thomas Browne. I tried to do so in Spanish. Then Adolfo Bioy-Casares and I translated the last chapter of Urn Burial into seventeenth-century Spanish—Quevedo. And it went quite well in seventeenth-century Spanish.... We did our best to be seventeenth-century, we went in for Latinisms the way that Sir Thomas Browne did. [...] I was doing my best to write Latin in Spanish.

That phrase "sedulous ape",³ what does that mean? Well, it's a reference to Robert Louis Stevenson (Borges was a big fan) who also tried to imitate Browne! In his 1887 essay collection Memories and Portraits, Stevenson writes:

I have thus played the sedulous ape to Hazlitt, to Lamb, to Wordsworth, to Sir Thomas Browne, to Defoe, to Hawthorne, to Montaigne, to Baudelaire and to Obermann. I remember one of these monkey tricks, which was called The Vanity of Morals: it was to have had a second part, The Vanity of Knowledge; and as I had neither morality nor scholarship, the names were apt; but the second part was never attempted, and the first part was written (which is my reason for recalling it, ghostlike, from its ashes) no less than three times: first in the manner of Hazlitt, second in the manner of Ruskin, who had cast on me a passing spell, and third, in a laborious pasticcio of Sir Thomas Browne.

In fact Browne's essay was admired by virtually all of Borges's Anglo predecessors: De Quincey, Coleridge, Dr. Johnson,⁴ Emerson, Poe. The epigraph of The Murders in the Rue Morgue is taken from Urn Burial: "What song the Syrens sang, or what name Achilles assumed when he hid himself among women, although puzzling questions are not beyond all conjecture."

Back to Tlön

Let us return to the origin of this post: what does Urn Burial have to do with Tlön? Why does Borges make the reference? We can safely assume that he had a reason in mind, but the connection is not immediately obvious: Browne's essay is about funerary customs, death, and fame—there's nothing in it about idealism, alternate worlds, or the social construction of reality.

I believe the key is that Browne misattributed the urns to the Romans (they were really Anglo-Saxon and dated to ~500 AD).⁵ Just as the invented world of Tlön invades reality in the Borges story, so Browne's invented Roman urns (and the connection they imply between Rome and England) impose themselves on our reality. Or perhaps the connection is to the Tlönian hrönir, objects that (in that idealist universe) appear when you start looking for them. Browne looked for Roman urns and therefore found them. This perspective also recalls Borges's inventions of fictional writers.⁶ A third and more doubtful interpretation involves Browne's mention of Tiberius, who (like Borges-the-narrator) rejected the world and withdrew to Capri, where he wrote anachronistic Greek verse and had oneiric visions inspired by ancient mythology. Or perhaps Borges simply wanted to pay tribute to his influences.

1.NYRB puts out a cute little tome, pairing it with his other famous essay, Religio Medici, a kind of Montaignian self-examination. ↩
2.Francisco de Quevedo (1580-1645) was a nobleman and prolific writer. I'm not sure if this is a little joke from Borges? I haven't read Quevedo yet, but the internet tells me he was known for his conceptismo style which favored rapidity, directness, and simple vocabulary (and was opposed to the ostentatious culteranismo style). This would suggest that the very notion of a "Quevedian translation" of Browne is a joke in itself, as Browne was famous for his ornate and extravagant style. But when Borges later talks about the "Quevedian translation" he refers to it as stylistically baroque and Latinate, so maybe not? ↩
3.Sedulous, adj. Showing dedication and diligence. ↩
4.Johnson wrote a short biography of Browne. "In defence of his uncommon words and expressions, we must consider that he had uncommon sentiments." ↩
5.Did Borges know about the urns not being Roman? The Norfolk Heritage Explorer(!) has a very handy page on the burial site, which includes a ton of references. The Archaeology of the Anglo-Saxon Settlements (1913) clearly attributes the urns to "Anglians" and calls out Browne for misattributing them. So it's entirely plausible that Borges would know about Browne's error. ↩
6."The methodical fabrication of hrönir [...] has made possible the interrogation and even the modification of the past, which is now no less plastic and docile than the future." ↩

The Best and Worst Books I Read in 2020

2021-01-18

The Best

Lucio Russo, The Forgotten Revolution: How Science Was Born in 300 BC and Why it Had to Be Reborn

Russo argues that Hellenistic science was significantly more developed than generally believed, and that it was killed by the Romans. This is one of those books that you have to approach with aggressive skepticism: Russo makes the case for a Hellenistic hypothetico-deductive scientific method, and some of the deductions he makes from second-hand readings certainly seem a bit much. But even if you don't buy its thesis in the end I think it's worth reading. A fascinating collection of stories about Hellenistic science and how the Romans viewed it. Doesn't really address the question of whether there could have been a scientific and industrial revolution in ancient times (the answer is almost certainly No, but I'd still like to see it argued out).

Lots of surprising facts, but perhaps most surprising were the things that remained backward for a long time: did you know that the first complete translation of Euclid's Elements into Latin was done in the year 1120, by an Englishman translating from the Arabic?

Hipparchus compiled his catalog of stars precisely so that later generations might deduce from it the displacements of stars and the possible appearance of novae. Clearly, Hipparchus too did not believe in a material sphere in which the stars are set. His catalog achieved its aim in full: the stellar coordinates listed therein were incorporated into Ptolemy’s work and so handed down until such a time when a change in the positions of the “fixed” stars could be detected. Changes were first noticed in 1718 A.D. by Halley, who, probably without realizing that he was completing an experiment consciously started two thousand years earlier, recorded that his measured coordinates for Sirius, Arcturus and Aldebaran diverged noticeably from those given by Ptolemy.

Larence Durrell, The Alexandria Quartet

Four interconnected novels set in sensuous and decadent Alexandria in the days before World War II, focused on a loose group of foreigners drawn to and trapped by the city's pleasures. Each novel begins with an epigraph from the Marquis de Sade. Some have derided the prose as purple, but it works well given the setting and intended effect. Begins in a highly experimental, lyrical, wistful, impressionistic mode but mellows out into a normal novel later on.

There's a Rashomon element to it: with each book you learn more, and as the "circle of knowledge" expands, events and characters are recontextualized, and shifting perspectives transform mysterious romances or shadowy conspiracies into something completely different. Tragic love; Cavafy; personal relations vs historic forces; Anglo vs Med culture; colonialism and its failures; everyone is the protagonist in their own story. Really makes you want to have a doomed love affair in a degenerate expat shithole for a while.

I had drifted into sleep again; and when I woke with a start the bed was empty and the candle had guttered away and gone out. She was standing at the drawn curtains to watch the dawn break over the tumbled roofs of the Arab town, naked and slender as an Easter lily. In the spring sunrise, with its dense dew, sketched upon the silence which engulfs a whole city before the birds awaken it, I caught the sweet voice of the blind muezzin from the mosque reciting the Ebed — a voice hanging like a hair in the palm-cooled upper airs of Alexandria. ‘I praise the perfection of God, the Forever existing; the perfection of God, the Desired, the Existing, the Single, the Supreme; the Perfection of God, the One, the Sole’ … The great prayer wound itself in shining coils across the city as I watched the grave and passionate intensity of her turned head where she stood to observe the climbing sun touch the minarets and palms with light: rapt and awake. And listening I smelt the warm odour of her hair upon the pillow beside me.

Edward Gibbon, The History of the Decline and Fall of the Roman Empire

I read about 10 pages of Gibbon every night over the last year. Just super comfy and endlessly entertaining, I would happily keep going if there was more. The scale. The ambition. The style. Is it outdated in some respects? Sure. But Gibbon is not as impressionable as you might think, and his anti-Christian bias has been vastly exaggerated in the popular consciousness.

Full review forthcoming.

The subjects of the Byzantine empire, who assume and dishonour the names both of Greeks and Romans, present a dead uniformity of abject vices, which are neither softened by the weakness of humanity nor animated by the vigour of memorable crimes.

Álvaro Mutis, The Adventures and Misadventures of Maqroll

Seven interconnected picaresque novellas that revolve around the titular Maqroll, a vagabond of the seas. Wanderlust, melancholy, friendship, alcohol, and heartbreak as we follow his adventures and ill-fated "business ventures" at the margins of civilization. It's very good at generating a vicarious pleasure in the "nomadic mania" of the main character and the parallel world of tramp steamers, seamen, and port city whores. Stories are told through second-hand recountings, miraculously recovered documents, or distant rumours found in exotic locales. Underlying it all there's a feeling of a fundamental dissatisfaction with what life has to offer, and Maqroll's adventures are an attempt to overcome that feeling. Stylistically rich and sumptuous, reminiscent of Conrad's maritime adventures, Herzog's diaries, and even Borges.

My favorite of the novellas involves a decaying tramp steamer and a parallel love affair (between the ship's captain and its owner) which is taken up whenever and wherever the ship goes to port, then put on hold while at sea.

The tramp steamer entered my field of vision as slowly as a wounded saurian. I could not believe my eyes. With the wondrous splendor of Saint Petersburg in the background, the poor ship intruded on the scene, its sides covered with dirty streaks of rust and refuse that reached all the way to the waterline. The captain's bridge, and the row of cabins on the deck for crew members and occasional passengers, had been painted white a long time before. Now a coat of grime, oil, and urine gave them an indefinite color, the color of misery, of irreparable decadence, of desperate, incessant use. The chimerical freighter slipped through the water to the agonized gasp of its machinery and the irregular rhythm of driving rods that threatened at any moment to fall silent forever. Now it occupied the foreground of the serene, dreamlike spectacle that had held all my attention, and my astonished wonder turned into something extremely difficult to define. This nomadic piece of sea trash bore a kind of witness to our destiny on earth, a pulvis eris that seemed truer and more eloquent in these polished metal waters with the gold and white vision of the capital of the last czars behind them. The sleek outline of the buildings and wharves on the Finnish coast rose at my side. At that moment I felt the stirrings of a warm solidarity for the tramp steamer, as if it were an unfortunate brother, a victim of human neglect and greed to which it responded with a stubborn determination to keep tracing the dreary wake of its miseries on all the world's seas. I watched it move toward the interior of the bay, searching for some discreet dock where it could anchor without too many maneuvers and, perhaps, for as little money as possible. The Honduran flag hung at the stern. The final letters of the name that had almost been erased by the waves were barely visible: ...cyon. In what seemed too mocking an irony, the name of this old freighter was probably the Halcyon.

Stuart Ritchie, Science Fictions: Exposing Fraud, Bias, Negligence and Hype in Science

An excellent introduction to the replication crisis. Covers both outright fraud and the grey areas of questionable research practices and hype. Highly accessible, you can give this to normal people and they will get a decent grasp of what's going on while being entertained by the amusing and/or terrifying anecdotes. I had a few quibbles, but overall it's very good.

The weird thing, though, is that scientists who already have tenure, and who already run well-funded labs, continue regularly to engage in the kinds of bad practices described in this book. The perverse incentives have become so deeply embedded that they’ve created a system that’s self-sustaining. Years of implicit and explicit training to chase publications and citations at any cost leave their mark on trainee scientists, forming new norms, habits, and ways of thinking that are hard to break even once a stable job has been secured. And as we discussed in the previous chapter, the system creates a selection pressure where the only academics who survive are the ones who are naturally good at playing the game.

Susanna Clarke, Piranesi

16 years after Jonathan Strange & Mr Norrell, a new novel from Susanna Clarke. It's short and not particularly ambitious, but I enjoyed it a lot. A tight fantastical mystery that starts out similar to The Library of Babel but then goes off in a different direction. I loved the setting (which is where the title comes from): a strange alternate dimension in the form of a great house filled with staircases and marble statues, with clouds at the upper levels and tides coming up from below.

Once, men and women were able to turn themselves into eagles and fly immense distances. They communed with rivers and mountains and received wisdom from them. They felt the turning of the stars inside their own minds. My contemporaries did not understand this. They were all enamoured with the idea of progress and believed that whatever was new must be superior to what was old. As if merit was a function of chronology! But it seemed to me that the wisdom of the ancients could not have simply vanished. Nothing simply vanishes. It’s not actually possible.

Francis Bacon, Novum Organum

The first part deals with science and empiricism and induction from an abstract perspective and it feels almost contemporary, like it was written by a time traveling 19th century scientist or something like that. The quarrel between the ancients and the moderns is already in full swing here, Bacon dunks on the Greeks constantly and upbraids people for blindly listening to Aristotle. He points to inventions like gunpowder and the compass and printing and paper and says that surely these indicate that there's a ton of undiscovered ideas out there, we should go looking for them. He talks about perceptual biases and scientific progress. Bacon's ambition feels limitless.

But any man whose care and concern is not merely to be content with what has been discovered and make use of it, but to penetrate further; and not to defeat an opponent in argument but to conquer nature by action; and not to have nice, plausible opinions about things but sure, demonstrable knowledge; let such men (if they please), as true sons of the sciences, join with me, so that we may pass the antechambers of nature which innumerable others have trod, and eventually open up access to the inner rooms.

Then you get to the second part and the Middle Ages hit you like a freight train, you suddenly realize this is no contemporary man at all and his conception of how the world works is completely alien. Ideas that to us seem bizarre and just intuitively nonsensical (about gravity, heat, light, biology, etc.) are only common sense to him. He repeats absurdities about light objects being pulled to the heavens while heavy objects are subject to the gravity of the earth, and so on. It's fascinating that both sides could exist in the same person. You won't learn anything new from Bacon, but it's a fascinating historical document.

Of twenty-five centuries in which human memory and learning is more or less in evidence, scarcely six can be picked out and isolated as fertile in sciences or favourable to their progress. There are deserts and wastes of time no less than of regions.

Signs should also be gathered from the growth and progress of philosophies and sciences. Those that are founded in nature grow and increase; those founded in opinion change but do not grow.

Thomas Pynchon, Bleeding Edge

Imagine going to a three Michelin star restaurant and being served a delicious burger and fries. No matter how good the burger, at some level you will feel disappointed. When I eat at Mr. Pynchon's restaurant I want a 20-dish tasting menu using unheard-of ingredients and requiring the development of entirely new types of kitchen machinery. Bleeding Edge is a burger. That said, it's funny, and readable, and stylish, and manages to evoke a great nostalgia for the early days of the internet—a 20th century version of the end of the Wild West, the railroad of centralized corporate interests conquering everything, while individualist early internet pioneers are shoved aside.

9/11, the deep web, intelligence agencies, power and sex, technology, family. Also just the idea of a 75-year-old geezer writing about Hideo Kojima and MILF-night at the "Joie de Beavre" is hilarious in itself.

“Our Meat Facial today, Ms. Loeffler?”
“Uhm, how’s that.”
“You didn’t get our offer in the mail? on special all this week, works miracles for the complexion—freshly killed, of course, before those enzymes’ve had a chance to break down, how about it?”
“Well, I don’t...”
“Wonderful! Morris, kill… the chicken!”
From the back room comes horrible panicked squawking, then silence. Maxine meantime is tilted back, eyelids aflutter, when— “Now we’ll just apply some of this,” wham! “...meat here, directly onto this lovely yet depleted face...”
“Mmff...”
“Pardon? (Easy, Morris!)”
“Why is it... uh, moving around like that? Wait! is that a— are you guys putting a real dead chicken in my— aaahhh!”
“Not quite dead yet!” Morris jovially informs the thrashing Maxine as blood and feathers fly everywhere.

Samuel R. Delany, Dhalgren

An amnesiac young man walks into a burned-out city, a localized post-apocalypse left behind by the rest of the United States. He meets the locals, gets into trouble, publishes some poetry, and ends up leading a gang. Perhaps more impressive and memorable than good, I would rather praise it than recommend it. It is a slog, it is puerile, the endless sex scenes are pointless at best, characters rather uninteresting, barely any story, the 70s counterculture stuff is comical, stylistically it's not up there with the stuff it's aping. I understand why some people hate it.

But there's something there, underneath all the grime. It reaches that size where quantity becomes a quality of its own. The combination of autobiography, pomo metafictional fuckery, magical realism, and science fiction is unique. Some of its scenes are certainly unforgettable. And it just has an alluring mystical aura, a compelling strangeness that I have a hard time putting into words...

He pictured great maps of darkness torn down before more. After today, he thought idly, there is no more reason for the sun to rise. Insanity? To live in any state other than terror! He held the books tightly. Are these poems mine? Or will I discover that they are improper descriptions by someone else of things I might have once been near; the map erased, aliases substituted for each location? Someone, then others, were laughing.

The Worst

Octavia E. Butler, Lilith's Brood

They say never judge a book by its cover, but in this case you'd be spot on. 800 pages of non-stop alien-on-human rape. Tentacle rape, roofie rape, mind control rape, impregnation rape, this book has it all. Very much a fetish thing. One of the main plot lines is about an alien that is so horny that it will literally die if it doesn't get to rape any humans. It's also bad in more conventional ways - weak characters, weak plotting, all sorts of holes and inconsistencies, not to mention extremely shallow treatment of the ideas about genetics, hierarchy, transhumanism, etc. In retrospect I have no idea why I kept reading to the end.

“You said I could choose. I’ve made my choice!”
“Your body said one thing. Your words said another.” It moved a sensory arm to the back of his neck, looping one coil loosely around his neck. “This is the position,” it said.

Ralph Waldo Emerson, Essays: First Series

Good God, what did Nietzsche see in this stuff? Farming raised to metaphysical principle? There's a bit of Prince Myshkin in Emerson, a bit of Monsieur Teste, but it's all played completely straight. A reliable sleep-inducer if there ever was one.

The fallacy lay in the immense concession, that the bad are successful; that justice is not done now. The blindness of the preacher consisted in deferring to the base estimate of the market of what constitutes a manly success, instead of confronting and convicting the world from the truth; announcing the presence of the soul; the omnipotence of the will: and so establishing the standard of good and ill, of success and falsehood.

Unjustified True Disbelief

2021-01-12

Yesterday I wrote about fake experts and credentialism, but left open the question of how to react. The philosopher Edmund Gettier is famous for presenting a series of problems (known as Gettier cases) designed to undermine the justified true belief account of knowledge. I'm interested in a more pragmatic issue: people rejecting bad science when they don't have the abilities necessary to make such a judgment, or in other words, unjustified true disbelief.¹

There are tons of articles suggesting that Americans have recently come to mistrust science: the Boston Review wants to explain How Americans Came to Distrust Science, Scientific American discusses the "crumbling of trust in science and academia", National Geographic asks Why Do Many Reasonable People Doubt Science?, aeon tells us "there is a crisis of trust in science", while the Christian Science Monitor tells us about the "roots of distrust" of the the "anti-science wave".

I got into a friendly argument on twitter about the first article in that list, in which Pascal-Emmanuel Gobry wrote that "normal people do know that "peer-reviewed studies" are largely and increasingly BS". I see two issues with this: 1) I don't think it's accurate, and 2) even if it were accurate, I don't think normal people can separate the wheat from the chaff.

Actual Trust in Science

Surveys find that trust in scientists has remained fairly stable for half a century.²

A look at recent data shows an increase in "great deal or fair amount of confidence" in scientists from 76% in 2016 to 86% in 2019. People have not come to distrust science...to which one might reasonably ask, why not? We've been going on about the replication crisis for a decade now, how could this possibly not affect people's trust in science? And the answer to that is that normal people don't care about the replication crisis and don't have the tools needed to understand it even if they did.

Choosing Disbelief

Let's forget that and say, hypothetically, that normal people have come to understand that "peer-reviewed studies are largely and increasingly BS". What alternatives does the normal person have? The way I see it one can choose among three options:

Distrust everything and become a forest hobo.
Trust everything anyway.
Pick and choose by judging things on your own.

Let's ignore the first one and focus on the choice between 2 and 3. It boils down to this: the judgment is inevitably going to be imperfect, so does the gain from doubting false science outweigh the loss from doubting true science? That depends on how good people are at doubting the right things.

There's some evidence that laypeople can distinguish which studies will replicate and which won't, but this ability is limited and in the end relies on intuition rather than an understanding and analysis of the work. Statistical evidence is hard to evaluate: even academic psychologists are pretty bad at the basics. The reasons why vaccines are probably safe and nutrition science is probably crap, the reasons why prospect theory is probably real and social priming is probably fake are complicated! If it was easy to make the right judgment, actual scientists wouldn't be screwing up all the time. Thus any disbelief laypeople end up with will probably be unjustified. And the worst thing about unjustified true disbeliefs is that they also carry unjustified false disbeliefs with them.³

Another problem with unjustified disbelief is that it fails to alter incentives in a useful way. Feedback loops are only virtuous if the feedback is epistemically reliable. (A point relevant to experts as well.)

And what exactly are the benefits from knowing about the replication crisis? So you think implicit bias is fake, what are you going to do with that information? Bring it up at your company's next diversity seminar? For the vast majority of people, beyond the intrinsic value of believing true things there is not much practical value in knowing about weaknesses in science.

Credentials Exist for a Reason

When they work properly, institutional credentials serve an extremely useful purpose: most laymen have no ability to evaluate the credibility of experts (and this is only getting worse due to increasing specialization). Instead, they offload this evaluation to a trusted institutional mechanism and then reap the rewards as experts uncover the secret mechanisms of nature and design better microwave ovens. There is an army of charlatans and mountebanks ready to pounce on anyone straying from the institutionally-approved orthodoxy—just look at penny stock promoters or "alternative" medicine.

Current strands of popular scientific skepticism offer a hint of what we can expect if there was more of it. Is this skepticism directed at methodological weaknesses in social science? Perhaps some valid questions about preregistration and outcome switching in medical trials? Elaborate calculations of the expected value of first doses first vs the risk of fading immunity? No, popular skepticism is aimed at very real and very useful things like vaccination,⁴ evolution, genetics, and nuclear energy. Most countries in the EU have banned genetically modified crops, for example—a moronic policy that damages not just Europeans, but overflows onto the people who need GMOs the most, African farmers.⁵ At one point Zambia refused food aid in the middle of a famine because the president thought it was "poison". In the past, shared cultural beliefs were tightly protected; today a kind of cheap skepticism filters down to people who don't know what to do with it and just end up in a horrible mess.

Realistically the alternative to blind trust of the establishment is not some enlightened utopia where we believe the true science and reject the fake experts; the alternative is a wide-open space for bullshit-artists to waltz in and take advantage of people. The practical reality of scientific skepticism is memetic and political, and completely unjustified from an epistemic perspective. Gobry himself once wrote a twitter thread about homeopathy and conspiratorial thinking in France: "the country is positively enamored with pseudo-science." He's right, it is. And that's exactly why we can't trust normal people to make the judgement that "studies largely and increasingly BS".⁶

The way I see it, the science that really matters also tends to be the most solid.⁷ The damage caused by "BS studies" seems relatively limited in comparison.

This line of reasoning also applies to yesterday's post on fake experts: for the average person, the choice between trusting a pseudonymous blogger versus trusting an army of university professors with long publication records in prestigious journals is pretty clear. From the inside view I'm pretty sure I'm right, but from the outside view the base rate of correctness among heterodox pseudonymous bloggers isn't very good. I wouldn't trust the damned blogger either! The only way to have even a modicum of confidence is personal verification, and unless you're part of the tiny minority with the requisite abilities, you should have no such confidence. So what are we left with? Helplessness and confusion. "Epistemic hell" doesn't even begin to cover it.

It is true that if you know where to look on the internet, you can find groups of intelligent and insightful generalists who outdo many credentialed experts. For example, many of the people I follow on twitter were (and in some respects still are) literally months ahead of the authorities on Covid-19. @LandsharkRides was only slightly exaggerating when he wrote that "here, in this incredibly small corner of twitter, we have cultivated a community of such incredibly determined autists, that we somehow know more than the experts in literally every single sphere". But while some internet groups of high-GRE generalists tend to be right, if you don't have the ability yourself it's hard to tell them apart from the charlatans.

But what about justified true disbelief?

Kahneman and Tversky came up with the idea of the inside view vs the outside view. The "inside view" is how we perceive our own personal situation, relying on our personal experiences and with the courage of our convictions. The "outside view" instead focuses on common elements, treating our situation as a single observation in a large statistical class.

From the inside view, skeptics of all stripes believe they are epistemically justified and posses superior knowledge. The people who think vaccines will poison their children believe it, the people who think the earth is flat believe it, and the people who doubt social science p=0.05 papers believe it. But the base rate of correctness among them is low, and the errors they make dangerous. How do you know if you're one of the actually competent skeptics with genuinely justified true disbelief? From the inside, you can't tell. And if you can't tell, you're better off just believing everything.

Some forms of disbelief are less dangerous than others. For example if epidemiologists tell you you don't need to wear a mask, but you choose to wear them anyway, there's very little downside if your skepticism is misguided. The reverse (not wearing masks when they tell you to) has a vastly larger downside. But again this relies on an ability to predict and weigh risks, etc.

The one thing we can appeal to is data from the outside: objectively graded forecasts, betting, market trading. And while these tools could quiet your own doubts, realistically the vast majority of people are not going to bother with these sorts of things. (And are you sure you are capable of judging people's objective track records?) Michael A. Bishop argues against fixed notions of epistemic responsibility in In Praise of Epistemic Irresponsibility: How Lazy and Ignorant Can You Be?, instead favoring an environment where "to a very rough first approximation, being epistemically responsible would involve nothing other than employing reliable belief-forming procedures." I'm certainly in favor of that.

So if you're reading this and feel confident in your stats abilities, your generalist knowledge, your intelligence, the quality of your intuitions, and can back those up via an objective track record, then go ahead and disbelieve all you want. But spreading that disbelief to others seems irresponsible to me. Perhaps even telling lay audiences about the replication crisis is an error. Maybe Ritchie should buy up every copy of his book and burn them, for the common good.

That's it?

Yup. Plenty of experts are fake but people should trust them anyway. Thousands of "studies" are just nonsense but people should trust them anyway. On net, less disbelief would improve people's lives. And unless someone has an objective track record showing they know better (and you have the ability to verify and compare it), you should probably ignore the skeptics. Noble lies are usually challenging ethical dilemmas, but this one strikes me as a pretty easy case.

I can't believe I ended up as an establishment shill after all the shit I've seen, but there you have it.⁸ I'm open to suggestions if you have a superior solution that would allow me to maintain an edgy contrarian persona.

1.I am indebted to David A. Oliver for the phrase. As far as I can tell he was the first person to ever use "unjustified true disbelief", on Christmas Eve 2020. ↩
2.One might question what these polls are actually measuring. Perhaps they're really measuring if the respondents simply think of themselves (or would like to present themselves) as the type of person who believes in science, regardless of whether they do or not. Perhaps the question of how much regular people "trust scientists" is not meaningful? ↩
3.You might be thinking that it's actually not that difficult, but beware the typical mind fallacy. You're reading a blog filled with long rants on obscure metascientific topics, and that's a pretty strong selection filter. You are not representative of the average person. What seem like clear and obvious judgments to you is more or less magic to others. ↩
4.It should be noted that when it comes to anti-vax, institutional credentialed people are not exactly blameless. The Lancet published Wakefield and didn't retract his fraudulent MMR-autism paper for 12 years. ↩
5.The fear of GMOs is particularly absurd in the light of the alternatives. The "traditional" techniques are based on inducing random mutations through radiation and hoping some of them are good. While mutagenesis is considered perfectly safe and appropriate, targeted changes in plant genomes are terribly risky. ↩
6."But, we can educate..." I doubt it. ↩
7.Epidemiologists (not to mention bioethicists) during covid-19 provide one of the most significant examples of simultaneously being important and weak. But that one item can't outweigh all the stuff on the other side of the scale. ↩
8.Insert Palpatine "ironic" gif here. ↩

Are Experts Real?

2021-01-11

I vacillate between two modes: sometimes I think every scientific and professional field is genuinely complex, requiring years if not decades of specialization to truly understand even a small sliver of it, and the experts¹ at the apex of these fields have deep insights about their subject matter. The evidence in favor of this view seems pretty good, a quick look at the technology, health, and wealth around us ought to convince anyone.

But sometimes one of these masters at the top of the mountain will say something so obviously incorrect, something even an amateur can see is false, that the only possible explanation is that they understand very little about their field. Sometimes vaguely smart generalists with some basic stats knowledge objectively outperform these experts. And if the masters at the top of the mountain aren't real, then that undermines the entire hierarchy of expertise.

Real Expertise

Some hierarchies are undeniably legitimate. Chess, for example, has the top players constantly battling each other, new players trying to break in (and sometimes succeeding), and it's all tracked by a transparent rating algorithm that is constantly updated. Even at the far right tails of these rankings, there are significant and undeniable skill gaps. There is simply no way Magnus Carlsen is secretly bad at chess.

Science would seem like another such hierarchy. The people at the top have passed through a long series of tests designed to evaluate their skills and knowledge, winnow out the undeserving, and provide specialized training: undergrad, PhD, tenure track position, with an armada of publications and citations along the way.

Anyone who has survived the torments of tertiary education will have had the experience of getting a broad look at a field in a 101 class, then drilling deeper into specific subfields in more advanced classes, and then into yet more specific sub-subfields in yet more advanced classes, until eventually you're stuck at home on a Saturday night reading an article in an obscure Belgian journal titled "Iron Content in Antwerp Horseshoes, 1791-1794: Trade and Equestrian Culture Under the Habsburgs", and the list of references carries the threatening implication of an entire literature on the equestrian metallurgy of the Low Countries, with academics split into factions justifying or expostulating the irreconcilable implications of rival theories. And then you realize that there's an equally obscure literature about every single subfield-of-a-subfield-of-a-subfield. You realize that you will never be a polymath and that simply catching up with the state of the art in one tiny corner of knowledge is a daunting proposition. The thought of exiting this ridiculous sham we call life flashes in your mind, but you dismiss it and heroically persist in your quest to understand those horseshoes instead.

It is absurd to think that after such lengthy studies and deep specialization the experts could be secret frauds. As absurd as the idea that Magnus Carlsen secretly can't play Chess. Right?

Fake Expertise?

Imagine if tomorrow it was revealed that Magnus Carlsen actually doesn't know how to play chess. You can't then just turn to the #2 and go "oh well, Carsen was fake but at least we have Fabiano Caruana, he's the real deal"—if Carlsen is fake that also implicates every player who has played against him, every tournament organizer, and so on. The entire hierarchy comes into question. Even worse, imagine if it was revealed that Carlsen was a fake, but he still continued to be ranked #1 afterwards. So when I observe extreme credential-competence disparities in science or government bureaucracies, I begin to suspect the entire system.² Let's take a look at some examples.

59

In 2015, Viechtbauer et al. published A simple formula for the calculation of sample size in pilot studies, in which they describe a simple method for calculating the required N for an x% chance of detecting a certain effect based on the proportion of participants who exhibit the effect. In the paper, they give an example of such a calculation, writing that if 5% of participants exhibit a problem, the study needs N=59 for a 95% probability of detecting the problem. The actual required N will, of course, vary depending on the prevalence of the effect being studied.

If you look at the papers citing Viechtbauer et al., you will find dozens of them simply using N=59, regardless of the problem they're studying, and explaining that they're using that sample size because of the Viechtbauer paper! The authors of these studies are professors at real universities, working in disciplines based almost entirely on statistical analyses. The papers passed through editors and peer reviewers. In my piece on the replication crisis, I wrote that I find it difficult to believe that social scientists don't know what they're doing when they publish weak studies; one of the most common responses from scientists was "no, they genuinely don't understand elementary statistics". It still seems absurd (just count the years from undergrad to PhD, how do you fail to pick this stuff up just by osmosis?) but it also appears to be true. How does this happen? Can you imagine a physicist who doesn't understand basic calculus? And if this is the level of competence among tenured professors, what is going on among the people below them in the hierarchy of expertise?

Masks

Epidemiologists have beclowned themselves in all sorts of ways over the last year, but this is one of my favorites. Michelle Odden, professor of epidemiology and population health at Stanford (to be fair she does focus on cardiovascular rather than infectious disease, but then perhaps she shouldn't appeal to her credentials):

Time Diversification

CalPERS is the largest pension fund in the United States, managing about $400 billion dollars. Here is a video of a meeting of the CalPERS investment committee, in which you will hear Chief Investment Officer Yu Meng say two incredible things:

That he can pick active managers who will generate alpha, and this decreases portfolio risk.
That infrequent model-based valuation of investments makes them less risky compared to those traded on a market, due to "time diversification".

This is utter nonsense, of course. When someone questions him, he retorts with "I might have to go back to school to get another PhD". The appeal to credentials is typical when fake expertise is questioned. Can you imagine Magnus Carlsen appealing to a piece of paper saying he has a PhD in chessology to explain why he's good?³

Cybernetics

Feedback Loops

It all comes down to feedback loops. The optimal environment for developing and recognizing expertise is one which allows for clear predictions and provides timely, objective feedback along with a system that promotes the use of that feedback for future improvement. Capitalism and evolution work so well because of their merciless feedback mechanisms.⁴

The basic sciences have a great environment for such feedback loops: if physics was fake, CPUs wouldn't work, rockets wouldn't go to the moon, and so on. But if social priming is fake, well...? There are other factors at play, too, some of them rather nebulous: there's more to science than just running experiments and publishing stuff. Mertonian, distributed, informal community norms play a significant role in aligning prestige with real expertise, and these broader social mechanisms are the keystone that holds everything together. But such things are hard to measure and impossible to engineer. And can such an honor system persist indefinitely or will it eventually be subverted by bad actors?⁵

What does feedback look like in the social sciences? The norms don't seem to be operating very well. There's a small probability that your work will be replicated at some point, but really the main feedback mechanism is "impact" (in other words, citations). Since citations in the social sciences are not related to truth, this is useless at best. Can you imagine if fake crank theories in physics got as many citations as the papers from CERN? Notorious fraud Brian Wansink racked up 2500 citations in 2020, two years after he was forced to resign. There's your social science feedback loop!

The feedback loops of the academy are also predicated on the current credentialed insiders actually being experts. But if the N=59 crew are making such ridiculous errors in their own papers, they obviously don't have the ability to judge other people's papers either, and neither do the editors and reviewers who allow such things to be published—for the feedback loop to work properly you need both the cultural norms and genuinely competent individuals to apply them.

The candidate gene literature (of which 5-HTTLPR was a subset) is an example of feedback and successful course correction: many years and thousands of papers were wasted on something that ended up being completely untrue, and while a few of these papers still trickle in, these days the approach has been essentially abandoned and replaced by genome-wide association studies. Scott Alexander is rather sanguine about the ability of scientific feedback mechanisms to eventually reach a true consensus, but I'm more skeptical. In psychology, the methodological deficiencies of today are the more or less same ones as those of the of the 1950s, with no hope for change.

Sometimes we can't wait a decade or two for the feedback to work, the current pandemic being a good example. Being right about masks eventually isn't good enough. The loop there needs to be tight, feedback immediate. Blindly relying on the predictions of this or that model from this or that university is absurd. You need systems with built-in, automatic mechanisms for feedback and course correction, subsidized markets being the very best one. Tom Liptay (a superforecaster) has been scoring his predictions against those of experts; naturally he's winning. And that's just one person, imagine hundreds of them combined, with a monetary incentive on top.

Uniformity

There's a superficial uniformity in the academy. If you visit the physics department and the psychology department of a university they will appear very similar: the people working there have the same titles, they instruct students in the same degrees, and publish similar-looking papers in similar-looking journals.⁶ The N=59 crew display the exact same shibboleths as the real scientists. This similarity provides cover so that hacks can attain the prestige, without the competence, of academic credentials.

Despite vastly different levels of rigor, different fields are treated with the ~same seriousness. Electrical engineering is definitely real, and the government bases all sorts of policies on the knowledge of electrical engineers. On the other hand nutrition is pretty much completely fake, yet the government dutifully informs you that you should eat tons of cereal and a couple loaves of bread every day. A USDA bureaucrat can hardly override the Scientists (and really, would you want them to?).

This also applies to people within the system itself: as I have written previously the funding agencies seem to think social science is as reliable as physics and have done virtually nothing in response to the "replication crisis" (or perhaps they are simply uninterested in fixing it).

In a recent episode of Two Psychologists Four Beers, Yoel Inbar was talking about the politically-motivated retraction of a paper on mentorship, asking "Who's gonna trust the field that makes their decisions about whether a paper is scientifically valid contingent on whether it defends the moral sensibilities of vocal critics?"⁷ I think the answer to that question is: pretty much everyone is gonna trust that field. Status is sticky. The moat of academic prestige is unassailable and has little to do with the quality of the research.⁸ “Anything that would go against World Health Organization recommendations would be a violation of our policy," says youtube—the question of whether the WHO is actually reliable being completely irrelevant. Trust in the scientific community has remained stable for 50+ years. Replication crisis? What's that? A niche concern for a handful of geeks, more or less.

As Patrick Collison put it, "This year, we’ve come to better appreciate the fallibility and shortcomings of numerous well-established institutions (“masks don’t work”)… while simultaneously entrenching more heavily mechanisms that assume their correctness (“removing COVID misinformation”)." A twitter moderator can hardly override the Scientists (and really, would you want them to?).

This is all exacerbated by the huge increase in specialization. Undoubtedly it has benefits, as scientists digging deeper and deeper into smaller and smaller niches allows us to make progress that otherwise would not happen. But specialization also has its downsides: the greater the specialization, the smaller the circle of people who can judge a result. And the rest of us have to take their word for it, hoping that the feedback mechanisms are working and nobody abuses this trust.

It wasn't always like this. A century ago, a physicist could have a decent grasp of pretty much all of physics. Hungarian Nobelist Eugene Wigner expressed his frustration with the changing landscape of specialization:

By 1950, even a conscientious physicist had trouble following more than one sixth of all the work in the field. Physics had become a discipline, practiced within narrow constraints. I tried to study other disciplines. I read Reviews Of Modern Physics to keep abreast of fields like radio-astronomy, earth magnetism theory, and magneto-hydrodynamics. I even tried to write articles about them for general readers. But a growing number of the published papers in physics I could not follow, and I realized that fact with some bitterness.

It is difficult (and probably wrong) for the magneto-hydrodynamics expert to barge into radio-astronomy and tell the specialists in that field that they're all wrong. After all, you haven't spent a decade specializing in their field, and they have. It is a genuinely good argument that you should shut up and listen to the experts.

Bringing Chess to the Academy

There are some chess-like feedback mechanisms even in social science, for example the DARPA SCORE Replication Markets project: it's a transparent system of predictions which are then objectively tested against reality and transparently scored.⁹ But how is this mechanism integrated with the rest of the social science ecosystem?

We are still waiting for the replication outcomes, but regardless of what happens I don't believe they will make a difference. Suppose the results come out tomorrow and they show that my ability to judge the truth or falsity of social science research is vastly superior to that of the people writing and publishing this stuff. Do you think I'll start getting emails from journal editors asking me to evaluate papers before they publish them? Of course not, the idea is ludicrous. That's not how the system works. "Ah, you have exposed our nonsense, and we would have gotten away with it too if it weren't for you meddling kids! Here, you can take control of everything now." The truth is that the system will not even have to defend itself, it will just ignore this stuff and keep on truckin' like nothing happened. Perhaps constructing a new, parallel scientific ecosystem is the only way out.

On the other hand, an objective track record of successful (or unsuccessful) predictions can be useful to skeptics outside the system.¹⁰ In the epistemic maelstrom that we're stuck in, forecasts at least provide a steady point to hold on to. And perhaps it could be useful for generating reform from above. When the scientific feedback systems aren't working properly, it is possible for help to come from outside:¹¹

So what do you do? Stop trusting the experts? Make your own judgments? Trust some blogger who tells you what to believe? Can you actually tell apart real results from fake ones, reliable generalists from exploitative charlatans? Do you have it in you to ignore 800 epidemiologists, and is it actually a good idea? More on this tomorrow.

1.I'm not talking about pundits or amorphous "elites", but people actually doing stuff. ↩
2.Sometimes the explanation is rather mundane: institutional incentives or preference falsification make these people deliberately say something they know to be false. But there are cases where that excuse does not apply. ↩
3.Ever notice how people with "Dr." in their twitter username tend to post a lot of stupid stuff? ↩
4.A Lysenkoist farmer in a capitalist country is an unprofitable farmer and therefore soon not a farmer at all. ↩
5.I also imagine it's easier to maintain an honor system with the clear-cut clarity (and rather apolitical nature) of the basic natural sciences compared to the social sciences. ↩
6.The same features can even be found in the humanities which is just tragicomic. ↩
7.The stability of academic prestige in the face of fake expertise makes it a very attractive target for political pressure. If you can take over a field, you can start saying anything you want and people will treat you seriously. ↩
8.Medicine was probably a net negative until the end of the 19th century, but doctors never went away. People continued to visit doctors while they bled them to death and smeared pigeon poop on their feet. Is it that some social status hierarchies are completely impregnable regardless of their results? A case of memetic defoundation perhaps? Were people simply fooled by regression to the mean? ↩
9.The very fact that such a prediction market could be useful is a sign of weakness. Physicists don't need prediction markets to decide whether the results from CERN are real. ↩
10.The nice thing about objective track records is that they protect against both credentialed bullshitters and uncredentialed charlatans looking to exploit people's doubts. ↩
11.Of course it's no panacea, especially in the face of countervailing incentives. ↩

Links & What I've Been Reading Q4 2020

2020-12-30

Links

Forecasting

Arpit Gupta on prediction markets vs 538 in the 2020 election: "Betting markets are pretty well calibrated—state markets that have an estimate of 50% are, in fact, tossups in the election. 538 is at least 20 points off—if 538 says that a state has a ~74% chance of going for Democrats, it really is a tossup." Also, In Defense of Polling: How I earned $50,000 on election night using polling data and some Python code. Here is a giant spreadsheet that scores 538/Economist vs markets. And here is a literal banana arguing against the very idea of polls.

Markets vs. polls as election predictors: An historical assessment (2012). Election prediction markets stretch back to the 19th century, and they used to be heavily traded and remarkably accurate despite the lack of any systematic polling information. Once polling was invented, volumes dropped and prediction markets lost their edge. Perhaps things are swinging in the other direction again?

Metaculus is organizing a "large-scale, comprehensive forecasting tournament dedicated to predicting advances in artificial intelligence" with $50k in prize money.

Covid

Philippe Lemoine critiques Flaxman et al.'s "Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe" in Nature.

However, as far as can tell, Flaxman et al. don’t say what the country-specific effect was for Sweden either in the paper or in the supplementary materials. This immediately triggered my bullshit detector, so I went and downloaded the code of their paper to take a closer look at the results and, lo and behold, my suspicion was confirmed. In this chart, I have plotted the country-specific effect of the last intervention in each country:

Flaxman responds on Andrew Gelman's blog. Lemoine responds to the response.

Alex Tabarrok has been beating the drum for delaying the second dose and vaccinating more people with a single dose instead.

Twitter thread on the new, potentially more infectious B.1.1.7 variant.

Sound pollution decreased due to COVID-19, and "birds responded by producing higher performance songs at lower amplitudes, effectively maximizing communication distance and salience".

An ancient coronavirus-like epidemic drove adaptation in East Asians from 25,000 to 5,000 years ago. Plus Razib Khan commentary.

Scent dog identification of samples from COVID-19 patients: "The dogs were able to discriminate between samples of infected (positive) and non-infected (negative) individuals with average diagnostic sensitivity of 82.63% and specificity of 96.35%." [N=1012] Unfortunately they didn't try it on asymptomatic/pre-symptomatic cases.

And a massive update on everything Covid from Zvi Mowshowitz, much of it infuriating. Do approach with caution though, the auction argument in particular seems questionable.

Innovations and Innovation

Peer Rejection in Science: a collection of "key discoveries have been at some point rejected, mocked, or ignored by leading scientists and expert commissions."

Somewhat related: if GAI is close, why aren't large companies investing in it? NunoSempere comments with some interesting historical examples of breakthrough technologies that received very little investment or were believed to be impossible before they were realized.

Deepmind solves protein folding. And a couple of great blog posts by Mohammed AlQuraishi, where he talks about why AlphaFold is important, why this innovation didn't come from pharmaceutical companies or the academy, and more:

First, from 2018, AlphaFold @ CASP13: “What just happened?”:
I don’t think we would do ourselves a service by not recognizing that what just happened presents a serious indictment of academic science. [...] What is worse than academic groups getting scooped by DeepMind? The fact that the collective powers of Novartis, Pfizer, etc, with their hundreds of thousands (~million?) of employees, let an industrial lab that is a complete outsider to the field, with virtually no prior molecular sciences experience, come in and thoroughly beat them on a problem that is, quite frankly, of far greater importance to pharmaceuticals than it is to Alphabet.
Then, in 2020, AlphaFold2 @ CASP14: “It feels like one’s child has left home.”:
Once a solution is solved in any way, it becomes hard to justify solving it another way, especially from a publication standpoint.

"These improvements drop the turn-around time from days to twelve hours and the cost for whole genome sequencing (WGS) from about $1000 to $15, as well as increase data production by several orders of magnitude." If this is real (and keep in mind $15 is not the actual price end-users would pay) we can expect universal whole-genome sequencing, vast improvements in PGSs, and pervasive usage of genetics in medicine in the near future.

Extrapolating GPT-N performance: "Close-to-optimal performance on these benchmarks seems like it’s at least ~3 orders of magnitude compute away [...] Taking into account both software improvements and potential bottlenecks like data, I’d be inclined to update that downwards, maybe an order of magnitude or so (for a total cost of ~$10-100B). Given hardware improvements in the next 5-10 years, I would expect that to fall further to ~$1-10B."

Fund people, not projects I: The HHMI and the NIH Director's Pioneer Award. "Ultimately it's hard to disagree with Azoulay & Li (2020), we need a better science of science! The scientific method needs to examine the social practice of science as well, and this should involve funders doing more experiments to see what works. Rather than doing whatever is it that they are doing now, funders should introduce an element of explicit randomization into their process."

It will take more than a few high-profile innovations to end the great stagnation. "And if you sincerely believe that we are in a new era of progress, then argue for it rigorously! Show it in the data. Revisit the papers that were so convincing to you a year ago, and go refute them directly."

The Rest

Why Are Some Bilingual People Dyslexic in English but Not Their Other Language? I'm not entirely sure about the explanations proposed in the article, but it's fascinating nonetheless.

The Centre for Applied Eschatology: "CAE is an interdisciplinary research center dedicated to practical solutions for existential or global catastrophe. We partner with government, private enterprise, and academia to leverage knowledge, resources, and diverse interests in creative fusion to bring enduring and universal transformation. We unite our age’s greatest expertise to accomplish history’s greatest task."

Labor share has been decreasing over the past decades, but without a corresponding increase in the capital share of income. Where does the money go? This paper suggests: housing costs. Home ownership as investment may have seemed like a great idea in the past, but now we're stuck in this terrible equilibrium where spiraling housing costs are causing huge problems but it would be political suicide to do anything about it. It's easy to say "LVT now!" but good luck formulating a real plan to make it reality.

1/4 of animals used in research are included in published papers. Someone told me this figure is surprisingly high. Unfortunately there's no data in the paper breaking down unpublished null results vs bad data/failed experiments/etc.

@Evolving_Moloch reviews Rutger Bregman's Humankind. "Bregman presents hunter-gatherer societies as being inherently peaceful, antiwar, equal, and feminist likely because these are commonly expressed social values among educated people in his own society today. This is not history but mythology."

@ArtirKel reviews Vinay Prasad's Malignant, with some comments on progress in cancer therapy and the design of clinical trials. "The whole system is permeated by industry-money, with the concomitant perverse incentives that generates."

@Cerebralab2 reviews Nick Lane's Power, Sex, Suicide: Mitochondria and the meaning of life. "The eukaryotic cell appeared much later (according to the mainstream view) and in the space of just a few hundred million years—a fraction of the time available to bacteria—gave rise to the great fountain of life we see all around us."

Is the great filter behind us? The Timing of Evolutionary Transitions Suggests Intelligent Life Is Rare. "Together with the dispersed timing of key evolutionary transitions and plausible priors, one can conclude that the expected transition times likely exceed the lifetime of Earth, perhaps by many orders of magnitude. In turn, this suggests that intelligent life is likely to be exceptionally rare." (Highly speculative, and there are some assumptions one might reasonably disagree with.)

How to Talk When a Machine is Listening: Corporate Disclosure in the Age of AI. "Companies [...] manage the sentiment and tone of their disclosures to induce algorithmic readers to draw favorable conclusions about the content."

On the Lambda School stats. "If their outcomes are actually good, why do they have to constantly lie?"

On the rationalism-to-trad pipeline. (Does such a pipeline actually exist?) "That "choice" as a guiding principle is suspect in itself. It's downstream from hundreds of factors that have nothing to do with reason. Anything from a leg cramp to an insult at work can alter the "rational" substrate significantly. Building civilizations on the quicksand of human whim is hubris defined."

There is a wikipedia article titled List of nicknames used by Donald Trump.

Why Is There a Full-Scale Replica of the Parthenon in Nashville, Tennessee?

We Are What We Watch: Movie Plots Predict the Personalities of Those who “Like” Them. An amusing confirmation of stereotypes: low extraversion people are anime fanatics, low agreeableness people like Hannibal, and low openness people just have terrible taste (under a more benevolent regime they might perhaps be prohibited from consuming media).

A short film based on Blindsight. Won't make any sense if you haven't read the book, but it looks great.

12 Hours of Powerline Noise from Serial Experiments Lain.

And here is a Japanese idol shoegaze group. They're called ・・・・・・・・・ and their debut album is 「」.

What I've Been Reading

The History of the Decline and Fall of the Roman Empire by Edward Gibbon. Fantastic. Consistently entertaining over almost 4k pages. Gibbon's style is perfect. I took it slow, reading it over 364 days...and I would gladly keep going for another year. Full review forthcoming.
The Adventures and Misadventures of Maqroll by Álvaro Mutis. A lovely collection of 7 picaresque novellas that revolve around Maqroll, a cosmopolitan vagabond of the seas. Stylistically rich and sumptuous. It's set in a kind of parallel maritime world, the world of traders and seamen and port city whores. Very melancholy, with doomed business ventures at the edge of civilization, doomed loves, doomed lives, and so on. While the reader takes vicarious pleasure in the "nomadic mania" of Maqroll, the underlying feeling is one of fundamental dissatisfaction with what life has to offer—ultimately the book is about our attempts to overcome it. Reminiscent of Conrad, but also Herzog's diaries plus a bit of Borges in the style.
Pandora’s Box: A History of the First World War by Jörn Leonhard. A comprehensive, single-volume history of WWI from a German author. It goes far beyond military history: besides the battles and armaments it covers geopolitics and diplomacy, national politics, economics, public opinion, morale. All fronts and combatants are explored, and all this squeezed into just 900 pages (some things are inevitably left out - for example no mention is made of Hoover's famine relief efforts). Its approach is rather abstract, so if you're looking for a visceral description of the trenches this isn't the book for you. The translation isn't great, and it can get a bit dry and repetitive, but overall it's a very impressive tome. n.b. the hardcover edition from HUP is astonishingly bad and started falling apart immediately. (Slightly longer review on goodreads.)
To Hold Up the Sky by Liu Cixin. A new short story collection. Not quite at the same level as the Three Body Trilogy, but there are some good pieces. I particularly enjoyed two stories about strange and destructive alien artists: Sea of Dreams (in which an alien steals all the water on earth for an orbital artwork), and Cloud of Poems (in which a poetry contest ultimately destroys the solar system, sort of a humanistic scifi take on The Library of Babel).
Creating Future People: The Ethics of Genetic Enhancement by Jonathan Anomaly. A concise work on the ethical dilemmas posed by genetic enhancement technology. It's written by a philosopher, but uses a lot of ideas from game theory and economics to work out the implications of genetic enhancement. Despite its short length, it goes into remarkable practical detail on things like how oxytocin affects behavior, the causes of global wealth inequality, and the potential of genetic editing to decrease the demand for plastic surgery. On the other hand, I did find it somewhat lacking (if not evasive) in its treatment of more general and abstract philosophical questions, such as: under what conditions is it acceptable to hurt people today in order to help future people?
The Life and Opinions of Tristram Shandy, Gentleman by Laurence Sterne. Famously the "first postmodern novel", this fictional biography from the 1760s is inventive, bawdy, and generally really strange and crazy. Heavily influenced by Don Quixote, parodies various famous writers of the time. Schopenhauer loved it and a young Karl Marx drew inspiration from it when writing Scorpion and Felix! I admire its ethos, and it's sometimes very funny. But ironic shitposting is still shitposting, and 700 pages of shitposting is a bit...pleonastic. At one point the narrator explains his digressions in the form of a line, one for each volume:
Illuminations by Walter Benjamin. Doesn't really live up to the hype. The essays on Proust and Baudelaire are fine, the hagiography of Brecht feels extremely silly in retrospect. The myopia of extremist 1930s politics prevents him from seeing very far.
Omensetter's Luck by William Gass. An experimental novel that does a great job of evoking 19th century rural America. Omensetter is a beguiling, larger-than-life figure, a kind of natural animal of a man. Nowhere near as good as The Tunnel and not much easier to read either. It's clearly an early piece, before Gass had fully developed his style.
The Silence: A Novel by Don DeLillo. Not so much a novel as a sketch of one. Not even undercooked, completely raw. It's about a sudden shutdown of all technology. Stylistically uninteresting compared to his other work. Here's a good review.
Little Science, Big Science by Derek John de Solla Price. Purports to be about the science of science, but really mostly an exploration of descriptive statistics over time - number of scientists, the distribution of their productivity and intelligence, distribution across countries, citations, and so on. Should have been a blog post. Nice charts, worth skimming just for them. (Very much out of print, but you can grab a pdf from the internet archive or libgen).
Experiment and the Making of Meaning: Human Agency in Scientific Observation and Experiment by David Gooding. Skimmed it. Written in 1990 but feels very outdated, nobody cares about observation sentences any more and they didn't in 1990 either. Some interesting points about the importance of experiment (as opposed to theory) in scientific progress. On the other hand all the fluffy stuff about "meaning" left me completely cold.
The Subjective Side of Science: A Philosophical Inquiry into the Psychology of the Apollo Moon Scientists by Ian Mitroff. Based on a series of structured interviews with geologists working on the Apollo project. Remarkably raw. Mitroff argues in favor of the subjective side, the biased side, of how scientists actually perform science in the real world. On personality types, relations between scientists, etc. There are some silly parts, like an attempt to tie Jungian psychology with the psychological clusters in science, and a very strange typology of scientific approaches toward the end, but overall it's above-average for the genre.
On Writing: A Memoir of the Craft by Stephen King. A pleasant autobiography combined with some tips on writing. The two parts don't really fit together very well. This was my first King book, I imagine it's much better if you're a fan (he talks about his own novels quite a lot).
The Lord Chandos Letter And Other Writings by Hugo von Hofmannsthal. A collection of short stories plus the titular essay. If David Lynch had been a symbolist writer, these are the kinds of stories he would have produced. Vague, mystical, dreamlike, impressionistic. I found them unsatisfying, and they never captured my interest enough to try to disentangle the symbols and allegories. The final essay about the limitations of language is worth reading, however.
Ghost Soldiers: The Forgotten Epic Story of World War II's Most Dramatic Mission by Hampton Sides. An account of a daring mission to rescue POWs held by the Japanese in the Philippines. The mission itself is fascinating but fairly short, and the book is padded with a lot of background info that is nowhere near as interesting (though it does set the scene). Parts of it are brutal and revolting beyond belief.
We Are Legion (We Are Bob) by Dennis Taylor. An interesting premise: a story told from the perspective of a sentient von Neumann probe. That premise is sort-of squandered by a juvenile approach filled to the brim with plot holes, and an inconclusive story arc: it's just setting up the sequels. Still, it's pretty entertaining. If you want a goofy space opera audiobook to listen to while doing other stuff, I'd recommend it.
Rocket Men: The Daring Odyssey of Apollo 8 and the Astronauts Who Made Man's First Journey to the Moon by Robert Kurson. Focused on the personalities, personal lives, and families of the three astronauts on Apollo 8: Frank Borman, William Anders, and James Lovell, set against the tumultuous political situation of late 1960s America. Written in a cinematic style, there's little on the technical/organizational aspects of Apollo 8. Its treatment of the cold war is rather naïve. The book achieves its goals, but I was looking for something different. Ray Porter (who I usually like) screws up the narration of the audiobook with an over-emotive approach, often emphasizing the wrong words. Really strange.

Book Review: The Idiot

2020-11-30

In 1969, Alfred Appel declared that Ada or Ardor was "the last 19th-century Russian novel". Now we have in our hands a new last 19th-century Russian novel—perhaps even the final one. And while Nabokov selected the obvious and trivial task of combining the Russian and American novels, our "Dostoyevsky" (an obvious pseudonym) has given himself the unparalleled and interminably heroic mission of combining the Russian novel with the Mexican soap opera. I am pleased to report that he has succeeded in producing a daring postmodern pastiche that truly evokes the 19th century.

The basic premise of The Idiot is lifted straight from Nietzsche's Antichrist 29-31:

To make a hero of Jesus! And even more, what a misunderstanding is the word 'genius'! Our whole concept, our cultural concept, of 'spirit' has no meaning whatever in the world in which Jesus lives. Spoken with the precision of a physiologist, even an entirely different word would be yet more fitting here—the word idiot. [...] That strange and sick world to which the Gospels introduce us — a world like that of a Russian novel, in which refuse of society, neurosis and ‘childlike’ idiocy seem to make a rendezvous.

The novel opens with Prince Lev Nikolayevich Myshkin, the titular idiot, returning by train to Russia after many years in the hands of a Swiss psychiatrist. He is a Christlike figure, as Nietzsche puts it "a mixture of the sublime, the sickly, and the childlike", a naïf beset on all sides by the iniquities of the selfish and the corruptive influence of society. Being penniless, he seeks out a distant relative and quickly becomes entangled in St. Petersburg society.

If this were really a 19th century novel, it would follow a predictable course from this point: the "idiot" would turn out to be secretly wiser than everyone, the "holy fool" would speak truths inaccessible to normal people, his purity would highlight the corruption of the world around him, his naiveté would ultimately be form of nobility, and so on.

Instead, Myshkin finds himself starring in a preposterous telenovela populated by a vast cast of absurdly melodramatic characters. He quickly receives an unexpected inheritance that makes him wealthy, and is then embroiled in a web of love and intrigue. As in any good soap opera, everything is raised to extremes in this book: there are no love triangles because three vertices would not be nearly enough; instead there are love polyhedrons, possibly in four or seven dimensions.

Myshkin's first love interest is the intimidating, dark, and self-destructive Nastasya Fillipovna. An orphan exploited by her guardian, she is the talk of the town and chased by multiple suitors, including the violent Rogozhin and the greedy Ganya. Myshkin thinks she's insane but pities her so intensely that they have an endless and tempestuous on-again off-again relationship, which includes Nastasya skipping out on multiple weddings. In the construction of this character I believe I detect the subtle influence of the yandere archetype from Japanese manga.

The second woman in Myshkin's life is the young and wealthy Aglaya Ivanovna: proud, snobbish, and innocent, she cannot resist mocking Myshkin, but at the same time is deeply attracted to him. Whereas Nastasya loves Myshkin but thinks she's not good enough for him, Aglaya loves him but thinks she's too good for him.

The main cast is rounded off by a bunch of colorful characters, including the senile general Epanchin, various aristocrats, a boxer, a religious maniac, and Ippolit the nihilist who spends 600 pages in a permanent state of almost-dying (consumption, of course) and even gets a suicide fakeout scene that would make the producers of The Young and the Restless blush.

As Myshkin's relationships develop, he is always kind, non-judgmental, honest and open with his views. But this is not the story of good man taken advantage of, but rather the story of a man who is simply incapable of living in the real world. Norm Macdonald, after seeing the musical Cats, exclaimed: "it's about actual cats!" The reader of The Idiot will inevitably experience the same shock of recognition, muttering "Mein Gott, it's about an actual idiot!" His behavior ends up hurting not only himself, but also the people around him, the people he loves. In the climax, Nastasya and Aglaya battle for Myshkin's heart, but it's a disaster as he makes all the wrong choices.

That's not to say that it's all serious; the drama is occasionally broken up by absurdist humor straight out of Monty Python:

Everyone realized that the resolution of all their bewilderment had begun.
‘Did you receive my hedgehog?’ she asked firmly and almost angrily.

Postmodern games permeate the entire novel: for example, what initially appears to be an omniscient narrator is revealed in the second half to simply be another character (a deeply unreliable one at that); one who sees himself as an objective reporter of the facts, but is really a gossip and rumourmonger. Toward the end he breaks the fourth wall and starts going on bizarre digressions that recall Tristram Shandy: at one point he excuses himself to the reader for digressing too far, then digresses even further to complain about the quality of the Russian civil service. The shifts in point of view become disorienting and call attention to the artificial nature of the novel. Critically, he never really warms up to Myshkin:

In presenting all these facts and refusing to explain them, we do not in the least mean to justify our hero in the eyes of our readers. More than that, we are quite prepared to share the indignation he aroused even in his friends.

Double Thoughts and Evolutionary Psychology

The entire novel revolves around the idea of the "double thought", an action with two motives: one pure and conscious, the other corrupt and hidden. Keller comes to Myshkin in order to confess his misdeeds, but also to use the opportunity to borrow money. Awareness of the base motive inevitably leads to guilt and in some cases self-destructive behavior. This is how Myshkin responds:

You have confused your motives and ideas, as I need scarcely say too often happens to myself. I can assure you, Keller, I reproach myself bitterly for it sometimes. When you were talking just now I seemed to be listening to something about myself. At times I have imagined that all men were the same,’ he continued earnestly, for he appeared to be much interested in the conversation, ‘and that consoled me in a certain degree, for a DOUBLE motive is a thing most difficult to fight against. I have tried, and I know. God knows whence they arise, these ideas that you speak of as base. I fear these double motives more than ever just now, but I am not your judge, and in my opinion it is going too far to give the name of baseness to it—what do you think? You were going to employ your tears as a ruse in order to borrow money, but you also say—in fact, you have sworn to the fact— that independently of this your confession was made with an honourable motive.¹

The "double thought" is an extension of the concept of self-deception invented by evolutionary psychologist Robert Trivers, and, simply put, this book could not have existed without his work. Trivers has been writing about self-deception since the 70s in academic journals and books (including his 2011 book The Folly of Fools). The basic idea is that people subconsciously deceive themselves about the true motives of their actions, because it's easier to convince others when you don't have to lie.

Dostoyevsky's innovation lies in examining what happens when someone becomes aware of their subconscious motives and inevitably feels guilty. There is empirical evidence that inhibition of guilt makes deception more effective, but this novel inverts that problem and asks the question: what happens when that inhibition fails and guilt takes over? The author's penetrating psychological analysis finds a perfect home in the soap opera setting, as the opposition of extreme emotions engendered by the double thought complements the melodrama. Dostoyevsky even goes a step further, and argues that self-consciousness of the double thought is a double thought in itself: "I couldn't help thinking ... that everyone is like that, so that I even began patting myself on the back". There is no escape from the signaling games we play. The complexity of unconscious motives is a recurring theme:

Don't let us forget that the causes of human actions are usually immeasurably more complex and varied than our subsequent explanations of them.

In a move of pure genius, Dostoyevsky plays with this idea on three levels in parallel: first, the internal contrast between pure and corrupt motives within each person; second, the external contrast between the pure Idiot Myskin and the corrupt society around him; third, on the philosophical level of Dionysus versus The Crucified. And in the end he comes down squarely in the camp of Dionysus and against Myshkin. Just as the Idiot is not ultimately good, so consciousness and the innocent motivations are not good either: the novel decides the issue strongly in favor of the corrupt motive, in favor of instinct over ratiocination, in favor of Dionysus over Apollo, in favor of the earthly over Christianity. We must live our lives in this world and deal with it as it is.

Double Anachronism

In the brilliant essay The Argentine Writer and Tradition, Borges writes that "what is truly native can and often does dispense with local color". Unfortunately Mssr. Dostoyevsky overloads his novel with local color, which in the end only highlights its artificiality. The lengths to which he has gone to make this novel appear as if it were a real product of the 19th century are admirable, but by overextending himself, he undermines the convincing (though fantastic) anachronism; like a double thought, the underlying deception leaks out and ruins everything. In a transparent and desperate reach for verisimilitude, he has included a series of references to real crimes from the 1860s. One cannot help but imagine the author bent over some dusty newspaper archive in the bowels of the National Library on Nevsky Prospekt, mining for details of grisly murders and executions.

Unfortunately The Idiot is anachronistic in more ways than one: as the juvenile influence of Der Antichrist hints, Dostoyevsky is a fervent anti-Christian who epitomizes the worst excesses of early-2000s New Atheism. Trivers wrote the foreword to Dawkins's The Selfish Gene, so it is no surprise that Dostoyevsky would be part of that intellectual tradition. But the heavy-handed anti-religious moralizing lacks nuance and gets old fast. His judgment of Myshkin, the representative of ideal Christianity, is heavy, but on top of that he also rants about the Catholic church, the representative of practical Christianity. He leaves no wiggle room in his condemnations.

And the lesson he wants to impart is clear: that Christianity is not only impractical and hypocritical, but actively ruins the lives of the people it touches. But these views hardly pass the smell test. While Dostoyevsky has mastered evolutionary psychology, he seems to have ignored cultural evolution. As Joe Henrich lays out in his latest book, The WEIRDest People in the World: How the West Became Psychologically Peculiar and Particularly Prosperous, the real-world influence of Christianity as a pro-social institution is a matter that deserves a far more nuanced examination. So if I could give Mssr. Dostoyevsky one piece of advice it would be this: less Dawkins, more Henrich please. After all, a devotee of Nietzsche should have a more subtle view of these things.

1.See also Daybreak 523: "With everything that a person allows to become visible one can ask: What is it supposed to hide?" ↩

Metascience and Philosophy

2020-11-28

It has been said that philosophy of science is as useful to scientists as ornithology is to birds. But perhaps it can be useful to metascientists?

State of Play

Philosophy

In the 20th century, philosophy of science attracted first-rate minds: scientists like Henri Poincaré, Pierre Duhem, and Michael Polanyi, as well as philosophers like Popper, Quine, Carnap, Kuhn, and Lakatos. Today the field is a backwater, lost in endless debates about scientific realism which evoke the malaise of medieval angelology.¹ Despite being part of philosophy, however, the field made actual progress, abandoning simplistic early models for more sophisticated approaches with greater explanatory power. Ultimately, philosophers reached one of two endpoints: some went full relativist,² while others (like Quine and Laudan) bit the bullet of naturalism and left the matter to metascientists and psychologists.³ "It is an empirical question, which means promote which ends".

Metascience

Did the metascientists actually pick up the torch? Sort of. There is some overlap, but (with the exception of the great Paul Meehl) they tend to focus on different problems. The current crop of metascientists is drawn, like sharks to blood, to easily quantifiable questions about the recent past (with all those p-values sitting around how could you resist analyzing them?). They focus on different fields, and therefore different problems. They seem hesitant to make normative claims. Less tractable questions about forms of progress, norms, theory selection, etc. have fallen by the wayside. Overall I think they underrate the problems posed by philosophers.

Rational Reconstruction

In The History of Science and Its Rational Reconstructions Lakatos proposed that theories of scientific methodology function as historiographical theories and can be criticized or compared to each other by using the theories to create "rational historical reconstructions" of scientific progress. The idea is simple: if a theory fails to rationally explain the past successes of science, it's probably not a good theory, and we should not adopt its normative tenets. As Lakatos puts it, "if the rationality of science is inductive, actual science is not rational; if it is rational, it is not inductive." He applied this "Pyrrhonian machine de guerre" not only to inductivism and confirmationism, but also to Popper.

The main issue with falsification boils down to the problem of auxiliary hypotheses. On the one hand you have underdetermination (the Duhem-Quine thesis): testing hypotheses in isolation is not possible, so when a falsifying result comes out it's not clear where the modus tollens should be directed. On the other hand there is the possibility of introducing new auxiliary hypotheses to "protect" an existing theory from falsification. These are not merely abstract games for philosophers, but very real problems that scientists have to deal with. Let's take a look at a couple of historical examples from the perspective of naïve falsificationism.

First, Newton's laws. They were already falsified at the time of publication: they failed to correctly predict the motion of the moon. In the words of Newton, "the apse of the Moon is about twice as swift" as his predictions. Despite this falsification, the Principia attracted followers who worked to improve the theory. The moon was no small problem and took two decades to solve with the introduction of new auxiliary hypotheses.

A later episode involving Newton's laws illustrates how treacherous these auxiliary hypotheses can be. In 1846 Le Verrier (I have written about him before) solved an anomaly in the orbit of Uranus by hypothesizing the existence of a new planet. That planet was Neptune and its discovery was a wonderful confirmation of Newton's laws. A decade later Le Verrier tried to solve an anomaly in the orbit of Mercury using the same method. The hypothesized new planet was never found and Newton's laws remained at odds with the data for decades (yet nobody abandoned them). The solution was only found in 1915 with Einstein's general relativity: Newton should have been abandoned this time!

Second, Prout's hypothesis: in 1815 William Prout proposed that the atomic weights of all elements were multiples of the atomic weight of hydrogen. A decade later, chemists measured the atomic weight of chlorine at 35.45x that of hydrogen and Prout's hypothesis was clearly falsified. Except, a century after that, isotopes were discovered: variants of chemical elements with different neutron numbers. Turns out that natural chlorine is composed of 76% ³⁵Cl and 24% ³⁷Cl, hence the atomic weight of 35.45. Whoops! So here we have a case where falsification depends on an auxiliary hypothesis (no isotopes) which the experimenters have no way of knowing.⁴

Popper tried to rescue falsificationism through a series of unsatisfying ad-hoc fixes: exhorting scientists not to be naughty when introducing auxiliary hypotheses, and saying falsification only applies to "serious anomalies". When asked what a serious anomaly is, he replied: "if an object were to move around the Sun in a square"!⁵

Problem, officer?

There are a few problems with rational reconstruction, and while I don't think any of them are fatal, they do mean we have to tread carefully.

External factors: no internal history of science can explain the popularity of Lysenkoism in the USSR—sometimes we have to appeal to external factors. But the line between internal and external history is unclear, and can even depend on your methodology of choice.

Meta-criterion choice: what criteria do you use to evaluate the quality of a rational reconstruction? Lakatos suggested using the criteria of each theory (eg use falsificationism to judge falsificationism) but he never makes a good case for that vs a standardized set of meta-criteria.

Case studies: philosophers tend to argue using case studies and it's easy to find one to support virtually any position, even if its normative suggestions are suboptimal. Lots of confirmation bias here. The illustrious Paul Meehl correctly argues for the use of "actuarial methods" instead. "Absent representative sampling, one lacks the database needed to best answer or resolve these types of inherently statistical questions." The metascientists obviously have a great methodological advantage here.

Fake history: the history of science as we read it today is sanitized if not fabricated.⁶ Successes are remembered and failures thrown aside; chaotic processes of discovery are cleaned up for presentation. As Peter Medawar noted in Is the scientific paper a fraud?, the "official record" of scientific progress contains few traces of the messy process that actually generated said progress.⁷ He further argues that there is a desire to conform to a particular ideal of induction which creates a biased picture of how scientific discovery works.

Falsification in Metascience

Now, let's shift our gaze to metascience. There's a fascinating subgenre of psychology in which researchers create elaborate scientific simulations and observe subjects as they try to make "scientific discoveries". The results can help us understand how scientific reasoning actually happens, how people search for hypotheses, design experiments, create new concepts, and so on. My favorite of these is Dunbar (1993), which involved a bunch of undergraduate students trying to recreate a Nobel-winning discovery in biochemistry.⁸

Reading these papers one gets the sense that there is a falsificationist background radiation permeating everything. When the subjects don't behave like falsificationists, it's simply treated as an error or a bias. Klahr & Dunbar scold their subjects: "our subjects frequently maintained their current hypotheses in the face of negative information". And within the tight confines of these experiments it's usually true that it is an error. But this reflects the design of the experiment rather than any inherent property of scientific reasoning or progress, and extrapolating these results to real-world science in general would be a mistake.

Sociology offers a cautionary tale about what happens when you take this kind of reasoning to an extreme: the strong programme people started with an idealistic (and wrong) philosophy of science, they then observed that real-world science does not actually operate like that, and concluded that it's all based on social forces and power relations, descending into an abyss of epistemological relativism. To reasonable people like you and me this looks like an excellent reductio ad absurdum, but sociologists are a special breed and one man’s modus ponens is another man’s modus tollens. The same applies to over-extensions of falsificationism. Lakatos:

...those trendy 'sociologists of knowledge' who try to explain the further (possibly unsuccessful) development of a theory 'falsified' by a 'crucial experiment' as the manifestation of the irrational, wicked, reactionary resistance by established authority to enlightened revolutionary innovation.

One could also argue that the current focus on replication is too narrow. The issue is obscured by the fact that in the current state of things the original studies tend to be very weak, the "theories" do not have track records of success, and the replications tend to be very strong, so the decision is fairly easy. But one can imagine a future scenario in which failed replications should be treated with far more skepticism.

There are also some empirical questions in this area that are ripe for the picking: at which point do scientists shift their beliefs to the replication over the original? What factors do they use? What do they view a falsification as actually refuting (ie where do they direct the modus tollens)? Longitudinal surveys, especially in the current climate of the social sciences, would be incredibly interesting.

Unit of Progress

One of the things philosophers of science are in agreement about is that individual scientists cannot be expected to behave rationally. Recall the example of Prout and the atomic weight of chlorine above: Prout simply didn't accept the falsifying results, and having obtained a value of 35.83 by experiment, rounded it to 36. To work around this problem, philosophers instead treated wider social or conceptual structures as the relevant unit of progress: "thinking style groups" (Fleck), "paradigms" (Kuhn), "research programmes" (Lakatos), "research traditions" (Laudan), etc. When a theory is tested, the implications of the result depend on the broader structure that theory is embedded in. Lakatos:

We have to study not the mind of the individual scientist but the mind of the Scientific Community. [...] Kuhn certainly showed that psychology of science can reveal important-and indeed sad-truths. But psychology of science is not autonomous; for the-rationally reconstructed-growth of science takes place essentially in the world of ideas, in Plato's and Popper's 'third world'.

Psychologists are temperamentally attracted to the individual, and this is reflected in their metascientific research methods which tend to focus on individual scientists' thinking, or isolated papers. Meehl, for example, simply views this as an opportunity to optimize individuals' cognitive performance:

The thinking of scientists, especially during the controversy or theoretical crises preceding Kuhnian revolutions, is often not rigorous, deep, incisive, or even fair-minded; and it is not "objective" in the sense of interjudge reliability. Studies of resistance to scientific discovery, poor agreement in peer review, negligible impact of most published papers, retrospective interpretations of error and conflict all suggest suboptimal cognitive performance.

Given the importance of broader structures however, things that seem irrational from the individual perspective might make sense collectively. Institutional design is criminally under-explored, and the differences in attitudes both over time and over the cross section of scientists are underrated objects of study.

You might retort that this is a job for the sociologists, but look at what they have produced: on the one hand they gave us Robert Merton, and on the other hand the strong programme. They don't strike me as particularly reliable.

Fields & Theories

Almost all the scientists doing philosophy of science were physicists or chemists, and the philosophers stuck to those disciplines in their analyses. Today's metascientists on the other hand mostly come from psychology and medicine. Not coincidentally, they tend to focus on psychology and medicine. These fields tend to have different kinds of challenges compared to the harder sciences: the relative lack of theory, for example, means that today's metascientists tend to ignore some of the most central parts of philosophy of science, such as questions about Lakatos's "positive heuristic" and how to judge auxiliary hypotheses, questions about whether the logical or empirical content of theories is preserved during progress, questions about how principles of theory evaluation change over time, and so on.

That's not to say no work at all has been done in this area, for example Paul Meehl⁹ tried to construct a quantitative index of a theory's track record that could then be used to determine how to respond to a falsifying result. There's also some similar work from a Bayesian POV. But much more could be done in this direction, and much of it depends on going beyond medicine and the social sciences. "But Alvaro, I barely understand p-values, I could never do the math needed to understand physics!" If the philosophers could do it then so can the psychologists. But perhaps these problems require broader interdisciplinary involvement: not only specialists from other fields, but also involvement from neuroscience, computational science, etc.

What is progress?

One of the biggest questions the philosophers tried to answer was how progress is made, and how to even define it. Notions of progress as strictly cumulative (ie the new theory has to explain everything explained by the old one) inevitably lead to relativism, because theories are sometimes widely accepted at an "early" stage when they have limitations relative to established ones. But what is the actual process of consensus formation? What principles do scientists actually use? What principles should they use? Mertonian theories about agreement about standards/aims are clearly false, but we don't have anything better to replace them. This is another question that depends on looking beyond psychology, toward more theory-oriented fields.

Looking Ahead

Metascience can continue the work and actually solve important questions posed by philosophers:

Is there a difference between mature and immature fields? Should there be?
What guiding assumptions are used for theory choice? Do they change over time, and if yes how are they accepted/rejected? What is the best set of rules? Meehl's suggestions are a good starting point: "We can construct other indexes of qualitative diversity, formal simplicity, novel fact predictivity, deductive rigor, and so on. Multiple indexes of theoretical merit could then be plotted over time, intercorrelated, and related to the long-term fate of theories."
Can we tell, in real time, which fields are progressing and which are degenerating? If not, is this an opening for irrationalism? What factors should we use to decide whether to stick with a theory on shaky ground? What factors should we use to judge auxiliary hypotheses?¹⁰ Meehl started doing good work in this area, let's build on it.
Does null hypothesis testing undermine progress in social sciences by focusing on stats rather than the building of solid theories as Meehl thought?
Is it actually useful, as Mitroff suggests, to have a wide array of differently-biased scientists working on the same problems? (At least when there's lots of uncertainty?)
Gholson & Barker 1985 applied Lakatos and Laudan's theories to progress in physics and psychology (arguing that some areas of psychology do have a strong theoretical grounding), but this should be taken beyond case studies: comparative approaches with normative conclusions. Do strong theories really help with progress in the social sciences? Protzko et al 2020 offer some great data with direct normative applications, much more could be done in this direction.
And hell, while I'm writing this absurd Christmas list let me add a cherry on top: give me a good explanation of how abduction works!

The Riddle of Sweden's COVID-19 Numbers

2020-11-03

Comparing Sweden's COVID-19 statistics to other European countries, two peculiar features emerge:

Despite very different policies, Sweden has a similar pattern of cases.
Despite a similar pattern of cases, Sweden has a very different pattern of deaths.

Sweden's Strategy

What exactly has Sweden done (and not done) in response to COVID-19?

The government has banned large public gatherings.
The government has partially closed schools and universities: lower secondary schools remained open while older students stayed at home.
The government recommends voluntary social distancing. High-risk groups are encouraged to isolate.
Those with symptoms are encouraged to stay at home.
The government does not recommend the use of masks, and surveys confirm that very few people use them (79% "not at all" vs 2% in France, 0% in Italy, 11% in the UK).
There was a ban on visits to care homes which was lifted in September.
There have been no lockdowns.

How has it worked? Well, Sweden is roughly at the same level as other western European countries in terms of per capita mortality, but it's also doing much worse than its Nordic neighbors. Early apocalyptic predictions have not materialized. Economically it doesn't seem to have gained much, as its Q2 GDP drop was more or less the same as that of Norway and Denmark.¹

Case Counts

Sweden has followed a trajectory similar to other Western countries with the first wave in April, a pause during the summer (Sweden took longer to settle down, however), and now a second wave in autumn.²

The fact that the summer drop-off in cases happened in Sweden without lockdowns and without masks suggests that perhaps those were not the determining factors? It doesn't necessarily mean that lockdowns are ineffective in general, just that in this particular case the no-lockdown counterfactual probably looks similar.

The similarity of the trajectories plus the timing points to a common factor: climate.

Seasonality?

This sure looks like a seasonal pattern, right? And there are good a priori reasons to think COVID-19 will be slow to spread in summer: the majority of respiratory diseases all but disappear during the warmer months. This chart from Li, Wang & Nair (2020) shows the monthly activity of various viruses sorted by latitude:

The exact reasons are unclear, but it's probably a mix of temperature, humidity,³ behavioral factors, UV radiation, and possibly vitamin D.

However, when it comes to COVID-19 specifically there are reasons to be skeptical. The US did not have a strong seasonal pattern:

And in the southern hemisphere, Australia's two waves don't really fit a clear seasonal pattern. [Edit: or perhaps it does fit? Their second wave was the winter wave; climate differences and lockdowns could explain the differences from the European pattern?]

The WHO (yes, yes, I know) says it's all one big wave and covid-19 has no seasonal pattern like influenza. A report from the National Academy of Sciences is also very skeptical about seasonality, making comparisons to SARS and MERS which do not exhibit seasonal patterns.

A review of 122 papers on the seasonality of COVID-19 is mostly inconclusive, citing lack of data and problems with confounding from control measures, social, economic, and cultural conditions. The results in the papers themselves "offer mixed offer mixed statistical support (none, weak, or strong relationships) for the influence of environmental drivers." Overall I don't think there's compelling evidence in favor of climatic variables explaining a large percentage of variation in COVID-19 deaths. So if we can't attribute the summer "pause" and autumn "second wave" in Europe to seasonality, what is the underlying cause?

Schools?

If not the climate, then I would suggest schools, but the evidence suggests they play a very small role. I like this study from Germany which uses variation in the timing of summer breaks across states, finding no evidence for an effect on new cases. This paper utilizes the partial school closures in Sweden and finds open schools had only "minor consequences". Looking at school closures during the SARS epidemic the results are similar. The ECDC is not particularly worried about schools, arguing that outbreaks in educational facilities are "exceptional events" that are "limited in number and size".

So what are we left with? Confusion.

Deaths

This chart shows daily new cases and new deaths for all of Europe:

There's a clear relationship between cases & deaths, with a lag of a few weeks as you would expect (and a change in magnitude due to increased testing and decreasing mortality rates). Here's what Sweden's chart looks like:

What is going on here? Fatality rates have been dropping everywhere, but cases and deaths appear to be completely disconnected in Sweden. Even the first death peak doesn't coincide with the first case peak, but that's probably because of early spread in nursing homes.

Are they undercounting deaths? I don't think so, total deaths seem to be below normal levels (data from euromomo):

So how do we explain the lack of deaths in Sweden?

Age?

Could it be that only young people are catching it in Sweden? ~~I haven't found any up to date, day-by-day breakdowns by age, but comparing broad statistics for Sweden and Europe as a whole, they look fairly similar.~~ Even if age could explain it, why would that be the case in Sweden and not in other countries? Why aren't the young people transmitting it to vulnerable old people? Perhaps it's happening and the lag is long enough that it's just not reflected in the data yet?

[Edit: thanks to commenter Frank Suozzo for pointing out that cases are concentrated in lower ages. I have found data from July 31 on the internet archive; comparing it to the latest figures, it appears that old people have managed to avoid getting covid in Sweden! Here's the chart showing total case counts:]

Improved Treatment?

Mortality has declined everywhere, and part of that is probably down to improved treatment. But I don't see Sweden doing anything unique which could explain the wild discrepancy.

~~Again I'm left confused about these cross-country differences. If you have any good theories I would love to hear them.~~ Looks like age is the answer.

1.I think the right way to look at this is to say that Sweden has underperformed given its cultural advantages. The differences between Italian-, French-, and German-speaking cantons in Switzerland suggest a large role for cultural factors. Sweden should've followed a trajectory similar to its neighbors rather than one similar to Central/Southern Europe. Of course it's hard to say how things will play out in the long run. ↩
2.Could this all be just because of increased testing? No. While testing has increased, the rate of positive tests has also risen dramatically. The second wave is not a statistical artifact. ↩
3.Humidity seems very important, at least when it comes to influenza. See eg Absolute Humidity and the Seasonal Onset of Influenzain the Continental United States and Absolute humidity modulates influenza survival, transmission, and seasonality. There's even experimental evidence here, some papers: High Humidity Leads to Loss of Infectious Influenza Virus from Simulated Coughs, Humidity as a non-pharmaceutical intervention for influenza A. ↩