The book is still A Square Meal: A Culinary History of the Great Depression, recommended to us by reader Phil Wagner, and Part I of the review is here.
Condescension, Means Testing, and Infinite Busybodies
The other big thing we learned from A Square Meal, besides the fact that food in the 1920s was bonkers, is that the Great Depression brought out the absolute worst in the American political machine: the tendency for condescension to the unfortunate, constant means testing to make sure the needy are really as needy as they say, and infinite busybodies of every stripe.
Some of this was just standard government condescension. In WWI, the United States Food Administration tried to convince Americans that dried peas were a fine substitute for beefcake, and that “Wheatless, Eggless, Butterless, Milkless, Sugarless Cake” was a food substance at all. Sure.
When the Great Depression hit, things got steadily worse. Homemakers were encouraged to turn anything and everything into casseroles, which had the benefit of making their contents indistinguishable under a layer of white sauce and bread crumbs. The housewife could serve unappetizing food or leftovers over and over again without her family catching on, or at least that was the idea. Among other things this gives us this unusual early example of the overuse of the term “epic”:
Whatever the answer, [the casserole] is bound to be an epic if mother is up to the minute on the art of turning homely foods into culinary triumphs
Some people in government seemed to be confused — they seemed to think that the food issues facing the nation were a matter of presentation, that food just didn’t look appetizing enough. It’s hard to interpret some of these suggestions in any other light, like the idea that good advice for Americans in the Depression could be, “to impart a touch of elegance to a bowl of split pea soup, why not float a thin slice of lemon on top and sprinkle it with some bright red paprika and finely chopped parsley.” Or the suggestion that beans could be fried in croquettes to make them more appealing. Authorities trotted out “reassuring” announcements like:
The fact that a really good cook can serve better meals on a small budget than a poor cook can serve on the fat of the land suggests that the fault may be not in the food material itself but in the manner in which the food is prepared and served, and therein lays a tale!
But what really grinds our gears isn’t the condescension, it’s the means testing. The second half of the book is mostly depressing stories about the government refusing to provide basic aid to starving families, or screwing up one relief program after another with means testing, or other things along these lines.
American society at the time was firmly anti-charity. People thought that everyone on the breadline could be divided into the “deserving poor” and the “undeserving poor”, and it was their job to search out which was which. They seemed to believe that even one whiff of assistance would immediately turn any hardworking, self-respecting American into a total layabout.
People in the 1930s are always saying things like, “if a man keeps beating a path to the welfare office to get a grocery order he will gradually learn the way and it will be pretty hard to get him off that path.” They really seemed to believe in the corrupting power of government support, to the point where they were often afraid to even use the word “charity”.
Before the Great Depression, there was very little history of big-picture welfare. The support that society did provide was administered locally, and not very well administered at all:
The poor laws combined guarded concern for needy Americans with suspicions that they were complicit in their own misfortune. Under the poor laws, the chronically jobless were removed from society and dispatched to county poorhouses, catchall institutions that were also home to the old, infirm, and mentally ill. Those who could ordinarily shift for themselves but were temporarily jobless applied to public officials, men with no special welfare training, for what was known as outdoor or home relief, assistance generally given in the form of food and coal. To discourage idlers, the welfare experience was made as unpleasant as possible. Before applying for help, the poor were made to wait until utterly penniless, and then declare it publicly. When granting relief, officers followed the old rule of thumb that families living “on the town” must never reach the comfort level of the poorest independent family. The weekly food allowance was a meager four dollars a week—and less in some areas—regardless of how many people it was supposed to feed. Finally, it was customary to give food and coal on alternate weeks, providing minimal nourishment and warmth, but never both at the same time.
Journalists called people who received government assistance “tax eaters”. When support from the town was forthcoming, it looked like this:
…the board made all relief applicants fill out a detailed form with questions like: Do you own a radio? Do you own a car? Do you spend any money for movies and entertainment? Did you plant a garden? How many bushels of potatoes do you have? The board gave aid in the form of scrip, which now could only be used to purchase the “necessities of life” at local stores: “flour, potatoes, navy beans, corn meal, oatmeal, coffee, tea, sugar, rice, yeast cakes, baking soda, pepper, matches, butter, lard, canned milk, laundry soap, prunes, syrup, tomatoes, canned peas, salmon, salt, vinegar, eggs, kerosene.”
The Manhattan breadline is emblematic of the Great Depression, so we were sort of surprised at just how much people at the time hated them, even very mainstream sources. You’d think that giving out bread to the starving would be one of the more defensible forms of charity, but people loathed it. And none more than the city’s social workers, described as the breadlines’ “harshest critics”:
Welfare professionals with a long-standing aversion to food charity, social workers condemned the breadlines as relief of the most haphazard and temporary variety, not much different from standing on a street corner and handing out nickels. The people who ran the breadlines, moreover, made no attempt to learn the first thing about the men they were trying to help, or to offer any form of “service” or counseling. The cause of more harm than good, the breadlines were humiliating and demoralizing and encouraged dependence, depriving able-bodied men of the impulse to fend for themselves. Social workers were adamant. Breadlines were the work of fumbling amateurs and “should be abolished entirely; if necessary by legal enactment.”
As the Depression dragged on and things became worse, more relief did come. But when it came, the relief was invasive. Housewives were told not only what to cook, but where to shop. Some women had to venture far outside their own neighborhoods to use food tickets. Social workers dropped in on schools to criticize the work of teachers, in particular the tendency of teachers to be overly “sentimental” or “solicitous”. They feared that schoolteachers lacked the “well-honed detective skills” required to distinguish between whining and genuine tales of woe.
To ascertain that applicants were truly destitute, officials subjected them to a round of interviews. Candor was not assumed. Rather, all claims were verified through interviews with relatives and former employers, which was not only embarrassing but could hurt a man’s chances for employment in the future. More demeaning, however, were the home visits by TERA investigators to make sure the family’s situation was sufficiently desperate. Investigators came once a month, unannounced, anxious to catch welfare abusers. Any sign that the family’s finances had improved—a suspiciously new-looking dress or fresh set of window curtains—was grounds for cross-examination. If the man of the house was not at home—a suggestion that he might be out earning money—investigators asked for his whereabouts, collecting names and addresses for later verification. Finally, though instructed otherwise, investigators were known to reprimand women for becoming pregnant while on relief, the ultimate intrusion.
Families lived in dread of these monthly visits, terrified they would be cut off if it was discovered that one of the kids had a paper route or some similar infraction.
In some areas, including New York City, “pantry snoopers” accompanied women to the market to confirm that all parties (both shopper and shopkeeper) were complying with TERA’s marketing guidelines. More prying took place in the kitchen itself, where investigators lifted pot covers and peered into iceboxes on the lookout for dietary violations.
Relief was often inadequate. Public officials were sometimes able to set relief levels at whatever amount they saw fit, regardless of state or federal guidance. Some of them assumed that poor families would be able to provide their own farm goods, but often this was not the case. In some places officials reasoned that poor workers would be easier to push around, and kept food allowances low to keep them in line. There was also just straight-up racism:
Six Eyetalians will live like kings for two weeks if you send in twenty pounds of spaghetti, six cans of tomato paste and a dozen loaves of three-foot-long bread. But give them a food order like this [$13.50, state minimum for six persons for half a month], and they will still live like kings and put five bucks in the bank. Now you ought to give a colored boy more. He likes his pork chops and half a fried chicken. Needs them, too, to keep up his strength. Let him have a chicken now and then and maybe he’ll go out and find himself a job. But a good meal of meat would kill an Eyetalian on account of he ain’t used to it.
Families on relief who asked for seasonings on their food, like vinegar or mustard, were refused on the grounds that they might “begin to feel too much like other families”. Officials who were afraid that cash handouts to the poor might encourage dependence instead used that money to hire a resident home economist to help the poor make better use of what little they had.
As with modern means testing, this seems heart-breakingly callous. All these “supervisory” jobs intended to keep poor people from getting too much relief look suspiciously like a method for taking money that’s meant to help starving families and using it to pay the middle class to snoop on their less fortunate neighbors. Everyone loves giving middle-class busybodies jobs in “charity” work, no one seems to worry all that much about getting food to malnourished children.
To be fair, no one expected that the relief would have to go on for years. Everyone thought that the panic was temporary, that it would all be over in a couple of months. This doesn’t make it much better, but it does explain some of the reluctance.
Another surprising villain in all this is, of all things, the American Red Cross. Over and over again, the Red Cross either refused to provide aid or gave only the smallest amounts, even when people were literally starving to death. They sent officials to areas stricken by drought and flood, who reported back that there was not “evidence of malnutrition more than exists in normal times” or brought back stories about an old man complaining that the Red Cross was feeding him too well. Meanwhile, actual Red Cross workers were reporting circumstances like this, from Kentucky:
We have filled up the poor farm. We have carted our children to orphanages for the sake of feeding them. There is no more room. Our people in the country are starving and freezing.
The Red Cross’s reasoning was the same as everyone else in government: “If people get it into their heads that when they have made a little cotton crop and tried to make a corn crop and failed and then expected charity to feed them for five months, then the Red Cross had defeated the very thing that it should have promoted, self-reliance and initiative.” Actually this statement is on the friendlier side of things; another Red Cross official, after touring Kentucky, wrote: “There is a feeling among the better farmers in Boyd County that the drought is providential; that God intended the dumb ones should be wiped out; and that it is a mistake to feed them.”
Was Hoover the Villain?
This brings us to a major question — namely, was Herbert Hoover the real villain of the Great Depression?
(He also lived in incredible opulence during his time in the White House. Always black tie dinners, always a table awash in gold, always fancy gourmet foreign food, always a row of butlers all exactly the same height. As they say, “not a good look”.)
But when you start looking at things in more detail, it becomes more complicated. It may seem naïve, but Hoover really thought that Americans would come together and take care of each other without the need for government assistance. He seemed to oppose relief because he thought that the federal government stepping in would make things worse. In one speech, he promised that “no man, woman or child shall go hungry or unsheltered through the coming winter” and emphasized that voluntary relief organizations would make sure that everyone was taken care of. He might have been wrong, but this doesn’t look villainous. (He also said, “This is, I trust, the last winter of this great calamity.” It was 1932.)
In a pretty bizarre state of affairs, he also seems to have been thwarted by the Red Cross at every turn, especially its chairman John Barton Payne. It makes sense that Hoover approved of the Red Cross, since it was one of the voluntary generous organizations he liked so much. What’s strange is how frequently the Red Cross just didn’t do jack, even when asked.
For example: In 1930, Hoover pressured the Red Cross to help out with drought relief in the Mississippi delta. The Red Cross agreed to give out $5 million in aid, but by the end of the year they had spent less than $500,000, mostly on handing out seed to farmers.
Another time, Hoover went to the Red Cross to help provide relief to striking miners. This time the Red Cross refused, though again they offered seed to the miners (it’s unclear if there was even arable land near the mine). So Hoover went to a Quaker relief organization instead, and the Quakers agreed to help feed hungry children in the mining areas. Hoover struck a deal where the Red Cross would provide $200,000 to help the Quakers out. The Quakers waited two months before the Red Cross refused again. So the Quakers went ahead without them.
Somehow Hoover never turned his back on the Red Cross. Maybe he just liked the idea of aid organizations too much to realize that this one kept undermining him in times of crisis.
But the other thing to understand about Hoover is that, despite his gruff no-handouts exterior, inside he was a bit of a softie. He stuck to his guns on the subject of cash relief, but he usually found a way to help without breaking his own rules. When the Red Cross refused to help the Quakers, Hoover rooted around and found $225,000 in an idle account belonging to a World War I relief organization, and sent that instead. In the flood of 1927 (when he was secretary of commerce), he refused to allocate federal funds directly, but he did have the U.S. Army distribute rations, tents, and blankets, organized local governments to rebuild roads and bridges, and got the Red Cross to distribute seeds and farm implements (the only thing they seem comfortable with). This was a huge success, and a big part of what won him the 1928 presidential election!
The reason Hoover believed in the no-relief approach was simple — he had used it many times before, and it had always worked. He had a long track record of dealing with this kind of crisis. Before he was president, his nickname was “The Great Humanitarian” for the relief work he had done in Europe during World War I. People saw him as an omnicompetent engineering genius, and the reputation is at least partially deserved. It’s hard to overstate just how popular he was before the Depression: He won the election of 1928 with 444 electoral votes to 87, a total landslide.
Hoover thinks that attitude is the key to fighting financial panics, because this is exactly what he saw in 1921. There was a big stock market panic, which Hoover recognized was at least partially psychological. So he put out a bunch of reassuring press releases, told everyone that the panic was over, and sure enough, the market recovered.
So when the same thing happens in 1929, he figures the same approach will work. He does everything he can to project confidence and a sense of business as usual, and tries not to do anything that will start a bigger panic. This includes no federal relief — because if the federal government starts handing out money, that must mean things are REALLY bad. It makes a certain amount of sense, it did work for him in the past, but for some reason, it doesn’t work this time around. Maybe blame the Red Cross.
FDR definitely jumped in to take advantage of the confusion. As then-governor of New York, he started implementing the kind of relief plans that Hoover refused to consider. He gave direct food relief. He used the word “charity”. And when he ran for president, he made it very clear that he thought the federal government should cover what the states could not, and make sure that no one would starve.
This did win him the election. But afterwards, he started looking a lot more like Hoover and some of his cronies. In his 1935 state of the union address, he said, “continued dependence upon relief induces a spiritual and moral disintegration fundamentally destructive to the national fiber. To dole out relief in this way is to administer a narcotic, a subtle destroyer of the human spirit. It is inimical to the dictates of sound policy. It is in violation of the traditions of America.” We’re right back to where we started.
FDR rolls out the Works Progress Administration, another program that ties relief to a person’s work. But this isn’t administered much better than the Red Cross. Many workers couldn’t live on the low wages they offered. Even when they could, the jobs lasted only as long as the projects did, so workers often went months without jobs between different programs.
The Civilian Conservation Corps was for a long time the most popular of these programs, but in 1936, Roosevelt decided to focus on balancing the budget instead. He slashed the program from 300,000 people to about 1,400. Over time, most of the relief burden fell back on state and city governments, many of which descended back into cronyism.
Some of the programs, for migrants and “transients”, were worse, nearly Orwellian:
Before the New Deal, transients were the last group to receive relief under the old poor laws. Now the FTP funded a separate system of transient centers guided by federal regulations meant to guard against local governments’ ingrained cultural biases against drifters and migrant job seekers. In rural areas, transients would be gathered into federal “concentration camps” (a term that had not yet gained its ominous connotations) designed for long-term stays.
As waves of agricultural migrants spread across the United States, by 1940 the FSA had opened 56 camps around the country, 18 of them in California, each accommodating up to 350 families. Administrators nevertheless continued to keep costs as low as possible, following the “rehabilitation rather than relief” rule handed down by President Roosevelt. Rather than give migrants food, the camps taught home economics–style classes on nutrition and food budgeting.
By 1937 everything seems to have fallen apart again, and the authors suggest that the second half of the 1930s was as bad or worse than the first half. In 1938, Roosevelt refused to give any further direct food relief from the WPA coffers. The stories from 1939 are kind of harrowing.
By 1939, the problems of unemployment and what to do with millions of jobless Americans seemed intractable. The economy continued to sputter along; real prosperity remained an elusive goal; and Americans were losing compassion for the destitute and hungry.
A Houston, Texas, reporter lived for a week on the city’s $1.20 weekly food handout, eating mostly oatmeal, potatoes, stewed tomatoes, and cabbage, and lost nearly ten pounds. In Chicago, a family of four received $36.50 a month, meant to cover food, clothing, fuel, rent, and everything else. But fuel in the cold Chicago winters was expensive; families had no choice but to cut back on food. In Ohio, the governor again refused to give aid to Cleveland, which ran out of money for nearly a month—called the “Hunger Weeks”—at the end of 1939. The city was reduced to feeding its poor with flour and apples as desperate families combed garbage bins for anything edible. Adults lost as much as fifteen pounds, while children had to stay home, too weak from hunger to attend school. Doctors saw a jump in cases of pneumonia, influenza, pleurisy, tuberculosis, heart disease, suicide attempts, and mental breakdowns.
So whatever Hoover did wrong, he doesn’t deserve all the blame, and the WPA certainly did not end the Great Depression.
They say that the past is a foreign country, and nowhere is this more true than with food.
The book is A Square Meal: A Culinary History of the Great Depression by Jane Ziegelman and Andrew Coe, recommended to us by reader Phil Wagner. This book is, no pun intended, just what it says on the tin, a history of food during the 1920s and 1930s. Both decades are covered because you need to understand what food was like in the 1920s to understand what changed when the Great Depression battered the world in the ‘30s.
Home is where the lard-based diet is
We read this book and were like, “what are you eating? I would never eat this.”
The book picks up at end of World War I, and the weird food anecdotes begin immediately:
Their greeting back in American waters—even before they landed—was rapturous. Local governments, newspapers, and anybody else who could chartered boats to race out to meet the arriving ships. When the Mauretania, carrying 3,999 troops, steamed into New York Harbor late in 1918, a police boat carrying the mayor’s welcoming committee pulled alongside. After city dignitaries shouted greetings to them through megaphones, the troops who crowded the deck and hung from every porthole bellowed en masse: “When do we eat?!” It became a custom for greeting parties to hire professional baseball pitchers to hurl California oranges at the troops—some soldiers sustained concussions from the barrage—to give them their first taste of fresh American produce in more than a year.
Not that the soldiers weren’t also well-fed at the front lines:
Despite the privations they had undergone, the Americans held one great advantage over both the German enemy and the soldiers of their French and British allies. They were by far the best-fed troops of World War I.
The U.S. Army field ration in France varied according to circumstances, but the core of the soldiers’ daily diet was twenty ounces of fresh beef (or sixteen ounces of canned meat or twelve ounces of bacon), twenty ounces of potatoes, and eighteen ounces of bread, hard or soft. American troops were always proud that they enjoyed white bread, while all the other armies had to subsist on dark breads of various sorts. This ration was supplemented with coffee, sugar, salt, pepper, dried fruit, and jam. If supply lines were running, a soldier could eat almost four pounds of food, or 5,000 calories, a day. American generals believed that this was the best diet for building bone, muscle, tissue, and endurance. British and French troops consumed closer to 4,000 calories, while in the last months of the war the Germans were barely receiving enough rations to sustain themselves.
The overall food landscape of the 1920s is almost unrecognizable. The term “salad” at the time referred to “assemblages made from canned fruit, cream cheese, gelatin, and mayonnaise,” which the authors note FDR especially hated . Any dish that contained tomatoes was called “Spanish” (a tradition that today survives only in the dish Spanish rice). And whatever the circumstances, there was ALWAYS dessert — even in the quasi-military CCC camps, even in the government-issued guides to balanced meals, even in school lunch programs that were barely scraping by.
This book also has some interesting reminders that constipation used to be the disease of civilization. In fact, they mention constipation being called “civilization’s curse”. This is why we have the stereotype of old people being obsessed with fiber and regularity, even though that stereotype is about a generation old now, and refers to a generation that has largely passed.
In the countryside, farm diets were enormous and overwhelmingly delicious:
In midwestern kitchens, the lard-based diet achieved its apotheosis in a dish called salt pork with milk gravy, here served with a typical side of boiled potatoes:
On a great platter lay two dozen or more pieces of fried salt pork, crisp in their shells of browned flour, and fit for a king. On one side of the platter was a heaping dish of steaming potatoes. A knife had been drawn once around each, just to give it a chance to expand and show mealy white between the gaping circles that covered its bulk. At the other side was a boat of milk gravy, which had followed the pork into the frying-pan and had come forth fit company for the boiled potatoes.
The first volume of their oral history, Feeding Our Families, describes the Indiana farmhouse diet from season to season and meal to meal. In the early decades of the century, the Hoosier breakfast was a proper sit-down feast featuring fried eggs and fried “meat,” which throughout much of rural American meant bacon, ham, or some other form of pork. In the nineteenth century, large tracts of Indiana had been settled by Germans, who left their mark on the local food culture. A common breakfast item among their descendants was pon haus, a relative of scrapple, made from pork scraps and cornmeal cooked into mush, molded into loaf pans and left to solidify. For breakfast, it was cut and fried. Toward fall, as the pork barrel emptied, the women replaced meat with slices of fried apples or potatoes. The required accompaniment was biscuits dressed with butter, jam, jelly, sorghum syrup, or fruit butter made from apples, peaches, or plums. A final possibility—country biscuits were never served naked—was milk gravy thickened with a flour roux.
Where farmhouse breakfasts were ample, lunch was more so, especially in summer when workdays were long and appetites pushed to their highest register. With the kitchen garden at full production, the midday meal often included stewed beets, stewed tomatoes, long-simmered green beans, boiled corn, and potatoes fried in salt pork, all cooked to maximum tenderness. At the center of the table often stood a pot of chicken and dumplings, with cushiony slices of white bread to sop up the cooking broth. The gaps between the plates were filled with jars of chow-chow; onion relish; and pickled peaches, cauliflower, and watermelon rinds. The midday meal concluded with a solid wedge of pie. Like bread, pies were baked in bulk, up to a dozen at a time, and could be consumed at breakfast, lunch, and dinner.
Ingredients were prepared in ways that sound pretty strange to a modern ear. Whole onions were baked in tomato sauce and then eaten for lunch. Whole tomatoes were scalloped on their own.
Organ meats were considered perfectly normal, if somewhat tricky to cook. The book mentions how food columnists had to teach urban housewives about how to remove the “transparent casing” that brains naturally come in, the membrane from kidneys, and the arteries and veins from hearts — not the sort of thing you would expect from a modern food columnist. On hog-killing day, an annual event all over the rural United States:
The most perishable parts of the animal were consumed by the assembled crowd, the brains scrambled with eggs, the heart and liver fried up and eaten with biscuits and gravy. Even bladders were put to good use—though it wasn’t culinary. Rather, they were given to the children, who inflated them, filled them with beans, and used them as rattles.
There are a lot of fascinating recipes in this book, but perhaps our favorite is this recipe that appears in a section on the many uses of pork lard:
Appalachian farm women prepared a springtime specialty called “killed lettuce,” made from pokeweed, dandelion, and other wild greens drizzled with hot bacon grease that “killed,” or wilted, the tender, new leaves. The final touch to this fat-slicked salad was a welcome dose of vinegar.
You might expect the urban food situation to be more modern, seeing as it involves less hog-killing. But if anything, it’s stranger.
To start with, ice cream delicacies were considered normal lunch fare:
The most typical soda fountain concoction was the ice cream soda, which was defined as “a measured quantity of ice cream added to the mixture of syrup and carbonated water. From there, the imaginations of soda jerks were given free range. Trade manuals such as The Dispenser’s Formulary or Soda Water Guide contained more than three thousand soda fountain recipes for concoctions like the Garden Sass Sundae (made with rhubarb) and the Cherry Suey (topped with chopped fruit, nuts, and cherry syrup). … From relatively austere malted milks to the most elaborate sundaes, all of these sweet confections were considered perfectly acceptable as a main course for lunch, particularly by women. In fact, American sugar consumption spiked during the 1920s. This was in part thanks to Prohibition—deprived of alcohol, Americans turned to anything sweet for a quick, satisfying rush.
Delicatessens and cafeterias, which we take for granted today, were strange new forms of dining. The reaction to these new eateries can only be described as apocalyptic. Delicatessens were described as “emblems of a declining civilization, the source of all our ills, the promoter of equal suffrage, the permitter of business and professional women, the destroyer of the home.” The world of the 1920s demanded an entirely new vocabulary for many new social ills springing up — “cafeteria brides” and “delicatessen husbands” facing down the possibility of that new phenomenon, the “delicatessen divorce.” The fear was that your flapper wife, unable to make a meal in her tiny city kitchenette, or out all day with a self-supporting career, would feed you food that she got from the delicatessen, instead of a home-cooked and hearty meal.
In all of these cases, the idea was that new ways of eating would destroy the kitchen-centric American way of life — which, to be fair, it did. Calling a deli “the destroyer of the home” seems comical to us, but they were concerned that these new conveniences would destroy the social structures that they knew and loved, and they were right. We think our way of life is an improvement, of course, but you can hardly fault the accuracy of their forecasting.
Really, people found these new eateries equal parts wonderful and terrifying — like any major change, they had their songs of praise as well as their fiery condemnations (hot take: delicatessens were the TikTok of the 1920s). For a stirring example from the praise section, take a look at this lyrical excerpt from the June 18, 1922 edition of the New York Tribune:
Spices of the Orient render delectable the fruits of the Occident. Peach perches on peach and pineapple, slice on slice, within graceful glass jars. Candies are there and exhibits of the manifold things that can be pickled in one way or another. Chickens, hams and sausages are ready to slice, having already been taken through the preliminaries on the range. There are cheeses, fearful and wonderful, and all the pretty bottles are seen, as enticing looking as ever, although they are but the fraction of their former selves [i.e., under Prohibition].”
Sandwiches were not only strange and new, but practically futuristic. “Before the 1920s, sandwiches were largely confined to picnics and free lunches in saloons,” they tell us, “and, with their crusts cut off, delicate accompaniments to afternoon tea.” The writer George Jean Nathan claimed that before the 1920s, there existed only eight basic sandwich types: Swiss cheese, ham, sardine, liverwurst, egg, corned beef, roast beef, and tongue (yes). But by 1926, he “claimed that he had counted 946 different sandwich varieties stuffed with fillings such as watermelon and pimento, peanut butter, fried oyster, Bermuda onion and parsley, fruit salad, aspic of foie gras, spaghetti, red snapper roe, salmi of duck, bacon and fried egg, lettuce and tomato, spiced beef, chow-chow, pickled herring, asparagus tips, deep sea scallops, and so on ad infinitum.”
Like the delicatessen, Americans were not going to take this sandwich thing lying down. Nor would they take it at all calmly! Boston writer Joseph Dinneen described sandwiches as “a natural by-product of modern machine civilization.”
Make your own “biggest thing since sliced bread” joke here, but actually this sandwich craze led directly to first the invention of special sandwich-shaped loaves with flattened tops, and then to sliced bread, which hit the market in 1928.
Many new foods didn’t fit squarely within existing categories. This is sort of like how squid ice cream seems normal in Japan. We have rules about what you can put in an ice cream — mint ice cream makes sense, but onion ice cream is right out — but the Japanese don’t care what we think the ice cream rules are. In the 1920s and 1930s many foods were unfamiliar or actually brand new, so no one had any expectations of what to do with them. For example, the banana, which you know as a fruit, was new enough to Americans that they were still figuring out how the thing should be served:
We’re sure bananas would be fine served as a vegetable, or with bacon, but this is certainly not the role we would assign to them today.
When the Depression hit, grapefruit somehow found its way into food relief boxes in huge quantities; “so much grapefruit that people didn’t know what to do with it.” Soon the newspapers were coming up with imaginative serving suggestions, like in this piece from the Atlanta Constitution:
It may open the meal, served as a fruit cocktail, in halves with a spoonful of mint jelly in the center or sprinkled with a snow of powdered sugar. It bobs up in a fruit cup, or in a delicious ice. It may be served broiled with meat, appear in a fruit salad or in a grapefruit soufflé pie. Broiled grapefruit slices, seasoned with chili sauce, make an unusual and delightful accompaniment for broiled fish, baked fish or chops.
Some of these sound pretty good; but still, unusual.
The other really strange and exciting thing about this period is that they had just discovered vitamins.
As we’ve covered previously, this was not as easy as you might think. It’s simple to think in terms of vitamins when you’re raised with the idea, but it took literally centuries for people to come up with the concept of a disease of deficiency, even with the totally obvious problem of scurvy staring everyone right in the face.
Scurvy isn’t just a problem for polar explorers and sailors in the Royal Navy. Farm families living through the winter on preserved foods from their cellar tended to develop “spring fever” just before the frost broke, which the authors of this book think was probably scurvy. Farmwives treated it with “blood tonics” like sassafras tea or sulfured molasses, or the first-sprouted dandelions and onions of spring.
But just around the turn of the century, and with the help of cosmic accidents involving guinea pigs, people finally started to get this vitamin thing right. So the 1920s and 30s paint an interesting picture of what cutting-edge nutrition research looks like when it’s so new that it’s still totally bumbling and incompetent.
In 1894, Wilbur Olin Atwater established America’s first dietary standards. Unfortunately, Atwater’s recommendations didn’t make much sense. For example, in this system men with more strenuous jobs were assigned more food than men with less strenuous jobs — a carpenter would get more calories than a clerk. This makes some sense, but Atwater then used each man’s food levels to calculate the amount of food required for his wife and kids. The children of men with desk jobs sometimes got half as much food as the children of manual laborers! The idea of treating each member of the family as their own person, nutritionally speaking, was radical in the early 1900s, but the observation that some children were “kept alive in a state of semi-starvation” had begun to attract attention.
People knew they could do better, so following Atwater’s death in 1907, the next generation got to work on coming up with a better system. Atwater had assumed that basically all fats were the same, as were all carbohydrates, all protein, etc. But Dr. Elmer V. McCollum, “a Kansas farm boy turned biochemist”, was on the case investigating fats.
We really want to emphasize that they had no system at this point, no idea what they were doing. Medical science was young, and nutritional science was barely a decade old. Back then they were still just making things up. These days “guinea pig” and “lab rat” are clichés, but these clichés hadn’t been invented back in 1907. Just like how Holst and Frolich seem to have picked guinea pigs more or less at random to study scurvy, and how Karl Koller’s lab used big frogs to test new anesthetics, McCollum was one of the first researchers to use rats as test subjects.
Anyways, McCollum tried feeding his rats different kinds of fats to see if, as Atwater claimed, all fats had the same nutritional value. He found that rats that ate lots of butterfat “grew strong and reproduced, while those that ate the olive oil did not”. He teamed up with a volunteer, Marguerite Davis, and they discovered a factor that was needed for growth and present not only in milk, but eggs, organ meat, and alfalfa leaves. This factor was later renamed vitamin A (as the first to be discovered), and the age of the vitamins had begun. Soon McCollum and Davis were on the trail of a second vitamin, which they naturally called vitamin B.
The public went absolutely bananas for vitamins. It’s not clear if this was a totally natural public reaction, or if it was in response to fears drummed up by… home economists. Yes, home economics, the most lackluster class of all of middle school, represents that last lingering influence of what was once a terrible force in American politics:
More than anything else, women were afraid of the “hidden hunger” caused by undetectable vitamin deficiencies that could well be injuring their children. … Home economists leveraged those fears. To ensure compliance, bureau food guides came with stark admonitions, warning mothers that poor nutrition in childhood could handicap a person for life. Women were left with the impression that one false move on their part meant their children would grow up with night blindness and bowed knees.
Whatever the cause, vitamins took America by storm. Any food found to be high in one vitamin or another quickly turned that finding to advertising purposes. Quaker oats, found to be high in vitamin B, advertised to kids with a campaign that “teamed up with Little Orphan Annie and her new pal, a soldier named Captain Sparks, who could perform his daring rescues because he had eaten his vitamins.” For adults, they implied that vitamin B would help make you vigorous in bed:
…a snappy new advertising campaign: “I eat Quaker Oats for that wonderful extra energy ‘spark-plug.’ Jim thinks I have ‘Oomph!’ but I know it’s just that I have plenty of vitality and the kind of disposition a man likes to live with.” What she did with her extra “oomph” was unspecified, but the graphic showed a young couple nose to nose, smiling into each other’s eyes.
Vitamins continued to have this weird grip over the imagination for a long time. As late as the 1940s, American food experts worried that the Nazis had developed some kind of super-nutritional supplement, a “magical Buck Rogers pill,” to keep their army tireless and efficient (there probably was such a pill, but that pill was methamphetamine). In response, Roosevelt convened a 900–person National Nutrition Conference for Defense, a full quarter of them home economists, to tackle malnutrition as part of the war effort.
Maybe it’s not surprising that vitamins had such a hold on the popular imagination. It’s hard for us to imagine growing up in a world where scurvy, beriberi, and rickets were a real and even terrifying danger, not just funny-sounding words you might encounter in a Dickens novel. But for people living in the 1920s, they were no joke. Look at your local five-year-old and think how they will never understand the real importance of the internet, and what life was like before. You’re the same way about vitamins.
The final thing we learned is that people from the 1920s and 1930s had an intense, almost deranged love for milk.
Milk was always mentioned first and usually mentioned often. It was on every menu. Good Housekeeping’s 1926 article, Guide Posts to Balanced Meals, included “One pint of milk a day as either a beverage or partly in soups, sauces or desserts” as guidepost #1. Pamphlets from the USDA’s Bureau of Home Economics suggested that one fifth of a family’s food budget should be spent on milk. Milk was served at every meal in the schoolhouse, with milk and crackers at recess, the target being a quart of milk for every child, every day.
Milk was on every relief list. Food relief in NYC in 1930, a very strict beans-and-potatoes affair, still made sure to include a pound of evaporated milk for every family. Even for those on microscopic fifty-cent-a-day menus, milk was recommended at every meal, “one pint for breakfast, some for lunch, and then another pint for supper.” One father struggling to adjust to the Depression said, “We had trouble learning to live within the food allowance allotted us. We learned it meant oleomargarine instead of butter. It meant one quart of milk a day for the children instead of three.” Even the tightest-fisted relief lists included a pint of milk a day for adults, and a quart a day for children. The most restrictive diets of all were bread and — you guessed it — milk.
Milk was the measure of destitution. Descriptions of people eating “whatever they could get” sound like this: “inferior qualities of food and less of it; less milk; loose milk instead of bottled milk, coffee for children who previously drank milk.” When describing the plight of West Virginia mining families, a state union leader said, “Their diet is potatoes, bread, beans, oleomargarine, but not meat, except sow-belly two or three times a week. The company won’t let the miners keep cows or pigs and the children almost never have fresh milk. Only a few get even canned milk.”
There’s no question — milk was the best food. The government sent McCollum, the guy who discovered vitamins, around the country, where in his lectures he said:
Who are the peoples who have achieved, who have become large, strong, vigorous people, who have reduced their infant mortality, who have the best trades in the world, who have an appreciation for art and literature and music, who are progressive in science and every activity of the human intellect? They are the people who have patronized the dairy industry.
Normal milk wasn’t enough for these people, so in 1933 they developed a line of “wonder foods” around the idea of combining milk with different kinds of cereals. They called them: Milkorno, Milkwheato, and Milkoat. These products are about what you would expect, but the reception was feverish:
With great fanfare, Rose introduced Milkorno, the first of the cereals, at Cornell’s February 1933 Farm & Home Week, where the assembled dignitaries—including Eleanor Roosevelt, wife of the president-elect—were fed a budget meal that included a Milkorno polenta with tomato sauce. The price tag per person was 6½ cents. FERA chose Milkwheato (manufactured under the Cornell Research Foundation’s patent) to add to its shipments of surplus foods, contracting with the Grange League Federation and the Ralston Purina Company to manufacture it. … Milkwheato and its sister cereals represented the pinnacle of scientifically enlightened eating. Forerunners to our own protein bars and nutritional shakes, they were high in nutrients, inexpensive, and nonperishable. White in color and with no pronounced flavor of their own, they were versatile too. Easily adapted to a variety of culinary applications, they boosted the nutritional value of whatever dish they touched. They could be baked into muffins, cookies, biscuits, and breads; stirred into chowders and chili con carne; mixed into meat loaf; and even used in place of noodles in Chinese chop suey.
We had always assumed that the American obsession with milk was the result of the dairy lobby trying to push more calcium on us than we really need. And maybe this is partially true. But public opinion of dairy has fallen so far from the rabid heights of the 1930s that now we wonder if milk might actually be underestimated. Is the dairy lobby asleep at the wheel? Still resting on their laurels? Anyways, if you want to eat the way your ancestors ate back in the 1920s, the authentic way to start your day off right is by drinking a nice tall pint of milk.
 : There might be a class element here? The authors say, “FDR recoiled from the plebeian food foisted on him as president; perhaps no dish was more off-putting to him than what home economists referred to as ‘salads,’ assemblages made from canned fruit, cream cheese, gelatin, and mayonnaise.”
Ben Kuhn on lognormal distributions and outliers. In our experience, understanding lognormal distributions is pretty easy and opens up all kinds of low-hanging fruit. This is a good intro to the concept and why it comes in handy.
If you’re familiar with Dr. Bronner’s Magic Soaps, then you know about “the label”. (If not, a sample: “In all we do, let us be fair, generous, and loving to Spaceship Earth and all its inhabitants. For we’re All-One or None! All-One!”) It turns out that the story of how the label was born is even more interesting than you might think.
Article from 2007. On a phone call, the author offhandedly mentions that his wife is good at Game Boy Tetris — “She can get 500 or 600 lines, no problem.” — and learns that the current world record for Game Boy Tetris is 327 lines. They go to New Hampshire and she becomes the new world record holder with a total of 841 lines.
In Japan, crows have learned to attack solar power plants with stones. No one knows why: “It is unknown why crows bombard solar panels, possibly it is a game. The stones seldom directly crack panels, but the crows are experts at placing stones or other garbage just so that they stay on top of the panel, soon causing overheating and destruction or permanent damage.” The only way to keep crows away is to use falcons. “One trained falcon making 60 attack sorties a day can protect 100,000 solar panels from vengeful crows.”
Yet another example of a potato-only diet, complete with a book. Amazon reviews are anecdotal, of course, but they’re very positive. Not affiliated with us, in fact predates our work by a couple of years, looks like this got started in 2015 or 2016.
Al Hatfield is a wannabe rationalist (his words) from the UK who sent us some data about water sources in Scotland. We had an interesting exchange with him about these data and, with Al’s permission, wanted to share it with all of you! Here it is:
I know you’re not that keen on correlations and I actually stopped working on this a few months ago when you mentioned that in the last A Chemical Hunger post, but after reading your post today I wanted to share it anyway, just in case it does help you at all.
It’s a while since I read all of A Chemical Hunger but I think this data about Scottish water may support a few things you said:
– The amount of Lithium in Scottish water is in the top 4 correlations I found with obesity (out of about 40 substances measured in the water)
– I recall you predicted the top correlation would be about 0.5, the data I have implies it’s 0.55, so about right.
– I recall you said more than one substance in the water may contribute to obesity, my data suggested 4 substances/factors had correlations of more than 0.46 with obesity levels and 6 were more than 0.41.
Wow, thanks for this! We’ll take a look and do a little more analysis if that’s all right, and get back to you shortly.
Do you know the units for the different measurements here, especially for the lithium? We’d be interested in seeing the original PDFs as well if that’s not too much hassle.
You’re welcome! That’s great if you can analyse it as I am very much an amateur.
The units for the Lithium measurements are µgLi/l. I’ve attached the Lithium levels Scottish Water sent me. I think they cover every water source they test in Scotland (though my analysis only covered about 15 water sources).
Sorry I don’t have access to the original pdfs as they’re on my other computer and I’m away at the moment. But I have downloaded a couple of pdfs online. Unfortunately the online versions have been updated since I did my analysis in late November, but hopefully you can get the idea from them and see what measurements Scottish Water use.
So we’ve taken a closer look at the data and while everything is encouraging, we don’t feel that we’re able to draw any strong conclusions.
We also get a correlation of 0.47 between obesity and lithium levels in the water. The problem is, this relationship isn’t significant, p = 0.078. Basically this means that the data are consistent with a correlation anywhere between -0.06 and 0.79, and since that includes zero (no relationship), we say that it’s not significant.
This still looks relatively good for the hypothesis — most of the confidence interval is positive, and these data are in theory consistent with a correlation as high as 0.79. But on the whole it’s weak evidence, and doesn’t meet the accepted standards.
The main reason this isn’t significant is that there are only 15 towns in the dataset. As far as sample sizes go, this is very small. That’s just not much information to work with, which is why the correlation isn’t significant. For similar reasons, we haven’t done any more complicated analyses, because we won’t be able to find much with such a small sample to work with.
Another problem is that correlation is designed to work with bivariate normal distributions — two variables, both of them approximately normally distributed, like so:
Usually this doesn’t matter a ton. Even if you’re looking at a correlation where the two variables aren’t really normally distributed, it’s usually ok. And sometimes you can use transformations to make the data more normal before doing your analysis. But in this case, the distribution doesn’t look like a bivariate normal at all:
Only four towns in the dataset have seriously elevated lithium levels, and those are the four fattest towns in the dataset. So this is definitely consistent with the hypothesis.
But the distribution is very strange and very extreme. In our opinion, you can’t really interpret a correlation you get from data that looks like this, because while you can calculate a correlation coefficient, correlation was never intended to describe data that are distributed like this.
On the other hand, we asked a friend about this and he said that he thinks a correlation is fine as long as the residuals are normal (we won’t get into that here), and they pretty much are normal, so maybe a correlation is fine in this case?
A possible way around this problem is nonparametric correlation tests, which don’t assume a bivariate normal distribution in the first place. Theoretically these should be kosher to use in this scenario because none of their assumptions are violated, though we admit we don’t use nonparametric methods very often.
Anyways, both of the nonparametric correlation tests we tried were statistically significant — Kendall rank correlation was significant (tau = 0.53, p = .015), and so was the Spearman rank correlation (rho = 0.64, p = .011). Per these tests, obesity and lithium levels are positively correlated in this dataset. The friend we talked to said that in his opinion, nonparametric tests are the more conservative option, so the fact that these are significant does seem suggestive.
We’re still hesitant to draw any strong conclusions here. Even if the correlations are significant, we’re working with only 15 observations. The lithium levels only go up to 7 ppb in these data, which is still pretty low, at least compared to lithium levels in many other areas. So overall, our conclusion is that this is certainly in line with the lithium hypothesis, but not terribly strong evidence either way.
A larger dataset of more than 15 towns would give us a bit more flexibility in terms of analysis. But we’re not sure it would be worth your time to put it together. It would be interesting if the correlation were still significant with 30 or 40 towns, and we could account for some of the other variables like Boron and Chloride. But, as we’ve mentioned before, in this case there are several reasons that a correlation might appear to be much smaller than it actually is. And in general, we think it can sometimes be misleading to use correlation outside the limited set of problems it was designed for (for example, in homeostatic systems).
That said, if you do decide to expand the dataset to more towns, we’d be happy to do more analysis. And above all else, thank you for sharing this with us!
[Addendum: In case anyone is interested in the distribution in the full lithium dataset, here’s a quick plot of lithium levels by Scottish Unitary Authority:
Thanks so much for looking at it. Sounds like I need to brush up on my statistics! Depending how bored I get I may extend it to 40 towns some time, but for now I’ll stick with experimenting with a water filter.
Early on in science there would never even could be a replication crisis or anything because everyone was just trying all the stuff. They were writing letters to each other with directions, trying each others’ studies, and seeing what they could confirm for themselves.
I have a particular cookbook that I love, and even though I follow the recipes as closely as I can, the food somehow never quite looks as good as it does in the photos. Does this mean that the recipes are deficient, perhaps even that the authors have misrepresented the quality of their food? Or could it be that there is more to great cooking than just following what’s printed in a recipe? I do wish the authors would specify how many millimeters constitutes a “thinly” sliced onion, or the maximum torque allowed when “fluffing” rice, or even just the acceptable range in degrees Fahrenheit for “medium” heat. They don’t, because they assume that I share tacit knowledge of certain culinary conventions and techniques; they also do not tell me that the onion needs to be peeled and that the chicken should be plucked free of feathers before browning. … Likewise, there is more to being a successful experimenter than merely following what’s printed in a method section. Experimenters develop a sense, honed over many years, of how to use a method successfully. Much of this knowledge is implicit.
Mitchell believes in a world where findings are so fragile that only extreme insiders, close collaborators of the original team, could possibly hope to reproduce their findings. The implicit message here is something like, “don’t bother replicating ever; please take my word for my findings.”
The general understanding of replication is slightly less extreme. To most researchers, replication is when one group of scientists at a major university reproduce the work of another group of scientists at a different major university. There’s also a minority position that replications should be done by many labs, that replication is an internal process of double-checking: “take the community’s word”.
But this doesn’t seem quite right to us either. If a finding can’t be confirmed by outsiders like you — if you can’t see it for yourself — it doesn’t really “count” as replication. This used to be the standard of evidence (confirm it for yourself or don’t feel bound to take it seriously) and we think this is a better standard to hold ourselves to.
It’s not that Mitchell is wrong — he’s right, there is a lot of implicit knowledge involved in doing anything worth doing. Sometimes science is really subtle and hard to replicate at home; other times, it isn’t. But whether or not a particular study is easy or hard to replicate is a dodge. This argument is a load of crap because the whole reason to do research in the first place is a fight against received wisdom.
The motto of the Royal Society, one of the first scientific societies, was and still is nullius in verba. Roughly translated, this means, “take no one’s word” or “don’t take anyone’s word for it”. We think this is a great motto. It’s a good summary of the kind of spirit you need to investigate the world. You have the right to see for yourself and make up your own mind; you shouldn’t have to take someone’s word. If you can take someone else’s word for it — a king, maybe — then why bother?
In the early 1670s, Antonie van Leeuwenhoek started writing to the Royal Society, talking about all the “little animals” he was seeing in drops of pond water when he examined them under his new microscopes. Long particles with green streaks, wound about like serpents, or the copper tubing in a distillery. Animals fashioned like tiny bells with long tails. Animals spinning like tops, or shooting through the water like pikes. “Little creatures,” he said, “above a thousand times smaller than the smallest ones I have ever yet seen upon the rind of cheese.”
Naturally, the Royal Society found these reports a little hard to believe. They had published some of van Leewenhoek’s letters before, so they had some sense of who the guy was, but this was almost too much:
Christiaan Huygens (son of Constanijn), then in Paris, who at that time remained sceptical, as was his wont: ‘I should greatly like to know how much credence our Mr Leeuwenhoek’s observations obtain among you. He resolves everything into little globules; but for my part, after vainly trying to see some of the things which he sees, I much misdoubt me whether they be not illusions of his sight’. The Royal Society tasked Nehemiah Grew, the botanist, to reproduce Leeuwenhoek’s work, but Grew failed; so in 1677, on succeeding Grew as Secretary, Hooke himself turned his mind back to microscopy. Hooke too initially failed, but on his third attempt to reproduce Leeuwenhoek’s findings with pepper-water (and other infusions), Hooke did succeed in seeing the animalcules—‘some of these so exceeding small that millions of millions might be contained in one drop of water’
People were skeptical and didn’t take van Leewenhoek at his word alone. They tried to get the same results, to see these little animals for themselves, and for a number of years they failed. They got no further help from van Leewenhoek, who refused to share his methods, or the secrets of how he made his superior microscopes. Yet even without a precise recipe, Hooke was eventually able to see the tiny, wonderful creatures for himself. And when he did, van Leewenhoek became a scientific celebrity almost overnight.
If something is the truth about how the world works, the truth will come out, even if it takes Robert Hooke a few years to confirm your crazy stories about the little animals you saw in your spit. Yes, research is very exacting, and can demand great care and precision. Yes, there is a lot of implicit knowledge involved. The people who want to see for themselves might have to work for it. But if you think what you found is the real McCoy, then you should expect that other people should be able to go out and see it for themselves. And assuming you are more helpful than van Leewenhoek, you should be happy to help them do it. If you don’t think people will be able to replicate it at their own bench, are you sure you think you’ve discovered something?
Fast forward to the early 1900s. Famous French Physicist Prosper-René Blondlot is studying the X-Rays, which had been first described by Wilhelm Röntgen in 1895. This was an exciting time for rays of all stripes — several forms of invisible radiation had just been discovered, not only X-Rays but ultraviolet light, gamma rays, and cathode rays.
So Blondlot was excited, but not all that surprised, when he discovered yet another new form of radiation. He was firing X-rays through a quartz prism and noticed that a detector was glowing when it shouldn’t be. He performed more experiments and in 1903 he announced the discovery of: N-rays!
Blondlot was a famous physicist at a big university in France, so everyone took this seriously and they were all very excited. Soon other scientists had replicated his work in their own labs and were publishing scores of papers on the subject. They began documenting the many strange properties of N-rays. The new radiation would pass right through many substances that blocked light, like wood and aluminum, but were obstructed by water, clouds, and salt. They were emitted by the sun and by human bodies (especially flexed muscles and certain areas of the brain), as well as rocks that had been left in the sun and been allowed to “soak up” the N-rays from sunlight.
The procedure for detecting these rays wasn’t easy. You had to do everything just right — you had to use phosphorescent screens as detectors, you had to stay in perfect darkness for a half hour so your eyes could acclimate, etc. Fortunately Blondlot was extremely forthcoming and always went out of his way to help provide these implicit details he might not have been able to fit in his reports. And he was vindicated, because with his help, labs all over the place were able to reproduce and extend his findings.
Well, all over France. Some physicists outside France, including some very famous ones, weren’t able to reproduce Blondlot’s findings at all. But as before, Blondlot was very forthcoming and did his best to answer everyone’s questions.
Even so, over time some of the foreigners began to get a little suspicious. Eventually some of them convinced an American physicist, Robert W. Wood, to go visit Blondlot in France to see if he could figure out what was going on.
Blondlot took Wood in and gave him several demonstrations. To make a long story short (you can read Wood’s full account here; it’s pretty interesting), Wood found a number of problems with Blondlot’s experiments. The game was really up when Wood secretly removed a critical prism from one of the experiments, and Blondlot continued reporting the same results as if nothing had happened. Wood concluded that N-rays and all the reports had been the work of self-deception, calling them “purely imaginary”. Within a couple of years, no one believed in N-rays anymore, and today they’re seen as a cautionary tale.
So much for the subtlety and implicit knowledge needed to do cutting-edge work. Maybe your results are hard to get right, but maybe if other people can’t reproduce your findings, they shouldn’t take your word for it.
This is the point of all those chemistry sets your parents (or cool uncle) gave you when you were a kid. This is the point of all those tedious lab classes in high school. They were poorly executed and all but this was the idea. If whatever Röntgen or Pasteur or Millikan or whoever found is for real, you should be able to reproduce the same thing for yourself in your high school with only the stoner kid for a lab assistant (joke’s on you, stoners make great chemists — they’re highly motivated).
Some people will scoff. After all, what kind of teenager can replicate the projects reported in a major scientific journal? Well, as just one example, take Dennis Gabor: “during his childhood in Budapest, Gabor showed an advanced aptitude for science; in their home laboratory, he and his brother would often duplicate the experiments they read about in scientific journals.”
Clearly some studies will be so complicated that Hungarian teenagers won’t be able to replicate them, or may require equipment they don’t have access to. And of course the Gabor brothers were not your average teenagers. But it used to be possible, and it should be made possible whenever possible. Because otherwise you are asking the majority of people to take your claims on faith. If a scientist is choosing between two lines of work of equal importance, one that requires a nuclear reactor and the other that her neighbor’s kids can do in their basement, she should go with the basement.
It’s good if one big lab can recreate what another big lab claims to have found. But YOU are under no obligation to believe it unless you can replicate it for yourself.
You can of course CHOOSE to trust the big lab, look at their report and decide for yourself. But that’s not really replication. It’s taking someone’s word for something.
There’s nothing wrong with taking someone’s word; you do it all the time. Some things you can’t look into for yourself; and even if you could, you don’t have enough time to look into everything. So we are all practical people and take the word of people we trust for lots of things. But that’s not replication.
Something that you personally can replicate is replication. Watching someone else do it is also pretty close, since you still get to see it for yourself. Something that a big lab would be able to replicate is not really replication. It’s nice to have confirmation from a second lab, but now you’re just taking two people’s word for it instead of one person’s. Something that can in principle be replicated, but isn’t practical for anyone to actually attempt, is not replication at all.
If it cannot be replicated even in principle, then what exactly do you think you’re doing? What exactly do you think you’ve discovered here?
We find it kind of concerning that “does replicate” or “doesn’t replicate” have come to be used as synonyms of “true” and “untrue”. It’s not enough to say that things replicate or not. Blondlot’s N-ray experiments were replicated hundreds of times around France, until all of a sudden they weren’t; van Leeuwenhoek’s observations of tiny critters in pond water weren’t replicated for years, until they were. The modern take on replication (lots of replications from big labs = good) would have gotten both of these wrong.
If knowing the truth about some result is important to you, don’t just take someone’s word for it. Don’t leave it up to the rest of the world to do this work; we’re all bunglers, you should know that. If you can, you should try it for yourself.
So let’s look at some examples of REAL replication. We’ll take our examples from psychology, since as we saw earlier, they’re in the thick of the modern fight over replication.
We also want to take a minute to defend the psychologists, at least on the topic of replication (psychology has other sins, but that’s a subject for another time). Psychology has gotten a lot of heat for being the epicenter of the replication crisis. Lots of psychology studies haven’t replicated under scrutiny. There have been many high-profile disputes and attacks. Lots of famous findings seem to be made out of straw.
Some people have taken this as a sign that psychology is all bunkum. They couldn’t be more wrong — it’s more like this. One family in town gets worried and hires someone to take a look at their house. The specialist shows up and sure enough, their house has termites. Some of the walls are unsafe; parts of the structure are compromised. The family is very worried but they start fumigating and replacing boards that the termites have damaged to keep their house standing. All the other families in town laugh at them and assume that their house is the most likely to fall down. But the opposite is true. No other family has even checked their home for termites; but if termites are in one house in town, they are in other houses for sure. The first family to check is embarrassed, yes, but they’re also the only family who is working to repair the damage.
The same thing is going on in psychology. It’s very embarrassing for the field to have their big mistakes aired in public; but psychology is also distinct for being the first field willing to take a long hard look at themselves and make a serious effort to change for the better. They haven’t done a great job, but they’re one of the only fields that is even trying. We won’t name names but you can bet that other fields have just as many problems with p-hacking — the only difference is that those fields are doing a worse job rooting it out.
The worst thing you can say about psychology is that it is still a very young field. But try looking at physics or chemistry when they were only 100 years old, and see how well they were doing. From this perspective, psychology is doing pretty ok.
Despite setbacks, there has been some real progress in psychology. So here are a few examples of psychological findings that can actually be replicated, by any independent researcher in an afternoon. You don’t have to take our word or anyone else’s word for these findings if you don’t want to. Try it for yourself! Please do try this at home, that’s the point.
Are these the most important psychology findings? Probably not — we picked them because they’re easy to replicate, and you should be able to confirm their results from your sofa (disclaimer: for some of them, you may have to leave your sofa). But all of them are things we didn’t know about 150 years ago, so they represent a real advance in what we know about the mind.
For most of these you will need a small group of people, because most of these are statistically true results, not guaranteed to work in every case. But as long as you have a dozen people or so, they should be pretty reliable.
Draw a Bicycle — Here’s a tricky one you can do all on your own. You’ve seen a bicycle before, right? You know what they look like? Ok, draw one.
Unless you’re a bicycle mechanic, chances are you’ll be really rubbish at this — most people are. While you can recognize a bicycle no problem, you don’t actually know what one looks like. Most people produce drawings that look something like this:
Needless to say, that’s not a good representation of the average bicycle.
Seriously, try this one yourself right now. Don’t look up what a bicycle looks like; draw it as best you can from memory and see what you get. We’ll put a picture of what a bicycle actually looks like at the end of this post.
(A similar example: which of the images below shows what a penny looks like?)
Wisdom of the Crowd — Wisdom of the crowd refers to the fact that people tend to make pretty good guesses on average even when their individual guesses aren’t that good.
You can do this by having a group of people guess how many jellybeans are in a jar of jellybeans, or how much an ox weighs. If you average all the guesses together, most of the time it will be pretty close to the right answer. But we’ve found it’s more fun to stand up there and ask everyone to guess your age.
We’ve had some fun doing this one ourselves, it’s a nice trick, though you need a group of people who don’t know you all that well. It works pretty well in a classroom.
This only works if everyone makes their judgments independently. To make sure they don’t influence each other’s guesses, have them all write down their guesses on a piece of paper before blurting it out.
Individual answers are often comically wrong — sometimes off by up to a decade in both directions — but we’ve been very impressed. In our experience the average of all the guesses is very accurate, often to within a couple of months. But give it a try for yourself.
Emotion in the Face — You look at someone’s face to see how they’re feeling, right? Well, maybe. There’s a neat paper from a few years ago that has an interesting demonstration of how this isn’t always true.
They took photos of tennis players who had just won a point or who had just lost a point, and cut apart their faces and bodies (in the photos; no tennis pros were harmed, etc.). Then they showed people just the bodies or just the faces and asked them to rate how positively or negatively the person was feeling:
They found that people could usually tell that a winning body was someone who was feeling good, and a losing body was someone feeling bad. But with just the faces, they couldn’t tell at all. Just look above – for just the bodies, which guy just won a point? How about for the faces, who won there?
Then they pushed it a step further by putting winning faces on losing bodies, and losing faces on winning bodies, like so:
Again, the faces didn’t seem to matter. People thought chimeras with winning bodies felt better than chimeras with losing bodies, and seemed to ignore the faces.
This one should be pretty easy to test for yourself. Go find some tennis videos on the internet, and take screenshots of the players when they win or lose a point. Cut out the faces and bodies and show them to a couple friends, and ask them to rate how happy/sad each of the bodies and faces seems, or to guess which have just won a point and which have just lost. You could do this one in an afternoon.
Anchoring — This one is a little dicey, and you’ll need a decent-sized group to have a good chance of seeing it.
Ask a room of people to write down some number that will be different for each of them — like the last four digits of their cell phone number, or the last two digits of their student ID or something. Don’t ask for part of their social security number or something that should be kept private.
Let’s assume it’s a classroom. Everyone takes out their student ID and writes down the last two digits of their ID number. If your student ID number is 28568734, you write down “34”.
Now ask everyone to guess how old Mahatma Gandhi was when he died, and write that down too. If this question bores you, you can ask them something else — the average temperature in Antarctica, the average number of floors in buildings in Manhattan, whatever you like.
Then ask everyone to share their answers with you, and write them on the board. You should see that people who have higher numbers as the last two digits of their student ID number (e.g. 78 rather than 22) will guess higher numbers for the second question, even though the two numbers are unrelated. They call this anchoring. You can plot the student ID digits and the estimates of Gandhi’s age on a scatterplot if you like, or even calculate the correlation. It should come out positive.
Inattentional Blindness — If you’ve taken an intro psych class, then you’re familiar with the “Invisible Gorilla” (for everyone else, sorry for spoiling). In the biz they call this “inattentional blindness” — when you aren’t paying attention, or your attention is focused on one task, you miss a lot of stuff.
Turns out this is super easy to replicate, especially a variant called “change blindness”, where you change something but people don’t notice. You can swap out whole people and about half the time, no one picks up on it.
False Memory — For this task you need a small group of people. Have them put away their phones and writing tools; no notes. Tell them you’re doing a memory task — you’ll show them a list of words for 30 seconds, and you want them to remember as many words as possible.
Then, show them the following list of words for 30 seconds or so:
After 30 seconds, hide or take down the list.
Then, wait a while for the second half of the task. If you’re doing this in a classroom, do the first step at the beginning of class, and the second half near the end.
Anyways, after waiting at least 10 minutes, show them these words and ask them, which of the words was on the original list?
Most people will incorrectly remember “sleep” as being on the original list, even though, if you go back and check, it’s not. What’s going on here? Well, all of the words on the original list are related to sleep — sleep adjectives, sleep sounds, sleep paraphernalia — and this leads to a false memory that “sleep” was on the list as well.
You can do the same thing for other words if you want — showing people a list of words like “sour”, “candy”, and “sugar” should lead to false memories of the word “sweet”. You can also read the list of words aloud instead of showing it on a screen for 30 seconds, you should get the same result either way.
Draw your own conclusions about what this tells us about memory, but the effect should be pretty easy to reproduce for yourself.
We don’t think all false memory findings in psychology bear out. We think some of them aren’t true, like the famous Loftus & Palmer (1974) study, which we think is probably bullshit. But we do think it’s clear that it’s easy to create false memories under the right circumstances, and you can do it in the classroom using the approach we describe above.
You can even use something like the inattentional blindness paradigms above to give people false memories about their political opinions. A little on the tricky side but you should also be able to replicate this one if you can get the magic trick right. And if this seems incredible, ridiculous, unbelievable — try it for yourself!
Let us call this past form aristocratic tutoring, to distinguish it from a tutor you meet in a coffeeshop to go over SAT math problems while the clock ticks down. It’s also different than “tiger parenting,” which is specifically focused around the resume padding that’s needed for kids to meet the impossible requirements for high-tier colleges. Aristocratic tutoring was not focused on measurables. Historically, it usually involved a paid adult tutor, who was an expert in the field, spending significant time with a young child or teenager, instructing them but also engaging them in discussions, often in a live-in capacity, fostering both knowledge but also engagement with intellectual subjects and fields.
“Aristocratic tutoring” is not how we would describe it, but otherwise this sounds about right. We think Erik is right that historical tutoring was better than education today. But we don’t think being aristocratic is what made it better. So here are three other angles on the same idea:
Our personal educational philosophy is that, for the most part, the most important thing you can do for your students is expose them to things they wouldn’t have encountered otherwise. Sort of in the spirit of, you can lead a horse to water, but you can’t make him drink. So K-12 education gums up the works by making bad recommendations, having students spend a lot of time on mediocre stuff, and keeping them so busy they can’t follow up on the better recommendations from friends and family.
From this perspective, mechanized schooling is actually a net negative — it is worse than nothing, and if we just let kids run around hitting each other with sticks or whatever, we would get more geniuses.
But another possibility is that mechanized schooling is net neutral, and the problem is that we’ve lost some active ingredient that makes tutoring effective.
Education no longer includes moral instruction. Back in the day, a proper education taught you more than “the mitochondria is the powerhouse of the cell” — it taught you to take your character as seriously as your scholarship, to lead and to serve, and to understand your moral responsibilities. Tutoring worked because tutors inspired their pupils. Modern education is a lot of things, but “inspiring” ain’t one of them.
Back when formal education could still be inspiring, it still produced brilliant individuals. People have pointed out that the Manhattan Project was led by a group of strangely brilliant Hungarian scientists. Not only did most of them come from Budapest, many of them went to the same high school, and some of them had the same math teacher, László Rátz. Eugene Wigner, a Nobel Laureate in physics and one of Rátz’s pupils, had this to say:
… there were many superb teachers at the Lutheran gymnasium. But the greatest was my mathematics teacher László Rátz. Rátz was known not only throughout our gymnasium but also by the church and government hierarchy and among many of the teachers in the country schools. I still keep a photograph of Rátz in my workroom because he had every quality of a miraculous teacher: He loved teaching. He knew the subject and how to kindle interest in it. He imparted the very deepest understanding. Many gymnasium teachers had great skill, but no one could evoke the beauty of the subject like Rátz.
Rátz may or may not have been responsible for Wigner’s success, and he didn’t teach everyone involved in the Manhattan Project; our point is just that these Hungarians lived in a time when high school math teachers could still inspire former students to describe them as “miraculous”. This seems to be an aspect of the educational system that we have lost.
If this is right, then we don’t need to worry about tutoring being aristocratic. You shouldn’t need tutors or even miraculous Hungarian math teachers. Other things that are also inspiring / socially encouraging would work just as well — see for example the amazing progress of the speedrunning community, a bunch of teenage nerds bootstrapping a scene by inspiring each another to insane degrees of precision.
Erik hints at this by mentioning the social element. “For humans,” he says, “engagement is a social phenomenon; particularly for children, this requires interactions with adults who can not just give them individual attention, but also model for them what serious intellectual engagement looks like.” Individual attention is good, but we also think kids are good at teaching themselves. The active ingredient to us is showing kids “what serious intellectual engagement looks like”, and most kids today don’t see that until college (if ever).
The real problem is segregating children. Tutoring worked because you exposed children to people practicing a real skill (even if it’s only speaking their native language), or working in an actual profession. Modern education exposes them only to teachers.
At the end of your German tutelage you can speak to people you wouldn’t have been able to speak to before, read books and poems you wouldn’t have been able to read. At the end of your taxidermy tutelage you can take samples and stuff birds, and could theoretically make a living at it. Meanwhile at the end of high school you can write a five-point essay, a “skill” that you will never use again as long as you live.
So the problem is not the lack of tutoring per se, as much as the lack of giving children any sense of the real world at all. Today, children have to be sent to guidance counselors to be advised on what is out there. Teenagers dream of being youtubers and influencers. This isn’t their fault — these are some of the only professions where they actually understand what is involved. It’s the fault of adults, for not letting children see any of the many ways they could actually go out and exercise their powers in the world.
But tutoring isn’t the only way to expose children to real skills. So did working in the family business, and so did apprenticeships. Writing about why nerds are unpopular, Paul Graham says:
I’m suspicious of this theory that thirteen-year-old kids are intrinsically messed up. If it’s physiological, it should be universal. Are Mongol nomads all nihilists at thirteen? I’ve read a lot of history, and I have not seen a single reference to this supposedly universal fact before the twentieth century. Teenage apprentices in the Renaissance seem to have been cheerful and eager. They got in fights and played tricks on one another of course (Michelangelo had his nose broken by a bully), but they weren’t crazy.
As far as I can tell, the concept of the hormone-crazed teenager is coeval with suburbia. I don’t think this is a coincidence. I think teenagers are driven crazy by the life they’re made to lead. Teenage apprentices in the Renaissance were working dogs. Teenagers now are neurotic lapdogs. Their craziness is the craziness of the idle everywhere.
Paul is right; in many parts of the world, useful apprenticeship was the historical norm. As anthropologist David Graeber writes:
Feudal society was a vast system of service… the form of service that had the most important and pervasive influence on most people’s lives was not feudal service but what historical sociologists have called “life-cycle” service. Essentially, almost everyone was expected to spend roughly the first seven to fifteen years of his or her working life as a servant in someone else’s household. Most of us are familiar with how this worked itself out within craft guilds, where teenagers would first be assigned to master craftsmen as apprentices, and then become journeymen… In fact, the system was in no sense limited to artisans. Even peasants normally expected to spend their teenage years onward as “servants in husbandry” in another farm household, typically, that of someone just slightly better off. Service was expected equally of girls and boys (that’s what milkmaids were: daughters of peasants during their years of service), and was usually expected even of the elite. The most familiar example here would be pages, who were apprentice knights, but even noblewomen, unless they were at the very top of the hierarchy, were expected to spend their adolescence as ladies-in-waiting—that is, servants who would “wait upon” a married noblewoman of slightly higher rank, attending to her privy chamber, toilette, meals, and so forth, even as they were also “waiting” for such time as they, too, were in a position to marry and become the lady of an aristocratic household themselves.
Service was especially pervasive in England. “Few are born who are exempted from this fate,” wrote a Venetian visitor around 1500, “for everyone, however rich he may be, sends away his children into the houses of others, whilst he, in return, receives those of strangers into his own.”
Even just having your children around adults and being a part of adult conversations will go a long way. For what it’s worth, this is how we were raised, i.e. mostly around adults.
This may be another element common to the cases Erik mentions — most of the geniuses he names seem to have had very little contact with children outside their immediate family. Whether or not this is good for children psychologically is a separate question, but it does seem to lead to very skilled adults.
In fact, the number of children in a family might also be a factor. There was a time when most families were pretty large, so a lot of children had several older siblings. If you have five older brothers, you get both benefits — other children to play with, and a more direct line to adulthood through your older siblings. Erik mentions the example of Bertrand Russell, and we wonder if this might be more representative than he realizes:
When Bertrand Russell’s older brother introduced him to geometry at the age of 11, Russell later wrote in his autobiography that it was: “… one of the great events of my life, as dazzling as first love.” Is that really solely his innate genetic facility, or was mathematics colored by the love of his older brother?
It’s easy to come up with other examples (though of course this is not universal). Charles Darwin was the fifth of six children. The Polgár sisters are all chess prodigies, and were intentionally raised to be geniuses, but the youngest daughter Judit is the best of the three. Jane Austen had five older brothers and an older sister. Her eldest brother James wrote prologues and epilogues for plays the family staged and it seems as though this moved Jane to try her hand at something similar.
So part of the success of tutoring might simply be exposing a child to subjects “before they are ready”, and one way to reliably do that is to have them overhear the lessons of their older siblings, who they are ready to imitate.
This ties neatly into the social/moral element we mention above. Children may be moved by a passionate tutor, or a beloved uncle, or a cousin, or a medical student who lives in the spare room. But they will always be influenced by older siblings, and the more older siblings there are, the more gates to adult influence will be opened. Maybe if we want more geniuses, people need to start having larger families.
A thermostat is a simple example of a control system. A basic model has only a few parts: some kind of sensor for detecting the temperature within the house, and some way of changing the temperature. Usually this means it has the ability to turn the furnace off and on, but it might also be able to control the air conditioning.
The thermostat uses these abilities to keep the house at whatever temperature a human sets it to — maybe 72 degrees. Assuming no major disturbances, the control system can keep a house at this temperature indefinitely.
In the real world, control systems are all over the place.
Imagine that a car is being driven across a hilly landscape.
A man is operating this car. Let’s call him Frank. Now, Frank is a real stickler about being a law-abiding citizen, and he always makes sure to go exactly the speed limit.
On this road, the speed limit is 35 mph. So Frank uses the gas pedal and the brake pedal to keep the car going the speed limit. He uses the gas to keep from slowing down when the road slopes up, and to keep the car going a constant speed on straightaways. He uses the brake to keep from speeding up when the road slopes down.
The road is hilly enough that frequent use of the gas and brake are necessary. But it’s well within Frank’s ability, and he successfully keeps the needle on 35 mph the whole time.
Together, Frank and the car form a control system, just like a thermostat, that keeps the car at a constant speed. You could also replace Frank’s brain with the car’s built-in cruise control function, if it has one, and that might provide an even more precise form of control. But whatever is doing the calculations, the entire system functions more or less the same way.
Surprisingly, if you graph all the variables at play here — the angle of the road, the gas, the brake, and the speed of the car at each time point — speed will not be correlated with any of the other variables. Despite the fact that the speed is almost entirely the result of the combination of gas, brake, and slope (plus small factors like wind and friction), there will be no apparent correlation, because the control system keeps the car at a constant 35 mph.
Similarly, if you took snapshots of many different Franks, driving on many different roads at different times, there would be no correlation between gas and speed in this dataset either.
We understand something about the causal system that is Frank and his car, and how this system responds to local traffic regulations, so we understand that gas and brake and angle of the road ARE causally responsible for that speed of 35 mph. But if an alien were looking at a readout of the data from a bunch of cars, their different speeds, and the use of various drivers’ implements as they rattle along, it would be hard pressed to figure out that the gas makes the car speed up and the brake makes it slow down.
We see that despite being causally related, gas and brake aren’t correlated with speed at all.
This is a well-understood, if somewhat understated, problem in causal inference. We’ve all heard that correlation does not imply causation, but most of us assume that when one thing causes another thing, those two things will be correlated. Hotter temperatures cause ice cream sales; and they’re correlated. Fertilizer use causes bigger plants; correlated. Parental height causes child height; you’d better believe it, they’re correlated.
Weirdly enough, sometimes there are causal relationships between two things and yet no observable correlation. Now that is definitely strange. How can one thing cause another thing without any discernible correlation between the two things? Consider this example, which is illustrated in Figure 1.1. A sailor is sailing her boat across the lake on a windy day. As the wind blows, she counters by turning the rudder in such a way so as to exactly offset the force of the wind. Back and forth she moves the rudder, yet the boat follows a straight line across the lake. A kindhearted yet naive person with no knowledge of wind or boats might look at this woman and say, “Someone get this sailor a new rudder! Hers is broken!” He thinks this because he cannot see any relationship between the movement of the rudder and the direction of the boat.
Let’s look at one more example, from the same textbook:
[The boat] sounds like a silly example, but in fact there are more serious versions of it. Consider a central bank reading tea leaves to discern when a recessionary wave is forming. Seeing evidence that a recession is emerging, the bank enters into open-market operations, buying bonds and pumping liquidity into the economy. Insofar as these actions are done optimally, these open-market operations will show no relationship whatsoever with actual output. In fact, in the ideal, banks may engage in aggressive trading in order to stop a recession, and we would be unable to see any evidence that it was working even though it was!
There’s something interesting that all of these examples — Frank driving the car, the sailor steering her boat, the central bank preventing a recession — have in common. They’re all examples of control systems.
Like we emphasized at the start, Frank and his car form a system for controlling the car’s speed. He goes up and down hills, but his speed stays at a constant 35 mph. If his control is good enough, there will be no detectable variation in the speed at all.
The sailor and her rudder are acting as a control system in the face of disturbances introduced by the wind. Just like Frank and his car, this control system is so good that to an external observer, there appears to be no change at all in the variable being controlled.
The central bank is doing something a little more complicated, but it is also acting as a control system. Trying to prevent a recession is controlling something like the growth of the economy. In this example, the growth of the economy continues increasing at about the same rate because of the central bank’s canny use of open-market operations, bonds, liquidity, etc. in response to some kind of external shock that would otherwise cause economic growth to stall or plummet — that would cause a recession. And “insofar as these actions are done optimally, these open-market operations will show no relationship whatsoever with actual output.”
The same thing will happen with a good enough thermostat, especially if it has access to both heating and cooling / air conditioning. The thermostat will operate its different interventions in response to external disturbances in temperature (from the sun, wind, doors being left open, etc.), and the internal temperature of the house will remain at 72 degrees, or whatever you set it at.
If you looked at the data, there would be no correlation between the house’s temperature and the methods used to control that temperature (furnace, A/C, etc.), and if you didn’t know what was going on, it would be hard to tell what was causing what.
In fact, we think this is the case for any control system. If a control system is working right, the target — the speed of Frank’s car, the direction of the boat, the rate of growth in the economy, the temperature of the house — will remain about the same no matter what. Depending on how sensitive your instruments are, you may not be able to detect any change at all.
If control is perfect — if Frank’s car stays at exactly 35 mph — then the system is leaking literally no information to the outside world. You can’t learn anything about how the system works because any other variable plotted against MPH, even one like gas or brake, will look something like this:
This is true even though gas and brake have a direct causal influence on speed. In any control system that is functioning properly, the methods used to control a signal won’t be correlated with the signal they’re controlling.
Worse, there will be several variables that DO show relationships, and may give the wrong impression. You’re looking at variables A, B, C, and D. You see that when A goes up, so does B. When A goes down, C goes up. D never changes and isn’t related to anything else — must not be important, certainly not related to the rest of the system. But of course, A is the angle of the road, B is the gas pedal, C is the brake pedal, and D is the speed of the car.
If control isn’t perfect, or your instruments are sensitive enough to detect when Frank speeds up or slows down by fractions of an mph, then some information will be let through. But this doesn’t mean that you’ll be able to get a correlation. You may be able to notice that the car speeds up a little on the approach to inclines and slows down when it goes downhill, and you may even be able to tie this to the gas and brake. But it shouldn’t show up as a correlation — you would have to use some other analysis technique, but we’re not sure if such a technique exists.
And if you don’t understand the rest of the environment, you’ll be hard pressed to tell which variation in speed is leaked from the control system and which is just noise from other sources — from differences in friction across the surface of the road, from going around curves, from imperfections in the engine, from Frank being distracted by birds, etc.
This seems like it might be a big problem, because control systems are found all over biology, medicine, and psychology.
Biology is all about homeostasis — maintaining stability against constant outside disturbances. Lots of the systems inside living things are designed to maintain homeostatic control over some important variable, because if you don’t have enough salt or oxygen or whatever, you die. But figuring out what controls what can be kind of complicated.
(If you’re getting ready to lecture us on the difference between allostasis and homeostasis, go jump in a pond instead.)
Medicine is the applied study of one area of biology (i.e. human biology, for the most part), so it faces all the same problems biology does. The human body works to control all sorts of variables important to our survival, which is good. But if you look at a signal relevant to human health, and want to figure out what controls that signal, chances are it won’t be correlated with its causes. That’s… confusing.
Lots of people forget that psychology is biological, but it obviously is. The brain is an organ too; it is made up of cells; it works by homeostatic principles. This is an under-appreciated perspective within psychology itself but some people are coming around; see for example this recent paper.
If you were to ask us what field our book A Chemical Hunger falls under, we would say cognitive science. Hunger is pretty clearly regulated in the brain as a cognitive-computational process and it’s pretty clearly part of a number of complicated homeostatic systems, systems that are controlling things like body weight and energy. So in a way, this is psychology too.
It’s important to remember that statistics was largely developed in fields like astronomy, demography, population genetics, and agriculture, which almost never deal with control systems. Correlation as you know it was introduced by Karl Pearson (incidentally, also a big racist; and worse, a Sorrows of Young Werther fan), whose work was wide-ranging but largely focused on genetic inheritance. While correlation was developed to understand things like barley yields, and can do that pretty well, it just wasn’t designed with control systems in mind. It may be unhelpful, or even misleading, if you point it at the wrong problem.
For a mathematical concept, correlation is not even that old, barely 140 years. So while correlation has captured the modern imagination, it’s not surprising that it isn’t always suited to scientific problems outside the ones it was invented to tackle.
Lady Wonder “was a mare some claimed to have psychic abilities and be able to perform intellectually demanding tasks such as arithmetic and spelling. …Lady was said to have predicted the outcome of boxing fights and political elections, and was consulted by the police in criminal investigations.“
Did you ever spend time in… middle school? If so, you may recognize some of these urban legends about drugs. Who can forget such classics as “Bananadine” or “Man permanently thinks he is an orange and is terrified of being turned into a glass of orange juice.” We love that Wikipedia has an article on this.
Monte Testaccio is an artificial hill in Rome over 100 feet high, and 1 km in circumference, composed of fragments of broken ancient Roman pottery dating from the time of the Roman Empire. Gotta go back to Rome so I can look at this friggin’ bing.
Also per Wikipedia: Albert Einstein loved the children’s puppet show Time for Beany. “On one occasion, the physicist interrupted a high-level conference by announcing, ‘You will have to excuse me, gentlemen. It’s Time for Beany.’”
Alex Wellerstein writes a retrospective on 10 years of NUKEMAP. “Historians should not be surprised by the passing of time, but people are, and historians are people, so, well, here I am, continually surprised.” Relatedly, if you ever think nuclear war is about to occur, consider taking a 90-day trip to New Zealand.
Breastfeeding by humans of animals — much more common than you might think! “The reasons for this are varied: to feed young animals, to drain a woman’s breasts, to promote lactation, to harden the nipples before a baby is born, to prevent conception, and so on. … In far northern Japan, the Ainu people are noted for holding an annual bear festival at which a captured bear, raised and suckled by the women, is sacrificed.”
Best in Blogging this month:
Adam at Experimental History describes bureaucratic psychosis. “The best way I’ve found to keep it at bay is to simply excuse myself from other people’s Renaissance Fair realities and go play somewhere else. Let the obtuse administrators, sadistic gatekeepers, and conmen consultants rule their blob-land; I am happy sharing a little corner of the world with people who see me as a person.”
Applied Divinity Studies put out a two-part series on the purported shoplifting wave in San Francisco (Part 1, Part 2). We recommend reading it in full, but to summarize, ADS thinks that this supposed crime spree is a complete fantasy, driven by selective reporting and “an abject failure to do even the bare minimum of background research”. Seriously chilling implications about how much you can trust reporting and for our political landscape. “If you stick though this series, you’ll get to hear… how we ended up in this weird and wacky world where libertarian VCs somehow end up agreeing with liberals like Nancy Pelosi and London Breed, and where the stance they all agree on is that we should be tough on a crime, a stance historically antithetical to both parties’ platforms.”
The collapse of the Roman Empire in the West is a complex sequence of events and one that often resists easy answers, but it is a useful one to think about, particularly as we now sit atop our own fragile clockwork economic mechanism, suspended not a few feet but many miles above the grinding poverty of pre-industrial life and often with our own arsonists, who are convinced that the system is durable and stable because they cannot imagine it ever vanishing.
In the beginning, scientific articles were just letters. Scholars wrote to each other about whatever they were working on, celebrating their discoveries or arguing over minutiae, and ended up with great stacks of the things. People started bringing interesting letters to meetings of the Royal Society to read aloud, then scientists started addressing their letters to the Royal Society directly, and eventually Henry Oldenburg started pulling some of these letters together and printing them as the Philosophical Transactions of the Royal Society, the first scientific journal.
In continuance of this hallowed tradition, in this blog post we are publishing some philosophical transactions of our own: correspondence with JP Callaghan, an MD/PhD student at a large Northeast research university going into anesthesia. He has expertise in protein statistical mechanics and kinetic modeling, so he reached out to us with several ideas and enlightened criticisms.
With JP Callaghan’s help we have lightly edited the correspondence for clarity, turning the multi-threaded format of the email exchange into something more linear. We found the conversation very informative, and we hope you do as well! So without further ado:
I’m sure someone already suggested this but the Fulbright program executes the “move abroad” experiment every year. In fact, they do the reverse experiment as well, paying foreigners to move to the US. The Phillipines Fulbright program seems especially active.
(The Peace Corps is already doing this experiment as well, but that’s probably probably more confounded since people are often living in pretty rustic locations.)
You could pretty easily imagine paying these folks a little extra money to send you their weight once a month or whatever.
SLIME MOLD TIME MOLD: Thank you! Yeah, we’ve been trying to figure out the best way to pursue this one, using existing data if possible. Fulbright is a good idea, especially US <–––> Philippines, and especially because we suspect young people will show weight changes faster. We’ve also thought about trying to collect a sample of expats, possibly on reddit, since there are a lot of anecdotes of weight loss in those communities.
The tricky thing is finding someone who has an in with one of these groups. We probably can’t just cold call Fulbright and ask how much all their scholars weigh, though we’ll start asking around.
JPC: Unfortunately my connection with the Fulbright was brief, superficial, and many years ago. I can ask around at my university, though. I’m not filled with unmitigated optimism, but the worst they can do is say no/ignore me.
Also, I wanted to mention that lithium level measurements are extremely common measurements in clinical practice. It’s used to monitor therapeutic lithium (for e.g. bipolar folks). (Although I will concede usually they are measuring .5 – 1.5 mmol/L which would be way higher than serum levels due to contamination.) Also, it’s interesting that the early pharmacokinetic studies also measured urine lithium (see e.g. Barbara Ehrlich’s seminal 1980 paper) so there’s precedent for that as well. I’m led to understand from my lab medicine colleagues that it’s a relatively straightforward (aka cheap) electrochemical assay, at least in common clinical practice.
SMTM: We’ve looked into measurement a bit. We’re concerned that serum levels aren’t worth measuring, since lithium seems to accumulate in the brain and we suspect that would be the mechanism (a commenter suggested it might also be accumulation in bone). But if we were to do clinical measurements, we’d probably measure lithium in urine or maybe even in saliva, since there’s evidence they’re good proxies for one another and for the levels in serum, and they’re easier to collect. Urine might be especially important if lithium clearance rate ends up being a piece of the puzzle, which it seems like it might.
JPC: It is definitely true that lithium accumulates inside cells (definitely rat neurons and human RBCs, probably human neurons, but maybe not human muscle; see e.g. that Ehrlich paper I mentioned). The thing is, lithium kinetics seem to be pretty fast. Since it’s an ion, it doesn’t partition into fat the way other long-lasting medications and toxins do, and so it’s eliminated fairly quickly by the kidneys. (THC is a classic example of a hydrophobic “contaminant”; this same physical chemistry explains why a long-time pothead will test positive for THC for months, but you can stop using cocaine and, 72 hours later, screen negative.)
It might be worth your time to look at some of the lithium washout experiments that have been done over the years (e.g. Hunter, 1988 where they see lithium levels rapidly decline after stopping lithium therapy that had been going on for a month).
I suppose, though, that I’m not aware of any data that specifically excludes the possibility that there is a very slow “third compartment” where lithium can deposit (such as, as your commenter suggested, bone; although I don’t know much about whether or not lithium can incorporate into the hydroxyapatite matrix in bone. It’s mostly calcium phosphate and I’m not sure if lithium could “find a place” in that crystalline matrix).
Anyway, though, my understanding is that lithium kinetics in the brain are relatively fast. (For instance, see Ebadi, et al where they measure [Li] in rat brains over time.) So even if you have a highly accumulated slow bone compartment, the levels of lithium you’d get in the brain would still be super low, because it equilibrates with the blood quickly and therefore is subject to rapid elimination by the kidneys.
However, I don’t think you need to posit accumulation for your hypothesis. If you’re exposed to constant, low levels of lithium, you reach an equilibrium. There’s some super low serum concentration, some rather-higher intracellular concentration, and it’s all held in steady state by the constant intake via the GI tract (say, in the water) and constant elimination by the kidneys. Perhaps this is what you’re getting at when you say the rate of elimination might be very important?
Instead, consider some interesting pharmacodynamics: low-level (or maybe widely fluctuating, since lithium is also quickly cleared?) exposure to lithium messes with the lipostat. This process is probably really slow, maybe because weight change is slow or maybe because of some kind of brain adaptation process or whatever. We have good reason to suspect low-level lithium has neurological effects already anyway through some of the population-level suicide data I’m sure you’re aware of.
Urine and serum levels of lithium are only good proxies for one another at steady state. I really strongly suggest you guys look at that Ehrlich paper. She measures serum, intra-RBC, and urine [Li] after a dose of lithium carbonate (the most common delayed-release preparation of pharmaceutical lithium).
Another good one is Gaillot et al which demonstrates how important the form of lithium (lithium carbonate vs LiCl) is to the kinetics. (As an aside, this might be a reason for lithium grease to be so bad; lithium grease is apparently some kind of weird soap complex with fatty acids, maybe it gets trapped in the GI tract or something.)
SMTM: The rat studies are interesting but don’t rats seem like a bad comparison for determining something like rate of clearance? Besides just not being human, their metabolisms are something like 6-8x faster than ours and their lifespans are about 20 times shorter. Also human brains are huge. What do you think?
JPC: Certainly I agree that rats are not people and are bad models in many ways. I think that renal function is the key parameter you’d want to compare. The most basic measure of kidney function is the GFR (glomerular filtration rate), which basically measures how much fluid gets pushed through the “kidney filter” per unit time. Unfortunately in people we measure it in volume/time/body surface area and in rats volume/time/mass which makes a comparison less obvious than I was hoping. To be honest, I am not sure how well rat kidney function and human kidney function is comparable. (Definitely more comparable than live and dead human kidney function, though .)
What do you mean by ”their metabolisms are something like 6-8x faster than ours”? Like, calories/mass/time? Usually when I think about “metabolic rate” I am thinking of energy usage. When we think about drug elimination, the main things that matter are 1) liver function (for drugs that are hepatically metabolized) 2) various tissue enzyme function (e.g. plasma esterases for something like esmolol) and 3) renal function. I don’t generally think about basal metabolic rate as being a pertinent factor, really, except perhaps in cases where it’s a proxy for hepatic metabolism.
Lithium is eliminated (“cleared”) almost exclusively by the kidney and it undergoes no metabolic transformations, so I wouldn’t worry about anything but kidney function for its clearance.
You’re right, though, the 20x lifespan difference could be an issue. If we are worried about accumulation on the timescale of years, then obviously a shorter rat life is a problem. But (if I read your blog posts right) rats as experimental animals are also getting fatter so presumably the effect extends to them on the timescale of their life? (Did you have data in rats? I don’t remember.)
Indeed, if it’s actually just that there a constant low-level “infusion” of lithium via tapwater, grease exposure at work, etc giving rise to a low steady-state lithium (rather than actual bioaccumulation) this would explain why the effect does extend to these short-lived experimental animals.
SMTM: You make good points about laboratory animals. There are data on rats and they do seem to be getting heavier. Let’s stick a pin in this one for a now, you may find this next bit is relevant to the same questions:
In your opinion, are the studies you cite consistent or inconsistent with the findings of Amdisen et al. 1974 and Shoepfer et al. 2021? Also potentially relevant is Amidsen 1977. We describe their findings near the end of this section — basically they seem to suggest that Li accumulates preferentially in the bones, thyroid, and parts of the brain. The total sample size is small but it seems suggestive. We agree accumulation may not be essential to the theory but doesn’t this look like evidence of accumulation? We’ve attached copies of Amdisen et al. 1974 and Amdisen 1977 as PDFs in case you want to take a closer look. [SMTM’s Note: If anyone else wants to see these papers, you can email us.]
Especially interesting that Ebadi et al. say, “it has been shown that sodium intake exerts a significant influence on the renal elimination of lithium (Schou, 1958b)”, somewhat in line with our speculation here. We’ll have to look into that.
JPC: Thanks for the papers. As you predicted, I’m finding them super interesting.
Shoepfer et al, 2021 is a lovely, very interesting paper (complete with some adorable Deutsch-English). I was aware of it but had not taken the time to read it yet.
By my read, it is primarily seeking to establish this new, nuclear fission based approach to measuring lithium in pathology tissue. After spending some time with it, I don’t really know how to interpret their findings. The main reason I am not sure what to do with this paper is that the results are in dead peoples’ brains. Indeed, they specifically note in their ‘limitations’ section: “The lithium distribution patterns so far obtained with the NIK method, thus in no way contradicting given literature references, are based on post mortem tissue.” The reason this is pertinent is that there is a lot of active transport of other monovalent cations (K, Na) and so I would worry that this is true for lithium as well and (obviously) this is almost certainly disrupted in dead people.
The second thing is that the tissue was fixed in (presumably) formalin and stained with hematoxylin and eosin before measuring lithium, which then comes out in units of mass/mass. Obviously in living tissue there’s lots of water and whatnot, and the mass-density of water and formalin is going to be pretty different.
So, as the authors say, I would say it’s neither consistent nor inconsistent with other data.
SMTM: It’s true that all the brain samples we have in humans are in dead brain tissue, but this seems like an insurmountable issue, right? Looking at dead tissue is the only way to get even a rough estimate of how much lithium is in the brain, since as far as we know there’s no way to test the levels in a living human brain, or if there is, no one has taken those measurements and it’s outside our current budget.
In any case, the most relevant findings from these studies, at least in our opinion, are 1) that lithium definitely reaches brain tissue and sticks around for a while, and 2) regardless of absolute levels, there seems to be relatively more lithium in parts of the brain that regulate appetite and weight gain. These conclusions seem likely to hold even given all the reasonable concerns about dead tissue. What do you think?
JPC: I agree. In my mind, the main question is whether or not lithium persists in the brain after cessation of lithium therapy. Put more rigorously, what is the rate of exchange between the “brain compartment” and (probably) the “serum compartment.” (I guess it could also be eliminated by CSF too maybe? Or “glymphatics”? idk I guess nobody really understands the brain.)
The main issue I have is this: if you’re exposed, say, to 20 ppb lithium and your serum has 20 ppb lithium and so does the cytoplasm in your neurons, this is actually the null hypothesis (that lithium is an inert substance that just flows down its concentration gradient). It’s obviously false (we know lithium concentrates in RBCs of healthy subjects, for instance), but this paper doesn’t help me decide if lithium 1) passively diffuses throughout the body 2) is actively concentrated in neurons, or even 3) is actively cleared from cells, simply because I don’t really know what to do with the number.
The second issue is the preparation. Maybe formalin fixation washes lithium away, or when it fixes cell membranes maybe the lithium is allowed to diffuse out. Maybe it poorly penetrates myelin sheaths, and has a tendency to concentrate the lithium inside cells by making the extracellular environment more hydrophobic (nature abhors an unsolvated ion).
Another reason I am so skeptical of the “slow lithium kinetics” hypothesis is just the physical chemistry of lithium. It’s a tiny, charged particle. Keeping these sorts of ions from moving around and distributing evenly is actually really hard in most cases. There are a few cases of ionic solids in the human body (various types of kidney stones, bones, bile stones] but for the most part these involve much less soluble ions than lithium and everything is dissolved and flows around at its whim except where it’s actively pumped.
SMTM: This is a good point, and in addition, the fact that tourists and expats seem to lose weight quickly does seem to be a point in favor of fast lithium over slow lithium. If those anecdotes bear out in some kind of more systematic study, “slow lithium kinetics” starts looking really unlikely. Another possibility, though, is that young people are the only ones who lose weight quickly on foreign trips, and there’s something like a “weight gain in the brain, reservoir in the bone” system where people remain dosed for a long time once enough has built up in their bones (or some other reservoir).
JPC: Very possible. Also young people generally have better renal function. There are tons of people walking around with their kidneys at like 50% or worse who don’t even know it.
A third and distant issue what I mentioned about the active transport of Na and K that happens in neurons (IIRC something like 1/3 of your calories are spent doing this) ceasing when you’re dead. This is also a fairly big deal, though, since there are various cation leak channels in cell membranes (for electrical excitability reasons, I think; ask an electrical engineer or a different kind of biophysicist) through which Li might also escape. (Since, after all, a reasonable hypothesis for the mechanism of action is that Li uses Na channels.)
Between these three difficulties, I do actually see this as borderline insurmountable for ascertaining how much lithium is in an alive brain based on these data. Basically, it comes down to “I don’t know how much lithium I should expect there to be in these experiments.”
However, “relatively more lithium in parts of the brain that regulate appetite and weight gain” is a good point. I think that this is something you actually can reasonably say: it seems like there is more lithium in these areas than other areas. The within-experiment comparisons definitely seem more sound. It would also be consistent with the onset of hunger/appetite symptoms below traditionally-accepted therapeutic ranges.
I do also want to clarify what I mean by “no accumulation.” There is of course a sort of accumulation for all things at all times. You take a dose of some enteral medication, it leaches into your bloodstream from your gut, accumulating first in the serum. It then is distributed throughout the body and accumulates in other compartments (brain, liver, kidney, bone, whatever). Assuming linear pharmacokinetics, there’s some rate that the drug goes in to and out of each of these compartments.
If you keep taking the drug and the influx rate (from the serum into a compartment) is higher than the efflux rate (back to the serum from the compartment), the steady state in the compartment will be higher than the serum at steady state. In some sense, this could be called “accumulation.” But in another sense, if both these rates are fast, your accumulation is transient and quickly relaxes to zero if you clear the serum compartment of drug (which we know happens in normal individuals in the case of lithium). Although the concentration in the third compartment is indeed higher than in the serum, if you stop taking the drug, it will wash out (first from the serum then, more slowly, from the accumulating compartment).
SMTM: Thanks, this clarification is helpful. To make sure we understand, “accumulation” to you means that a contaminant goes to a part of the body, stays there, and basically never leaves. But you’re open to “a sort of accumulation” where 50 units go into the brain every day and only 10 units are cleared, leading to a more-or-less perpetual increase in the levels. Is that right?
JPC: Yes. I would frame this in terms of rates, though. So 5 x brain concentration units go to the brain and 1 x brain concentration units go out of the brain per unit time, such that you get a steady state concentration difference between the serum in the brain of in_rate / out_rate (in this case).
You guys seem mathy so I’ll add: for an arbitrary number of compartments this is just a first-order ODE. You can represent this situation as rate matrix K where element i, j represents the rate (1/time) that material flows from compartment i to j (or maybe j to i, I can never remember). Anyway this usually just boils down to something looking like an eigenvector problem to get the stationary distribution of things. (Obviously things get more complicated when you have pulsatile influx.)
The key question, though, is what effect does this high concentration in the accumulating compartment have on the actual physiology? If we have slowly-resolving, high concentration in the brain, then I think we could call this clinical (ie neuropharmacologically significant) accumulation. However, I think the case in the brain is that you have higher-than-serum concentrations, but that these concentrations quickly resolve after cessation of lithium therapy. My reasoning for this is that lithium pharmacokinetics are classically well-modeled with two- and three-compartment models, which mostly have pretty fast kinetics (rate parameters with half lives in the hours range).
SMTM: This is interesting because our sense is sort of the opposite! Specifically, our understanding is that most people who go off clinical doses of lithium do not lose much weight and tend to keep most of the weight they gained as a side effect (correct us if we’re wrong, we haven’t seen great documentation of this).
This seems at least suggestive that relatively high levels of lithium persist in the brain for a long time. On the other hand, clinical doses are really, really huge compared to trace doses, so maybe there is just so much in the brain compartment that it sometimes takes decades to clear. Ok we may not actually disagree, but it seemed like an interesting minor point of departure that might be worth considering.
JPC: I don’t know about this! I agree that slower (months to years) kinetics of lithium in the brain could explain this. An alternative (relatively parsimonious) explanation would be that, as Guyenet proposes, there simply is no mechanism for shedding excess adiposity. So if you gain weight as the result of any circumstance, if it stays on long enough for the lipostat to habituate to it, you just have a new, higher adiposity setpoint and have great difficulty eliminating that weight. That is, not being able to get the weight off after lithium-related weight gain might just be normal physiology.
The idea that clinical doses are just huge is sort of interesting. Normally, we think of the movement of ions in these kinetics models as having first-order kinetics (i.e. flux is proportional to concentration), but if you have truly shitboats of lithium in the brain, you could imagine that efflux might saturate (i.e. there are only so many transporters for the lithium to get out, since I imagine the cell membrane itself is impenetrable to Li+). This could be interesting. Not sure how you’d investigate it though. Probably patch-clamp type studies in ex vivo neurons? These are unfortunately expensive and extremely technical.
JPC: I see Amdisen et al. 1974 describes a fatal dose of lithium, which is very different pharmacokinetically from therapeutic doses. Above about 2.0 mmol/L (~2x therapeutic levels), lithium kinetics become nonlinear—that is, the pharmacokinetics are no longer fixed and the drug begins to influence its own clearance. In the case of lithium, high doses of lithium reduce clearance, leading to a vicious cycle of toxicity. This is a big deal clinically, often leading to the need for emergent hemodialysis.
So this is consistent with the papers I mentioned earlier (Ehrlich et al, Galliot et al) in the sense that cannot really conflict because they are reporting on two very different pharmacokinetic regimes.
You can’t directly compare the lithium kinetics in this patient to those in healthy people. You can see in figure 1 that the patient’s “urea” (I assume what we’d call BUN today?) explodes, which is a result of renal failure. It sounds like the patient wasn’t making any urine, i.e. has zero lithium clearance.
SMTM: True, it’s hard to tell. But FWIW lithium also seems to be cleared through other sources like sweat, so even renal failure doesn’t mean zero lithium clearance, just severely reduced. (Though not sure the percent. 50% through urine? 80%? 99%?)
JPC: Yes this is true, of course. My intuition would be that it’s closer to 99% or even like 99.9%. The kidney’s “function” (I guess you have to be a bit careful not to anthropomorphize/be teleological about the kidney here, but you know what I mean) is to eliminate stuff from the blood via urine, which it does very well, whereas sweat and other excreta have other functions.
Let’s assume for a second that lithium and sodium are the same and that the body doesn’t distinguish (obviously false; all models are wrong but some are useful) and let’s do some math.
In the ICU we routinely track “ins and outs” very carefully. Generally normal urine output is 0.5 – 1.5 mL/kg body weight/hr. In a 70 kg adult call it >800 mL/day. But because we also know how much fluid is going in, we know how much we lose to evaporation (sweat, spitting, coughing up gunk, etc), which we call “insensible losses.” This is usually 40-800 mL/day.
A normal sweat chloride (which we use to check for cystic fibrosis) is <29 mM. Because sweat doesn’t have a static charge, we know there’s some positive counterion. Let’s assume it’s all sodium. So call it 30 mM NaCl, and calculate 800 mL x 30 mM = 24 mmol NaCl and 40 mL x 30 mM = 1.2 mmol. These are collected using (I think) topical pilocarpine to stimulate sweat production, so this would be an upper bound probably. It’s pretty close to what they find here which is in athletes during training (full disclosure I didn’t read the whole thing), which seems like it would be similar to the pilocarpine case (i.e. unlikely to be sustained throughout the day).
We also measure 24-hour sodium elimination when investigating disorders of the kidney. A first-reasonabe-google-hit normal range is 40-220 mmol Na/24 hours. (Of course, this is usually done when fluid-restricting the patient, so this would be on the low end of normal. If you go to Shake Shack and eat a giant salty burger your urine urea and Na are going to skyrocket. If you’re in a desert, your urine will be WAY concentrated, but maybe lower volume. It’s hard to generalize so this is at best a Fermi estimation type of deal.)
Anyhow, we’re looking at somewhere between 2x and 250x more sodium eliminated in the urine. Again my guess is that we’d be closer to the 250x number and not the 2x number for some of the reasons I mention above. Also I worry you can’t just multiply insensible losses * sweat [Na] because as water evaporates it gets drawn out of the body as free water to re-hydrate the Na, or something.
In writing this up, I also found this paper which also does some interesting quantification of sweat electrolytes (again we get a mean sweat [Na] of 37 and [Cl] of 34), but in some of the later plots (Figure 2) we can see that [Na] and [Cl] go way low and that the average seems to be being pulled up by a long tail of high sweat electrolytes.
So not sure what to take away from that but I thought I’d share my work anyway. 🙂
JPC: In the case of bone, however, there might be something here! You could imagine the bone being a large but slowly-exchanging depot of lithium. I’d be interested to see if anyone has measured bone lithium levels in folks who were, say, on chronic therapeutic lithium. I’m not aware of anything like that.
SMTM: It seems to fit Amdisen et al. 1974. That case study is of a woman who was on clinical levels of lithium for three years, and had relatively high concentrations in her bones. Like you say, a fatal dose of lithium is very different pharmacokinetically from therapeutic doses, but the rate at which lithium deposits in bone is presumably (?) much slower than for other tissues, so this may be a reasonable estimate of how much had made it into her bones from three years of clinical treatment. Sample size of one, etc., but like you say there doesn’t seem to be any other data on lithium in bones.
JPC: I think it’s hard to say for sure if high concentration in her bones is due to the chronic therapy or the overdose. However, they note higher (0.77 vs 0.59 mmol/kg) in dense bone (iliac crest) than in spongey bone (vertebral body; there’s a better name than spongey… maybe cumulus? I don’t remember.). That’s interesting because it suggests to me (assuming that the error in the measurement is << 0.77-0.59) there is more concentrating effect in mineralized bone than all the cellular components (osteoclasts, osteoblasts, hematopoietic cells etc).
Anyway it’s suggestive that maybe there is deposition in bone. I wouldn’t hang my hat on it, but it is definitely consistent with it. I also agree that bone mineralization/incorporation seems like it ought to be on a longer timescale than cellular transport, so that is consistent as well. Obviously n=1, etc etc, but it’s kind of cute.
SMTM: Maybe we should see if we could do a study, there must be someone out there with a… skeleton bank? What do you call that?
JPC: A cadaver lab? I think most medical schools have them (ours does). In an academic medical setting, I would just get an IRB to collect bone samples from all the cadavers or maybe everyone who gets an autopsy that’s sufficiently extensive to make it easy to collect some bone. This would be a convenience sample, of course, but it would be interesting. Correlate age, zip code, renal function if known?
Because the patient is dead, there’s no risk of harm, and because they’re already doing the autopsy/dissection/whatever it should be relatively straightforward to collect in most cases (I mean, they remove organs and stuff to weigh and examine them so grabbing a bit of bone is easy). Unfortunately all these people got sick and died so you have a little bit of a problem there. For example, if someone had cancer and was cachectic, what can you learn from that? Idk.
In vivo bone biopsies are also a relatively common procedure done by interventional radiology under CT guidance (it’s SUPER COOL). You also have the problem that people are getting their biopsies for a reason, and usually the reason boils down to “we think that this bone looks weird,” so your samples would be almost by definition abnormal.
SMTM: Great! Maybe we can find someone with a cadaver lab and see if we can make it happen. This is a very cool idea.
SMTM: Earlier you mentioned the idea that the body’s set point can only be raised, but it seems really unlikely to us that there’s no mechanism for shedding excess adiposity.
JPC: Hmm. You guys are definitely better read on this subject than I am, but do I fear I have oversimplified the Guyenet hypothesis somewhat. My recollection is that it is more that there’s no driving force for the lipostat setpoint to return to a healthy level if it has habituated to a higher level of adiposity.
I like the analogy to iron. (I don’t think that Guyenet makes this connection, but I read The Hungry Brain years ago so I’m not sure.) It turns out that the body has no way of directly eliminating iron, so when iron levels get high, the body just turns off the “get more iron” system. Eventually, iron slowly makes its way out of the body because bleeding, entropy, etc etc and the iron-absorption system clicks back on. (This is relevant because patients who receive frequent transfusions, such as those with sickle cell, get iron overload due to their inability to eliminate the extra iron.)
I guess, by analogy, it would be that the mechanism for shedding adiposity would be “turn off the big hunger cues.” It’s not no mechanism, it’s just a crappy, passive, poorly-optimized mechanism. (Presumably because, like how nobody got transfusions prior to the 20th century, there was never an unending excess of trivially-accessible and highly palatable food in our evolutionary history.)
SMTM: Well, overfeeding studies raise people’s weights temporarily but they quickly go back to where they were before. Anecdotally, a lot of people who visit lean countries lose decent amounts of weight in just a few weeks. And occasionally people drop a couple hundred pounds for no apparent reason (if the contamination hypothesis is correct, this probably happens in rare cases where a person serendipitously eliminates most of their contamination load all at once). And people do have outlets like fidgeting that seem to be a mechanism beyond just “turn off the big hunger cues.” All this seems to suggest that weight is controlled in both directions.
JPC: Proponents of the above hypothesis would explain this by saying that the lipostat doesn’t have time to habituate to the new setpoint during the timescale of an overfeeding study, and so they lose the weight by having their “acute hunger cues” turned off. Whereas as weight creeps up year after year, the lipostat slowly follows the weight up. You do bring up a good point about fidgeting, though.
My thought was that bolus-dosed lithium (in food or elsewhere) might serve the function of repeated overfeeding episodes, each one pushing the lipostat up some small amount, leading to overall slow weight gain.
I think combining the idea that the brain concentrates lithium with an “up only” lipostat might give you this effect? If we say 1) lithium probably concentrates first in areas controlling hunger and thirst, leading to an effect on this at lower-than-theraputic serum concentrations, you might see weeks of weight-gain effect from a bolus 2) that we know that weight gain can occur on this timescale and then not revert (see the observation, which I read about in Guyenet, that most weight is gained between thanksgiving and NYE). What do you think?
SMTM: To get a little more into the weeds on this (because you may find it interesting), William Powers says in some of his writing (can’t recall where) that control systems built using neurons will have separate systems for “push up” and “push down” control. If he’s right, then there are separate “up lipostats” and “down lipostats”, and presumably they function or fail largely separately. This suggests that a contaminant that breaks one probably doesn’t break the other, and also suggests that the obesity epidemic would probably be the result of two or more contaminants.
JPC: Yes! Super interesting. There are lots of places in the brain where this kind of push-pull system is used. I remember very clearly a neuroscience professor saying, while aggressively waving his hands, that “engineers love this kind of thing and that’s probably why the brain does it too.” I wonder if he was thinking of Powers’ work when he said that.
SMTM: Let’s say that contaminant A raises the set point of the “down lipostat”, and contaminant B raises the set point of the “up lipostat”. Someone exposed to just A doesn’t necessarily get fatter, but they can drift up to the new set point if they overeat. At the same time, with exercise and calorie restriction, there’s nothing keeping them from pushing their weight down again.
Someone exposed to both A and B does necessarily get fatter, because they are being pushed up, and they have to fight the up lipostat to lose any weight, which is close to impossible. (This might explain why calorie restriction seems to work as a diet for some people but doesn’t work generally.)
Someone exposed to just B, or who has a paradoxical reaction to A, sees their up and down lipostats get in a fight, which looks like cycles of binging and purging and intense stress. This might possibly present as bulimia.
There isn’t enough evidence to tell to this level of detail, but a plausible read based on this theoretical perspective is that we might see something like, lithium raises the set point of the down lipostat and PFAS raise the set point of the up lipostat, and you only get really obese if you get exposed to high doses of both.
JPC: Very interesting! It’s definitely appealing on a theoretical level. (See: your recent post on beauty in science.) I just don’t know anything about the state of the evidence in the systems neuroscience of obesity to say if it’s consistent or inconsistent with the data. (Same is of course true of the lipostat-creep hypothesis above.)
I’m not sure about why you think the two systems would function separately? Certainly, for us to see a change, there would have to be a failure of one or the other population preferentially but I’m not sure why this would be less common than one effect or the other. They’d be likely anatomical neighbors, and perhaps even developmentally related. I guess it would all depend on the actual physiology. I’m thinking, for instance, of how the eye creates center-surround receptive fields using the same photoreceptors in combination with some (I think) inhibitory interneurons (neural NOT gates). The same photoreceptor, hooked up a different way, acts to activate or inhibit different retinal ganglion cells (the cells that make up the optic nerve… I think. It’s been a while.). Another example might be the basal ganglia, which (allegedly) functions to select between different actions, but mostly our drugs act to “do more actions” by being pro-dopaminergic (for instance to treat Parkinsons) or “do fewer actions” by being antidopaminergic (as in antipsychotics like haloperidol).
SMTM: Yeah good points and good question! We have reasons to believe that these systems (and other paired systems) do function more or less separately, but it might be too long to get into here. Long story short we think they are computationally separate but probably share a lot of underlying hardware.
SMTM: What do you think of a model based on peak lithium exposure? Our concern is that most sources of exposure are going to be lognormally distributed. Most of the time you get small doses, but very rarely you get a really really large dose. Most food contains no lithium grease, but every so often some grease gets on your hamburger during transport and you eat a big glob of it by accident.
Or even more concerning: you live downriver from a coal power plant, and you get your drinking water from the river. Most of the time the river contains only 10-20 ppb Li+, nothing all that impressive. But every few months they dump a new load of coal ash in the ash pond, which leaches lithium into the river, and for the next couple of days you’re drinking 10,000 ppb of lithium in every glass. This leads to a huge influx, and your compartments are filled with lithium.
This will deplete over time as your drinking water goes back to 10 ppb, but if it happens frequently enough, influx will be net greater than efflux over the long term and the general lithium levels in your compartments will go up and up. But anyone who comes to town to test your drinking water or your serum will find that levels in both are pretty low, unless they happen to show up on one of the very rare peak exposure days. So unless you did exhaustive testing or happened to be there on the right day, everything would look normal.
JPC: I totally vibe with the prediction that intake would be lognormally distributed. From a classic pharmacokinetic perspective, I would expect lognormally-distributed lithium boluses to actually be buffered by the fact that renal clearance eliminates lithium in proportion to its serum concentration–that is, it gets faster as lithium concentrations go up.
But I’m a big believer that you should shut up and calculate so I coded up a three compartment model (gut -> serum <-> tissue), made up some parameters* that seemed reasonable and gave the qualitative behavior I expected). Then either gave the model either 300 mg lithium carbonate three times a day (a low-ish dose of the the preparation given clinically), or three-times-a-day doses drawn from a lognormal distribution with two parameter sets (µ=1.5 and σ=1.5 or σ=2.5; this corresponds to a median dose of about 4.4 mg lithium carbonate in both cases, since the long tail doesn’t influence the median very much).
* k_gut->serum = 0.01 per minute
* k_serum->brain = 0.01 per minute
* k_brain->serum = 0.0025 per minute
* k_serum->urine = 0.001 per minute
* V_d,serum = 16 L
In my opinion, this gives us the following hypothesis: lognormally distributed doses of lithium with sufficient variability should create transient excursions of serum lithium into the therapeutic range.
Because this model includes that slow third compartment, we can also ask what the amount of lithium in that compartment is:
My interpretation of this is that the third compartment smooths the very spiky nature of the serum levels and, in that third compartment, you get nearly therapeutic levels of lithium in the third compartment for whole weeks (days ~35-40) after these spikes, especially if you get two spikes back to back. (Which it seems to me would be likely if you have, like, a coal ash spill or it’s wolfberry season or whatever.)
There clearly are a ton of limitations here: the parameters are made up by me, real kinetics are more like two slow compartments (this has one), lithium carbonate is a delayed preparation that almost certainly has different kinetics from food-based lithium, and I have no idea how realistic my lognormal parameters are, to name a few. However, I think the general principle holds: the slow compartment “smooths” the spikes, and so doing seems to be able to sustain highish [Li] even when the kidney is clearing it by feasting when Li is plentiful and retaining it during famine periods.
I’m not sure if this supports your hypothesis or not (do you need sustained brain [Li] above some threshold to get weight gain? I don’t think anyone knows…) but I thought the kinetics were interesting and best discussed with actual numbers and pictures than words. What do you guys think? Is this what you expected?
SMTM: Yes! Obviously the specifics of the dynamics matter a lot, but this seems to be a pretty clear demonstration of what we expected — that it’s theoretically possible to get therapeutic levels in the second compartment (serum) and sometimes in the third compartment (brain?), even if the median dose is much much lower than a therapeutic dose.
And because of the lognormal distribution, most samples of food or serum would have low levels of lithium — you would have to do a pretty exhaustive search to have a good chance of finding any of the spikes. So if something like this is what’s happening, it would make sense that no one has noticed.
It would be interesting to make a version of this model that also includes low-level constant exposure from drinking water (closer to 0.1 mg per day) and looks at dynamics over multiple years, getting an impression of what lifetime accumulation might look like, but that sounds like a project for another time.
JPC: Another thought is that thyroid concentrations may also matter. If lithium induces a slightly hypothyroid effect, people will gain weight that way too, since common (even classic) symptoms of hypothyroidism are weight gain and decreased activity. (It also proposes an immediate hypothesis [look at T3 vs TSH] and intervention [give people just a whiff of levothyroxine and see if it helps].) There’s also some thought that lithium maybe impacts thirst (full disclosure have not read this article except the abstract)?
SMTM: Also a good note, and yes, we do see signs of thyroid concentration. Some sort of thyroid sample would also be less invasive than a brain sample, right?
JPC: Yes. We routinely biopsy thyroid under ultrasound guidance for the evaluation of thyroid nodules (i.e. malignant vs benign). These biopsies might be a source of tissue you could test for lithium, but I’m not sure. The pathologists may need all the tissue they get for the diagnosis, they may not. Doing it on healthy people might be hard because it’s expensive (you need a well-trained operator) and more importantly it’s not a risk free procedure: the thyroid is highly vascular and if you goof you can hit a blood vessel and “brisk bleeding into the neck” is a pretty bad problem (if rare).
That said, it is definitely less invasive than a brain biopsy, and actually safer than the very low bar of “less invasive than a brain biopsy” implies.
SMTM: Do you have clinical experience with lithium?
JPC: Minimal but non-zero. I had a couple of patients on lithium during my psychiatry rotation and I think one case of lithium toxicity on my toxicology rotation. I do know a lot of doctors, though, so I could ask around if they’re simple questions.
SMTM: Great! So, trace doses might be the whole story, but we’re also concerned about possible lithium accumulation in food (like we saw in the wolfberries in the Gila River Valley). We wonder if people are getting subclinical or even clinical doses from their food. We do plan to test for lithium in food, but it also occurred to us that a sign of this might be cases of undiagnosed lithium toxicity.
Let’s make up some rough numbers for example. Let’s say that a clinical dose is 600,000 µg and lithium toxicity happens at 800,000 µg. Let’s also say that corn is the only major crop that concentrates lithium, and that corn products can contain up to 200,000 µg, though most contain less. Most of the time you eat fewer than four of these products a day and get a subclinical dose of something like 50,000 – 300,000 µg. But one day you eat five corn products that all happen to be high in lithium, and you suddenly get 1,000,000 µg. You’ve just had an overdose. If common foods concentrate lithium to a high enough level, this should happen, at least on occasion.
If someone presents at the ER with vomiting, dizziness, and confusion, how many docs are going to suspect lithium toxicity, especially if the person isn’t on prescription lithium for bipolar? Same for tremor, ataxia, nystagmus, etc. We assume (?) no one is routinely checking the lithium blood levels of these patients for lithium, that no one would think to order this blood test. Even if they did, there’s a pretty narrow time window for blood levels detecting this spike, as far as we understand.
So our question is something like, if normal people are occasionally presenting with lithium toxicity, would the medical system even notice? Or would these cases be misdiagnosed as heavy metal exposure / dementia / ischemic stroke / etc.? If so, is there any way we can follow up with this? Ask some ER docs to start ordering lithium tests in any mystery cases they see? Curious to know what you think, if this seems at all plausible or useful.
JPC: I have a close friend who is an ED doc! She and I talked about it and here’s our vibe:
With a presentation as nonspecific as vomiting, dizziness, and confusion, my impression is that most ED docs would be unlikely to check a lithium level, especially if the patient is well enough to say convincingly “no I didn’t take any pills and no I don’t take lithium.” At some point, you might send off a lithium level as a hail-Mary, but there are so many things that cause this that a very plausible story would be: patient comes to ED with nausea/vomiting, dizziness, and altered mental status. The ED gives maybe fluids, checks some basic labs, does an initial workup, and doesn’t find anything. Admits the patient. The next day the admitting team does some more stuff, checks some other things, and comes up empty. The patient gets better after maybe 24-48h, nobody ever thinks to check a lithium level, and since the patient is feeling better they’re discharged without ever knowing why.
Another version would go: patient is super sick, maybe their vomiting and diarrhea get them super dehydrated and give them an AKI (basically temporary kidney failure). People think “wow maybe it’s really bad gastritis or some kind of primary GI problem or something?” The patient is admitted to the ICU with some kind of gross electrolyte imbalance because they’re in kidney failure and they pooped out all their potassium, someone decides they need hemodialysis, and this clears the lithium. Again the patient gets better, and everyone is none the wiser.
Tremor, ataxia, nystagmus, etc. are more focal signs and even if someone doesn’t have a history of lithium use, and in this case our impression is that people would be more likely to check a lithium level. We also think it wouldn’t always happen. Even in classic presentations of lithium toxicity, sometimes people miss the diagnosis. (Emergency medicine is hard; people aren’t like routers where they blink the link light red when the motherboard is fried or power light goes orange if the AC is under voltage. Things are often vague and complicated and mysterious.)
Something you’d have to explain is how this isn’t happening CONSTANTLY to people with really borderline kidney function. Perhaps one explanation might be that acute lithium intoxication (i.e. not against a background of existing lithium therapy) generally presents late with the neuro stuff (or so I hear).
We think that this is plausible if it is relatively uncommon or almost always pretty mild. If we were having an epidemic of this kind of thing (like on the scale of the obesity epidemic) I think it would be weird that nobody has noticed. Unless of course it’s a pretty mild, self-resolving thing. Then, who knows! AFAIK still nobody really knows why sideaches happen—figuring it out just isn’t a priority.
On occasion, the medical-scientific community also has big misses. There’s an old line that “half of what you learn in medical school is false, you just don’t know which half.” We were convinced until 1982 that ulcers were caused by lifestyle and “too much acid”; turns out that’s completely wrong and actually it’s bacteria. I saw a paper recently that argued that pretty much all MS might be due to EBV infection (no idea if it’s any good).
I think you could theoretically “add on” a lithium level to anybody that’s getting a head CT with the indication being “altered mental status.” “Add on” just means that the lab will just take the blood they already have from the patient and run additional testing, if they have enough in the right kind of tube. The logic is that patients with new-onset, dramatic, and unexplained mental status changes often get head CTs to rule out a bleed or other intracranial badness, so a head CT ordered this way could be a sign that the ordering doc may be feeling stumped.
If you wanted to get fancy, you could try to come up with a lab signature of “nausea/vomiting/diarrhea of unclear origin” (maybe certain labs being ordered that look like a fishing expedition) and add on a lithium there as well.
SMTM: Good point, but, isn’t it possible that it IS happening constantly to people with really borderline kidney function? The symptoms of loss of kidney function have some overlap with the symptoms of lithium intoxication, maybe people with reduced kidney function really do have this happen to one degree or another whenever they draw the short straw on dietary lithium exposure for the day. Lots of people have mysterious ailments that lead to symptoms like nausea and dizziness, seemingly at random.
JPC: I guess it’s definitely possible. The “canonical” explanation to this would be that diabetes (which is obviously linked to obesity) destroys your kidneys. But, if it’s all correlated together as a vicious cycle (lithium → obesity → CKD → lithium) that’s kind of appealing too. I bet a lot is known about the obesity-diabetes-kidney disease link though and my bet without looking into it would be that there’s some problem with that hypothesis.
My thought here was that if people with marginal/no kidney function are getting mild cases, I would expect people with normal kidney function to be basically immune. Or, if people with normal kidney function get mild cases, people with marginal kidneys should get raging cases. This is because serum levels of stuff are related to the inverse of clearance. The classic example is creatinine, which is filtered by the kidney and used as a (rough) proxy for renal function.
SMTM: This is super fascinating/helpful. For a long time now we’ve been looking for a “silver bullet” on the lithium hypothesis — something which, if the hypothesis is correct, should be possible and would bring us from “plausible” to “pretty likely” or even “that’s probably what’s going on”. For a long time we thought the only silver bullet would be actually curing obesity in a sample population by making sure they weren’t consuming any lithium, but that’s a pretty tall order for a variety of reasons, not least because (as we’ve been discussing) the kinetics remain unclear! But recently we’ve realized there might be other silver bullets. One would be finding high levels of lithium in food products, but there are a lot of different kinds of foods out there, and since the levels are probably lognormal distributed you might need an exhaustive search.
But now we think that finding people admitted to the ER with vague symptoms and high serum lithium, despite not taking it clinically, could be a silver bullet too. Even a single case study would be pretty compelling, and we could use any cases we found to try to narrow down which foods we should look at more closely. Or if we can’t find any of these cases, a study of lithium levels in thyroid or in bone could potentially be another silver bullet, especially if levels were correlated with BMI or something.
JPC: I’m always hesitant to describe any single experiment as a silver bullet, but I agree that even a single case report, under the right conditions, of high serum lithium in someone not taking lithium would be pretty suspicious. You’d have to rule out foul play and primary/secondary gain (i.e. lying) but it would definitely be interesting. As far as finding lithium in bone or thyroid (of someone not taking lithium), I’d want to see some kind of evidence that it’s doing something, but again it’d definitely be supportive.
Big investments generate quite a lot of money — you can draw off about 4% of an investment every year without depleting the principal, because you get back that much or more in interest. Even if you did nothing but stick the money in an S&P 500 index fund, the historical average is about 10% per year. That’s not guaranteed, but it’s pretty damn good.
If we assume 4% annually, a $3 million endowment would generate $120,000 a year, or $10,000 a month indefinitely. A $2.5 million endowment would generate $100,000 a year, or $8333.33 a month. Even a measly $2 million endowment would generate $80,000 a year, or $6,666.66 a month.
Any of these amounts would be enough to purchase a big 6-to-10 bedroom house in many areas, with some endowment left to generate interest each month. If you sink $1 million of a $3 million endowment into a house, you still have the remaining interest from $2 million every month.
Once you’ve bought a house, you could use that interest to support a houseful of people. Exact numbers vary by location, but the interest should be enough to keep the house in good repair, pay property taxes, pay for utilities and internet access, feed everyone, buy a junky car, and even give them all a small stipend.
The big gains are in rent — getting a decent room can easily cost you $1000 a month these days, so eight people seeking out individual lodgings would be in for $8000 a month collectively, or $96,000 a year! But if they all live in an 8-bedroom house with a mortgage of $3000, that’s only $36,000 a year, and you save $60,000 annually. (And if you purchase the house outright, then of course there’s no rent at all.)
You’ll notice a few things. It’s clear that a mortgage is the wrong choice here. You won’t come out ahead until 30 years down the road when the mortgage is finally paid off. If you have the money, buy the house outright.
Healthcare is the big stumbling block — in a lot of scenarios, you just won’t have enough to pay for everyone’s insurance. Residents might qualify for some kind of reduced rates depending on income, but this seems to vary a lot by state.
Even in the best-case scenarios, it’s hard to end up with enough to give your residents much of a stipend. This still isn’t such a bad deal — they get their rent, their food, and maybe their health insurance all covered. They even get access to a junky car. What more could you want?
The situation improves a lot if you start with an endowment of more than $3 million, of course, or if you assume you can get more than 4% interest per year. But even within these constraints, you can get pretty decent living conditions for 5-8 people if you choose a house in the right place and give them a shitty enough car. Go ahead and mess around with the values in the Houseulator and find out!
Charter houses could be used to fill all sorts of weird niches.
Maybe you think college is a waste of time (andreallywhodoesn’tthesedays). Or maybe you just think we should make it easy for young people to take big risks, and work on moonshot projects that will take years to pan out.
In that case, a charter house could be an accelerator for young people. Lots of high school or college graduates would love an opportunity to not think about paying rent and focus on their passion projects for the next several years.
In general, young people don’t mind a slightly marginal existence, so this setup fits them pretty well. The average 30-year-old would have a hard time accepting a tiny stipend, even if rent was covered and there were no strings attached. The average 30-year-old also probably has better options, where they can make a lot more money, even if they have to work for it. But the average 22-year-old would jump at the opportunity to [checks notes] get paid to not pay rent, and most 22-year-olds don’t have access to a better deal than this. This is even more true for the average 18-year-old, especially one that doesn’t want to bother with college.
You might be concerned that young people would like their charter house so much they would stay forever, but this is where the very small stipend becomes an advantage. From the ages of 18 to 24 or so, survival alone is pretty enticing. But as they grow up, most of your residents will begin to dream of more than a $500 a month stipend and free rent. Soon they will hunger for more space, or nicer equipment, or a car that doesn’t have holes in the floor. They’ll find a job or some other way of making money and graduate, moving out on their own. If some of them do decide to become long-haulers, that’s ok too, since it gives your house more institutional memory.
This level of security helps people figure out their comparative advantage, and lets them found more small businesses and startups, because they don’t need to make the same kind of money right out of school. Obviously that’s good for innovation.
Research and Scholarship
Here’s a question: what’s the minimum form of scholarly institution? Existing universities are huge, but every university is made up of schools and departments, and in many cases these function almost as independent entities. How small can you go and still call it an institution?
A charter house could be an interesting experiment in marginal scholarship. Charter houses could serve as a replacement for academic departments, possibly with a mentoring component (e.g. half of the residents are students, with mandatory turnover after a couple years). You buy a house and give it an endowment, and recruit a bunch of biologists or linguists or computer scientists, and see what kind of scholarship they produce. We don’t know if it will be good, but we’re sure it will be different.
The kind of biologists who would show up to live in an abandoned church in Oak Creek, Wisconsin or an old Victorian mansion in Normal, Indiana would be a very different kind of biologist than the kind who would take an academic job at your local university. But we think this is an advantage.
There’s an ongoing conversation about how we as a society can support people who have important but hard-to-compensate roles (see for example this twitter thread). There are lots of roles, especially in open source software but also in other areas, where the work is critical but no one is willing to pony up to pay for it.
These roles don’t fit within normal funding structures — they’re too small for a business to hire the person on, too small to form a nonprofit around them, and too big to be supported through individual donations. And beyond this, there are even more projects that someone should do, and which might attract support retrospectively, but no business or nonprofit would be willing to support prospectively.
Charter houses could solve this problem neatly. A charter house or two could easily be set up with positions offered to people who are filling these roles, providing them with a minimum of support — at the very least, free rent and free high-speed internet. These people are professionals, so this may not be enough for them — but there’s no reason they can’t get support from the charter house and make additional money in other ways. They can supplement that support by consulting, getting a real job, being a bounty hunter, etc.
In fact, since people in this position might also have a part-time consulting gig or something, a charter house targeted at them might be able to survive on a much smaller endowment, only paying for their rent, and not covering their food and healthcare, for example.
There are a lot of projects that would have no prospective support because they’re super high risk. But if we have an ecosystem for encouraging lots of high risk projects, we will eventually get a lot of crazy successful moonshots. Our society already does this a bit for open source software — we should do it for other important avenues of progress as well. Like apenwarr says: “The best part of free software is it sometimes produces stuff you never would have been willing to pay to develop (Linux), and sometimes at quality levels too high to be rational for the market to provide (sqlite).”
You could also allow a totally unprincipled combination of all of these approaches, and we think that would work pretty damn well. It’s fine if you have three engineers working on a startup on the ground floor, an essayist sharing a bunk bed with a painter in a room above the garage, and two biologists in the attic.
A mix of approaches is good and healthy. If you fill a house with biologists, they will all be competing with each other. They may even end up at each other’s throats — they are too similar. But mix in a little diversity, a few chemists and physicists, some experts in East Asian literature, and a Turkish math wiz who speaks almost no English, and things will work very well indeed.
It’s tempting to make each charter house alike in scope and subject — one house for the college dropouts, one house for the physicists, one house for the painters, one house for the startup accelerator, one house for the mystics, etc. But siloing people in this way is going to be counterproductive. Young people will benefit from sitting across the dinner table from old people; old people from young people. Biologists will benefit from playing video games in the living room with art historians. Philosophers will benefit from going grocery shopping with blacksmiths. Electrical engineers will benefit from fixing windows with clowns. Bartenders will benefit from cooking dinner with astronomers.
As Paul Graham says in his essay Hackers and Painters, “I’ve found that the best sources of ideas are not the other fields that have the word ‘computer’ in their names, but the other fields inhabited by makers. Painting has been a much richer source of ideas than the theory of computation.”
So it’s ok, even ideal, to have a charter house where most of the residents are college dropouts, and there’s one 60-year-old living in the basement maintaining ‘runk’.
Charter houses capture a number of features of other successful programs.
They’re kind of like the Alaska Fellows Program. In this program, you stick a bunch of recent college grads in a house somewhere in Alaska, where they live together for about a year. Housing and utilities are covered, and everyone gets a monthly stipend of $1000 on top of that. We hear it works great. If young people sign up for this, you can bet they would also sign up for a program with more freedom and where they didn’t have to live through the polar night.
They’re also kind of like medieval guilds. A guild was an organization devoted to a specific kind of skill, one with practical applications, and that saw to training and organization. They pooled funds and sometimes shared tools or workshops. The first universities started out as guilds of students, who banded together to hire tutors (the first professors) and for mutual protection. Other medieval examples include various religious orders, like the Franciscans or the Poor Clares. In these particular examples you personally owned no property, but you still had a place to stay. Religious orders often owned buildings (monasteries, convents, abbeys, etc.) and conducted various forms of scholarly work together. Gregor Mendel, the father of genetics, was an Augustinian friar and abbot.
Something like charter houses already exists during college. Particular dorms will have a particular theme, or a subset of all the people in a club or frat will live together. When people graduate from college, it’s pretty common for them to share an apartment with friends for a couple years. It’s clear that people enjoy living together like this, as long as they get their own space.
This is pretty good evidence that, given the option, young people would try living in a charter house. And it seems like this is just straight-up competitive with college in almost every way. You have to pay for your housing in college, but in Soviet Russia, house pays you a charter house pays you. In most colleges you have to share a tiny, cramped room with other people, but most charter houses would be big enough for everyone to have their own room. In college you have to study some predetermined topic and take classes, but in a charter house you can spend your time on projects that actually teach you what you need to know. In college you have to hide your drugs, but in a charter house, the chemist who lives in the walk-in closet is synthesizing LSD in the bathtub.
An example of a similar successful model is Hampshire College in Amherst, Massachusetts. At Hampshire, upperclassmen don’t live in dorms, they live in mods (“modular housing”) of 6-10 students, which are like medium-size apartment buildings. The mods are big — almost everyone gets a single, and the few doubles are huge. And you can work on whatever you think is important because of the traditional Hampshire package of no majors, no tests, and no grades (yes, really!). The only downside is that you still have to pay, but charter houses fixes this. And we know this crazy system works — Hampshire has produced alumni like Ken Burns, Elliott Smith, Lupita Nyong’o, and Eugene Mirman, the man who voices Gene on Bob’s Burgers and deliverer of the best commencement speech of all time.
The benefit of college is of course the fact that it’s large — there are lots of people you already have something in common with, which makes it easier to build community and a strong social network. A single charter house can’t compete with that, but if you put a bunch of houses in the same town, they can support each other in various ways.
We’re not just talking about community — they can share skills and resources. The charter house full of musicians is the only house with a grand piano, but residents of the other charter houses can visit to use it. The chemists sprang for a projector or a giant TV, so everyone comes to their place for movie nights. The videographers living in the garret of the old B&B help record the experiments the electrical engineers are doing in the charter house down the street, and put it all on YouTube.
Replicating the benefits of college without the headaches isn’t just for college-age kids. Most people who went to college don’t miss the exams or the food, but a lot of them miss the sense of community and the ability to casually hang out with interesting people. Charter houses could be designed to be attractive to almost any age group.
We’re not financial advisors, so we can’t advise on how to set up the institution behind a charter house. But we can advise a little on how we think you should organize it.
In brief, we think a charter house should have very few rules.
Certainly you do want some rules. You probably want to have one resident who is on all the paperwork, who can collect the interest from the endowment every month, and who is responsible for paying all the bills. You want some legal firm or something to oversee the endowment. You want rules about what happens if the endowment grossly underperforms or overperforms — what happens to a house if their $2 million endowment shrinks to $1 million, or grows to $4 million? You want rules about what happens if the house ends up being abandoned.
(A growth rate of 4% per year does seem pretty conservative, so we support a rule that if a charter house’s endowment gets too big — if it ever reaches double the original endowment, if it breaks $5 million, something like that — it should be forced to split in half and spin off a sister house nearby.)
Other than that, we don’t think you want many rules at all.
There are many rules that do seem enticing at first glance. If your charter house is intended for biologists, you might want to make a rule that only biologists can live there. If your charter house is meant to be an accelerator for young people, you might want a rule that no one over 26 can live there. You might want a rule that no one can live there for more than 4 years, to encourage turnover and give lots of people a chance to live in the house. If the house itself has only eight rooms, you might want to make a rule that no more than 10 people can live there at a time. You might want to make sure at least a few people are living in the house at all times. Maybe you want to make a rule, “no girlfriends/boyfriends”, or at least “no families/kids”. And you would probably want some rule about how people are chosen to join the house.
These seem like good ideas, but we are against them for a simple reason: they are really hard to enforce. Who is going to go check that everyone living in the house is a biologist? If the guy playing guitar in the living room says “no I’m a biologist”, what are you going to do? If you try to enforce a maximum number of residents, how will you tell who is living there and who is just visiting? How long can someone visit for, before they count as living in the house? A week? A month?
So our recommendation is, don’t make these rules and don’t waste time and effort on trying to enforce them. It’s fine to tell a house, “I set this up for chemists” or “this house is to support open software” or “I want to support young people, so try to graduate when you can.” But don’t try to enforce these rules — trying to enforce them will just lead to internal squabbles.
Let the residents have friends over. Let them stay as long as they need. Let them decide how they’re going to pick their housemates. And let them learn to govern themselves. This teaches them that 1) they are capable of self governance and 2) specific tips and tricks on how to actually run a small organization/government. Pretty pro-democracy.
So we think charter houses should have as few rules as possible. On the other hand, they should definitely have traditions. Each house should have a name, house colors, maybe a crest. A motto if they can come up with one (maybe, “I am a beautiful animal! I am a destroyer of worlds!”). Perhaps an official song or chant. Traditions like a house movie (may we suggest WPDR) or a monthly poetry contest. And of course, a party every year on the day it was founded.
You can speculate and plan all you want, but you won’t know what works and what doesn’t until you give it a go. You really want someone to try it, to start some charter houses and see what they come up with, what problems they run into, and what solutions.
You want to invite the people who will live in the house to be your co-conspirators. If you make up a bunch of rules, even good ones, and try to enforce them, your residents will resent you. But if you bring them on board, and let them tinker with it, they will surprise you.
“Let yourself be second guessed,” says Paul Graham. “When you make any tool, people use it in ways you didn’t intend, and this is especially true of a highly articulated tool like a programming language. Many a hacker will want to tweak your semantic model in a way that you never imagined. I say, let them; give the programmer access to as much internal stuff as you can without endangering runtime systems like the garbage collector.”
We feel the same way — let them get at everything except the metaphorical runtime systems. In hacking they call this the “Hands-On Imperative”, and while actual code may or may not be involved, the charter house is more than a bit of a hacking project. “Hackers can do almost anything and be a hacker,” said Burrell Smith, the designer of the Macintosh computer, at the first Hacker Conference. “You can be a hacker carpenter. It’s not necessarily high tech. I think it has to do with craftsmanship and caring about what you’re doing.”
You want these houses to be very different, and you want to use the power of evolution. Lack of diversity is so bad that in biology, they call it genetic erosion.
You want “speciation” — you want to release ideas into the world and get feedback from their success and failure. We’re going to continue with the Paul Graham quotes for a second, because charter houses are more than a little like a combination of startups and startup accelerators. “If you release a crude version 1 then iterate,” he says, “your solution can benefit from the imagination of nature, which, as Feynman pointed out, is more powerful than your own.”
Most plans to change the world require a lot of coordination. You have to argue with senators and NGOs and the university PR department, on and on and on. But anyone who can spare a couple million dollars can set up a charter house unilaterally.
We’re not even necessarily talking about bringing in new money — you could do a lot just by redirecting donations that are already being made. If Bloomberg wants to give several hundred million dollars to Johns Hopkins (estimated endowment: $8.8 billion), who are we to judge? But in a world where we can’t seem to stop talking about stagnation and academic decline, doesn’t it seem worth it to try a different model? How about you spend $10 million to set up three charter houses with endowments of $3.3 million each, and give Johns Hopkins a mere $340 million? Or set up ten charter houses with endowments of $5 million each, and see if Johns Hopkins can survive on $300 million?
What’s gonna give you more bang for your buck, giving Stanford a shiny new engineering building and filling it with smartboards and swivel chairs and all of the engineering students who would have gone to Stanford whether it had a shiny new engineering building or not, or giving some of those engineers a house where they can work on stuff they think is cool, and enough food to keep them alive while they do it?
Another thing: those engineering students will take on like a hundred thousand dollars in debt if they go to Stanford! An advantage of charter houses is that nobody has to take out a loan.
So instead of giving $20 million to an institution that you are certain will muddle on in acceptable mediocrity, split that money up among several charter houses. Some will fizzle out; a couple may even explode. But others will become self-sustaining little critters that will spark and wriggle and lay plans of their own.