THIRD PLACE: Have Attention Spans Been Declining? by niplav, writing as Cennfaeladh, who also blogs at niplav.site. Also, honorable mention to niplav for getting by far the most (30,000+) pageviews.
We’ve been in contact with all three winners and will be sending them their awards shortly. Congratulations!
In addition, TripleTaco gave us this proposed explanation to the list of Amish mysteries. For your consideration:
After I formed my own theory of what caused Autism, I looked hard to see if anybody else had come to the same conclusion. After a long time of looking, I finally found one researcher, Max McDowell, who explains what he (and I) believe likely causes autism in this video: https://www.youtube.com/watch?v=BHhAnCTVLG4
If I were naming this, I’d call it “face starvation”. Babies have a crucial window in which they need a certain amount of eye contact and starving them of that causes autism for many children. He goes on in that video to describe many of the same mysteries I described in my mystery post. The theory lines up with the mysteries perfectly.
The video has been out for over 2 years and at this moment it has 114 views. His idea was first published all the way back in 2004, so it’s not new at all. It’s just being utterly ignored, and for very strong reasons:
First of all, he calls himself a “Jungian Psychoanalyst”, which frankly makes him sound kind of woo-woo and makes it harder to take him seriously. Secondly, he’s well outside of the academic circles that have earned the right to talk about autism and its causes. He’s an outsider. Thirdly, it’s a shocking, awful idea, and anybody championing it will be pilloried from all sides. The idea is threatening to the neurodiversity camp who don’t want autism treated like a disorder at all. The idea is threatening to parents who aren’t eager to hear that their own technology habits may have contributed to severe lifelong difficulty in their child. The idea is threatening to existing researchers who are far along other research paths and stand to gain little from such a simple tidy explanation, especially when championing it would get them in all kinds of hot water from every direction. Frankly, this idea is way too controversial to put my own name on, which is why this is an anonymous contribution.
Even if we’re right, I’m not hopeful that people are going to start taking this idea seriously. There are some ideas that are just too controversial to be taken seriously.
We will probably keep voting open until the end of the day on August 31 in case you want a chance to go back and re-read your favorites. In case you’ve forgotten, the finalists are:
[This is one of the finalists in the SMTM Mysteries Contest, by a reader writing under the pseudonym Lee S. Pubb. We’ll be posting about one of these a week until we have gotten through all the finalists. At the end, we’ll ask you to vote for a favorite, so remember which ones you liked.]
Background
Element 53, iodine, is a mineral essential to human health largely because it is utilized in the production of thyroid hormones.[1] For much of recorded human history, societies the world over suffered from the iodine deficiency disorder known as goiter–swelling in the neck resulting from an enlarged thyroid gland. It was common in regions where topsoil was regularly eroded (e.g., by flooding), as topsoil is rich in iodine, normally leading to its uptake in food crops. It’s also abundant in foods derived from the ocean, as ocean water contains iodine.
While Chinese doctors are said to have prescribed the consumption of animals’ thyroid glands to treat goiter as early as the 7th century, it was only in the early 20th century that large-scale research was done on iodine supplementation. It was found that sufficient levels of iodine consumption eliminated goiter in the vast majority of cases, and we began iodizing salt soon after. (Which is why I had to give this lengthy background on goiter, instead of just saying “that giant lump in your friend’s neck”.) The FDA recommends that Americans consume 150 mcg of iodine a day, and expect that the vast majority of Americans will achieve this through salt consumption.
(Very likely the reason you don’t have a goiter.)
That said, while the level of iodine supplementation present in modern-day table salt is clearly sufficient to greatly reduce the incidence of goiter, that doesn’t necessarily mean that it’s the ideal level for overall thyroid health.[2] (The FDA’s upper limit is set nearly 8 times higher than the RDA, at 1,100 mcg, suggesting that considerably more than the recommended amount isn’t expected to be deleterious to human health.) The thyroid largely regulates metabolism, and you may be familiar with some mysteries raised about human metabolism over the past century.[3] There are also a number of diseases directly associated with thyroid function, and these are pretty common in the present day (especially among women), with around 20 million patients in the United States estimated to have some kind of thyroid disorder. Thyroid functioning is screened for by doctors via blood tests measuring thyroid hormone concentration. Common disorders include hypothyroidism, in which an abnormally low level of thyroid hormone is produced, and hyperthyroidism, in which too much is produced.
INTERESTING!:
The Blue Zones are regions around the world renowned for their high proportion of centenarians, people living to the age of 100 or later. These are Okinawa, Japan; Sardinia, Italy; Nicoya, Costa Rica; Ikaria, Greece, and Loma Linda, California. These zones are now getting the shit studied out of them by anthropologists, doctors, etc., so we can try and figure out why they live so long and hopefully use it to improve human healthspan elsewhere.
You will not be surprised, given the background I just provided, to find out that iodine appears to be related to the mystery of the blue zones.
In the study Association of endemic goitre and exceptional longevity in Sardinia: evidence from an ecological study, the authors note “The spatial analysis revealed that the goitre rate (p < 0.0001), the proportion of inhabitants involved in pastoralism (p = 0.016), the terrain inclination (p = 0.008), and the distance from the workplace as a proxy for physical activity (p = 0.023) were consistently associated with population longevity at an aggregated level in the 377 municipalities.” Which is to say, a higher goiter rate in a municipality was the measure they studied that was most clearly associated with greater longevity. This association had an extremely–to my mind, an almost outrageously–low p-value.
The study goes on to say that, “from a worldwide perspective, the finding of an epidemiological association between goitre prevalence and longevity does not seem to be limited to Sardinia, but partially shared also by other populations where long-lived subjects are numerous. Most of the Longevity Blue Zones in the past were niches of endemic goitre as well. In Costa Rica, where the Nicoya LBZ was identified in 2007 (Rosero-Bixby 2008), high prevalence of endemic goitre has been reported since the 1950s (Perez et al. 1956) possibly aggravated by a gross excess of calcium ingested with drinking water (Boyle et al. 1966). In another LBZ, Ikaria island, the iodine level in spring water is remarkably low (Pes and Poulain 2014).”
So great, restrict iodine and you’ll live forever, right? AGING SOLVED! WE DID IT, YOU’RE WELCOME, aging is just iodine damage.
…Except.
Another Blue Zone is Okinawa, where most of the population routinely consumes nutritionally large quantities of seaweed–one of the richest natural sources of iodine. How much iodine are they consuming? I don’t have great numbers on Okinawa itself, but mean consumption of iodine in Japan is estimated at around 1-3 milligrams a day–that’s 1,000-3,000 micrograms of iodine. Remember that the FDA puts the RDA of iodine at just 150 micrograms, and the UL at 1,100 micrograms! Apparently, many Japanese people are consuming considerably more than the “tolerable upper intake level” for iodine. Centenarianhood is apparently their reward. Japan has 86,000 centenarians (.06% of their population, the highest percentage in the world), and a life expectancy of 84.62 years, second only to Hong Kong (the population of which, as of 2011, had a high risk of iodine deficiency).
These studies, taken together, suggest that having a moderate iodine intake leads to early death. As the Buddha said, “seek extremes in all things and maybe you can live to be a hundred”.
It is difficult not to propose any explanations for this, but that is the task that has been assigned to me, so, there you have it. Low iodine is correlated with longevity. High iodine is correlated with longevity. The thyroid is truly the most mysterious organ.
[1] If that first sentence surprised you, you’re in good company; iodine is “the heaviest element commonly needed by living organisms.”
[2] Plus, there’s bound to be population variance–I, for one, use kosher salt in my cooking, because Alton Brown told me to, and it’s not iodized. I’m surely exposed to iodized salt via processed foods and restaurant meals, but I do consume those somewhat rarely, at least compared to the number of meals I prepare at home.
[3] Iodine was once used to treat flour, oxidizing it to allow gluten to more easily form bonds and create the texture all us non-celiacs love in bread and pizza. But around the same time we began to study iodine supplementation, we invented potassium bromate, and started brominating flour instead to achieve the same effect–potassium bromate produced a very white flour that rapidly oxidized. Iodine is a necessary mineral, though, and bromine isn’t. So there was likely a point in recent American history where iodine consumption in bread decreased. You’re also really not supposed to consume the potassium bromate, but the thought is that baking temperatures sublimate the bromine–ah fuck I guess you really *shouldn’t* eat raw cookie dough? fuck.
[This is one of the finalists in the SMTM Mysteries Contest, by a reader writing under the pseudonym Cennfaeladh. We’ll be posting about one of these a week until we have gotten through all the finalists. At the end, we’ll ask you to vote for a favorite, so remember which ones you liked.]
[UPDATE: Now that the contest is over, we reveal that the author of this post is niplav, who blogs at niplav.site]
I investigate whether the attention span of individual humans has been falling over the last two decades (prompted by curiosity about whether the introduction of the internet may be harmful to cognitive performance). I find little direct work on the topic, despite its wide appeal. Reviewing related research indicates that individual attention spans might indeed have been declining65%.
In what might be just the age-old regular ephebiphobia, claims have been raised that individual attention spans have been declining—not just among adolescents, but among the general population. If so, this would be quite worrying: Much of the economy in industrialized societies is comprised of knowledge work, and knowledge work depends on attention to the task at hand: switching between tasks too often might prevent progress on complicated and difficult problems.
I became interested in the topic after seeing several claims that e.g. Generation Z allegedly has lower attention spans, observing myself and how I struggled to get any work done when connected to the internet, and hearing reports from others online and in person having the same problem.
The exact question being asked is:
“Have the attention spans of individuals on neutral tasks (that is, tasks that are not specifically intended to be stimulating) declined from 2000 to the present?”
(One might also formulate it as “Is there an equivalent of the “Reversed Flynn Effect” for attention span?”) I am not particularly wedded to the specific timeframe, though the worries mentioned above assert that this has become most stark during the last decade or so, attributing the change to widespread social media/smartphone/internet usage. Data from before 2000 or just the aughts would be less interesting. The near-global COVID-19 lockdows could provide an especially enlightening natural experiment: Did social media usage increase (my guess: yes90%), and if so, did attention spans decrease at the same time (or with a lag) (my guess: also yes75%), but I don’t think anyone has the data on that and wants to share it.
Ideally want to have experiments from ~2000 up to 2019: close enough to the present to see whether there is a downward trend (a bit more than a decade after the introduction of the iPhone in 2007), but before the COVID-19 pandemic which might be a huge confounder, or just have accelerated existing trends (which we can probably check in another 2 years).
I am mostly interested in the attention span of individual humans and not groups: Lorenz-Spreen et al. 2019 investigate the development of a construct they call “collective attention” (and indeed find a decline), but that seems less economically relevant than individual attention span. I am also far less interested in self-perception of attention span, give me data from a proper power- or speed-test, cowards!
So the question I am asking is not any of the following:
“Does more social media/internet usage cause decreased attention spans?”
“Does more social media/internet usage correlate with decreased attention spans?”
“Does more social media/internet usage correlate with people reporting having shorter attention spans?”
“Did collective attention spans decrease?”
“Are people on average spending less time on webpages than they used to?”
How Is Attention Span Defined?
Attention is generally divided into three distinct categories: sustained attention, which is the consistent focus on a specific task or piece of information over time (Wikipedia states that the span for sustained attention has a leprechaun figure of 10 minutes floating around, elaborated on in Wilson & Korn 2007); selective attention, which is the ability to resist distractions while focusing on important information while performing on a task (the thing trained during mindfulness meditation); and alternating or divided attention, also known as the ability to multitask.
When asking the question “have attention spans been declining”, we’d ideally want the same test measuring all those three aspects of attention (and not just asking people about their perception via surveys), performed annually on large random samples of humans over decades, ideally with additional information such as age, sex, intelligence (or alternatively educational attainment), occupation etc. I’m personally most interested in the development of sustained attention, and less so in the development of selective attention. But I have not been able to find such research, and in fact there is apparently no agreed upon test for measuring attention span in the first place:
She studies attention in drivers and witnesses to crime and says the idea of an “average attention span” is pretty meaningless. “It’s very much task-dependent. How much attention we apply to a task will vary depending on what the task demand is.”
Perhaps I should soften my tone, as this perspective appears elsewhere:
[…] Gould suggests the metaphor of a dense bush whose branches are periodically pruned by nature. This allows for parallel evolutionary sequences, some of which are adaptive and others not — at any moment in time only the tips of aseledaptive branches are in evidence, the pruned ones cannot be seen. Thus rather than being direct descendants of primitive hominids, for example, huankind would have evolved along a separate but parallel line from other primates.
Might the ontogeny of selective attention recapitulate this theme? That is, rather than selective attention comprising a single construct with a fixed ontogenic plan, might it be better conceptualized as a multidimensional construct with separat, parallel developmental trajectories for different components. To carry the analogy still further, might the specific developmental progression for a particular component of selective attention be determined by the adaptive fit of that component with the individual’s ‘environmental press’? Although such a conjecture rekindles the tened of ontogeny recapitulates phylogney long since abandoned in physiological development (e.g., Dixon and Lerner, 1985), we suggest that it may nonetheless provide an overarching framework within which to cast life-span research and theory on the development of selective attention.
One of my hopes was that there is a canonical and well-established (and therefore, ah, tested) test for attention span (or just attention) à la the IQ test for g: If so, I would be able to laboriously go through the literature on attention, extract the individual measurements (and maybe even acquire some datasets) and perform a meta-analysis.
Continuous Performance Tests
For measuring sustained and selective attention, I found the family of continuous performance tests, including the Visual and Auditory CPT (IVA-2), the Test of Variables of Attention (T.O.V.A.), Conners’ CPT-III, the gradCPT and the QbTest, some of which are described here. These tests usually contain two parts: a part with low stimulation and rare changes of stimuli, which tests for lack of attention, and a part with high stimulation and numerous changes of stimuli, which tests for impulsivity/self control.
Correct detection: This indicates the number of times the client responded to the target stimulus. Higher rates of correct detections indicate better attentional capacity.
Reaction times: This measures the amount of time between the presentation of the stimulus and the client’s response.
Omission errors: This indicates the number of times the target was presented, but the client did not respond/click the mouse. High omission rates indicate that the subject is either not paying attention (distractibility) to stimuli or has a sluggish response.
Commission errors: This score indicates the number of times the client responded but no target was presented. A fast reaction time and high commission error rate points to difficulties with impulsivity. A slow reaction time with high commission and omission errors, indicates inattention in general.
I’m currently unsure about two crucial points:
How much does any CPT measure the concept we naively call attention span? The papers I’ve read don’t refer to attention span per se, but a general capability of sustained and selective attention.
Are there any time-series analyses or longitudinal studies using a CPT, or alternatively meta-analyses using data collected from existing studies? I have not been able to find any.
Other Heterogenous Metrics
I also attempted to find a survey or review paper on attention span, but was unsuccessful in my quest, so I fell back to collecting metrics for attention span from different papers:
Three online tests (probably devised by the authors (?), since no source is given) (n≈2000 Canadians). Very little information about the exact nature of the tests.
Sustained attention span: “Counting the number of times responds correctly identified an X occurring after an A.”
Selective attention span: “Counting the number of times respondents correctly identified a change in the orientation of the rectangles”
Alternating attention span: “Calculating the difference in the time lapsed to perform a series of consecutive number or letter classification, compared to a mixture of number and letter classifications.”
Neurological research: The same games/tests as above with the participants being measured with an EEG (“Results were reported as ACE (Attention, Connectivity, Encoding) scores, as well as the number of attention bursts”) (n=112 Canadians)
Questionnaire developed by the authors based on Conners 2004 (reliability: α=0.786)
Wilson & Korn 2007 report several different measures of attention span during lectures: the amount of notes taken over time, observation of the students by an author of one study or two independent observers in another study, retention of material after the lecture, self-report in 5-minute intervals during the lecture, and heart rate. They also note that “Researchers use behaviors such as fidgeting, doodling, yawning, and looking around as indicators of inattentiveness (e.g., Frost, 1965; Johnstone & Percival, 1976).”
Plude et al. 1994 review how selective attention develops during a human life. For measuring attention, they mainly focus on studies using reaction time as a metric—the speed at which an action occurs as a result of a changing stimulus: eye movement patterns of infants, simple tests such as pressing a button on a changing (often visual) stimulus, the influence of irrelevant visual stimuli at the periphery on a task performed at the centre of the visual field, judging similarity of stimuli at various distances in the visual field, responding to a target stimulus surrounded by interfering distractor stimuli, and determining whether a visual target item is present or absent. They also mention skin conductance (measuring arousal).
They also mention studies investigating the time required for attentional switching in acoustic contexts: “Pearson and Lane (1991a) studied the time course of the attention-shifting process between lists and also found large age-related improvements between 8 and 11 years. Whereas 8-year-olds required more than 3.5 s to completely switch from monitoring one list to another, 11-year-olds and adults appeared to complete the switch in less than 2.5 seconds.”
This is not an adequate metric, I believe: It would also decline if people would become better at prioritising on which websites are more worthy of their attention.
Time that specific pieces of information (hashtags/n-grams/Reddit submissions &c) were popular
As it stands, I think there’s a decent chance60% that one or several tests from the CPT family can be used as tests for attention span without much of a problem.
I don’t think a separate dedicated test for attention span exists45%: The set of listed measures I found (apart from the CPT) appears to be too heterogenous, idiosyncratic, mostly not quantitative enough and measuring slightly different things to be robustly useful for a meta-analysis.
What Are the Existing Investigations?
A lack of long-term studies means we can’t tell whether attention spans have actually declined.
Do younger people perform worse on the sustained attention span test?, Yes (31% high sustained attention for group aged 18-34, 34% for group aged 35-54, and 35% group aged 55+) (the methodology is wholly unclear here, though: how do we determine the group that has “high sustained attention span”? Did they perform any statisitical tests? If yes, which?).
Do people who report more technology usage (web browsing/multi-screen usage while online/social media usage/tech adoption) perform worse on the sustained attention span test?, Yes. Light:medium:heavy usage for web browsing has 39%:33%:27% users with high sustained attention span, 36%:33%:27% for light:medium:heavy multi-screen usage, 36%:29%:23% for light:medium:heavy social media usage and 35%:31%:25% for light:medium:heavy tech adoption (though these numbers are basically not elaborated on).
Selective attention:
Do younger people perform worse on the selective attention span test? No (34% high selective attention for group aged 18-34, 30% for group aged 35-54, and 35% group aged 55+).
Do people with high selective attention use fewer devices at the same time? Yes (details p. 31).
Alternating attention:
Do younger people perform worse on the alternating attention span test? No (36% high selective attention for group aged 18-34, 28% for group aged 35-54, and 36% group aged 55+).
Do people who report more technology usage (tech adoption/web browsing/multi-screen usage while online) perform worse on the alternating attention span test? No, they seem to perform better: Light:medium:heavy tech adoption corresponds to 31%:39%:40% having high alternating attention spans, light:medium:heavy web browsing to 29%:34%:37% and multi-screening while online to 27%:32%:37%.
Do people who use social media more have higher Attention/Connection/Encoding scores on EEG measurements?, Not quite: “Moderate users of social media are better at multi-tasking than lower users. But, when crossing into the top quartile of social media usage, scores plummet.”
This is a marketing statement wearing the skinsuit of a previously great paper, it would be awesome if they released their exact methodology (tests performed, data collected, exact calculations & code written). I can smell that they actually put effort into the research: Creating an actual test instead of just asking respondents about their attention spans, doing EEG measurements of over 100 people, for 3 different types of attention…come on! Just put out there what you did!
Is self-reported attention span related to the number of social media accounts?, No, not statistically significant (F(2, 206)=0.1223, p>0.05) (via a one-way ANOVA)
Is self-reported attention span related to whether a respondent mainly uses a mobile phone or a computer?, No, not statistically significant (P(2,713)=0.923, p>0.05) (via a one-way ANOVA)
I do not trust this paper: Calling (what I think is) Generation Z “Generation D” (without source for the term), being clearly written in Word, and confusing grammar (I think the authors are all Americans, so no excuse here):
Users that are older such as late adolescents and emerging adults average approximately 30-minutes daily for just Facebook that does not calculate the time spent on all social media networks
Bakardjieva and Gaden (2012) examined the field of social interaction in general to the everyday chatter of unstructured and spontaneous interactions among individuals to highly structured and regulated interaction consisting of the military or the stock exchange.
Question answered: How much time do people spend on a website, on average?, “if you look at the trend for mobile browsing between the years 2017 and 2019 you would see that there is a drop of about 11 seconds in the average time spent on a website.” and “The data suggests that the average amount of time spent on websites before navigating away for all devices has gone down by 49 seconds which is a pretty huge reduction all things considered.”
The data is from the right timeframe (up to but not including 2020), but the linked SimilarWeb report is behind a paywall, so I can’t confirm the numbers. Furthermore, the time spent on websites is a weak proxy: Perhaps people simply have become better at prioritising information sources.
How long does any particular hashtag stay in the group of the top 50 most used hashtags? Specifically, how has that number developed from 2013 to 2016?, “in 2013 a hashtag stayed within the top 50 for 17.5 hours on average, a number which gradually decreases to 11.9 hours in 2016”, and “The average maximum popularity ⟨L(tpeak)⟩
on one day tpeak stays relatively constant, while the average gradients ⟨ΔL⟩
in positive and negative direction become steeper over the years.”
Do things become more popular faster over time? That is, when e.g. a movie is gaining popularity, did it take longer to become popular in 1985 than it did in 2018?, Broadly yes (the trends holds for popularity of hashtags in tweets (2013-2016)/n-grams in books (1900-2004)/number of theaters that movies were screened in (1985-2018)/topics for search queries on Google (2010-2017)/Reddit comments on posts (2010-2015)/citations of publications (1990-2015)/daily traffic for Wikipedia articles (2012-2017)). Again the length of the time at the peak mostly didn’t change (except in the case of Wikipedia articles, where the time at the peak shrunk)
While it investigates a question different from the one I have, this paper seems good and trustworthy to me, while supporting a suspicion I’ve had (observing that the lifecycle of e.g. memes has apparently sped up significantly). I’d be interested in seeing whether the same process holds for internet communities I’m part of (for example on votes LessWrong and the EA Forum or forecasts on Metaculus).
Mark 2023 is a recent book about attention spans, which I was excited to read and find the important studies I’d missed. Unfortunately, it is quite thin on talking about the development of attention span over time. It states that
My own research, as well as those of others, has shown that over the last fifteen years, our attention spans have declined in duration when we use our devices. Our attention spans while on our computers and smartphones have become short—crazily short—as we now spend about forty-seven seconds on any screen on average.
—Gloria Mark, “Attention Span” p. 13/14, 2023
which is not quite strong enough a measurement for me.
In 2004, in our earliest study, we found that people averaged about one hundred fifty seconds (two and a half minutes) on a computer screen before switching their attention to another screen; in 2012, the average went down to seventy-five seconds before switching. In later years, from 2016 to 2021, the average amount of time on any screen before switching was found to be relatively consistent between forty-four and fifty seconds. Others replicated our results, also with computer logging. seconds. Others replicated our results, also with computer logging. André Meyer and colleagues at Microsoft Research found the average attention span of twenty software developers over eleven workdays to be fifty seconds.⁹ For her dissertation, my student Fatema Akbar found the average attention span of fifty office workers in various jobs over a period of three to four weeks to be a mere forty-four seconds.¹⁰ In other words, in the last several years, every day and all day in the workplace, people switch their attention on computer screens about every forty-seven seconds on average. In fact, in 2016 we found the median (i.e., midpoint) for length of attention duration to be forty seconds.¹¹ This means that half the observations of attention length on any screen were shorter than forty seconds.
—Gloria Mark, “Attention Span” p. 74/75, 2023
She doesn’t mention the hypothesis that this could be the symptom of a higher ability to prioritize tasks, although she is adamant that multi-tasking is bad.
Furthermore, this behavior displays only a decrease in the propensity of attention, but not necessarily one of capacity: Perhaps people could concentrate more, if they wanted to/were incentivized to, but they don’t, because there is no strong intent to or reward for doing so. Admittedly, this is less of an argument in the workplace where these studies were conducted, but perhaps people just care not as much about their jobs (or so I’ve heard).
when email was cut off, people’s attention spans were significantly longer while working on their computers—in other words, they switched their attention less frequently.
—Gloria Mark, “Attention Span” p. 97, 2023
She gives some useful statistics about time spent on screens:
Nielsen reports that Americans spend on average five hours and thirty minutes daily of screen time on their computers, tablets and phones8. […] But what is really astonishing is that when we add in the time watching other media like TV and films to this, then we see that our attention is fixated on some form of screen, in some type of mediated environment, nearly ten hours a day8.
—Gloria Mark, “Attention Span” p. 180, 2023
She connects attention span to shot-length in movies:
The type of motion within shots has been changing. According to film scholar James Cutting and his colleagues at Cornell, shots containing the onset of motion (like a standing person who then runs) have increased because filmmakers believe that it will better attract viewers’ attention. […] The average film shot length in 1930 was twelve seconds, but then began to shorten, reaching an average of less than four seconds after the year 2010, as measured by James Cutting and colleagues.12 Interestingly, the shot length for film sequels also decreased. For example, the shot length of the first Iron Man film averaged about 3.7 seconds; for Iron Man 2, 3.0 seconds; and for Iron Man 3, about 2.4 seconds.13
—Gloria Mark, “Attention Span” p. 180/181, 2023
Like in TV and film, shot lengths in television commercials also shortened over time. The average shot length of commercials in 1978 was 3.8 seconds, dropping down to an average of 2.3 seconds in 1991. […] It’s not just the shot lengths, though, that are short—the overall length of advertisements on TV has also decreased. The majority of ads started out as sixty seconds in length in the 1950s,26 but that length comprised only 5 percent of ads shown in 2017. In the 1980s, advertisers started experimenting with showing fifteen-second ads instead of thirty-second ads. They discovered that fifteen seconds was even more persuasive than thirty seconds, especially when the ads used elements expressing cuteness and humor.27 In 2014, 61 percent of ads were thirty seconds in length, but three years later, that percentage decreased to 49 percent.28
—Gloria Mark, “Attention Span” p. 189, 2023
Do People Believe Attention Spans Have Declined?
Half of the public feel their attention span is shorter than it used to be, compared with around a quarter (23%) who believe they are just attentive [sic] as they’ve always been.
Again, the feeling of is not just reported by the young — it’s also the dominant feeling among the middle aged too, with 56% of 35- to 54-year-olds thinking their attention spans have worsened.
Even more widespread is the belief that young people’s attention spans in particular are worse than they were in the past—two-thirds of people think this is the case (66%).
Perhaps unsurprisingly, this belief is most common among the oldest age group surveyed, of those aged 55 or over — however, young people themselves also feel this way, with a majority of 18- 34-year-olds holding this view.
Note that selective attention mostly improves with age, so the older age-groups might be comparing themselves now to the younger age groups now (as opposed to remembering back at their own attention spans).
The absence of long-term research means it remains unknown whether technology has caused a deterioration in the country’s ability to concentrate — but comparisons with survey data from previous decades indicate that, on some measures the public feel more pressured than they did in the past.
In response to the questions (n=2093 UK adults aged 18+ in 2021):
“To what extent do you agree or disagree with the following statement? The pace of life is too much for me these days” (1983: 30% agree, 2021: 41% agree)
“To what extent do you agree or disagree with the following statement? I wish I could slow down the pace of my life” (1997: 47% agree, 1999: 51% agree, 2008: 45% agree, 2021: 54% agree)
What About Rates of ADHD?
Data from the CDC shows a clear increase in the percentage of children with a parent-reported ADHD diagnosis:
There has been a similar increase in the diagnosis of ADHD among adults, “from 0.43 to 0.96 percent” between 2007 and 2016.
However, this does not necessarily mean that the rate of ADHD has increased, if e.g. awareness of ADHD has increased and therefore leads to more diagnoses.
What Could A Study Look Like?
Compared to other feats that psychology is accomplishing, finding out whether individual attention spans are declining appears to be of medium difficulty, so I’ll try to outline how this could be accomplished in three different ways:
Develop a good instrument for measuring attention span (optionally just use a continuous performance test). Once one has a suitable instrument for measuring attention span, one can every year (or every second year) for a couple of years pick a random sample from the population (not of the same set of people, though, since attention span increases with age), e.g. via the internet if the test can be done online. One could then apply a linear trend estimation or a fancier statistical technique I don’t know to find out whether attention spans have declined between the measurements.
This could be done relatively cheaply: Let’s say we collect 50 datapoints a year, from Mechanical Turk workers at $10/hr. A conservative estimate is that the test takes ~30 minutes to complete, so for three years the cost of the data would be 50⋅3⋅10$/h⋅0.5h=$750. It looks like there areopen-source implementationsof the test available (Conners’ CPT 3 costs $1.5k), so the additional cost is for the researcher setting up the test and recruiting the participants, which could take ~30 hours, and another ~30 hours for analysing the data. So the total cost of the experiment would be, at an hourly wage of $15 for the researcher (come on, we can let a grad student do it), $750+60h⋅15$/h=$1650
. Fudging upwards by taking the planning fallacy into account gives $2k for the experiment.
Use the control groups from studies testing the effect of interventions on attention as data and then perform a meta-analysis. A lot of studies use some variant of the CPT, I started collecting such studies in Appendix B.
Conclusion
Given the amount of interest the question about shrinking attention spans has received, I was surprised to not find a knockdown study of the type I was looking for, and instead many different investigations that were either not quite answering the question I was asking or too shoddy (or murky) to be trusted. It seems likely to me that individual attention spans have declined (I’d give it ~70%), but I wouldn’t be surprised if the decline was relatively small, noisy & dependent on specific tests.
So—why hasn’t anyone investigated this question to satisfaction yet? After all, it doesn’t seem to me to be extremely difficult to do (compared to other things science has accomplished), there is pretty clearly a lot of media attention on the question (so much so that a likely incorrect number proliferates far & wide), it appears economically and strategically relevant to me (especially sustained attention is probably an important factor in knowledge work, I’d guess?) and it slots more or less into cognitive psychology.
I’m not sure why this hasn’t happened yet (and consider this text evidence for a partial violation of Cowen’s 2nd law—although, to be fair, the law doesn’t specify there needs to be a good literature on everything…). The reasons I can think of is that one would need to first develop a good test for determining attention span, which is some work in itself (or use the CPT); be relatively patient (since the test would need to be re-run at least twice with a >1 year pause, for which the best grant structure might not exist); there are many partial investigations into the topic, making it appear like it’s solved; and perhaps there just aren’t enough cognitive psychologists around to investigate all the interesting questions that come up.
So I want to end with a call to action: If you have the capacity to study this problem, there is room for improvement in the existing literature! Attention spans could be important, it’s probably not hard to measure them, and many people claim that they’re declining, but are way too confident about it given the state of the evidence. False numbers are widely circulated, meaning that correct numbers might be cited even more widely. And it’s probably not even (that) hard!
Consider your incentives :-).
Appendix A: Claims That Attention Spans Have Been Declining
Most of these are either unsourced or cite Gausby 2015 fallaciously (which Bradbury 2016 conjectures to be the number of seconds spent on websites on average).
Today, individuals are constantly on an information overload from both the quantity of information available and the speed of which information gets into the hands of individuals through advertising and multimedia. Attention deficits tend to be increasing as it is challenging to attract individuals and hold their attention long enough for people to read or watch messages such as work memos, advertisements, etc.
Big data plays an important role in the development of microlearning. In the age of big data, human’s attention span is decreasing. As per Hebert (1971), “what information consumes is rather obvious: it consumes the attention of its recipients. Hence a wealth of information creates a poverty of attention and a need to allocate that attention efficiently among the overabundance of information sources that might consume it” (p. 41). An example of short attention span in the age of big data can be found in the music industry, as per (Gauvin, 2017), the average time that passed before the audience would hear the vocals on any radio song was 23 s, today the average intro is just 5 s long. Wertz (2017) also suggested that 40% of users are likely to abandon a website if it does not load within three seconds or less. Furthermore, a survey (Gausby, 2015) conducted by Microsoft indicated that the average attention span of a human dropped from 12 to eight seconds, which means shorter than a goldfish. Given the average human attention span is decreasing, microlearning becomes more and more important because it emphasises short learning duration.
Unfortunately, all too many of us are having “squirrel” days, according to Dr. Gloria Mark, a professor of informatics at the University of California, Irvine, who studies how digital media impacts our lives. In her new book, “Attention Span: A Groundbreaking Way to Restore Balance, Happiness and Productivity,” Mark explained how decades of research has tracked the decline of the ability to focus.
“In 2004, we measured the average attention on a screen to be 2½ minutes,” Mark said. “Some years later, we found attention spans to be about 75 seconds. Now we find people can only pay attention to one screen for an average of 47 seconds.”
Not only do people concentrate for less than a minute on any one screen, Mark said, but when attention is diverted from an active work project, it also takes about 25 minutes to refocus on that task.
Tech-savvy users often say that the way the modern internet works has made it so that people’s attention spans are getting shorter every single day but the truth behind this story is rather tough to ascertain. However, recent data from SimilarWeb indicates that people definitely are suffering from shorter attention spans, and what’s more is that these attention spans are shortening at a pretty rapid pace when you take into account the numerous factors that are coming into play, all of which serve some kind of purpose in this trend.
If you look at the data for how long users spend on websites before navigating away, for the most part the trend has been that these times are remaining more or less stable on web based browsing, but if you look at the trend for mobile browsing between the years 2017 and 2019 you would see that there is a drop of about 11 seconds in the average time spent on a website. When you take into account the fact that mobile browsing is starting to become a lot more popular and in many ways has become the preferred form of browsing for people on the internet, the change is a lot more drastic.
However, as much as technology can be used as an effective learning tool inside and outside the classroom, there’s no denying that one of the biggest challenges faced by educators today is the distraction posed by social media. Students are distracted by their phones during class, and even without that distraction, the time they spend on social media outside the classroom has an impact on their attention spans.
In 2015, a study commissioned by Microsoft and discussed in Time magazine found that the average attention span was in fact only 8 s. If indeed this is the case, then even participating in a 15-min lecture would be positively heroic. To place this in perspective, it was reported in the same Time article, that goldfish, of the piscine rather than snack variety, have an attention span of 9 s, one whole second greater than humans! It is perhaps rather premature to opt for an 8-s lecture format, as there are many caveats to the Time article, not the least of which is that no one knows how to actually measure a goldfish’s attention span. What has been measured is goldfish memory, which, according to researchers in the School of Psychology at the University of Plymouth, is actually quite good (7). Similarly the 8-s attention span for humans actually reflects the average time a person will spend on a web page before looking somewhere else.
I’ll list the closest thing those studies have to a control group, list sorted by year.
Studies Using the CPT
What Determines Continuous Performance Task Performance? (Maria J. O. van Asma/René P. Rombouts/Robert J. van den Bosch, 1996): Patient controls n=19 (mean age 29.5±8.1 years), normal controls n=20 (mean age 32.2±8.3 years). Tested with an unspecified digit related CPT.
Measuring Several Aspects of Attention in One Test (Iwona Kovalik-Gran/Jens Egeland, 2010): n=376 Norwegian patients, aged 14-77 (mean 32.9 years, standard deviation 13.8), either referred to Egeland or to the Vestfold Mental Health Care Trust, with various psychological disorders (57 without such disorders). Tested using Conners’ CPT. Scores for normal group not reported independently.
Reliability and validity of the Conners’ Continuous Performance Test (Danielle Shaked/Lauren M. D. Faulkner/Kathryn Tolle/Carrington R. Wendell/Shari R. Waldstein/Robert J. Spencer, 2019): n=91 undergraduate psychology students (20.01±1.68 years), tested using Conners’ CPT II. Page 3 lists more retest studies for Conners’ CPT II.
Do concerns about COVID-19 impair sustained attention? (Jihyang Jun/Yi Ni Toh/Caitlin A. Sisk/Roger W. Remington/Vanessa G. Lee, 2021): n=161 participants (23±5.2 years), recruited online. No control group, but instead a correlational study. Tested using the scene CPT.
Appendix C: How I Believed One Might Measure Attention Span Before I Found Out About The CPT
Before I found out about the Continuous Performance Test, I speculated about how to measure attention span:
(Note that I’m not a psychometrician, but I like speculating about things, so the ideas below might contain subtle and glaring mistakes. Noting them down here anyway because I might want to implement them at some point.)
It seems relatively easy to measure attention span with a power- or speed-test, via one of three methods:
Present a stimulus, change the stimulus and let the test subject report the change in stimulus; this results in two numbers: the time between the stimulus being initially being presented and the time it was changed (let’s call this value t_change), and the time between the change of the stimulus and the reporting of the change (calling this value t_report). Performing this test with different value of t_change should result in different values of t_report. There is a t_change for which t_report falls over a threshold value, that t_change can be called the attention span.
This method has some disadvantages:
It needs a change in stimulus that requires selective attention to notice, but changing e.g. visual stimuli involves motion, which direct attention. (Idea: have a colored stimulus continuously changing color, and a reference color, once the stimulus has the reference color, the subject is supposed to report; avoiding sudden changes in visual stimuli.)
The method would require many samples to find the t_change for which t_report falls over the threshold value.
Performing the test multiple times in a row might induce mental fatigue, decreasing attention span
Let the test subject engage in a mentally draining exercise like the Stroop test with some performance measure. I would the performance to decline over time, and one could define a threshold value at which the subject “is no longer paying attention”.
Let the subject observe a neutral stimulus while measuring some indicator of attention (such as arousal via skin conductivity or the default mode network being inactive), when the measured value falls under/over a threshold the subject has “lost attention”.
This method has the major disadvantage that it requires special equipment to perform.
Such an instrument would of course need to have different forms of reliability and validity, and I think it would probably work best as a power test or a speed test.
I’m not sure how such a test would relate to standard IQ tests: would it simply measure a subpart of g, completely independent or just partially related to it?
[This is one of the finalists in the SMTM Mysteries Contest, by a reader writing under the pseudonym TripleTaco. We’ll be posting about one of these a week until we have gotten through all the finalists. At the end, we’ll ask you to vote for a favorite, so remember which ones you liked.]
I’ve always thought it was really weird that Amish kids mostly don’t get autism, and I have found other people’s explanations unsatisfying. For most of America, the autism rate is about 1 in 44, but among the Amish, it’s more like 1 in 271. Like obesity, autism is extremely common in developed countries, and largely did not exist (or was not recognized) before modernity (diagnoses have been rising dramatically). So, what if some environmental factor of our modern lives is poisoning our children’s very minds?
It’s not because of vaccines
It appears that the correlation is mostly used by anti-vaxxers as evidence that children shouldn’t be vaccinated, which makes it data-non-grata; (we can’t talk about this data lest we reinforce their conspiracy theories).
It’s not because of vaccines, but what if there’s some other aspect of modern living that contributes heavily to autism? There are many other things that the Amish abstain from that could be contributing.
It’s not (just) kids watching TV
An easy culprit would be babies watching too much TV. Indeed, there’s strong evidence that needs to be considered here. An increase in cable TV subscriptions in a neighborhood is directly correlated to an increase in autism. The same article also points out that places with more precipitation get more autism (perhaps because people stay inside to watch tv?)
But wait, while research on this topic does support the theory that screen time is bad for babies’ social development, the effect size is not nearly large enough to explain the severe autism that many children get. No study I’ve seen directly correlates baby screen time with actual autism diagnosis (although few would argue that heavy screen usage is good for babies). The TV explanation is missing something.
It’s not (just) genetics
Some have pointed out that the Amish are genetically isolated, and perhaps they have good genes for avoiding autism. However, no genetic explanation can explain the dramatic rise in autism rates in other populations.
Are there any clues from treatment?
Often diseases are treated with therapies that reflect their causes. Bacterial infections are treated with antibiotics, for instance; kill the bacteria, and you solve the problem. Autism is generally treated with language and behavioral therapy. The child has extended one-on-one time with a therapist where they practice building the social skills they are lacking. If this is truly the best treatment (at least for now), does that tell us anything about possible causes?
Teaser
There’s more I’d like to say and link to, but I can’t really go much further without getting into my own theories on this topic. I do have a pet theory that I think better fits the observed data than other explanations. I have in fact found one other individual (a therapist who treats autism) espousing my exact theory, although it doesn’t appear to have gotten much traction. For now, I’ll just leave this mystery without suggesting a theory as requested.
Companions the creator seeks, not corpses, not herds and believers. Fellow creators the creator seeks—those who grave new values on new tablets. Companions the creator seeks, and fellow harvesters; for everything about him is ripe for the harvest.
— Friedrich Nietzsche, “Thus Spoke Zarathustra”
There’s a long tradition in the history of medicine where people figured out the cause of an industrial disease by noticing that one profession had a much higher rate of the disease than everyone else. For example, in Victorian and Edwardian England, chimney sweeps had a rate of scrotal cancer more than 200 times higher than workers who weren’t exposed to tar on the job. No, we are not making this up.
Now it’s your turn to do something similar. Your mission, should you choose to accept it, is to write a review of the mysteries on a topic and send it to us at slimemoldtimemold[at]gmail[dot]com by July 1st 2023.
Pick a topic, and write about the mysterious aspects of that topic, like we did for the mysteries of obesity in Part I of A Chemical Hunger. We mostly expect you to review topics from “hard science” areas like medicine, biology, chemistry, and neuroscience, but we are open to reviews of mysteries from social science, economics, political science, or the humanities. If you feel you can make a strong case for some mysteries and why they are mysterious, that’s good by us.
You can include Normal Mysteries, things that are unexplained but that most people know about and don’t seem all that confusing. For example, IBS and migraines are about 2-3x more common in women than in men. Everyone kind of knows this, so it’s not all that weird, but no one can really explain it, so it is still a mystery. The first three mysteries we reviewed about the obesity epidemic were all pretty normal.
You should also review Weird Mysteries, things that most people aren’t aware of and/or that seem like they totally don’t make sense, things that fly in the face of our understanding. The rest of the mysteries we reviewed about the obesity epidemic were pretty weird, like how lab animals and wild animals are also getting more obese. What’s up with that?
Our hot tip is that the simplest form of mystery is just unexplained or unexpected variation. A good example is how obesity rates vary by altitude — low-altitude counties in the United States have much higher obesity rates than high-altitude countries do. This is not predicted by most theories of obesity, and many people found this very surprising.
An unexpected LACK of variation can also be a mystery. For obesity, it feels intuitive that people who eat different kinds of diets should weigh different amounts, but diet consistently seems to make very little difference. From the perspective of the mainstream understanding of obesity, this is pretty mysterious.
How do you know that you’ve found a good mystery? It’s an emotion, a feeling that starts in your gut, not unlike IBS (which, hey now that we think about it, is pretty mysterious). Start with something that you just can’t wrap your stomach around. We’re looking for a confusion that started rumbling in your tummy when you were a student who kept asking the same basic questions and couldn’t get a straight answer, a confusion that has just kept grumbling away right there next to your esophagus ever since — now that’s a mystery. The best mysteries will be assumptions where everyone else thinks everything is fine, but you have a nagging suspicion that something is wrong.
Please focus on the mysteries of your chosen subject — DO NOT include a theory. If you feel you need to provide context, you can discuss popular theories and how your mysteries support or undermine them (like we did in Part II). But no arguing for a theory or introducing a theory of your own.
This is a mystery contest, not a theory contest. Your mystery review is the hook; if you do a great job reviewing some mysteries and win the contest, everyone will be excited to hear about your theory. Then you can put it on your own blog and get a lot of readers. If people think you have a promising direction, maybe you can get funding to study it further.
Software engineers who have just lost their jobs; grad students on strike; academics who are fed up with the paywall curtain; couples who have just retired at 35; founders whose last venture was recently acquired; billionaire playboys with too much time on their hands; anyone who is looking to make a pivot to research — this is the contest for you. You don’t need a lot of research chops to look at something and tell that it’s weird; anyone can pick out mysteries by noticing when things don’t add up, when things are unexplained, or when experts all disagree on the best explanation.
If anything, outsiders and newbies have an advantage. If your career doesn’t rely on pretending to understand, it’s easier to spot things that don’t make any sense.
Don’t do this though
Contest Format
We have recruited some judges to help us evaluate the mysteries: Adam Mastroianni, Lars Doucet, Applied Divinity Studies, Tony Kulesa, and possibly some other judges TBA. We will consult with these judges and will choose around 5-10 finalists, which will be published on the blog. Then readers will vote for the best. First place will get at least $2000, second place $1000, third place $500, though we might increase those numbers later on.
Use your expertise. The best entries will probably be about things YOU are already familiar with, things where you know about the mysteries the rest of us haven’t noticed yet.
All forms of media are welcome! We like to write really long stuff, and sometimes we just post our correspondence. But if you like to boil ‘em instead of mash ‘em (or stick them in a stew!), that’s cool too. Podcasts, videos, slideshows, semaphore code, etc. are all welcome. All written finalists will be published on the blog. Finalists in other formats (e.g. videos, podcasts) will be linked to. The language shared by the judges is English, so we prefer materials that suit the conventions of English speakers.
You must submit your entry under a pseudonym. This helps people discuss you and your work without having to say, “the guy or lady perhaps or person or team who wrote the SMTM mystery contest entry on pancreatic cancer”. Instead they can say, “blorpShark’s wonderful mysteries of pancreatic cancer review”, which is much nicer.
Pseudonyms also keep famous people from having an advantage. For this reason, if you already go by a well-known pseudonym on the internet, please choose a new pseudonym for this contest.
Team submissions are strongly encouraged (friendship is the most powerful force in the universe), and we encourage you to pick a band name. Go to your nearest band name generator and pick the stupidest name it generates. For solo entries, we recommend a rap name generator, like Post Malone did.
After the contest is over, if you want to connect your pseudonym to your other name(s), please feel free to do so. If you do not provide a pseudonym, one will be provided for you.
If you submit a non-written entry, please send it to us in a form that is as anonymous as possible. For example, you might send a podcast entry as an audio file, or a video essay as a video file. Don’t mention your name in the recording, etc.
Please submit written entries by putting them in a Google doc and sharing that doc with us. We will try to preserve your formatting as best we can if we publish your entry as a finalist, but no promises. If you want to make sure your formatting appears as intended, use simple formatting (e.g. bold, italics, and images). The more complicated your formatting is, the more likely we are to make an error in copying it over.
Please don’t put your name or any hints about your identity in the Google doc itself. If you do, we may remove that information or disqualify your entry.
Please make sure that the Google doc is unlocked and that we can read it and share it with the other judges. Go to the “Share” button in the upper right, and on the bottom of the popup click on where it says “restricted” and change to “anyone with the link”. If you send us a document we can’t read, we will probably disqualify you.
Frankly we reserve the right to disqualify entries for any reason, or no reason at all.
If you win, we will send you your prize money in the form of an envelope stuffed with cash, or something else if we agree that it’s more convenient.
Your due date is July 1st, 2023. If you have any questions, ask in the comments so other people who have the same questions can see. You can also email us or ask us questions on twitter. Good luck!
The first mystery is the obesity epidemic itself. It’s hard for a modern person to appreciate just how thin we all were for most of human history. A century ago, the average man in the US weighed around 155 lbs. Today, he weighs about 195 lbs. About 1% of the population was obese back then. Now it’s about 36%.
Back in the 1890s, the federal government had a board of surgeons examine several thousand Union Army veterans who fought in the Civil War. This was several decades after the end of the war, so by this point the veterans were all in their 40’s or older. This gives us a snapshot of what middle-aged white men looked like in the 1890s. When we look at their data, we find that they had an average BMI of about 23 (overweight is a BMI of 25 and obese is a BMI of 30 or more). Only about 3% of them were obese. In comparison, middle-aged white men in the year 2000 had an average BMI of around 28. About 24% were obese in early middle age, increasing to 41% by the time the men were in their 60s.
(Most experts consider measures like body fat percentage to be better measures of adiposity than BMI, and we agree. Unfortunately, nearly every source reports BMI, and most don’t report body fat percentage. Here, we use BMI so that we can compare different sources to one another.)
It’s not just that we’re a little fatter than our great-grandparents — the entire picture is different.
People in the 1800s did have diets that were very different from ours. But by conventional wisdom, their diets were worse, not better. They ate more bread and almost four times more butter than we do today. They also consumed more cream, milk, and lard. This seems closely related to observations like the French Paradox — the French eat a lot of fatty cheese and butter, so why aren’t they fatter and sicker?
Our great-grandparents (and the French) were able to maintain these weights effortlessly. They weren’t all on weird starvation diets or crazy fasting routines. And while they probably exercised more on average than we do, the minor difference in exercise isn’t enough to explain the enormous difference in weight. Many of them were farmers or laborers, of course, but plenty of people in 1900 had cushy desk jobs, and those people weren’t obese either.
Something seems to have changed. But surprisingly, we don’t seem to have any idea what that thing was.
Mystery 2: An Abrupt Shift
Another thing that many people are not aware of is just how abrupt this change was. Between 1890 and 1976, people got a little heavier. The average BMI went from about 23 to about 26. This corresponds with rates of obesity going from about 3% to about 10%. The rate of obesity in most developed countries was steady at around 10% until 1980, when it suddenly began to rise.
Trends in adult overweight, obesity, and severe obesity among men and women aged 20–74: United States, 1960–1962 through 2015–2016. SOURCES: NCHS, National Health Examination Survey and National Health and Nutrition Examination Surveys.
Today the rate of obesity in Italy, France, and Sweden is around 20%. In 1975, there was no country in the world that had an obesity rate higher than 15%.
This wasn’t a steady, gentle trend as food got better, or diets got worse. People had access to plenty of delicious, high-calorie foods back in 1965. Doritos were invented in 1966, Twinkies in 1930, Oreos in 1912, and Coca-Cola all the way back in 1886. So what changed in 1980?
Common wisdom today tells us that we get heavier as we get older. But historically, this wasn’t true. In the past, most people got slightly leaner as they got older. Those Civil War veterans we mentioned above had an average BMI of 23.2 in their 40s and 22.9 in their 60’s. In their 40’s, 3.7% were obese, compared to 2.9% in their 60s. We see the same pattern in data from 1976-1980: people in their 60s had slightly lower BMIs and were slightly less likely to be obese than people in their 40s (See the table below). It isn’t until the 1980s that we start to see this trend reverse. Something fundamental about the nature of obesity has changed.
Distribution of BMI and obesity prevalence, non-Hispanic white men in the US by time period and age group. Adapted from Helmchen & Henderson, 2003.
Mystery 3: The Ongoing Crisis
Things don’t seem to be getting any better. A couple decades ago, rising obesity rates were a frequent topic of discussion, debate, and concern. But recently it has received much less attention; from the lack of press and popular coverage, you might reasonably assume that if we aren’t winning the fight against obesity, we’ve gotten at least to a stalemate.
Rates of obesity are also increasing worldwide. As The Lancet notes, “unlike other major causes of preventable death and disability, such as tobacco use, injuries, and infectious diseases, there are no exemplar populations in which the obesity epidemic has been reversed by public health measures.”
All of this is, to say the least, very mysterious.
1.1 Weird Mysteries
Then there are the weird mysteries.
Mystery 4: Hunter-Gatherers
A common assumption is that humans evolved eating a highly varied diet of wild plants and animals, that our bodies still crave variety, and that we would be better off with a more varied diet. But when we look at modern hunter-gatherers, we see this isn’t true. The !Kung San of Tanzania get about 40% of their calories from a single food source, the mongongo nut, with another 40% coming from meat. But the !Kung are extremely lean (about 110lbs on average) and have excellent cardiovascular health.
Of course, variety isn’t everything. You would also expect that people need to eat the right diet. A balanced diet, with the right mix of macronutrients. But again, this doesn’t seem to be the case. Hunter-gatherer societies around the world have incredibly different diets, some of them very extreme, and almost never suffer from obesity.
Historically, different cultures had wildly different diets — some hunter-gatherers ate diets very high in sugar, some very high in fat, some very high in starch, etc. Some had diets that were extremely varied, while others survived largely off of just two or three foods. Yet all of these different groups remained lean. This is strong evidence against the idea that a high-fat, high-sugar, high-starch, low-variety, high-variety, etc. diet could cause obesity.
A Tanzanian hunter-gatherer society called the Hadza get about 15 percent of their calories from honey. Combined with all the sugar they get from eating fruit, they end up eating about the same amount of sugar as Americans do. Despite this, the Hadza do not exhibit obesity. Another group, the Mbuti of the Congo, eat almost nothing but honey during the rainy season, when honey can provide up to 80% of the calories in their diet. These are all unrefined sugars, of course, but the Kuna of Panama, though mostly hunter-gatherers, also obtain white sugar and some sugar-containing foods from trade. Their diet is 65% carbohydrate and 17% sugar, which is more sugar than the average American currently consumes. Despite this the Kuna are lean, with average BMIs around 22-23.
Kitava is a Melanesian island largely isolated from the outside world. In 1990, Staffan Lindeberg went to the island to study the diet, lifestyle, and health of its people. He found a diet based on starchy tubers and roots like yam, sweet potato, and taro, supplemented by fruit, vegetables, seafood, and coconut. Food was abundant and easy to come by, and the Kitavans ate as much as they wanted. “It is obvious from our investigations,” wrote Lindeberg, “that lack of food is an unknown concept, and that the surplus of fruits and vegetables regularly rots or is eaten by dogs.”
About 70% of the calories in the Kitavan diet came from carbohydrates. For comparison, the modern American diet is about 50% carbohydrates. Despite this, none of the Kitavans were obese. Instead they were in excellent health. Below, you’ll see a photo of a Kitavan man being examined by Lindeberg.
Kitavans didn’t even seem to gain weight in middle age. In fact, BMI was found to decrease with age. Many lived into their 80s or 90s, and Lindeberg even observed one man who he estimated to be 100 years old. None of the elderly Kitavans showed signs of dementia or memory loss. The Kitavans also had no incidence of diabetes, heart attacks, stroke, or cardiovascular disease, and were unfamiliar with the symptoms of these diseases. “The only cases of sudden death they could recall,” he reports, “were accidents such as drowning or falling from a coconut tree.”
Mystery 5: Lab Animals and Wild Animals
Humans aren’t the only ones who are growing more obese — lab animals and even wild animals are becoming more obese as well. Primates and rodents living in research colonies, feral rodents living in our cities, and domestic pets like dogs and cats are all steadily getting fatter and fatter. This can’t be attributed to changes in what they eat, because lab animals live in contained environments with highly controlled diets. They’re being fed the same foods as always, but for some reason, they’re getting fatter.
This seems to be true everywhere you look. Our pets may eat scraps from the table, but why would zoo animals, being fed by professionals, also be getting fatter? Even horses are becoming more obese. This is all very strange, and none of it fits with the normal explanations for the obesity epidemic.
It used to be that if researchers needed obese rats for a study, they would just add fat to normal rodent chow. But it turns out that it takes a long time for rats to become obese on this diet. A breakthrough occurred one day when a graduate student happened to put a rat onto a bench where another student had left a half-finished bowl of Froot Loops. Rats are usually cautious around new foods, but in this case the rat wandered over and began scarfing down the brightly-colored cereal. The graduate student was inspired to try putting the rats on a diet of “palatable supermarket food”; not only Froot Loops, but foods like Doritos, pork rinds, and wedding cake. Today, researchers call these “cafeteria diets”.
Sure enough, on this diet the rats gained weight at unprecedented speed. All this despite the fact that the high-fat and cafeteria diets have similar nutritional profiles, including very similar fat/kcal percentages, around 45%. In both diets, rats were allowed to eat as much as they wanted. When you give a rat a high-fat diet, it eats the right amount and then stops eating, and maintains a healthy weight. But when you give a rat the cafeteria diet, it just keeps eating, and quickly becomes overweight. Something is making them eat more. “Palatable human food is the most effective way to cause a normal rat to spontaneously overeat and become obese,” says neuroscientist Stephan Guyenet in The Hungry Brain, “and its fattening effect cannot be attributed solely to its fat or sugar content.”
Rodents eating diets that are only high in fat or only high in carbohydrates don’t gain nearly as much weight as rodents eating the cafeteria diet. And this isn’t limited to lab rats. Raccoons and monkeys quickly grow fat on human food as well.
We see a similar pattern of results in humans. With access to lots of calorie-dense, tasty foods, people reliably overeat and rapidly gain weight. But again, it’s not just the contents. For some reason, eating more fat or sugar by itself isn’t as fattening as the cafeteria diet. Why is “palatable human food” so much worse for your waistline than its fat and sugar alone would suggest?
If you look at a map of county-level obesity data in the United States, the Rockies, the Sierra Mountains, and the Appalachians stand out quite clearly:
County-Level Estimates of Obesity among Adults aged 20 and over, 2009. Map from the CDC.
One paper, Hypobaric Hypoxia Causes Body Weight Reduction in Obese Subjects from Lippl et al. (2012), claims to show a reduction in weight at high altitude and suggests that this weight loss is attributable to differences in oxygen levels. However, there are a number of problems with this paper and its conclusions. To begin with, there isn’t a control group, so this isn’t an experiment. Without an appropriate control, it’s hard to infer a causal relationship. What they actually show is that people brought to 2,650 meters lost a small amount of weight and had lower blood oxygen saturation, but this is unsurprising. Obviously if you bring people to 2,650 meters they will have lower blood oxygen, and there’s no evidence linking that to the reported weight loss. They don’t even report a correlation between blood oxygen saturation and weight loss, even though that would be the relevant test given the data they have. Presumably they don’t report it because it’s not significant. In addition there are major issues with multiple comparisons, which make their few significant findings hard to interpret (for more detail, see our full analysis of the paper).
Mystery 8: Diets Don’t Work
There’s a lot of disagreement about which diet is best for weight loss. People spend a lot of time arguing over how to diet, and about which diet is best. I’m sure people have come to blows over whether you lose more weight on keto or on the Mediterranean diet, but meta-analysis consistently finds that there is little difference between different diets.
Some people do lose weight on diets. Some of them even lose a lot of weight. But the best research finds that diets just don’t work very well in general, and that no one diet seems to be better than any other. For example, a 2013 review of 4 meta-analyses said:
Numerous randomized trials comparing diets differing in macronutrient compositions (eg, low-carbohydrate, low-fat, Mediterranean) have demonstrated differences in weight loss and metabolic risk factors that are small (ie, a mean difference of <1 kg) and inconsistent.
Most diets lead to weight loss of around 5-20 lbs, with minimal differences between them. Now, 20 lbs isn’t nothing, but it’s also not much compared to the overall size of the obesity epidemic. And even if someone does lose 20 lbs, in general they will gain most of it back within a year.
Hello! If you’re just joining us, check out achemicalhunger.com, the table of contents helps make the series easier to navigate!