Because it is a barometer of a community’s psychological well-being, Twitter can also serve as an effective tool for predicting that community’s rate of heart disease, according to a study published online this week in the journal Psychological Science.
In fact, Twitter — or, specifically, the language used in Twitter posts — was found to be a stronger marker for predicting heart disease rates at the county level than data collected by epidemiologists on more traditional risk factors, such as income, education, smoking, diabetes and obesity, combined.
That finding is particularly intriguing given the fact that, as the authors of this study point out, the median age of Twitter users is 31 — much younger than the typical person at risk for heart disease.
“It is not obvious why Twitter language should track heart-disease mortality,” the researchers write. “The people tweeting are not the people dying. However, the tweets of younger adults may disclose characteristics of their community, reflecting a shared economic, physical, and psychological environment.”
Millions of Tweets
For the study, the researchers data-mined 148 million randomly selected public tweets sent from 1,347 counties across the United States from June 2009 to March 2010. The residents in those counties represent about 88 percent of the U.S. population.
The researchers tagged language in the tweets — single words and word groupings (topics) — that conveyed both negative and positive emotions, such as anger, hostility, anxiety, fatigue, optimism and excitement. Next, they determined the relative frequencies of those expressions of emotions for each of the 1,347 counties. (The relative frequency of the word hate, for example, ranged from 0.009 percent to 0.139 percent across the counties.)
That information was then compared with county-level data on death rates for heart disease. The researchers found that counties with high frequencies of negative tweets — messages that contained words like hate or expletives — tended to have higher rates of heart disease, while those with high frequencies of positive tweets — messages with words like friends, opportunity or wonderful — had lower rates. That correlation held even after adjusting for such variables such as age, income and education.
In fact, when the researchers compared the Centers for Disease Control and Prevention’s (CDC) county-level data on heart-disease risk with their own estimates based on Twitter language, they found them to be remarkably similar.
A reflection of a psychological ‘atmosphere’
What might explain the correlation between a community’s Twitter-language trends and its heart-disease risk, especially given that the age group sending the tweets is not the same one being diagnosed with the illness?
Plenty of research has reported an association between negative environmental factors, such as stress, anger, anxiety and depression, and an increased risk of heart disease. By reflecting the overall psychological atmosphere of a community, Twitter language may be telling us how pervasive those negative factors are within the community, say the study’s authors.
“Certainly, hostility and anger is very likely to spread person to person,” the study’s lead author, Johannes Eichstaedt, a graduate student at the University of Pennsylvania, told the Washington Post. “So even if we both live in the most beautiful neighborhood in New York City, and I’m really, really angry and I’m on the road with you, you will get some of that anger.”
Of course, this isn’t the first study in which social media has been found useful for identifying health trends. As Eichstaedt and his colleagues point out, researchers have used Twitter to successfully track Lyme disease, H1N1 influenza, depression and other common illnesses. And a 2009 study reported that Google search queries provided an earlier indication of the spread of the flu than traditional tracking methods undertaken by the CDC.