|Image by Olimpia Zagnoli, New York Times|
So, for fun, I compared it to unemployment data and found a strong negative correlation. More marijuana use, less unemployment.
Mr. Wardle liked my assignment. Granted, his sense of humour was famous; on the weekly ten point quizzes, we got a bonus mark for adding a caption to a Far Side cartoon, and he announced the funniest caption the following class.
When he handed back the assignment, he reminded me of one key rule of research:
Correlation does not imply causation.
Which brings us to an op-ed in yesterday's New York Times. Seth Stephens-Davidowitz, an economist interning at Google, presented evidence from Google searches for possible causes of depression in the United States. After unemployment, what was the best predictor of searches for depression?
I tested dozens of variables in many different categories. The strongest predictor by far: an area’s average temperature in January. Colder places have higher rates of depression, with the correlation concentrated in the colder months. The relationship between weather and mental health has been debated, but those debates have generally relied on “small” data. Google searches, the biggest data source we currently have, are unambiguous: when it comes to our happiness, climate matters a great deal.
Paging Mr. Wardle, wherever you are.
What else happens in January in cold places?
It is dark. You don't need to be a mental health expert to know about 'seasonal affective
The availability of internet search data allows researchers to probe questions previously answered only with high effort, limited sample-size opinion polls. There can be real value to analyses with Google Trends or other storehouses or search data. The Centers for Disease Control, for example, works with Google because the number of people in an area searching for information on the flu turned out to be the best available indicator of a flu outbreak. On a simpler note, want to know whether people are more likely to use the term "climate change" or the term "global warming"? Try Google Trends, and you'll see the answer is clearly "global warming".
So by all means, examine data with Google Trends. Just remember Mr. Wardle's lesson; correlation does not imply causation. Even Stephens-Davidowitz seemed to understand this, at least in the case of one variable:
More Hispanic-Americans meant fewer searches (though this might have been a result of language factors).
Might have. You think?