Using Search Query Logs to Predict Influenza Activity

    Seasonal influenza poses a significant health threat. Despite the fact that considerable resources are devoted to influenza surveillance, surveillance information is limited and at least one week old by the time it is available. Also, there are no good mechanisms to aggregation information for forecasting purposes. As people rely on the Internet more and more to satisfy their information needs, their online behavior may reveal their interests and concerns. An increase of influenza-related queries at a major search engine may reflect an increase of people who have an influenza-like illness or are concerned about acquiring one. Search data may contain rich and timely information for predicting influenza activity. Motivated by this conjecture, we study the temporal association between influenza-related search frequency and influenza disease activity. Leveraging on influenza-related search data obtained from Yahoo! Search and U.S. influenza mortality and positive culture data from Center for Disease Control and Prevention, we show that influenza-related search frequency is statistically related to both influenza mortality and influenza positive cultures. An increase in influenza-related search activities seem to precede an increase in influenza positive cultures by 4 weeks, and deaths from pneumonia and influenza by 7 weeks. Our results, although limited by the data that we use, suggests that search term surveillance may represent a novel, powerful, and inexpensive way of performing supplemental disease surveillance.