Scientists in the United States have developed a machine learning algorithm that can predict a surge of COVID-19 cases at county level across the US, in the vast majority of cases. Such a tool could have a powerful impact in protecting people, and let healthcare systems prepare up to six weeks before a major outbreak.
To reach this prediction level, the team fed the machine learning algorithm historical data on COVID-19 activity in 97 US counties, as well as search terms from Google, tweets, technical searches from medical professionals, and directions requested from Apple Maps. The algorithm was able to work out the relevant aspects that flagged the beginning of a surge in cases.
“We try to identify changes in human behavior that may be predictive of changes of transmission patterns in COVID-19,” corresponding author Professor Mauricio Santillana, from Harvard Medical School, told IFLScience. “When we are looking in real-time and we start seeing those patterns that historically have anticipated the emergence of an outbreak, then we can say, ‘okay, looks like an outbreak may happen in the coming one to six weeks’.”
The Centers for Disease Control and Prevention (CDC) already uses approaches to predict the evolution of COVID-19 infections and deaths over time. These are based on epidemiological indicators. The limitation of these models is that they often struggle to capture rapid changes, like during surges, because they do not receive fast, consistent, and reliable data.
“If we want to be responsive in real-time to outbreaks that are about to emerge, we want those models to be very accurate during those times when things change from being normal to a lot of cases,” Santillana explained to IFLScience.
This approach was able to identify 87 percent of the outbreaks in the data collected between January 2020 and January 2022. Compared to the standard CDC approach, the machine learning approach could predict outbreaks within a similar timeframe or up to six weeks in advance. The system has limitations when it comes to small and rural counties with low internet usage, but it is a step forward in predicting rapid changes in COVID-19 infections.
Santillana explained that such an approach could be used to help protect vulnerable people and to better allocate medical resources. In an ideal world, we might end up with a scenario similar to the Y2K scare, where the system's prediction is taken so seriously by everyone that the problem is solved before it materializes.
The work is published in the journal Science Advances.