A “ridiculously powerful” supercomputer has identified over 100,000 new RNA viruses – including nine never-before-seen coronaviruses – by trawling through vast quantities of biological samples.
As reported in the journal Nature, an international team of researchers used the supercomputer to go through 20 million gigabytes of gene sequence data from 5.7 million biological samples gathered from all kinds of environments, from ice core samples to animal poop.
This search revealed 132,000 RNA viruses (just 15,000 of which were previously known to science) and nine new species of coronaviruses.
By scooping together all of this vital information, the team hopes their work could be used to stamp out future disease outbreaks – and perhaps even address the next big pandemic.
“We’re entering a new era of understanding the genetic and spatial diversity of viruses in nature, and how a wide variety of animals interface with these viruses. The hope is we’re not caught off guard if something like SARS-CoV-2 — the novel coronavirus that causes COVID-19 — emerges again," Dr Artem Babaian, an independent researcher who worked on the project, said in a statement.
"These viruses can be recognized more easily and their natural reservoirs can be found faster. The real goal is these infections are recognized so early that they never become pandemics.”
“If a patient presents with a fever of unknown origin, once that blood is sequenced, you can now connect that unknown virus in the human to a way bigger database of existing viruses. If a patient, for example, presents with a viral infection of unknown origin in St. Louis, you can now search through the database in about two minutes, and connect that virus to, say, a camel in sub-Saharan Africa sampled in 2012,” added Dr Babaian.
The most infamous coronavirus right now is, of course, SARS-CoV-2 – but there is also the SARS-CoV that emerged in 2002 and MERS-CoV, which causes Middle East respiratory syndrome (MERS).
Along with those particular dangerous coronaviruses, there are four that typically cause no more than a common cold in humans. These nine newly identified coronaviruses likely originated in pigs, birds, and bats, but it’s not known whether they have the power to infect humans.
The supercomputer used in this research has a computing capacity of 22,500 typical CPUs. It would take a traditional supercomputer over a year (not to mention hundreds of thousands of dollars) to perform this analysis, but this project completed the job in 11 days for $24,000.
The study has the potential to help improve pathogen surveillance for the anticipation and mitigation of future pandemics, but despite this bold promise, Dr Babaian says he started this as a “fun side project.”
As side projects go, this one went on to become pretty major.