The science community is still reeling from the loss of 20 million irreplaceable artifacts after a fire tore through Brazil’s oldest scientific institution earlier this month. In a “call to arms”, a team of scientists found just 3 to 4 percent of recorded fossil locations around the world are accounted for in published literature – a finding they hope will help inspire the scientific community to digitize collections around the world.
"In the wake of the fire, my reaction was one of heartbreak, dismay, and shock,” said lead author Dr Charles Marshall in a statement. “As scientists, seeing a fire like this is akin to learning your parent's house has just burnt to the ground. It's time for government and funding agencies to step up investment in the digitization of natural history collections and preserve our world heritage for decades to come."
Publishing their work in Biology Letters, the researchers are calling the specimens sitting on back shelves and in storage rooms that have not been published or documented digitally “dark data” and say it could be lost in future disasters if underfunded museums around the world don’t invest in the digital preservation of their studies.
"The fossil record offers invaluable insight into our planet's ecological and evolutionary past," said study co-author Dr Peter Roopnarine. "Yet published literature only documents a fraction of the fossils housed in museum collections. Digitizing specimens preserve valuable data and make it readily accessible to researchers everywhere."
Before computers, these datasets were compiled by hand and took years of effort, only to later be difficult to tabulate. A first “digital revolution” started nearly 30 years ago with the launch of several online databases based on public literature, many of which are still growing and being added to, including Paleobiology Database (PBDB). Today, the authors note we are seeing a second digital revolution as almost a dozen institutions are digitally cataloging fossils from collections that haven’t been cited in published work in a database called Eastern Pacific Invertebrate Communities of the Cenozoic (EPICC). This partnership of 10 natural history museums is digitizing marine invertebrate fossils from the last 66 million years found along Americas' western coast, from Chile to Alaska.
The authors compared the number of locations represented in the PBDB with those tallied in EPIC for Washington, Oregon, and California. For every one location where fossils were collected and recorded in the digital databases, 23 others existed on museum shelves exclusively.
"What this means is that within most of the great museums of the world there are specimens that have not been fully utilized to understand the nature of our planet, how ecosystems responded to climate change in the past, and how they'll respond moving forward," said Marshall. "We need that perspective to forecast the future."
Over the course of the three-year study, digital technologies allowed the team to analyze hundreds of thousands of specimens. In doing so, they say they continue to make new discoveries simply by going deeper into this "untapped dark data".