CERN Just Released 300 Terabytes Worth Of Data To The Public


Dr. Alfredo Carpineti

Senior Staff Writer & Space Correspondent

clockApr 25 2016, 20:07 UTC
The CMS experiment under construction. Julian Williams via Wikimedia Commons

If you’ve ever dreamed of working on the largest experiment in the world, you can now make that dream a reality from the comfort of your own home. CERN has just released more than 300 terabytes (TB) of high-quality open data from its CMS collaboration.


The data includes 100 TB collected at the Large Hadron Collider (LHC) by the CMS detector in 2011. This includes raw datasets used by the scientists, as well as simplified datasets that can be analyzed with very little computational power, and even simulated data – which helps the researchers to know what to expect from the experiment.

The release is an effort by CMS to have its data publicly available. CERN is funded by government money from its 29 member states, so by releasing this information, the general public can actually see what their money is spent on. The collaboration previously released 27 TB of data in 2014.

All the data can be easily visualized in the CERN virtual machine. CMS Collaboration

“As scientists, we should take the release of data from publicly funded research very seriously,” says Salvatore Rappoccio, a CMS physicist, in a statement. “In addition to showing good stewardship of the funding we have received, it also provides a scientific benefit to our field as a whole. While it is a difficult and daunting task with much left to do, the release of CMS data is a giant step in the right direction.”


The data released also has many scientific benefits beyond being a good public engagement tool. The data can now be accessed by other scientists, who can take a fresh look at the information and maybe discover something that has been missed so far. The data will also have a huge impact as an educational tool, as the CERN virtual machine can be downloaded and used to help train undergraduates and high school students to be the next generation of particle physicists.

The CMS data is available on the CERN open data portal. CMS stands for Compact Muon Solenoid and is one of two general-purpose experiments looking for potential new particles produced through proton collisions. CMS and its counterpart ATLAS were responsible for the discovery of the Higgs boson in 2012. 

