Top Banner
Data is the raw material for the 21st century. The NOMAD (Novel Materials Discovery) Laboratory, a European Centre of Excellence (CoE) (https://NOMAD-CoE.eu), creates, collects, stores, and cleanses a large volume of computational materials science data, derived from the most important materials science codes available today. In addition, the NOMAD Laboratory CoE develops tools for mining this data in order to find structure, correlations, and novel information that could not be discovered from studying smaller data sets. Together, the large volume of data and innovative tools will enable researchers in basic science and engineering to advance materials science, identify new physical phenomena, and help industry to improve existing and develop novel products and technologies (Fig. 1). The data repository developed and maintained by NOMAD is now the largest repository for input and output files of computational materials science worldwide, containing the files from several million high-quality calculations. The volume of files made available through the NOMAD Repository is steadily increasing, as the computational materials-science community uses millions of CPU hours every day in high-performance-computing (HPC) centers worldwide. Importantly, the NOMAD Repository contains data from researchers from all over the world, and from other data bases, e.g. AFLOWlib and OQMD. Unlike other repositories, the NOMAD Repository is not restricted to selected computer codes but serves all important codes currently used in computational materials science. The Repository also helps the computational materials science community to host and organize its data, and to make it available to others in a highly efficient way (Fig. 2, http://repository.nomad-coe.eu/, and https://www.youtube.com/ watch?v=L-nmRSH4NQM). NOMAD has also contributed to data organization by defining metadata to unambiguously label key quantities in the field. As the NOMAD Repository data is generated by many different computer codes, it is heterogeneous and therefore hard to integrate and to use directly for data analytics and extensive comparisons. Peta- (toward exa-) scale high-throughput calculations are wasted without deeper analytics of the results. Nevertheless, most other exascale- computing initiatives currently focus on hardware and software challenges, while extreme-scale aspects of Big-Data remain under-explored, in particular for materials science and engineering. In the first 15 months of the NOMAD project, the consortium has developed ways to convert the existing open-access data of the NOMAD Repository into a common, code-independent format, developing numerous parsers and creating the NOMAD Archive. In this way, NOMAD stands out, compared to other materials-genomics initiatives. Our code-independent Archive enables a leap forward in computational materials science by providing a basis for deeper analytics. In parallel with creation of the Archive, we have developed tools to exploit the extensive Archive data, including the NOMAD Encyclopedia, Big-Data Analytics, and Advanced Graphics (Fig. 3). The NOMAD Encyclopedia represents a user-friendly, public access point to the extensive knowledge contained in the NOMAD Archive. For the first time, we will be able to see, compare, explore, and comprehend computations from international researchers that will help us to understand structural, mechanical, and thermal behaviors of a large variety of materials, their electronic properties, responses to external excitations, and more. The NOMAD Big-Data Analytics Toolkit will help NOMAD users to identify correlations and structure in the Big-Data of the Archive. This will help scientists and engineers to select which materials will be most useful for specific applications or predict and identify promising new materials with specific sets of desirable properties, worth further exploration. Seeing helps understanding. Consequently, NOMAD is developing an infrastructure for remote visualization of the multi-dimensional NOMAD data. Our virtual-reality environment will enable interactive data exploration, as well as enhanced training and dissemination. The remote visualization system will allow users to have access to data and tools using standard devices (laptops, smartphones), independent of their location. NOMAD Laboratory CoE users will be able to use our virtual reality software to collaboratively study complex n-dimensional systems in an intuitive way, and pave the way for visual analytics. The NOMAD (Novel Materials Discovery) Laboratory A European Centre of Excellence The first 15 months ©NOMAD, 2017 Figure 1. The NOMAD Laboratory CoE Figure 2. The NOMAD Repository @NoMaDCoE @nomadCoE
3

The NOMAD (Novel Materials Discovery) Laboratory … Materials/_NOMAD...The NOMAD (Novel Materials Discovery) Laboratory, ... (Novel Materials Discovery) Laboratory A European Centre

Apr 14, 2018

Download

Documents

vuongkiet
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The NOMAD (Novel Materials Discovery) Laboratory … Materials/_NOMAD...The NOMAD (Novel Materials Discovery) Laboratory, ... (Novel Materials Discovery) Laboratory A European Centre

Data is the raw material for the 21st century. The NOMAD (Novel Materials Discovery) Laboratory, a European Centre of Excellence (CoE) (https://NOMAD-CoE.eu), creates, collects, stores, and cleanses a large volume of computational materials science data, derived from the most important materials science codes available today. In addition, the NOMAD Laboratory CoE develops tools for mining this data in order to find structure, correlations, and novel information that could not be discovered from studying smaller data sets. Together, the large volume of data and innovative tools will enable researchers in basic science and engineering to advance materials science, identify new physical phenomena, and help industry to improve existing and develop novel products and technologies (Fig. 1).The data repository developed and maintained by NOMAD is now the largest repository for input and output files of computational materials science worldwide, containing the files from several million high-quality calculations. The volume of files made available through the NOMAD Repository is steadily increasing, as the computational materials-science community uses millions of CPU hours every day in high-performance-computing (HPC) centers worldwide. Importantly, the NOMAD Repository contains data from researchers from all over the world, and from other data bases, e.g. AFLOWlib and OQMD. Unlike other repositories, the NOMAD Repository is not restricted to selected computer codes but serves all important codes currently used in computational materials science. The Repository also helps the computational materials science community to host and organize its data, and to make it available to others in a highly efficient way (Fig. 2, http://repository.nomad-coe.eu/, and https://www.youtube.com/watch?v=L-nmRSH4NQM). NOMAD has also contributed to data organization by defining metadata to unambiguously label key quantities in the field.

As the NOMAD Repository data is generated by many different computer codes, it is heterogeneous and therefore hard to integrate and to use directly for data analytics and extensive comparisons. Peta- (toward exa-) scale high-throughput calculations are wasted without deeper analytics of the results. Nevertheless, most other exascale-computing initiatives currently focus on hardware and software challenges, while extreme-scale aspects of Big-Data remain under-explored, in particular for materials science and engineering. In the first 15 months of the NOMAD project, the consortium has developed ways to convert the existing open-access data of the NOMAD Repository into a common, code-independent format, developing numerous parsers and creating the NOMAD Archive. In this way, NOMAD stands out, compared to other materials-genomics

initiatives. Our code-independent Archive enables a leap forward in computational materials science by providing a basis for deeper analytics. In parallel with creation of the Archive, we have developed tools to exploit the extensive Archive data, including the NOMAD Encyclopedia, Big-Data Analytics, and Advanced Graphics (Fig. 3). The NOMAD Encyclopedia represents a user-friendly, public access point to the extensive knowledge contained in the NOMAD Archive. For the first time, we will be able to see, compare, explore, and comprehend computations from international researchers that will help us to understand structural, mechanical, and thermal behaviors of a large variety of materials, their electronic properties, responses to external excitations, and more.The NOMAD Big-Data Analytics Toolkit will help NOMAD users to identify correlations and structure in the Big-Data of the Archive. This will help scientists and engineers to select which materials will be most useful for specific applications or predict and identify promising new materials with specific sets of desirable properties, worth further exploration. Seeing helps understanding. Consequently, NOMAD is developing an infrastructure for remote visualization of the multi-dimensional NOMAD data. Our virtual-reality environment will enable interactive data exploration, as well as enhanced training and dissemination. The remote visualization system will allow users to have access to data and tools using standard devices (laptops, smartphones), independent of their location. NOMAD Laboratory CoE users will be able to use our virtual reality software to collaboratively study complex n-dimensional systems in an intuitive way, and pave the way for visual analytics.

The NOMAD (Novel Materials Discovery) Laboratory A European Centre of Excellence

The first 15 months

©NOMAD, 2017

Figure 1. The NOMAD Laboratory CoE

Figure 2. The NOMAD Repository

@NoMaDCoE@nomadCoE

Page 2: The NOMAD (Novel Materials Discovery) Laboratory … Materials/_NOMAD...The NOMAD (Novel Materials Discovery) Laboratory, ... (Novel Materials Discovery) Laboratory A European Centre

High-Performance Computing Expertise and Hardware enable the NOMAD Laboratory CoE to meet the demands of the Encyclopedia, Big-Data-Analytics Toolkit, and Visualization tools by design and operation of the underlying computing platform for the NOMAD services, as well as application support for both HPC and Big-Data analytics and corresponding workflows. Through the NOMAD Laboratory CoE, academic and industrial users alike will be able to leverage European HPC capabilities by gaining access to meaningful, useful presentations of computational materials science data already computed by HPC centers and by using the HPC resources that support delivery of NOMAD tools and services.NOMAD also performs high-quality calculations for materials where important information is missing in the Archive. We are carefully listening to suggestions from our industrial colleagues about the most needed calculations. For example, as requested by Siemens, a novel thermal-conductivity calculation approach has been developed, which for the first time enables accurate calculations for materials from very low to very high thermal conductivity. Systematic calculations of heat-transport tensors for many materials will be started soon. I-deals, a company coordinating the Methanol fuel from CO2 (MefCO2) project, is interested in the catalytic activation of CO2, which is presently being studied by the NOMAD team to examine various potential catalyst materials, e.g. carbides and oxides. We will also perform studies to develop thin coating films to protect novel hybrid perovskite solar cells from degradation in moist environments, perform systematic high-throughput screening of potential transparent oxide semiconductors, and more. The data and tools of the NOMAD Laboratory CoE will be made freely available to anyone wishing to use them. The Materials Encyclopedia web interface and API will soon be available through the project website, and a number of Big-Data Analytics tutorials are already available (https://analytics-toolkit.nomad-coe.eu/home/). Videos showcasing our virtual reality environment are also available now on the website (https://nomad-coe.eu/index.php?page=graphics). In addition, we will make available APIs to facilitate data downloading.To ensure that the NOMAD Laboratory CoE achieves maximum impact and benefit, we are conducting extensive outreach to industrial and academic end-users. We have hosted an Academic Workshop and two Industry Meetings, with a third Industry Meeting planned for 05 - 06 Feb 2018. We have also conducted numerous Industry Interviews and will continue to seek industry feedback on our tools and services. In 2017, we are organizing an Academic Workshop (http://th.fhi-berlin.mpg.de/meetings/BDMS2017/) and a Summer School (http://meetings.nomad-coe.eu/nomad-summer-2017/), open to both industry and academia, in collaboration with the Psi-k and CECAM networks.

©NOMAD, 2017

Figure 3. Overall NOMAD Concept

TEAM

Matthias Scheffler (NOMAD Coordinator, Member of the Executive Team, Leader of Raw Data Conversion, Selection and Compression and Big-Data Analytics), Fawzi Mohamed (Deputy Leader Raw Data Conversion, Selection and Compression), Luca Ghiringhelli (Deputy Leader Big-Data Analytics), Christian Carbogno, Ankit Kariyaa, Bryan Goldsmith, Angelo Ziletti, Björn Bieniek, Danilo Brambila, Benjamin Regler, Daria Tomecka, Emre Ahmetcik, Xiangyue Liu, Sergey Levchenko, Christopher Sutton, Matthias Rupp, Aliaksei Mazheika, Yanggang Wang, Zhong-Kang Han

Fritz Haber Institute of the Max Planck Society

Claudia Draxl (Member of the Executive Team, Leader of the NOMAD Encyclopedia), Georg Huhs (Deputy Leader of the NOMAD Encyclopedia), Pasquale Pavone, Ioan Vancea, Junhgo Shin, Lorenzo Pardini, Santiago Rigamonti, Benedict Hoock, Axel Hübner, Maria Troppenz, Sven Lubeck, Andris Gulans

Humboldt-Universität zu Berlin

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 676580.The materials presented and views expressed here are the responsibility of the author(s) only. The EU Commission takes no responsibility for any use

made of the information set out.

Page 3: The NOMAD (Novel Materials Discovery) Laboratory … Materials/_NOMAD...The NOMAD (Novel Materials Discovery) Laboratory, ... (Novel Materials Discovery) Laboratory A European Centre

©NOMAD, 2017

Angel Rubio (Leader Industry Networking), Alejandro Perez, Henning Glawe, Ask Hjorth Larsen, Micael José Tourdot de Oliveira, Adriel García Domínguez, Joaquim Jornet-Somoza, Carlos de Armas

Max Planck Institute Structure of Dynamics of Matter

Kristian Sommer Thygesen (Deputy Leader Raw Data Conversion, Selection and Compression), Mikkel Strange

Technical University of Denmark

Alessandro De Vita (Leader Industry Networking), Martina Stella, Ádám Fekete, Alessio Comisso

King’s College London

Daan Frenkel (Deputy Leader Big-Data Analytics), Gábor Csányi (Deputy Leader Big-Data Analytics), Carl Poelking

University of Cambridge

Francesc Illas (Leader Communication, Dissemination and Exploitation), Stefan Bromley (Leader Communication, Dissemination and Exploitation), Helena Muñoz Galan, Francesc Viñes, Ask Hjorth Larsen, Rosendo Valero

Universitat de Barcelona

Risto Nieminen (Member of the Executive Team, Deputy Leader Advanced Graphics), Patrick Rinke (Deputy Leader Advanced Graphics), Adam Foster, Martha bin Zaidan, Filippo Federici, Lauri Himanen, Marc Jäger, Milica Todorovic, Annika Stuke, Henri Paulamäki, Kunal Ghosh, Eiaki Morooka

Aalto University

Kimmo Koski (Member of the Executive Team, Leader HPC Services and Infrastructure), Janne Ignatius (Leader HPC Services and Infrastructure), Atte Sillanpää (Leader HPC Services and Infrastructure), Sri Harsha Vathsavayi, Aleksi Kallio, Ari Lukkarinen

CSC - IT Center for Science Ltd

Arndt Bode (Leader Advanced Graphics), Rubén García Hernández (Leader Advanced Graphics), Christoph Anthes

Leibniz Supercomputing Centre

Stefan Heinzel (Deputy Leader HPC Services and Infrastructure), Hermann Lederer Deputy Leader HPC Services andInfrastructure), Markus Rampp, Thomas Zastrow, Michele Compostella, Giuseppe Di Bernardo

Max Planck Computing and Data Facility

José Maria Cela (Deputy Leader of the NOMAD Encyclopedia), Georg Huhs (Deputy Leader of the NOMAD Encyclopedia), Luz Calvo, Fernando Cucchietti, Guillermo Marin, Maria-Cristiana Marinescu, Sergi Madona Soria, Monica De Mier Torrecilla, Artur Garcia Saez, Diana Fernanda Velez Garcia

Barcelona Supercomputing Center

Ciaran Clissmann (Leader Communication, Dissemination and Exploitation), Kylie O’Brien

Pintail Ltd