Liu, XiaotongBeijing Advanced Innovation Center for Materials Genome Engineering, Beijing Information Science and Technology University, P. R. China.
Author
De Breuck, Pierre-PaulUCLouvain
Author
Wang, LinghuiSchool of Computer, Beijing Information Science and Technology University, No. 35 Beisihuan Middle Road, Beijing 100101 Beijing, P. R. China
Machine-learning models have recently encountered enormous success for predicting the properties of materials. These are often trained based on data that present various levels of accuracy, with typically much less high- than low-fidelity data. In order to extract as much information as possible from all available data, we here introduce an approach which aims to improve the quality of the data through denoising. We investigate the possibilities that it offers in the case of the prediction of the band gap using both limited experimental data and density-functional theory relying on different exchange-correlation functionals. After analyzing the raw data thoroughly, we explore different ways to combine the data into training sequences and analyze the effect of the chosen denoiser. We also study the effect of applying the denoising procedure several times until convergence. Finally, we compare our approach with various existing methods to exploit multi-fidelity data and show that it provides an interesting improvement.
Liu, X., De Breuck, P.-P., Wang, L., & Rignanese, G.-M. (2022). A simple denoising approach to exploit multi-fidelity data for machine learning materials properties. npj Computational Materials, 8(1), 233. https://doi.org/10.1038/s41524-022-00925-1 (Original work published 2022)