Liu, XiaotongState Key Laboratory of Coal Conversion, Institute of Coal Chemistry, Chinese Academy of Sciences, Taiyuan 030001, P. R. China;
Author
Zhang, TianfuState Key Laboratory of Coal Conversion, Institute of Coal Chemistry, Chinese Academy of Sciences, Taiyuan 030001, P. R. China
Author
Yang, TaoBeijing Advanced Innovation Center for Materials Genome Engineering, Beijing Information Science and Technology University, Beijing 100101, P. R. China
Author
Liu, XiuleiBeijing Advanced Innovation Center for Materials Genome Engineering, Beijing Information Science and Technology University, Beijing 100101, P. R. China
Traditionally, chemistry problems are solved by means of a deductive approach. The question to be addressed is typically related to the value of a property that is either measured experimentally, computed using quantum-chemistry software, or (more recently) predicted using a machine-learned model. In this paper, we demonstrate that an inductive approach can be adopted using End-to-End (E2E) machine learning. This approach is illustrated for tackling the following chemistry problems: (i) determine the fully coordinated (FC) and undercoordinated (UC) atoms in a molecule with one missing atom, (ii) identify the type of atom that is missing in such an incomplete molecule, and (iii) predict the direction of a reaction between two molecules according to an existing dataset. The E2E approach leads to accuracies higher than 99%, 98%, and 93% for these three problems, respectively. Finally, in order to achieve such accuracies, a descriptor for the molecules, called bag of clusters, is introduced and compared with a series previously proposed descriptors, highlighting a series of advantages.