Abstract
Recent computational materials discovery efforts have led to an enormous number of predictions of previously unknown, potentially stable inorganic, crystalline compounds. In particular, both high-throughput screenings and generative models have benefited tremendously from recent advances in computational resources and available data. However, these efforts are currently limited to predicting pristine crystalline materials. As a consequence, many of these predictions cannot be realized in experiments, where kinetic effects, defects, and crystallographic disorder can be crucial. To address this shortcoming, the current work aims to introduce disorder into computational materials discovery with machine learning (ML) based classification models. Trained on the inorganic crystal structure database (ICSD), these classifiers capture the chemical trends of crystallographic disorder and estimate the prevalence of disorder in computational databases produced by the Materials Project or Graph Networks for Materials Science (GNoME) initiatives. This opens the door towards disorder-aware computational materials discovery workflows, bridging the gap between prediction and experiment