Abstract
1 min readIn silico materials design has long faced a fundamental tradeoff between accuracy, universality, and efficiency. In 2022, we pioneered the concept of a universal machine learning interatomic potential (UMLIP) [Chen & Ong, Nat. Comput. Sci., 2022, 2, 718–728] – a foundational materials model (FMM) with comprehensive coverage of the periodic table. FMMs enable accurate, large-scale simulations across a broad spectrum of materials, offering transformative potential for materials discovery and design. More recently, the field has seen a trend toward increasingly complex FMM architectures—often with over 10 million parameters—trained on datasets exceeding 100 million structures, driven largely by major tech companies like Google DeepMind, Microsoft, and various startups. In this talk, I challenge the prevailing “bigger is better” paradigm in FMM development. I will present MatPES, a foundational, community-curated potential energy surface (PES) dataset of ~400,000 structures. Leveraging MatPES, we demonstrate that gains in FMM performance are primarily driven by data quality, and there are no “accuracy moat” in FMM architectures. Models trained on MatPES match or exceed the accuracy of previous FMMs across a diverse set of equilibrium, near-equilibrium, and dynamic benchmarks. Finally, I will argue that the key priorities in architectural and algorithmic development should be in the parallelization and scaling of such FMMs in high performance computing and their integration in high-throughput materials workflows.
Discussion(0)
No comments yet. Be the first to comment.