Skip to content

Machine Learning Tutorials

This section covers Machine Learning (ML) workflows on the Mat3ra platform. The tutorials are organized by approach: universal force fields for atomistic simulation, custom potential training, and statistical property prediction.

Universal Machine-Learned Force Fields

Pre-trained interatomic potentials that predict energies, forces, and stresses across the periodic table without per-system training.

Tutorial Model Description
MatterSim (Python MLFF) MatterSim Run pre-trained MatterSim for total energy, relaxation, and phonons — covers bank workflows, custom workflows, GPU execution, and multi-threading

Running other Python-based models

Any Python-based MLFF that can be installed via pip (e.g. MACE, CHGNet, SevenNet) can be run using the general Python workflow template. See Section 3 of the MatterSim tutorial for the general approach.

Custom Potential Training

Training a neural network potential from first-principles data, then using it for large-scale molecular dynamics.

Tutorial Pipeline Description
DeePMD (QE → DeePMD → LAMMPS) QE CP + DeePMD-kit + LAMMPS End-to-end workflow: generate ab-initio MD data with Quantum ESPRESSO Car–Parrinello, train a DeePMD potential, and run production MD in LAMMPS

Statistical Property Prediction (Python ML)

Traditional ML models (regression, classification, clustering) using tabulated materials descriptors and scikit-learn. These workflows use a dataset (CSV) rather than a crystal structure as input.

Tutorial Task Description
Train a regression model Regression Train a neural network regressor on adsorption energies
Predict with regression Prediction Apply a trained regression model to new data
Unsupervised clustering Clustering K-means and hierarchical clustering of materials descriptors
Train a classifier Classification Train a model to classify materials by category
Predict with a classifier Prediction Apply a trained classifier to new data

Legacy Tutorials

Deprecated

The following tutorial uses the legacy ML engine, which has been superseded by the Python ML infrastructure above.