Semiconducting Materials by Analogy and Chemical Theory (SMACT) is a collection of rapid screening and informatics tools that uses data about chemical elements.
- Documentation: https://smact.readthedocs.io/en/latest/
- Examples: https://smact.readthedocs.io/en/latest/examples.html
If you torture the data enough, nature will always confess - Roland Coase (from 'How should economists choose?')
There is a strong demand for functional materials across a wide range of technologies. The motivation can include cost reduction, performance enhancement, or to enable a new application. We have developed low-cost procedures for screening hypothetical materials. This framework can be used for simple calculations on your own computer. SMACT follows a top-down approach where a set of element combinations is generated and then screened using rapid chemical filters. It can be used as part of a multi-technique workflow or to feed artificial intelligence models for materials.
Features are accessed through Python scripts, importing classes and functions as needed. The best place to start is looking at the docs, which highlight some simple examples of how these classes and functions can be used. Use cases are available in our examples and tutorials folders.
-
At the core of SMACT are Element and Species (element in a given oxidation state) classes that have various properties associated with them.
-
Oxidation states that are accessible to each element are included in their properties.
-
Element compositions can be screened through based on the heuristic filters of charge neutrality and electronegativity order. This is handled using the screening module and this publication describes the underlying theory. An example procedure is outlined in the docs.
-
Further filters can be applied to generated lists of compositions in order to screen for particular properties. These properties are either intrinsic properties of elements or are calculated for compositions using the properties module. For example:
- An application is shown in this publication, in which 160,000 chemical compositions are screened based on optical band gap calculated using the solid-state energy scale.
- The oxidation_states module can be used to filter out compositions containing metals in unlikely oxidation states according to a data-driven model.
-
Compositions can also be filtered based on sustainability via the abundance of elements in the Earth's crust or via the HHI scale.
-
Compositions can be converted for use in Pymatgen or for representation to machine learning algorithms (see this example) and the related ElementEmbeddings package.
-
Charge neutrality screening supports mixed-valence compounds via the
mixed_valence=Trueflag insmact_validity, enabling correct handling of materials like FeβOβ and MnβOβ. -
Oxidation state data is sourced from ICSD 2024, providing an updated and stricter set of experimentally observed oxidation states per element.
-
The property prediction module enables composition-to-property prediction using pretrained deep learning models, including a ROOST-based band gap predictor trained on the Materials Project database.
-
The code also has tools for manipulating common crystal lattice types:
- Certain structure types can be built using the builder module
- Lattice parameters can be estimated using ionic radii of the elements for various common crystal structure types using the lattice_parameters module.
- The lattice module and distorter module rely on the Atomic Simulation Environment and can be used to generate unique atomic substitutions on a given crystal structure.
- The structure prediction module can be used to predict the structure of hypothetical compositions using species similarity measures.
- The dopant prediction module can be used to facilitate high-throughput predictions of p-type and n-type dopants of multicomponent solids.
Legend: π’ new in v4 β π‘ improved in v4
graph TD
classDef new fill:#c8e6c9,stroke:#388e3c,color:#000
classDef improved fill:#fff9c4,stroke:#f9a825,color:#000
SMACT(["smact"])
SMACT --> core["Core modules"]
SMACT --> SP["structure_prediction"]
SMACT --> DP["dopant_prediction"]
SMACT --> PP["π’ property_prediction"]
SMACT --> IO["π’ io"]
SMACT --> UT["utils"]
core --> init["init.py β Element, Species, neutral_ratios"]
core --> dl["data_loader.py β elemental and oxidation state data loading"]
core --> sc["π‘ screening.py β compositional screening"]
core --> pr["properties.py β band gap, electronegativity, valence electron count"]
core --> ox["oxidation_states.py β oxidation state combination likelihood"]
core --> mt["metallicity.py β metallic character scoring"]
core --> la["lattice.py β Site and Lattice representations"]
core --> bld["π‘ builder.py β perovskite and wurtzite structure builders"]
core --> lp["π‘ lattice_parameters.py β lattice parameter estimation from ionic radii"]
core --> di["distorter.py β inequivalent site enumeration and substitution"]
sc --> sc1["smact_validity β charge neutrality and Pauling electronegativity test"]
sc --> sc2["smact_filter β compositional search space generation"]
sc --> sc3["π’ mixed_valence flag β correct handling of Fe3O4, Mn3O4"]
sc --> sc4["π’ ICSD 2024 oxidation states β stricter, updated elemental data"]
bld --> bld1["cubic_perovskite β parameterized oxidation state tiling"]
bld --> bld2["wurtzite β corrected default cell parameters"]
lp --> lp1["corrected geometric formulae for all structure types"]
SP --> spst["structure.py β SmactStructure"]
SP --> spdb["database.py β StructureDB SQLite interface"]
SP --> spmu["mutation.py β CationMutator from lambda tables"]
SP --> sppd["π‘ prediction.py β StructurePredictor"]
spst --> spst1["from_file, from_mp, from_pymatgen constructors"]
sppd --> sppd1["ionic substitution-based crystal structure prediction"]
sppd --> sppd2["π’ updated to mp_api.client.MPRester interface"]
DP --> doper["π‘ doper.py β Doper"]
doper --> doper1["get_dopants β p-type and n-type candidates"]
doper --> doper2["π’ DopantCandidate dataclass β structured prediction output"]
doper --> doper3["plot_dopants β periodic table heatmap visualisation"]
PP --> base["base_predictor.py β BasePropertyPredictor, PredictionResult"]
PP --> roost["roost/ β RoostPropertyPredictor"]
PP --> conv["convenience.py β predict_band_gap"]
PP --> reg["registry.py β model discovery and resolution"]
roost --> roost1["pretrained ROOST model for band gap prediction"]
roost --> roost2["uncertainty estimates alongside predictions"]
roost --> roost3["trained on Materials Project 2024 database"]
base --> base1["PredictionResult β value, uncertainty, metadata"]
IO --> ee["π’ elementembeddings.py β ElementEmbeddings interface"]
ee --> ee1["composition_featuriser β composition-level feature vectors"]
ee --> ee2["species_featuriser β species-level feature vectors"]
UT --> comp["composition.py β parse_formula, comp_maker, formula_maker"]
UT --> uox["π’ oxidation.py β ICSD24OxStatesFilter"]
UT --> sp2["species.py β parse_spec, unparse_spec"]
UT --> cs["crystal_space/"]
uox --> uox1["consensus and commonality-based filtering of oxidation states"]
cs --> cs1["generate_composition_with_smact.py β SMACT-based composition generation"]
cs --> cs2["download_compounds_with_mp_api.py β Materials Project bulk download"]
cs --> cs3["plot_embedding.py β crystal space visualisation"]
class PP,IO new
class sc,bld,lp,sppd,doper,uox improved
class sc3,sc4,sppd2,doper2,ee,uox new
- smact library containing:
- __init__.py Contains the core
ElementandSpeciesclasses. - data_loader.py Handles the loading of external data used to initialise the core
smact.Elementandsmact.Speciesclasses. - screening.py Used for generating and applying filters to compositional search spaces.
- properties.py A collection of tools for estimating useful properties based on composition.
- lattice.py Given the sites, multiplicities and possible oxidation states at those sites, this reads from the database and generates all possible stoichiometries.
- builder.py Builds some common lattice structures, given the chemical composition.
- lattice_parameters.py Estimation of lattice parameters for various lattice types using covalent/ionic radii.
- distorter.py A collection of functions for enumerating and then substituting on inequivalent sites of a sub-lattice.
- oxidation_states.py: Used for predicting the likelihood of species coexisting in a compound based on a statistical model.
- structure_prediction: A submodule which contains a collection of tools for facilitating crystal structure predictions via ionic substitutions
- dopant_prediction: A submodule which contains a collection of tools for predicting dopants.
- property_prediction: A submodule for composition-to-property prediction using pretrained deep learning models (e.g. ROOST band gap predictor).
- utils: A submodule containing utility functions for composition parsing, species handling, oxidation state filtering, and crystal space generation and download.
- __init__.py Contains the core
The main language is Python 3 and has been tested using Python 3.11 - 3.13.
Core dependencies include NumPy, SciPy, pandas, pymatgen, ASE, and spglib. A full list is in pyproject.toml.
The latest stable release can be installed via pip:
pip install smactOptional dependencies (needed for full replication of examples and tutorials):
pip install "smact[optional]"SMACT is also available via conda-forge:
conda install -c conda-forge smactWe use uv for dependency management. To set up a development environment:
git clone https://github.com/wmd-group/smact.git
cd smact
uv sync --extra optional --extra property_prediction --dev
pre-commit installThis installs SMACT in editable mode with all optional and development dependencies, and sets up pre-commit hooks. See CONTRIBUTING.md for the full workflow.
Python code and original data tables are licensed under the MIT License.
Please use the Issue Tracker to report bugs or request features in the first instance. While we hope that most questions can be answered by searching the docs, we welcome new questions on the issue tracker, especially if they help us improve the docs! For other queries about any aspect of the code, please contact Kinga Mastej (maintainer) by e-mail.
We are always looking for ways to make SMACT better and more useful to the wider community; contributions are welcome. As of v4.0.0, we use GitHub Flow: branch from master, open a pull request against master, and releases are tagged from master. Please use the "Fork and Pull" workflow to make contributions and stick as closely as possible to the following:
- Code style is enforced by ruff (linting and formatting) and pyright (type checking). Pre-commit hooks run these automatically on commit.
- Use Google-style docstrings.
- Add tests wherever possible, and use the test suite to check if you broke anything.
- Look at the contributing guide for more information.
We use GitHub Actions for CI. Tests should be added to smact/tests/test_core.py or another smact/tests/test_something.py file.
Run the tests locally:
make testOr to run the full CI pipeline (pre-commit hooks and tests):
make ci-localH. Park et al., "Mapping inorganic crystal chemical space" Faraday Discuss. (2024)

