How to find the substrate for my own structure


I have a new 2D material. To guide the future experiment, I want to propose a suitable substrate for growing this material. But how can I find the suitable substrate for my structure by using pymatgen ? This structure is not in the database of material project.


You can use pymatgen’s SubstrateAnalyzer in order to calculate the minimal coincident interface area (MCIA) between film and substrate structures. If you have an elastic tensor for your film, you can also calculate elastic energy. We do substrate analysis for each of our materials across a list of common substrates, and we use an elastic tensor if such a calculation was also requested for the candidate film material. Thus, one option for guiding your experiment is to submit your structure to us via the crystal toolkit app. Here is how you might do such an analysis yourself, though I don’t include use of an elastic tensor here:

from operator import itemgetter
import itertools

import pandas as pd
from pymatgen.analysis.substrate_analyzer import SubstrateAnalyzer
from pymatgen import MPRester
from tqdm import tqdm

mpr = MPRester()

# Get list of material IDs for common substrates
mids = mpr.get_all_substrates()

substrates = mpr.query({"material_id": {"$in": mids}}, ["structure", "material_id"])

# Pull an example film structure. You will need to construct a
# `pymatgen.core.structure.Structure`
film = mpr.get_structure_by_material_id("mp-81")

all_matches = []
sa = SubstrateAnalyzer()

def groupby_itemkey(iterable, item):
    """groupby keyed on (and pre-sorted by) itemgetter(item)."""
    itemkey = itemgetter(item)
    return itertools.groupby(sorted(iterable, key=itemkey), itemkey)

for s in tqdm(substrates):
    substrate = s["structure"]

    # Calculate all matches and group by substrate orientation
    matches_by_orient = groupby_itemkey(
        sa.calculate(film, substrate, lowest=True),

    # Find the lowest area match for each substrate orientation
    lowest_matches = [min(g, key=itemgetter("match_area"))
                      for k, g in matches_by_orient]

    for match in lowest_matches:
        db_entry = {
            "sub_id": s["material_id"],
            "orient": " ".join(map(str, match["sub_miller"])),
            "sub_form": substrate.composition.reduced_formula,
            "film_orient": " ".join(map(str, match["film_miller"])),
            "area": match["match_area"],

df = pd.DataFrame(all_matches)
df.set_index("sub_id", inplace=True)

The above is based on the procedure we use for building this information across our database of structures (see Also, I recommend use of e.g. the tqdm library for progress indication as above because the analysis can take several minutes for tens of substrate candidates.


Thank you very much, it works !:grinning: