How to find the substrate for my own structure

haidi-ustc · January 22, 2018, 5:03am

I have a new 2D material. To guide the future experiment, I want to propose a suitable substrate for growing this material. But how can I find the suitable substrate for my structure by using pymatgen ? This structure is not in the database of material project.

dwinston · January 22, 2018, 7:32pm

You can use pymatgen’s SubstrateAnalyzer in order to calculate the minimal coincident interface area (MCIA) between film and substrate structures. If you have an elastic tensor for your film, you can also calculate elastic energy. We do substrate analysis for each of our materials across a list of common substrates, and we use an elastic tensor if such a calculation was also requested for the candidate film material. Thus, one option for guiding your experiment is to submit your structure to us via the crystal toolkit app. Here is how you might do such an analysis yourself, though I don’t include use of an elastic tensor here:

from operator import itemgetter
import itertools

import pandas as pd
from pymatgen.analysis.substrate_analyzer import SubstrateAnalyzer
from pymatgen import MPRester
from tqdm import tqdm

mpr = MPRester()

# Get list of material IDs for common substrates
mids = mpr.get_all_substrates()

substrates = mpr.query({"material_id": {"$in": mids}}, ["structure", "material_id"])

# Pull an example film structure. You will need to construct a
# `pymatgen.core.structure.Structure`
film = mpr.get_structure_by_material_id("mp-81")

all_matches = []
sa = SubstrateAnalyzer()

def groupby_itemkey(iterable, item):
    """groupby keyed on (and pre-sorted by) itemgetter(item)."""
    itemkey = itemgetter(item)
    return itertools.groupby(sorted(iterable, key=itemkey), itemkey)

for s in tqdm(substrates):
    substrate = s["structure"]

    # Calculate all matches and group by substrate orientation
    matches_by_orient = groupby_itemkey(
        sa.calculate(film, substrate, lowest=True),
        "sub_miller")

    # Find the lowest area match for each substrate orientation
    lowest_matches = [min(g, key=itemgetter("match_area"))
                      for k, g in matches_by_orient]

    for match in lowest_matches:
        db_entry = {
            "sub_id": s["material_id"],
            "orient": " ".join(map(str, match["sub_miller"])),
            "sub_form": substrate.composition.reduced_formula,
            "film_orient": " ".join(map(str, match["film_miller"])),
            "area": match["match_area"],
        }
        all_matches.append(db_entry)

df = pd.DataFrame(all_matches)
df.set_index("sub_id", inplace=True)
df.sort_values("area")

The above is based on the procedure we use for building this information across our database of structures (see emmet.vasp.builders.substrates). Also, I recommend use of e.g. the tqdm library for progress indication as above because the analysis can take several minutes for tens of substrate candidates.

haidi-ustc · January 23, 2018, 2:04am

Thank you very much, it works !

pnieves2019 · August 16, 2019, 9:30am

Hi, I used the above mentioned script but I can’t reproduce the data shown in Materials Project website for the selected example “mp-81”. I modified it in the following way in order to get the same data (as well as to include the elastic energy):

from operator import itemgetter
import itertools

import numpy as np

import pandas as pd
from pymatgen.analysis.substrate_analyzer import SubstrateAnalyzer
from pymatgen import MPRester
from tqdm import tqdm

from pymatgen.symmetry.analyzer import SpacegroupAnalyzer

from pymatgen.analysis.elasticity.elastic import ElasticTensor

mpr = MPRester(" ")

# Get list of material IDs for common substrates
mids = mpr.get_all_substrates()

substrates = mpr.query({"material_id": {"$in": mids}}, ["structure", "material_id"])


# Pull an example film structure. You will need to construct a
# `pymatgen.core.structure.Structure`

film0 = mpr.get_structure_by_material_id("mp-81")
aa = SpacegroupAnalyzer(film0)
film = aa.get_conventional_standard_structure(international_monoclinic=True)

a1 = np.zeros((3,3,3,3))

c11 = 14400  
c12 = 13400
c13 = 13400
c33 = 14400
c44 = 2900
c55 = 2900
c66 = 2900

a1[0,0,0,0] = c11
a1[1,1,1,1] = c11
a1[2,2,2,2] = c33
a1[0,0,1,1] = c12
a1[1,1,0,0] = c12
a1[0,0,2,2] = c13
a1[1,1,2,2] = c13
a1[2,2,0,0] = c13
a1[2,2,1,1] = c13
a1[1,2,1,2] = c44
a1[2,1,1,2] = c44
a1[1,2,2,1] = c44
a1[2,1,2,1] = c44
a1[0,2,0,2] = c55
a1[2,0,2,0] = c55
a1[0,2,2,0] = c55
a1[2,0,0,2] = c55
a1[0,1,0,1] = c66
a1[1,0,0,1] = c66
a1[0,1,1,0] = c66
a1[1,0,1,0] = c66

a0 = ElasticTensor(a1)
a2 = a0.convert_to_ieee(film)

all_matches = []
sa = SubstrateAnalyzer()

def groupby_itemkey(iterable, item):
    """groupby keyed on (and pre-sorted by) itemgetter(item)."""
    itemkey = itemgetter(item)
    return itertools.groupby(sorted(iterable, key=itemkey), itemkey)

for s in tqdm(substrates):
    substrate0 = s["structure"]

    bb = SpacegroupAnalyzer(substrate0)

    substrate = bb.get_conventional_standard_structure(international_monoclinic=True)

    # Calculate all matches and group by substrate orientation
    matches_by_orient = groupby_itemkey(
        sa.calculate(film, substrate, elasticity_tensor=a2, lowest=True),
        "sub_miller")


    # Find the lowest area match for each substrate orientation
    lowest_matches = [min(g, key=itemgetter("match_area"))
                      for k, g in matches_by_orient]

    for match in lowest_matches:
        db_entry = {
            "sub_id": s["material_id"],
            "orient": " ".join(map(str, match["sub_miller"])),
            "sub_form": substrate.composition.reduced_formula,
            "film_orient": " ".join(map(str, match["film_miller"])),
            "area": match["match_area"],
        }
        if "elastic_energy" in match:
                    db_entry["energy"] = match["elastic_energy"]
                    db_entry["strain"] = match["strain"]

        all_matches.append(db_entry)

df = pd.DataFrame(all_matches)
df.set_index("sub_id", inplace=True)
df.sort_values("area")

tfile = open('test_mp81.txt', 'a')
tfile.write(df.to_string())
tfile.close()