Materials Project Database Release Log

[Edit by admin: Database version logs are now maintained in our public documentation available at Database Versions - Materials Project Documentation]

We’ve released a new revision of the Materials Project database as part of a major overhaul of our software infrastructure. This enables us to continue to build new properties and deliver them to you, the material science community, at an accelerating pace. This has also caused some growing pains as we fix bugs we created in building our new infrastructure from scratch.

Many of our users have reported bugs, and we are very grateful for that. We’re making great strides in being as transparent as possible so everyone knows what is changing. As part of this, we’ve settled on a monthly release schedule including an announcement that informs everyone on what has or has not changed.

v2019.11

  • Introduced 3,971 new materials
  • Amorphous materials added with amorphous tag
  • Added theoretical which is True when the material matches no known experimental structure from ICSD
  • Fixed several inconsistency bugs for band_gap, piezo tensors, elastic warnings, and total magnetic moment.

V2019.05

  • Introduced a new deprecated field to materials. By default the website and API only search for materials that are not deprecated: {“deprecated”: false}.
  • Deprecated 15,000 and added 3,600 new materials. We will be recomputing the deprecated materials to fill these spaces back up. Some of these new relaxations may end up matching current materials, so the total number of materials is not guaranteed to be the same as in V2019.02.
  • Fixed an issue with sandboxes not properly building the whole hull. Previously, only the sandboxed chemical systems were being recalculated for energy_above_hull searches

V2019.02

  • Added over 47,000 new materials from orderings of disordered ICSD as well as compounds from the Pauling File
  • Finalized enforcing symmetry on piezo tensors
  • Moved third order elastic data to elasticity_third_order so that people are not swamped by the mountain of information associated with it.

V2018.12

  • Adjusted the mp-id naming scheme to fix “mvc” ids taking over old mp-ids.
  • Fixed piezoeletric max_direction to be a miller index rather than a unit vector.

V2018.11

  • Changed the grouping of magnetic materials to aggregate all magnetic orderings of a given material into a single material-id, and report the lowest energy ordering
  • Fixed incorrect calculation and display of polycrystalline dielectric constants
  • Fixed labeling of all materials as high-pressure. Note we’re parsing ICSD tags for this labeling so while some materials may not conventionally be considered high-pressure, a single matching ICSD entry can tag a material as such. We would love to hear comments on how we could better tag high-pressure materials
  • Begun enforcing the symmetry of the structure on piezo tensors. In general, this reduces the expected piezo value.
10 Likes

We’re onto to the second monthly release of our database. We continue to fix a lot of issues you’ve all flagged for us. Thank you for finding bugs. There are only a handful of developers in the Materials Project and we benefit immensely from the community. Please keep sending in bugs as you find them.

It’s also been pointed out that Materials Project has been running slowly for a number of people. We’ve had a sudden increase in load, likely due to the number of talks at MRS 2018 referencing Materials Project data for machine learning. We’re are working on scaling the website to cope with the load.

V2018.12

  • Adjusted the mp-id naming scheme to fix “mvc” ids taking over old mp-ids.
  • Fixed piezoeletric max_direction to be a miller index rather than a unit vector.
1 Like

We’re onto to our third release of our database which is coming a bit late, but we have a good excuse. Our dogs ate our database.

In reality, we’ve been preparing a major increase in the material count by a little over 47,000 new materials. These come primarily from finding orderings of many disordered structures in the ICSD as well as some structures originally from the Pauling files. We continue to use our structure matcher to ensure we’re not duplicating materials so the current set should offer good coverage of both the ICSD and Pauling files.

We’re onto the V2019.05 release of the MP Database. The last release (V2019.02) introduced a number of new compounds, but unfortunately also highlighted a validation issue for us that has actually been present for a while now in MP. The issue was this: Most of our structure optimizations show very small changes in the input volume, keeping the k-points consistent. However, for a number of new compounds, the volume change was significant – greater than 50% – which drastically changed the k-points necessary for proper coordinates and energies. These outlier compounds were not being caught before.

Rather than just removing them completely, we’ve built a deprecation tag to softly “remove” such materials from searches and phase diagrams. Exploring MP via the website will not show these materials, but you can still find them by ID via the API and by direct links; if a material was referenced in a publication, its data is still accessible.

It’s been a long time since we released an updated database. We’ve been making a lot of changes in the backend to get ready for a whole new API and what will eventually be a completely new website. This will address several issues we’ve had with expanding the dataset and potentially moving some of our data to AWS to minimize downtime.

Major Changes
This latest release V2019.11 introduces 3,971 new materials. Of particular note is “amorphous” materials we’ve computed as part of our Synthesizeability Skyline. You can search for these by an “amorphous” tag.

As previously announced, we’re tagging materials as “theoretical” if we don’t have an experimental ICSD ID. In the future, we’re adding the Pauling File IDs and Crystallography Open Database IDs which will all affect this tag.

Bug Fixes

  • Fixed elasticity warnings to properly show up in the materials doc
  • Fixed an issue with piezo tensors which were parsing VASP’s voigt notation expecting conventionla voigt notation
  • Several bandgaps were being incorrectly reported from structure optimizations rather than static calculations; this inconsistency should be fixed.
  • Updated magnetism to be consistently from structure optimizations for now
  • Fixed volume and NSites to corresponding to the same cell in search and materials detail page

2019-11-21 During deployment of the new v2019.11 database, there was temporary issue with generating interactive phase diagrams leading to incorrect formation enthalpies for a small number of chemical systems. This has now been fixed. Data presented on the materials detail pages was unaffected by this issue.

2019-12-04 We are aware of an on-going issue with the reported energies above hull on the materials detail pages. We will update this thread when a fix has been fully implemented and with further details.

Until this issue is fully resolved, correct energies above hull can be retrieved using pymatgen as follows:

from pymatgen import MPRester
from pymatgen.analysis.phase_diagram import PhaseDiagram

with MPRester(YOUR_API_KEY_HERE) as mpr:
    # replace with your elements and mp-id of interest
    entries = mpr.get_entries_in_chemsys(['Li','Co', 'O'])
    entry_of_interest = mpr.get_entry_by_material_id('mp-19128')

phase_diagram = PhaseDiagram(entries)
e_above_hull = phase_diagram.get_e_above_hull(entry_of_interest)

print("e_above_hull", e_above_hull)

If you have not previously used pymatgen, it is a Python code and can be installed using pip install pymatgen or conda install --channel conda-forge pymatgen and your API key can be found on your Dashboard when logged in.

2019-12-05 The issue mentioned in 2019-12-04 has now been addressed, however approximately 7% of materials saw errors in their reported energies above hull of greater than 0.05 eV/atom. Values calculated via pymatgen or via the phase diagram app on the website during this time were correct, while values reported on the materials details page and via the e_above_hull API key were incorrect.

We encourage users who accessed convex hulls from the website between the latest database release and 2019-12-05 to re-check any values obtained from the website.

We apologize for the error, and will be incorporating additional checks into our automated testing to prevent similar errors in the future.

2020-03-05 We’re dealing with an issue involving excess load on our server. The website may be slow while we investigate, and the API could go offline.

2020-04-07 We had a brief outage today where some materials details pages would not load as a result of the website connecting to our development database rather than our production database. Any data retrieved between roughly 10am to 2pm PST should be re-checked.

2020-06-30 Database V2020.06 Released

In this database release, we have added several thousand materials and many magnetic ground states, improved the quality of our energetics, and fixed many bugs. This database release is part of on-going efforts in 2020 to improve database reliability and quality, following the introduction of our deprecation process last year. There are still known issues with this release which we are working to address, please let us know if you encounter any in our forum.

1 Like

2020-08-20 Database V2020.08.20 Released

In this release we have added thousands of new band structure and density of states calculations, improving our overall material coverage and data quality. Additionally, we have overhauled the plotting for these quantities on the material details page. This is a first step in improving the electronic structure data within the Materials Project as part of our new tool set for band structure calculations.

We are also working through an on-going issue affecting the energies of a small number of materials. In the previous release, we added a large batch of higher-quality calculations for our energetics as well as fixing numerous bugs. However, we discovered an error in our calculation parameters leading to larger energies than expected for a minority of materials and issues such as those discussed here. We are currently re-running these calculations and will be fixing this data in a supplemental update in the next few weeks. We advise anybody performing large screening studies to do so with caution or wait until this supplemental update has been released.

2 Likes

2020-09-08 Supplemental Database Release V2020.09.08

This releases addresses issues noticed in the previous release with formation energies and updates the energies of approximately 6k materials where this error was greatest. We are planning a further supplemental release.

We’re also looking at ways to put in place a process to be more transparent with database changes and updates to share more specifically what has changed, as well as providing means to access historical versions of the database, since we know this is a common requirement.

Note that, wherever possible, we continue to keep individual historical calculation data available via its task_id even in cases where the aggregated information (such as that presented on the materials detail page) might change.

1 Like

For anyone using pymatgen for direct database access, we have now added notifications of the current database version when using MPRester. Update to pymatgen version >= 2020.9.14 to use this feature.

2021-02-08 Supplemental Database Release V2021-02-08

We had a small new database release today, this introduces new higher-quality calculations for around 30,000 materials. It also deprecates 78 materials since we currently do not have calculations for these materials that match our current quality standards; we hope to restore these 78 materials in a subsequent release. For an exact list, please see the attached file.

As a reminder, all historical calculation tasks remain available via our API and the task detail pages, and information on deprecated materials also remain available via the API. More information on our deprecation policy is in our documentation. We continue our work on better ways communicate database diffs and to more easily provide access to historical information, so stay tuned for future announcements here.

db_v2020_09_08_to_v2021_02_08_diff.yaml (376.9 KB)

2021-03-22 Supplemental Database Release V2021.03.22

This release updates some older materials with new calculations, and adjusts our rules for deprecating older calculations. It does not contain any new materials. Thanks to the new calculations many materials that were previously deprecated are now accessible again. This release is in preparation for a switch to our new compatibility scheme which will improve our predictions of formation energy.

2021-05-13 Database V2021.05.13 Released

This release updates the energy correction scheme we use to generate phase diagrams and compute formation energies. As with any new database release, formation energies for many compounds have changed; however in this case the change is due only to our new energy correction scheme and not to any new data. We are proud to report that the new correction scheme has reduced the overall error in formation energy in our database by 7% compared to experiment.

You can see details of each correction that has been applied by inspecting the energy_adjustments attribute of a ComputedEntry retrieved via the API. In addition, the new correction scheme is available for manual use via the MaterialsProject2020Compatibility class in pymatgen.

We realize that this change may be disruptive to ongoing work, and want to assure you that the historical corrections are still available in pymatgen if needed. They may be recovered by manually reprocessing ComputedEntry using the legacy MaterialsProjectCompatibility class. An example notebook demonstrating how to do this available on matgenb.

Below we summarize the most significant changes associated with the new MaterialsProject2020Compatibility correction scheme. For complete details and documentation, please refer to this manuscript.

1. Refitted corrections for legacy species
Corrections applied to oxygen compounds, diatomic gases, and transition metal oxides and fluorides have been refit using more up to date DFT calculations and a larger compilation of computed and experimental formation enthalpy data.

2. Corrections for additional species
We have added corrections for Br, I, Se, Si, Sb, and Te, which did not previously have energy corrections. As a result, formation energies for materials containing these species will generally be lower than they were previously.

3. Diatomic gas corrections moved to compounds
Previously, corrections for H, F, Cl, and N were applied to the elements. One consequence of this was that polymorphs of H2, N2, Cl2 and F2 were always assigned a zero energy above hull, even if some polymorphs were higher in energy. This made interpretation of these values confusing. With this release, energy corrections are applied to the material (e.g., LiH) and not the element. This also means that unstable polymorphs of diatomic gases will now have non-zero e_above_hull

4. Oxidation state based corrections
Our build process now estimates the likely oxidation states of each species in a material, and uses this information to intelligently apply corrections to anionic species only when their estimated oxidation state is negative. For example, in the compound MoCl3O, estimated oxidation states for both Cl and O are negative, so both anions receive corrections.

Our algorithms are not always successful in predicting the oxidation state. When this occurs, we apply anion corrections to only the most electronegative element in the material. As a result, some ternary or higher compounds in the database may be destabilized in this release because their oxidation states could not be determined. This is the case for MoCl5O (mp-1196724) for example, which does not receive a Cl correction because O is more electronegative.

If this affects your work, you can manually assign oxidation states by populating the oxidation_states key of the .data attribute of any ComputedEntry and then reprocessing the data using MaterialsProject2020Compatibility.

5. Uncertainty Quantification
We now compute the estimated uncertainty associated with the energy corrections on a material. Uncertainties reflect the measured uncertainty in the underlying experimental data that we use to determine the corrections, as well as uncertainty associated with the fitting procedure itself. This information enables new methods of assessing phase stability, as described in this manuscript

A Note for API and MPRester Users

For API users, if you are retrieving formation energies directly via the API, you will get the correct, latest formation energies from the current database release. However, if you are using get_entries or get_pourbaix_entries which apply the correction scheme on-the-fly, make sure to update to the latest version of pymatgen (v2022.0.8 or later) to get the correct values. If you are using pymatgen v2021 or earlier, this will use the old correction scheme by default when using get_entries and get_pourbaix_entries.

6 Likes

So nice!Thank you very much!

1 Like