Merge of material id?


#1

Hi,

My colleague and I noticed that some material ids are missing recently and the query on materials project website returns document of other MVC entry.

For example, when we query mp-37405 on the website, we get the results for mvc-16102, Al5CuSe8. Python MPRester queries would return nothing for mp-37405. In fact, we compare these two documents in our previous downloads and notice that they are mostly the same except for some calculated properties. The band gap for mvc-16102 is 0.8742 while it is 0.5142 for mp-37405.

Could you confirm these changes? and are there any mappings to get the current mvc id using the previous mp id?

Thanks

Regards
Chi


#2

Yes, materials may map to new “canonical” ids.

See previous discussion for doing robust queries / auto-mapping:




#3

To respond to this specific example, you can still see the individual tasks for both of these:

https://materialsproject.org/tasks/mvc-16102#mvc-16102
https://materialsproject.org/tasks/mvc-16102#mp-37405

When you query by the canonical mp-id via the API, you can request task_ids for a list of all task mp-ids grouped with the canonical mp-id.

I’m not seeing the band gap of 0.8742 eV you’re claiming however, it looks like it’s 0.514 eV for all band structure tasks to me for that material, and also that’s what is reported on the website for mvc-16102.


#4

Thanks Donny, really helped!


#5

Thanks for the reply, really helped.


#7

Hi @dwinston

Thanks for your reply.
However, following your instruction, I am still having trouble with querying from the api.


#8

Try:

mpr.get_entry_by_material_id({"task_ids":"mp-37405"})

To be very explicit about what is happening here: if you ask for a specific task_id or material_id it has to be the canonical material_id (“mp-id”). Unfortunately, due to recent database changes, the canonical mp-id has changed as you’ve noticed. With this query, we are instead asking to find a material where it’s task_ids list (note the plural) contains the specific task_id you’re asking for. That’s why, if you make the query above, you will get this in response:

ComputedEntry mvc-16102 - Al5 Cu1 Se8
Energy = -61.4075
Correction = 0.0000
Parameters:
run_type = GGA
is_hubbard = False
pseudo_potential = {'functional': 'PBE', 'labels': ['Al', 'Cu_pv', 'Se'], 'pot_type': 'paw'}
hubbards = {}
potcar_symbols = ['PBE Al', 'PBE Cu_pv', 'PBE Se']
oxide_type = None
Data:
oxide_type = None

i.e. it returns mvc-16102.

If you do want the data for that specific task, you would ask for:

mpr.get_task_data('mp-37405')

and this would return:

[{'energy': -61.40748582,
  'energy_per_atom': -4.386248987142857,
  'volume': 299.91559029019317,
  'nsites': 14,
  'composition_unit_cell': {'Al': 5.0, 'Cu': 1.0, 'Se': 8.0},
  'formula_pretty': 'Al5CuSe8',
  'is_hubbard': False,
  'hubbards': {},
  'elements': ['Al', 'Cu', 'Se'],
  'nelements': 3,
  'spacegroup': {'source': 'spglib',
   'symbol': 'F-43m',
   'number': 216,
   'point_group': '-43m',
   'crystal_system': 'cubic',
   'hall': 'F -4 2 3'},
  'band_gap': 0.8704000000000001,
  'density': 4.59619122937921,
  'oxide_type': None,
  'full_formula': 'Al5Cu1Se8',
  'material_id': 'mp-37405'}]

The reason for this distinction is that there are many tasks performed by the Materials Project for any given single material, and we try to present the best aggregated data we can, while also giving people access to the underlying individual calculation tasks in case they are of interest to you.