Too much data requested in one query


#1

Hi, I am trying to download all compounds in MP data including their Pretty formula and CIF. I am getting the following error. Is there any better idea to do so?

reak down your query into sub-queries.", “version”: {“db”: “2018.11”, “pymatgen”: “2018.11.6”, “rest”: “2.0”}, “created_at”: “2018-11-12T14:44:46.350284”, “traceback”: “Traceback (most recent call last):\n File \”/var/www/python/matgen_prod/materials_django/rest/rest.py\", line 94, in wrapped\n d = func(*args, **kwargs)\n File \"/var/www/python/matgen_prod/materials_django/rest/rest.py\", line 176, in query\n \“Too much data requested in one query.\”\nrest.rest.RESTError: Too much data requested in one query. Please break down your query into sub-queries.\n"}’

Any suggestions would be really appreciated.


#2

There is a good chance that the next release of pymatgen will auto-chunk your query for you. See this pull request, and in the meantime you can reference the proposed changes to MPRester.query and use that code to break down your current query.

Best,
Donny


#3

Thank you. I will try it.


#4

Following up on this, the latest version (v2018.12.12) of pymatgen's MPRester.query method supports downloading large quantities of data in chunks. For example, say you want to download information on all structures with computed electronic band structures. You want the structure in CIF format, the formation energy, the material id, and the computed band gap. The following code will fetch all that in roughly one minute over a gigabit connection, with progress shown on your screen:

from pymatgen import MPRester

mpr = MPRester()

docs = mpr.query({
    "has": "bandstructure",
}, [
    "material_id",
    "cif",
    "formation_energy_per_atom",
    "band_gap",
])

#5

I really appericiate it. Thanks.