Converting API responses¶
Introduction¶
The NMDC Runtime API returns NMDC metadata in JSON format.
One thing we've learned from talking with our users is that some of them prefer working with metadata in CSV format.
In this how-to guide, we'll show you how you can convert JSON-formatted Runtime API responses into various formats, including CSV.
Converting JSON into CSV¶
Here's how you can convert JSON into CSV.
Note: In this section, we'll be demonstrating a process that produces a CSV string in which each column name is a JSONPath expression that indicates precisely where—in the original JSON object(s)—the value(s) in that column came from.
For example, the column name "
$.principal_investigator.has_raw_value
" indicates that the value(s) in that column came from the "has_raw_value
" field of the object in the "principal_investigator
" field of the root object (represented by "$
").
Prerequisites¶
- The jq CLI application is installed
You can check whether it's installed by running
$ jq --version
. You can install it by following the instructions on its website. - The
json-tabulate
Python package is installedYou can check whether it's installed by running
$ json-tabulate --version
. You can install it by running$ pip install json-tabulate
.
Procedure¶
Here's how you can convert the JSON-formatted response from an NMDC Runtime API endpoint into CSV.
curl --silent 'https://api.microbiomedata.org/studies?fields=id,name,principal_investigator.has_raw_value&per_page=3' \
| jq '.results' \
| json-tabulate
We'll break it down here.
The first command—curl ...
—sends an HTTP request to the NMDC Runtime API and outputs the JSON-formatted response, which looks like this:
{
"meta": {
"mongo_filter_dict": {},
"mongo_sort_list": null,
"count": 47,
"db_response_time_ms": 0,
"page": 1,
"per_page": 3,
"fields": "id,name,principal_investigator.has_raw_value"
},
"results": [
{
"id": "nmdc:sty-11-8fb6t785",
"name": "Deep subsurface shale carbon reservoir microbial communities from Ohio and West Virginia, USA",
"principal_investigator": {
"has_raw_value": "Kelly Wrighton"
}
},
{
"id": "nmdc:sty-11-33fbta56",
"name": "Peatland microbial communities from Minnesota, USA, analyzing carbon cycling and trace gas fluxes",
"principal_investigator": {
"has_raw_value": "Christopher Schadt"
}
},
{
"id": "nmdc:sty-11-aygzgv51",
"name": "Riverbed sediment microbial communities from the Columbia River, Washington, USA",
"principal_investigator": {
"has_raw_value": "James Stegen"
}
}
],
"group_by": []
}
We wanted a list of studies, but the API response included some metadata about pagination and we don't want to analyze that. The list of studies we wanted is nested within a property named results
.
We can pipe the output of the curl ...
command into jq
in order to isolate the results
list, like this:
curl --silent 'https://api.microbiomedata.org/studies?fields=id,name,principal_investigator.has_raw_value&per_page=3' \
| jq '.results'
The output is now:
[
{
"id": "nmdc:sty-11-8fb6t785",
"name": "Deep subsurface shale carbon reservoir microbial communities from Ohio and West Virginia, USA",
"principal_investigator": {
"has_raw_value": "Kelly Wrighton"
}
},
{
"id": "nmdc:sty-11-33fbta56",
"name": "Peatland microbial communities from Minnesota, USA, analyzing carbon cycling and trace gas fluxes",
"principal_investigator": {
"has_raw_value": "Christopher Schadt"
}
},
{
"id": "nmdc:sty-11-aygzgv51",
"name": "Riverbed sediment microbial communities from the Columbia River, Washington, USA",
"principal_investigator": {
"has_raw_value": "James Stegen"
}
}
]
Much better! Lastly, this how-to guide is about converting JSON-formatted metadata into other formats; in this case, into CSV.
We can pipe the output of the jq ...
command into json-tabulate
in order to convert the metadata into CSV, like this:
curl --silent 'https://api.microbiomedata.org/studies?fields=id,name,principal_investigator.has_raw_value&per_page=3' \
| jq '.results' \
| json-tabulate
The output is now:
$.id,$.name,$.principal_investigator.has_raw_value
nmdc:sty-11-8fb6t785,"Deep subsurface shale carbon reservoir microbial communities from Ohio and West Virginia, USA",Kelly Wrighton
nmdc:sty-11-33fbta56,"Peatland microbial communities from Minnesota, USA, analyzing carbon cycling and trace gas fluxes",Christopher Schadt
nmdc:sty-11-aygzgv51,"Riverbed sediment microbial communities from the Columbia River, Washington, USA",James Stegen
Or, as a table:
$.id | $.name | $.principal_investigator.has_raw_value |
---|---|---|
nmdc:sty-11-8fb6t785 | Deep subsurface shale carbon reservoir microbial communities from Ohio and West Virginia, USA | Kelly Wrighton |
nmdc:sty-11-33fbta56 | Peatland microbial communities from Minnesota, USA, analyzing carbon cycling and trace gas fluxes | Christopher Schadt |
nmdc:sty-11-aygzgv51 | Riverbed sediment microbial communities from the Columbia River, Washington, USA | James Stegen |
Wondering why all the column names begin with
$
?The column names are JSONPath expressions. They indicate precisely where—in the original JSON object(s)—the value(s) in that column came from.
Converting JSON into TSV¶
You can convert JSON into TSV by following the same procedure as converting JSON into CSV (see above); except that, for TSV, the json-tabulate
command would include the --output-format tsv
option, like this:
curl --silent 'https://api.microbiomedata.org/studies?fields=id,name,principal_investigator.has_raw_value&per_page=3' \
| jq '.results' \
| json-tabulate --output-format tsv
The output would be:
$.id $.name $.principal_investigator.has_raw_value
nmdc:sty-11-8fb6t785 Deep subsurface shale carbon reservoir microbial communities from Ohio and West Virginia, USA Kelly Wrighton
nmdc:sty-11-33fbta56 Peatland microbial communities from Minnesota, USA, analyzing carbon cycling and trace gas fluxes Christopher Schadt
nmdc:sty-11-aygzgv51 Riverbed sediment microbial communities from the Columbia River, Washington, USA James Stegen
Conclusion¶
In this how-to guide, we showed you how you could convert an NMDC Runtime API response from JSON into a couple different formats; specifically, CSV and TSV.