Using the NMDC API Graphical User Interface (GUI)
Note: This guide was written with respect to NMDC API version
1.2.0
.
Retrieving Metadata using the Find and Metadata API Endpoints
Metadata describing NMDC data (e.g. studies, biosamples, data objects, etc.) may be retrieved with GET requests, using the NMDC API Graphical User Interface (GUI). The API GUI provides a guided user interface for direct access to the NMDC data portal. It allows for:
performing highly granular and targeted queries directly. This is especially helpful if a user has a query that may not be supported by the NMDC Data Portal yet.
interactive exploration of querying capabilities. It provides code snippets that can be used in scripts for programmatic access, i.e. the request
curl
commands and URLs provided in the responses (please see the examples below).
Please note that the endpoints discussed in this documentation were designed for use primarily by NMDC data consumers. For documentation describing other endpoints, primarily those designed for use by NMDC team members, please see the NMDC Runtime documentation.
API requests can include various parameters to filter, sort, and organize the requested information. The syntax of the parameters will vary, depending upon whether the API endpoint is a find endpoint or a metadata endpoint. Find endpoints are designed to use more compact syntax (for example, filtering biosamples for those having an “Ecosystem Category” of “Plants” would involve submitting a request containing ecosystem_category:Plants
to the GET /biosamples
endpoint). In contrast, metadata endpoints use MongoDB-like query syntax (e.g. the same filter would look like {"ecosystem_category": "Plants"}
using the GET /nmdcshema/{collection_name}
endpoint with collection_name
set to biosample_set
).
The following sections are about the find and metadata endpoints.
Find Endpoints
The find endpoints are provided with NMDC metadata entities already specified. In other words, there are find endpoints specific to finding studies, other find endpoints specific to finding biosamples, ones for finding data objects, and ones for finding planned processes.
When preparing to submit an API request to a find endpoint, we recommend reviewing the parameter options in that endpoint’s section of the API GUI. They’re all listed in a section called “Find”.
Here are some of the find endpoints that exist today:
The
GET /studies
endpoint is a general purpose way to retrieve NMDC studies based on parameters provided by the user. Studies can be filtered and sorted based on the applicable Study attributes.
If the study identifier is known, a study can be retrieved directly using the
GET /studies/{study_id}
endpoint. Note that only one study can be retrieved at a time using this method.
The
GET /biosamples
endpoint is a general purpose way to retrieve biosample metadata using user-provided filter and sort criteria. Please see the applicable Biosample attributes.
If the biosample identifier is known, a biosample can be retrieved directly using the
GET /biosamples/{sample_id}
. Note that only one biosample metadata record can be retrieved at a time using this method.
To retrieve metadata about NMDC data objects (such as files, records, or omics data) the
GET /data_objects
endpoint may be used along with various parameters. Please see the applicable Data Object attributes.
If the data object identifier is known, the metadata can be retrieved using the
GET /data_objects/{data_object_id}
endpoint. Note that only one data object metadata record may be retrieved at a time using this method.
For the latest, complete list of find endpoints, consult the “Find” section of the API GUI.
Metadata Endpoints
The metadata endpoints can be used to get and filter metadata from collection set types (including studies, biosamples, activities, and data objects as discussed in the find section).
Unlike the compact syntax used in the find endpoints, the syntax for the filter parameter of the metadata endpoints uses MongoDB-like query syntax.
When preparing to submit an API request to a metadata endpoint, we recommend reviewing the parameter options in that endpoint’s section of the API GUI. They’re all listed in a section called “Metadata”.
Here are some of the find endpoints that exist today:
To view the NMDC Schema version the database is currently using, try executing the
GET /nmdcschema/version
endpoint.
To get the NMDC Database collection statistics, like the total count of records in a collection or the size of the collection, try executing the
GET /nmdcschema/collection_stats
endpoint.
The
GET /nmdcschema/{collection_name}
endpoint is a general purpose way to retrieve metadata about a specified collection given user-provided filter and projection criteria. Please see the Collection Names that may be retrieved. Please note that metadata may only be retrieved about one collection at a time.
If the identifier of the record is known, the
GET /nmdcshema/ids/{doc_id}
can be used to retrieve the specified record. Note that only one identifier may be used at a time, and therefore, only one record may be retrieved at a time using this method.
If both the identifier and the collection name of the desired record is known, the
GET /nmdcschema/{collection_name}/{doc_id}
can be used to retrieve the record. The projection parameter is optionally available for this endpoint to retrieve only desired attributes from a record. Please note that only one record can be retrieved at one time using this method.
For the latest, complete list of metadata endpoints, consult the “Metadata” section of the API GUI.
Metadata endpoint example: Get all the biosamples that are part of the “1000 Soils Research Campaign” study sampled from Colorado
Click on the dropdown arrow to the right side of the
GET /nmdcschema/{collection_name}
endpoint.Click Try it out in the upper right of the expanded endpoint box.
In order to enter the parameters, get the identifier for this study by navigating to the 1000 Soils Research Campaign study page in the NMDC Data Portal and copying the
ID
.Enter the parameters in the
GET /nmdcschema/{collection_name}
endpoint. For this example, we will inputbiosample_set
into the collection_name parameter and{"part_of": "nmdc:sty-11-28tm5d36", "geo_loc_name.has_raw_value": {"$regex": "Colorado"}}
into the filter parameter. See the Biosample Class in the NMDC Schema to view the applicable biosample attributes (slots); for this example, they arepart_of
andgeo_loc_name.has_raw_value
. Note that$regex
conducts a full text search for the word “Colorado” in thegeo_loc_name.has_raw_value
attribute.Click Execute.
View the results in JSON format, available to download by clicking Download; or copy the results by clicking the clipboard icon in the bottom right corner of the response. In this case, two studies were retrieved. Note that the curl and request URL are provided as well.
Retrieving Metadata using a Queries API Endpoint
“Public” versus “Private” API endpoints
The previous section was about some API endpoints that people could access without being logged into the NMDC API GUI. People sometimes refer to endpoints like that as “public” API endpoints. In contrast, this next section will be about API endpoints that people can only access when they are logged into the NMDC API GUI. People sometimes refer to API endpoints like these as “private” API endpoints. We’ll be using those terms—“public” and “private”—that way, in this section.
Logging into the NMDC API GUI
Here’s how you can log into the NMDC API GUI:
Visit the NMDC API GUI in your web browser if you aren’t already there.
Notice that the padlock icon on the “Authorize” button is open, which signifies that you aren’t currently logged into the NMDC API.
Editor’s note: Several people have reported that they find that choice of icon—which the NMDC API GUI inherits from a third-party API documentation library—to be counterintuitive.
Near the top of the page, click the link that says “Login with ORCID”.
The “Sign in to ORCID” page will appear.
On the “Sign in to ORCID” page, enter and submit your ORCID credentials.
The API GUI page will reappear, including a blue box that says “You are now authorized.” Also, the padlock icon on the “Authorize” button will be closed; signifying that you are logged into the NMDC API.
(Optional) In the blue box, click the “Show token” button to see your NMDC API access token.
You can use that access token when submitting API requests via the command line (e.g., via curl).
At this point, you are logged into the NMDC API GUI.
Accessing a “private” Queries API endpoint
Now that you are logged into the NMDC API GUI, you can use the NMDC API GUI to access “private” API endpoints.
Here’s how you can access a “private” Queries API endpoint:
Visit the NMDC API GUI in your web browser if you aren’t already there.
Confirm the “Authorize” button has a closed padlock icon on it, indicating that you are logged in.
Scroll down to the “Queries” group of API endpoints.
Click the
POST /queries:run
section (which has a padlock icon next to it) to expand it.Click the “Try it out” button next to the “Parameters” heading.
Populate the “Request body” field with the following JSON snippet:
{ "find": "study_set", "filter": {"ecosystem_category": "Aquatic"} }
Click the “Execute” button.
The NMDC API GUI will send an HTTP request to the NMDC API and display the response from the NMDC API.
Notice that the “Curl” command includes an
Authorization
header that contains your access token. If you were making the API request via your command line instead of via the NMDC API GUI, you could include that same header in order to access “private” API endpoints.View the API response body in the “Response body” section.
The API response body is a JSON object having several properties, including
ok
andcursor
. Thecursor
property contains an object having afirstBatch
property, which contains the array of results that met the filter criteria that was specified in the API request. In this case, it contains all studies having anecosystem_category
value of “Aquatic
”.