How do I find data using code?


Here are our recommended approaches for finding data with code, from the command line or a notebook.

In Python we can use the earthaccess library (renamed, previously earthdata)

To install the package we’ll run this code from the command line. Note: you can run shell code directly from a Jupyter Notebook cell by adding a !, so it would be !conda install.

[command line code]
# Install earthaccess
conda install -c conda-forge earthaccess

This example searches for data from the Land Processes DAAC with a spatial bounding box and temporal range.

## Import packages
from earthdata import DataGranules, DataCollections
from pprint import pprint 

# We'll get 4 collections that match with our keyword of interest
collections = DataCollections().keyword("REFLECTANCE").cloud_hosted(True).get(4)

# Let's print 2 collections
for collection in collections[0:2]:
    print(pprint(collection.summary()) , collection.abstract(), "\n")
#Search for files from the second dataset result over a small plot in Nebraska, USA for two weeks in September 2022
granules = DataGranules().concept_id("C2021957657-LPCLOUD").temporal("2022-09-10","2022-09-24").bounding_box(-101.67271,41.04754,-101.65344,41.06213)

To find data in R, we’ll also use the earthaccess python package - we can do so from R using the reticulate package (cheatsheet). Note below that we import the python library as an R object we name earthaccess, as well as the earthaccess$ syntax for accessing functions from the earthaccess library. The granules object has a list of JSON dictionaries with some extra dictionaries.

[R code]
## load R libraries
library(tidyverse) # install.packages("tidyverse") 
library(reticulate) # install.packages("reticulate")

## load python library
earthaccess <- reticulate::import("earthaccess") 

# use earthaccess to access data #
granules <- earthaccess$search_data(
  concept_id = "C2036880672-POCLOUD",
  temporal = reticulate::tuple("2017-01", "2017-02") # with an earthaccess update, this can be simply c() or list()

## Granules found: 72

## exploring
granules # this is the result of the get request. 

class(granules) # "list"
## granules <- reticulate::py_to_r(granules) # Object to convert is not a Python object

Matlab code coming soon!

[Matlab code]
# Coming soon!

With wget and curl:

[command line code]
# Coming soon!