Find Data: Programmatic Search

Introduction

We can find data programmatically using the following code.

Code

Here are our recommended approaches for finding data with code.

In Python we can use the earthaccess library (renamed, previously earthdata)

To install the package we’ll run this code from the command line. Note: you can run shell code directly from a Jupyter Notebook cell by adding a !, so it would be !conda install.

[command line code]
# Install earthaccess
conda install -c conda-forge earthaccess

This example searches for data from POCLOUD

[python code]
## Import earthaccess
import earthaccess

## Access data 
granules = earthaccess.search_data(
  concept_id = "C2036880672-POCLOUD",
  temporal = ("2017-01", "2018-01") # this () syntax means it's a tuple
)
## Granules found: 72 

granules

To find data in R, we’ll also use the earthaccess python package - we can do so from R using the reticulate package (cheatsheet). Note below that we import the python library as an R object we name earthaccess, as well as the earthaccess$ syntax for accessing functions from the earthaccess library.

[R code]
## load R libraries
library(tidyverse) # install.packages("tidyverse") 
library(reticulate) # install.packages("reticulate")

## load python library
earthaccess <- reticulate::import("earthaccess") 

# Then we use earthaccess to build a Query with spatiotemporal parameters # https://nsidc.github.io/earthaccess/tutorials/search-granules/
granules <- earthaccess$search_data(
  concept_id = "C2036880672-POCLOUD",
  temporal = reticulate::tuple("2017-01", "2018-01") # with an earthaccess update, this can be simply c() or list()
)

## Granules found: 72

## exploring
granules 
class(granules) # "list"
granules <- py_to_r(granules) # Object to convert is not a Python object

## Next steps - 
## str(granules) %>% jsonlite::fromJSON() ## revisit, talk to Bri et al

Matlab code coming soon!

[Matlab code]
# Coming soon!

With wget and curl:

[command line code]
# Coming soon!