[python code]
import os
from osgeo import gdal
import rasterio as rio
import rioxarray
import hvplot.xarray
import holoviews as hv
In this notebook, we will access data for the Harmonized Landsat Sentinel-2 (HLS) Operational Land Imager Surface Reflectance and TOA Brightness Daily Global 30m v2.0 (L30) ([10.5067/HLS/HLSL30.002](https://doi.org/10.5067/HLS/HLSL30.002)) data product. These data are archived and distributed as Cloud Optimized GeoTIFF (COG) files, one file for each spectral band.
We will access a single COG file, L30 red band (0.64 – 0.67 μm), from inside the AWS cloud (us-west-2 region, specifically) and load it into Python as an xarray
dataarray
. This approach leverages S3 native protocols for efficient access to the data.
NASA Earthdata Cloud data in S3 can be directly accessed via temporary credentials; this access is limited to requests made within the US West (Oregon) (code: us-west-2) AWS region.
An Earthdata Login account is required to access data, as well as discover restricted data, from the NASA Earthdata system. Thus, to access NASA data, you need Earthdata Login. Please visit https://urs.earthdata.nasa.gov to register and manage your Earthdata Login account. This account is free to create and only takes a moment to set up.
You will need a netrc file containing your NASA Earthdata Login credentials in order to execute the notebooks. A netrc file can be created manually within text editor and saved to your home directory. For additional information see: [Authentication for NASA Earthdata](https://nasa-openscapes.github.io/2021-Cloud-Hackathon/tutorials/04_NASA_Earthdata_Authentication.html#authentication-via-netrc-file).
how to configure your Python and R work environment to access Cloud Optimized geoTIFF (COG) files
how to access HLS COG files
how to plot the data
Using Harmonized Landsat Sentinel-2 (HLS) version 2.0
[python code]
import os
from osgeo import gdal
import rasterio as rio
import rioxarray
import hvplot.xarray
import holoviews as hv
[R code]
library(rgdal)
library(raster)
library(terra)
For this exercise, we are going to open up a context manager for the notebook using the rasterio.env module to store the required GDAL configurations we need to access the data from Earthdata Cloud. While the context manager is open (rio_env.__enter__()) we will be able to run the open or get data commands that would typically be executed within a with statement, thus allowing us to more freely interact with the data. We’ll close the context (rio_env.__exit__()) at the end of the notebook.
GDAL environment variables must be configured to access COGs from Earthdata Cloud. Geospatial data access Python packages like rasterio and rioxarray depend on GDAL, leveraging GDAL’s “Virtual File Systems” to read remote files. GDAL has a lot of environment variables that control it’s behavior. Changing these settings can mean the difference being able to access a file or not. They can also have an impact on the performance.
[python code]
= rio.Env(GDAL_DISABLE_READDIR_ON_OPEN='TRUE',
rio_env =os.path.expanduser('~/cookies.txt'),
GDAL_HTTP_COOKIEFILE=os.path.expanduser('~/cookies.txt'))
GDAL_HTTP_COOKIEJAR__enter__() rio_env.
Set up rgdal configurations to access the cloud assets that we are interested in. You can learn more about these configuration options here.
[R code]
setGDALconfig(c("GDAL_HTTP_UNSAFESSL=YES",
"GDAL_HTTP_COOKIEFILE=.rcookies",
"GDAL_HTTP_COOKIEJAR=.rcookies",
"GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR",
"CPL_VSIL_CURL_ALLOWED_EXTENSIONS=TIF"))
In this example we’re interested in the HLS L30 data collection from NASA’s LP DAAC in Earthdata Cloud. Below we specify the HTTPS URL to the data asset in Earthdata Cloud. This URL can be found via Earthdata Search or programmatically through the CMR and CMR-STAC APIs.
[python code]
= 'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T11SQA.2021333T181532.v2.0/HLS.L30.T11SQA.2021333T181532.v2.0.B04.tif' https_url
Please note that in R, we need to add /vsicurl/
manually to the COG file URL.
[R code]
<- '/vsicurl/https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T11SQA.2021333T181532.v2.0/HLS.L30.T11SQA.2021333T181532.v2.0.B04.tif' https_url
Read in the HLS HTTPS URL for the L30 red band (0.64 – 0.67 μm) into our workspace. Note that, accessing files in the cloud requires you to authenticate using your NASA Earthdata Login account meaning a proper netrc file needs to be set up.
we are using rioxarray
, an extension of xarray
used to read geospatial data.
[python code]
= rioxarray.open_rasterio(https_url)
da da
The file is read into Python as an xarray
dataarray
with a band, x, and y dimension. In this example the band dimension is meaningless, so we’ll use the squeeze()
function to remove band as a dimension.
[python code]
= da.squeeze('band', drop=True)
da_red da_red
Plot the dataarray
, representing the L30 red band, using hvplot
.
[python code]
='x', y='y', cmap='gray', aspect='equal') da_red.hvplot.image(x
Exit the context manager.
[python code]
__exit__() rio_env.
[R code]
<- rast(https_url)
da_red da_red
The Convert a SpatRaster
object to a Raster
object using raster() to be able to use leaflet
to plot our data.
[R code]
<- da_red %>% raster()
red_raster red_raster
Then plot the red band
using plot
function.
[R code]
plot(red_raster)