How do I subset data granules?
How do I subset a data granule using Harmony?
Install the harmony-py package:
# Install harmony-py
pip install -U harmony-py
Import packages:
import datetime as dt
from harmony import BBox, Client, Collection, Request, LinkType
import s3fs
import xarray as xr
Set up Harmony client and authentication
We will authenticate the following Harmony request using a netrc file. See the appendix for more information on Earthdata Login and netrc setup. This basic line below to create a Harmony Client assumes that we have a .netrc available.
= Client() harmony_client
Create and submit Harmony request
We are interested in the GHRSST Level 4 MUR Global Foundation Sea Surface Temperature Analysis dataset. We are subsetting over the Pacific Ocean to the west of Mexico during 1:00 - 2:00 on 10 March 2021. The dataset is organized into daily files, so while we are specifying a single hour in our request, this will return that full day’s worth of data.
= 'MUR-JPL-L4-GLOB-v4.1'
dataset_short_name
= Request(
request =Collection(id=dataset_short_name),
collection=BBox(-125.469,15.820,-99.453,35.859),
spatial={
temporal'start': dt.datetime(2021, 3, 10, 1),
'stop': dt.datetime(2021, 3, 10, 2)
}
)
= harmony_client.submit(request)
job_id
harmony_client.wait_for_processing(job_id)
Open and read the subsetted file in xarray
Harmony data outputs can be accessed within the cloud using the s3 URLs and AWS credentials provided in the Harmony job response. Using aws_credentials
we can retrieve the credentials needed to access the Harmony s3 staging bucket and its contents. We then use the AWS s3fs
package to create a file system that can then be read by xarray.
= harmony_client.result_urls(job_id, link_type=LinkType.s3)
results = list(results)
urls = urls[0]
url
= harmony_client.aws_credentials()
creds
= s3fs.S3FileSystem(
s3_fs =creds['aws_access_key_id'],
key=creds['aws_secret_access_key'],
secret=creds['aws_session_token'],
token={'region_name':'us-west-2'},
client_kwargs
)
= s3_fs.open(url, mode='rb')
f = xr.open_dataset(f)
ds ds
Plot data
Use the xarray built in plotting function to create a simple plot along the x and y dimensions of the dataset:
; ds.analysed_sst.plot()
R code coming soon!
# Coming soon!
Matlab code coming soon!
#| echo: true
# Coming soon!
With wget
and curl
:
# Coming soon!
How do I subset an OPeNDAP granule in the cloud?
How do I subset a data granule using xarray?
How do I download a subset of NetCDF-4?
this might be a deprecated idea