Using icepyx to access ICESat-2 data

!pip install icepyx==0.8.1 -qq

What is icepyx?

icepyx is a community and software library for searching, downloading, and reading ICESat-2 data. While opening data should be straightforward, there are some oddities in navigating the highly nested organization and hundreds of variables of the ICESat-2 data. icepyx provides tools to help with those oddities.

Fitting icepyx into the data access package landscape

For ICESat-2 data, the icepyx package can: - search for available data granules (data files) - order and download data - order a subset of data: clipped in space, time, containing fewer variables, or a few other options provided by NSIDC - provides functionality to search through the available data variables - read ICESat-2 data into xarray DataArrays, including merging data from multiple files

What is ICESat-2?

IS2

ICESat-2 carries a satellite lidar instrument, ATLAS. Lidar is an active remote sensing technique in which pulses of light are emitted and the return time is used to measure distance. The available ICESat-2 data products range from sea ice freeboard to land elevation to cloud backscatter characteristics. A list of availble products can be found here. In this tutorial we will look at the ATL08 Land Water Vegetation Elevation product.

Data Collection

ICESat-2 measures data along 3 strong/weak beam pairs, resulting in 3 strong beams and 3 weak beams. The strong and weak beams are calibrated such that the weak beams have more sensitivity to viewing very bright surfaces (Ex. ice), while the strong beams are able to view surfaces with lower reflectances (Ex. water). The beams are designated in each data product as gt1l, gt1r, gt2l, gt2r, gt3l, and gt3r, where gt stands for “ground track”, the number refers to the photon emitter, and the l and r indicate “left” or “right” beam of the pair. Which of these designations is strong or weak depends on the orientation of the satellite (forwards, sc_orient==1; backwards, sc_orient==0). A helpful table of which beams are strong/weak can be found on p131 of the ATL03 Algorithm Theoretical Basis Document. The ATLAS spot number (values 1-6) is based on the ground track designation (gt1l etc.) and spacecraft orientation and, once determined can be used to consistently identify strong (Spots 1, 3, and 5) and weak (Spots 2, 4, and 6) beams.

Tracks

Photo: Neuenschwander et. al. 2019, Remote Sens. Env. DOI

Counting Photons

The ICESat-2 lidar collects at the single photon level, different from most commercial lidar systems. A lot of additional photons get returned as solar background noise, and removing these unwanted photons is a key part of the algorithms that produce the higher level data products.

Fig. 2. Results from signal finding methods for simulated ATLAS data. Black points show raw point cloud data as ingested from ATL03 product. Blue points overlaid in each plot show which photons each method identified as signal. Top panel reflects the signal photons as identified on the ATL03 data product (medium and high confidence signal photons). Bottom panel reflects the signal photons identified from the ATL08 DRAGANN method. (Neuenschwander & Pitts, 2019)

Photo: Neuenschwander et. al. 2019, Remote Sens. Env. DOI

To aggregate all these photons into more manegable chunks ATL08 consolidates the photons into 100m segments, each made up of five 20m segments.

Using icepyx to search for data

We won’t dive into using icepyx to search for and download data in this tutorial, since we already discussed how to do that with earthaccess. The code to search and download is still provided below for the curious reader. The icepyx documentation shows more detail about different search parameters and how to inspect the results of a query.

import icepyx as ipx
import json
import math

import numpy as np
import matplotlib.pyplot as plt
from shapely.geometry import shape, GeometryCollection
%matplotlib inline
# Open a geojson of our area of interest
with open("bosque_primavera.json") as f:
    features = json.load(f)["features"]

bosque = GeometryCollection([shape(feature["geometry"]).buffer(0) for feature in features])
# Use our search parameters to setup a search Query
short_name = 'ATL08'
spatial_extent = list(bosque.bounds)
date_range = ['2019-05-04','2019-05-04']
region = ipx.Query(short_name, spatial_extent, date_range)
# Display if any data files, or granules, matched our search
region.avail_granules(ids=True)
[['ATL08_20190504124152_05540301_006_02.h5']]
# Download the granules to a into a folder called 'bosque_primavera_ATL08'
region.download_granules('./bosque_primavera_ATL08')
EARTHDATA_USERNAME and EARTHDATA_PASSWORD are not set in the current environment, try setting them or use a different strategy (netrc, interactive)
You're now authenticated with NASA Earthdata Login
Using token with expiration date: 12/15/2023
Using .netrc file for EDL
Total number of data order requests is  1  for  1  granules.
Data request  1  of  1  is submitting to NSIDC
order ID:  5000004610385
Initial status of your order request at NSIDC is:  processing
Your order status is still  processing  at NSIDC. Please continue waiting... this may take a few moments.
Your order is: complete
Beginning download of zipped output...
Data request 5000004610385 of  1  order(s) is downloaded.
Download complete
Tip: If you don’t want to type your earthdata login information every time they are required you can setup more automatic methods of authentication. Two common methods are 1) Add your earthdata password and username to as environment variables as EARTHDATA_USERNAME and EARTHDATA_PASSWORD. 2) setup a .netrc file in your home directory. See the Openscapes tutorial

Reading a file with icepyx

To read a file with icepyx there are several steps: 1. Create a Read object. This sets up an initial connection to your file(s) and validates the metadata. 2. Tell the Read object what variables you would like to read 3. Load your data!

Create a Read object

pattern = "processed_ATL{product:2}_{datetime:%Y%m%d%H%M%S}_{rgt:4}{cycle:2}{orbitsegment:2}_{version:3}_{revision:2}.h5"
reader = ipx.Read('./bosque_primavera_ATL08', "ATL08", pattern)
You have 1 files matching the filename pattern to be read in.
reader
<icepyx.core.read.Read at 0x7f3d05e72e90>

Select your variables

To view the variables contained in your dataset you can call .vars on your data reader.

reader.vars.avail()
['ancillary_data/atlas_sdp_gps_epoch',
 'ancillary_data/control',
 'ancillary_data/data_end_utc',
 'ancillary_data/data_start_utc',
 'ancillary_data/end_cycle',
 'ancillary_data/end_delta_time',
 'ancillary_data/end_geoseg',
 'ancillary_data/end_gpssow',
 'ancillary_data/end_gpsweek',
 'ancillary_data/end_orbit',
 'ancillary_data/end_region',
 'ancillary_data/end_rgt',
 'ancillary_data/granule_end_utc',
 'ancillary_data/granule_start_utc',
 'ancillary_data/land/atl08_region',
 'ancillary_data/land/bin_size_h',
 'ancillary_data/land/bin_size_n',
 'ancillary_data/land/bright_thresh',
 'ancillary_data/land/ca_class',
 'ancillary_data/land/can_noise_thresh',
 'ancillary_data/land/can_stat_thresh',
 'ancillary_data/land/canopy20m_thresh',
 'ancillary_data/land/canopy_flag_switch',
 'ancillary_data/land/canopy_seg',
 'ancillary_data/land/class_thresh',
 'ancillary_data/land/cloud_filter_switch',
 'ancillary_data/land/del_amp',
 'ancillary_data/land/del_mu',
 'ancillary_data/land/del_sigma',
 'ancillary_data/land/dem_filter_switch',
 'ancillary_data/land/dem_removal_percent_limit',
 'ancillary_data/land/dragann_switch',
 'ancillary_data/land/dseg',
 'ancillary_data/land/dseg_buf',
 'ancillary_data/land/fnlgnd_filter_switch',
 'ancillary_data/land/gnd_stat_thresh',
 'ancillary_data/land/gthresh_factor',
 'ancillary_data/land/h_canopy_perc',
 'ancillary_data/land/iter_gnd',
 'ancillary_data/land/iter_max',
 'ancillary_data/land/lseg',
 'ancillary_data/land/lseg_buf',
 'ancillary_data/land/lw_filt_bnd',
 'ancillary_data/land/lw_gnd_bnd',
 'ancillary_data/land/lw_toc_bnd',
 'ancillary_data/land/lw_toc_cut',
 'ancillary_data/land/max_atl03files',
 'ancillary_data/land/max_atl09files',
 'ancillary_data/land/max_peaks',
 'ancillary_data/land/max_try',
 'ancillary_data/land/min_nphs',
 'ancillary_data/land/n_dec_mode',
 'ancillary_data/land/night_thresh',
 'ancillary_data/land/noise_class',
 'ancillary_data/land/outlier_filter_switch',
 'ancillary_data/land/p_static',
 'ancillary_data/land/ph_removal_percent_limit',
 'ancillary_data/land/proc_geoseg',
 'ancillary_data/land/psf',
 'ancillary_data/land/ref_dem_limit',
 'ancillary_data/land/ref_finalground_limit',
 'ancillary_data/land/relief_hbot',
 'ancillary_data/land/relief_htop',
 'ancillary_data/land/shp_param',
 'ancillary_data/land/sig_rsq_search',
 'ancillary_data/land/sseg',
 'ancillary_data/land/stat20m_thresh',
 'ancillary_data/land/stat_thresh',
 'ancillary_data/land/tc_thresh',
 'ancillary_data/land/te_class',
 'ancillary_data/land/terrain20m_thresh',
 'ancillary_data/land/toc_class',
 'ancillary_data/land/up_filt_bnd',
 'ancillary_data/land/up_gnd_bnd',
 'ancillary_data/land/up_toc_bnd',
 'ancillary_data/land/up_toc_cut',
 'ancillary_data/land/yapc_switch',
 'ancillary_data/qa_at_interval',
 'ancillary_data/release',
 'ancillary_data/start_cycle',
 'ancillary_data/start_delta_time',
 'ancillary_data/start_geoseg',
 'ancillary_data/start_gpssow',
 'ancillary_data/start_gpsweek',
 'ancillary_data/start_orbit',
 'ancillary_data/start_region',
 'ancillary_data/start_rgt',
 'ancillary_data/version',
 'ds_geosegments',
 'ds_metrics',
 'ds_surf_type',
 'gt3l/land_segments/asr',
 'gt3l/land_segments/atlas_pa',
 'gt3l/land_segments/beam_azimuth',
 'gt3l/land_segments/beam_coelev',
 'gt3l/land_segments/brightness_flag',
 'gt3l/land_segments/canopy/can_noise',
 'gt3l/land_segments/canopy/canopy_h_metrics',
 'gt3l/land_segments/canopy/canopy_h_metrics_abs',
 'gt3l/land_segments/canopy/canopy_openness',
 'gt3l/land_segments/canopy/canopy_rh_conf',
 'gt3l/land_segments/canopy/centroid_height',
 'gt3l/land_segments/canopy/h_canopy',
 'gt3l/land_segments/canopy/h_canopy_20m',
 'gt3l/land_segments/canopy/h_canopy_abs',
 'gt3l/land_segments/canopy/h_canopy_quad',
 'gt3l/land_segments/canopy/h_canopy_uncertainty',
 'gt3l/land_segments/canopy/h_dif_canopy',
 'gt3l/land_segments/canopy/h_max_canopy',
 'gt3l/land_segments/canopy/h_max_canopy_abs',
 'gt3l/land_segments/canopy/h_mean_canopy',
 'gt3l/land_segments/canopy/h_mean_canopy_abs',
 'gt3l/land_segments/canopy/h_median_canopy',
 'gt3l/land_segments/canopy/h_median_canopy_abs',
 'gt3l/land_segments/canopy/h_min_canopy',
 'gt3l/land_segments/canopy/h_min_canopy_abs',
 'gt3l/land_segments/canopy/n_ca_photons',
 'gt3l/land_segments/canopy/n_toc_photons',
 'gt3l/land_segments/canopy/photon_rate_can',
 'gt3l/land_segments/canopy/photon_rate_can_nr',
 'gt3l/land_segments/canopy/segment_cover',
 'gt3l/land_segments/canopy/subset_can_flag',
 'gt3l/land_segments/canopy/toc_roughness',
 'gt3l/land_segments/cloud_flag_atm',
 'gt3l/land_segments/cloud_fold_flag',
 'gt3l/land_segments/delta_time',
 'gt3l/land_segments/delta_time_beg',
 'gt3l/land_segments/delta_time_end',
 'gt3l/land_segments/dem_flag',
 'gt3l/land_segments/dem_h',
 'gt3l/land_segments/dem_removal_flag',
 'gt3l/land_segments/h_dif_ref',
 'gt3l/land_segments/last_seg_extend',
 'gt3l/land_segments/latitude',
 'gt3l/land_segments/latitude_20m',
 'gt3l/land_segments/layer_flag',
 'gt3l/land_segments/longitude',
 'gt3l/land_segments/longitude_20m',
 'gt3l/land_segments/msw_flag',
 'gt3l/land_segments/n_seg_ph',
 'gt3l/land_segments/night_flag',
 'gt3l/land_segments/ph_ndx_beg',
 'gt3l/land_segments/ph_removal_flag',
 'gt3l/land_segments/psf_flag',
 'gt3l/land_segments/rgt',
 'gt3l/land_segments/sat_flag',
 'gt3l/land_segments/segment_id_beg',
 'gt3l/land_segments/segment_id_end',
 'gt3l/land_segments/segment_landcover',
 'gt3l/land_segments/segment_snowcover',
 'gt3l/land_segments/segment_watermask',
 'gt3l/land_segments/sigma_across',
 'gt3l/land_segments/sigma_along',
 'gt3l/land_segments/sigma_atlas_land',
 'gt3l/land_segments/sigma_h',
 'gt3l/land_segments/sigma_topo',
 'gt3l/land_segments/snr',
 'gt3l/land_segments/solar_azimuth',
 'gt3l/land_segments/solar_elevation',
 'gt3l/land_segments/surf_type',
 'gt3l/land_segments/terrain/h_te_best_fit',
 'gt3l/land_segments/terrain/h_te_best_fit_20m',
 'gt3l/land_segments/terrain/h_te_interp',
 'gt3l/land_segments/terrain/h_te_max',
 'gt3l/land_segments/terrain/h_te_mean',
 'gt3l/land_segments/terrain/h_te_median',
 'gt3l/land_segments/terrain/h_te_min',
 'gt3l/land_segments/terrain/h_te_mode',
 'gt3l/land_segments/terrain/h_te_rh25',
 'gt3l/land_segments/terrain/h_te_skew',
 'gt3l/land_segments/terrain/h_te_std',
 'gt3l/land_segments/terrain/h_te_uncertainty',
 'gt3l/land_segments/terrain/n_te_photons',
 'gt3l/land_segments/terrain/photon_rate_te',
 'gt3l/land_segments/terrain/subset_te_flag',
 'gt3l/land_segments/terrain/terrain_slope',
 'gt3l/land_segments/terrain_flg',
 'gt3l/land_segments/urban_flag',
 'gt3l/signal_photons/classed_pc_flag',
 'gt3l/signal_photons/classed_pc_indx',
 'gt3l/signal_photons/d_flag',
 'gt3l/signal_photons/delta_time',
 'gt3l/signal_photons/ph_h',
 'gt3l/signal_photons/ph_segment_id',
 'gt3r/land_segments/asr',
 'gt3r/land_segments/atlas_pa',
 'gt3r/land_segments/beam_azimuth',
 'gt3r/land_segments/beam_coelev',
 'gt3r/land_segments/brightness_flag',
 'gt3r/land_segments/canopy/can_noise',
 'gt3r/land_segments/canopy/canopy_h_metrics',
 'gt3r/land_segments/canopy/canopy_h_metrics_abs',
 'gt3r/land_segments/canopy/canopy_openness',
 'gt3r/land_segments/canopy/canopy_rh_conf',
 'gt3r/land_segments/canopy/centroid_height',
 'gt3r/land_segments/canopy/h_canopy',
 'gt3r/land_segments/canopy/h_canopy_20m',
 'gt3r/land_segments/canopy/h_canopy_abs',
 'gt3r/land_segments/canopy/h_canopy_quad',
 'gt3r/land_segments/canopy/h_canopy_uncertainty',
 'gt3r/land_segments/canopy/h_dif_canopy',
 'gt3r/land_segments/canopy/h_max_canopy',
 'gt3r/land_segments/canopy/h_max_canopy_abs',
 'gt3r/land_segments/canopy/h_mean_canopy',
 'gt3r/land_segments/canopy/h_mean_canopy_abs',
 'gt3r/land_segments/canopy/h_median_canopy',
 'gt3r/land_segments/canopy/h_median_canopy_abs',
 'gt3r/land_segments/canopy/h_min_canopy',
 'gt3r/land_segments/canopy/h_min_canopy_abs',
 'gt3r/land_segments/canopy/n_ca_photons',
 'gt3r/land_segments/canopy/n_toc_photons',
 'gt3r/land_segments/canopy/photon_rate_can',
 'gt3r/land_segments/canopy/photon_rate_can_nr',
 'gt3r/land_segments/canopy/segment_cover',
 'gt3r/land_segments/canopy/subset_can_flag',
 'gt3r/land_segments/canopy/toc_roughness',
 'gt3r/land_segments/cloud_flag_atm',
 'gt3r/land_segments/cloud_fold_flag',
 'gt3r/land_segments/delta_time',
 'gt3r/land_segments/delta_time_beg',
 'gt3r/land_segments/delta_time_end',
 'gt3r/land_segments/dem_flag',
 'gt3r/land_segments/dem_h',
 'gt3r/land_segments/dem_removal_flag',
 'gt3r/land_segments/h_dif_ref',
 'gt3r/land_segments/last_seg_extend',
 'gt3r/land_segments/latitude',
 'gt3r/land_segments/latitude_20m',
 'gt3r/land_segments/layer_flag',
 'gt3r/land_segments/longitude',
 'gt3r/land_segments/longitude_20m',
 'gt3r/land_segments/msw_flag',
 'gt3r/land_segments/n_seg_ph',
 'gt3r/land_segments/night_flag',
 'gt3r/land_segments/ph_ndx_beg',
 'gt3r/land_segments/ph_removal_flag',
 'gt3r/land_segments/psf_flag',
 'gt3r/land_segments/rgt',
 'gt3r/land_segments/sat_flag',
 'gt3r/land_segments/segment_id_beg',
 'gt3r/land_segments/segment_id_end',
 'gt3r/land_segments/segment_landcover',
 'gt3r/land_segments/segment_snowcover',
 'gt3r/land_segments/segment_watermask',
 'gt3r/land_segments/sigma_across',
 'gt3r/land_segments/sigma_along',
 'gt3r/land_segments/sigma_atlas_land',
 'gt3r/land_segments/sigma_h',
 'gt3r/land_segments/sigma_topo',
 'gt3r/land_segments/snr',
 'gt3r/land_segments/solar_azimuth',
 'gt3r/land_segments/solar_elevation',
 'gt3r/land_segments/surf_type',
 'gt3r/land_segments/terrain/h_te_best_fit',
 'gt3r/land_segments/terrain/h_te_best_fit_20m',
 'gt3r/land_segments/terrain/h_te_interp',
 'gt3r/land_segments/terrain/h_te_max',
 'gt3r/land_segments/terrain/h_te_mean',
 'gt3r/land_segments/terrain/h_te_median',
 'gt3r/land_segments/terrain/h_te_min',
 'gt3r/land_segments/terrain/h_te_mode',
 'gt3r/land_segments/terrain/h_te_rh25',
 'gt3r/land_segments/terrain/h_te_skew',
 'gt3r/land_segments/terrain/h_te_std',
 'gt3r/land_segments/terrain/h_te_uncertainty',
 'gt3r/land_segments/terrain/n_te_photons',
 'gt3r/land_segments/terrain/photon_rate_te',
 'gt3r/land_segments/terrain/subset_te_flag',
 'gt3r/land_segments/terrain/terrain_slope',
 'gt3r/land_segments/terrain_flg',
 'gt3r/land_segments/urban_flag',
 'gt3r/signal_photons/classed_pc_flag',
 'gt3r/signal_photons/classed_pc_indx',
 'gt3r/signal_photons/d_flag',
 'gt3r/signal_photons/delta_time',
 'gt3r/signal_photons/ph_h',
 'gt3r/signal_photons/ph_segment_id',
 'orbit_info/bounding_polygon_lat1',
 'orbit_info/bounding_polygon_lon1',
 'orbit_info/crossing_time',
 'orbit_info/cycle_number',
 'orbit_info/lan',
 'orbit_info/orbit_number',
 'orbit_info/rgt',
 'orbit_info/sc_orient',
 'orbit_info/sc_orient_time',
 'quality_assessment/qa_granule_fail_reason',
 'quality_assessment/qa_granule_pass_fail']

Thats a lot of variables!

One key feature of icepyx is the ability to browse the variables available in the dataset. There are typically hundreds of variables in a single dataset, so that is a lot to sort through! Let’s take a moment to get oriented to the organization of ATL08 variables, by first a few important pieces of the algorithm.

To create higher level variables like canopy or terrain height, the ATL08 algorithms goes through a series of steps: 1. Identify signal photons from noise photons 2. Classify each of the signal photons as either terrain, canopy, or canopy top 3. Remove elevation, so the heights are with respect to the ground 3. Group the signal photons into 100m segments. If there are a sufficient number of photons in that group, calculate statistics for terrain and canopy (ex. mean height, max height, standard deviation, etc.)

Fig. 4. An example of the classified photons produced from the ATL08 algorithm. Ground photons (red dots) are labeled as all photons falling within a point spread function distance of the estimated ground surface. The top of canopy photons (green dots) are photons that fall within a buffer distance from the upper canopy surface, and the photons that lie between the top of canopy surface and ground surface are labeled as canopy photons (blue dots). (Neuenschwander & Pitts, 2019)

Providing all the potentially useful information from all these processing steps results in a data file that looks like:

Another way to visualize these structure is to download one file and open it using https://myhdf5.hdfgroup.org/.

Further information about each one of the variables is available in the Algorithm Theoretical Basis Document (ATBD) for ATL08.

There is lots to explore in these variables, but we will move forward using a common ATL08 variable: h_canopy, or the “98% height of all the individual relative canopy heights (height above terrain)” (ATBD definition).

reader.vars.append(var_list=['h_canopy', 'latitude', 'longitude'])

Note that adding variables is a required step before you can load the data.

Load the data!

ds = reader.load()
ds
/srv/conda/envs/notebook/lib/python3.10/site-packages/icepyx/core/read.py:520: UserWarning: rename 'delta_time' to 'photon_idx' does not create an index anymore. Try using swap_dims instead or use set_index after rename to create an indexed coordinate.
  .rename({"delta_time": "photon_idx"})
/srv/conda/envs/notebook/lib/python3.10/site-packages/icepyx/core/read.py:520: UserWarning: rename 'delta_time' to 'photon_idx' does not create an index anymore. Try using swap_dims instead or use set_index after rename to create an indexed coordinate.
  .rename({"delta_time": "photon_idx"})
<xarray.Dataset>
Dimensions:              (gran_idx: 1, photon_idx: 211, spot: 2)
Coordinates:
  * gran_idx             (gran_idx) float64 5.54e+04
  * photon_idx           (photon_idx) int64 0 1 2 3 4 5 ... 206 207 208 209 210
  * spot                 (spot) uint8 1 2
    source_file          (gran_idx) <U74 './bosque_primavera_ATL08/processed_...
    delta_time           (photon_idx) datetime64[ns] 2019-05-04T12:47:13.5766...
Data variables:
    sc_orient            (gran_idx) int8 0
    cycle_number         (gran_idx) int8 3
    rgt                  (gran_idx, spot, photon_idx) float32 554.0 ... 554.0
    atlas_sdp_gps_epoch  (gran_idx) datetime64[ns] 2018-01-01T00:00:18
    data_start_utc       (gran_idx) datetime64[ns] 2019-05-04T12:46:31.876322
    data_end_utc         (gran_idx) datetime64[ns] 2019-05-04T12:48:54.200826
    latitude             (spot, gran_idx, photon_idx) float32 20.59 ... 20.73
    longitude            (spot, gran_idx, photon_idx) float32 -103.7 ... -103.7
    gt                   (gran_idx, spot) object 'gt3l' 'gt3r'
    h_canopy             (photon_idx) float32 12.12 4.747 11.83 ... nan nan nan
Attributes:
    data_product:  ATL08
    Description:   Contains data categorized as land at 100 meter intervals.
    data_rate:     Data are stored as aggregates of 100 meters.

Here we have an xarray Dataset, a common Python data structure for analysis. To visualize the data we can plot it using:

ds.plot.scatter(x="longitude", y="latitude", hue="h_canopy")
<matplotlib.collections.PathCollection at 0x7f3d005d8430>

Notice also that the data is shown for just our area of interest! That is because of icepyx’s subsetting feature, which we will discuss more in the next section.

Using icepyx to download a subset of a granule

One feature which is not yet available in earthaccess is the ability to download just a subset of the file. This could mean a smaller spatial area or fewer variables. This feature is available in icepyx.

We saw above that icepyx by default will subset your data to the bounding box you provided when downloading. If you know in advance which variables you want icepyx can also subset variables.

Subset variables

# Create our Query
short_name = 'ATL08'
spatial_extent = list(bosque.bounds)
date_range = ['2019-05-04','2019-05-04']
region = ipx.Query(short_name, spatial_extent, date_range)
# Specify desired variables
region.order_vars.append(var_list=['h_canopy', 'latitude', 'longitude'])
We are already authenticated with NASA EDL
# Download the granules, using the Coverage kwarg to specify variables
region.download_granules(path='./ATL08_h_canopy', Coverage=region.order_vars.wanted)
Total number of data order requests is  1  for  1  granules.
Data request  1  of  1  is submitting to NSIDC
order ID:  5000004610386
Initial status of your order request at NSIDC is:  processing
Your order status is still  processing  at NSIDC. Please continue waiting... this may take a few moments.
Your order is: complete
Beginning download of zipped output...
Data request 5000004610386 of  1  order(s) is downloaded.
Download complete
# Read the new data
pattern = "processed_ATL{product:2}_{datetime:%Y%m%d%H%M%S}_{rgt:4}{cycle:2}{orbitsegment:2}_{version:3}_{revision:2}.h5"
reader = ipx.Read('./ATL08_h_canopy', "ATL08", pattern)
You have 1 files matching the filename pattern to be read in.

The available variables list on the subset dataset is a lot shorter!

reader.vars.avail()
['ancillary_data/atlas_sdp_gps_epoch',
 'ancillary_data/data_end_utc',
 'ancillary_data/data_start_utc',
 'ancillary_data/end_delta_time',
 'ancillary_data/granule_end_utc',
 'ancillary_data/granule_start_utc',
 'ancillary_data/start_delta_time',
 'gt3l/land_segments/canopy/h_canopy',
 'gt3l/land_segments/delta_time',
 'gt3l/land_segments/latitude',
 'gt3l/land_segments/longitude',
 'gt3r/land_segments/canopy/h_canopy',
 'gt3r/land_segments/delta_time',
 'gt3r/land_segments/latitude',
 'gt3r/land_segments/longitude',
 'orbit_info/sc_orient',
 'orbit_info/sc_orient_time']

Some example plots

To close, here are a few more examples of reading and visualizing ATL08 data.

Example 1: View the photon classifications

# Set up the data reader
pattern = "processed_ATL{product:2}_{datetime:%Y%m%d%H%M%S}_{rgt:4}{cycle:2}{orbitsegment:2}_{version:3}_{revision:2}.h5"
reader = ipx.Read('./bosque_primavera_ATL08', "ATL08", pattern)
You have 1 files matching the filename pattern to be read in.
# Add the photon height and classification variables
reader.vars.append(var_list=['ph_h', 'classed_pc_flag', 'latitude', 'longitude'])
# load the dataset
ds_photons = reader.load()
ds_photons
/srv/conda/envs/notebook/lib/python3.10/site-packages/icepyx/core/read.py:520: UserWarning: rename 'delta_time' to 'photon_idx' does not create an index anymore. Try using swap_dims instead or use set_index after rename to create an indexed coordinate.
  .rename({"delta_time": "photon_idx"})
/srv/conda/envs/notebook/lib/python3.10/site-packages/icepyx/core/read.py:520: UserWarning: rename 'delta_time' to 'photon_idx' does not create an index anymore. Try using swap_dims instead or use set_index after rename to create an indexed coordinate.
  .rename({"delta_time": "photon_idx"})
/srv/conda/envs/notebook/lib/python3.10/site-packages/icepyx/core/read.py:520: UserWarning: rename 'delta_time' to 'photon_idx' does not create an index anymore. Try using swap_dims instead or use set_index after rename to create an indexed coordinate.
  .rename({"delta_time": "photon_idx"})
/srv/conda/envs/notebook/lib/python3.10/site-packages/icepyx/core/read.py:520: UserWarning: rename 'delta_time' to 'photon_idx' does not create an index anymore. Try using swap_dims instead or use set_index after rename to create an indexed coordinate.
  .rename({"delta_time": "photon_idx"})
<xarray.Dataset>
Dimensions:              (gran_idx: 1, photon_idx: 25234, spot: 2)
Coordinates:
  * gran_idx             (gran_idx) float64 5.54e+04
  * photon_idx           (photon_idx) int64 0 1 2 3 ... 25230 25231 25232 25233
  * spot                 (spot) uint8 1 2
    source_file          (gran_idx) <U74 './bosque_primavera_ATL08/processed_...
    delta_time           (photon_idx) datetime64[ns] 2019-05-04T12:47:13.5766...
Data variables:
    sc_orient            (gran_idx) int8 0
    cycle_number         (gran_idx) int8 3
    rgt                  (gran_idx, spot, photon_idx) float32 554.0 ... nan
    atlas_sdp_gps_epoch  (gran_idx) datetime64[ns] 2018-01-01T00:00:18
    data_start_utc       (gran_idx) datetime64[ns] 2019-05-04T12:46:31.876322
    data_end_utc         (gran_idx) datetime64[ns] 2019-05-04T12:48:54.200826
    latitude             (spot, gran_idx, photon_idx) float32 20.59 ... nan
    longitude            (spot, gran_idx, photon_idx) float32 -103.7 ... nan
    gt                   (gran_idx, spot) <U4 'gt3l' 'gt3r'
    ph_h                 (spot, gran_idx, photon_idx) float32 nan ... 0.05542
    classed_pc_flag      (spot, gran_idx, photon_idx) float32 nan nan ... 1.0
Attributes:
    data_product:  ATL08
# Select just one beam
gt1l = ds_photons.sel(spot=1)
# A less complex plot
fig, ax = plt.subplots()
fig.set_size_inches(15, 4)
    
gt1l.plot.scatter(ax=ax, x='delta_time', y='ph_h', hue='classed_pc_flag')
<matplotlib.collections.PathCollection at 0x7f3cfe539e10>

# A plot with more customization
fig, ax = plt.subplots()
fig.set_size_inches(17, 6)
fig.suptitle('Classification of input photons', size=16)

labels={0: 'noise', 1: 'ground', 2: 'canopy', 3: 'top of canopy'}
colors={0: 'grey', 1: 'red', 2: 'green', 3: 'blue'}

for g in np.unique(gt1l.classed_pc_flag[0]):
    if not math.isnan(g):
        ds_group = gt1l.where(gt1l.classed_pc_flag == g, drop=True)
        ax.scatter(x=ds_group.delta_time, y=ds_group.ph_h, c=colors[g], 
                   label=labels[g], s=8)
ax.legend()

ax.set_ylabel('Height above the ground (m)', size=12)
Text(0, 0.5, 'Height above the ground (m)')

Plot the canopy compared to the ground height

# Remove our previous variables
reader.vars.remove(all=True)
# Add the next set of variables to the list
reader.vars.append(var_list=['h_te_best_fit', 'latitude', 'longitude'])
# load the data
ds_te = reader.load()
ds_te
/srv/conda/envs/notebook/lib/python3.10/site-packages/icepyx/core/read.py:520: UserWarning: rename 'delta_time' to 'photon_idx' does not create an index anymore. Try using swap_dims instead or use set_index after rename to create an indexed coordinate.
  .rename({"delta_time": "photon_idx"})
/srv/conda/envs/notebook/lib/python3.10/site-packages/icepyx/core/read.py:520: UserWarning: rename 'delta_time' to 'photon_idx' does not create an index anymore. Try using swap_dims instead or use set_index after rename to create an indexed coordinate.
  .rename({"delta_time": "photon_idx"})
<xarray.Dataset>
Dimensions:              (gran_idx: 1, photon_idx: 211, spot: 2)
Coordinates:
  * gran_idx             (gran_idx) float64 5.54e+04
  * photon_idx           (photon_idx) int64 0 1 2 3 4 5 ... 206 207 208 209 210
  * spot                 (spot) uint8 1 2
    source_file          (gran_idx) <U74 './bosque_primavera_ATL08/processed_...
    delta_time           (photon_idx) datetime64[ns] 2019-05-04T12:47:13.5766...
Data variables:
    sc_orient            (gran_idx) int8 0
    cycle_number         (gran_idx) int8 3
    rgt                  (gran_idx, spot, photon_idx) float32 554.0 ... 554.0
    atlas_sdp_gps_epoch  (gran_idx) datetime64[ns] 2018-01-01T00:00:18
    data_start_utc       (gran_idx) datetime64[ns] 2019-05-04T12:46:31.876322
    data_end_utc         (gran_idx) datetime64[ns] 2019-05-04T12:48:54.200826
    latitude             (spot, gran_idx, photon_idx) float32 20.59 ... 20.73
    longitude            (spot, gran_idx, photon_idx) float32 -103.7 ... -103.7
    gt                   (gran_idx, spot) object 'gt3l' 'gt3r'
    h_te_best_fit        (photon_idx) float32 1.342e+03 1.34e+03 ... 1.381e+03
Attributes:
    data_product:  ATL08
    Description:   Contains data categorized as land at 100 meter intervals.
    data_rate:     Data are stored as aggregates of 100 meters.
fig, ax = plt.subplots()
fig.set_size_inches(12, 3)

# plot the canopy height above ground level
(ds.h_canopy + ds_te.h_te_best_fit).plot.scatter(ax=ax, x="delta_time", y="h_canopy") # orange

# plot the terrain values
ds_te.plot.scatter(ax=ax, x="delta_time", y="h_te_best_fit") # blue
<matplotlib.collections.PathCollection at 0x7f3cfe4efaf0>

Summary

In this notebook we explored the opening and rendering ATL08 data with icepyx. We saw that icepyx will subset our downloaded data to our area of interest and also allows us to download only the variables we need. The ATL08 data has a folder-like structure with many variables to choose from. We focused on h_canopy and showed additional examples using the raw photons and h_te_best_fit for the ground height.

More information about ATL08 or icepyx can be found in: - The icepyx documentation - The Algorithm Theoretical Basis Document (ATBD) - Neuenschwander et. al. 2019, Remote Sens. Env. DOI