What we’re learning about cloud costs for Earth science workflows in our JupyterHub

blog
event
nasa-framework
noaa-fisheries
Authors

Andy Teucher

Alex Lewandowski

Eli Holmes

Tasha Snow

Yuvi

Julie Lowndes

Published

May 1, 2025

On April 22, 2025 Openscapes hosted a Community Call to share what we’re learning about cloud costs for Earth science workflows in our JupyterHubs. We are working deeply with NASA Openscapes Mentors and NOAA Openscapes Mentors to use, teach, and develop with 2i2c; and increasingly have cross-NASA collaborations with other hubs like the Alaska Satellite Facility (ASF) and CryoCloud. This was a first conversation. There is so much more to share than we had time for, and there was a lot more interest in questions and community examples. Through a light interview structure, Andy Teucher (Openscapes), Alex Lewandowski (NASA Alaska Satellite Facility), Eli Holmes (NOAA Fisheries), Tasha Snow (CryoCloud), and Yuvi (2i2c) shared their experiences and what they are building.

Quick links:


Openscapes and our partners at NASA, NOAA, and 2i2c have been learning together about monitoring and managing the costs of cloud computing in a JupyterHub. Many organizations are starting to use cloud computing for computational analysis and teaching workshops, often using JupyterHubs as the platform for this work.  Tracking usage and attributing costs to specific users and workflows can be tricky on these shared hubs, and we have been learning some strategies and tools to help us understand them. The purpose of this Community Call was to share what we’ve learned so far: tools and processes to explore cloud costs, as well as figures like the basic costs for hosting a hub, cost per user, cost per science workflow, and what it costs to run a workshop in the hub.

One punchline: yes it is possible to run real science workflows in different Hubs – and we can estimate the costs. An example workflow transported from Alaska Satellite Facility (ASF) OpenSARLab to NASA Openscapes Hub cost $0.74. This cost does not reflect the cost of technical infrastructure & development, plus the training and upskilling the researcher needs. When comparing costs, it is important to keep in mind that different JupyterHubs have different focuses (e.g., data type(s) and usage patterns) and, consequently, different architectures that can greatly affect costs even for similar workflows. Yet, it is an exciting step in understanding the cost of earth science workflows in the cloud!

2 plots: CPU cores and Memory (GB) Data size: ~28GB; Instance type: r5.xlarge ($0.252 per hour); Total cost: $0.74

CPU & Memory Requests for OSL workflow in NASA Openscapes Hub

The format of the call was several stories and demos from this cutting-edge collaboration between open source infrastructure, government agencies, science communities (slides). Here is the story arc, and a few key notes.

Resources

  • openscapes.cloud for NASA and NOAA Fisheries JupyterHub policies and cost reporting.

  • 2i2c helps communities build their own interactive computing hub in the cloud with open infrastructure.

  • CryoCloud: Accelerating discovery and enhancing collaboration for NASA Cryosphere communities.

  • Grafana for monitoring and visualizing usage data.

  • AWS Cost Explorer and API docs for managing cloud costs on AWS.

  • jupycost: A work-in-progress R package from Openscapes for querying and summarizing JupyterHub cost and usage statistics.

  • sixtyfour: An R package for interfacing with AWS APIs, from the Fred Hutch Cancer Center Data Science Lab

  • grafana-dashboards: Grafana Dashboards used in our JupyterHubs. Provides Grafana Dashboards as code – very useful for learning how to query Prometheus metrics.

Speakers

  • Tasha Snow is a co-founder of CryoCloud, a remote sensing glaciologist at ESSIC University of Maryland and NASA GSFC, and was the recipient of the 2023 AGU Open Science Recognition Prize. 

  • Andy Teucher is a core Openscapes team member and develops software and cloud infrastructure. He is a biologist turned data scientist, specializing in helping people and organizations build maintainable, reproducible data workflows. Andy is a strong open data and open code advocate, and believes in the value of using and contributing to open-source software.

  • Eli Holmes is currently lead of NOAA Fisheries Open Science and in this role, facilitates and runs trainings in computing, data access and statistics for NOAA Fisheries. She is co-lead of the Inter-agency R User Group (federal agencies) and NMFS Openscapes, and other trainings and HackWeeks. 

  • Yuvi is a co-founder and tech lead of 2i2c.org, a core member of the JupyterHub team, ex-member of the Wikimedia DevOps team and has been doing open source work for about 17 years in various communities.

  • Alex Lewandowski is a research software engineer at the Alaska Satellite Facility. He works on the Science Enabling Services Team. Much of his work focuses on providing support, educational materials, and tools to ASF’s users of SAR data. He has been an Openscapes Mentor since 2022.


Citation

BibTeX citation:
@online{teucher2025,
  author = {Teucher, Andy and Lewandowski, Alex and Holmes, Eli and
    Snow, Tasha and , Yuvi and Lowndes, Julie},
  title = {What We’re Learning about Cloud Costs for {Earth} Science
    Workflows in Our {JupyterHub}},
  date = {2025-05-01},
  url = {https://nasa-openscapes.github.io/news/2025-05-01-community-call-hub-cloud-costs/},
  langid = {en}
}
For attribution, please cite this work as:
Teucher, Andy, Alex Lewandowski, Eli Holmes, Tasha Snow, Yuvi, and Julie Lowndes. 2025. “What We’re Learning about Cloud Costs for Earth Science Workflows in Our JupyterHub.” May 1, 2025. https://nasa-openscapes.github.io/news/2025-05-01-community-call-hub-cloud-costs/.