Skip to main content

2 posts tagged with "Cloud Computing"

View All Tags

The NextGen Research DataStream (NRDS): A Reproducible Numerical Prediction System for Accelerating Research to Operations in Hydrology

· 10 min read
Jordan Laser
Software Engineer at Lynker
Arpita Patel
Assistant Director of DevOps and IT
Harsha Vemula
DevOps Engineer at Alabama Water Institute

Technological advances are evolving water prediction capabilities at a ludicrous pace. From revolutionary machine learning algorithms to dramatic advances in computational hardware, the potential for making accurate hydrologic predictions has never been higher. To meet this new potential, the hydrologic community continuously generates models and approaches based on cutting edge research that could potentially benefit operational systems. However, many of these innovations lack a path to operational deployment.

The NextGen Research Datastream (NRDS) provides a mechanism by which these ideas can be refined and make their way into operations.

Developed by Lynker and the Alabama Water Institute (a Cooperative Institute for Research to Operations in Hydrology partnership), the NRDS facilitates the actualization a research idea from the community in a scalable and deployable numerical prediction system. To evaluate each of these modeling concepts, NRDS deploys prototype models to generate a continuous “datastream”. These outputs can then be evaluated and made more accurate. This cycle of streamlined deployment and iterative design lets these prototypes mature into a product that can be picked up by an operational forecasting team.

To enable this process to be done rapidly and smoothly, the entire system is designed with reproducibility and iterative improvement as core principles. The NRDS is an automated numerical prediction system generating regular stream flow forecasts that uses the NextGen Water Resources Modeling Framework (NextGen) as the core modeling engine and NextGen In A Box (NGIAB) as the simulation environment. This system generates forecasts across the contiguous United States (CONUS) on CIROH's operational cyberinfrastructure backbone: the research-to-operations (R2O) Hybrid Cloud (R2OHC) platform, with deployment on the AWS cloud. What makes the NRDS exciting is that the entire system is open-sourced, reproducible, publicly browsable, and potentially editable by anyone in the hydrologic community.

Expanding Access to NextGen Research through the CIROH Community NextGen Hub (CCNH) in Cloud

· 5 min read
Ayman Nassar
Postdoctoral Researcher
David Tarboton
Professor at Utah Water Research Laboratory
Arpita Patel
Assistant Director of DevOps and IT
Furqan Baig
Research Programmer
Homa Salehabadi
Postdoctoral Researcher
Benjamin Lee
Development Operations Engineer
Josh Cunningham
Software Engineer

Opening New Doors for Research with the NextGen Framework

The NextGen framework holds great potential for hydrologic modeling, but is often inaccessible due to its strenuous setup and requirements. As such, embedding it within a cloud-based framework offers a natural solution to this problem by removing some of the administrative and technical requirements for compute resource setup and computational library configuration, thus opening the door for a wider audience to tke advantage of the strengths of the framework.

With the CIROH Community NextGen Hub (CCNH), we’ve created a cloud-based environment that addresses exactly those setup challenges, so users can focus on science instead of software.

A Preconfigured, Ready-to-Use Cloud Environment

CCNH is a containerized, cloud-based modeling environment hosted on the CIROH-2i2c JupyterHub. It packages everything a researcher needs to run end-to-end NextGen workflows — from input preprocessing through model execution, calibration, evaluation, and output visualization — into a single, ready-to-use JupyterHub image. Built on the same containerization patterns as NGIAB, CCNH leverages a Pangeo base image and includes:

  • Pre-compiled NextGen framework binaries from NGIAB based docker image
  • NGIAB data preprocessing tools for automated retrieval and subsetting of hydrofabric and meteorological forcing datasets
  • T-Route routing components for streamflow simulation
  • SPOTPY(Statistical Parameter Optimization Tool for Python) for model calibration
  • TEEHR(Tools for Exploratory Evaluation in Hydrologic Research) for performance evaluation
  • PyNGIAB, a Python wrapper that lets you run NextGen simulations directly from Jupyter notebooks
  • HydroShare integration tools (nbfetch, hs_files-jupyter, hsclient) for seamless data exchange to save results in HydroShare for collaboration, reproducibility and publishing
  • JupyterLab with distributed computing capabilities for interactive, scalable workflows
Diagram illustrating how HydroShare resources, 2i2c JupyterHub, and S3 Object Store interact to enable streamlined NextGen workflows in the cloud.
Diagram illustrating how HydroShare resources, 2i2c JupyterHub, and S3 Object Store interact to enable streamlined NextGen workflows in the cloud.

The result: researchers can go from zero to running a calibrated NextGen simulation in a fraction of the time previously required.