📘 Overview

In this tutorial, we will explore how to use the CAMELS (Catchment Attributes and Meteorology for Large-sample Studies) dataset to build deep learning models for rainfall–runoff prediction. You will learn how to preprocess hydrologic data, construct and train LSTM and Transformer models, and evaluate their predictive performance.

To make the tutorial lightweight and runnable on Colab (even without GPU), we’ll use a small subset (20 basins) of CAMELS that has been pre-processed and stored as a NetCDF file.

💡 The full CAMELS dataset can be downloaded and processed using the provided Python scripts (01.download_camels.py, 02.prepare_camels.py), which takes about 20 minutes.


Resources

References

Liu, J., Bian, Y., Lawson, K., & Shen, C. (2024). Probing the limit of hydrologic predictability with the Transformer network. Journal of Hydrology, 637, 131389.

Liu, J., Shen, C., O’Donncha, F., Song, Y., et al. (2025). From RNNs to Transformers: Benchmarking deep learning architectures for hydrologic prediction. EGUsphere, 2025.


Written by

Jiangtao Liu

I am interested in using multiple satellite datasets, in-situ observation datasets, and reanalysis products to investigate how climate variability and human activities affect water resources. My approach integrates physics-based hydrological models with deep learning techniques, ensuring that model predictions remain both accurate and physically interpretable. In parallel, I develop BERT/GPT-based foundation models that can be fine-tuned for tasks such as streamflow forecasting, soil moisture prediction, and water quality assessment. Ultimately, I aim to deliver robust, scalable, and transparent modeling frameworks that guide decision-making from local watersheds to global scales, helping mitigate risks such as droughts, floods, and landslides in a changing climate.

Start the conversation