I recently searched for publicly available code to run statistical analyses that are commonly employed to evaluate (eco)hydrology datasets. My interests included data retrieval, data visualization, outlier detection, and regressions applied to streamflow, water quality, weather/climate, and land use data. This post serves as a time snapshot of available code resources.
Package Languages
I focused on packages in R and Python. R seems to have more data analysis and statistical modeling packages, whereas Python seems to have more physical hydrological modeling packages. As a result of my search interests, this post highlights the available R packages. Here are good package lists for both languages:
Hopefully, the time required to prepare your data and learn how to use these functions is less than the time to develop similar functions yourself. Most packages have vignettes (tutorials) that should help to learn the functions, and may also serve as teaching resources. In R, the function browseVignettes(‘packageName’) will open a webpage containing links to vignette PDFs (if available) and source code resources for the specified package. The datasets.load package in R is useful to discover if a specified package has sample datasets available to use.
Example webpage output from browseVignettes(‘smwrQW’):
USGS R Packages
R is currently the primary development language for hydrological statistical analyses within the United States Geological Survey (USGS). There are nearly 100 USGS-R Github repositories, from which I’ve selected a subset that seem to be actively maintained and applicable to the interests listed above. I’m not sure if similar functions as the ones contained within these repositories are available and tested in other programming languages. These USGS-R packages have test functions, which provides a baseline level of confidence in their development. Many packages are contributed or maintained by Laura DeCicco, and by David Lorenz for statistical analysis and water quality.
Selected USGS R Packages
- dataRetrieval for streamflow and water quality gauge/site data retrieval from the National Water Information System (NWIS) and the Water Quality Portal (WQP). This package has several great vignettes that demonstrate how to use the functions. The package is simple to use, but the data require processing to be useful for research (I’m currently developing a code repository for this purpose).
- Statistical Methods in Water Resources (smwr). All have vignettes.
smwrBase: Generic hydrology data processing tools
smwrStats: Statistical analysis and probability functions, and a possible companion book by Helsel and Hirsch (2002)
- WREG: Predictions in ungauged basins w/ OLS, GLS, WLS
- nsRFA: Regional Frequency Analysis, focusing on the index-value method
- DVstats: Daily flow statistical analysis and visualization
- HydroTools: Seems useful based on the function names, but no documentation outside of the functions themselves. Seems to be under development.
Additional R Hydrology Packages
- hydroTSM: Excellent graphics package for time series data. Two examples below of one-line plot functions and summary statistics.
- fasstr: hydrological time series analysis with a quick-start Cheat Sheet
- FlowScreen: temporal trends and changepoint analysis
- hydrostats: Computation of daily streamflow metrics (e.g. low flows, high flows, seasonality)
- FAdist: Common probability distributions for frequency analysis. These distributions are not available in base R.
- lmom and lmomRFA: L-moments for common probability distributions, and regional frequency analysis.
- nhdR: National Hydrography Dataset downloads
Water Quality
- EGRET – Excellent for plotting water quality data
User guide
Confidence intervals for WRTDS regression method - smwrQW: USGS water quality data processing. Seems better than EGRET for quantitative water quality data analysis functions.
Streamflow & Weather Generators
- MATLAB Kirsch-Nowak Streamflow Generator and blog posts
- RMAWGEN: Multi-site Auto-regressive Weather GENerator in R
- weathergen: for synthetic climate timeseries
Weather Station Data
- rnoaa: NOAA station data downloads using R (tip: the meteo_tidy_ghcnd() function provides nicely formated year-month-day datasets)
- GHCNpy: similar package in Python for GHCN-Daily records. Also has visualization functions.
- countyweather: US County-level weather data
- getMet: Seems like a SWAT model companion for weather data
- meteo: Data Analysis and Visualization
Climate Assessment
- musica: Climate model assessment tools
Hydrological Models in R:
- topmodel: Topmodel for flow modeling
- swmmr: SWMM model for stormwater
- swatmodel: SWAT model for ecohydrological studies
Land Use / Land Cover Change
- lulcc: Land Use and Land Cover Change modeling
If you know of other good packages, please add them to the comments or write to me so that I can add them to the lists.
A review article just came out in HESS about hydrology and R, you can find it here: https://www.hydrol-earth-syst-sci.net/23/2939/2019/
Pingback: Hydro Packages in R: HydroGOF – Water Programming: A Collaborative Research Blog
Pingback: Hydro Packages in R: HydroGOF – Hydrogen Water
Pingback: EnGauge: R Code Repository for Environmental Gauge Data Acquisition, Processing, and Visualization – Water Programming: A Collaborative Research Blog