hddtools: Hydrological Data Discovery Tools

Claudia Vitolo

2021-02-14

Introduction

hddtools stands for Hydrological Data Discovery Tools. This R package is an open source project designed to facilitate access to a variety of online open data sources relevant for hydrologists and, in general, environmental scientists and practitioners.

This typically implies the download of a metadata catalogue, selection of information needed, formal request for dataset(s), de-compression, conversion, manual filtering and parsing. All those operation are made more efficient by re-usable functions.

Depending on the data license, functions can provide offline and/or online modes. When redistribution is allowed, for instance, a copy of the dataset is cached within the package and updated twice a year. This is the fastest option and also allows offline use of package’s functions. When re-distribution is not allowed, only online mode is provided.

Installation

Get the released version from CRAN:

install.packages("hddtools")

Or the development version from github using devtools:

devtools::install_github("ropensci/hddtools")

Load the hddtools package:

library("hddtools")

Data sources and Functions

The functions provided can retrieve hydrological information from a variety of data providers. To filter the data, it is advisable to use the package dplyr.

library("dplyr")

The Koppen Climate Classification map

The Koppen Climate Classification is the most widely used system for classifying the world’s climates. Its categories are based on the annual and monthly averages of temperature and precipitation. It was first updated by Rudolf Geiger in 1961, then by Kottek et al. (2006), Peel et al. (2007) and then by Rubel et al. (2010).

The package hddtools contains a function to identify the updated Koppen-Greiger climate zone, given a bounding box.

# Define a bounding box
areaBox <- raster::extent(-10, 5, 48, 62)

# Extract climate zones from Peel's map:
KGClimateClass(areaBox = areaBox, updatedBy = "Peel")
#>   ID Class Frequency
#> 1  9   Csb         1
#> 2 15   Cfb      6199
#> 3 16   Cfc        10
#> 4 27   Dfc         3
#> 5 29    ET         1
# Extract climate zones from Kottek's map:
KGClimateClass(areaBox = areaBox, updatedBy = "Kottek")
#>   ID Class Frequency
#> 1 32   Cfb     12301
#> 2 33   Cfc       507

The Global Runoff Data Centre

The Global Runoff Data Centre (GRDC) is an international archive hosted by the Federal Institute of Hydrology in Koblenz, Germany. The Centre operates under the auspices of the World Meteorological Organisation and retains services and datasets for all the major rivers in the world. Catalogue, kml files and the product Long-Term Mean Monthly Discharges are open data and accessible via the hddtools.

Information on all the GRDC stations can be retrieved using the function catalogueGRDC with no input arguments, as in the examle below:

# GRDC full catalogue
GRDC_catalogue <- catalogueGRDC()

It is advisable to use the package dplyr for convenient filtering, some examples are provided below.

# Filter GRDC catalogue based on a country code
GRDC_catalogue %>%
  filter(country == "IT")

# Filter GRDC catalogue based on rivername
GRDC_catalogue %>%
  filter(river == "PO, FIUME")

# Filter GRDC catalogue based on which daily data is available since 2000
GRDC_catalogue %>%
  filter(d_start >= 2000)

# Filter the catalogue based on a geographical bounding box
GRDC_catalogue %>%
  filter(between(x = long, left = -10, right = 5),
         between(x = lat, left = 48, right = 62))

# Combine filtering criteria
GRDC_catalogue %>%
  filter(between(x = long, left = -10, right = 5),
         between(x = lat, left = 48, right = 62),
         d_start >= 2000,
         area > 1000)

The GRDC catalogue (or a subset) can be used to create a map.

# Visualise outlets on an interactive map
library(leaflet)
leaflet(data = GRDC_catalogue %>% filter(river == "PO, FIUME")) %>%
  addTiles() %>%  # Add default OpenStreetMap map tiles
  addMarkers(~long, ~lat, popup = ~station)