dgevbhm vignette

In this vignette, we present an example workflow to apply our R code for fitting the duration-dependent Generalized Extreme Value distribution (dGEV) in a Bayesian Hierarchical framework (Rischmuller et al. 2026). We use sub-daily precipitation records from the Global Sub-Daily Rainfall dataset (Lewis et al. 2019) over the United Kingdom.

library(dgevbhm)
#> Package dgevbhm loaded successfully.
#> NOTE: Always update using pak::pkg_install('alexleer/dgevbhm') to avoid long C++ compile times.

The data_read() function reads in sub-daily precipitation data directly from a directory dir_path of text files. missing_data specifies the minimum percentage of missing data per year; otherwise, the year is disregarded from the data. durs are the durations of the precipitation maxima for IDF/dGEV fitting. cores allows the user to specify the number of CPU cores for parallel processing, significantly speeding up the data reading process.

To reproduce this workflow on your own machine, you will first need to obtain the raw sub-daily precipitation text files (e.g., from the Global Sub-Daily Rainfall dataset). Please download the data and place the files into a local directory on your computer.

Once your directory is set up, you can read the data into R by pointing the dir_path argument to your chosen folder:

# Replace "/path/to/directory/" with the actual path where you saved the data
UK_data_full <- data_read(dir_path = "/path/to/directory/",
                          missing_data = 5,
                          durs = c(1, 3, 6, 12, 24, 48),
                          cores = 4)

(Note: For this vignette, the processed UK_data_full object is already bundled and loaded with the package.)

length(UK_data_full$y)
#> [1] 47

Out of 1903 files, 47 stations were selected based on the requirements mentioned above.

Selecting the rain gauges

dgev_bhm() is the central function of our package. It is able to fit the Bayesian Hierarchical dGEV models for a chosen sample of rain gauges. There, the user can choose if the shape parameter of the dGEV is fixed over durations (resulting in the same shape parameter for all durations, \(\xi_d\)) or fixed over space (resulting in the same shape parameter for all rain gauges, \(\xi_j\)).

# mean and median hourly precipitation rate (mm)
mean(UK_data_full$amp)
#> [1] 0.1024251
median(UK_data_full$amp)
#> [1] 0.0882582

Below we show an example of selecting similar stations using annual mean precipitation amp:

med_amp <- median(UK_data_full$amp)
# similar stations: selection of 25 stations that are closest to the median precipitation rate
sim_stations <- which(rank(abs(UK_data_full$amp - med_amp)) <= 25)
# plotting a histogram of the 25 precipitation rates
hist(UK_data_full$amp[sim_stations], main = 'histogram', xlab = 'stations')

names(UK_data_full) #checking header of the GSDR data
#> [1] "y"         "amp"       "latlon"    "stationID" "durs"
dim(UK_data_full$latlon) #dimension of the latlon 
#> [1] 47  2
# clean the data: select the 25 rain gauges and store them in a new object 
UK_data <- list()
UK_data$y <- UK_data_full$y[sim_stations]
UK_data$amp <- UK_data_full$amp[sim_stations]
UK_data$latlon <- UK_data_full$latlon[sim_stations, ]
UK_data$stationID <- UK_data_full$stationID[sim_stations]
# don't forget the durations!
UK_data$durs <- c(1, 3, 6, 12, 24, 48)

Fit the Bayesian hierarchical models

# Fit the dGEV-BHM with the shape parameter fixed over durations (shp_d = 'd'). We opt for 2000 iterations of the MCMC sampler, where the first half is used as burn-in. 
# We assume that 2 cores are available for computation. 
# As you can only use one core per chain, we use 2 chains.

# WARNING: May take some time. You can also load the prepared fitted models.
fit_xi_d <- dgev_bhm(data = UK_data, chains = 2, iter = 2000, cores = 2, shp_d = 'd')
#> 
#> Computing initial parameters... Complete!
#> 
#> Loading pre-compiled Stan model (durational)... Complete!
#> 
#> SAMPLING FOR MODEL 'dgev_bhm_xi_d' NOW (CHAIN 1).
#> 
#> SAMPLING FOR MODEL 'dgev_bhm_xi_d' NOW (CHAIN 2).
#> Chain 2: 
#> Chain 2: Gradient evaluation took 0.002303 seconds
#> Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 23.03 seconds.
#> Chain 2: Adjust your expectations accordingly!
#> Chain 2: 
#> Chain 2: 
#> Chain 1: 
#> Chain 1: Gradient evaluation took 0.002291 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 22.91 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
#> Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
#> Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
#> Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
#> Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
#> Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
#> Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
#> Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
#> Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
#> Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
#> Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
#> Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
#> Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
#> Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
#> Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
#> Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
#> Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
#> Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
#> Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
#> Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
#> Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
#> Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
#> Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
#> Chain 2: 
#> Chain 2:  Elapsed Time: 100.215 seconds (Warm-up)
#> Chain 2:                70.525 seconds (Sampling)
#> Chain 2:                170.74 seconds (Total)
#> Chain 2: 
#> Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 105.258 seconds (Warm-up)
#> Chain 1:                70.23 seconds (Sampling)
#> Chain 1:                175.488 seconds (Total)
#> Chain 1:
# Fit the dGEV-BHM with the shape parameter fixed over space (shp_d = 'j').

# WARNING: May take some time. You can also load the prepared fitted models.
fit_xi_j <- dgev_bhm(data = UK_data, chains = 2, iter = 2000, cores = 2, shp_d = 'j')
#> 
#> Computing initial parameters... Complete!
#> 
#> Loading pre-compiled Stan model (spatial)... Complete!
#> 
#> SAMPLING FOR MODEL 'dgev_bhm_xi_j' NOW (CHAIN 1).
#> 
#> SAMPLING FOR MODEL 'dgev_bhm_xi_j' NOW (CHAIN 2).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 0.003353 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 33.53 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 2: 
#> Chain 2: Gradient evaluation took 0.003252 seconds
#> Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 32.52 seconds.
#> Chain 2: Adjust your expectations accordingly!
#> Chain 2: 
#> Chain 2: 
#> Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
#> Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
#> Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
#> Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
#> Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
#> Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
#> Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
#> Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
#> Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
#> Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
#> Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
#> Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
#> Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
#> Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
#> Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
#> Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
#> Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
#> Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
#> Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
#> Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
#> Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
#> Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
#> Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 102.498 seconds (Warm-up)
#> Chain 1:                71.637 seconds (Sampling)
#> Chain 1:                174.135 seconds (Total)
#> Chain 1: 
#> Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
#> Chain 2: 
#> Chain 2:  Elapsed Time: 105.847 seconds (Warm-up)
#> Chain 2:                69.98 seconds (Sampling)
#> Chain 2:                175.827 seconds (Total)
#> Chain 2:
# structure of the fitted models
names(fit_xi_d) 
#>  [1] "pars"                 "data"                 "information_criteria"
#>  [4] "mcmc_diagnostics"     "durs"                 "j"                   
#>  [7] "d"                    "shp_d"                "mcsamp"              
#> [10] "stationID"
names(fit_xi_j)
#>  [1] "pars"                 "data"                 "information_criteria"
#>  [4] "mcmc_diagnostics"     "durs"                 "j"                   
#>  [7] "d"                    "shp_d"                "mcsamp"              
#> [10] "stationID"

There are 10 list items in the fitted model:

  • pars – posterior parameters samples
  • data – original input data
  • information_criteria
    1. widely applicable information criteria
    2. leave-one-out information criteria
  • mcmc_diagnostics
    1. R-hat
    2. Effective sample size
  • durs – duration vector of the model
  • j – number of localities
  • d – number of durations
  • shp_d – hierarchical dimension with respect to space "j" or durations "d"
  • mcsamp – MCMC sample size
  • stationID – vector of station IDs

Bayesian summary statistics

From the fitted dgev_bhm() object, we can obtain MCMC convergence diagnostics.

Here for the parameter \(\tilde{\mu}\):

# MCMC diagnostics for mut
fit_xi_d$mcmc_diagnostics$rhat$mut
#>    mut[1]    mut[2]    mut[3]    mut[4]    mut[5]    mut[6]    mut[7]    mut[8] 
#> 0.9991725 0.9995434 0.9991507 0.9990208 0.9994259 0.9997968 0.9993415 0.9994561 
#>    mut[9]   mut[10]   mut[11]   mut[12]   mut[13]   mut[14]   mut[15]   mut[16] 
#> 0.9997098 1.0009602 0.9996644 0.9997824 0.9997590 1.0002515 1.0000173 0.9990407 
#>   mut[17]   mut[18]   mut[19]   mut[20]   mut[21]   mut[22]   mut[23]   mut[24] 
#> 0.9991513 1.0001993 0.9993007 0.9991746 0.9994584 0.9994935 0.9997975 0.9994897 
#>   mut[25] 
#> 1.0011175
fit_xi_d$mcmc_diagnostics$ess$mut
#>   mut[1]   mut[2]   mut[3]   mut[4]   mut[5]   mut[6]   mut[7]   mut[8] 
#> 4199.119 3473.473 3383.287 4735.476 4818.267 3377.576 4617.195 5010.845 
#>   mut[9]  mut[10]  mut[11]  mut[12]  mut[13]  mut[14]  mut[15]  mut[16] 
#> 3843.079 3228.694 3111.379 4298.047 3761.694 3926.366 2829.858 6016.675 
#>  mut[17]  mut[18]  mut[19]  mut[20]  mut[21]  mut[22]  mut[23]  mut[24] 
#> 4281.848 3448.553 4152.583 3548.250 4428.796 3968.504 4968.388 5570.229 
#>  mut[25] 
#> 4387.957

The \(\hat{R}\) values for \(\tilde{\mu}\) are near 1.000, indicating perfect MCMC convergence and chain mixing. Furthermore, the large Effective Sample Sizes (5,700–8,700) confirm an abundance of independent draws. Together, these diagnostics guarantee highly stable and reliable parameter estimates for the dGEV model.

Information criteria

Examine out-of-sample predictive accuracy with information criteria (IC).

fit_xi_j$information_criteria
#> $waic
#>             Estimate         SE
#> elpd_waic -5763.1218  82.871994
#> p_waic      109.8157   4.095126
#> waic      11526.2435 165.743988
#> 
#> $looic
#>            Estimate         SE
#> elpd_loo -5764.5007  82.936025
#> p_loo      111.1946   4.264695
#> looic    11529.0014 165.872051
fit_xi_d$information_criteria
#> $waic
#>              Estimate         SE
#> elpd_waic -5787.05977  83.580923
#> p_waic       97.25975   4.429872
#> waic      11574.11954 167.161847
#> 
#> $looic
#>             Estimate         SE
#> elpd_loo -5787.92819  83.621697
#> p_loo       98.12817   4.584659
#> looic    11575.85638 167.243394

The dGEV-BHM-\(\xi_j\) is the better model, because the IC values are lower, indicating superior out-of-sample predictive accuracy.

Checking the model fits

Now we will proceed with goodness-of-fit metrics.

Probability-probability plots:

pp_plot(fit = fit_xi_d, j = 5)

pp_plot(fit = fit_xi_j, j = 5)

Ideally, the dots should line up close to the 1:1 diagonal. Furthermore, no single duration should deviate significantly, as this could indicate that the dGEV model is inappropriate for capturing duration dependence. While both models yield visually similar results in the PP-plots, the information criteria previously established that dGEV-BHM-\(\xi_j\) provides a better overall fit for this dataset.

Posterior predictive metrics

The figures below generated by the post_pred_plot function are posterior predictive histograms. The histogram generates the theoretical statistic (e.g., quantiles, median, max, min) of the posterior parameter samples. In our case, the statistics are computed using the GEV itself. We compare this with the empirical statistic from the data sample. With this, we get a sense of the goodness of fit as well as the uncertainty.

par(mfrow = c(2, 1))
post_pred_plot(fit = fit_xi_d, stat = 'median', j = 1, d = 6) # posterior predictive diagnostic plot for rain gauge #1 and duration 48h for the fit_xi_d model 
post_pred_plot(fit = fit_xi_j, stat = 'median', j = 1, d = 6) # posterior predictive diagnostic plot for rain gauge #1 and duration 48h for the fit_xi_j model

par(mfrow = c(2, 1))
post_pred_plot(fit = fit_xi_d, stat = 'min', j = 6, d = 1) # posterior predictive diagnostic plot for rain gauge #6 and duration 1h for the fit_xi_d model 
post_pred_plot(fit = fit_xi_j, stat = 'min', j = 6, d = 1) # posterior predictive diagnostic plot for rain gauge #6 and duration 1h for the fit_xi_j model

In the posterior predictive histograms, not all stations are going to fit perfectly. This is especially, and almost exclusively, true for 'min' and 'max' statistics. This is due to the asymptotic nature of the GEV, which models the limit of block extremes (the histogram) and therefore often fails to capture a single, rare absolute record (the red line) in a finite sample.

par(mfrow = c(2, 1))
post_pred_plot(fit = fit_xi_d, stat = 'max', j = 3, d = 1)
post_pred_plot(fit = fit_xi_j, stat = 'max', j = 3, d = 1)

Instead of using a diagram, we can calculate the posterior predictive p-value using the post_pred_pval() function. The inputs are identical to the post_pred_plot() function. An ideal fit for the model would yield a p-value of 0.5.

post_pred_pval(fit = fit_xi_d, stat = 'median', j = 2, d = 6)
#> [1] 0.2605
post_pred_pval(fit = fit_xi_j, stat = 'median', j = 2, d = 6)
#> [1] 0.3295

While p-values can fluctuate across different gauge samples, the dGEV-BHM-\(\xi_j\) model’s median aligns with the quantitative information criteria results, demonstrating a reliable and superior fit for this specific spatial sample.

Return level plots

The inputs are:

  • j - spatial locality
  • d - duration
  • max_rp - maximum plotted return period
  • alpha - significance level for credible intervals (default alpha=0.05)
# let's try a significance level of 0.1
return_level_plot(fit_xi_d, j = 5, d = 1, max_rp = 100, alpha = 0.1)

return_level_plot(fit_xi_j, j = 5, d = 1, max_rp = 100, alpha = 0.1)

Intensity-duration-frequency curves

The inputs are:

  • j - spatial locality
  • rp - return periods (i.e. frequency)
  • alpha - significance level for credible intervals (default alpha=0.05)

Here we have a single IDF curve on the plot:

idf_plot(fit = fit_xi_d,j = 5, rp = 100, alpha = 0.05)
#> Computing d-dimensional shape IDF curve...

Multiple IDF curves on the plot:

idf_plot(fit = fit_xi_d,j = 5, rp = c(2,10,100), alpha = 0.05)
#> Computing d-dimensional shape IDF curve...

The IDF curves are the final result, and they can be produced for all 25 rain gauges, where the user can select return period(s) and the confidence interval. Rischmuller et al. (2026) have shown the added value of the dGEV Bayesian Hierarchical estimation compared to standard station-wise GEV or dGEV-MLE estimations in terms of higher accuracy and robustness.

Lewis, Elizabeth, Hayley Fowler, Lisa Alexander, et al. 2019. “GSDR: A Global Sub-Daily Rainfall Dataset.” Journal of Climate 32 (15): 4715–29.
Rischmuller, Alexander Lee, Benjamin Poschlod, and Jana Sillmann. 2026. “Bayesian Hierarchical Modelling of Intensity-Duration-Frequency Curves Using a Climate Model Large Ensemble.” Advances in Statistical Climatology, Meteorology and Oceanography 12 (1): 1–19.