welcome to part 2 of our short introduction to extreme value analysis using the
extRemes package in R.
As some of you might know, there are two common approaches for practical extreme value analysis. Today, we will focus on the first of these two approaches, called the block maxima method.
This approach for modelling extremes of a (time) series of observations is based on the utilization of maximum or minimum values of these observations within a certain sequence of constant length. For a sufficiently large number n of established blocks, the resulting peak values of these n blocks of equal length can be used for fitting a suitable distribution to these data. While the block size is basically freely selectable, a trade-off has to be made between bias (small blocks) and variance (large blocks). Usually, the length of the sequences is often chosen to correspond to a certain familiar time period, in most cases a year. The resulting vector of annual maxima (or minima, respectively) is calles “Annual Maxima (Minima) Series” or simply AMS.
According to the Fisher–Tippett–Gnedenko theorem, the distribution of block maxima can be approximated by a generalized extreme value distribution.
The following code shows a short practical example of fitting a generalized extreme value distribution to a time series of precipitation data using the
extRemes package in R. The sample data set features precipitation data of Bärnkopf (Lower Austria) from 1971 through 2014 and is is provided by the hydrographic services of Austria via eHYD. The function
read_ehyd() for importing the data set can be found at Reading data from eHYD using R.
# load required packages library(extRemes) library(xts) # get data from eHYD ehyd_url <- "http://ehyd.gv.at/eHYD/MessstellenExtraData/nlv?id=107540&file=2" precipitation_xts <- read_ehyd(ehyd_url) # derive AMS for maximum precipitation ams <- apply.yearly(precipitation_xts, max) # maximum-likelihood fitting of the GEV distribution fit_mle <- fevd(as.vector(ams), method = "MLE", type="GEV") # diagnostic plots plot(fit_mle) # return levels: rl_mle <- return.level(fit_mle, conf = 0.05, return.period= c(2,5,10,20,50,100)) # fitting of GEV distribution based on L-moments estimation fit_lmom <- fevd(as.vector(ams), method = "Lmoments", type="GEV") # diagnostic plots plot(fit_lmom) # return levels: rl_lmom <- return.level(fit_lmom, conf = 0.05, return.period= c(2,5,10,20,50,100)) # return level plots par(mfcol=c(1,2)) # return level plot w/ MLE plot(fit_mle, type="rl", main="Return Level Plot for Bärnkopf w/ MLE", ylim=c(0,200), pch=16) loc <- as.numeric(return.level(fit_mle, conf = 0.05,return.period=100)) segments(100, 0, 100, loc, col= 'midnightblue',lty=6) segments(0.01,loc,100, loc, col='midnightblue', lty=6) # return level plot w/ LMOM plot(fit_lmom, type="rl", main="Return Level Plot for Bärnkopf w/ L-Moments", ylim=c(0,200)) loc <- as.numeric(return.level(fit_lmom, conf = 0.05,return.period=100)) segments(100, 0, 100, loc, col= 'midnightblue',lty=6) segments(0.01,loc,100, loc, col='midnightblue', lty=6) # comparison of return levels results <- t(data.frame(mle=as.numeric(rl_mle), lmom=as.numeric(rl_lmom))) colnames(results) <- c(2,5,10,20,50,100) round(results,1)
In this case, both results are quite similar. In most cases, L-moments estimation is more robust than maximum likelihood estimation. In addition to these classical estimation methods,
extRemes offers Generalized Maximum Likelihood Estimation (GMLE, Martins and Stedinger, 2000) and Bayesian estimation methods (Gilleland and Katz, 2016). Moreover, I have made the observation that maximum likelihood estimation works more reliable in other R packages in some cases (e.g.
Reference: Coles, Stuart (2001). An Introduction to Statistical Modeling of Extreme Values. Springer Series in Statistics. London: Springer.