Introduction to Extreme Value Analysis in R – Part 4: Dealing with trends

Hey there,

welcome to part 4 of our short introduction to extreme value analysis using the extRemes package in R.

In the recent posts about the block maxima method and the threshold excess method, we have simply assumed that all assumptions for extreme value analysis are fulfilled. However, this is most likely not the case when dealing with environmental variables. Especially the assumption of stationarity may be violated in many cases. Against the background of global climate change, it is likely that there is a a considerable trend in the time series of meteorological or other environmental variables. Of course, this trend has to be incorporated into the analysis, as the resulting return levels change over time.

The following code shows a short practical example of fitting a generalized pareto distribution to a time series of precipitation data using the extRemes package in R. The sample data set features precipitation data of Bärnkopf (Lower Austria) from 1971 through 2013 and is provided by the hydrographic services of Austria via via eHYD.

# load required packages
library(extRemes)
library(xts)
library(ggplot2)

# get data from eHYD
ehyd_url <- "http://ehyd.gv.at/eHYD/MessstellenExtraData/nlv?id=107540&file=2"
precipitation_xts <- read_ehyd(ehyd_url)

# derive AMS for maximum precipitation
ams <- apply.yearly(precipitation_xts, max)

# check stationarity within AMS:
# Mann-Kendall trend test
Kendall::MannKendall(ams)

# simple linear model
ams_df <- fortify(ams)
colnames(ams_df)[1] <- "date"
summary(lm(ams~date, data=ams_df))

p <- ggplot(ams_df, aes(x = date, y = ams)) + geom_point() + geom_line()
p + stat_smooth(method = "lm", formula = y ~ x, size = 1)

trend

Both the results of the fitted linear model and the graphical impression given by the plot indicate an upward trend of annual maximum precipitation amounts. The Mann-Kendall trend test returns a very small p-value, confirming this trend. Hence, trend correction has to be applied in order to account for changing return levels over time.

# maximum likelihood estimation
mle_trend <- fevd(x = ams, data = ams_df, location.fun=~date, method = "MLE", type="GEV")
rl_trend <- return.level(mle_trend, conf = 0.05, return.period= c(2,5,10,20,50,100))

# return level plot
plot(mle_trend, type="rl", main="Return Level Plot for Bärnkopf w/ MLE")

return_levels_trend
Compared to the return level plots of the previous posts (without trend), this return level plot looks different. It displays the change of the 5 and 100 year return level over time.

Reference: Coles, Stuart (2001). An Introduction to Statistical Modeling of Extreme Values. Springer Series in Statistics. London: Springer.

About This Author

Matthias studied Environmental Information Management at the University of Natural Resources and Life Sciences Vienna and holds a PhD in environmental statistics. The focus of his thesis was on the statistical modelling of rare (extreme) events as a basis for vulnerability assessment of critical infrastructure. He is working at the Austrian national weather and geophysical service (ZAMG) and at the Institute of Mountain Risk Engineering at BOKU University. He currently focuses the (statistical) assessment of adverse weather events and natural hazards, and disaster risk reduction. His main interests are statistical modelling of environmental phenomena as well as open source tools for data science, geoinformation and remote sensing.

9 Comments

You can post comments in this post.


  • Very nice and very helpful!

    How did you get the return level x-axis to be labeled “Year”? Even when I run your code I get “index.”
    Thanks in advance!

    Mike 7 years ago Reply


    • Try adding xlab="Year" inside the plot function.

      Martin 7 years ago Reply


      • That makes sense, but something in fevd doesn’t like that.

        plot(mle.trend, type="rl", ylim=c(0,200),
        +      main="Return Level Plot for Bärnkopf w/ MLE",xlab="Year")
        Error in plot.default(y, type = "l", xlab = "index", ylab = ylb, ...) : 
          formal argument "xlab" matched by multiple actual arguments
        

        Again your complete examples (data, code and graphics) are super helpful for beginners like me.

        Mike 7 years ago Reply


        • As a matter of fact, I am using customized plotting functions for fevd objects. In this particular case, it’s not that easy to modify the plots, as the fevd function is somehow bloated, and so are its plotting functions. So, even though the code for the plots is quite simple, you have to find the correct position where you have to make your modifications. In your code, the xlab-definition gets passed on to a plot()-function where the argument xlab is already defined as xlab = "index". If you want to customize your plot, you have to modify the function plot.fevd.mle. Around line 550 you will find the actual plot function that gets called in this specific case. You can change type = "b", xlab = "Year", pch = 16 to get a similar plot to the one I used in this post. In addition, you can dig into this plotting function if you want to change colors, return levels or plot multiple fitting methods in one plot (e.g. comparison of MLE and L-moments in one plot).

          Matthias 7 years ago Reply


  • I’ve got it to work, and understand why its important, however I’m a bit lost on how the trend was removed. In other words what exactly is occurring to remove the trend from the data, or is it just applying the trend to the return levels over time.

    Cameron 5 years ago Reply


  • The trend-corrected estimation $z$ at time $t$ is obtained as $\hat{z}_{t}=y_{t}-\hat{y}_{t}+\hat{y}_{l}$, where $y_{t}$ is the measurement at time $t$ and $\hat{y}_{t}$ is the trend at time $t$ obtained from the linear trend model $\hat{y}_{t}=\beta_{0}+\beta_{1}t$, with intercept $\beta_{0}$ and slope $\beta_{1}$, and $\hat{y}_{l}$ being the trend estimate for the latest year.

    This means that you (1) take each measurement, (2) subtract the estimate obtained via linear regression for said measurement, and (3) add the estimate obtained via linear regression for the last year in your time series.

    Matthias 5 years ago Reply


  • Hi there
    I’m having problem with the command “read_ehyd(ehyd_url)”.
    Can you please explain that command? 🙂
    Thanks in advance.

    Tomas Hansson 5 years ago Reply


    • Hi Tomas,

      this is a custom function to read data from the digital hydrological archive of Austria.
      It is explained in the post Reading data from eHYD using R.

      Regards,
      Matthias

      Matthias 5 years ago Reply


      • OK
        Thank you, Matthias.

        Regards,
        Tomas

        Tomas Hansson 5 years ago Reply


Post A Reply

*