Hey there,
welcome to part 4 of our short introduction to extreme value analysis using the extRemes
package in R.
In the recent posts about the block maxima method and the threshold excess method, we have simply assumed that all assumptions for extreme value analysis are fulfilled. However, this is most likely not the case when dealing with environmental variables. Especially the assumption of stationarity may be violated in many cases. Against the background of global climate change, it is likely that there is a a considerable trend in the time series of meteorological or other environmental variables. Of course, this trend has to be incorporated into the analysis, as the resulting return levels change over time.
The following code shows a short practical example of fitting a generalized pareto distribution to a time series of precipitation data using the extRemes
package in R. The sample data set features precipitation data of Bärnkopf (Lower Austria) from 1971 through 2013 and is provided by the hydrographic services of Austria via via eHYD.
# load required packages library(extRemes) library(xts) library(ggplot2) # get data from eHYD ehyd_url <- "http://ehyd.gv.at/eHYD/MessstellenExtraData/nlv?id=107540&file=2" precipitation_xts <- read_ehyd(ehyd_url) # derive AMS for maximum precipitation ams <- apply.yearly(precipitation_xts, max) # check stationarity within AMS: # Mann-Kendall trend test Kendall::MannKendall(ams) # simple linear model ams_df <- fortify(ams) colnames(ams_df)[1] <- "date" summary(lm(ams~date, data=ams_df)) p <- ggplot(ams_df, aes(x = date, y = ams)) + geom_point() + geom_line() p + stat_smooth(method = "lm", formula = y ~ x, size = 1)
Both the results of the fitted linear model and the graphical impression given by the plot indicate an upward trend of annual maximum precipitation amounts. The Mann-Kendall trend test returns a very small p-value, confirming this trend. Hence, trend correction has to be applied in order to account for changing return levels over time.
# maximum likelihood estimation mle_trend <- fevd(x = ams, data = ams_df, location.fun=~date, method = "MLE", type="GEV") rl_trend <- return.level(mle_trend, conf = 0.05, return.period= c(2,5,10,20,50,100)) # return level plot plot(mle_trend, type="rl", main="Return Level Plot for Bärnkopf w/ MLE")
Compared to the return level plots of the previous posts (without trend), this return level plot looks different. It displays the change of the 5 and 100 year return level over time.
Reference: Coles, Stuart (2001). An Introduction to Statistical Modeling of Extreme Values. Springer Series in Statistics. London: Springer.
9 Comments
You can post comments in this post.
Very nice and very helpful!
How did you get the return level x-axis to be labeled “Year”? Even when I run your code I get “index.”
Thanks in advance!
Mike 7 years ago
Try adding
xlab="Year"
inside the plot function.Martin 7 years ago
That makes sense, but something in
fevd
doesn’t like that.Again your complete examples (data, code and graphics) are super helpful for beginners like me.
Mike 7 years ago
As a matter of fact, I am using customized plotting functions for
fevd
objects. In this particular case, it’s not that easy to modify the plots, as thefevd
function is somehow bloated, and so are its plotting functions. So, even though the code for the plots is quite simple, you have to find the correct position where you have to make your modifications. In your code, thexlab
-definition gets passed on to aplot()
-function where the argumentxlab
is already defined asxlab = "index"
. If you want to customize your plot, you have to modify the functionplot.fevd.mle
. Around line 550 you will find the actual plot function that gets called in this specific case. You can changetype = "b", xlab = "Year", pch = 16
to get a similar plot to the one I used in this post. In addition, you can dig into this plotting function if you want to change colors, return levels or plot multiple fitting methods in one plot (e.g. comparison of MLE and L-moments in one plot).Matthias 7 years ago
I’ve got it to work, and understand why its important, however I’m a bit lost on how the trend was removed. In other words what exactly is occurring to remove the trend from the data, or is it just applying the trend to the return levels over time.
Cameron 5 years ago
The trend-corrected estimation $z$ at time $t$ is obtained as $\hat{z}_{t}=y_{t}-\hat{y}_{t}+\hat{y}_{l}$, where $y_{t}$ is the measurement at time $t$ and $\hat{y}_{t}$ is the trend at time $t$ obtained from the linear trend model $\hat{y}_{t}=\beta_{0}+\beta_{1}t$, with intercept $\beta_{0}$ and slope $\beta_{1}$, and $\hat{y}_{l}$ being the trend estimate for the latest year.
This means that you (1) take each measurement, (2) subtract the estimate obtained via linear regression for said measurement, and (3) add the estimate obtained via linear regression for the last year in your time series.
Matthias 5 years ago
Hi there
I’m having problem with the command “read_ehyd(ehyd_url)”.
Can you please explain that command? 🙂
Thanks in advance.
Tomas Hansson 5 years ago
Hi Tomas,
this is a custom function to read data from the digital hydrological archive of Austria.
It is explained in the post Reading data from eHYD using R.
Regards,
Matthias
Matthias 5 years ago
OK
Thank you, Matthias.
Regards,
Tomas
Tomas Hansson 5 years ago
Post A Reply