Increasing the speed of {raster} processing with R: Part 1/3

Hey there rasterR folks!
I love to code with R – most of my time I am spending on processing raster data with the {raster} package, bceause it’s quite intuitive and easy to use compared to other scripting languages. Sometimes however, when I work on larger datasets or when I have to process a lot of files, the sloooooow and long processing times make me wonder if R is always the best choice… Of course it is! You just have to know how to improve the performance of R by either writing more efficient code or cluster and parallelise your R functions. Since this seems to be a big cradle for many R beginners and some advanced coders, I decided to write a tutorial on this topic: It will consist out of three parts, where I will describe how to speed up the processing time of the raster package.

Today we will have a short introduction on the topic and I will show you how you can speed up the processing time by setting some local rasterOptions() in your R session, which will give you a good initial speed boost.

rasterOptions()

The rasterOptions() function from the {raster} package allows you to customize your R session. If you type rasterOptions() with no arguments, you can see the current (default) settings:

> rasterOptions()
format        : raster
datatype      : FLT8S
overwrite     : FALSE
progress      : none
timer         : FALSE
chunksize     : 1e+07
maxmemory     : 1e+08
tmpdir        : /var/folders/lk/hg070f5n4md9hrchf__th6080000gn/T//RtmpQrCgZD/raster//
tmptime       : 168
setfileext    : TRUE
tolerance     : 0.1
standardnames : TRUE
warn depracat.: TRUE
header        : none

There are two parameters that have a lot of influence on the performance of the {raster} package: chunksize and maxmemory.

  • chunksize: integer. Maximum number of cells to read/write in a single chunk while processing (chunk by chunk) disk based Raster* objects
  • maxmemory: integer. Maximum number of cells to read into memory.

The default value of maxmemory is 1e+08. Let´s have a look if and how the processing time of a simple raster calculation changes when we increase the maxmemory limit. To increase the limit, you can simply write:

rasterOptions(maxmemory = 1e+09)

Benchmark

In order to test the influence on the processing time of the maxmemory setting, I loaded a Landsat 8 subset into R and performed a simple calculation: Ten times with a maxmemory limit of 1e+08 and ten times with a limit of 1e+09:

Here is an summary on the raster image that was used for the benchmark:

class       : RasterStack
dimensions  : 3485, 4606, 16051910, 5  (nrow, ncol, ncell, nlayers)
resolution  : 30, 30  (x, y)

And here the code for the simple (nonsene) raster calulation to evaluate the time:

start <- Sys.time()
ras_calc <- ras ^ 2 + ras
end <- Sys.time()
difftime(end,start)

Result:

The time saved due to the maxmemory limit increas from 1e+08 to 1e+09 was signicant. The lower memory limit (1e+08) needed on average 61 seconds to perform the simple calculation task. The higher limit (1e+09) performed much better and only needed about 35 seconds to perform the same task. You can see the final result in the boxplot below:

Maxmemory Benchmark
I hope this was useful to some of you. The 2nd part of  the tutorial on “Increasing the speed of {raster} processing with R” will be on how you can parallelise your R code using the {foreach} and {doParallel} packages.
See you soon!

Martin

Martin

Martin was born in Czech Republic and studied at the University of Natural Resources and Life Sciences, Vienna. He is currently working at GeoVille - an Earth Observation Company based in Austria, specialised in Land Monitoring. His main interests are: Open-source applications like R, (geospatial) statistics and data-management, web-mapping and visualization. He loves travelling, geocaching, photography and sports.

4 Comments

You can post comments in this post.


  • thanks! that made my script not crash on a cluster

    fa 1 year ago Reply


    • Happy to hear! Cheers Martin

      Martin 1 year ago Reply


  • Hi Martin, firstly thanks for sharing useful hints of ‘raster’ R package processing.

    I’d like to know if there is a limit of maxmemory. For instance, could we estimate maxmemory limit as a function of the computer ram? What do you think?

    Thanks in advance

    Jose Lucas 8 months ago Reply


Post A Reply

*