Hey folks!
Today I am going to show you how to perform a very basic kMeans unsupervised classification of satellite imagery using R. We will do this on a small subset of a Sentinel-2 image.
Sentinel-2 is a satellite launched by the European Space Agency and its data is freely accessible for example here.
The image I am going to use is showing the northern part of the Lake Neusiedl (east of Vienna, Austria). The region is famous for saling, good wine and its beautiful nature (reeds and protected wetlands under the Ramsar convetion). I will be using bands 2 – blue, 3 – green, 4 – red and 8 – near infra red for the classification. Here is how the image looks like – on the left you can see the true color composite (432) and on the right a false color near infrared composite (843):

Lake Neusiedl as seen by Sentinel-2. The lake is surrounded by a reed belt and intensive agricultural activity. In the north-west of the image a deciduous forest can be seen.
Please note that you can use any other image of your choice to perform the classification. The code will stay the same!
Load the image into R:
To perform the unsupervised image classification we first need to load the image into R. This is as simple as these two lines:
library(raster) #load raster package image <- stack("path/To/YourImage/stack.tif)
Voila! Our image is in R! If you don’t have the raster package installed, please execute “install.packages(“raster”) first.
Classify!
kMeans unsupervised classification can sound very confusing and hard if you have never classified an image before or if you are new to machine learning. No worries! You will actually only need about 3-4 lines of code and were are done 🙂 All we need is the ‘kMeans’ function. We will need to specify the number of classes we want to “detect” in the image and the function will take care of the rest. It iteratively goes through the images and looks for so called clusters (=spectrally similar areas which constitute a land cover class). In my case I want to detect six classes, let’s see how the unsupervised classification will perform:
#execute the kMeans function on the image values (indicated by the squared bracket) #and search for 6 clusters (centers = 6) kMeansResult <- kmeans(image[], centers=6) #create a dummy raster using the first layer of our image #and replace the values of the dummy raster with the clusters (classes) of the kMeans classification result <- raster(image[[1]]) result <- setValues(result, kMeansResult$cluster) #plot the result plot(result)
This is how the result looks like:
The default visualisation is not the best I have ever seen, but okay… it’s a good start. You can see six different colors each corresponding to a spectrally similar area.
The first and necessary step after an unsupervised kMeans classification is to assign class names to the detected clusters by the algorithm. Let’s have look at our classified image and compare it to our satellite scene from above, this will help us to assign names to the detected clusters/classes:
- 1 and 2 indicates reeds or forest
- Water is classified as 3
- Agriculture is roughly depicted by 4,5 and 6.
Let’s change the coloring of our plot and see how it looks like:
plot(result, col=c("darkgreen", "darkgreen","blue", "orange", "orange","orange"))

Unsupervised kMeans classification with class coloring. Darkgreen indicates reeds and forest, orange depicst agriculture and water is shown in blue.
Conclusion
We can see that water is very well captured by the classifier and agriculture is also detecetd quite well. Reeds (surrounding the lake) and forest (north-west of the lake) however seem to be mixed and we can see a big confusion inbetween those two classes. The unsupervised kMeans classifier is a fast and easy way to detect patterns inside an image and is usually used to make a first raw classification. It is popular due of its good performance and widely used because no sample points are needed for its application (as opposed to a supervised classification). My intention however was to detect six different classes and the algorithm was only able to distinguish roughly three. By increasing the “centers” parameter it might be possible to detect more classes. A second option on how to make a better classification would be adding more bands as an input (for example the red-edge or short wave infrared Sentinel-2 bands). In an upcoming post however I will show you how to perform a supervised classification (with sample points) using R and then we will compare both results.
Cheers
Martin
5 Comments
You can post comments in this post.
Thank you for this. code. could you please write the code for the unsupervised classification using maximum likelihood. I want to work on the Land use Land cover and its change detection. i want to do this using R code. you will be the coauthor of this paper.
Muhammad Waqas 5 years ago
Hey! MLI classification is as far as I know a supervised classification. You can find a lot of online material on google: for example here a function that can perform your task from RStoolbox package: the http://bleutner.github.io/RStoolbox/rstbx-docu/superClass.html
Martin 5 years ago
When I followed this exactly I got the following error, any idea what could be happening?
Error in .plotraster2(x, col = col, maxpixels = maxpixels, add = add, :
no values associated with this RasterLayer
Paula S 4 years ago
Dear Martin
I have three maps in asc format and I would like to apply kmeans clustering on them using Rstudio, but I dont any thing about this method at Rstudio. can you help me? how can I wirte the kmeans commons about these maps at r. and How can I do validation the number of my clusters?
Best Regard
bahar abdollahi 4 years ago
It’s really great information for becoming a better Blogger. Keep sharing, Thanks. For more details to visit
http://www.micronetsolutions.in/
ahanakhan 3 years ago
Post A Reply