Hey folks!

# Create a grid

The first step is to create a dataframe with a custom grid that covers the whole earth. Every grid point represents one Twitter query that will be executed later. In this case we will create a grid with a cellsize of 15°. At the Aquator this corresponds approximately to a distance of 1500km:

```#create a sequence for the latitude and longitude values
latseq = seq(-90,90,15)
lonseq = seq(-180,180,15)

#transform to a dataframe
latlon.df = data.frame()

for (i in 1:length(lonseq)){
for (ii in 1:length(latseq)) {
latlon.df[ii,i] <- paste0(latseq[ii],",",lonseq[i])
}
}```

This is how the grid looks like:

# Automatic query function for searchTwitter()

After we created a grid, we will now write a function that loops over every grid point and downloads the requestes twitter data for every point with a radius of 750km:

```downloadTweets = function(term, radius, unit) {
tweetsTotal.df = data.frame()

for (i in 1:ncol(latlon.df)) {
for (ii in 1:nrow(latlon.df)) {

#output for console that shows what coordinate is being queried at the moment
flush.console()
print(paste(ii,",", i))
print(latlon.df[ii,i])

#query statement
since="2015-01-01",
retryOnRateLimit=200)

#create df out of search query
tweets.df = do.call("rbind",lapply(tweets,as.data.frame))

#merge all dataframes
tweetsTotal.df = rbind(tweetsTotal.df, tweets.df)

Sys.sleep(5)
}
}
tweetsTotal.df
}
```

Where the variable term is the term you want to search for, radius is the query radius, and unit is the unit of the radius (either mi, or km). Please note that I used  the function Sys.sleep(5) that pauses the loop after every iteration for 5 secons to make sure I am not downloading too much data at once: If you stream too much data in a short period of time you can get blocked from the Twitter API. For more information have a look at the Developer Agreement & Policy.

Specify your parameters and run the function:

```#Set specifications
term = 'politics'
unit = "km"

#run function
```

The execution of this function might take 10 minutes or more. Just get a cup of coffee and lay back while it does the work for you 🙂

# Examples

Now I would also like to show you some example queries and visualizations I made that demonstrate how powerful this package is.

## Tweets about Ukraine in the year 2015

Here is a map that shows a subset of tweets that were published this year and contained the word “ukraine“.

The term “Ukraine” was quite popular in Central Europe and on the East Coast of the United States. Unexpected however is, that there was no Tweet from Ukraine itself and quite few in Russia and Asia in general. Please note however that I only queried the english country name “Ukraine” and not the equivalent expressions in other languagues. Here is the code I used to create the map:

```library(ggplot2)
library(maptools)

#create new dataframe with coodrinates only, and do type conversion (necessarry for ggplot)
coords = data.frame(longit=tweetsTotal.df\$longitude, latit=tweetsTotal.df\$latitude)
coords\$latit = as.double(as.character(coords\$latit))
coords\$longit = as.double(as.character(coords\$longit))

#get background map
world =  wrld_simpl
world = fortify(world)

#plot with ggplot2
p1 <- ggplot()+ggtitle(expression(atop("Worldwide Tweets", atop(italic("Keyword "Ukraine" | since 2015"), ""))))+geom_map(data=world,map=world,aes(x=long,y=lat,map_id=id))+geom_point(colour="steelblue3", alpha = 0.30, data=coords, mapping=aes(x=longit, y=latit), size=4)
p1```

## Solar eclipse

Last weekend there was a partial solar eclipse over Europe. The earth, the moon and the sun aligned in a way that the sun was covered by the moon by approximately 60%.  This made me curious on how many Tweets where sent  this year (2015) containing the word “solar eclipse” and when the time they were sent. The result is quite intresting:

First I was confused because I couldn’t really see a pattern in the dots. I was expecting far more dots in Europe (since the eclipse was only visible in Europe) and not so many dots in Southern America. This made me even more curious… I decided to look at the date and day time the tweets were published using CartoDB which is a great web-mapping tool that allows you to make animated web maps:

If you start the animation you can see that there were some single tweets starting from 13/03/15  onwards in Asia, Southern America and the USA, Europe however stays relatively clean. The closer we get to the event of the solar eclipse which happened on 20/03/15 we can see that the activity in  Europe and in the whole world increases and skyrockets on 20/03/15 where all Europe is suddenly covered under a orange-red heat map. I am planning on writing a tutorial on CartoDB soon, so pay us a visit later if you got curious. 🙂

I hope you enjoyed the data download function and the visualizations. If you have any questions on the provided functions that enable you to download twitter data for the whole earth, feel free to leave me a comment. Please also take a look at the previous Twitter tutorial where I explained how to connect to the Twitter API via R.

Cheers

Martin