Some fun with {gganimate}

Your browser does not support the video tag.

In this short blog post I show you how you can use the {gganimate} package to create animationsfrom {ggplot2} graphs with data from UNU-WIDER.

WIID data

Just before Christmas, UNU-WIDER released a new edition of their World Income Inequality Database:

The data is available in Excel and STATA formats, and I thought it was a great opportunity torelease it as an R package. You can install it with:

1
devtools::install_github("b-rodrigues/wiid4")

Here a short description of the data, taken from UNU-WIDER’s website:

“The World Income Inequality Database (WIID) presents information on income inequality fordeveloped, developing, and transition countries. It provides the most comprehensive set of incomeinequality statistics available and can be downloaded for free.

WIID4, released in December 2018, covers 189 countries (including historical entities), with over11,000 data points in total. With the current version, the latest observations now reach the year 2017.”

It was also a good opportunity to play around with the {gganimate} package. This packagemakes it possible to create animations and is an extension to {ggplot2}. Read more about ithere.

Preparing the data

To create a smooth animation, I need to have a cylindrical panel data set; meaning that for eachcountry in the data set, there are no missing years. I also chose to focus on certain variablesonly; net income, all the population of the country (instead of just focusing on the economicallyactive for instance) as well as all the country itself (and not just the rural areas).On this link youcan find a codebook (pdf warning), so you can understand the filters I defined below better.

Let’s first load the packages, data and perform the necessary transformations:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
library(wiid4)
library(tidyverse)
library(ggrepel)
library(gganimate)
library(brotools)

small_wiid4 <- wiid4 %>%
 mutate(eu = as.character(eu)) %>%
 mutate(eu = case_when(eu == "1" ~ "EU member state",
 eu == "0" ~ "Non-EU member state")) %>%
 filter(resource == 1, popcovr == 1, areacovr == 1, scale == 2) %>%
 group_by(country) %>%
 group_by(country, year) %>%
 filter(quality_score == max(quality_score)) %>%
 filter(source == min(source)) %>%
 filter(!is.na(bottom5)) %>%
 group_by(country) %>%
 mutate(flag = ifelse(all(seq(2004, 2016) %in% year), 1, 0)) %>%
 filter(flag == 1, year > 2003) %>%
 mutate(year = lubridate::ymd(paste0(year, "-01-01")))

For some country and some years, there are several sources of data with varying quality. I onlykeep the highest quality sources with:

1
2
 group_by(country, year) %>%
 filter(quality_score == max(quality_score)) %>%

If there are different sources of equal quality, I give priority to the sources that are the mostcomparable across country (Luxembourg Income Study, LIS data) to less comparable sources with(at least that’s my understanding of the source variable):

1
 filter(source == min(source)) %>%

I then remove missing data with:

1
 filter(!is.na(bottom5)) %>%

bottom5 and top5 give the share of income that is controlled by the bottom 5% and top 5%respectively. These are the variables that I want to plot.

Finally I keep the years 2004 to 2016, without any interruption with the following line:

1
2
 mutate(flag = ifelse(all(seq(2004, 2016) %in% year), 1, 0)) %>%
 filter(flag == 1, year > 2003) %>%

ifelse(all(seq(2004, 2016) %in% year), 1, 0)) creates a flag that equals 1 only if the years2004 to 2016 are present in the data without any interruption. Then I only keep the data from 2004on and only where the flag variable equals 1.

In the end, I ended up only with European countries. It would have been interesting to have countriesfrom other continents, but apparently only European countries provide data in an annual basis.

Creating the animation

To create the animation I first started by creating a static ggplot showing what I wanted;a scatter plot of the income by bottom and top 5%. The size of the bubbles should be proportionalto the GDP of the country (another variable provided in the data). Once the plot looked how I wantedI added the lines that are specific to {gganimate}:

1
2
3
 labs(title = 'Year: {frame_time}', x = 'Top 5', y = 'Bottom 5') +
 transition_time(year) +
 ease_aes('linear')

I took this from {gganimate}’s README.

1
2
3
4
5
6
7
8
9
10
animation <- ggplot(small_wiid4) +
 geom_point(aes(y = bottom5, x = top5, colour = eu, size = log(gdp_ppp_pc_usd2011))) +
 xlim(c(10, 20)) +
 geom_label_repel(aes(y = bottom5, x = top5, label = country), hjust = 1, nudge_x = 20) +
 theme(legend.position = "bottom") +
 theme_blog() +
 scale_color_blog() +
 labs(title = 'Year: {frame_time}', x = 'Top 5', y = 'Bottom 5') +
 transition_time(year) +
 ease_aes('linear')

I use geom_label_repel to place the countries’ labels on the right of the plot. If I don’t dothis, the labels of the countries would be floating around and the animation would be unreadable.

I then spent some time trying to render a nice webm instead of a gif. It took some trial and errorand I am still not entirely satisfied with the result, but here is the code to render the animation:

1
2
3
4
5
animate(animation, renderer = ffmpeg_renderer(options = list(s = "864x480", 
 vcodec = "libvpx-vp9",
 crf = "15",
 b = "1600k", 
 vf = "setpts=5*PTS")))

The option vf = "setpts=5*PTS" is important because it slows the video down, so we can actuallysee something. crf = "15" is the quality of the video (lower is better), b = "1600k" is thebitrate, and vcodec = "libvpx-vp9" is the codec I use. The video you saw at the top of thispost is the result. You can also find the video here,and here’s a gif if all else fails:

I would have preferred if the video was smoother, which should be possible by creating more frames.I did not find such an option in {gganimate}, and perhaps there is none, at least for now.

In any case {gganimate} is pretty nice to play with, and I’ll definitely use it more!

Update

Silly me! It turns out thate the animate() function has arguments that can control the number of framesand the duration, without needing to pass options to the renderer. I was looking at options for therenderer only, without having read the documentation of the animate() function. It turns out thatyou can pass several arguments to the animate() function; for example, here is how youcan make a GIF that lasts for 20 seconds running and 20 frames per second, pausing for 5frames at the end and then restarting:

1
animate(animation, nframes = 400, duration = 20, fps = 20, end_pause = 5, rewind = TRUE)

I guess that you should only pass options to the renderer if you really need fine-grained control.

This took around 2 minutes to finish. You can use the same options with the ffmpeg renderer too.Here is what the gif looks like:

Much, much smoother!

Hope you enjoyed! If you found this blog post useful, you might want to followme on twitter for blog post updates andbuy me an espresso or paypal.me.