Voronoi diagram with ggvoronoi package with Train Station data

I’ve always been curious to make Voronoi diagram, I just think they are beautiful! When I came across data set with train stations in Japan. I instantly thought this would be great data sets to make Voronoi diagram! I’ve gotten data sets from (Ekidata)[http://www.ekidata.jp/] site. I’m amazed how many train stations we have in Japan, as well as coverage of train systems in Japan.

There are couple of packages I could’ve used to make Voronoi diagram, but I’ve utilized package ggvoronoi. I really like using “outline� inside of geom_voronoi function to mask out the shape! (Which I wasn’t sure how to do before using deldir package).

Voronoi Diagram with Train Station as a seed.

ggvoronoi makes it easy to plot voronoi diagram! All I really needed to produce voronoi diagram was longitude & latitude.

Initially I’ve plotted all the train station as a point (using geom_point), you can see that station will reveal shape of Japan, as JR (Japan Railway) really covers coast line of Japan. There are total of 10828 points, as there were 10828 stations listed in most recent data set downloaded today.

I also used treemap package to create treemap.I’ve colour coded rectangle inside of treemap with company types. 47% of 10K+ stations are JR Japan Railway stations in Japan.

Tokyo (area: 2,188 sq.km) has 943 stations all together, followed by Hokkaido 650 stations, but Hokkaido is the biggest prefecture in terms of area (83,456.87 sq.km) . It would be interesting to get area data for each prefecture, so we can calculate stations per area.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
jp <- ggplot2::map_data('world2', region='japan')
names(jp) <- c("lon","lat", "group","order","region","subregion")
## for train, I'm going to tidy up the map bit. (I've excluded Okinawa for now)
jp_outline <- jp %>% filter(subregion %in% c("Honshu","Hokkaido","Kyushu","Shikoku"))

## I also wanted prefecture level data, so I've used map data from mapdata package.
jp_outline_detailed <- map_data("japan")


## station_master lists all stations of all lines
plotPoints <-station_master %>%
 ggplot(aes(x=lon, y=lat)) +
 theme_void(base_family="Roboto Condensed") +
 geom_polygon(data=jp_outline, aes(group=group), fill="#ffffff", color="#33333380") +
 geom_point(aes(color=pref_cd),size=0.1, alpha=0.8) +
 scale_color_viridis_c(end=0.5, guide="none") +
 labs(title="Each Train Station as a point") +
 coord_quickmap()

## station_master2 is reduced version of station_master
plotVoronoi <-station_master2 %>%
 ggplot(aes(x=lon, y=lat)) +
 theme_void(base_family="Roboto Condensed") +
 geom_polygon(data=jp_outline, aes(group=group), fill="#ffffff00", color="#33333380") +
 geom_path(stat="voronoi", size=0.1, aes(color=pref_cd)) +
 coord_quickmap() +
 scale_color_viridis_c(end=0.5, guide="none") +
 labs(title="Voronoi Diagram with station as a seed")

## use patchwork package to plot 2 plots side by side
plotPoints + plotVoronoi

1
2
3
4
5
6
7
8
## All of Japan - Takes long time to draw on my machine.
station_master2 %>% 
 ggplot(aes(x=lon, y=lat)) +
 theme_void(base_family="Hiragino Sans W5") +
 geom_voronoi(aes(fill=station_cnt),size=0.05, color="#ffffff", 
 outline=jp_outline) + ## this outline feature is awesome!
 coord_quickmap() +
 scale_fill_viridis_c(end=0.8, option="magma", guide="none") 

Treemap with treemap package

With treemap, I can easily see which prefecture has more stations. Also I wanted to see which railway company are dominant in each prefecture.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
## Treemap to see which prefecture has more stations.
station_master %>% 
 count(pref_name,company_type_descr,company_name_r) %>%
 add_count(pref_name,wt=n) %>%
 mutate(pref_descr = paste(pref_name,":",nn,"駅")) %>%
 treemap(index=c("pref_descr","company_type_descr","company_name_r"),
 vSize="n", vColor="company_type_descr", type="categorical",
 fontfamily.labels="Hiragino Sans W3",
 align.labels=list(c("left","top"),c("center","center"),c("right","bottom")),
 fontsize.labels=c(13,0,11),
 palette=viridis_pal(end=0.6)(4),
 border.col="white",
 bg.labels=0,
 position.legend="bottom",
 title.legend="", 
 title="Number of Stations by Prefecture\ncoloured by operating company types")

Writing Function to Plot Prefecture Level Voronoi

There are 47 prefectures in Japan. So I’ve decided to write function to draw voronoi as below. I think below can be simplified…, but for now it does the job…

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
## function to draw voronoi map at prefecture level
draw_pref <- function(pref_no=1,zoom=T,save_file=F,folder_name="prefecture",...){
 region <- prefs %>% filter(pref_cd==pref_no) %>% pull(pref_name_en)
 region_jp <- prefs %>% filter(pref_cd==pref_no) %>% pull(pref_name) 
 pref_summary <- station_master %>% 
 filter(pref_cd==pref_no) %>% 
 summarise(station_cnt=n(),
 line_cnt =n_distinct(line_name), 
 company_count=n_distinct(company_name_r))
 
 tmp_df <- station_master2 %>% filter(pref_cd==pref_no)
 pref_outline <- map_data("japan", region=region)
 capital <-jpnprefs %>% mutate(pref_cd=row_number()) %>% filter(pref_cd==pref_no)
 
 ## calculate distance between capital city & each station so i can colour the cell of voronoi.
 tmp_df <- tmp_df %>%
 mutate(dist_from_capital = 
 sqrt((lon-capital$capital_longitude)^2 + (lat-capital$capital_latitude)^2))
 
 # finding bounding box from train station data... , so I can crop the map if I want to.
 bbox <-tmp_df %>% ungroup() %>% 
 summarise(xmax=max(lon), xmin=min(lon), ymax=max(lat), ymin=min(lat))
 
 
 base_map <-tmp_df %>% ggplot(aes(x=lon,y=lat)) +
 theme_void(base_family="Hiragino Sans W5") +
 #geom_voronoi(aes(fill=comp_cd_min) ,size=0.1, color="#ffffff", 
 # outline=pref_outline) +
 geom_voronoi(aes(fill=dist_from_capital) ,size=0.1, color="#ffffff", 
 outline=pref_outline) +
 #scale_fill_gradientn(colors = c("#440154FF","#000000FF","#31688EFF", "#1F9E89FF","#6DCD59FF"),
 #breaks=c(0,7,30,100,200), limits=c(1,250), guide="none") +
 scale_fill_viridis_c(end=0.9, guide="none", option="magma") +
 labs(title=paste0(region_jp," (",region,")"), 
 caption=paste0("Capital City of ",region," is ",capital$capital, " @ (",
 round(capital$capital_longitude,2),",", round(capital$capital_latitude,2),")"),
 subtitle=paste(pref_summary$station_cnt,"stations", 
 pref_summary$line_cnt," lines operated by", 
 pref_summary$company_count, "companies in", 
 str_to_title(region))) +
 geom_point(data=capital, aes(x=capital_longitude, y=capital_latitude),shape=4, color="#ffffff") 
 
 if (zoom) {
 print(base_map + 
 coord_quickmap(xlim=c(bbox$xmin-0.1,bbox$xmax+0.1), 
 ylim=c(bbox$ymin-0.1,bbox$ymax+0.1)))
 } else {
 print(base_map + coord_quickmap())
 }
 
 if(save_file){
 ggsave(paste0(folder_name,"/",
 formatC(pref_no, width=2,flag="0"),"-",str_to_lower(region),".png"),
 width=9,height=9,dpi=300)
 }
 
}


## function to draw treemap at prefecture level
draw_treemap <- function(pref_no=1,...){
 station_master$color <- 
 viridis_pal(end=0.6)(nlevels(station_master$company_type_descr))[station_master$company_type_descr]
 title_text <- prefs %>% filter(pref_cd==pref_no) %>% pull(pref_name_en)
 
 station_master %>% 
 filter(pref_cd==pref_no) %>%
 count(company_type_descr,company_name_r,line_name,color,station_name) %>%
 treemap(index=c("company_type_descr","company_name_r","line_name","station_name"),
 vSize="n", vColor="color", type="color",
 fontfamily.labels="Hiragino Sans W3",
 fontfamily.title="Roboto Condensed",
 align.labels=list(c("center","center"),c("left","top"),
 c("right","bottom"),c("center","center")),
 fontsize.labels=c(0,13,11,0),
 border.col=c("#ffffffff","#ffffff90","#ffffff30","#ffffff10"),
 border.lwds = c(3,2,1,0.2),
 bg.labels=0,
 title.legend="", title="",
 aspRatio = 16/9)
}

Tokyo!

While it’s interesting to see Voronoi map of Japan, I wanted to zoom into selected prefectures that I care maybe more about.

Firstly, Tokyo. I love looking at Tokyo’s train map such as this one. JR East Route Map PDF.

For below voronoi diagram, I’ve decided to colour the voronoi cell with distance from Shinjuku (capital city of Tokyo) to corresponding station cell. (I actually think it’s more interesting to get train usage data, and colour the cell with train usage data, but because there are so many different operating company, getting data about train usage seemed like pretty hard task to do…)

I like how dense train staions are packed around central tokyo (east side), but as you go towards the west, cell becomes bigger and bigger. In fact, far west side of Tokyo, there are NOT that many stations at all.

I’ve also created treemap for Tokyo below. Personally I was surprised that there are maybe more Tokyo metro stations than JR stations in Tokyo. I’ve also came to realize that there are so many companies…

1
draw_pref(13, zoom=T, save_file=F)

1
draw_treemap(13)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
station_master %>% 
 filter(pref_cd==13) %>%
 arrange(e_sort) %>%
 ggplot(aes(x=lon, y=lat)) +
 geom_sf(data=jpn_pref(13), inherit.aes=F, color="#33333320") +
 geom_point(aes(color=company_name_r, shape=company_type_descr), alpha=0.8) +
 theme(axis.text=element_blank(),
 axis.title=element_blank()) +
 theme_minimal(base_family="Hiragino Sans W5") +
 geom_text_repel(data=station_master2 %>% filter(pref_cd==13 & station_cnt>6), 
 aes(label=station_name),
 family="Osaka", min.segment.length=0, nudge_x=0.25, segment.color="#33333350") +
 scale_color_hue(l=45) +
 coord_sf(ylim=c(35.5,35.9), xlim=c(138.8,139.9)) ## to remove islands of Tokyo

Related