Halloween is coming..! Halloween is just around the corner, I am still trying to decide which candies to purchase this year for trick-or-treaters.
Initially I was looking for data sets maybe comparing American chocolate bars vs Canadian chocolate bars possibly with sugar contents or lists of ingredients. I am really curious why there are big differences in candy between Canada and US, but for now I couldn’t find them instead I came across below data sets.
Data itself is actually bit outdated, since it’s data from candy sales in 2007-2015. I’m not even sure if these popular candies changes every year or not. Now I’m curious to find data sets that are more recent…CandyStore.com’s sales from 2007–2015—focusing on the three months leading up to Halloween
Since data set contains data at state level for US, I decided I’ll practice plotting the data on grid map, I’ve been wanting to try. To colour the map, I’ve decided instead of colouring by volume of candy purchased (in pounds), I’ve gathered population by state in order to figure out ratio of candy per person.
Looks like Hawaiian eats lots of Hershey Kisses (or in general, maybe there are more sweet tooth in Hawaii!). Utah, Nevada and Arizona also seems to maybe either consumer more sweets or possibly top ranked chocolates are purchased in volume!
It would’ve been fun to gather all candy images and place them in the grid too…!
Below are the steps I took to visualize the data.
Importing Data & Tidying Up
I’ve imported data to dataframe using jsonlite package using fromJSON function from OpenDataSoft site. There seems to be lots of other interesting data sets!
I’ve also harvested population by state from Wikipedia using rvest package.
1 |
|
Transforming Data from Wide to Long format
1 |
|
Overall Popular Candies in United States
If candy was ranked #1 in state, I gave gold colour, if #2, then silver, #3 then bronze. Sort of like Olympic, I’ve counted how many metals each candies have gotten to decide most popular candy in US.
There were total of 27 different candies in this data sets, most popular candies are M&M followed by Skittles! I thought it was also interesting that Life Savers were top ranked candy in Delaware, but it did not appear in any other state, and similarly for Swedish Fish in Georgia.
There’s Assorted Salt Water Taffy and Salt Water Taffy… I wasn’t sure if they are same candies…!
1 |
|
Grid State Map
I recently discovered geofacet package. Examples of graphs you can create with this package looks super fun! I thought it was really neat that I can create my own grid using “Geo Grid Designer”!!
1 |
|
6 | 7 | AL | Alabama |
7 | 2 | AK | Alaska |
5 | 2 | AZ | Arizona |
5 | 5 | AR | Arkansas |
4 | 1 | CA | California |
4 | 3 | CO | Colorado |
1 |
|
Related