“Anyone can look for fashion in a boutique or history in a museum. The creative explorer looks for history in a hardware store and fashion in an airport” — Robert Wieder
“Anyone can look for fashion in a boutique or history in a museum. The creative explorer looks for history in a hardware store and fashion in an airport” — Robert Wieder
As much as I like Medium for it’s ability to reach an audience I prefer the customization afforded by github.io. If you want to see a far-fancier version of this post pleasevisit this page instead.
Project Inspiration
You may, or may not, have heard of the term gamification but chances are you’ve experienced it.
Gamification is the application of game-design elements, and game principles, in non-game contexts. The idea is, if you use elements of games, like linking rules and rewards into a feedback system, you can make (almost) any activity motivating and fun.
Gamification is the concept behind eLearning. In elementary school I remember all the students wanted to play The Oregon Trail in computer class. I also remember another game where you had to solve math problems before something hit the floor. Okay, maybe it wasn’t the most thrilling introduction to gamification but I remember it nonetheless.
At some point in my career, I got tired of using nano and decided to I wanted to try to learn Vim.
It was then that I discovered two very enjoyable examples of gamification:
- shortcutFoo teaches you shortcuts for Vim, Emacs, Command Line, Regex etc. via interval training, which is essentially spaced repetition. This helps you memorize shortcuts more efficiently.
Today, I enjoy eLearning-gamification on platforms like DuoLingo, and DataCamp.
I’ve also recently started to participate in a Kaggle competition, “PUBG Finish Placement Prediction”. Kaggle is a Google owned hangout for data science enthusiasts where they can use machine learning to solve predictive analytics problems for cash and clout. Similar to chess there are so-called Kaggle Grandmasters.
The Quest
Our laboratory studies perinatal influences on the biological embedding of early adversity of mental health outcomes. We combine genetic, epigenetic and epidemiological approaches to identify pregnant women who’s offspring may potentially be at risk for adverse mental health outcomes.
My supervisor approached me with a challenge; how feasible would it be to access biometric data from 200 Fitbits?
So I bought myself a Fitbit Charge2 fitness tracker and hit the gym!
At some point I think we both realized that this project was going to be a big undertaking. Perhaps R isn’t really intended to do large scale real-time data management from API services. It’s great for static files, or static endpoints, but if you’re working with multiple participants a dedicated solution like Fitabase may work the best – or so they claim.
Nonetheless, I wanted to try out a bunch of cool new things in R like making a personal website using blogdown, using gganimate with Rokemon, accessing the fitbit API with httr as well as adding a background image with some custom CSS/HTML. Is there possibly a better way to possibly gamify my leaRning curve – I think not.
The following was my attempt at e-learninggamification forR.
I used the blogdown package to allow me to write blog posts as R Markdown documents, knitting everything to a nice neat static website that I can push online. It was a nice opportunity to learn a bit about pandoc, Hugo, CSS/HTML lurking beneath the server side code. I decided to go with the Academic theme for Hugo, pull in as much data as I could from the Fitbit API, clean it up, and then perform some exploratory data analysis. In the process, I generated some cool animated sprites and use video game inspired visualizations.
Setting up a Fitbit Developer Account
Fitbit uses OAuth 2.0Access Token for making HTTP request to the Fitbit API. You need to set up an account to use the API and include your token in R. Instead of reading the FITBIT DEV HELP section I would direct the reader to better more-concise instructions here.
Now that you have an account we’re ready to do stuff in R.
Set your token up:
Using the fitbitr package
I had never made an HTTP request before and although the process is officially documented here it can be a tad overwhelming. Therefore, I initially resorted to using an R package built to access the R API, under-the-hood, called fitbitr.
Unfortunately this would limit me to only accessing some basic user information, heart rate and step count data.
Getting Basic User Info
The first function in this package sends a GET request to the Get Profile resource URL.
My stride length is 68.5.
Stride length is measured from heel to heel and determines how far you walk with each step. On average, a man’s walking stride length is 2.5 feet, or 30 inches, while a woman’s average stride length is 2.2 feet, or 26.4 inches, according to this report.
My running stride length is 105.5.
The Fitbit uses your **sex and **height by default to gauge your stride length which could potentially be inaccurate.
My average daily steps is 14214.
Considering that the daily recommended steps is 10,000 I’d say that’s acceptable. That being said, there’s always room for improvement.
Accessing heart rate and footsteps with the fitbitr pacakge
I’m going to grab a week’s worth of data for a very preliminary EDA.
week$date <- as.Date(week$date)
Summary Statistics
Exploratory Data Analysis
Since this is a post about gamification I decided to do something fun with my exploratory data visualizations. I wanted to use the Rokemon package which allows me to set the theme of ggplot2 (and ggplot2 extensions) to Game Boy and Game Boy Advance themes! When convenient, I’ve combined plots with cowplot.
Let’s take a quick look at the relationship and distribution of heart rate and step count.
Alternatively, we could get a better look at the data by adding marginal density plots to the scatterplots with the ggMarginal() function from the ggExtra package.
Weekly trends
Let’s take a quick look at the distribution of the contiguous variables to get a better idea than the mean and median.
Heart rate is a little right-skewed, probably due to sleep and sedentary work. Similarly, for step count you see that only a small bump under brisk walking pace from when I skateboarded to work.
Exercise patterns
This week I didn’t work out so I thought I’d at least look at when I was on my way to work. The figure below shows blue for heart rate/min and orange is the number of steps/min.
My activity has been pretty much the same all week since I skateboard to work every morning.
Most active time
The distribution of steps per-minute was pretty constant because as I said I didn’t work-out; this likely reflects me shuffling to get tea.
It looks like Monday was the day I got my heart rate up the most, the bimodal peak is probably when I was running around looking for a rental property.
Accessing the Fitbit API directly with the httr package
Eventually, I found an excellent tutorial by obrl-soil which introduced me to the httr package and gave me the confidence I needed to peruse the Fitbit DEV web API reference. Now I was able to gain access to far more sources of data.
What data is available?
A brief overview of what data is available from the Fitbit API:
What are the units of measurement?
Define a function for turning a json list into a dataframe.
Investigating my 10Km run
GET request to retrieve minute-by-minute heart rate data for my 10km run.
Let’s take a look at the summary for my 10Km run:
obrl-soil used the MyZone Efforts Points (MEPS) which is calculated minute-by-minute as a percentage of max heart rate. It measures the effort put in – The more points the better. Another example of gamification in action.
Mine is 186.
Now is we create a tribble with 4 heart ranges showing the lower and higher bounds and use the mutate() function from above to calculate what my max heart rate is (with lower and upper bounds).
With the equation now defined let’s calculate my total MEPS:
Wow it’s 216!
I’m not sure what that exactly means but apparently the maximum possible MEPS in a 42-minute workout is 168 and since I ran this 10Km in 54:35 I guess that’s good?
I’d like to post sub 50 minutes on my next 10Km run but I’m not sure if I should be aiming to shoot for a greater percentage of peak heart rate minutes or not – guess I will need to look into this.
Minute-by-minute sleep data for one night
Let’s examine my sleep patterns last night.
Okay now that the data munging is complete, let’s look at my sleep pattern.
A pie chart is probably not the best way to show this data. Let’s visualize the distribution with a box plot.
An even better way to visualize the distribution would be to use a violin plot with the raw data points overlaid.
Get Daily Activity Patterns For 3 Months
You can do API requests for various periods from the Fitbit Activity and Exercise Logs but since I’ve only had mine a couple months I’ll use the 3m period.
I will also need to trim off any day’s which are in the future otherwise they’ll appear as 0 calories in the figures. It’s best to use the Sys.Date() function rather than hardcoding the date when doing EDA, making a Shiny app, or parameterizing a RMarkdown file. This way you can explore different time periods without anything breaking.
I cannot remember when I started wearing my Fitbit but we can figure that out with the following code:
I’ve had my Fitbit since 2018–08–20.
Let’s gather data from September 20th until November 6th 2018.
got_calories <- get_calories(baseDate = “2018-11-20”, period = “3m”)calories <- content(got_calories)# turn into dfcalories[[‘activities-calories’]] <- jsonlist_to_df(calories[[‘activities-calories’]])# assign easy object and renamecalories <- calories[[‘activities-calories’]]colnames(calories) <- c(“dateTime”, “calories”)
got_steps <- get_steps(baseDate = “2018-11-20”, period = “3m”)steps <- content(got_steps)# turn into dfsteps[[‘activities-steps’]] <- jsonlist_to_df(steps[[‘activities-steps’]])# assign easy object and renamesteps <- steps[[‘activities-steps’]]colnames(steps) <- c(“dateTime”, “steps”)
got_distance <- get_distance(baseDate = “2018-11-20”, period = “3m”)distance <- content(got_distance)# turn into dfdistance[[‘activities-distance’]] <- jsonlist_to_df(distance[[‘activities-distance’]])# assign easy object and renamedistance <- distance[[‘activities-distance’]]colnames(distance) <- c(“dateTime”, “distance”)
got_floors <- get_floors(baseDate = “2018-11-20”, period = “3m”)floors <- content(got_floors)# turn into dffloors[[‘activities-floors’]] <- jsonlist_to_df(floors[[‘activities-floors’]])# assign easy object and renamefloors <- floors[[‘activities-floors’]]colnames(floors) <- c(“dateTime”, “floors”)
got_elevation <- get_elevation(baseDate = “2018-11-20”, period = “3m”)elevation <- content(got_elevation)# turn into dfelevation[[‘activities-elevation’]] <- jsonlist_to_df(elevation[[‘activities-elevation’]])# assign easy object and renameelevation <- elevation[[‘activities-elevation’]]colnames(elevation) <- c(“dateTime”, “elevation”)
got_minutesSedentary <- get_minutesSedentary(baseDate = “2018-11-20”, period = “3m”)minutesSedentary <- content(got_minutesSedentary)# turn into dfminutesSedentary[[‘activities-minutesSedentary’]] <- jsonlist_to_df(minutesSedentary[[‘activities-minutesSedentary’]])# assign easy object and renameminutesSedentary <- minutesSedentary[[‘activities-minutesSedentary’]]colnames(minutesSedentary) <- c(“dateTime”, “minutesSedentary”)
got_minutesLightlyActive <- get_minutesLightlyActive(baseDate = “2018-11-20”, period = “3m”)minutesLightlyActive <- content(got_minutesLightlyActive)# turn into dfminutesLightlyActive[[‘activities-minutesLightlyActive’]] <- jsonlist_to_df(minutesLightlyActive[[‘activities-minutesLightlyActive’]])# assign easy object and renameminutesLightlyActive <- minutesLightlyActive[[‘activities-minutesLightlyActive’]]colnames(minutesLightlyActive) <- c(“dateTime”, “minutesLightlyActive”)
got_minutesFairlyActive <- get_minutesFairlyActive(baseDate = “2018-11-20”, period = “3m”)minutesFairlyActive <- content(got_minutesFairlyActive)# turn into dfminutesFairlyActive[[‘activities-minutesFairlyActive’]] <- jsonlist_to_df(minutesFairlyActive[[‘activities-minutesFairlyActive’]])# assign easy object and renameminutesFairlyActive <- minutesFairlyActive[[‘activities-minutesFairlyActive’]]colnames(minutesFairlyActive) <- c(“dateTime”, “minutesFairlyActive”)
got_minutesVeryActive <- get_minutesVeryActive(baseDate = “2018-11-20”, period = “3m”)minutesVeryActive <- content(got_minutesVeryActive)# turn into dfminutesVeryActive[[‘activities-minutesVeryActive’]] <- jsonlist_to_df(minutesVeryActive[[‘activities-minutesVeryActive’]])# assign easy object and renameminutesVeryActive <- minutesVeryActive[[‘activities-minutesVeryActive’]]colnames(minutesVeryActive) <- c(“dateTime”, “minutesVeryActive”)
got_activityCalories <- get_activityCalories(baseDate = “2018-11-20”, period = “3m”)activityCalories <- content(got_activityCalories)# turn into dfactivityCalories[[‘activities-activityCalories’]] <- jsonlist_to_df(activityCalories[[‘activities-activityCalories’]])# assign easy object and renameactivityCalories <- activityCalories[[‘activities-activityCalories’]]colnames(activityCalories) <- c(“dateTime”, “activityCalories”)
Get Recent Activity Types
get_frequentActivities <- function(baseDate = NULL, period = NULL, token = Sys.getenv(‘FITB_AUTH’)){GET(url = paste0(‘https://api.fitbit.com/1/user/-/activities/recent.json’),add_headers(Authorization = paste0(“Bearer “, token)))}
got_frequentActivities <- get_frequentActivities(baseDate = “2018-11-20”, period = “3m”)frequentActivities <- content(got_frequentActivities)
I would never have considered myself a Darwin or a Thoreau but apparently strolling very slowly is my favorite activity in terms of time spent.
You can see that my Fitbit has also logged times for Weights, Sports and Biking which is likely from when I’ve manually logged my activities. There’s a possibility that Fitbit is registering Biking for when I skateboard.
Correlogram of Activity
Previously I had always used the corrplot package to create a correlation plot; however, it doesn’t play nicely with ggplot meaning you cannot add Game Boy themes easily. Nonetheless, I was able to give it a retro-looking palette with some minor tweaking.
Since I had two colors in mind from the original gameboy, and knew their hex code, I was able to generate a palette from this website.
In a correlation plot the color of each circle indicates the magnitude of the correlation, and the size of the circle indicates its significance.
After a bit of searching for a ggplot2 extension I was able to use ggcorrplot which allowed me to use gameboy themes again!
ggcorrplot(corr, hc.order = TRUE, type = “lower”, lab = TRUE, lab_size = 2,tl.cex = 8,show.legend = FALSE,colors = c( “#3B7AAD”, “#56B1F7”, “#1D3E5D” ), title=”Correlogram”,ggtheme=theme_gba)
Exploring Activity
Staticg <- activity_df %>% ggplot(aes(x=dateTime, y=calories)) + geom_line(colour = “black”) +geom_point(shape = 21, colour = “black”, aes(fill = calories), size = 5, stroke = 1) +xlab(“”) +ylab(“Calorie Expenditure”)
g + theme_gameboy() + theme(legend.position = “none”)g + theme_gba() + theme(legend.position = “none”)
gganimateg <- activity_df %>% ggplot(aes(x=dateTime, y=calories)) + geom_line(colour = “black”) +geom_point(shape = 21, colour = “black”, aes(fill = calories), size = 5, stroke = 1) +transition_time(dateTime) +shadow_mark() +ease_aes(‘linear’) +xlab(“”) +ylab(“Calorie Expenditure”)
g + theme_gba() + theme(legend.position = “none”)
Distance is determined by using your steps and your estimated stride length (for the height you put in).
I’ve also made plots for Distance, Steps, Elevationand Floorsbut you’ll have to check out this page to see them.
Closing thoughts
Even though Fitbit offers a nice dashboard for a single user it’s not scale-able. By accessing the data directly one can ask the questions they want from 200 individuals — or more. If one was inclined, they could even build a fancy Shiny dashboard with bespoke visualizations.
If you have any questions or comments you can always reach me on LinkedIn. Till then, see you in the next post!
https://www.spriters-resource.com/game_boy_advance/kirbynim/sheet/15585/sprite_sheet <- png::readPNG(“kirby.png”)
Nframes <- 11 # number of frames to extractwidth <- 29 # width of a framesprite_frames <- list() # storage for the extracted frames
Not equal sized frames in the sprite sheet. Need to compensate for each frameoffset <- c(0, -4, -6, -7, -10, -16, -22, -26, -28, -29, -30)
Manually extract each framefor (i in seq(Nframes)) {sprite_frames[[i]] <- sprite_sheet[120:148, (width*(i-1)) + (1:width) + offset[i], 1:3]}
Function to convert a sprite frame to a data.frame# and remove any background pixels i.e. #00DBFFsprite_frame_to_df <- function(frame) {plot_df <- data_frame(fill = as.vector(as.raster(frame)),x = rep(1:width, width),y = rep(width:1, each=width)) %>%filter(fill != ‘#00DBFF’)}
sprite_dfs <- sprite_frames %>%map(sprite_frame_to_df) %>%imap(~mutate(.x, idx=.y))
fill_manual_values <- unique(sprite_dfs[[1]]$fill)fill_manual_values <- setNames(fill_manual_values, fill_manual_values)
mega_df <- dplyr::bind_rows(sprite_dfs)
p <- ggplot(mega_df, aes(x, y, fill=fill)) +geom_tile(width=0.9, height=0.9) +coord_equal(xlim=c(1, width), ylim=c(1, width)) +scale_fill_manual(values = fill_manual_values) +theme_gba() +xlab(“”) +ylab(“”) +theme(legend.position = ‘none’, axis.text=element_blank(), axis.ticks = element_blank())
panim <- p +transition_manual(idx, seq_along(sprite_frames)) +labs(title = “gganimate Kirby”)
gganimate::animate(panim, fps=30, width=400, height=400)
The Gamification Of Fitbit: How an API Provided the Next Level of tRaining was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Related