Students Combat MS with Data Science

Machine learning can effectively be leveraged to support NGOs, but it often requires sponsorship and organizational support to generate the resources needed for such data explorations. The fourth annual Teradata University Network (TUN) Data Challenge Competition provided this support, and focused on combating Multiple Sclerosis (MS).

MS is a disease of the central nervous system that impacts over 2 million people worldwide and that causes diverse symptoms, ranging from numbness and memory loss to blindness and paralysis. TUN teamed up with the National MS Society to help improve their fundraising efficiency so that they can support more research about MS. In this challenge, student teams were presented with identical copies of a dataset about NMSS’s Bike MS program, which organizes sponsored bike races.

A student team from ESCP Europe won the People’s Choice Award for the TUN competition, after presenting at the Teradata conference in Las Vegas this past October. Anke Joubert, Chiao-Ann Tsai, Fahd Lemhaider, and Marie Tourdes are all students in the MSc Big Data and Business Analytics program at ESCP.

The students all come from different countries and have distinct skill sets: engineering, economics, business, and political science, all of which played a role in their success. Marie said that they knew from working on school projects early in their MSc career that “the difference of our previous backgrounds was a real strength and a valuable asset for teamwork.”

How’d They Do It?

Their results focused on a multi-pronged improvement approach to improve Bike MS. They analyzed ideal event timing, ways to improve corporate participation and women’s cycling events, better marketing targeting techniques, and offered concrete recommendations on how to advance and expand the program. 

The team found that connecting to the human impact of data science was critical for their success at the competition, and a prudent reminder for business cases as well.

“We were very proud to receive the People’s Choice Award for the Teradata Data Challenge,” stated Marie. “I think that we created a connection with the audience during our presentation, as we presented our analysis and results following a storyline based on personal experience. It helped to share the “why” we had chosen to start and commit to this experience, and it helped to visualize the concrete impacts that exist behind the data.”

 

For their project, they used Dataiku to clean, combine, and analyze the data. The team “did not use only MS Bike data but also tried to analyze external data,” according to Fahd, which he believes contributed to their success.

Anke said that, “[Dataiku] is easy to use and gives results quickly. The interface allowed us to make adjustments and rerun models to really get to know the dataset well. This was beneficial given the large data set we had to deal with.”

The team also found Dataiku helpful since it collapsed their workflow into one platform. “Being able to monitor our data flow is another big plus of using [Dataiku],” explained Chiao-Ann, “It’s an ideal tool for a project like this one because it is comprehensive, allowing flexibility from data prep and cleaning to boosting models.”

Fahd thought the project helped him grow personally and professionally, “this project was an amazing opportunity to combine our academic background and professional experiences to serve a human cause. And was also a great occasion to get out of the academic comfort zone and expose our minds to different international experts in the industry.”

How Can I Do It Too?

If you’re interested in performing your own analysis with Dataiku, consider trying a free trial today, and check out our Machine Learning Guidebook on how to improve your skills and analysis techniques.