Analyzing Pronto CycleShare Data with Python and Pandas

This week Pronto CycleShare, Seattle’s Bicycle Share system, turned one year old. To celebrate this, Pronto made available a large cache of data from the first year of operation and announced the Pronto Cycle Share’s Data Challenge, which offers prizes for different categories of analysis.

There are a lot of tools out there that you could use to analyze data like this, but my tool of choice is (obviously) Python. In this post, I want to show how you can get started analyzing this data and joining it with other available data sources using the PyData stack, namely NumPy, Pandas, Matplotlib, and Seaborn. Here I’ll take a look at some of the basic questions you can answer with this data. Later I hope to find the time to dig deeper and ask some more interesting and creative questions – stay tuned!

For those who aren’t familiar, this post is composed in the form of a Jupyter Notebook, which is an open document format that combines text, code, data, and graphics and is viewable through the web browser – if you have not used it before I encourage you to try it out! You can download the notebook containing this post here, open it with Jupyter, and start asking your own questions of the data.