2018 Data Sources for Cool Data Science Projects, provided by Thinknum

At The Data Incubator, we run a Data Science Fellowship program for Master’s and PhD graduates looking to transition to a career in industry. Our admissions team, as well as our hiring partners, love Fellows who don’t mind getting their hands dirty with data. That’s why our applicants submit ideas for capstone projects they’ll work on throughout the 8-week Fellowship to showcase their data science skills. One of the biggest obstacles to creating and completing successful projects has been getting access to interesting data. Today, we’re excited to announce a partnership with leading alternative webcrawled data provider, Thinknum. Thinknum has been the principal provider of web crawled data to the finance community for over 3 years, counting more than 150 elite hedge funds and a majority of investment banks in their client list, employing the data to experiment with ever-more innovative and differentiated ways of producing investment ideas across all sectors and multiple asset classes. More recently, Thinknum’s data has been in high demand for the some of the largest and most innovative corporate customers for internal strategic decision making. The data is also heavily used by journalists, especially those reporting on the financial sector, with the media outlets like CNN, Business Insider and CNBC all using Thinknum resources in their stories. This partnership will provide Fellows and Fellowship applicants access to some of the data used by experts in the finance industry and corporate leaders on a daily basis.

Business, economic and social activity is continually moving online. This increasing digital activity leaves behind data trails that, with proper organization, can reveal otherwise invisible trends, shifts and movements. Thinknum clients, and now The Data Incubator Fellows and applicants can utilize this data for the purposes of investing, gaining deeper understanding of businesses, or telling a story about an industry trend. Thinknum trawls the internet to collect data on over 400,000 public and private companies across the globe every day, generating huge amounts of data. Their intuitive web-based tool will allow fellows to easily navigate huge volumes of data to gather insights, create correlations, and generate visualisations to share with other fellows in seconds.

Thinknum Data

Thinknum tracks thousands of websites capturing and indexing vast amounts of public data, indexes it and maps it back to individual companies. In the full Thinknum library there are over 20 datasets, each containing dozens of metrics updated daily.

3 Datasets

Thinknum is providing The Data Incubator with access to three real world datasets for our fellows to analyze and explore. In terms of potential projects, there are virtually limitless options for each dataset and most of them haven’t been worked through. If you take a look at the number of columns for each, you will get a sense just how many questions one can ask. Included are a few initial suggestions though.

Enter your email to receive the data sets and get started on your own data science projects: