New: Maintained Datasets

Can you trust the data you use on Kaggle? Is it licensed? Has it been updated recently?

Those sensible questions are the reason for the new “Maintained by Kaggle” badge you may have noticed while browsing select datasets. This badge signifies that a dataset is maintained by Kaggle, though it may or may not be data that Kaggle has collected (e.g. Kaggle - Meta Kaggle vs. SF Open Data - Police Calls). Kaggle connects to datasets of other organizations using public APIs like Socrata and FRED.

What does the badge mean?

The “Maintained by Kaggle” badge means that Kaggle is now and will continue to actively maintain that dataset. This includes regular updates to descriptions and metadata, quicker response rates in discussion, and accurate current data from the source. Our goal is to create seamless workflows that allow everyone to do data science on Kaggle and be confident in the data they work with.

Which datasets are maintained?

Kaggle maintains data from various sources and in a variety of subject areas. Here a few examples of open-source datasets we’re currently maintaining:

See even more datasets currently maintained by Kaggle!

Give us your thoughts!

Are there other datasets you’d like to see “Maintained by Kaggle?” Do you manage a data repository that you’d like to integrate with Kaggle? Check out our Product Feedback Forum to send us your comments and discuss your thoughts with other Kagglers.