On May 19, 2016 Meta Brown, author of Data Mining for Dummies, presented CRISP-DM: The Dominant Process for Data Mining at the inaugural Chicago Women and Machine Learning & Data Science (WiMLDS) Meetup event. CRISP-DM is a popular data mining process; in fact, when a KDNuggets poll asked respondents what methodology they use for analytics, data mining, and data science projects, 43% responded that they use CRISP-DM, a six-phase process first developed in the 1990s. In her talk, Meta will discuss why and how CRISP-DM was developed, the phases of the CRISP-DM process, and where CRISP-DM stands today.
Before Meta’s talk, Data Scientist at Datascope, Mollie Pettit, asked Meta a few questions about herself, her talk, and for a little friendly advice.
Mollie: What is one (or some) of the main points you hope attendees will get from your WiMLDS Meetup talk?
Meta: Good process matters more than any other aspect of analytics. Sophisticated algorithms, the latest tools, and massive resources mean nothing if you don’t have a process that is thorough, methodical and above all, relevant to a real business problem. What’s more, an excellent process enables you to produce valuable, actionable information using a minimum of resources.
Mollie: Can you succinctly describe what the CRISP-DM process is?
Meta: CRISP-DM is a step by step framework for the data analysis process. It was developed by a large and diverse group of analysts from over 200 organizations, and designed to be a flexible framework suitable for any industry, and all types of data and tools.
Mollie: For a company (or a data scientist within a company) who wants to implement CRISP-DM, what advice would you give them?
Meta: Commit to implementing an excellent data analysis process. You don’t need a lot of money, special tools, or special talents to use CRISP-DM, but you do need be tenacious, consistent and insistent on the importance of process.
The only necessary resource that you may not already have on hand is literature to explain the framework. I recommend two resources for that – the CRISP-DM Step-by-step Data Mining Guide and my book, Data Mining for Dummies.
Mollie: Do you have any specific thoughts or advice to women in the data science field?
Meta: Be cautious of anything that separates the “data science” specialty from the wider analytics community. The computing industry wants to own analytics, and that’s not good for women, or for analytics.
Women are fully half the analytics community. The latest data from the Bureau of Labor Statistics indicates that 53% of statisticians are women, and there’s plenty of data to show that women are present in large numbers and doing important work in every aspect of analytics. I’ve personally written and shared profiles of more than 450 accomplished women in the field, and that’s just a minuscule fraction of the whole.
In my presentation, I’ll give some examples that I’ve encountered in my work, of common male behavior that isn’t desirable for producing valuable analytics, and how women (and men, too, if they’re willing) can approach analytics differently to produce more relevant and valuable information for their employers and clients.
Mollie: Tell us something fun about yourself that has nothing to do with data science.
Meta: Last summer, I took a geology walking tour of London, and now I look at construction materials with awe.
Mollie: What is some of the best advice you have been given?
Meta: Be visible. Launch a campaign, immediately, to make yourself better known for the work you do.
The event was held at the Datascope office in downtown Chicago. For info on upcoming Meetup events hosted by Datascope, check out the Chicago Women in Machine Learning & Data Science events, and Data Science Chicago events.