SunJackson Blog

Get a 2–6x Speed-up on Your Data Pre-processing with Python

转载自：http://feedproxy.google.com/~r/kdnuggets-data-mining-analytics/~3/yxvapa05ZsY/get-speed-up-data-pre-processing-python.html

Matt Mayo Editor

发表于 2018-10-23

By George Seif, AI / Machine Learning Engineer

阅读全文 »

Introducing gratia

转载自：http://feedproxy.google.com/~r/RBloggers/~3/cowPypUunVY/

Gavin L. Simpson

发表于 2018-10-23

I use generalized additive models (GAMs) in my research work. I use them a lot! Simon Wood’s mgcv package is an excellent set of software for specifying, fitting, and visualizing GAMs for very large data sets. Despite recently dabbling with brms, mgcv is still my go-to GAM package. The only down-side to mgcv is that it is not very tidy-aware and the ggplot-verse may as well not exist as far as it is concerned. This in itself is no bad thing, though as someone who uses mgcv a lot but also prefers to do my plotting with ggplot2, this lack of awareness was starting to hurt. So, I started working on something to help bridge the gap between these two separate worlds that I inhabit. The fruit of that labour is gratia, and development has progressed to the stage where I am ready to talk a bit more about it.

阅读全文 »

How Can Autonomous Drones Help the Energy and Utilities Industry?

转载自：https://blogs.technet.microsoft.com/machinelearning/2018/10/23/how-can-autonomous-drones-help-the-energy-and-utilities-industry/

ML Blog Team

发表于 2018-10-23

阅读全文 »

5 Steps to Prepare for a Data Science Job

转载自：https://www.codementor.io/divyacyclitics15/5-steps-to-prepare-for-a-data-science-job-olqni451r

Kartik Singh

发表于 2018-10-23

A career in data science is hyped as the hottest job of the 21st century, but how do you become a data scientist? How should you, as an aspiring data scientist, or a student who aims at a data science job, prepare? What are the skills you need? What must you do? Fret not – this article will answer all your questions and give you links with which you can jump-start a new career in data science!

阅读全文 »

Whats new on arXiv

转载自：https://advanceddataanalytics.net/2018/10/23/whats-new-on-arxiv-794/

Michael Laux

发表于 2018-10-23

Removing the influence of a group variable in high-dimensional predictive modelling

阅读全文 »

Document worth reading： “Attribute-aware Collaborative Filtering： Survey and Classification”

转载自：https://advanceddataanalytics.net/2018/10/23/document-worth-reading-attribute-aware-collaborative-filtering-survey-and-classification/

Michael Laux

发表于 2018-10-23

Attribute-aware CF models aims at rating prediction given not only the historical rating from users to items, but also the information associated with users (e.g. age), items (e.g. price), or even ratings (e.g. rating time). This paper surveys works in the past decade developing attribute-aware CF systems, and discovered that mathematically they can be classified into four different categories. We provide the readers not only the high level mathematical interpretation of the existing works in this area but also the mathematical insight for each category of models. Finally we provide in-depth experiment results comparing the effectiveness of the major works in each category. Attribute-aware Collaborative Filtering: Survey and Classification

阅读全文 »

Computer Vision for Model Assessment

转载自：http://feedproxy.google.com/~r/RBloggers/~3/vlTx7s87nwU/

David Smith

发表于 2018-10-23

One of the differences between statistical data scientists and machine learning engineers is that while the latter group are concerned primarily with the predictive performance of a model, the former group are also concerned with the fit of the model. A model that misses important structures in the data — for example, seasonal trends, or a poor fit to specific subgroups — is likely to be lacking important variables or features in the source data. You can try different machine learning techniques or adjust hyperparameters to your heart’s content, but you’re unlikely to discover problems like this without evaluating the model fit.

阅读全文 »

What to think about this new study which says that you should limit your alcohol to 5 drinks a week?

转载自：https://andrewgelman.com/2018/10/23/think-new-study-says-limit-alcohol-5-drinks-week/

Andrew

发表于 2018-10-23

Someone who wishes to remain anonymous points us to a recent article in the Lancet, “Risk thresholds for alcohol consumption: combined analysis of individual-participant data for 599 912 current drinkers in 83 prospective studies,” by Angela Wood et al., that’s received a lot of press coverage; for example:

阅读全文 »

Introduction to Active Learning

转载自：http://feedproxy.google.com/~r/kdnuggets-data-mining-analytics/~3/U6deHrHRI3c/introduction-active-learning.html

Dan Clark

发表于 2018-10-23

By Jennifer Prendki, VP of Machine Learning, Figure Eight

阅读全文 »

High school statistics class builds election prediction model

转载自：https://flowingdata.com/2018/10/23/high-school-statistics-class-builds-election-prediction-model/

Nathan Yau

发表于 2018-10-23

High school seniors, in the Political Statistics class at Montgomery Blair High School in Silver Spring, Maryland, built a prediction model for the upcoming elections:

阅读全文 »