SunJackson Blog

Industry Predictions： AI, Machine Learning, Analytics & Data Science Main Developments in 2018 and Key Trends for 2019

转载自：http://feedproxy.google.com/~r/kdnuggets-data-mining-analytics/~3/FVi0DoPYyqA/predictions-industry-2019.html

Matthew Mayo

发表于 2018-12-18

As we continue to bring KDnuggets readers year-end roundups and predictions for 2019, we reach out to a number of influential industry companies for their takes, posing this question:

阅读全文 »

Highlights of 2018

转载自：http://blog.fastforwardlabs.com/2018/12/18/highlights-2018.html

未知

发表于 2018-12-18

We end 2018 with a round-up of some of the research, talks, sci-fi, visualizations/art, and a grab bag of other stuff we found particularly interesting, enjoyable, or influential this year (and we’re going to be a bit fuzzy about the definition of “this year”)!

阅读全文 »

Document worth reading： “Are screening methods useful in feature selection? An empirical study”

转载自：https://analytixon.com/2018/12/19/document-worth-reading-are-screening-methods-useful-in-feature-selection-an-empirical-study/

Michael Laux

发表于 2018-12-18

Filter or screening methods are often used as a preprocessing step for reducing the number of variables used by a learning algorithm in obtaining a classification or regression model. While there are many such filter methods, there is a need for an objective evaluation of these methods. Such an evaluation is needed to compare them with each other and also to answer whether they are at all useful, or a learning algorithm could do a better job without them. For this purpose, many popular screening methods are partnered in this paper with three regression learners and five classification learners and evaluated on ten real datasets to obtain accuracy criteria such as R-square and area under the ROC curve (AUC). The obtained results are compared through curve plots and comparison tables in order to find out whether screening methods help improve the performance of learning algorithms and how they fare with each other. Our findings revealed that the screening methods were only useful in one regression and three classification datasets out of the ten datasets evaluated. Are screening methods useful in feature selection? An empirical study

阅读全文 »

So you want to play a pRank in R…?

转载自：http://feedproxy.google.com/~r/RBloggers/~3/nyZeJoBQESc/

Andrew Treadway

发表于 2018-12-18

So…you want to play a pRank with R? This short post will give you a fun function you can use in R to help you out!

阅读全文 »

Document worth reading： “A Study and Comparison of Human and Deep Learning Recognition Performance Under Visual Distortions”

转载自：https://analytixon.com/2018/12/18/document-worth-reading-a-study-and-comparison-of-human-and-deep-learning-recognition-performance-under-visual-distortions/

Michael Laux

发表于 2018-12-18

Deep neural networks (DNNs) achieve excellent performance on standard classification tasks. However, under image quality distortions such as blur and noise, classification accuracy becomes poor. In this work, we compare the performance of DNNs with human subjects on distorted images. We show that, although DNNs perform better than or on par with humans on good quality images, DNN performance is still much lower than human performance on distorted images. We additionally find that there is little correlation in errors between DNNs and human subjects. This could be an indication that the internal representation of images are different between DNNs and the human visual system. These comparisons with human performance could be used to guide future development of more robust DNNs. A Study and Comparison of Human and Deep Learning Recognition Performance Under Visual Distortions

阅读全文 »

vtreat Variable Importance

转载自：http://www.win-vector.com/blog/2018/12/vtreat-variable-importance/

John Mount

发表于 2018-12-18

vtreat‘s purpose is to produce pure numeric R data.frames that are ready for supervised predictive modeling (predicting a value from other values). By ready we mean: a purely numeric data frame with no missing values and a reasonable number of columns (missing-values re-encoded with indicators, and high-degree categorical re-encode by effects codes or impact codes).

阅读全文 »