SunJackson Blog

A Subtle Flaw in Some Popular R NSE Interfaces

转载自：http://www.win-vector.com/blog/2018/09/a-subtle-flaw-in-some-popular-r-nse-interfaces/

John Mount

发表于 2018-09-24

It is no great secret: I like value oriented interfaces that preserve referential transparency. It is the side of the public debate I take in R programming.

阅读全文 »

R Packages worth a look

转载自：https://advanceddataanalytics.net/2018/09/24/r-packages-worth-a-look-1282/

Michael Laux

发表于 2018-09-24

L1-Penalized Censored Gaussian Graphical Models (cglasso)The l1-penalized censored Gaussian graphical model (cglasso) is an extension of the graphical lasso estimator developed to handle datasets with censore …

阅读全文 »

Python Vs R ： The Eternal Question for Data Scientists

转载自：https://dimensionless.in/python-vs-r-the-eternal-question-for-data-scientists/

Dimensionless

发表于 2018-09-24

阅读全文 »

How to Optimise Ad CTR with Reinforcement Learning

转载自：https://www.codementor.io/divyacyclitics15/how-to-optimise-ad-ctr-with-reinforcement-learning-nohjpnp6r

Kartik Singh

发表于 2018-09-24

In this blog we will try to get the basic idea behind reinforcement learning and understand what is a multi arm bandit problem. We will also be trying to maximise CTR(click through rate) for advertisements for a advertising agency.Article includes:1. Basics of reinforcement learning2. Types of problems in reinforcement learning3. Understamding multi-arm bandit problem4. Basics of conditional probability and Thompson sampling5. Optimizing ads CTR using Thompson sampling in R

阅读全文 »

Dataquest helped me get my dream job at Noodle.ai

转载自：https://www.dataquest.io/blog/dataquest-helped-me-get-my-dream-job-at-noodle-ai/

Meg Blanchette

发表于 2018-09-24

Dataquest’s mission is to prepare real-world data scientists.

阅读全文 »

Whats new on arXiv

转载自：https://advanceddataanalytics.net/2018/09/25/whats-new-on-arxiv-771/

Michael Laux

发表于 2018-09-24

Uncertainty Aware AI ML: Why and How

阅读全文 »

“Tweeking”： The big problem is not where you think it is.

转载自：https://andrewgelman.com/2018/09/23/tweeking-big-problem-not-think/

Andrew

发表于 2018-09-23

In her recent article about pizzagate, Stephanie Lee included this hilarious email from Brian Wansink, the self-styled “world-renowned eating behavior expert for over 25 years”:

阅读全文 »

Document worth reading： “Graph-based Ontology Summarization： A Survey”

转载自：https://advanceddataanalytics.net/2018/09/23/document-worth-reading-graph-based-ontology-summarization-a-survey/

Michael Laux

发表于 2018-09-23

Ontologies have been widely used in numerous and varied applications, e.g., to support data modeling, information integration, and knowledge management. With the increasing size of ontologies, ontology understanding, which is playing an important role in different tasks, is becoming more difficult. Consequently, ontology summarization, as a way to distill key information from an ontology and generate an abridged version to facilitate a better understanding, is getting growing attention. In this survey paper, we review existing ontology summarization techniques and focus mainly on graph-based methods, which represent an ontology as a graph and apply centrality-based and other measures to identify the most important elements of an ontology as its summary. After analyzing their strengths and weaknesses, we highlight a few potential directions for future research. Graph-based Ontology Summarization: A Survey

阅读全文 »

Distilled News

转载自：https://advanceddataanalytics.net/2018/09/23/distilled-news-867/

Michael Laux

发表于 2018-09-23

The 2018 State of Data Management

阅读全文 »

Document worth reading： “On the Learning Dynamics of Deep Neural Networks”

转载自：https://advanceddataanalytics.net/2018/09/23/document-worth-reading-on-the-learning-dynamics-of-deep-neural-networks/

Michael Laux

发表于 2018-09-23

While a lot of progress has been made in recent years, the dynamics of learning in deep nonlinear neural networks remain to this day largely misunderstood. In this work, we study the case of binary classification and prove various properties of learning in such networks under strong assumptions such as linear separability of the data. Extending existing results from the linear case, we confirm empirical observations by proving that the classification error also follows a sigmoidal shape in nonlinear architectures. We show that given proper initialization, learning expounds parallel independent modes and that certain regions of parameter space might lead to failed training. We also demonstrate that input norm and features’ frequency in the dataset lead to distinct convergence speeds which might shed some light on the generalization capabilities of deep neural networks. We provide a comparison between the dynamics of learning with cross-entropy and hinge losses, which could prove useful to understand recent progress in the training of generative adversarial networks. Finally, we identify a phenomenon that we baptize gradient starvation where the most frequent features in a dataset prevent the learning of other less frequent but equally informative features. On the Learning Dynamics of Deep Neural Networks

阅读全文 »