It is no great secret: I like value oriented interfaces that preserve referential transparency. It is the side of the public debate I take in R
programming.
R Packages worth a look
L1-Penalized Censored Gaussian Graphical Models (cglasso)The l1-penalized censored Gaussian graphical model (cglasso) is an extension of the graphical lasso estimator developed to handle datasets with censore …
Python Vs R : The Eternal Question for Data Scientists
How to Optimise Ad CTR with Reinforcement Learning
In this blog we will try to get the basic idea behind reinforcement learning and understand what is a multi arm bandit problem. We will also be trying to maximise CTR(click through rate) for advertisements for a advertising agency.Article includes:1. Basics of reinforcement learning2. Types of problems in reinforcement learning3. Understamding multi-arm bandit problem4. Basics of conditional probability and Thompson sampling5. Optimizing ads CTR using Thompson sampling in R
Dataquest helped me get my dream job at Noodle.ai
Dataquest’s mission is to prepare real-world data scientists.
Whats new on arXiv
Uncertainty Aware AI ML: Why and How
“Tweeking”: The big problem is not where you think it is.
In her recent article about pizzagate, Stephanie Lee included this hilarious email from Brian Wansink, the self-styled “world-renowned eating behavior expert for over 25 years”:
Document worth reading: “Graph-based Ontology Summarization: A Survey”
Ontologies have been widely used in numerous and varied applications, e.g., to support data modeling, information integration, and knowledge management. With the increasing size of ontologies, ontology understanding, which is playing an important role in different tasks, is becoming more difficult. Consequently, ontology summarization, as a way to distill key information from an ontology and generate an abridged version to facilitate a better understanding, is getting growing attention. In this survey paper, we review existing ontology summarization techniques and focus mainly on graph-based methods, which represent an ontology as a graph and apply centrality-based and other measures to identify the most important elements of an ontology as its summary. After analyzing their strengths and weaknesses, we highlight a few potential directions for future research. Graph-based Ontology Summarization: A Survey
Distilled News
The 2018 State of Data Management
Document worth reading: “On the Learning Dynamics of Deep Neural Networks”
While a lot of progress has been made in recent years, the dynamics of learning in deep nonlinear neural networks remain to this day largely misunderstood. In this work, we study the case of binary classification and prove various properties of learning in such networks under strong assumptions such as linear separability of the data. Extending existing results from the linear case, we confirm empirical observations by proving that the classification error also follows a sigmoidal shape in nonlinear architectures. We show that given proper initialization, learning expounds parallel independent modes and that certain regions of parameter space might lead to failed training. We also demonstrate that input norm and features’ frequency in the dataset lead to distinct convergence speeds which might shed some light on the generalization capabilities of deep neural networks. We provide a comparison between the dynamics of learning with cross-entropy and hinge losses, which could prove useful to understand recent progress in the training of generative adversarial networks. Finally, we identify a phenomenon that we baptize gradient starvation where the most frequent features in a dataset prevent the learning of other less frequent but equally informative features. On the Learning Dynamics of Deep Neural Networks