Exploratory data analysis is important, everybody knows that. With R, it is also easy. Below you see three lines of code that allow you to interactively explore the Preston Curve, the prominent association of country level real income per capita with life expectancy.
Timing Grouped Mean Calculation in R
This note is a comment on some of the timings shared in the dplyr-0.8.0 pre-release announcement.
Timing Grouped Mean Calculation in R
This note is a comment on some of the timings shared in the dplyr-0.8.0 pre-release announcement.
My footnote about global warming
At the beginning of my article, How to think scientifically about scientists’ proposals for fixing science, which we discussed yesterday, I wrote:
R Packages worth a look
Visualize R Data Structures with Trees (lobstr)A set of tools for inspecting and understanding R data structures inspired by str(). Includes ast() for visualizing abstract syntax trees, ref() for sh …
Document worth reading: “A Theory of Diagnostic Interpretation in Supervised Classification”
Interpretable deep learning is a fundamental building block towards safer AI, especially when the deployment possibilities of deep learning-based computer-aided medical diagnostic systems are so eminent. However, without a computational formulation of black-box interpretation, general interpretability research rely heavily on subjective bias. Clear decision structure of the medical diagnostics lets us approximate the decision process of a radiologist as a model – removed from subjective bias. We define the process of interpretation as a finite communication between a known model and a black-box model to optimally map the black box’s decision process in the known model. Consequently, we define interpretability as maximal information gain over the initial uncertainty about the black-box’s decision within finite communication. We relax this definition based on the observation that diagnostic interpretation is typically achieved by a process of minimal querying. We derive an algorithm to calculate diagnostic interpretability. The usual question of accuracy-interpretability tradeoff, i.e. whether a black-box model’s prediction accuracy is dependent on its ability to be interpreted by a known source model, does not arise in this theory. With multiple example simulation experiments of various complexity levels, we demonstrate the working of such a theoretical model in synthetic supervised classification scenarios. A Theory of Diagnostic Interpretation in Supervised Classification
It was twenty years ago …
… this week that I made a first cameo in the debian/changelog for the Debian R package:
Dr. Data Show Video: Five Reasons Computers Predict When You’ll Die
Watch the latest episode of The Dr. Data Show, which answers the question, “Why do computers predict when you’ll die?” – with five example reasons!
Whats new on arXiv
Interpretable Graph Convolutional Neural Networks for Inference on Noisy Knowledge Graphs
If you did not already know
Context-aware Sentiment Word Identification (sentiword2vec) Traditional sentiment analysis often uses sentiment dictionary to extract sentiment information in text and classify documents. However, emerging informal words and phrases in user generated content call for analysis aware to the context. Usually, they have special meanings in a particular context. Because of its great performance in representing inter-word relation, we use sentiment word vectors to identify the special words. Based on the distributed language model word2vec, in this paper we represent a novel method about sentiment representation of word under particular context, to be detailed, to identify the words with abnormal sentiment polarity in long answers. Result shows the improved model shows better performance in representing the words with special meaning, while keep doing well in representing special idiomatic pattern. Finally, we will discuss the meaning of vectors representing in the field of sentiment, which may be different from general object-based conditions. …