This paper reviews recent advances in missing data research using graphical mod- els to represent multivariate dependencies. We rst examine the limitations of tra- ditional frameworks from three di erent perspectives: transparency, estimability and testability. We then show how procedures based on graphical models can overcome these limitations and provide meaningful performance guarantees even when data are Missing Not At Random (MNAR). In particular, we identify conditions that guar- antee consistent estimation in broad categories of missing data problems, and derive procedures for implementing this estimation. Finally we derive testable implications for missing data models in both MAR (Missing At Random) and MNAR categories. Graphical Models for Processing Missing Data
RcppMsgPack 0.2.3
Another maintenance release of RcppMsgPack got onto CRAN today. Two new helper functions were added and not unlike the previous 0.2.2 release in, some additional changes are internal and should allow compilation on all CRAN systems.
Graphs and tables, tables and graphs
Jesse Wolfhagen writes:
Statistics Sunday: Reading and Creating a Data Frame with Multiple Text Files
First Statistics Sunday in far too long! It’s going to be a short one, but it describes a great trick I learned recently while completing a time study for our exams at work.
Growing List vs Growing Queue
GROWING LIST
base_lst1 <- function(df) { l <- list() for (i in seq(nrow(df))) l[[i]] <- as.list(df[i, ]) return(l) }
RcppGetconf 0.0.3
A second and minor update for the RcppGetconf package for reading system configuration — not unlike getconf
from the libc library — is now on CRAN.
If you did not already know
Non-convex Conditional Gradient Sliding (NCGS)
We investigate a projection free method, namely conditional gradient sliding on batched, stochastic and finite-sum non-convex problem. CGS is a smart combination of Nesterov’s accelerated gradient method and Frank-Wolfe (FW) method, and outperforms FW in the convex setting by saving gradient computations. However, the study of CGS in the non-convex setting is limited. In this paper, we propose the non-convex conditional gradient sliding (NCGS) which surpasses the non-convex Frank-Wolfe method in batched, stochastic and finite-sum setting. …
Getting Started with Amazon Comprehend custom entities
We released an update to Amazon Comprehend enabling support for private, custom entity types. Customers can now train state-of-the-art entity recognition models to extract their specific terms, completely automatically. No machine learning experience required. The service enables customers to create custom models with the data they already have, and without learning the ins and outs of ML. Customers can use the feature to easily build models to extract custom entities like policy numbers, part codes, serial numbers that are tailored to an organization’s need.
R Packages worth a look
Derive Polygenic Risk Score Based on Emprical Bayes Theory (EBPRS)EB-PRS is a novel method that leverages information for effect sizes across all the markers to improve the prediction accuracy. No parameter tuning is …
Whats new on arXiv
A causal inference framework for cancer cluster investigations using publicly available data