Anomaly Detection: A Survey (September 2009)
If you did not already know
Computation Control Protocol (CCP)
Cooperative computation is a promising approach for localized data processing for Internet of Things (IoT), where computationally intensive tasks in a device could be divided into sub-tasks, and offloaded to other devices or servers in close proximity. However, exploiting the potential of cooperative computation is challenging mainly due to the heterogeneous nature of IoT devices. Indeed, IoT devices may have different and time-varying computing power and energy resources, and could be mobile. Coded computation, which advocates mixing data in sub-tasks by employing erasure codes and offloading these sub-tasks to other devices for computation, is recently gaining interest, thanks to its higher reliability, smaller delay, and lower communication costs. In this paper, we develop a coded cooperative computation framework, which we name Computation Control Protocol (CCP), by taking into account heterogeneous computing power and energy resources of IoT devices. CCP dynamically allocates sub-tasks to helpers and is adaptive to time-varying resources. We show that (i) CCP improves task completion delay significantly as compared to baselines, (ii) task completion delay of CCP is very close to its theoretical characterization, and (iii) the efficiency of CCP in terms of resource utilization is higher than 99%, which is significant. …
R Packages worth a look
Optimal Design Emulators via Point Processes (demu)Implements the Determinantal point process (DPP) based optimal design emulator described in Pratola, Lin and Craigmile (2018)
Document worth reading: “Nonnegative Matrix Factorization for Signal and Data Analytics: Identifiability, Algorithms, and Applications”
Nonnegative matrix factorization (NMF) has become a workhorse for signal and data analytics, triggered by its model parsimony and interpretability. Perhaps a bit surprisingly, the understanding to its model identifiability—the major reason behind the interpretability in many applications such as topic mining and hyperspectral imaging—had been rather limited until recent years. Beginning from the 2010s, the identifiability research of NMF has progressed considerably: Many interesting and important results have been discovered by the signal processing (SP) and machine learning (ML) communities. NMF identifiability has a great impact on many aspects in practice, such as ill-posed formulation avoidance and performance-guaranteed algorithm design. On the other hand, there is no tutorial paper that introduces NMF from an identifiability viewpoint. In this paper, we aim at filling this gap by offering a comprehensive and deep tutorial on model identifiability of NMF as well as the connections to algorithms and applications. This tutorial will help researchers and graduate students grasp the essence and insights of NMF, thereby avoiding typical `pitfalls’ that are often times due to unidentifiable NMF formulations. This paper will also help practitioners pick/design suitable factorization tools for their own problems. Nonnegative Matrix Factorization for Signal and Data Analytics: Identifiability, Algorithms, and Applications
When anyone claims 80% power, I’m skeptical.
A policy analyst writes:
Whats new on arXiv
Genie: An Open Box Counterfactual Policy Estimator for Optimizing Sponsored Search Marketplace
R Packages worth a look
Choosing the Sample Strategy (optimStrat)A package intended to assist in the choice of the sample strategy to implement in a survey. It compares five strategies having into account the informa …
Weighing the risk of moderate alcohol consumption
A research study on mortality and alcohol consumption is making the rounds. Its main conclusion is that all alcohol consumption is bad for you, because of increased risk. David Spiegelhalter, the chair of the Winton Centre for Risk and Evidence Communication, offers a different interpretation of the data:
Because it's Friday: One Million Integers
This is a visualization of numbers from 1 to 1,000,000, added in order in blocks of 1,000, located in the 2-D plane according to their prime factors:
Microsoft Weekly Data Science News for August 24, 2018
The latest articles from Microsoft regarding cloud data science products and updates.