Using Stacking to Average Bayesian Predictive Distributions (with Discussion)
I’ve posted on this paper (by Yuling Yao, Aki Vehtari, Daniel Simpson, and myself) before, but now the final version has been published, along with a bunch of interesting discussions and our rejoinder.
Advantages of Online Data Science Courses
R Packages worth a look
R Bindings for ‘Selenium WebDriver’ (RSelenium)Provides a set of R bindings for the ‘Selenium 2.0 WebDriver’ (see <https://… …
Document worth reading: “Human-Machine Inference Networks For Smart Decision Making: Opportunities and Challenges”
The emerging paradigm of Human-Machine Inference Networks (HuMaINs) combines complementary cognitive strengths of humans and machines in an intelligent manner to tackle various inference tasks and achieves higher performance than either humans or machines by themselves. While inference performance optimization techniques for human-only or sensor-only networks are quite mature, HuMaINs require novel signal processing and machine learning solutions. In this paper, we present an overview of the HuMaINs architecture with a focus on three main issues that include architecture design, inference algorithms including security/privacy challenges, and application areas/use cases. Human-Machine Inference Networks For Smart Decision Making: Opportunities and Challenges
The Price of Transformation
· Read in 6 minutes · 1200 words ·
3-D shadow maps in R: the rayshader package
Data scientists often work with geographic data that needs to be visualized on a map, and sometimes the maps themselves are the data. The data is often located in two-dimensional space (latitude and longitude), but for some applications we have a third dimension as well: elevation. We could represent the elevations using contours, color, or 3-D perspective, but with the new rayshader package for R by Tyler Morgan-Wall, it’s easy to visualize such maps as 3-D relief maps complete with shadows, perspective and depth of field:
R Packages worth a look
Apache Thrift Client Server ([https://thrift](
Job opening at CDC: “The Statistician will play a central role in guiding the statistical methods of all major projects of the Epidemiology and Prevention Branch of the CDC Influenza Division, and aid in designing, analyzing, and interpreting research intended to understand the burden of influenza in the US and internationally and identify the best influenza vaccines and vaccine strategies.”
Vacancy Information: Mathematical Statistician, GS-1529-14
If you did not already know
Distributed Data Shuffling
Data shuffling of training data among different computing nodes (workers) has been identified as a core element to improve the statistical performance of modern large scale machine learning algorithms. Data shuffling is often considered one of the most significant bottlenecks in such systems due to the heavy communication load. Under a master-worker architecture (where a master has access to the entire dataset and only communications between the master and workers is allowed) coding has been recently proved to considerably reduce the communication load. In this work, we consider a different communication paradigm referred to as distributed data shuffling, where workers, connected by a shared link, are allowed to communicate with one another while no communication between the master and workers is allowed. Under the constraint of uncoded cache placement, we first propose a general coded distributed data shuffling scheme, which achieves the optimal communication load within a factor two. Then, we propose an improved scheme achieving the exact optimality for either large memory size or at most four workers in the system. …