SunJackson Blog

A potential big problem with placebo tests in econometrics： they’re subject to the “difference between significant and non-significant is not itself statistically significant” issue

转载自：https://andrewgelman.com/2018/09/26/potential-big-problem-placebo-tests-econometrics-theyre-subject-difference-significant-non-significant-not-statistically-significant-issue/

Andrew

发表于 2018-09-26

阅读全文 »

Using Stacking to Average Bayesian Predictive Distributions (with Discussion)

转载自：https://andrewgelman.com/2018/09/26/using-stacking-to-average-bayesian-predictive-distributions-with-discussion/

Andrew

发表于 2018-09-26

I’ve posted on this paper (by Yuling Yao, Aki Vehtari, Daniel Simpson, and myself) before, but now the final version has been published, along with a bunch of interesting discussions and our rejoinder.

阅读全文 »

Advantages of Online Data Science Courses

转载自：https://dimensionless.in/advantages-online-data-science-courses/

Kartik Singh

发表于 2018-09-26

阅读全文 »

R Packages worth a look

转载自：https://advanceddataanalytics.net/2018/09/26/r-packages-worth-a-look-1284/

Michael Laux

发表于 2018-09-26

R Bindings for ‘Selenium WebDriver’ (RSelenium)Provides a set of R bindings for the ‘Selenium 2.0 WebDriver’ (see <https://… …

阅读全文 »

Document worth reading： “Human-Machine Inference Networks For Smart Decision Making： Opportunities and Challenges”

转载自：https://advanceddataanalytics.net/2018/09/26/document-worth-reading-human-machine-inference-networks-for-smart-decision-making-opportunities-and-challenges/

Michael Laux

发表于 2018-09-26

The emerging paradigm of Human-Machine Inference Networks (HuMaINs) combines complementary cognitive strengths of humans and machines in an intelligent manner to tackle various inference tasks and achieves higher performance than either humans or machines by themselves. While inference performance optimization techniques for human-only or sensor-only networks are quite mature, HuMaINs require novel signal processing and machine learning solutions. In this paper, we present an overview of the HuMaINs architecture with a focus on three main issues that include architecture design, inference algorithms including security/privacy challenges, and application areas/use cases. Human-Machine Inference Networks For Smart Decision Making: Opportunities and Challenges

阅读全文 »

The Price of Transformation

转载自：http://blog.shakirm.com/2018/09/the-price-of-transformation/

shakirm

发表于 2018-09-26

· Read in 6 minutes · 1200 words ·

阅读全文 »

3-D shadow maps in R： the rayshader package

转载自：http://blog.revolutionanalytics.com/2018/09/raytracer.html

David Smith

发表于 2018-09-26

Data scientists often work with geographic data that needs to be visualized on a map, and sometimes the maps themselves are the data. The data is often located in two-dimensional space (latitude and longitude), but for some applications we have a third dimension as well: elevation. We could represent the elevations using contours, color, or 3-D perspective, but with the new rayshader package for R by Tyler Morgan-Wall, it’s easy to visualize such maps as 3-D relief maps complete with shadows, perspective and depth of field:

阅读全文 »

R Packages worth a look

转载自：https://advanceddataanalytics.net/2018/09/26/r-packages-worth-a-look-1285/

Michael Laux

发表于 2018-09-26

Apache Thrift Client Server ([https://thrift](

阅读全文 »

Job opening at CDC： “The Statistician will play a central role in guiding the statistical methods of all major projects of the Epidemiology and Prevention Branch of the CDC Influenza Division, and aid in designing, analyzing, and interpreting research intended to understand the burden of influenza in the US and internationally and identify the best influenza vaccines and vaccine strategies.”

转载自：https://andrewgelman.com/2018/09/25/job-opening-at-cdc-the-statistician-will-play-a-central-role-in-guiding-the-statistical-methods-of-all-major-projects-of-the-epidemiology-and-prevention-branch-of-the-cdc-influenza-division-and-a/

Andrew

发表于 2018-09-26

Vacancy Information: Mathematical Statistician, GS-1529-14

阅读全文 »

If you did not already know

转载自：https://advanceddataanalytics.net/2018/09/26/if-you-did-not-already-know-494/

Michael Laux

发表于 2018-09-26

Distributed Data Shuffling Data shuffling of training data among different computing nodes (workers) has been identified as a core element to improve the statistical performance of modern large scale machine learning algorithms. Data shuffling is often considered one of the most significant bottlenecks in such systems due to the heavy communication load. Under a master-worker architecture (where a master has access to the entire dataset and only communications between the master and workers is allowed) coding has been recently proved to considerably reduce the communication load. In this work, we consider a different communication paradigm referred to as distributed data shuffling, where workers, connected by a shared link, are allowed to communicate with one another while no communication between the master and workers is allowed. Under the constraint of uncoded cache placement, we first propose a general coded distributed data shuffling scheme, which achieves the optimal communication load within a factor two. Then, we propose an improved scheme achieving the exact optimality for either large memory size or at most four workers in the system. …

阅读全文 »