Machine learning can effectively be leveraged to support NGOs, but it often requires sponsorship and organizational support to generate the resources needed for such data explorations. The fourth annual Teradata University Network (TUN) Data Challenge Competition provided this support, and focused on combating Multiple Sclerosis (MS).
Amazon SageMaker notebooks now support Git integration for increased persistence, collaboration, and reproducibility
It’s now possible to associate GitHub, AWS CodeCommit, and any self-hosted Git repository with Amazon SageMaker notebook instances to easily and securely collaborate and ensure version-control with Jupyter Notebooks. In this blog post, I’ll elaborate on the benefits of using Git-based version-control systems and how to set up your notebook instances to work with Git repositories.
Create 3D County Maps Using Density as Z-Axis
This is going to be a bit longer than some of my previous tutorials as it covers a walkthrough for sourcing data, scraping tables, cleaning, and generating the 3D view below which you can springboard from with the help of the rgl
package. The heavy lifting is done with ggplot
and rayshader
.
If you did not already know
SoaAlloc
We propose SoaAlloc, a dynamic object allocator for Single-Method Multiple-Objects applications in CUDA. SoaAlloc is the first allocator for GPUs that (a) arranges allocations in a SIMD-friendly Structure of Arrays (SOA) data layout, (b) provides a do-all operation for maximizing the benefit of SOA, and (c) is on par with state-of-the-art memory allocators for raw (de)allocation time. Our benchmarks show that the SOA layout leads to significantly better memory bandwidth utilization, resulting in a 2x speedup of application code. …
October 2018: “Top 40” New Packages
One hundred eighty-five new packages made it to CRAN in October. Here are my picks for the “Top 40” in eight categories: Computational Methods, Data, Machine Learning, Medicine, Science, Statistics, Utilities, and Visualization.
NYC buses: simple Cubist regression
- Advanced Modeling
Whats new on arXiv
GaterNet: Dynamic Filter Selection in Convolutional Neural Network via a Dedicated Global Gating Network
Teaching kids data visualization
Jonathan Schwabish gave his fourth-grade son’s class a lesson on data visualization. He wrote about his experience:
Linking Data Science Activities to Business Initiatives Using the Hypothesis Development Canvas
In the blog “Data Science ‘Paint by the Numbers’ with the Hypothesis Development Canvas,” I introduced the Hypothesis Development Canvas as a tool for linking an organization’s data science activities with the organization’s strategic business initiatives (see Figure 1).
Community Call Summary – Code Review in the Lab
rOpenSci - open tools for open science
发表于
Although there are increasing incentives and pressures for researchers to share code (even for projects that are not essentially computational), practices vary widely and standards are mostly non-existent. The practice of reviewing code then falls to researchers and research groups before publication. With that in mind, rOpenSci hosted a discussion thread and a community call to bring together different researchers for a conversation about current practices, and challenges in reviewing code in the lab.