归档 | SunJackson Blog

SunJackson Blog

首页
分类
关于
归档
标签
站点地图
公益404

太棒了! 目前共计 3974 篇日志。继续努力。

2019

Whats new on arXiv

01-13

If you did not already know

01-13

Distilled News

01-13

Magister Dixit

01-13

Generating Synthetic Data Sets with ‘synthpop’ in R

01-13

Making sense of the METS and ALTO XML standards

01-13

R Packages worth a look

01-13

XmR Chart | Step-by-Step Guide by Hand and with R

01-13

Showing a difference in means between two groups

01-13

Practical Data Science with R, 2nd Edition discount!

01-12

2019

How to Learn Python in 30 days

01-12

CES 2019

01-12

Whats new on arXiv

01-12

Document worth reading： “Deep learning in agriculture： A survey”

01-12

I walk the (train) line – part deux – the weight loss continues

01-12

How to combine Multiple ggplot Plots to make Publication-ready Plots

01-12

10 years of playback history on Last.FM： "Just sit back and listen"

01-12

Why Vegetarians Miss Fewer Flights – Five Bizarre Insights from Data

01-12

Practical Data Science with R, 2nd Edition discount!

01-12

Document worth reading： “Deep Neural Network Approximation Theory”

01-12

2019

How simpleshow uses Amazon Polly to voice stories in their explainer videos

01-11

The SIAM Book Series on Data Science

01-11

Practical Apache Spark in 10 Minutes

01-11

The year in AI/Machine Learning advances： Xavier Amatriain 2018 Roundup

01-11

pinp 0.0.7： More small YAML options

01-11

If you did not already know

01-11

Document worth reading： “Machine Learning in Official Statistics”

01-11

How to Remove Unfair Bias From Your AI

01-11

R Tip： Use seqi() For Indexes

01-11

R Packages worth a look

01-11

2019

Satellite imagery generation with Generative Adversarial Networks (GANs)

01-11

Parallelize a For-Loop by Rewriting it as an Lapply Call

01-11

epubr 0.6.0 CRAN release

01-11

Ensure consistency in data processing code between training and inference in Amazon SageMaker

01-11

Pear Therapeutics： Data Scientist [San Francisco, CA]

01-11

R Tip： Use seqi() For Indexes

01-11

Visualizing the Asian Cup with R!

01-11

R Packages worth a look

01-11

Add a static pdf vignette to an R package

01-11

Murmuration： Data Scientist [New York, NY]

01-10

2019

Roll Your Own Federal Government Shutdown-caused SSL Certificate Expiration Monitor in R

01-10

vitae： Dynamic CVs with R Markdown

01-10

Whats new on arXiv

01-10

Linguistic Signals of Album Quality： A Predictive Analysis of Pitchfork Review Scores Using Quanteda

01-10

Explainable Artificial Intelligence

01-10

Hackathon Winner Interview： Friendship University of Russia | Kaggle University Club

01-10

My presentations on ‘Elements of Neural Networks & Deep Learning’ -Part1,2,3

01-10

Biggest Deep Learning Summit – Special KDnuggets Offer

01-10

Automated and continuous deployment of Amazon SageMaker models with AWS Step Functions

01-10

Babe Didrikson Zaharias (2) vs. Adam Schiff; Sid Caesar advances

01-10

2019

Tutorial： Time Series Analysis with Pandas

01-10

Who is the greatest finisher in soccer?

01-10

Alibaba acquires Data Artisans?

01-10

Document worth reading： “Universality of Deep Convolutional Neural Networks”

01-10

Who is the greatest finisher in soccer?

01-10

AI in Healthcare (With a case study)

01-10

Top Skills Needed to Work as Data Scientist in iGaming

01-10

MRP (multilevel regression and poststratification; Mister P)： Clearing up misunderstandings about

01-10

Principles of Database Management： The Practical Guide to Storing, Managing and Analyzing Big and Small Data

01-10

If you did not already know

01-10

2019

Considering sensitivity to unmeasured confounding： part 2

01-10

“discover feature relationships” – new EDA tool

01-10

10 Companies to Work with After a Data Science Course

01-10

✚ Repetitions, Data Analysis as Brainstorm

01-10

MS in Applied Data Science Online – which track is right for you?

01-10

The Role of the Data Engineer is Changing

01-10

Python Patterns： max Instead of if

01-10

未命名

01-13

Core Principles of Sustainable Data Science, Machine Learning and AI Product Development： Research as a core driver

01-09

How Data Scientists Think - A Mini Case Study

01-09

2019

R Packages worth a look

01-09

Top 10 Books on NLP and Text Analysis

01-09

4 Myths of Big Data and 4 Ways to Improve with Deep Data

01-09

Whats new on arXiv

01-09

KDnuggets™ News 19：n02, Jan 9： The cold start problem： how to build your machine learning portfolio; 5 Best Data Visualization Libraries

01-09

R Packages worth a look

01-09

On the Road to 0.8.0 — Some Additional New Features Coming in the sergeant Package

01-09

How do Convolutional Neural Nets (CNNs) learn? + Keras example

01-09

Updated Review： jamovi User Interface to R

01-09

A deep dive into glmnet： offset

01-09

2019

Top 5 Data Science Courses in 2019

01-09

Top December Stories： Why You Shouldn’t be a Data Science Generalist

01-09

Magister Dixit

01-09

Distilled News

01-09

Understanding the maths of Computed Tomography (CT) scans

01-09

Top KDnuggets tweets, Jan 02-08： 10 Free Must-Read Books for Machine Learning and Data Science

01-09

An even better rOpenSci website with Hugo

01-09

ML and NLP Publications in 2018

01-09

Nemirovski’s acceleration

01-09

Industry leaders to speak at Mega-PAW, Las Vegas – June 16-20

01-09

2019

Ed Sullivan (3) vs. Sid Caesar; DJ Jazzy Jeff advances

01-09

Learn Python for Data Science From Scratch

01-09

The Right Kind of Internal Motivation Can Improve Your Studies

01-08

“The Book of Why” by Pearl and Mackenzie

01-08

NYU Stern Fubon Center for Technology, Business and Innovation： Fubon Center Faculty Fellow [New York, NY]

01-08

Document worth reading： “Recent Advances in Deep Learning： An Overview”

01-08

AzureR packages now on CRAN

01-08

Apply to NYU Stern’s MS in Business Analytics

01-08

You did a sentiment analysis with tidytext but you forgot to do dependency parsing to answer WHY is something positive/negative

01-08

Whats new on arXiv

01-08

2019

Philip Roth (4) vs. DJ Jazzy Jeff; Jim Thorpe advances

01-08

A Non-Compromising Approach to Privacy-Preserving Personalized Services

01-08

French Baccalaureate Results

01-08

Where does .Renviron live on Citrix?

01-08

AI Gotchas (& How to Avoid Them)

01-08

If you did not already know

01-08

A Beautiful 2 by 2 Matrix Identity

01-08

Analysis of South African Funds

01-08

Top Stories, Dec 24 – Jan 6： The Essence of Machine Learning; Papers with Code： A Fantastic GitHub Resource for Machine Learning

01-08

From a Night of Insomnia to Competition Winner | An Interview with Martin Barron

01-08

2019

Did she really live 122 years?

01-08

NLP Overview： Modern Deep Learning Techniques Applied to Natural Language Processing

01-08

5 things that happened in Data Science in 2018

01-08

Don’t reinvent the wheel： making use of shiny extension packages. Join MünsteR for our next meetup!

01-08

Document worth reading： “I can see clearly now： reinterpreting statistical significance”

01-08

Dow Jones Stock Market Index (3/4)： Log Returns GARCH Model

01-08

Do something for yourself in 2019

01-08

AzureR packages now on CRAN

01-08

RcppStreams 0.1.2

01-07

Document worth reading： “Which Knowledge Graph Is Best for Me?”

01-07

2019

The Ultrarich's dirty secret： not paying taxes

01-07

The Five Best Data Visualization Libraries

01-07

On deck for the first half of 2019

01-07

February 21st & 22nd： End-2-End from a Keras/TensorFlow model to production

01-07

Tutorial： An app in R shiny visualizing biopsy data — in a pharmaceutical company

01-07

Comparison of the Text Distance Metrics

01-07

Marketing analytics with greybox

01-07

The seminar speaker contest begins： Jim Thorpe (1) vs. John Oliver

01-07

7 Reasons for Policy Professionals to Get Pumped About R Programming in 2019

01-07

RTest： pretty testing of R packages

01-07

2019

Role of Computer Science in Data Science World

01-07

Part 2, further comments on OfS grade-inflation report

01-07

Stock Price prediction using ML and DL

01-07

Hackers beware： Bootstrap sampling may be harmful

01-07

Dow Jones Stock Market Index (2/4)： Trade Volume Exploratory Analysis

01-07

The Data Science Event You Need in 2019

01-07

Auto-Keras and AutoML： A Getting Started Guide

01-07

Rev Summit for Data Science Leaders featuring Daniel Kahneman

01-07

BH 1.69.0-1 on CRAN

01-07

Distilled News

01-07

2019

Looking back on 2018, looking to 2019

01-07

If you did not already know

01-07

Long-awaited updates to htmlTable

01-07

Whats new on arXiv

01-06

Announcing the ultimate seminar speaker contest： 2019 edition!

01-06

R Packages worth a look

01-06

Distilled News

01-06

Rotagrams

01-06

Timing the Same Algorithm in R, Python, and C++

01-06

R Packages worth a look

01-06

2019

If you did not already know

01-06

Co-integration and Mean Reverting Portfolio

01-06

Timing the Same Algorithm in R, Python, and C++

01-06

Scaling H2O analytics with AWS and p(f)urrr (Part 1)

01-06

gganimation for the nation

01-06

2018 Volatility Recap

01-06

If you did not already know

01-06

2018 Winners and Losers

01-06

Magister Dixit

01-06

Distilled News

01-05

2019

Distilled News

01-05

Here’s why 2019 is a great year to start with R： A story of 10 year old R code then and now

01-05

If you did not already know

01-05

Document worth reading： “Searching Toward Pareto-Optimal Device-Aware Neural Architectures”

01-05

Structural Analisys of Bayesian VARs with an example using the Brazilian Development Bank

01-05

“Dissolving the Fermi Paradox”

01-05

Document worth reading： “Recent Research Advances on Interactive Machine Learning”

01-05

Maryville University： Business Intelligence Analyst [St. Louis, MO]

01-04

My Activities in 2018 with R and ShinyApp

01-04

Displaying our “R – Quality Control Individual Range Chart Made Nice” inside a Java web App using AJAX – How To.

01-04

2019

Back by popular demand . . . The Greatest Seminar Speaker contest!

01-04

If you did not already know

01-04

R Packages worth a look

01-04

In case you missed it： December 2018 roundup

01-04

What to do when your training and testing data come from different distributions

01-04

Strata Data SF 2019 KDnuggets Offer

01-04

R Packages worth a look

01-04

The cold start problem： how to build your machine learning portfolio

01-04

Looking into 19th century ads from a Luxembourguish newspaper with R

01-04

Robin Pemantle’s updated bag of tricks for math teaching!

01-04

2019

Math for Machine Learning

01-04

What does it mean to write “vectorized” code in R?

01-04

Southern Illinois University Edwardsville： Director of the Center for Predictive Analytics/(Associate) Professor of Mathematics and Statistics [Edwardsville, IL]

01-04

Whats new on arXiv

01-04

Whats new on arXiv

01-04

My R Take in Advent of Code – Day 5

01-03

2018 Traffic Data

01-03

Improve your AI and Machine Learning skills at AI NEXTCon in Seattle, Jan 23-27

01-03

5 Ways in which Data Science is Revolutionizing Web Development

01-03

Whats new on arXiv

01-03

2019

Check Machin-like formulae with arbitrary-precision arithmetic

01-03

Top 5 Data Visualization Tools for 2019

01-03

Published in 2018

01-03

Magister Dixit

01-03

How to Write a Great Data Science Resume

01-03

Adding Firebase Authentication to Shiny

01-03

Purr yourself into a math genius

01-03

Approaches to Text Summarization： An Overview

01-03

‘data：’ Scraping & Chart Reproduction ： Arrows of Environmental Destruction

01-03

If you did not already know

01-03

2019

Ensemble Learning： 5 Main Approaches

01-03

Icon making with ggplot2 and magick

01-03

Notebooks from the Practical AI Workshop

01-03

KDnuggets™ News 19：n01, Jan 3： The Essence of Machine Learning; A Guide to Decision Trees for Machine Learning and Data Science

01-03

R Packages worth a look

01-03

✚ Avoiding D3, Using D3, and Why I Use D3

01-03

x-mas tRees with gganimate, ggplot, plotly and friends

01-03

Data Notes： Malaria Detection with FastAI

01-03

Applying for a PhD program in visualization

01-03

Whats new on arXiv

01-03

2019

gganimate has transitioned to a state of release

01-03

Document worth reading： “Recommendation System based on Semantic Scholar Mining and Topic modeling： A behavioral analysis of researchers from six conferences”

01-03

Top KDnuggets tweets, Dec 19 – Jan 1： Deep Learning Cheat Sheets

01-02

Document worth reading： “A Review for Weighted MinHash Algorithms”

01-02

How to Learn Python in 30 days

01-02

3 More Google Colab Environment Management Tips

01-02

Considering sensitivity to unmeasured confounding： part 1

01-02

Music listener statistics： last.fm’s last.year as an R package

01-02

My book ‘Practical Machine Learning in R and Python： Third edition’ on Amazon

01-02

Why Learning Data Science Live is Better than Self-Paced Learning

01-02

2019

Document worth reading： “Neural Style Transfer： A Review”

01-02

Apache Drill 1.15.0 + sergeant 0.8.0 = pcapng Support, Proper Column Types & Mounds of New Metadata

01-02

Magister Dixit

01-02

Entering and Exiting 2018

01-02

Office for Students report on “grade inflation”

01-02

What to do when you read a paper and it’s full of errors and the author won’t share the data or be open about the analysis?

01-02

Dataviz Course Packet Quickstart

01-02

The Backpropagation Algorithm Demystified

01-02

Advanced Jupyter Notebooks： A Tutorial

01-02

How to Meet Your New Years Resolutions in 2019 (Udemy Coupons $9.99)

01-01

2019

“Principles of posterior visualization”

01-01

Your and my 2019 R goals

01-01

Seeing the wood for the trees

01-01

New Year's Resolutions 2019

01-01

If you did not already know

01-01

Whats new on arXiv

01-01

If you did not already know

01-01

Document worth reading： “Instance-Level Explanations for Fraud Detection： A Case Study”

01-01

Simulating Multi-state Models with R

01-01

Nimble tweak to use specific python version or virtual environment in RStudio

01-01

2019

R Packages worth a look

01-01

R Packages worth a look

01-01

2018

Silent Duels and an Old Paper of Restrepo

12-31

Distilled News

12-31

Introducing RcppDynProg

12-31

R Packages worth a look

12-31

Introducing RcppDynProg

12-31

Authority figures in psychology spread more happy talk, still don’t get the point that much of the published, celebrated, and publicized work in their field is no good (Part 2)

12-31

Whats new on arXiv

12-31

New Year's Resolution： Help Data Scientists Help You

12-31

2018

Leaf Plant Classification： Statistical Learning Model – Part 2

12-31

Document worth reading： “A Survey： Non-Orthogonal Multiple Access with Compressed Sensing Multiuser Detection for mMTC”

12-31

2018.

12-31

Keras Conv2D and Convolutional Layers

12-31

Exploring 2018 R-bloggers & R Weekly Posts with Feedly & the ‘seymour’ package

12-31

Center for Ultrasound Research and Translation, Massachusetts General Hospital： Post-Doctoral Scholar / Research Scientist [Boston, MA]

12-31

Papers with Code： A Fantastic GitHub Resource for Machine Learning

12-31

Import AI 127： Why language AI advancements may make Google more competitive; COCO image captioning systems don’t live up to the hype, and Amazon sees 3X growth in voice shopping via Alexa

12-31

Good Feature Building Techniques and Tricks for Kaggle

12-31

R or Python? Why not both? Using Anaconda Python within R with {reticulate}

12-30

2018

This dance, it’s like a weapon： Radiohead’s and Beck’s danceability, valence, popularity, and more from the LastFM and Spotify APIs

12-30

Sudoku Solver

12-30

Whats new on arXiv

12-30

If you did not already know

12-30

Distilled News

12-30

If you did not already know

12-30

Combining apparently contradictory evidence

12-30

Document worth reading： “The importance of being dissimilar in Recommendation”

12-30

Distilled News

12-29

Leaf Plant Classification： An Exploratory Analysis – Part 1

12-29

2018

“Check yourself before you wreck yourself： Assessing discrete choice models through predictive simulations”

12-29

If you did not already know

12-29

Whats new on arXiv

12-29

R Packages worth a look

12-29

Part 5： Code corrections to optimism corrected bootstrapping series

12-29

Document worth reading： “Learnable： Theory vs Applications”

12-28

Deep Learning for Media Content

12-28

Manning Countdown to 2019 – Big Deals on AI, Data Science, Machine Learning books and videos

12-28

Using emojis as scatterplot points

12-28

Part 4： Why does bias occur in optimism corrected bootstrapping?

12-28

2018

Supervised Learning： Model Popularity from Past to Present

12-28

My R Take on Advent of Code – Day 3

12-28

R Packages worth a look

12-28

My

12-28

The business case for federated learning

12-28

Fine-tuning for Natural Language Processing

12-28

Whats new on arXiv

12-28

R Packages worth a look

12-28

Comparison of the Top Speech Processing APIs

12-28

Use AWS Machine Learning to Analyze Customer Calls from Contact Centers (Part 2)： Automate, Deploy, and Visualize Analytics using Amazon Transcribe, Amazon Comprehend, AWS CloudFormation, and Amazon QuickSight

12-28

2018

The Essence of Machine Learning

12-28

Document worth reading： “Generalization in Machine Learning via Analytical Learning Theory”

12-28

Synthetic Data Generation： A must-have skill for new data scientists

12-27

World’s Biggest Deep Learning Summit 3 weeks away

12-27

Christmas elves puzzle

12-27

Using the Economics Value Curve to Drive Digital Transformation

12-27

Document worth reading： “The Gap of Semantic Parsing： A Survey on Automatic Math Word Problem Solvers”

12-27

Some fun with {gganimate}

12-27

Clustering the Bible

12-27

French Mortality Poster

12-27

2018

A Case For Explainable AI & Machine Learning

12-27

If you did not already know

12-27

Best Data Visualization Projects of 2018

12-27

R Packages worth a look

12-27

The Christmas Eve Selloff was a Classic Capitulation

12-27

Document worth reading： “Computational Power and the Social Impact of Artificial Intelligence”

12-27

9 Reasons Excel Users Should Consider Learning Programming

12-27

Who is a Data Scientist?

12-27

Part 3： Two more implementations of optimism corrected bootstrapping show shocking bias

12-27

The Christmas Eve Selloff was a Classic Capitulation

12-27

2018

How AI Will Change Brick-and-Mortar Retail in 2019

12-26

Part 2： Optimism corrected bootstrapping is definitely bias, further evidence

12-26

Following your gut, following the data

12-26

If you did not already know

12-26

Statistical Assessments of AUC

12-26

If you did not already know

12-26

Will Julia Replace Python and R for Data Science?

12-26

BERT： State of the Art NLP Model, Explained

12-26

Miami University： Assistant Provost for Institutional Research and Effectiveness [Oxford, OH]

12-26

Deep learning in Satellite imagery

12-26

2018

Document worth reading： “A Survey of Knowledge Representation and Retrieval for Learning in Service Robotics”

12-26

Le Monde puzzle [#1076]

12-26

Finally, You Can Plot H2O Decision Trees in R

12-26

R Packages worth a look

12-26

Very shiny holidays!

12-26

Data Science & ML ： A Complete Interview Guide

12-26

If you did not already know

12-25

Optimism corrected bootstrapping： a problematic method

12-25

Whats new on arXiv

12-25

Distilled News

12-25

2018

At Year's End： 2018

12-25

“Thus, a loss aversion principle is rendered superfluous to an account of the phenomena it was introduced to explain.”

12-25

The Need for Speed Part 2： C++ vs. Fortran vs. C

12-24

Magister Dixit

12-24

A Guide to Decision Trees for Machine Learning and Data Science

12-24

If you did not already know

12-24

4 Reasons Santa Needs Machine Learning & AI

12-24

Objects types and some useful R functions for beginners

12-24

Dreaming of a white Christmas – with ggmap in R

12-24

University of Virginia： Faculty, Open Rank Model and Simulation at the Human-Technology Frontier [Charlottesville, VA]

12-24

2018

The most practical causal inference book I’ve read (is still a draft)

12-24

How to Land a Job As a Data Scientist in 2019

12-24

How to use Keras fit and fit_generator (a hands-on tutorial)

12-24

June is applied regression exam month!

12-24

Top Stories, Dec 17-23： Why You Shouldn’t be a Data Science Generalist; 10 More Must-See Free Courses for Machine Learning and Data Science

12-24

Twas the Night Before Analysis or A Visit from the Chief Data Scientist

12-24

Text classification with tidy data principles

12-24

R 101

12-24

Zak David expresses critical views of some published research in empirical quantitative finance

12-24

Pivot Billions and Deep Learning enhanced trading models achieve 100% net profit

12-24

2018

Interspeech 2018： Highlights for Data Scientists

12-24

How Miguel Got 3 Data Science Job Offers Fast With Dataquest

12-24

R Packages worth a look

12-23

ShinyProxy Christmas Release

12-23

The Semantic Web： Where is it now?

12-23

Custom JavaScript, CSS and HTML in Shiny

12-23

R Packages worth a look

12-23

If you did not already know

12-23

Certifiably Gone Phishing

12-23

“When Both Men and Women Drop Out of the Labor Force, Why Do Economists Only Ask About Men?”

12-23

2018

Simulating Persian Monarchs gameplay by @ellis2013nz

12-22

Document worth reading： “Learning to Reason”

12-22

5 amazing free tools that can help with publishing R results and blogging

12-22

Bubble Packed Chart with R using packcircles package

12-22

Carol Nickerson explains what those mysterious diagrams were saying

12-22

Whats new on arXiv

12-22

Day 22 – little helper get_files

12-22

The Bear is Here

12-22

R Packages worth a look

12-22

Re-creating a Voronoi-Style Map with R

12-22

2018

The Bear is Here

12-22

The Riddler： Santa Needs Some Help With Math

12-22

If you did not already know

12-21

Using the tidyverse for more than data manipulation： estimating pi with Monte Carlo methods

12-21

Blogdown – shortcode for radix-like Bibtex

12-21

Does imputing model labels using the model predictions can improve it’s performance?

12-21

Gold-Mining Week 16 (2018)

12-21

If you did not already know

12-21

November 2018： “Top 40” New Packages

12-21

Top 10 Data Science Tools (other than SQL Python R)

12-21

2018

R Packages worth a look

12-21

Distilled News

12-21

The causal hype ratchet

12-21

Six Steps to Master Machine Learning with Data Preparation

12-21

Transcribe speech in three new languages： French, Italian, and Brazilian Portuguese

12-21

Feature engineering, Explained

12-21

Machine Learning Explainability vs Interpretability： Two concepts that could help restore trust in AI

12-20

Spelling 2.0： Improved Markdown and RStudio Support

12-20

R 3.5.2 now available

12-20

How to Scrape Data from a JavaScript Website with R

12-20

2018

R 3.5.2 now available

12-20

Power your website with on-demand translated reviews using Amazon Translate

12-20

10 More Must-See Free Courses for Machine Learning and Data Science

12-20

BH 1.69.0-0 pre-releases and three required changes

12-20

R Packages worth a look

12-20

LightOn： Forward We Go !

12-20

✚ Tufte Tweet Follow-up; Visualization Tools and Resources Roundup for December 2018

12-20

The Key to Getting a Data Science Job, According to Briana Brownell

12-20

Your AI journey… and Happy Holidays!

12-20

Examining the Tweeting Patterns of Prominent Crossfit Gyms

12-20

2018

Miami University： Director of the Center for Analytics & Data Science (CADS) [Oxford, OH]

12-20

Distilled News

12-20

Exploring model fit by looking at a histogram of a posterior simulation draw of a set of parameters in a hierarchical model

12-20

Whats new on arXiv

12-20

Document worth reading： “A second-quantised Shannon theory”

12-20

Amazon SageMaker adds Scikit-Learn support

12-20

Whats new on arXiv

12-20

Easily train models using datasets labeled by Amazon SageMaker Ground Truth

12-20

Day 20 – little helper char_replace

12-20

If you did not already know

12-20

2018

The importance of Data Analytics skills in today’s MBA roles

12-19

Hackathon Winner Interview： Penn State | Kaggle University Club

12-19

The brain as a neural network： this is why we can’t get along

12-19

Dataiku Series C： New Year, New Chapter

12-19

KDnuggets™ News 18：n48, Dec 19： Why You Shouldn’t be a Data Science Generalist; Industry Data Science & Machine Learning 2019 Predictions

12-19

2019

未命名

01-13

2018

4 Strategies to Deal With Large Datasets Using Pandas

12-19

UnitedHealth Group： Director, Data Science [Minnetonka, MN]

12-19

Kent State University： Assistant/Associate Professor – Business Analytics/Information Systems [Kent, OH]

12-19

Top KDnuggets tweets, Dec 12-18： Deep Learning Cheat Sheets; The Nate Silver vs. Nassim Taleb Twitter War

12-19

2018

Distilled News

12-19

Distilled News

12-19

Data Science & ML ： A Complete Interview Guide

12-19

Whats new on arXiv

12-19

Document worth reading： “Mobile big data analysis with machine learning”

12-19

Optimal Picture Viewing Distance

12-19

The Netflix Data War

12-19

FAQ on ICML 2019 Code Submission Policy

12-19

R Packages worth a look

12-19

Think Twice Before You Accept That Fancy Data Science Job

12-19

2018

When “nudge” doesn’t work： Medication Reminders to Outcomes After Myocardial Infarction

12-19

AI, Machine Learning and Data Science Roundup： December 2018

12-19

Rotary

12-19

Top Python Libraries in 2018 in Data Science, Deep Learning, Machine Learning

12-19

Data, movies and ggplot2

12-19

Heavy Tailed Self Regularization in Deep Neural Nets： 1 year of research

12-18

Alternative approaches to scaling Shiny with RStudio Shiny Server, ShinyProxy or custom architecture.

12-18

Analyzing contact center calls—Part 1： Use Amazon Transcribe and Amazon Comprehend to analyze customer sentiment

12-18

AzureStor： an R package for working with Azure storage

12-18

R Packages worth a look

12-18

2018

Industry Predictions： AI, Machine Learning, Analytics & Data Science Main Developments in 2018 and Key Trends for 2019

12-18

Highlights of 2018

12-18

Document worth reading： “Are screening methods useful in feature selection? An empirical study”

12-18

So you want to play a pRank in R…?

12-18

Document worth reading： “A Study and Comparison of Human and Deep Learning Recognition Performance Under Visual Distortions”

12-18

vtreat Variable Importance

12-18

AzureStor： an R package for working with Azure storage

12-18

Classifying yin and yang using MRI

12-18

University of Rhode Island： Data Scientist, DataSpark (2 Positions) [Kingston, RI]

12-18

How will automation tools change data science?

12-18

2018

Whats new on arXiv

12-18

Statistics in Glaucoma： Part III

12-18

Modern reproduction of 1847 geometry books

12-18

Comparing racism from different eras： If only Tucker Carlson had been around in the 1950s he could’ve been a New York Intellectual.

12-18

Exploring the Data Jungle Free eBook

12-18

If you did not already know

12-18

If you did not already know

12-18

vtreat Variable Importance

12-18

Magister Dixit

12-18

All the (NBA) box scores you ever wanted

12-18

2018

New public course on Successfully Delivering Data Science Projects for Feb 1st

12-18

Distilled News

12-18

LoyaltyOne： Consultant Category Manager / Analyst, Client Services [Westborough, MA]

12-17

Vanguard： Senior AI Architect [Malvern, PA]

12-17

2018 Year-in-Review： Machine Learning Open Source Projects & Frameworks

12-17

If you did not already know

12-17

Scalable multi-node training with TensorFlow

12-17

My R take on Advent of Code – Day 1

12-17

If you did not already know

12-17

Introduction to Statistics for Data Science

12-17

2018

eBook： An Introduction to Active Learning

12-17

Top 10 Advantages of a Data Science Certification

12-17

R Packages worth a look

12-17

Meta-Learning For Better Machine Learning

12-17

Image Stitching with OpenCV and Python

12-17

Day 17 – little helper to_na

12-17

The 2019 PAW Business Agenda is Live – Super Early Bird expires this Friday

12-17

Whats new on arXiv

12-17

Introduction to Pandas, NumPy and RegEx in Python

12-17

Vanguard： Senior AI Engineer [Malvern, PA]

12-17

2018

Phillips-Ouliaris Test For Cointegration

12-17

An R Shiny app to recognize flower species

12-17

LoyaltyOne： Associate Director, Client Services [Westborough, MA]

12-17

Why do sociologists (and bloggers) focus on the negative? 5 possible explanations. (A post in the style of Fabio Rojas)

12-17

Top Stories, Dec 10-16： Why You Shouldn’t be a Data Science Generalist; Machine Learning & AI Main Developments in 2018 and Key Trends for 2019

12-17

LoyaltyOne： Associate Director, CPG [Westborough, MA]

12-17

Distilled News

12-17

Document worth reading： “Coupled Ensembles of Neural Networks”

12-16

2018-13 Rendering HTML Content in R Graphics

12-16

R Packages worth a look

12-16

2018

Document worth reading： “Gaussian Processes and Kernel Methods： A Review on Connections and Equivalences”

12-16

If you did not already know

12-16

Quoting Concatenate

12-16

Word associations from the Small World of Words

12-16

Surprise-hacking： “the narrative of blindness and illusion sells, and therefore continues to be the central thesis of popular books written by psychologists and cognitive scientists”

12-16

Minimum CRPS vs. maximum likelihood

12-16

Quoting Concatenate

12-16

linl 0.0.3： Micro release

12-15

RStudio Pandoc – HTML To Markdown

12-15

Data Scientist’s Dilemma – The Cold Start Problem

12-15

2018

Neural Ordinary Differential Equations

12-15

Manipulate dates easily with {lubridate}

12-15

Distilled News

12-15

If you did not already know

12-15

Advent of Code： Most Popular Languages

12-15

“My advisor and I disagree on how we should carry out repeated cross-validation. We would love to have a third expert opinion…”

12-15

Day 15 – little helper sci_palette

12-15

If you did not already know

12-15

Six Sigma DMAIC Series in R – Part4

12-15

Request for comments on planned features for futile.logger 1.5

12-15

2018

Easy CI/CD of GPU applications on Google Cloud including bare-metal using Gitlab and Kubernetes

12-14

Top Insights from 50 Chief Data Officers

12-14

Gift ideas for the R lovers

12-14

Day 14 – little helper print_fs

12-14

Learning R： A gentle introduction to higher-order functions

12-14

CBH Group： Sr Data Engineer [Perth, Australia]

12-14

LoyaltyOne： Consultant Category Manager / Analyst, Client Services [Westborough, MA]

12-14

running plot [and simulated annealing]

12-14

R Packages worth a look

12-14

In case you missed it： November 2018 roundup

12-14

2018

Soft Actor Critic—Deep Reinforcement Learning with Real-World Robots

12-14

Document worth reading： “Small Sample Learning in Big Data Era”

12-14

LoyaltyOne： Manager, CPG [Westborough, MA]

12-14

NLP Breakthrough Imagenet Moment has arrived

12-14

If you did not already know

12-14

Top Stories of 2018： 9 Must-have skills you need to become a Data Scientist, updated; Python eats away at R： Top Software for Analytics, Data Science, Machine Learning in 2018

12-14

Spark + AI Summit： learn best practices in ML and DL, latest frameworks, and more – special KDnuggets offer

12-14

Pdftools 2.0： powerful pdf text extraction tools

12-14

Whats new on arXiv

12-14

Implementing ResNet with MXNET Gluon and Comet.ml for Image Classification

12-14

2018

My book ‘Deep Learning from first principles：Second Edition’ now on Amazon

12-14

Why You Shouldn’t be a Data Science Generalist

12-14

A couple of thoughts regarding the hot hand fallacy fallacy

12-14

Distilled News

12-14

Solve any Image Classification Problem Quickly and Easily

12-13

RTutor： Better Incentive Contracts For Road Construction

12-13

Four Real-Life Machine Learning Use Cases

12-13

Oh, I hate it when work is criticized (or, in this case, fails in attempted replications) and then the original researchers don’t even consider the possibility that maybe in their original work they were inadvertently just finding patterns in noise.

12-13

Gold-Mining Week 15 (2018)

12-13

R Packages worth a look

12-13

2018

R community update： announcing sessions for useR Delhi December meetup

12-13

State of Deep Learning and Major Advances： H2 2018 Review

12-13

Are you ready to tackle the data-driven revolution?

12-13

Document worth reading： “AI Reasoning Systems： PAC and Applied Methods”

12-13

If you did not already know

12-13

Whats new on arXiv

12-13

Recreating the NBA lead tracker graphic

12-13

Yet another visualization of the Bayesian Beta-Binomial model

12-13

Cummins： Data Engineering Technical Specialist [Columbus, IN]

12-13

R Packages worth a look

12-13

2018

Cummins： Reliability Analytics Leader [Columbus, IN]

12-13

Reusable Pipelines in R

12-13

Amazon SageMaker Automatic Model Tuning now supports early stopping of training jobs

12-13

Top KDnuggets tweets, Dec 5-11： How to build a data science project from scratch; NeurIPS 2018 video talk collection

12-13

Cummins： Data Engineering Apps and Solutions Architect [Columbus, IN]

12-13

WNS Hackathon Solutions by Top Finishers

12-13

Reusable Pipelines in R

12-13

von Neumann Poker Analysis

12-13

Apps gather your location and then sell the data

12-13

Day 13 – little helper read_files

12-13

2018

MINDBODY： Business Intelligence Analyst II [San Luis Obispo, CA]

12-13

Four Approaches to Explaining AI and Machine Learning

12-12

What's the future of the pandas library?

12-12

Keras Hyperparameter Tuning in Google Colab Using Hyperas

12-12

Cummins： Advanced Analytics Systems Architect Principle [Columbus, IN]

12-12

Time series of Democratic/Republican vote share in House elections

12-12

Document worth reading： “Computing the Unique Information”

12-12

Intuit： Staff Data Scientist [Mountain View, CA]

12-12

Intuit： Staff Experimentation Data Scientist [Mountain View, CA]

12-12

R Packages worth a look

12-12

2018

How to deploy a predictive service to Kubernetes with R and the AzureContainers package

12-12

Teaching and Learning Materials for Data Visualization

12-12

Intuit： Staff Data Scientist – Business Analytics [Mountain View, CA]

12-12

Code for case study – Customer Churn with Keras/TensorFlow and H2O

12-12

Day 12 – little helper dive

12-12

If you did not already know

12-12

Cummins： Advanced Analytics Platform Principle Engineer [Columbus, IN]

12-12

10 Data Science Skills to Land your Dream Job in 2019

12-12

Network Centrality in R： New ways of measuring Centrality

12-12

Using ggplot2 for functional time series

12-12

2018

Scaling Multi-Agent Reinforcement Learning

12-12

Scraping the Turkey Accordion

12-12

I Spy with my Graphing Eye 📊 👁️

12-12

Distilled News

12-12

Reading List Faster With parallel, doParallel, and pbapply

12-12

Exploring the Gender Pay Gap with Publicly Available Data

12-12

My introductory course on Bayesian statistics

12-12

How to deploy a predictive service to Kubernetes with R and the AzureContainers package

12-12

KDnuggets™ News 18：n47, Dec 12： Common mistakes when doing machine learning; Here are the most popular Python IDEs / Editors

12-12

Single-Income Occupations

12-12

2018

Automated Web Scraping in R

12-11

P&G： Data Scientist – Machine Learning/NLP [Cincinnati, OH]

12-11

InformationAge： Will 2019 See the Automation of Automation and Push Up Salaries of Data Scientists?

12-11

Historic Wildfire Data： Exploratory Visualization in R

12-11

If you did not already know

12-11

CBH Group： Data Scientist [Perth, Australia]

12-11

R Packages worth a look

12-11

Intuit： Staff Data Scientist [Woodland Hills, CA and Mountain View, CA]

12-11

Sharing Modeling Pipelines in R

12-11

When cycling is faster than driving

12-11

2018

“Do you have any recommendations for useful priors when datasets are small?”

12-11

CBH Group： Sr Data Scientist [Perth, Australia]

12-11

The Role of Theory in Data Analysis

12-11

How to give money to the R project

12-11

Advanced News API search： leveraging DBpedia entity types

12-11

Sharing Modeling Pipelines in R

12-11

A Machine Learning Deep Dive [Webinar, Dec 13]

12-11

Machine Learning (ML) Essentials

12-11

DB connected R application on open-source Shiny server, part 1

12-11

Distilled News

12-11

2018

8 Data Science Projects to Build your Portfolio

12-11

Document worth reading： “Taxonomy of Big Data： A Survey”

12-11

Top November Stories： The Most in Demand Skills for Data Scientists; What is the Best Python IDE for Data Science?

12-11

Intuit： Staff Data Scientist [Mountain View, CA]

12-11

Machine Learning & AI Main Developments in 2018 and Key Trends for 2019

12-11

Day 11 – little helper trim

12-11

Introduction to Named Entity Recognition

12-11

Learning Machine Learning vs Learning Data Science

12-11

Let Automation Carry You from BI to AI in 2019

12-11

Le Monde puzzle [#1075]

12-11

2018

Whats new on arXiv

12-11

Prior distributions for covariance matrices

12-10

Whats new on arXiv

12-10

Enter the

12-10

Great post Yash!

12-10

Whats new on arXiv

12-10

Whats new on arXiv

12-10

Document worth reading： “Can Machines Design An Artificial General Intelligence Approach”

12-10

2019

未命名

01-13

2018

Personal Data Analytics

12-10

2018

Distilled News

12-10

Document worth reading： “A Short Introduction to Local Graph Clustering Methods and Software”

12-10

Failure Pressure Prediction Using Machine Learning

12-10

Reflections on the 10th anniversary of the Revolutions blog

12-10

Should you become a data scientist?

12-10

covrpage, more information on unit testing

12-10

Whats new on arXiv

12-10

5½ Reasons to Ditch Spreadsheets for Data Science： Code is Poetry

12-10

Keras – Save and Load Your Deep Learning Models

12-10

R Packages worth a look

12-10

2018

Day 10 – little helper %nin%

12-10

The ‘knight on an infinite chessboard’ puzzle： efficient simulation in R

12-10

The Need for Speed Part 1： Building an R Package with Fortran (or C)

12-10

Top Stories, Dec 3-9： Common mistakes when carrying out machine learning and data science; AI, Data Science, Analytics Main Developments in 2018 and Key Trends for 2019

12-10

Math for Machine Learning

12-10

How Different are Conventional Programming and Machine Learning?

12-10

Reflections on the 10th anniversary of the Revolutions blog

12-10

ggmap Tutorial Updated!

12-10

Should we be concerned about MRP estimates being used in later analyses? Maybe. I recommend checking using fake-data simulation.

12-09

Canada Map

12-09

2018

Day 09 – little helper object_size_in_env

12-09

If you did not already know

12-09

Whats new on arXiv

12-09

Smartly select and mutate data frame columns, using dict

12-09

Magister Dixit

12-09

An 8-hour course on R and Data Mining

12-09

An 8-hour course on R and Data Mining

12-09

Document worth reading： “What Do We Understand About Convolutional Networks”

12-09

Distilled News

12-09

Interesting packages taken from R/Pharma

12-09

2018

Interactive panel EDA with 3 lines of code

12-09

Timing Grouped Mean Calculation in R

12-08

Timing Grouped Mean Calculation in R

12-08

My footnote about global warming

12-08

R Packages worth a look

12-08

Document worth reading： “A Theory of Diagnostic Interpretation in Supervised Classification”

12-08

It was twenty years ago …

12-08

Dr. Data Show Video： Five Reasons Computers Predict When You’ll Die

12-08

Whats new on arXiv

12-08

If you did not already know

12-08

2018

R Packages worth a look

12-08

Self Avoiding Walks

12-08

confint3： 2-Sided Confidence Interval (Extended Moodle Version)

12-08

Day 08 – little helper intersect2

12-08

Document worth reading： “Marketing Analytics： Methods, Practice, Implementation, and Links to Other Fields”

12-07

R community update： announcing useR Delhi December meetup and CFP

12-07

Shinyfit： Advanced regression modelling in a shiny app

12-07

Cohort and age effects

12-07

A comprehensive list of Machine Learning Resources： Open Courses, Textbooks, Tutorials, Cheat Sheets and more

12-07

Whats new on arXiv

12-07

2018

XGBoost on GPUs： Unlocking Machine Learning Performance and Productivity

12-07

Latour Sokal NYT

12-07

“Increase sample size until statistical significance is reached” is not a valid adaptive trial design; but it’s fixable.

12-07

If you did not already know

12-07

Distilled News

12-07

Day 07 – little helper count_na

12-07

The Machine Learning Project Checklist

12-07

Here are the most popular Python IDEs / Editors

12-07

Bayesian Nonparametric Models in NIMBLE, Part 2： Nonparametric Random Effects

12-07

Take a Look at Looker, Demo/Webinar Dec 13

12-07

2018

R Packages worth a look

12-07

Automated Dashboard Visualizations with Ranking in R

12-07

One-stop-learning-shop for data pros – get exclusive access for less than a cup of coffee

12-06

DATAx Presents： AI AND MACHINE LEARNING TRENDS IN 2019

12-06

Einops — a new style of deep learning code

12-06

Reduced privacy risk in exchange for accuracy in the Census count

12-06

Running an R script on heroku

12-06

If you did not already know

12-06

Intuition for principal component analysis (PCA)

12-06

Designing Turbofan Tycoon

12-06

2018

An Intro to Deep Learning in Python

12-06

Build a serverless Twitter reader using AWS Fargate

12-06

Day 06 – little helper statusbar

12-06

Must-Have Resources to Become a Data Scientist

12-06

Four Techniques for Outlier Detection

12-06

Automated Dashboard visualizations with Deviation in R

12-06

If you did not already know

12-06

Common mistakes when carrying out machine learning and data science

12-06

Distilled News

12-06

A parable regarding changing standards on the presentation of statistical evidence

12-06

2018

Distilled News

12-06

R Packages worth a look

12-06

Explainable Artificial Intelligence (Part 2) – Model Interpretation Strategies

12-06

Bitcoin and Taxes： What You May Not Know

12-06

The JapanR Conference 2018 Round-Up!

12-06

Announcing Kaggle integration with Google Data Studio

12-05

Trust in ML models. Slides from TWiML & AI EMEA Meetup + iX Articles

12-05

Building Gene Expression Atlases with Deep Generative Models for Single-cell Transcriptomics

12-05

R Packages worth a look

12-05

Top KDnuggets tweets, Nov 28 – Dec 4： Deep Learning Cheat Sheets; Amazon opens its internal

12-05

2018

Community Call – Governance strategies for open source research software projects

12-05

6 Step Plan to Starting Your Data Science Career

12-05

Debiasing Approximate Inference

12-05

Gender Diversity in the R and Python Communities

12-05

ROCK 'n' ROLL TRAFFIC ROUTING, WITH NEO4J, PART 2

12-05

Distilled News

12-05

ggQC | ggplot Quality Control Charts – New Release

12-05

Day 05 – little helper get_network

12-05

ROCK 'n' ROLL TRAFFIC ROUTING, WITH NEO4J

12-05

KDnuggets™ News 18：n46, Dec 5： AI, Data Science, Analytics 2018 Main Developments, 2019 Key Trends; Deep Learning Cheat Sheets

12-05

2018

Automated Dashboard with various correlation visualizations in R

12-05

Extract data from a PNG/TIFF

12-05

The Quick Python Book

12-05

Creating Tables Using R and Pure HTML

12-05

My Self-Driving Presentation for TTS

12-05

Anomaly detection on Amazon DynamoDB Streams using the Amazon SageMaker Random Cut Forest algorithm

12-05

Magister Dixit

12-05

Kick Start Your Data Career! Tips From the Frontline

12-05

Learn to do Data Viz in R

12-05

Document worth reading： “A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition”

12-05

2018

Announcing the Winners of the 2018 AWS AI Hackathon

12-05

Gender Diversity in the R and Python Communities

12-05

How to build a data science project from scratch

12-05

Niall Ferguson and the perils of playing to your audience

12-05

“Statistical insights into public opinion and politics” (my talk for the Columbia Data Science Society this Wed 9pm)

12-04

If you did not already know

12-04

rnoaa： new data sources and NCDC units

12-04

Heatmaps of Mortality Rates

12-04

R Packages worth a look

12-04

Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning： December and Beyond

12-04

2018

Document worth reading： “The Dynamics of Learning： A Random Matrix Approach”

12-04

Data Science Projects Employers Want To See： How To Show A Business Impact

12-04

Deep learning in Satellite imagery

12-04

Document worth reading： “A Review of Theoretical and Practical Challenges of Trusted Autonomy in Big Data”

12-04

Starspace for NLP

12-04

Day 04 – little helper evenstrings

12-04

Handling Imbalanced Datasets in Deep Learning

12-04

Why Machine Learning Interpretability Matters

12-04

Data Mining Book – Chapter Download

12-04

Detecting spatiotemporal groups in relocation data with spatsoc

12-04

2018

Bayesian Nonparametric Models in NIMBLE, Part 1： Density Estimation

12-04

Bayes, statistics, and reproducibility： “Many serious problems with statistics in practice arise from Bayesian inference that is not Bayesian enough, or frequentist evaluation that is not frequentist enough, in both cases using replication distributions that do not make scientific sense or do not reflect the actual procedures being performed on the data.”

12-04

Whats new on arXiv

12-04

Why Primary Research?

12-04

Deep Learning and Medical Image Analysis with Keras

12-03

GARCH and a rudimentary application to Vol Trading

12-03

Distilled News

12-03

AI, Data Science, Analytics Main Developments in 2018 and Key Trends for 2019

12-03

The State of Data in Astronomy

12-03

Whats new on arXiv

12-03

2018

Very Non-Standard Calling in R

12-03

In which I demonstrate my ignorance of world literature

12-03

AzureVM： managing virtual machines in Azure

12-03

If you did not already know

12-03

Graph-Powered Machine Learning

12-03

One Recipe Step to Rule Them All

12-03

My talk tomorrow (Tues) noon at the Princeton University Psychology Department

12-03

restez： Query GenBank locally

12-03

Day 03 – little helper multiplot

12-03

Statistics in Glaucoma： Part I

12-03

2018

Ronin： Sr Machine Learning and AI Data Scientist [San Mateo, CA]

12-03

An Utility Function For Monotonic Binning

12-03

8 Data Science Projects to Build your Portfolio

12-03

StanCon 2018 Helsinki talk slides, notebooks and code online

12-03

Import AI： 123： Facebook sees demands for deep learning services in its data centers grow by 3.5X; why advanced AI might require a global policeforce; and diagnosing natural disasters with deep learning

12-03

Best Machine Learning languages, Data Visualization Tools, DL Frameworks, and Big Data Tools

12-03

Document worth reading： “A Survey of Modern Object Detection Literature using Deep Learning”

12-03

R Packages worth a look

12-03

Multithreaded in the Wild

12-03

Ronin： Data Engineer [San Mateo, CA]

12-03

2018

Making a Profit with Henry Wan in Arkham Horror： The Card Game

12-03

Compare population age structures of Europe NUTS-3 regions and the US counties using ternary color-coding

12-03

If you did not already know

12-03

Monash University： Research Fellow (Bioinformatics) [Melbourne, Australia]

12-03

AzureVM： managing virtual machines in Azure

12-03

Very Non-Standard Calling in R

12-03

Leaving NYC for Nashville

12-03

Top Stories, Nov 26 – Dec 2： Deep Learning Cheat Sheets; A Complete Guide to Choosing the Best Machine Learning Course

12-03

How to get the homology of a antibody using R

12-02

R Packages worth a look

12-02

2018

The p-value is 4.76×10^−264

12-02

Day 02 – little helper na_omitlist

12-02

TSstudio 0.1.3

12-02

Site Redesign

12-02

If you did not already know

12-02

Document worth reading： “Comparative Study on Generative Adversarial Networks”

12-02

R plus Magento 2 REST API revisited： part 3 – more complex samples of use

12-02

Distilled News

12-02

December Reading for Econometricians

12-02

Why R for data science – and not Python?

12-02

2018

Interpretability is crucial for trusting AI and machine learning

12-01

Document worth reading： “A Tutorial on Bayesian Optimization”

12-01

NYC buses： C5.0 classification with R; more than 20 minute delay?

12-01

R Packages worth a look

12-01

Free Machine Learning Textbook

12-01

Tribes.ai： Sr Data Scientist [Remote, India / Eastern Europe]

12-01

Magister Dixit

12-01

A Programmer’s Introduction to Mathematics

12-01

Whats new on arXiv

12-01

If you did not already know

12-01

2018

Using R： the best thing I’ve changed about my code in years

12-01

Day 01 – little helper checkdir

12-01

If you did not already know

12-01

Defining visualization literacy

11-30

WPI： Research Scientist [Worcester, MA]

11-30

The Future of AI is the Enterprise

11-30

Simulating dinosaur populations, with R

11-30

Simulating dinosaur populations, with R

11-30

Introducing the First AI / Machine Learning Course With a Job Guarantee

11-30

Yeshiva University： Data Science Program Director [New York, NY]

11-30

2018

Variational Autoencoders Explained in Detail

11-30

A Complete Guide to Choosing the Best Machine Learning Course

11-30

Deep Learning for the Masses (… and The Semantic Layer)

11-30

Creating and saving multiple plots to Powerpoint

11-30

Whats new on arXiv

11-30

Math in Data Science

11-30

If you did not already know

11-30

Number of births in the twentieth century by @ellis2013nz

11-30

Faster garbage collection in pqR

11-30

2019

未命名

01-13

2018

NYC buses： Cubist regression with more predictors

11-30

Visual Model-Based Reinforcement Learning as a Path towards Generalist Robots

11-30

R Packages worth a look

11-30

Distilled News

11-30

Java Object Tracking for Cars

11-30

Distilled News

11-30

University of Tennessee Knoxville： Assistant or Associate Professor in Data Science [Knoxville, TN]

11-30

ML Methods for Prediction and Personalization

11-30

Distilled News

11-29

Amazon SageMaker now comes with new capabilities for accelerating machine learning experimentation

11-29

2018

Whats new on arXiv

11-29

“And when you did you weren’t much use, you didn’t even know what a peptide was”

11-29

Whats new on arXiv

11-29

Document worth reading： “Big Data and Fog Computing”

11-29

Serve yourself. The Next-Generation of Data Analytics. Dec 6 Webinar

11-29

Free ebook： Exploring Data with python

11-29

Request for Proposal： Topical Projects for January 2019

11-29

Designing a Self-Learning Tic-Tac-Toe Player

11-29

Gold-Mining Week 13 (2018)

11-29

Combating Customer Churn with AI

11-29

2018

Students Combat MS with Data Science

11-29

Amazon SageMaker notebooks now support Git integration for increased persistence, collaboration, and reproducibility

11-29

Create 3D County Maps Using Density as Z-Axis

11-29

If you did not already know

11-29

October 2018： “Top 40” New Packages

11-29

NYC buses： simple Cubist regression

11-29

Whats new on arXiv

11-29

Teaching kids data visualization

11-29

Linking Data Science Activities to Business Initiatives Using the Hypothesis Development Canvas

11-29

Community Call Summary – Code Review in the Lab

11-29

2018

How to Find Mentors for Data Science?

11-29

RcppArmadillo 0.9.200.5.0

11-28

Horses for courses, or to each model its own (causal effect)

11-28

R now supported in Azure SQL Database

11-28

Plotting Scottish census data with some tidyverse magic

11-28

Top KDnuggets tweets, Nov 21-27： Intro to

11-28

8 Reasons to Take Data Analytics Certification Courses

11-28

How to Build a Machine Learning Team When You Are Not Google or Facebook

11-28

Document worth reading： “An Introductory Survey on Attention Mechanisms in NLP Problems”

11-28

KDnuggets™ News 18：n45, Nov 28： Your Favorite Python IDE/editor? Intro to Data Science for Managers

11-28

2018

Whats new on arXiv

11-28

Sales Forecasting Using Facebook’s Prophet

11-28

R Packages worth a look

11-28

Semantic Segmentation algorithm is now available in Amazon SageMaker

11-28

Marginal Effects for (mixed effects) regression models

11-28

Document worth reading： “Legible Normativity for AI Alignment： The Value of Silly Rules”

11-28

Deep Learning Cheat Sheets

11-28

R now supported in Azure SQL Database

11-28

Extracting data from news articles： Australian pollution by postcode

11-28

Whats new on arXiv

11-28

2018

The new pqR parser, and R’s “else” problem

11-28

ICML 2019： Some Changes and Call for Papers

11-28

Multilevel models for multiple comparisons! Varying treatment effects!

11-28

Introducing Amazon Translate Custom Terminology

11-28

NYC buses： company level predictors with R

11-28

R Packages worth a look

11-28

Filter Clickbait from News Content with our custom Natural Language Processing Model

11-28

Le Monde puzzle [#1078]

11-28

Lessons from posting a fake map about pies

11-28

What Python editors or IDEs you used the most in 2018?

11-27

2018

Whats new on arXiv

11-27

Visualization of NYC bus delays with R

11-27

If you did not already know

11-27

Magister Dixit

11-27

styler 1.1.0

11-27

Humana： Principal Data Scientist/Informatics Principal [Chicago, IL, Dallas, TX and Louisville, KY]

11-27

Distilled News

11-27

Making Machine Learning Accessible [Webinar Replay]

11-27

Introducing Dynamic Training for deep learning with Amazon EC2

11-27

Co-localization analysis of fluorescence microscopy images

11-27

2018

Document worth reading： “An exploration of algorithmic discrimination in data and classification”

11-27

Introducing medical language processing with Amazon Comprehend Medical

11-27

Bringing Machine Learning Research to Product Commercialization

11-27

How to Gather Your Own Data by Conducting a Great Survey

11-27

Drexel University： 2 Teaching Faculty Positions in Data Science [Philadelphia, PA]

11-27

How to Engineer Your Way Out of Slow Models

11-27

$ vs. votes

11-27

If you did not already know

11-27

Amazon Launches Machine Learning University

11-27

Peak Non-Creepy Dating Pool

11-27

2018

“Economic predictions with big data” using partial pooling

11-26

Plotting wind highways using rWind

11-26

Distilled News

11-26

Distilled News

11-26

R Packages worth a look

11-26

My secret sauce to be in top 2% of a Kaggle competition

11-26

AzureRMR： an R interface to Azure Resource Manager

11-26

AzureRMR： an R interface to Azure Resource Manager

11-26

Data Science Strategy Safari： Aligning Data Science Strategy to Org Strategy

11-26

Talking on “High Performance Python” at Linuxing In London last week

11-26

2018

Global Legal Entity Identifier Foundation (GLEIF)： Data Analyst [Frankfurt, Germany]

11-26

Instance segmentation with OpenCV

11-26

Whats new on arXiv

11-26

R Packages worth a look

11-26

Import AI： 122： Google obtains new ImageNet state-of-the-art with GPipe; drone learns to land more effectively than PD controller policy; and Facebook releases its ‘CherryPi’ StarCraft bot

11-26

You Can’t Do AI Without Augmented Analytics and AutoML

11-26

Cathy O’Neil discusses the current lack of fairness in artificial intelligence and much more.

11-26

Physics-Based Learned Design： Teaching a Microscope How to Image

11-26

3 Challenges for Companies Tackling Data Science

11-26

Building Blocks of Decision Tree

11-26

2018

Top Stories, Nov 19-25： What is the Best Python IDE for Data Science?; Intro to Data Science for Managers

11-26

Project planning with plotly

11-26

If you did not already know

11-26

Open Workshop： Data Visualization in R and ggplot2, January 25th in Munich

11-26

Data Pro Cyber Monday – Choose Your Savings

11-26

Stereograms

11-26

Whats new on arXiv

11-26

Amazon’s own ‘Machine Learning University’ now available to all developers

11-26

OneR – fascinating insights through simple rules

11-25

Document worth reading： “Internet of Things： An Overview”

11-25

2018

A tutorial on tidy cross-validation with R

11-25

Improving Binning by Bootstrap Bumping

11-25

New version of pqR, with major speed improvements

11-25

If you did not already know

11-25

Whats new on arXiv

11-25

Whats new on arXiv

11-25

RQuantLib 0.4.6： Updated upstream, and calls for help

11-25

Statistics Sunday： Introduction to Regular Expressions

11-25

These 3 problems destroy many clinical trials (in context of some papers on problems with non-inferiority trials, or problems with clinical trials in general)

11-25

Document worth reading： “Customised Structural Elicitation”

11-25

2018

Quidditch： is it all about the Snitch?

11-24

OneR – fascinating insights through simple rules

11-24

RcppEigen 0.3.3.5.0

11-24

How to work with strings in base R – An overview of 20+ methods for daily use.

11-24

Distilled News

11-24

Distilled News

11-24

Polished statistical analysis chapters in evidence-based software engineering

11-24

More Robust Monotonic Binning Based on Isotonic Regression

11-24

lmer vs INLA for variance components

11-24

R Packages worth a look

11-24

2018

R Packages worth a look

11-24

The evolution of pace in popular movies

11-24

EARL conference recap： Seattle 2018

11-24

Document worth reading： “Learning From Positive and Unlabeled Data： A Survey”

11-23

R Packages worth a look

11-23

Magister Dixit

11-23

Top 5 domains Big Data analytics helps to transform

11-23

If you did not already know

11-23

If you did not already know

11-23

Counting digits by @ellis2013nz

11-23

2018

Document worth reading： “To Cluster, or Not to Cluster： An Analysis of Clusterability Methods”

11-23

Whats new on arXiv

11-23

Intro to Data Science for Managers

11-23

Creating List with Iterator

11-23

Interactive Graphics with R Shiny

11-23

RFishBC CRAN Release

11-22

R Packages worth a look

11-22

High-performance mathematical paradigms in Python

11-22

Beautiful Chaos： The Double Pendulum

11-22

Cartoon： Thanksgiving, Big Data, and Turkey Data Science.

11-22

2018

If you did not already know

11-22

Gold-Mining Week 12 (2018)

11-22

Dealing with failed projects

11-22

“She also observed that results from smaller studies conducted by NGOs – often pilot studies – would often look promising. But when governments tried to implement scaled-up versions of those programs, their performance would drop considerably.”

11-22

KNNs (K-Nearest-Neighbours) in Python

11-22

Monash University： Lecturer/Sr Lecturer – Digital Health [Melbourne, Australia]

11-22

6 Goals Every Wannabe Data Scientist Should Make for 2019

11-22

Amazon Rekognition announces updates to its face detection, analysis, and recognition capabilities

11-22

OpenCPU 2.1 Release： Scalable R Services

11-22

Monash University： Research Fellow (Digital Civics) [Melbourne, Australia]

11-22

2018

Document worth reading： “A Survey on Trust Modeling from a Bayesian Perspective”

11-22

Data Tools We're Thankful For

11-22

Data Science in Esports

11-21

Building a conversational business intelligence bot with Amazon Lex

11-21

Scrapping data about Australian politicians with RSelenium

11-21

Autonomy – Do we have the choice?

11-21

A short proof for Nesterov’s momentum

11-21

Machine Learning. In conversation with Jelena Ilic, Senior Data Scientist at Mango Solutions

11-21

KDnuggets™ News 18：n44, Nov 21： What is the Best Python IDE for Data Science?; Anticipating the next move in data science

11-21

New Features For Amazon SageMaker： Workflows, Algorithms, and Accreditation

11-21

2018

Join the World’s Biggest Deep Learning Summit – KDnuggets Early Cyber Monday

11-21

Le Monde puzzle [#1075]

11-21

R > Python： a Concrete Example

11-21

Top KDnuggets tweets, Nov 14-20： 10 Free Must-See Courses for Machine Learning and Data Science; Great list of

11-21

WPI： Post-Doctoral Fellow [Worcester, MA]

11-21

Document worth reading： “The Algorithm Selection Competition Series 2015-17”

11-21

A Bayesian take on ballot order effects

11-21

Using a Keras Long Short-Term Memory (LSTM) Model to Predict Stock Prices

11-21

Driving Success through Business Insight, One Customer at a Time

11-21

R Packages worth a look

11-21

2018

RTutor： Driving Electric or Gasoline Cars? Comparing the Pollution Damages

11-21

AI, Machine Learning and Data Science Roundup： November 2018

11-21

An Introduction to AI

11-21

The best way to visit Luxembourguish castles is doing data science + combinatorial optimization

11-21

An Overview of the Singapore Hiring Landscape

11-21

Slides from my talks about Demystifying Big Data and Deep Learning (and how to get started)

11-20

Whats new on arXiv

11-20

If you did not already know

11-20

Forget Motivation and Double Your Chances of Learning Success

11-20

Introducing pipe, The Automattic Machine Learning Pipeline

11-20

2018

Amazon Transcribe now supports real-time transcriptions

11-20

Machine Learning in Action： Going Beyond Decision Support Data Science

11-20

Mega-PAW Las Vegas Registration is Live & Super Early Bird Pricing is Now Available!

11-20

Checklist Recipe – How we created a template to standardize species data

11-20

Address Your Data Science Strategy at DSNY

11-20

Word Morphing – an original idea

11-20

R Packages worth a look

11-20

Distilled News

11-20

Data Shows No Increase In NYC Plowing as Storm Picked Up

11-20

Quantcast： Sr Applied Scientist, Audience Platform [Seattle, WA]

11-20

2018

R Packages worth a look

11-20

Generating data to explore the myriad causal effects that can be estimated in observational data analysis

11-20

Introducing Octoparse New Version 7.1 – web scraping for dummies is official

11-20

“The hype economy”

11-20

Understanding object detection in deep learning

11-19

Document worth reading： “A Learning Approach to Secure Learning”

11-19

UnitedHealth Group： Sr Manager, Data Engineering [Minnetonka, MN]

11-19

How Important is that Machine Learning Model be Understandable? We analyze poll results

11-19

Change over time is not “treatment response”

11-19

Build Your Own Natural Language Models on AWS (no ML experience required)

11-19

2018

Import AI 121： Sony researchers make ultra-fast ImageNet training breakthrough; Berkeley researchers tackle StarCraft II with modular RL system; and Germany adds €3bn for AI research

11-19

The Big Data Game Board™

11-19

The Distribution of Time Between Recessions： Revisited (with MCHT)

11-19

Insights on the role data can play in your organization

11-19

Detect suspicious IP addresses with the Amazon SageMaker IP Insights algorithm

11-19

ML Methods for Prediction and Personalization

11-19

Tom Wolfe

11-19

What I Learned About Machine Learning at ODSC West 2018

11-19

Distilled News

11-19

Analyze live video at scale in real time using Amazon Kinesis Video Streams and Amazon SageMaker

11-19

2019

未命名

01-13

2018

Amazon SageMaker Automatic Model Tuning becomes more efficient with warm start of hyperparameter tuning jobs

11-19

Neural networks to generate music

11-19

Predictive Analytics in 2018： Salaries & Industry Shifts

11-19

UnitedHealth Group： Sr Manager, Data Science [Telecommute, Central or Eastern Time Zones]

11-19

Zero Counts in dplyr

11-19

Mask R-CNN with OpenCV

11-19

Hacking Bioconductor

11-19

UnitedHealth Group： Sr Director, Decision Analytics [Minnetonka, MN]

11-19

If you did not already know

11-19

2018

UnitedHealth Group： Director, Omni-Channel Analytics [Minnetonka, MN]

11-19

Cognitive Services in Containers

11-19

If you did not already know

11-19

Easily monitor and visualize metrics while training models on Amazon SageMaker

11-19

Top Stories, Nov 12-18： What is the Best Python IDE for Data Science?; To get hired as a data scientist, don’t follow the herd

11-19

Cognitive Services in Containers

11-19

Don’t Peek part 2： Predictions without Test Data

11-18

epubr 0.5.0 CRAN release

11-18

R Packages worth a look

11-18

Using OSX? Compiling an R package from source? Issues with ‘-fopenmp’? Try this.

11-18

2018

Document worth reading： “Graphical Models for Processing Missing Data”

11-18

RcppMsgPack 0.2.3

11-18

Graphs and tables, tables and graphs

11-18

Statistics Sunday： Reading and Creating a Data Frame with Multiple Text Files

11-18

Growing List vs Growing Queue

11-18

RcppGetconf 0.0.3

11-17

If you did not already know

11-17

Getting Started with Amazon Comprehend custom entities

11-17

R Packages worth a look

11-17

Whats new on arXiv

11-17

2018

A more systematic look at suppressed data by @ellis2013nz

11-17

Anticipating the next move in data science – my interview with Thomson Reuters

11-17

Congress Over Time

11-17

“Using numbers to replace judgment”

11-17

Benford’s Law for Fraud Detection with an Application to all Brazilian Presidential Elections from 2002 to 2018

11-17

Convert Data Frame to Dictionary List in R

11-17

Document worth reading： “Multi-Agent Reinforcement Learning： A Report on Challenges and Approaches”

11-17

Tis the Season to Check your SSL/TLS Cipher List Thrice (RCurl/curl/openssl)

11-17

If you did not already know

11-17

If you did not already know

11-16

2018

UnitedHealth Group： Clinical Data Statistical Analyst – SQL SAS (Clinician Required) [Telecommute]

11-16

Document worth reading： “Saliency Prediction in the Deep Learning Era： An Empirical Investigation”

11-16

Distilled News

11-16

“On the Diagramatic Diagnosis of Data” at BudapestBI 2018

11-16

Top 10 Python Data Science Libraries

11-16

Because it's Friday： The physics of The Expanse

11-16

Hey, check this out： Columbia’s Data Science Institute is hiring research scientists and postdocs!

11-16

Whats new on arXiv

11-16

R Packages worth a look

11-16

Example of Overfitting

11-16

2018

UnitedHealth Group： Data Analytics and Reporting Lead [Minnetonka, MN or Telecommute]

11-16

Using a genetic algorithm for the hyperparameter optimization of a SARIMA model

11-16

Sorry I didn’t get that! How to understand what your users want

11-16

UnitedHealth Group： Senior Principal Data Scientist [Telecommute, Central or Eastern Time Zones]

11-16

Distilled News

11-16

Using Uncertainty to Interpret your Model

11-16

The tidy caret interface in R

11-16

Report from the Enterprise Applications of the R Language conference

11-16

Report from the Enterprise Applications of the R Language conference

11-16

2018： How did people actually vote? (The real story, not the exit polls.)

11-16

2018

Mirrors

11-16

Magister Dixit

11-16

The State of the Art

11-15

Online Bayesian Deep Learning in Production at Tencent

11-15

Make Beautiful Tables with the Formattable Package

11-15

Searching for the optimal hyper-parameters of an ARIMA model in parallel： the tidy gridsearch approach

11-15

Gold-Mining Week 11 (2018)

11-15

Quoting in R

11-15

(Webinar) Farmers and Chubb on Humanizing Claims with AI

11-15

Scikit-learn Tutorial： Machine Learning in Python

11-15

2018

Best Deals in Deep Learning Cloud Providers： From CPU to GPU to TPU

11-15

A deep dive into glmnet： standardize

11-15

Introducing Drexel new online MS in Data Science

11-15

Quoting in R

11-15

R Packages worth a look

11-15

In case you missed it： October 2018 roundup

11-15

Whats new on arXiv

11-15

Magister Dixit

11-15

Rcpp now used by 1500 CRAN packages

11-15

If you did not already know

11-15

2018

Data Notes： Impact of Game of Thrones on US Baby Names

11-15

Mastering The New Generation of Gradient Boosting

11-15

URI： Director, Data Analytics/DataSpark [Kingston, RI]

11-15

The Crime Machine

11-15

Discourse Network Analysis： Undertaking Literature Reviews in R

11-15

Whats new on arXiv

11-15

What is Cloud Computing & Which is Better, AWS or GCP

11-15

More on Bias Corrected Standard Deviation Estimates

11-14

Rdew Valley： Optimizing Farming with R

11-14

Use GitHub Vulnerability Alerts to Keep Users of Your R Packages Safe

11-14

2018

Strategy： Customer Analytics： Are you Profiting from your Data?

11-14

Bright Lights, Bright Future. TDWI Is Back in Vegas

11-14

Free Reinforcement Learning Textbook

11-14

AdaSearch： A Successive Elimination Approach to Adaptive Search

11-14

Document worth reading： “Visions of a generalized probability theory”

11-14

Robustness checks are a joke

11-14

Paris ML E#2 S#6： Conscience, Code Analysis, Can a machine learn like a child?

11-14

More on Bias Corrected Standard Deviation Estimates

11-14

anytime 0.3.3

11-14

What is the Best Python IDE for Data Science?

11-14

2018

Finding a house to buy, using statistics

11-14

Gazing into the Abyss of P-Hacking： HARKing vs. Optional Stopping

11-14

NYU Stern： 2019-20 Asst. Professor of Information, Operations & Management Sciences – Information Systems, tenure-track [New York City, NY]

11-14

Document worth reading： “Deep Reinforcement Learning： An Overview”

11-14

R Packages worth a look

11-14

Distilled News

11-14

More Sandwiches, Anyone?

11-14

Metadata Enrichment is Essential to Realize the Value of Open Datasets

11-14

Easy time-series prediction with R： a tutorial with air traffic data from Lux Airport

11-14

Top KDnuggets tweets, Nov 07-13： 10 Free Must-See Courses for Machine Learning and Data Science

11-14

2018

Building a Repository of Alpine-based Docker Images for R, Part II

11-14

Federated learning： distributed machine learning with data locality and privacy

11-14

R Packages worth a look

11-14

Whats new on arXiv

11-14

KDnuggets™ News 18：n43, Nov 14： To get hired as a data scientist, don’t follow the herd; LinkedIn Top Voices in Data Science & Analytics

11-14

Distilled News

11-14

Windows Clipboard Access with R

11-14

Notes on the Frank-Wolfe Algorithm, Part II： A Primal-dual Analysis

11-14

Chocolate milk! Another stunning discovery from an experiment on 24 people!

11-13

NLP for Log Analysis – Tokenization

11-13

2019

未命名

01-13

2018

The Evolution of Build Engineering in Managing Open Source [Webinar Replay]

11-13

All About Scikit-Learn, with Olivier Grisel

11-13

If you did not already know

11-13

R Packages worth a look

11-13

A deep dive into glmnet： penalty.factor

11-13

Help us understand your Data Science goals!

11-13

How to Find an Entry-Level Job in Data Science

11-13

LinkedIn Top Voices 2018： Data Science & Analytics

11-13

The ultimate guide to starting AI

11-13

2018

AI for Good： slides and notebooks from the ODSC workshop

11-13

Magister Dixit

11-13

TWIMLAI European Online Meetup about Trust in Predictions of ML Models

11-13

Those “other” apply functions…

11-13

Top Stories, Nov 5-11： The Most in Demand Skills for Data Scientists; 10 Free Must-See Courses for Machine Learning and Data Science

11-13

The Antarctic/Southern Ocean rOpenSci community

11-13

The 5 Basic Statistics Concepts Data Scientists Need to Know

11-13

Whats new on arXiv

11-13

Distilled News

11-12

YOLO object detection with OpenCV

11-12

2018

Preview my new book： Introduction to Reproducible Science in R

11-12

To get hired as a data scientist, don’t follow the herd

11-12

The Long Tail of Medical Data

11-12

“Law professor Alan Dershowitz’s new book claims that political differences have lately been criminalized in the United States. He has it wrong. Instead, the orderly enforcement of the law has, ludicrously, been framed as political.”

11-12

Machine Learning Toronto SummitNov 20-21 – Special KDnuggets discount

11-12

Amazon Polly adds Italian and Castilian Spanish voices, and Mexican Spanish language support

11-12

Time Series and MCHT

11-12

Healthcare Analytics Made Simple

11-12

Document worth reading： “An Introduction to Probabilistic Programming”

11-12

Whats new on arXiv

11-12

2018

Data Science With R Course Series – Week 9

11-12

Introducing a simple and intuitive Python API for UCI machine learning repository

11-12

Visualization research for non-researchers

11-12

Distilled News

11-12

How to de-Bias Standard Deviation Estimates

11-12

Which taxonomy should you use to classify news content, IAB-QAG or IPTC Subject Codes?

11-12

How to de-Bias Standard Deviation Estimates

11-12

If you did not already know

11-12

If you did not already know

11-12

Angela Bassa discusses managing data science teams and much more.

11-12

2018

Data Science in Esports

11-12

Hey! Here’s what to do when you have two or more surveys on the same population!

11-11

Whats new on arXiv

11-11

RATest. A Randomization Tests package is available on CRAN

11-11

Distilled News

11-11

R Packages worth a look

11-11

On receiving the Community Leadership Award at the NumFOCUS Summit 2018

11-11

On helping to open the inaugural PyDataPrague meetup

11-11

Deriving Expectation-Maximization

11-11

Characterizing Online Public Discussions through Patterns of Participant Interactions

11-11

2018

If you did not already know

11-10

Voronoi diagram with ggvoronoi package with Train Station data

11-10

4 ways to be more efficient using RStudio’s Code Snippets, with 11 ready to use examples

11-10

2018： Who actually voted? (The real story, not the exit polls.)

11-10

Gold-Mining Week 10 (2018)

11-10

If you did not already know

11-10

2018： What really happened?

11-10

One-arm Bayesian Adaptive Trial Simulation Code

11-10

Detailed introduction of “myprettyreport” R package

11-10

Document worth reading： “Advice from the Oracle： Really Intelligent Information Retrieval”

11-10

2018

RcppArmadillo 0.9.200.4.0

11-10

Dr. Data Show Video： What the Hell Does “Data Science” Really Mean?

11-10

The Gamification Of Fitbit： How an API Provided the Next Level of tRaining

11-10

Model evaluation, model selection, and algorithm selection in machine learning

11-10

R Packages worth a look

11-10

R Packages worth a look

11-10

Matching (and discarding non-matches) to deal with lack of complete overlap, then regression to adjust for imbalance between treatment and control groups

11-10

Multi-Class Text Classification with Doc2Vec & Logistic Regression

11-09

Top October Stories： 9 Must-have skills you need to become a Data Scientist, updated; 10 Best Mobile Apps for Data Scientist / Data Analysts

11-09

T-mobile uses R for Customer Service AI

11-09

2018

R Packages worth a look

11-09

Image segmentation based on Superpixels and Clustering

11-09

T-mobile uses R for Customer Service AI

11-09

Escaping the macOS 10.14 (Mojave) Filesystem Sandbox with R / RStudio

11-09

“Recapping the recent plagiarism scandal”

11-09

What does a data scientist REALLY look like?

11-09

simmer 4.1.0

11-09

Top 5 Trends in Data Science

11-09

Coding Regression trees in 150 lines of R code

11-09

Why would I ever NEED Bayesian Statistics?

11-09

2018

Exploring Models with lime

11-09

Document worth reading： “A Tutorial on Modeling and Inference in Undirected Graphical Models for Hyperspectral Image Analysis”

11-09

Document worth reading： “An Overview of Blockchain Integration with Robotics and Artificial Intelligence”

11-08

If you did not already know

11-08

Melanie Miller says, “As someone who has worked in A.I. for decades, I’ve witnessed the failure of similar predictions of imminent human-level A.I., and I’m certain these latest forecasts will fall short as well. “

11-08

AzureR： R packages to control Azure services

11-08

Your Client Engagement Program Isn't Doing What You Think It Is.

11-08

K-means clustering with Amazon SageMaker

11-08

Deep Learning Performance Cheat Sheet

11-08

Practical statistics books for software engineers

11-08

2018

5 Critical Steps to Predictive Business Analytics

11-08

Watch out for naively (because implicitly based on flat-prior) Bayesian statements based on classical confidence intervals! (Comptroller of the Currency edition)

11-08

Egg-Not-Egg Deep Learning Model

11-08

R Packages worth a look

11-08

Rcpp 1.0.0： The Tenth Birthday Release

11-08

AWS expands HIPAA eligible machine learning services for healthcare customers

11-08

Introducing Webhooks — Fastest Way to Collect Data

11-08

Introduction to Amazon SageMaker Object2Vec

11-08

How to sync Fastmail's CardDAV to use with mutt + abook

11-08

Melanie Mitchell says, “As someone who has worked in A.I. for decades, I’ve witnessed the failure of similar predictions of imminent human-level A.I., and I’m certain these latest forecasts will fall short as well. “

11-08

2018

anytime – dates in R

11-08

Best Practices for Using Notebooks for Data Science

11-08

Carlos： ‘Everything Dataquest showed me, I use in my new job’

11-08

AzureR： R packages to control Azure services

11-08

10 Free Must-See Courses for Machine Learning and Data Science

11-08

Distilled News

11-08

Hilary Mason and Gilad Lotan to Keynote at MADS 2019

11-08

Latest Trends in Computer Vision Technology and Applications

11-07

Working with US Census Data in R

11-07

anytime 0.3.2

11-07

2018

Top KDnuggets tweets, Oct 31 – Nov 6： 10 More Free Must-Read Books for Machine Learning and Data Science

11-07

Integrating R and Telegram

11-07

Introduction to PyTorch for Deep Learning

11-07

Now easily perform incremental learning on Amazon SageMaker

11-07

Why R? 2018 Conference – After Movie and Summary

11-07

Working with US Census Data in R

11-07

Whats new on arXiv

11-07

The “probability to win” is hard to estimate…

11-07

DePaul University： Professor of Practice position in Data Science [Chicago, IL]

11-07

Distilled News

11-07

2018

UI Update — Datazar

11-07

R Packages worth a look

11-07

If you did not already know

11-07

EMNLP 2018 Highlights： Inductive bias, cross-lingual learning, and more

11-07

KDnuggets™ News 18：n42, Nov 7： The Most in Demand Skills for Data Scientists; How Machines Understand Our Language： Intro to NLP

11-07

DePaul University： Two tenure-track/tenured positions in Data Science/Computer Science [Chicago, IL]

11-07

If you did not already know

11-07

Whats new on arXiv

11-07

Technoslavia 2.5： Open Source Topography

11-07

Magister Dixit

11-07

2018

“35. What differentiates solitary confinement, county jail and house arrest” and 70 others

11-07

7 Best Practices for Machine Learning on a Data Lake

11-07

Direct access to Amazon SageMaker notebooks from Amazon VPC by using an AWS PrivateLink endpoint

11-06

Causal mediation estimation measures the unobservable

11-06

New： Maintained Datasets

11-06

Turbocharge Tech Transformation： Integrate AI Across Insurance

11-06

Customize your notebook volume size, up to 16 TB, with Amazon SageMaker

11-06

R plus Magento 2 REST API revisited： part 1- authentication and universal search

11-06

Whats new on arXiv

11-06

Postdocs and Research fellows for combining probabilistic programming, simulators and interactive AI

11-06

2018

Document worth reading： “Toward a System Building Agenda for Data Integration”

11-06

If you did not already know

11-06

Document worth reading： “Lectures on Statistics in Theory： Prelude to Statistics in Practice”

11-06

More on sigr

11-06

Data Feminism

11-06

xts 0.11-2 on CRAN

11-06

Happy 10th Bday, Rcpp – and welcome release 1.0 !!

11-06

R Packages worth a look

11-06

“Statistical and Machine Learning forecasting methods： Concerns and ways forward”

11-06

Whats new on arXiv

11-06

2018

Turn data into revenue. Wharton can show you how.

11-06

Text Preprocessing in Python： Steps, Tools, and Examples

11-06

Source and List： Organizing R Shiny Apps

11-06

Data Science in 30 Minutes with Jake Porway of DataKind

11-06

Building Surveillance System Using USB Camera and Wireless-Connected Raspberry Pi

11-06

Can we predict the crawling of the Google-Bot?

11-06

Tesseract 4 is here! State of the art OCR in R!

11-06

Using httr to Detect HTTP(s) Redirects

11-06

Mastering the Learning Rate to Speed Up Deep Learning

11-06

Lifecycle configuration update for Amazon SageMaker notebook instances

11-06

2018

EARL Houston： Interview with Hadley Wickham

11-05

The purported CSI effect and the retroactive precision fallacy

11-05

Distilled News

11-05

Whats new on arXiv

11-05

Top Data Science Hacks

11-05

Quantum Machine Learning： A look at myths, realities, and future projections

11-05

India vs US – Kaggle Users & Data Scientists

11-05

Vanderbilt University： Lecturer in Data and Analytics [Nashville, TN]

11-05

Maps of the issues mentioned most in election advertising

11-05

Vanderbilt University： Sr Lecturer in Data and Analytics [Nashville, TN]

11-05

2018

“A Guide to Working With Census Data in R” is now Complete!

11-05

How a meme grew into a campaign slogan

11-05

Top Stories, Oct 29 – Nov 4： The Most in Demand Skills for Data Scientists; How Machines Understand Our Language

11-05

Creating GIFs with OpenCV

11-05

Vanderbilt University’s Peabody College： Lecturer in Data and Analytics [Online Teaching]

11-05

Machine Learning Classification： A Dataset-based Pictorial

11-05

R Packages worth a look

11-05

Import AI 119： How to benefit AI research in Africa; German politician calls for billions in spending to prevent country being left behind; and using deep learning to spot thefts

11-05

Maps, models, and analytic problem framing

11-05

Distilled News

11-05

2018

Don’t use AI when BI will suffice!

11-05

NG "roll returns" – inflection point?

11-05

The 3Ds of Machine Learning Systems Design

11-05

Top Data Science Hacks

11-05

Vanderbilt University’s Peabody College： Sr. Lecturer in Data and Analytics [Nashville, TN]

11-05

Coding Gradient boosted machines in 100 lines of code

11-05

Linear Regression in Real Life

11-05

Telling Truth from Hype When Hunting for Data Science Work

11-05

Peter Bull discusses the importance of human-centered design in data science.

11-05

Data Science With R Course Series – Week 8

11-05

2018

Document worth reading： “Artificial Intelligence for Long-Term Robot Autonomy： A Survey”

11-04

R tip： Make Your Results Clear with sigr

11-04

If you did not already know

11-04

Document worth reading： “Deep Learning for Image Denoising： A Survey”

11-04

If you did not already know

11-04

R tip： Make Your Results Clear with sigr

11-04

Cornell prof (but not the pizzagate guy!) has one quick trick to getting 1700 peer reviewed publications on your CV

11-04

Building a neighbour matrix with python

11-04

Distilled News

11-04

Building a Repository of Alpine-based Docker Images for R, Part I

11-04

2018

Whats new on arXiv

11-04

Whats new on arXiv

11-04

R Packages worth a look

11-04

New R Cheatsheet： Data Science Workflow with R

11-04

RProtoBuf 0.4.13 (and 0.4.12)

11-03

coalesce with wrapr

11-03

Reflections on remote data science work

11-03

R Packages worth a look

11-03

If you did not already know

11-03

Le Monde puzzle [#1073]

11-03

2018

A quick look at GHCN version 4

11-03

Visualize the Business Value of your Predictive Models with modelplotr

11-03

coalesce with wrapr

11-03

Book Review – Sound Analysis and Synthesis with R

11-03

If you did not already know

11-03

Whats new on arXiv

11-03

Document worth reading： “A User’s Guide to Support Vector Machines”

11-03

“We are reluctant to engage in post hoc speculation about this unexpected result, but it does not clearly support our hypothesis”

11-03

7 Awesome Things You Can Do in Dataiku Without Coding

11-02

RcppAnnoy 0.0.11

11-02

2018

Data Mining Book – Chapter Download

11-02

The Most in Demand Skills for Data Scientists

11-02

collateral

11-02

Top 13 Python Deep Learning Libraries

11-02

Learn how machine learning is transforming business

11-02

Pulse of the Competition： November Edition

11-02

Master R shiny： One trick to build maintainable and scalable event chains

11-02

Learn how machine learning is transforming business, Nov 12 Webinar

11-02

“What Happened Next Tuesday： A New Way To Understand Election Results”

11-02

Distilled News

11-02

2018

“Simulations are not scalable but theory is scalable”

11-02

Data Science “Paint by the Numbers” with the Hypothesis Development Canvas

11-02

Data Representation for Natural Language Processing Tasks

11-02

Quick overview on the new Bioconductor 3.8 release

11-02

Document worth reading： “Transfer Metric Learning： Algorithms, Applications and Outlooks”

11-02

The blocks and rows theory of data shaping

11-02

Whats new on arXiv

11-02

My two talks in Austria next week, on two of your favorite topics!

11-02

Data Notes： Chinese Tourism's Impact on Taiwan

11-01

How Data Science Is Improving Higher Education

11-01

2018

How Data Science (+ Friends) Helped Me Learn French

11-01

Search labels and IDs from IAB-QAG and IPTC Subject Codes taxonomies

11-01

Python vs R： Head to Head Data Analysis

11-01

Multi-Class Text Classification Model Comparison and Selection

11-01

Sharing the Recipe for rOpenSci’s Unconf Ice Breaker

11-01

Azure ML Studio now supports R 3.4

11-01

If you did not already know

11-01

Talk： How Do We Support Under-represented Groups To Put Themselves Forward?

11-01

R Packages worth a look

11-01

Raghuveer Parthasarathy’s big idea for fixing science

11-01

2018

Automated Email Reports with R

11-01

Join AI experts from Google Brain, Open AI & Uber AI Labs in San Francisco

11-01

RcppTOML 0.1.5： Small extensions

11-01

Multithreaded in the Wild

11-01

New Course： Analyzing Election and Polling Data in R

11-01

2018-11 Variable-Width Bezier Splines in R

11-01

Communicating results with R Markdown

11-01

Webinar： Transform Your Stagnant Data Swamp into a Pristine Data Lake, Nov 8

11-01

Facial feedback： “These findings suggest that minute differences in the experimental protocol might lead to theoretically meaningful changes in the outcomes.”

11-01

The role of academia in data science education

11-01

2018

What R version do you really need for a package?

11-01

If you did not already know

11-01

R Packages worth a look

11-01

Why AI will not replace radiologists

11-01

Spam Detection with Natural Language Processing – Part 3

11-01

Now use Pipe mode with CSV datasets for faster training on Amazon SageMaker built-in algorithms

11-01

Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning： November and Beyond

11-01

Tomorrow, Nov 8 Webinar： Transform Your Stagnant Data Swamp into a Pristine Data Lake

11-01

Distilled News

10-31

How to Highlight 3D Brain Regions

10-31

2018

How Machines Understand Our Language： An Introduction to Natural Language Processing

10-31

Le Monde puzzle [#1072]

10-31

Gold-Mining Week 9 (2018)

10-31

Top KDnuggets tweets, Oct 24-30： Building a Question-Answering System from Scratch

10-31

Model Server for Apache MXNet v1.0 released

10-31

If you did not already know

10-31

Document worth reading： “A Comprehensive Study of Deep Learning for Image Captioning”

10-31

namer, Automatic Labelling of R Markdown Chunks

10-31

If you did not already know

10-31

Multilevel Modeling Solves the Multiple Comparison Problem： An Example with R

10-31

2018

Spooky! Gravedigger in R

10-31

Webinar – Integrate AI Across Insurance Operations to Turbocharge Tech Transformation, Nov 14

10-31

“2010： What happened?” in light of 2018

10-31

RHL’19 St-Cergue, Switzerland, 25-27 January 2019

10-31

KDnuggets™ News 18：n41, Oct 31： Introduction to Deep Learning with Keras; Easy Named Entity Recognition with Scikit-Learn

10-31

Spooky! Gravedigger in R

10-31

How to Start Learning R for Data Science

10-31

Labeling Unstructured Text for Meaning to Achieve Predictive Lift

10-31

Improving model interpretability with LIME

10-31

Stop Installing Tensorflow Using pip for Performance Sake!

10-30

2018

Use Pseudo-Aggregators to Add Safety Checks to Your Data-Wrangling Workflow

10-30

Explainable ML versus Interpretable ML

10-30

Moody’s Analytics： Machine Learning / NLP – Research Scientist / Engineer [New York, NY]

10-30

In-Depth Training for the Future of Data, Orlando, Nov 11-16 – Save with code KD30

10-30

Growth of Subreddits

10-30

How to Mitigate Open Source License Risks

10-30

Our Favorite Spooky AI & Data Articles

10-30

Whats new on arXiv

10-30

Data + Art STEAM Project： Final Results

10-30

Whats new on arXiv

10-30

2018

R Packages worth a look

10-30

Key Takeaways from AI Conference SF, Day 2： AI and Security, Adversarial Examples, Innovation

10-30

How to create useful features for Machine Learning

10-30

New Poll： How Important is Understanding Machine Learning Models?

10-30

Additional Strategies for Confronting the Partition Function

10-30

Fringe FM conversation on AI Ethics

10-30

Machine Learning Basics – Random Forest

10-30

Are petrol prices in Australia fair?

10-30

Using deep learning on AWS to lower property damage losses from natural disasters

10-30

Use Pseudo-Aggregators to Add Safety Checks to Your Data-Wrangling Workflow

10-30

2018

Lehigh University： Tenure Track Positions in Foundations of Data Science [Bethlehem, PA]

10-30

Document worth reading： “Resource Management in Fog/Edge Computing： A Survey”

10-30

Site Migration

10-30

Distilled News

10-30

Introduction to Deep Learning with Keras

10-29

If you did not already know

10-29

How do I visualise the results of a Bayesian Model： Rugby models in Arviz

10-29

R Packages worth a look

10-29

Bootstrap Testing with MCHT

10-29

Amazing consistency： Largest Dataset Analyzed / Data Mined – Poll Results and Trends

10-29

2018

Is the answer to everything Gaussian?

10-29

Multi-object tracking with dlib

10-29

Federated Learning： Machine Learning with Privacy on the Edge

10-29

Bank of Canada： Data Scientist [Ottawa, Canada]

10-29

If you did not already know

10-29

What does it mean to talk about a “1 in 600 year drought”?

10-29

Key Takeaways from AI Conference SF, Day 1： Domain Specific Architectures, Emerging China, AI Risks

10-29

About a Curious Feature and Interpretation of Linear Regressions

10-29

The quest continues： a look at a new initiative to explore human and machine intelligence

10-29

The Future of Management： Human Resource Analytics

10-29

2018

NAIC： Analyst I (Capital Markets) [New York, NY]

10-29

Open Source Deep Dive with Olivier Grisel

10-29

The Decentralized Web

10-29

Please vote

10-29

Data Science With R Course Series – Week 7

10-29

crfsuite for natural language processing

10-29

Top Obstacles to Overcome when Implementing Predictive Maintenance

10-29

Document worth reading： “Neural Approaches to Conversational AI”

10-29

Learning to learn in a model-agnostic way

10-29

Arnaub Chatterjee discusses artificial intelligence (AI) and machine learning (ML) in healthcare.

10-29

2018

Top Stories, Oct 22-28： 9 Must-have skills you need to become a Data Scientist, updated; Named Entity Recognition and Classification with Scikit-Learn

10-29

Amazon Translate now offers 113 new language pairs

10-29

2019

未命名

01-13

2018

How to be an Artificial Intelligence (AI) Expert?

10-29

American Association of Colleges of Osteopathic Medicine： Data Analyst [Bethesda, Maryland]

10-29

R Packages worth a look

10-28

Scatterplot matrices (pair plots) with cdata and ggplot2

10-28

Distilled News

10-28

Distilled News

10-28

MRP (or RPP) with non-census variables

10-28

2018

Simple Feed Ranking Algorithm

10-28

Scatterplot matrices (pair plots) with cdata and ggplot2

10-28

Document worth reading： “Machine Learning for Wireless Networks with Artificial Intelligence： A Tutorial on Neural Networks”

10-28

How quickly do stock market valuations revert back to their means?

10-28

If you did not already know

10-28

Conway’s Game of Life in R： Or On the Importance of Vectorizing Your R Code

10-28

Introducing cricpy：A python package to analyze performances of cricketers

10-28

R Packages worth a look

10-28

Data Science Interview Questions with Answers

10-28

Conway’s Game of Life in R： Or On the Importance of Vectorizing Your R Code

10-28

2018

Document worth reading： “Opening the black box of deep learning”

10-28

Visualizing The Catholic Lectionary – Part 1

10-27

If you did not already know

10-27

AI Masterpieces： But is it Art?

10-27

Document worth reading： “Restricted Boltzmann Machines： Introduction and Review”

10-27

Debate about genetics and school performance

10-27

Distilled News

10-27

Maps with pie charts on top of each administrative division： an example with Luxembourg’s elections data

10-27

Whats new on arXiv

10-27

RcppRedis 0.1.9

10-27

2018

How to perform merges (joins) on two or more data frames with base R, tidyverse and data.table

10-27

Can we do better than using averaged measurements?

10-26

RConsortium — Building an R Certification

10-26

Because it's Friday： Parable of the Polygons

10-26

R Packages worth a look

10-26

Whats new on arXiv

10-26

Document worth reading： “Causal inference and the data-fusion problem”

10-26

Whats new on arXiv

10-26

The Final Data Science Roadshow is Just the Beginning

10-26

CRAN’s New Missing Data Task View

10-26

2018

Gold-Mining Week 8 (2018)

10-26

Simulating simple dice games by @ellis2013nz

10-26

SQL, Python, & R in One Platform

10-26

Designing Transforms for Data Reshaping with cdata

10-26

Spotlight on Julia Silge, Keynote Speaker EARL Seattle 7th November

10-26

Designing Transforms for Data Reshaping with cdata

10-26

Marketing Analytics and Data Science

10-26

Notes on Feature Preprocessing： The What, the Why, and the How

10-26

Are you buying an apartment? How to hack competition in the real estate market

10-26

R Packages worth a look

10-26

2018

Drawing beautiful maps programmatically with R, sf and ggplot2 — Part 3： Layouts

10-25

Document worth reading： “Recent Advances in Object Detection in the Age of Deep Convolutional Neural Networks”

10-25

Whats new on arXiv

10-25

Implementing Automated Machine Learning Systems with Open Source Tools

10-25

The Axios Turing test and the heat death of the journalistic universe

10-25

Monash University： Academic Opportunities in Dialogue Research [Melbourne, Australia]

10-25

Popular Halloween Candy on US State Grid Map

10-25

How to be an Artificial Intelligence (AI) Expert?

10-25

Learn how to create data-driven marketing team

10-25

Distilled News

10-25

2018

How DataCamp Handles Course Quality

10-25

Naive Bayes from Scratch using Python only – No Fancy Frameworks

10-25

11 Design Tips for Data Visualization

10-25

A Data Scientist’s Guide to an Efficient Project Lifecycle

10-25

If you did not already know

10-25

AI, Machine Learning and Data Science Roundup： October 2018

10-25

Baltimore-Washington

10-25

Drawing beautiful maps programmatically with R, sf and ggplot2 — Part 1： Basics

10-25

Getting started Stamen maps with ggmap

10-25

Named Entity Recognition and Classification with Scikit-Learn

10-25

2018

How I Learned to Stop Worrying and Love Uncertainty

10-24

When the numbers don’t tell the whole story

10-24

SiliconANGLE： Machine learning automation startup DataRobot lands $100M round

10-24

A study fails to replicate, but it continues to get referenced as if it had no problems. Communication channels are blocked.

10-24

U. of Zurich： Professorship in Big Data Science (Open Rank) [Zurich, Switzerland]

10-24

Join us at the EARL US Roadshow – a conference dedicated to the real-world usage of R

10-24

When the numbers don't tell the whole story

10-24

Generative Adversarial Networks – Paper Reading Road Map

10-24

Understanding Amazon SageMaker notebook instance networking configurations and advanced routing options

10-24

If you did not already know

10-24

2018

U. of Zurich： Assistant Professorship in AI and Machine Learning (Non-tenure Track) [Zurich, Switzerland]

10-24

Distilled News

10-24

Community Call – Working with images in R

10-24

Whats new on arXiv

10-24

Building a Question-Answering System from Scratch

10-24

ITWire： VIDEO Interview with a DataRobot： Greg Michaelson talks AI, banking, machine learning and more

10-24

How AI Can Help Cope with Data Scientists’ Boredom

10-24

“Demystifying Data Science” remote notes

10-24

Vote suppression in corrupt NY State

10-24

R Packages worth a look

10-24

2018

KDnuggets™ News 18：n40, Oct 24： Graphs Are The Next Frontier In Data Science; Apache Spark Intro for Beginners

10-24

M4 Forecasting Conference

10-24

Top KDnuggets tweets, Oct 17-23： Machine Learning Cheat Sheets

10-24

R Packages worth a look

10-24

Coding is hard

10-24

automl package： part 2/2 first steps how to

10-24

Drilling Down on Depth Sensing and Deep Learning

10-23

Google, Microsoft & Fraunhofer at the First European Edition of Deep Learning World – 12 Nov, 2018

10-23

Computer Vision for Model Assessment

10-23

RcppTOML 0.1.4： Now with TOML v0.5.0

10-23

2018

Get a 2–6x Speed-up on Your Data Pre-processing with Python

10-23

Introducing gratia

10-23

How Can Autonomous Drones Help the Energy and Utilities Industry?

10-23

5 Steps to Prepare for a Data Science Job

10-23

Whats new on arXiv

10-23

Document worth reading： “Attribute-aware Collaborative Filtering： Survey and Classification”

10-23

Computer Vision for Model Assessment

10-23

What to think about this new study which says that you should limit your alcohol to 5 drinks a week?

10-23

Introduction to Active Learning

10-23

High school statistics class builds election prediction model

10-23

2018

Whats new on arXiv

10-23

R Packages worth a look

10-23

If you did not already know

10-23

Cross-over study design with a major constraint

10-23

If you did not already know

10-23

Whats new on arXiv

10-22

Packages for Testing your R Package

10-22

Building statues of hope in augmented reality

10-22

Whats new on arXiv

10-22

Amazon SageMaker Batch Transform now supports Amazon VPC and AWS KMS-based encryption

10-22

2018

Distilled News

10-22

Top Stories, Oct 15-21： Graphs Are The Next Frontier In Data Science; The Main Approaches to Natural Language Processing Tasks

10-22

Update on the R Consortium Census Working Group

10-22

Document worth reading： “Fractal AI： A fragile theory of intelligence”

10-22

Don’t miss Big Data LDN 2018

10-22

How to Define a Machine Learning Problem Like a Detective

10-22

Maximized Monte Carlo Testing with MCHT

10-22

Speak at Mega-PAW Vegas 2019 – on Machine Learning Deployment (Apply by Nov 15)

10-22

Distilled News

10-22

University of Rhode Island： Assistant Professor of Data Science [Kingston, RI]

10-22

2018

MVP for Data Projects

10-22

Object tracking with dlib

10-22

“The dwarf galaxy NGC1052-DF2”

10-22

Import AI： 117： Surveillance search engines; harvesting real-world road data with hovering drones; and improving language with unsupervised pre-training

10-22

Document worth reading： “Declarative Statistics”

10-22

Summer Intern Projects

10-22

Whats new on arXiv

10-22

Does Sharing Goals Help or Hurt Your Chances of Success?

10-22

Data Science With R Course Series – Week 6

10-22

Cassie Kozyrkov discusses decision making and decision intelligence!

10-22

2018

Beginner Data Visualization & Exploration Using Pandas

10-22

Residential Property Investment Visualization and Analysis Shiny App

10-22

R Packages worth a look

10-22

5 Steps to Prepare for a Data Science Job

10-22

Getting the data from the Luxembourguish elections out of Excel

10-21

Faceted Graphs with cdata and ggplot2

10-21

If you did not already know

10-21

If you did not already know

10-21

Whats new on arXiv

10-21

Distilled News

10-21

2018

Document worth reading： “Machine Learning for Spatiotemporal Sequence Forecasting： A Survey”

10-21

RApiDatetime 0.0.4： Updates and Extensions

10-21

Multilevel models with group-level predictors

10-21

automl package： part 1/2 why and how

10-21

Faceted Graphs with cdata and ggplot2

10-21

Statistics Sunday： What Fast Food Can Tell Us About a Community and the World

10-21

R Packages worth a look

10-20

Dr. Data Show Video： How Can You Trust AI?

10-20

A Lazy Function

10-20

A Thorough Introduction to Boltzmann Machines

10-20

2018

Table of Contents for PIM

10-20

He’s a history teacher and he has a statistics question

10-20

R Packages worth a look

10-20

New Course： Visualization Best Practices in R

10-19

New Course： Interactive Data Visualization with rbokeh

10-19

Start your journey into data science today

10-19

An actual quote from a paper published in a medical journal： “The data, analytic methods, and study materials will not be made available to other researchers for purposes of reproducing the results or replicating the procedure.”

10-19

Whats new on arXiv

10-19

McKinsey Datathon： The City Cup17 November, Amsterdam, Stockholm and Zurich. Apply Now

10-19

If you did not already know

10-19

2018

The Intuitions Behind Bayesian Optimization with Gaussian Processes

10-19

Loops and Pizzas

10-19

I’m an Analyst and the software engineers made fun of my code!

10-19

Young Investigator Special Competition for Time-Sharing Experiment for the Social Sciences

10-19

Statistics Challenge Invites Students to Tackle Opioid Crisis Using Real-World Data

10-19

Document worth reading： “Review of Deep Learning”

10-19

Holy Grail of AI for Enterprise — Explainable AI

10-19

Gold-Mining Week 7 (2018)

10-19

If you did not already know

10-19

Will Models Rule the World? Data Science Salon Miami, Nov 6-7

10-19

2018

Maryland's Bridge Safety, reported using R

10-19

Solving the chinese postman problem

10-19

survHE new release

10-19

New Jobs Sure to Emerge Alongside Artificial Intelligence

10-18

Distilled News

10-18

Analyzing English Team of the Year Data Since 1973

10-18

shinytest – Automated testing for Shiny apps

10-18

From Project Manager to Data Champion — Conquer Your Data Projects

10-18

Apache Spark Introduction for Beginners

10-18

R Packages worth a look

10-18

2018

Graphs Are The Next Frontier In Data Science

10-18

Examining Inter-Rater Reliability in a Reality Baking Show

10-18

Spam Detection with Natural Language Processing-Part 2

10-18

How to Solve the ModelOps Challenge

10-18

Adam： “It would have been much harder without Dataquest”

10-18

BI to AI： Getting Intelligent Insights to Everyone

10-18

Serial and Parallel bulb puzzle

10-18

R Packages worth a look

10-18

Ethics in statistical practice and communication： Five recommendations.

10-18

Distilled News

10-18

2018

Ask the Question, Visualize the Answer

10-17

Blockchain applications in the Federal Government sector

10-17

Stan development in RStudio

10-17

University of San Francisco： Assistant Professor, Tenure Track, Mathematics and Statistics [San Francisco, CA]

10-17

Use AWS DeepLens to give Amazon Alexa the power to detect objects via Alexa skills

10-17

Document worth reading： “Deep Facial Expression Recognition： A Survey”

10-17

Cannibus Curve with ggplot2

10-17

Building a data warehouse

10-17

R Packages worth a look

10-17

Top KDnuggets tweets, Oct 10-16： 7 Books to Grasp Mathematical Foundations of Data Science and Machine Learning; 6 Books Every Data Scientist Should Keep Nearby

10-17

2018

The Main Approaches to Natural Language Processing Tasks

10-17

If you did not already know

10-17

KDnuggets™ News 18：n39, Oct 17： 10 Best Mobile Apps for Data Scientist; Vote in new poll： Largest dataset you analyzed?

10-17

Four machine learning strategies for solving real-world problems

10-17

Music for Data Scientists? Music by Data Scientists? …What…?!

10-17

Citizen Data Scientists | Why Not DIY AI?

10-17

Fitting the Besag, York, and Mollie spatial autoregression model with discrete data

10-17

SatRday talks recordings

10-17

Slides from my talk at the R-Ladies Meetup about Interpretable Deep Learning with R, Keras and LIME

10-17

Estimating Control Chart Constants with R

10-17

2018

Machine Reading at Scale – Transfer Learning for Large Text Corpuses

10-17

The Definitive Guide to AI’s “Black Box” Problem

10-17

Use R with Excel： Importing and Exporting Data

10-17

If you did not already know

10-17

Announcing RStudio Package Manager

10-17

5 Alternatives to the Default R Outputs for GLMs and Linear Models

10-17

Data Notes： The Secret of Academic Success

10-17

Mindstrong Health： Sr Data Scientist / Machine Learning, Statistics, Coding [Palo Alto, CA]

10-17

A small logical change with big impact

10-16

The AAA tranche of subprime science, revisited

10-16

2018

Estimating Pi

10-16

Distilled News

10-16

Why you need GPUs for your deep learning platform

10-16

Accelerating Your Algorithms in Production [Webinar Replay]

10-16

WoRkshop in ToRonto

10-16

Will Compression Be Machine Learning’s Killer App?

10-16

Quasiquotation in R via bquote()

10-16

RStudio 1.2 Preview： Stan

10-16

Using pandas and pymapd for ETL into OmniSci

10-16

Whats new on arXiv

10-16

2018

GitHub Python Data Science Spotlight： High Level Machine Learning & NLP, Ensembles, Command Line Viz & Docker Made Easy

10-16

Adversarial Examples, Explained

10-16

Even more images as x-axis labels

10-16

Self-Service Analytics or Operationalization： Which Should I Implement?

10-16

New paper – Inside or outside： quantifying extrapolation across river networks

10-16

Modifying Excel Files using openxlsx

10-16

A small logical change with big impact

10-16

David Brooks discovers Red State Blue State Rich State Poor State!

10-16

University of San Francisco： Postdoctoral Fellowship, Data Institute [San Francisco, CA]

10-16

Optimized bubble tea consumption

10-16

2018

Applied Data Science： Solving a Predictive Maintenance Business Problem Part 3

10-16

2019

未命名

01-13

2018

Exploring college major and income： a live data analysis in R

10-16

A small logical change with big impact

10-16

R Packages worth a look

10-16

Toward better measurement in K-12 education research

10-15

Voice Control your Shiny Apps

10-15

Slot Machines

10-15

Package support offer

10-15

Machine Learning Trick of the Day (8)： Instrumental Thinking

10-15

2018

Making Art in R

10-15

R Packages worth a look

10-15

Obtaining the number of components from cross validation of principal components regression

10-15

Document worth reading： “A Survey of Inverse Reinforcement Learning： Challenges, Methods and Progress”

10-15

Modularize your Shiny Apps： Exercises

10-15

How we use emojis

10-15

Distilled News

10-15

Spam Detection with Natural Language Processing (NLP) – Part 1

10-15

Choose Your Own Adventure – Analytics On-boarding

10-15

I fell out with tapply and in love with dplyr

10-15

2018

Most liked R-bloggers’ posts from last week (2018-10-07 till 2018-10-13 – based on twitter)

10-15

5 “Clean Code” Tips That Will Dramatically Improve Your Productivity

10-15

How AI Will Change Healthcare

10-15

In Memoriam： Manfred te Grotenhuis

10-15

Deep learning, hydroponics, and medical marijuana

10-15

If you did not already know

10-15

ABC intro for Astrophysics

10-15

Distilled News

10-14

Running R scripts within in-database SQL Server Machine Learning

10-14

Random Walk of Pi – Another ggplot2 Experiment

10-14

2018

Visualising Networks in ASOIAF – Part II

10-14

If you did not already know

10-14

Monotonic Binning with Equal-Sized Bads for Scorecard Development

10-14

Introducing the New Zealand Trade Intelligence Dashboard

10-14

Statistics Sunday： Some Psychometric Tricks in R

10-14

If you did not already know

10-14

Document worth reading： “Vector and Matrix Optimal Mass Transport： Theory, Algorithm, and Applications”

10-14

R Packages worth a look

10-14

He had a sudden cardiac arrest. How does this change the probability that he has a particular genetic condition?

10-14

2019

未命名

01-13

2018

R Packages worth a look

10-13

Document worth reading： “Data Curation with Deep Learning [Vision]： Towards Self Driving Data Curation”

10-13

Understanding Chicago’s homicide spike; comparisons to other cities

10-13

Whats new on arXiv

10-13

How to import a directory of csvs at once with base R and data.table. Can you guess which way is the fastest?

10-13

GitHub Streak： Round Five

10-13

Open Workshop： Deep Learning in R and Keras, November 14th in Frankfurt

10-13

Piping into ggplot2

10-13

RcppNLoptExample 0.0.1： Use NLopt from C/C++

10-13

Prophets of gloom： Using NLP to analyze Radiohead lyrics

10-13

2018

New package in CRAN： PkgsFromFiles

10-13

Piping into ggplot2

10-13

Learn the top things to look for in an AI Vendor

10-12

Review： Excel TV’s Data Science with Power BI and R

10-12

Temple University： Faculty Positions (Assistant/Associate/Full Professor) [Philadelphia, PA]

10-12

binb 0.0.3： Now with Monash

10-12

Animated River Flow Revisited

10-12

Writing Code to Read Quotes About Writing Code

10-12

New Poll： What was the largest dataset you analyzed / data mined?

10-12

Limitations of “Limitations of Bayesian Leave-One-Out Cross-Validation for Model Selection”

10-12

2018

Modeling Airbnb prices

10-12

Because it's Friday： Hey, it's Enrico Pallazzo!

10-12

Hitchhiker's guide to Exploratory Data Analysis

10-12

Is it time to stop using sentinel values for null / "NA" values?

10-12

Sketchnotes from TWiML&AI： Evaluating Model Explainability Methods with Sara Hooker

10-12

The Economist's Big Mac Index is calculated with R

10-12

Distilled News

10-12

Stan on the web! (thanks to RStudio)

10-12

R Packages worth a look

10-12

If you did not already know

10-12

2018

Top Blockchain Applications Making Waves in Commercial Real Estate

10-12

Gold-Mining Week 6 (2018)

10-12

Join us for DataTech19, Scotland’s first technical data science conference as part of DataFest

10-12

The Economist's Big Mac Index is calculated with R

10-12

Because it's Friday： Hey, it's Enrico Pallazzo!

10-12

Machine learning — Is the emperor wearing clothes?

10-12

The Economist’s Big Mac Index is calculated with R

10-12

We Sized Washington’s Edible Marijuana Market Using AI

10-12

Whats new on arXiv

10-11

Why are functional programming languages so popular in the programming languages community?

10-11

2018

Document worth reading： “Automatic Rumor Detection on Microblogs： A Survey”

10-11

DataCamp： Part-time Contract Instructors [Remote]

10-11

How R gets built on Windows

10-11

If you did not already know

10-11

How R gets built on Windows

10-11

SQL, Python, & R： All in One Platform

10-11

cransays - Follow your R Package Journey to CRANterbury with our Dashboard!

10-11

Power Bat – How Spektacom is Powering the Game of Cricket with Microsoft AI

10-11

Business Analysis (BA) Career Path

10-11

A/B Testing： The Definitive Guide to Improving Your Product

10-11

2018

The One reason you should learn Python

10-11

Evaluating the Business Value of Predictive Models in Python and R

10-11

Decolonising Artificial Intelligence

10-11

Machine Reading Comprehension： Learning to Ask & Answer

10-11

Using Confusion Matrices to Quantify the Cost of Being Wrong

10-11

Guest Post： Galin Jones on criteria for promotion and tenture in (bio)statistics departments

10-11

Document worth reading： “The Risk of Machine Learning”

10-11

Distilled News

10-11

Top KDnuggets tweets, Oct 3–9： 5 Reasons Logistic Regression should be the first thing you learn when becoming a Data Scientist

10-10

Top 10 Mistakes to Avoid to Master Data Science

10-10

2018

Whats new on arXiv

10-10

TDWI In-Person and Virtual Data and Analytics Training

10-10

If you did not already know

10-10

Amazon Comprehend introduces new Region availability and language support for French, German, Italian, and Portuguese

10-10

If you did not already know

10-10

Paris Machine Learning

10-10

a4 Media： Manager, Machine Learning Data Engineer [Long Island City, NY]

10-10

Announcing Ursa Labs's partnership with NVIDIA

10-10

Preprocessing for Deep Learning： From covariance matrix to image whitening

10-10

How to get a Data Science Job in 6 Months

10-10

2018

KDnuggets™ News 18：n38, Oct 10： Concise Explanation of Learning Algorithms; Why I Call Myself a Data Scientist; Linear Regression in the Wild

10-10

Life in Madrid seen through BiciMAD

10-10

Top September Stories： Essential Math for Data Science： Why and How; Machine Learning Cheat Sheets

10-10

R Packages worth a look

10-10

The Golden Rule of Nudge

10-10

10 Best Mobile Apps for Data Scientist / Data Analysts

10-10

Data Mining Book： Chapter Download.

10-10

Document worth reading： “Deep Learning for Generic Object Detection： A Survey”

10-10

All About Open Source

10-09

If you did not already know

10-09

2018

Document worth reading： “An Introduction to Inductive Statistical Inference — from Parameter Estimation to Decision-Making”

10-09

Shopper Sentiment： Analyzing in-store customer experience

10-09

R Consortium grant applications due October 31

10-09

TEXATA Data Analytics Summit 2018 – Exclusive 30% KDnuggets Discount.

10-09

Building an Image Classifier Running on Raspberry Pi

10-09

R Consortium grant applications due October 31

10-09

Distilled News

10-09

Leading the Charge 🔌 🚘： 10 Charts on Electric Vehicles in Plotly

10-09

How To Learn Data Science If You’re Broke

10-09

Whats new on arXiv

10-09

2018

Whats new on arXiv

10-09

Whats new on arXiv

10-09

Top 8 Python Machine Learning Libraries

10-09

Semantic Interoperability： Are you training your AI by mixing data sources that look the same but aren’t?

10-09

Learning Acrobatics by Watching YouTube

10-09

R Packages worth a look

10-09

Processing complicated package outputs

10-09

Simulating the iSight Camera in the iOS Simulator

10-09

Track the number of coffees consumed using AWS DeepLens

10-09

Distilled News

10-08

2018

Running the Same Task in Python and R

10-08

Remembering Michael

10-08

The economic consequences of MOOCs

10-08

Things you should know when traveling via the Big Data Engineering hype-train

10-08

Don’t Peek： Deep Learning without looking … at test data

10-08

Keras vs. TensorFlow – Which one is better and which one should I learn?

10-08

R Packages worth a look

10-08

Big career opportunities in big data

10-08

Rising test scores . . . reported as stagnant test scores

10-08

Fewer Headaches (Thanks to Data Science)

10-08

2018

BIG, small or Right Data： Which is the proper focus?

10-08

A Neural Architecture for Bayesian CompressiveSensing over the Simplex via Laplace Techniques

10-08

If you did not already know

10-08

Job： Postdoctoral Researcher in Small Data Deep Learning and Explainable Machine Learning, Livermore, CA

10-08

Tidyverse 'Starts_with' in M/Power Query

10-08

Document worth reading： “Big Data Systems Meet Machine Learning Challenges： Towards Big Data Science as a Service”

10-07

Bayesian inference and religious belief

10-07

If you did not already know

10-07

R Packages worth a look

10-07

Sunday Morning Video (in french)： Les travaux de Grothendieck.sur les espaces de Banach, Gilles. Pisier (Lectures grothendieckiennes)

10-07

2018

Document worth reading： “Learning Tree Distributions by Hidden Markov Models”

10-07

“Fudged statistics on the Iraq War death toll are still circulating today”

10-06

R Packages worth a look

10-06

R Packages worth a look

10-06

Document worth reading： “An Analysis of Hierarchical Text Classification Using Word Embeddings”

10-06

Present each others’ posters

10-06

Distilled News

10-06

Distilled News

10-06

Quick Significance Calculations for A/B Tests in R

10-06

A Concise Explanation of Learning Algorithms with the Mitchell Paradigm

10-05

2018

Here is How You Can build a data science team from scratch in 2018 - The Definitive Guide

10-05

A few upcoming R conferences

10-05

University of Nebraska at Omaha： Faculty Position in Computer Science [Omaha, NE]

10-05

Proof that 1/7 is a repeated decimal

10-05

Challenges & Solutions for Production Recommendation Systems

10-05

Online Master’s in Applied Data Science From Syracuse

10-05

Basic Image Data Analysis Using Python – Part 4

10-05

If you did not already know

10-05

Document worth reading： “Detecting Dead Weights and Units in Neural Networks”

10-05

Multithreaded in the Wild

10-05

2018

A few upcoming R conferences

10-05

Distilled News

10-05

Magister Dixit

10-05

If you did not already know

10-05

Journey from Non-Technical background to an expert in Data Science

10-05

Why do I Call Myself a Data Scientist?

10-05

Colorado State University： Assistant Professor in Industrial and Organizational (IO) Psychology [Fort Collins, CO]

10-05

“Ivy League Football Saw Large Reduction in Concussions After New Kickoff Rules”

10-05

Accelerate model training using faster Pipe mode on Amazon SageMaker

10-05

Whats new on arXiv

10-05

2018

UnitedHealth Group： UHC Digital Director of Project Management [Minnetonka, MN]

10-04

Semantic Segmentation： Wiki, Applications and Resources

10-04

3 Stages of Creating Smart

10-04

Short Article Reveals the Undeniable Facts About College Essay Writing Service and How It Can Affect You

10-04

If you did not already know

10-04

✚ This is Misleading, This is Not Really Misleading

10-04

Amazon SageMaker Neural Topic Model now supports auxiliary vocabulary channel, new topic evaluation metrics, and training subsampling

10-04

Whats new on arXiv

10-04

Chromebook Data Science

10-04

Segmenting brain tissue using Apache MXNet with Amazon SageMaker and AWS Greengrass ML Inference – Part 2

10-04

2018

Speed Up With Microsoft

10-04

UnitedHealth Group： UHC Digital Project Manager [Minnetonka, MN]

10-04

Whats new on arXiv

10-04

What Does it Take to Train Deep Learning Models On-Device?

10-04

Beyond text： How Spokata uses Amazon Polly to make news and information universally accessible as real-time audio

10-04

Understand Why ODSC is the Most Recommended Conference for Applied Data Science

10-04

Society of Machines： The Complex Interaction of Agents

10-04

Why Almost Everything You’ve Learned About Cheap Custom Essay Is Wrong and What You Should Know

10-04

Data Notes： Are Those Honey Bees Healthy?

10-04

Distilled News

10-04

2018

R Packages worth a look

10-04

UnitedHealth Group： Sr .Net Web Developer, UHC E&I [Indianapolis, IN or Green Bay, WI]

10-04

Top 10 Mistakes to Avoid to Master Data Science

10-04

“Six Signs of Scientism”： where I disagree with Haack

10-04

Big Data Day Camp： Big Data Tools & Techniques (October 25-26)

10-04

Deep Learning Without Labels

10-03

Linear Regression in the Wild

10-03

How to use common workflows on Amazon SageMaker notebook instances

10-03

Data Science at Northwestern

10-03

R Packages worth a look

10-03

2018

Document worth reading： “An Introduction to Mathematical Optimal Control Theory Version 0.2”

10-03

“Snip Insights” – An Open Source Cross-Platform AI Tool for Intelligent Screen Capture

10-03

R Packages worth a look

10-03

Top 3 Trends in Deep Learning

10-03

Top KDnuggets tweets, Sep 26 – Oct 2： Why building your own Deep Learning Computer is 10x cheaper than AWS; 6 Steps To Write Any Machine Learning Algorithm

10-03

If you did not already know

10-03

KDnuggets™ News 18：n37, Oct 3： Mathematics of Machine Learning; Effective Transfer Learning for NLP; Path Analysis with R

10-03

Document worth reading： “Bayesian model reduction”

10-03

PyTorch 1.0 preview now available in Amazon SageMaker and the AWS Deep Learning AMIs

10-03

Python Dictionary Tutorial

10-03

2018

Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning： October and Beyond

10-03

Mapping opportunity for children, based on where they grew up

10-03

Build a Predictive Maintenance Engine with GIS Data

10-03

Whats new on arXiv

10-03

In case you missed it： September 2018 roundup

10-03

Sequence Modeling with Neural Networks – Part I

10-03

David Weakliem points out that both economic and cultural issues can be more or less “moralized.”

10-03

5 Reasons Why You Should Use Cross-Validation in Your Data Science Projects

10-02

How to Create a Simple Neural Network in Python

10-02

DevOps 2.0： Applying Machine Learning in the CI/CD Chain

10-02

2018

If you did not already know

10-02

Cool postdoc position in Arizona on forestry forecasting using tree ring models!

10-02

Shifting Causes of Death

10-02

R Packages worth a look

10-02

Whats new on arXiv

10-02

AI, Machine Learning and Data Science Announcements from Microsoft Ignite

10-02

Unleash a Faster Python on Your Data

10-02

The Enterprise AI Lab： Not Your Average AI Lab

10-02

“Moral cowardice requires choice and action.”

10-02

Magister Dixit

10-02

2018

TEXATA Data Analytics Summit 2018 – Exclusive 30% KDnuggets Discount

10-01

Modeling muti-category Outcomes With vtreat

10-01

Chromebook Data Science - a free online data science program for anyone with a web browser.

10-01

A Right to Reasonable Inferences

10-01

Import AI 114： Synthetic images take a big leap forward with BigGANs; US lawmakers call for national AI strategy; researchers probe language reasoning via HotspotQA

10-01

Up your open source game with Hacktoberfest at Locke Data!

10-01

A Review of the Neural History of Natural Language Processing

10-01

Reinforcement Learning： Super Mario, AlphaGo and beyond

10-01

Distilled News

10-01

Bob Erikson on the 2018 Midterms

10-01

2018

Dr. Data Show Video： Why Machine Learning Is the Coolest Science

10-01

Document worth reading： “Generative Adversarial Nets for Information Retrieval： Fundamentals and Advances”

10-01

PyImageConf 2018 Recap

10-01

Robust Quality – Powerful Integration of Data Science and Process Engineering

10-01

Distilled News

10-01

My talk tomorrow (Tues) 4pm in the Biomedical Informatics department (at 168th St)

10-01

A Three Month Data Analysis in Excel Could Have Taken Me One Day

10-01

R Packages worth a look

09-30

What do you do when someone says, “The quote is, this is the exact quote”—and then misquotes you?

09-30

R Packages worth a look

09-30

2018

Document worth reading： “Importance of the Mathematical Foundations of Machine Learning Methods for Scientific and Engineering Applications”

09-30

If you did not already know

09-30

If you did not already know

09-30

Distilled News

09-30

Overlapping Disks

09-30

Statistical Modeling, Causal Inference, and Social Science Regrets Its Decision to Hire Cannibal P-hacker as Writer-at-Large

09-29

Document worth reading： “Physically optimizing inference”

09-29

Document worth reading： “How to Maximize the Spread of Social Influence： A Survey”

09-29

R Packages worth a look

09-29

Python Vs R ： The Eternal Question for Data Scientists

09-29

2018

If you did not already know

09-29

Functions and Packages

09-29

Document worth reading： “A Survey on Expert Recommendation in Community Question Answering”

09-28

The complex process of obtaining Puerto Rico mortality data： a timeline

09-28

Distilled News

09-28

Whats new on arXiv

09-28

The Markup is a new journalism venture to examine technology through data

09-28

Machine Learning and Deep Learning ： Differences

09-28

If you did not already know

09-28

Whats new on arXiv

09-28

2018

Whats new on arXiv

09-27

If you did not already know

09-27

Magister Dixit

09-27

R Packages worth a look

09-27

(People are missing the point on Wansink, so) what’s the lesson we should be drawing from this story?

09-27

Deploy your own TensorFlow object detection model to AWS DeepLens

09-27

Implement Simple Convolution with Java

09-27

Your Guide to AI and Machine Learning at re：Invent 2018

09-27

Document worth reading： “Data Science vs. Statistics： Two Cultures”

09-27

Segmenting brain tissue using Apache MXNet with Amazon SageMaker and AWS Greengrass ML Inference – Part 1

09-26

2018

A potential big problem with placebo tests in econometrics： they’re subject to the “difference between significant and non-significant is not itself statistically significant” issue

09-26

Using Stacking to Average Bayesian Predictive Distributions (with Discussion)

09-26

Advantages of Online Data Science Courses

09-26

R Packages worth a look

09-26

Document worth reading： “Human-Machine Inference Networks For Smart Decision Making： Opportunities and Challenges”

09-26

The Price of Transformation

09-26

3-D shadow maps in R： the rayshader package

09-26

R Packages worth a look

09-26

Job opening at CDC： “The Statistician will play a central role in guiding the statistical methods of all major projects of the Epidemiology and Prevention Branch of the CDC Influenza Division, and aid in designing, analyzing, and interpreting research intended to understand the burden of influenza in the US and internationally and identify the best influenza vaccines and vaccine strategies.”

09-26

If you did not already know

09-26

2018

Can AI Generate Programs to Help Automate Busy Work?

09-26

Understanding Regression Error Metrics

09-26

Inside Higher Ed： Pushing the Boundaries of Learning With AI

09-26

Distilled News

09-26

Distilled News

09-26

Morph, an open-source tool for data-driven art without code

09-26

You’ve got data on 35 countries, but it’s really just N=3 groups.

09-25

Help improve lives through Machine Learning by joining the AWS DeepLens Challenge!

09-25

A Better Example of the Confused By The Environment Issue

09-25

Whats new on arXiv

09-25

2018

If you did not already know

09-25

One Drink Per Day, Your Chances of Developing an Alcohol-Related Condition

09-25

Document worth reading： “Data Innovation for International Development： An overview of natural language processing for qualitative data analysis”

09-25

Whats new on arXiv

09-25

Amazon SageMaker automatic model tuning produces better models, faster

09-25

Distilled News

09-25

Distilled News

09-24

If you did not already know

09-24

R Packages worth a look

09-24

Don’t calculate post-hoc power using observed estimate of effect size

09-24

2018

A Subtle Flaw in Some Popular R NSE Interfaces

09-24

R Packages worth a look

09-24

Python Vs R ： The Eternal Question for Data Scientists

09-24

How to Optimise Ad CTR with Reinforcement Learning

09-24

Dataquest helped me get my dream job at Noodle.ai

09-24

Whats new on arXiv

09-24

“Tweeking”： The big problem is not where you think it is.

09-23

Document worth reading： “Graph-based Ontology Summarization： A Survey”

09-23

Distilled News

09-23

Document worth reading： “On the Learning Dynamics of Deep Neural Networks”

09-23

2018

R Packages worth a look

09-23

R Packages worth a look

09-22

Distilled News

09-22

If you did not already know

09-22

Document worth reading： “Do Deep Learning Models Have Too Many Parameters An Information Theory Viewpoint”

09-22

Multilevel data collection and analysis for weight training (with R code)

09-22

A psychology researcher uses Stan, multiverse, and open data exploration to explore human memory

09-21

Timing Column Indexing in R

09-21

Whats new on arXiv

09-21

Nextgov： Machine Learning Could Help Chip Away at the Security Clearance Backlog

09-21

2018

This New [AI] Software Constantly Improves – and that Makes all the Difference

09-21

If you did not already know

09-21

Data Projects WILL Fail - Learn to Fail Quickly & Efficiently

09-21

R Packages worth a look

09-21

Whats new on arXiv

09-21

Using a Column as a Column Index

09-21

The rise and plummet of the name Heather

09-21

How to Implement AI-First Business Models at Scale

09-21

If you did not already know

09-21

How Pol Brigneti got a Data Analyst job using Dataquest at Belgrave Valley

09-21

2018

Applications of R presented at EARL London 2018

09-21

AI-Based Virtual Tutors – The Future of Education?

09-21

Distilled News

09-20

Whats new on arXiv

09-20

✚ Chart Components and Working On Your Graphics Piece-wise

09-20

The Best Programming Languages for Data Science and Machine Learning in 2018

09-20

AI, Machine Learning and Data Science Roundup： September 2018

09-20

What is P-value?

09-20

Data Notes： How Do Autoencoders Work?

09-20

Post-publication peer review： who’s qualified?

09-20

2018

If you did not already know

09-20

Magister Dixit

09-20

Learning Statistics Online for Data Science

09-20

Judging connectedness of American communities, based on Facebook friendships

09-20

Document worth reading： “Automatic Language Identification in Texts： A Survey”

09-20

Discovering and indexing podcast episodes using Amazon Transcribe and Amazon Comprehend

09-20

R Packages worth a look

09-20

How to graph a function of 4 variables using a grid

09-20

PyConUK 2018

09-19

Distilled News

09-19

2018

Document worth reading： “Decision-Making with Belief Functions： a Review”

09-19

Help! I can’t reproduce a machine learning project!

09-19

Three Mighty Good Reasons to Learn R for Data Science

09-19

New Engen improves customer acquisition marketing campaigns using Amazon Rekognition

09-19

R Packages worth a look

09-19

Whats new on arXiv

09-19

A couple more papers on genetic diversity as an explanation for why Africa and remote Andean countries are so poor while Europe and North America are so wealthy

09-19

The hot hand—in darts!

09-18

R Packages worth a look

09-18

Training models with unequal economic error costs using Amazon SageMaker

09-18

2018

Document worth reading： “The Three Pillars of Machine-Based Programming”

09-18

Understanding Different Components & Roles in Data Science

09-18

How to generalize (algorithmically)

09-18

If you did not already know

09-18

Variety is the Secret Sauce for Big Discoveries in Big Data

09-18

Save time and money by filtering faces during indexing with Amazon Rekognition

09-18

Dataiku： “Multimodal Force Majeure” Among Predictive Analytics & ML Platforms

09-18

Document worth reading： “On-Disk Data Processing： Issues and Future Directions”

09-18

Not Hotdog： A Shiny app using the Custom Vision API

09-18

Whats new on arXiv

09-18

2018

Why, oh why, do so many people embrace the Pacific Garbage Cleanup nonsense? (I have a theory).

09-18

Cuisine Ingredients

09-18

Distilled News

09-18

Cuisine Ingredients

09-18

BRUNO： A Deep Recurrent Model for Exchangeable Data

09-17

R Packages worth a look

09-17

If you did not already know

09-17

Deep learning made easier with transfer learning

09-17

How to Optimise Ad CTR with Reinforcement Learning

09-17

What to do when your measured outcome doesn’t quite line up with what you’re interested in?

09-17

2018

Distilled News

09-17

If you did not already know

09-17

Whats new on arXiv

09-17

Distilled News

09-17

Monotonicity constraints in machine learning

09-16

Don’t get fooled by observational correlations

09-16

Document worth reading： “Introduction to Nonnegative Matrix Factorization”

09-16

Document worth reading： “A practical tutorial on autoencoders for nonlinear feature fusion： Taxonomy, models, software and guidelines”

09-16

R Packages worth a look

09-16

Parameterizing with bquote

09-16

2018

Columbia Data Science Institute art contest

09-16

Distilled News

09-15

High-profile statistical errors occur in the physical sciences too, it’s not just a problem in social science.

09-15

Better R Code with wrapr Dot Arrow

09-15

R Packages worth a look

09-15

If you did not already know

09-15

On “Competition” in the R Ecosystem

09-15

If you did not already know

09-15

Limit access to a Jupyter notebook instance by IP address

09-14

Because it's Friday： Hurricane Trackers

09-14

2018

Whats new on arXiv

09-14

Document worth reading： “Closing the AI Knowledge Gap”

09-14

Waffle House index as a storm indicator

09-14

Distilled News

09-14

Whats new on arXiv

09-14

Echo Chamber Incites Online Mob to Attack Math Profs

09-14

Waffle House index as a storm indicator

09-14

How many deaths were caused by the hurricane in Puerto Rico?

09-14

Meet Zhiyu—the first Mandarin Chinese voice for Amazon Polly

09-14

Divergent and Convergent Phases of Data Analysis

09-14

2018

Winner Interview | Particle Tracking Challenge first runner-up, Pei-Lien Chou

09-14

Whats new on arXiv

09-13

N=1 survey tells me Cynthia Nixon will lose by a lot (no joke)

09-13

✚ Google Dataset Search Impressions, the Challenges of Looking for Data, and Other Places to Find Data

09-13

Mapillary uses Amazon Rekognition to work towards building parking solutions for US cities

09-13

Classifying high-resolution chest x-ray medical images with Amazon SageMaker

09-13

Discussion of effects of growth mindset： Let’s not demand unrealistic effect sizes.

09-13

The Waiting Time Paradox, or, Why Is My Bus Always Late?

09-13

R Packages worth a look

09-13

Document worth reading： “A Taxonomy for Neural Memory Networks”

09-13

2018

Announcing wrapr 1.6.2

09-13

Dataiku 5.0： Enterprise AI Within Reach

09-12

Hurricane Florence trackers

09-12

Against Arianism 2： Arianism Grande

09-12

The Benefits of Active Learning for Data Science Skills

09-12

R Packages worth a look

09-12

Distilled News

09-12

Practical Data Science with R2

09-12

If you did not already know

09-12

Java Home Made Face Recognition Application

09-12

2018

Narcolepsy Could Be ‘Sleeper Effect’ in Trump and Brexit Campaigns

09-12

If not Notebooks, then what? Look to Literate Programming

09-12

Whats new on arXiv

09-12

Data Center Scale Computing and Artificial Intelligence with Matei Zaharia, Inventor of Apache Spark

09-12

Run SQL queries from your SageMaker notebooks using Amazon Athena

09-12

Data Science Glossary

09-12

Document worth reading： “Analytics for the Internet of Things： A Survey”

09-12

If you did not already know

09-11

If you did not already know

09-11

Distilled News

09-11

2018

Distilled News

09-11

Whats new on arXiv

09-11

Mouse Among the Cats

09-11

R Packages worth a look

09-11

Whats new on arXiv

09-11

Google Dataset Search ： Google’s New Data Search Engine

09-10

Big Data ： Meaning, Components, Collection & Analysis

09-10

Import AI 111： Hacking computers with Generative Adversarial Networks, Facebook trains world-class speech translation in 85 minutes via 128 GPUs, and Europeans use AI to classify 1,000-year-old graffiti.

09-10

A Quick Appreciation of the R transform Function

09-10

Researchers.one： A souped-up Arxiv with pre- and post-publication review

09-10

2018

R Packages worth a look

09-10

If you did not already know

09-10

Document worth reading： “Quantizing deep convolutional networks for efficient inference： A whitepaper”

09-10

What if a big study is done and nobody reports it?

09-10

Document worth reading： “Concept Tagging for Natural Language Understanding： Two Decadelong Algorithm Development”

09-10

Why Would Prosthetic Arms Need to See or Connect to Cloud AI?

09-10

Magister Dixit

09-09

“Check out table 4.”

09-09

If you did not already know

09-09

Document worth reading： “Accelerating CNN inference on FPGAs： A Survey”

09-08

2018

Distilled News

09-08

R Tip： Give data.table a Try

09-08

If you did not already know

09-08

R Packages worth a look

09-08

Whats new on arXiv

09-08

“It’s Always Sunny in Correlationville： Stories in Science,” or, Science should not be a game of Botticelli

09-08

Naive Bayes Classifier： A Geometric Analysis of the Naivete. Part 1

09-07

Being at the Center

09-07

Whats new on arXiv

09-07

R Packages worth a look

09-07

2018

Connected Arms – Can AI Revolutionize Prosthetic Devices & Make them More Affordable?

09-07

Whats new on arXiv

09-07

Cosmos DB for Data Science

09-07

Distilled News

09-07

Multithreaded in the Wild

09-07

Get started with automated metadata extraction using the AWS Media Analysis Solution

09-07

How to scrape data from a website using Python

09-07

Whats new on arXiv

09-07

Whats new on arXiv

09-07

Visualization in the 1980s, just before the rise of computers

09-07

2018

Document worth reading： “Putting Data Science In Production”

09-07

Magister Dixit

09-07

Welcome to Dataiku University!

09-07

Bothered by non-monotonicity? Here’s ONE QUICK TRICK to make you happy.

09-07

The Blessings of Multiple Causes： Causal Inference when you Can't Measure Confounders

09-07

Who wrote that anonymous NYT op-ed? Text similarity analyses with R

09-07

Mirroring an FTP Using lftp and cron

09-06

If you did not already know

09-06

“Dynamically Rescaled Hamiltonian Monte Carlo for Bayesian Hierarchical Models”

09-06

Document worth reading： “Data learning from big data”

09-06

2018

How Foundations Student Russell Martin got into The Data Incubator’s Fellowship

09-06

R Packages worth a look

09-06

In case you missed it： August 2018 roundup

09-06

Distilled News

09-06

The gaps between 1, 2, and 3 are just too large.

09-06

What is Neural Network?

09-06

Visual Reinforcement Learning with Imagined Goals

09-06

Data Notes： The Secret to Getting to a Second Date

09-06

Google Dataset Search now in public beta

09-06

R Packages worth a look

09-06

2018

See How AI is Inspiring the Next Generation of Developers

09-05

Visual search on AWS—Part 2： Deployment with AWS DeepLens

09-05

Against Winner-Take-All Attribution

09-05

No code chatbots： TIBCO uses Amazon Lex to put chat interfaces into the hands of business users

09-05

If you did not already know

09-05

R Packages worth a look

09-05

British journalists not running corrections and talking about putting people in the freezer

09-05

Aosta Valley, Italy

09-05

Putting the Power of Kafka into the Hands of Data Scientists

09-05

If you did not already know

09-05

2018

Data Science Portfolio Project： Where to Advertise an E-learning Product

09-05

If you did not already know

09-04

Distilled News

09-04

GovernmentCIO： How AI Can Stop Doctors Likely to Overprescribe Opioids — and Stem the Crisis

09-04

Book review： SQL Server 2017 Machine Learning Services with R

09-04

Robert Heinlein vs. Lawrence Summers

09-04

StanCon 2018 Helsinki tutorial videos online

09-04

Streamlining Production with Predictive Maintenance and Essilor

09-04

Three Operator Splitting

09-04

“We continuously increased the number of animals until statistical significance was reached to support our conclusions” . . . I think this is not so bad, actually!

09-04

2018

Document worth reading： “Interpreting Deep Learning： The Machine Learning Rorschach Test”

09-04

Vulcan Post： This AI Startup Is Run By The World’s Top Data Scientists – Lets Anyone Build Predictive Models

09-04

Distilled News

09-04

Document worth reading： “Artificial Intelligence and Robotics”

09-04

The Data Science Roadshow is ON!

09-03

Logistic Regression： Concept & Application

09-03

A.I. parity with the West in 2020

09-03

R Packages worth a look

09-03

Document worth reading： “Psychological State in Text： A Limitation of Sentiment Analysis”

09-03

Whats new on arXiv

09-03

2018

R Packages worth a look

09-03

Human Fuel Consumption

09-02

How to set up a voting system for a Hall of Fame?

09-02

If you did not already know

09-02

Document worth reading： “A Survey on Influence Maximization in a Social Network”

09-02

R Packages worth a look

09-02

Hey—take this psychological science replication quiz!

09-02

Unfolding Naïve Bayes From Scratch!

09-02

Magister Dixit

09-02

If you did not already know

09-01

2018

John Hattie’s “Visible Learning”： How much should we trust this influential review of education research?

09-01

Magister Dixit

09-01

Whats new on arXiv

09-01

If you did not already know

09-01

R Tip： How to Pass a formula to lm

09-01

R Packages worth a look

09-01

Distilled News

08-31

Dexterous Manipulation with Reinforcement Learning： Efficient, General, and Low-Cost

08-31

“Identification of and correction for publication bias,” and another discussion of how forking paths is not the same thing as file drawer

08-31

Amazon SageMaker runtime now supports the CustomAttributes header

08-31

2018

Because it's Friday： The Curiosity Show

08-31

Magister Dixit

08-31

Whats new on arXiv

08-31

A Deep (But Jargon and Math Free) Dive Into Deep Learning

08-31

Distilled News

08-31

Document worth reading： “PMLB： A Large Benchmark Suite for Machine Learning Evaluation and Comparison”

08-31

Counting baseball cliches

08-31

Guide to a high-performance, powerful R installation

08-31

If you did not already know

08-30

If you did not already know

08-30

2018

3 recent movies from the 50s and the 70s

08-30

Distilled News

08-30

Visual search on AWS—Part 1： Engine implementation with Amazon SageMaker

08-30

R Packages worth a look

08-30

Turn Whiteboard UX Sketches into Working HTML in Seconds – Introducing Sketch2Code

08-30

R Packages worth a look

08-30

Understanding Different Components & Roles in Data Science

08-30

R Tip： Put Your Values in Columns

08-30

Tips for analyzing Excel data in R

08-30

Document worth reading： “Idealised Bayesian Neural Networks Cannot Have Adversarial Examples： Theoretical and Empirical Study”

08-29

2018

Whats new on arXiv

08-29

Some clues that this study has big big problems

08-29

Distilled News

08-29

Synesthesia： The Sound of Style

08-29

Use Kaggle to start (and guide) your ML/ Data Science journey — Why and How

08-29

If you did not already know

08-29

Document worth reading： “A Comparative Study on using Principle Component Analysis with Different Text Classifiers”

08-29

R Packages worth a look

08-29

Access Amazon S3 data managed by AWS Glue Data Catalog from Amazon SageMaker notebooks

08-29

Whats new on arXiv

08-28

2018

Whats new on arXiv

08-28

Distilled News

08-28

Whats new on arXiv

08-28

Document worth reading： “What am I searching for?”

08-28

Videos from NYC R Conference

08-28

Old school

08-28

How I got in the top 1 % on Kaggle.

08-28

Pixm takes on phishing attacks with deep learning using Apache MXNet on AWS

08-28

“To get started, I suggest coming up with a simple but reasonable model for missingness, then simulate fake complete data followed by a fake missingness pattern, and check that you can recover your missing-data model and your complete data model in that fake-data situation. You can then proceed from there. But if you can’t even do it with fake data, you’re sunk.”

08-27

Amazon Transcribe now supports multi-channel transcriptions

08-27

2018

R Packages worth a look

08-27

R Packages worth a look

08-27

If you did not already know

08-27

If you did not already know

08-27

Document worth reading： “A Tutorial on Network Embeddings”

08-26

Forbes： 25 Machine Learning Startups to Watch in 2018

08-26

Bayesian model comparison in ecology

08-26

Document worth reading： “The State of the Art in Developing Fuzzy Ontologies： A Survey”

08-26

If you did not already know

08-26

In statistics, we talk about uncertainty without it being viewed as undesirable

08-25

2018

Distilled News

08-25

If you did not already know

08-25

R Packages worth a look

08-25

Document worth reading： “Nonnegative Matrix Factorization for Signal and Data Analytics： Identifiability, Algorithms, and Applications”

08-25

When anyone claims 80% power, I’m skeptical.

08-24

Whats new on arXiv

08-24

R Packages worth a look

08-24

Weighing the risk of moderate alcohol consumption

08-24

Because it's Friday： One Million Integers

08-24

Microsoft Weekly Data Science News for August 24, 2018

08-24

2018

Whats new on arXiv

08-24

The Chartmaker Directory： Data visualizations in every tool

08-24

R Objects

08-24

R Packages worth a look

08-24

World map shows aerosol billowing in the wind

08-24

If you did not already know

08-24

Constructing a Data Analysis

08-24

What is a Box Plot?

08-24

Problems in a published article on food security in the Lower Mekong Basin

08-23

Timings of a Grouped Rank Filter Task

08-23

2018

Data Notes： Drought and the War in Syria

08-23

Document worth reading： “An Information-Theoretic Analysis of Deep Latent-Variable Models”

08-23

If you did not already know

08-23

Video： Azure Machine Learning in plain English

08-23

Distilled News

08-23

3-D-Printed Time Series Plates

08-23

R Packages worth a look

08-23

Getting Started with Competitions - A Peer to Peer Guide

08-22

Why you can't have privacy on the internet

08-22

Distilled News

08-22

2018

Who spends how much, and on what?

08-22

Whats new on arXiv

08-22

Document worth reading： “Fog Computing： Survey of Trends, Architectures, Requirements, and Research Directions”

08-22

R Packages worth a look

08-22

Document worth reading： “Applications of Artificial Intelligence to Network Security”

08-22

Who spends how much, and on what?

08-22

Create a translator chatbot using Amazon Translate and Amazon Lex

08-22

New speed record set for training deep learning models on AWS

08-22

Using gganimate to illustrate the luminance illusion

08-22

If you did not already know

08-22

2018

What to Consider When Choosing Colors for Data Visualization

08-22

What data scientists really do

08-21

Distilled News

08-21

Data concerns when interpreting comparisons of gender equality between countries

08-21

Data concerns when interpreting comparisons of gender equality between countries

08-21

Magister Dixit

08-21

R tip： Use Radix Sort

08-21

Creating a MapD ODBC Connection in RStudio Server

08-21

The scandal isn’t what’s retracted, the scandal is what’s not retracted.

08-21

Managing your expenses with Amazon Lex

08-21

2018

Against Arianism

08-21

Distilled News

08-21

Whats new on arXiv

08-21

The scandal isn’t what’s retracted, the scandal is what’s not retracted.

08-21

Against Arianism

08-21

If you did not already know

08-21

Whats new on arXiv

08-21

Fake News and Filter Bubbles

08-21

R Packages worth a look

08-21

Whats new on arXiv

08-21

2018

The competing narratives of scientific revolution

08-20

Import AI： 108： Learning language with fake sentences, Chinese researchers use RL to train prototype warehouse robots; and what the implications are of scaled-up Neural Architecture Search

08-20

Document worth reading： “A Survey on Visual Query Systems in the Web Era (extended version)”

08-20

What is Data Science?

08-20

Bad headlines distract from real AI problems

08-20

Magister Dixit

08-20

Nextgov： DHS Funds Machine Learning Tool to Boost Other Countries’ Airport Security

08-20

Document worth reading： “A rational analysis of curiosity”

08-20

The competing narratives of scientific revolution

08-20

Forecasting financial time series with dynamic deep learning on AWS

08-20

2018

If you did not already know

08-20

Document worth reading： “A Survey on Resilient Machine Learning”

08-19

If you did not already know

08-19

Let’s get hysterical

08-19

R Packages worth a look

08-19

R Packages worth a look

08-19

More Practical Data Science with R Book News

08-19

Let’s get hysterical

08-19

Document worth reading： “Cogniculture： Towards a Better Human-Machine Co-evolution”

08-18

If you did not already know

08-18

2018

R Packages worth a look

08-18

The fallacy of the excluded middle — statistical philosophy edition

08-18

The fallacy of the excluded middle — statistical philosophy edition

08-18

Whats new on arXiv

08-17

If you did not already know

08-17

AI, Machine Learning and Data Science Roundup： August 2018

08-17

Three informal case studies： (1) Monte Carlo EM, (2) a new approach to C++ matrix autodiff with closures, (3) C++ serialization via parameter packs

08-17

Three informal case studies： (1) Monte Carlo EM, (2) a new approach to C++ matrix autodiff with closures, (3) C++ serialization via parameter packs

08-17

Distilled News

08-17

Whats new on arXiv

08-17

2018

Multiple Linear Regression & Assumptions of Linear Regression： A-Z

08-17

No, I don’t think it’s the file drawer effect

08-16

No, I don’t think it’s the file drawer effect

08-16

Monitoring the media reaction to Facebook’s disastrous earnings call – News API Monthly Media Review

08-16

Make R speak

08-16

Magister Dixit

08-16

Build a model to predict the impact of weather on urban air quality using Amazon SageMaker

08-16

A visual analysis of jean pockets and their lack of practicality

08-16

Distilled News

08-16

On the growth of our PyDataLondon community

08-16

2018

Distilled News

08-16

Whats new on arXiv

08-16

Document worth reading： “Sequences, yet Functions： The Dual Nature of Data-Stream Processing”

08-16

R Packages worth a look

08-16

✚ Visualization Away from the Computer, Developing Ideas, Bring in the Constraints

08-16

The Law and Order of Data Science

08-15

Document worth reading： “How Important Is a Neuron”

08-15

It should be ok to just publish the data.

08-15

Build an automatic alert system to easily moderate content at scale with Amazon Rekognition Video

08-15

Document worth reading： “Radial Basis Function Approximations： Comparison and Applications”

08-15

2018

Ethical AI for Data Scientists

08-15

R Packages worth a look

08-15

Whats new on arXiv

08-15

Cool tennis-tracking app

08-15

Announcing the Artificial Intelligence (AI) Hackathon： Build Intelligent Applications using machine learning APIs and serverless

08-15

Aella Credit empowers underbanked individuals by using Amazon Rekognition for identity verification

08-15

Announcing Practical Data Science with R, 2nd Edition

08-15

If you did not already know

08-15

If you did not already know

08-15

Data Science Portfolio Project： Is Fandango Still Inflating Ratings?

08-15

2018

Deploy a TensorFlow trained image classification model to AWS DeepLens

08-15

It should be ok to just publish the data.

08-15

Cool tennis-tracking app

08-15

R Packages worth a look

08-15

TINT uses Amazon Comprehend to find and aggregate the best social media content for customers

08-15

It was the weeds that bothered him.

08-14

Building a Linear Regression Model for Real World Problems, in R

08-14

Distill Update 2018

08-14

Document worth reading： “Does putting your emotions into words make you feel better? Measuring the minute-scale dynamics of emotions from online data”

08-14

data.table is Really Good at Sorting

08-14

2018

A transforming river seen from above

08-14

The Microsoft AI Idea Challenge – Breakthrough Ideas Wanted!

08-14

Securing all Amazon SageMaker API calls with AWS PrivateLink

08-14

Distilled News

08-14

It was the weeds that bothered him.

08-14

R Packages worth a look

08-14

Whats new on arXiv

08-14

If you did not already know

08-14

Microsoft R Open 3.5.1 now available

08-14

Amazon Translate now available in the Memsource translation management system

08-14

2018

Curalate makes social sell with AI using Apache MXNet on AWS

08-13

Document worth reading： “Weighted Abstract Dialectical Frameworks： Extended and Revised Report”

08-13

Legal Tech： How Can Lawyers Benefit?

08-13

Probability and Tennis

08-13

How feminism has made me a better scientist

08-13

How feminism has made me a better scientist

08-13

Distilled News

08-13

R Packages worth a look

08-13

Hierarchical Bayesian Neural Networks with Informative Priors

08-13

Whats new on arXiv

08-13

2018

How to Build a Data Science Portfolio

08-13

“Usefully skeptical science journalism”

08-12

“Usefully skeptical science journalism”

08-12

R Packages worth a look

08-12

If you did not already know

08-12

Linear compression in python： PCA vs unsupervised feature selection

08-11

Shared items

08-11

Magister Dixit

08-11

Discussion of the value of a mathematical model for the dissemination of propaganda

08-11

Document worth reading： “Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution”

08-11

2018

Distilled News

08-11

R Packages worth a look

08-11

Discussion of the value of a mathematical model for the dissemination of propaganda

08-11

If you did not already know

08-11

LSTM的神奇之处

08-10

机器学习面试

08-10

GBM

08-10

Jeremy Freese was ahead of the curve

08-10

Document worth reading： “Learning to Succeed while Teaching to Fail： Privacy in Closed Machine Learning Systems”

08-10

Create video subtitles with translation using machine learning

08-10

2018

Document worth reading： “Model-free, Model-based, and General Intelligence”

08-10

Redmonk Language Rankings, June 2018

08-10

Distilled News

08-10

R Packages worth a look

08-10

Whats new on arXiv

08-10

Cryptocurrency： Your Current Options

08-10

If you did not already know

08-10

Amazon Rekognition is now available in the Asia Pacific (Seoul) and Asia Pacific (Mumbai) Regions

08-09

Marinus Analytics fights human trafficking using Amazon Rekognition

08-09

Data Notes： From Hate Speech to Russian Troll Tweets

08-09

2018

What’s gonna happen in the 2018 midterm elections?

08-09

Can You Read My Mind? Analyzing The Killers’ Discography with NLP

08-09

Amazon Lex integration with Genesys PureCloud IVR now available

08-09

Magister Dixit

08-09

✚ Detailed Intentions of a Map, When Everything Leads to Nothing, Designing for Misinterpretations

08-09

DIFFERENCE BETWEEN DATA SCIENCE, DATA ANALYTICS AND MACHINE LEARNING

08-09

AI Meets Mail Processing (Automation for Admin Tasks)

08-09

Lana Del Rey’s Discography through the Lens of Text Analytics

08-09

What is a p-value

08-09

Distilled News

08-09

2018

R Packages worth a look

08-09

Some thoughts after reading “Bad Blood： Secrets and Lies in a Silicon Valley Startup”

08-09

The Trillion Dollar Question

08-09

Whats new on arXiv

08-09

“Richard Jarecki, Doctor Who Conquered Roulette, Dies at 86”

08-09

In case you missed it： July 2018 roundup

08-08

How to Overcome That Awkward Silence in Interviews

08-08

Document worth reading： “Examining the Use of Neural Networks for Feature Extraction： A Comparative Analysis using Deep Learning, Support Vector Machines, and K-Nearest Neighbor Classifiers”

08-08

Distilled News

08-08

Document worth reading： “Mathematics of Deep Learning”

08-08

2018

Whats new on arXiv

08-08

What is “party balancing” and how does it explain midterm elections?

08-08

If you did not already know

08-08

Document worth reading： “A Temporal Difference Reinforcement Learning Theory of Emotion： unifying emotion, cognition and adaptive behavior”

08-07

The Hidden Costs of Data Silos

08-07

xkcd： Disaster Movie

08-07

Distilled News

08-07

R Packages worth a look

08-07

Trapped in the spam folder? Here’s what to do.

08-07

Whats new on arXiv

08-07

2018

IEEE Language Rankings 2018

08-07

“The most important aspect of a statistical analysis is not what you do with the data, it’s what data you use” (survey adjustment edition)

08-07

If you did not already know

08-07

Meta-packages, nails in CRAN’s coffin

08-07

“Optimized” floor plan with genetic algorithms

08-06

Azure Functions for Data Science

08-06

Let’s be open about the evidence for the benefits of open science

08-06

When Recurrent Models Don't Need to be Recurrent

08-06

2018 Data Sources for Cool Data Science Projects, provided by Thinknum

08-06

Quick and Dirty Serverless Integer Programming

08-06

2018

Twilio offers greater voice selection to customers with Amazon Polly integration

08-06

Testing code with random output

08-06

Announcing the Amazon SageMaker MXNet 1.2 container

08-06

Essential Tips and Tricks for Starting Machine Learning with Python

08-05

Scale out your Pandas DataFrame operations using Dask

08-05

Response to Rafa： Why I don’t think ROC [receiver operating characteristic] works as a model for science

08-05

Collecting Expressions in R

08-05

R Packages worth a look

08-05

If you did not already know

08-05

Magister Dixit

08-04

2018

Document worth reading： “Foundations of Complex Event Processing”

08-04

Thorn collaborates with Amazon Rekognition to help fight child sexual abuse and trafficking

08-04

Distilled News

08-04

Don’t call it a bandit

08-04

On Using Hyperopt： Advanced Machine Learning

08-04

Whats new on arXiv

08-04

Thorn partners with Amazon Rekognition to help fight child sexual abuse and trafficking

08-04

Bring your own pre-trained MXNet or TensorFlow models into Amazon SageMaker

08-03

Video： How to run R and Python in SQL Server from a Jupyter notebook

08-03

Artificial Intelligence in the Workplace

08-03

2018

The replication crisis and the political process

08-03

Document worth reading： “Attend Before you Act： Leveraging human visual attention for continual learning”

08-03

R Packages worth a look

08-03

Handling Imbalanced Classes in the Dataset

08-03

On the "we have naughty videos of you" scam

08-03

Whats new on arXiv

08-03

When LOO and other cross-validation approaches are valid

08-03

Distributed Deep Learning on AZTK and HDInsight Spark Clusters

08-02

Use Amazon Mechanical Turk with Amazon SageMaker for supervised learning

08-02

China air pollution regression discontinuity update

08-02

2018

Amazon Polly adds bilingual Indian English/Hindi language support

08-02

Three flavors of data scientist

08-02

Skills that Employers look in a Data Scientist

08-02

Trust The Process

08-02

Document worth reading： “A Reliability Theory of Truth”

08-02

Distilled News

08-02

✚ Wrong Tool, Right Tool, More Tools for Visualization

08-02

Whats new on arXiv

08-02

Download 3 million Russian troll tweets

08-02

If you did not already know

08-02

2018

Continuous tempering through path sampling

08-02

How America uses its land

08-01

“Seeding trials”： medical marketing disguised as science

08-01

Data Makes Possible Many Things： Insights Discovery, Innovation, and Better Decisions

08-01

A glass shattering book draw with gganimate

08-01

TechTarget： Data science in healthcare demands dual focus, expert says

08-01

Tips & Tricks for Starting Your First Data Project

08-01

R Generation： 25 Years of R

08-01

R Packages worth a look

08-01

Build a document search bot using Amazon Lex and Amazon Elasticsearch Service

08-01

2018

Whats new on arXiv

08-01

Thanks, NVIDIA

08-01

If you did not already know

08-01

Is it really true that babies should sleep on their backs?

07-31

What makes the Python Cool.

07-31

New Dynamics for Topic Models

07-31

Document worth reading： “Are Efficient Deep Representations Learnable”

07-31

Recent top-selling books in AI and Machine Learning

07-31

Magister Dixit

07-31

Neural reinterpretations of movie trailers

07-31

2018

Progress in machine learning interpretability

07-31

Amelia, it was just a false alarm

07-31

Distilled News

07-31

R Packages worth a look

07-31

Whats new on arXiv

07-31

3 Steps to Build Your First Intelligent App – Conference Buddy

07-31

Using Entity-level Sentiment Analysis to understand News Content

07-30

Facilitate Proactive Cybersecurity Operations with Big Data Analytics and Machine Intelligence

07-30

Four Ways to Harness Big Data in the Energy Sector

07-30

The file drawer’s on fire!

07-30

2018

Machine Learning Making Big Moves in Marketing

07-30

Quantum Computing： Cats, Crushes, and Chemistry

07-30

A Certification for R Package Quality

07-30

Seasonalities： Bad Period for Stocks?

07-29

Revisiting “Is the scientific paper a fraud?”

07-29

Of Tennys players and moral Hazards

07-28

aRt with code

07-27

What makes Robin Pemantle’s bag of tricks for teaching math so great?

07-27

Keynote at EuroPython 2018 on “Citizen Science”

07-27

Because it's Friday： Street Orientation

07-27

2018

Transfer learning for custom labels using a TensorFlow container and “bring your own algorithm” in Amazon SageMaker

07-27

Why I Indent My Code 8 Spaces

07-27

Thoughts On Machine Learning Accuracy

07-27

First Data Project? Go Tandem! (AVISIA at Play)

07-27

ACL 2018 Highlights： Understanding Representations and Evaluation in More Challenging Settings

07-26

Data Notes： Winning Solutions of Kaggle Competitions

07-26

Awesome MCMC animation site by Chi Feng! On Github!

07-26

Parsimonious principle vs integration over all uncertainties

07-26

How to think about an accelerating string of research successes?

07-26

Grazing and Calculus Revisited

07-26

2018

Whistler, British Columbia

07-26

AWS Deep Learning AMIs now include ONNX, enabling model portability across deep learning frameworks

07-26

Differentiable Image Parameterizations

07-25

Advice on soft skills for academics

07-25

Journals and refereeing： toward a new equilibrium

07-25

Recently in the sister blog

07-24

The AWS DeepLens Inclusivity Challenge submission period extended to 8/19

07-24

New Research on Multi-Task Learning

07-24

Most Common Jobs, By State

07-24

A quick tour of AI services in Azure

07-24

2018

When wife earns more than husband, they report a lesser gap

07-23

Top 20 Python AI and Machine Learning Open Source Projects

07-23

Import AI：

07-23

Year 3 of Data, Beer, & Inspiration

07-23

AI, Machine Learning and Data Science Roundup： July 2018

07-23

AWS Deep Learning AMIs now with optimized TensorFlow 1.9 and Apache MXNet 1.2 with Keras 2 support to accelerate deep learning on Amazon EC2 instances

07-23

DeepLearning-Github排行

07-22

Defining data science in 2018

07-22

Of statistics class and judo class： Beyond the paradigm of sequential education

07-22

The Real Problems with Neural Machine Translation

07-21

2018

“A Headline That Will Make Global-Warming Activists Apoplectic”

07-21

图像特征提取（纹理特征）

07-20

A hex sticker wall, created with R

07-20

Scalable multi-node deep learning training using GPUs in the AWS Cloud

07-20

Classify your own images using Amazon SageMaker

07-20

Where that title came from

07-20

Learn to R blog series - Operators and Objects

07-19

“The idea of replication is central not just to scientific practice but also to formal statistics . . . Frequentist statistics relies on the reference set of repeated experiments, and Bayesian statistics relies on the prior distribution which represents the population of effects.”

07-19

If you have a measure, it will be gamed (politics edition).

07-18

Basic Statistics in Python： Probability

07-18

2018

Python数据分析之pandas

07-18

“For professional baseball players, faster hand-eye coordination linked to batting performance”

07-18

Highlights from the useR! 2018 conference in Brisbane

07-18

Data-based ways of getting a job

07-18

Why AI Isn’t A Black Box (And Its Business Value)

07-17

Model Updates： Entity-level Sentiment Analysis and Brand New Entity Extraction Models Now Live in the Text Analysis API

07-17

The statistical checklist： Could there be a list of guidelines to help analysts do better work?

07-17

Video： R for AI, and the Not Hotdog workshop

07-17

How To Remotely Send R and Python Execution to SQL Server from Jupyter Notebooks

07-17

Scanning Office 365 documents

07-16

2018

Mister P wins again

07-16

Import AI：

07-16

The “Carl Sagan effect”

07-16

CRN： The 10 Coolest Machine-Learning And AI Startups Of 2018 (So Far)

07-16

Verlet Simulations

07-16

Mounting multiple data and outputs volumes

07-15

What happens to your career when you have to retract a paper?

07-14

Join "Data School Insiders" on Patreon

07-13

RAIN Project： evolution of the game development dream

07-13

“Bayesian Meta-Analysis with Weakly Informative Prior Distributions”

07-13

2018

Teaching R to New Users - From tapply to the Tidyverse

07-12

The persistence of bad reporting and the reluctance of people to criticize it

07-12

FIFA WC 2018： Semi-Finals and the Final

07-12

Data Notes： How to Forecast the S&P 500 with Prophet

07-12

Where do I learn about log_sum_exp, log1p, lccdf, and other numerical analysis tricks?

07-12

Harmonizing and emojifying our GitHub issue trackers

07-12

Poor Customer Service

07-12

Data Science in 30 Minutes： Using Data Science to Predict the Future with Kirk Borne

07-11

Should the points in this scatterplot be binned?

07-11

I Can’t Afford to Hire a Data Scientist. Now What?

07-11

2018

Preparing for the Data Science Job Hunt

07-11

John Mount speaking on rquery and rqdatatable

07-11

Do Bayesians Overfit?

07-11

BD reviews

07-11

Exercise and weight loss： long-term follow-up

07-10

He wants to model a proportion given some predictors that sum to 1

07-10

Top-Down vs. Bottom-Up Approaches to Data Science

07-10

Using Siamese Networks and Pre-Trained Convolutional Neural Networks (CNNs) for Fashion Similarity Matching

07-10

Joint inference or modular inference? Pierre Jacob, Lawrence Murray, Chris Holmes, Christian Robert discuss conditions on the strength and weaknesses of these choices

07-09

Import AI：

07-09

2018

Feature-wise transformations

07-09

Data Science Project Style Guide

07-09

Design Patterns for Production NLP Systems

07-09

I think they use witchcraft

07-08

Divisibility in statistics： Where is it needed?

07-08

Lenny Dykstra, His Strike Zone, & Bayesian Stats

07-08

Speed up your R Work

07-08

He wants to know what to read and what software to learn, to increase his ability to think about quantitative methods in social science

07-07

From the Sidewalk to the Saddle： Data and the Tour de France

07-06

FIFA WC 2018： Quarter Final Stage Preditions

07-06

2018

All of Life is 6 to 5 Against

07-06

What I’ve learned from competing in machine learning contests on Kaggle

07-06

A Real World Reinforcement Learning Research Program

07-06

On this 4th of July, let’s declare independence from “95%”

07-05

Data Science at Scale： Six Major Trends

07-05

Tutorial： The practical application of complicated statistical methods to fill up the scientific literature with confusing and irrelevant analyses

07-05

Build this media monitoring Slack bot in 20 minutes without writing code

07-04

How to update your scikit-learn code for 2018

07-04

PNAS forgets basic principles of game theory, thus dooming thousands of Bothans to the fate of Alderaan

07-04

Using WSL Linux on Windows 10 for Deep Learning Development.

07-04

2018

Does batting order matter in Major League Baseball? A simulation approach

07-04

SatRdays Cardiff

07-04

Basic Statistics in Python： Descriptive Statistics

07-03

Here’s How to Survive the Rise of A.I. – Become a Data Facilitator

07-03

About that claim in the NYT that the immigration issue helped Hillary Clinton? The numbers don’t seem to add up.

07-03

Flaws in stupid horrible algorithm revealed because it made numerical predictions

07-03

Reply-all loop

07-03

The Ponzi threshold and the Armstrong principle

07-02

seplyr 0.5.8 Now Available on CRAN

07-02

Boost Computation Power and Speed with Snowflake

07-02

2018

Cultural Differences in Map Data Visualization

06-30

Keras vs PyTorch：谁是「第一」深度学习框架？

06-30

Data science books - theory and practice

06-29

Deep Learning Vendor Update： Hyperparameter Tuning Systems

06-29

Computability, Complexity, & Algorithms Part 1

06-29

Sequence labeling with semi-supervised multi-task learning

06-29

One-Shot Imitation from Watching Videos

06-28

Understanding Latent Style

06-28

Announcement – The Data Incubator Partnership with MRI Network

06-28

Data Notes： Your smartphone knows what?

06-28

2018

My Thoughts on Synthetic Data

06-27

Can Lessons from Data Science Help Journalism?

06-27

What Data Scientists should focus on in 2018?

06-27

DIY AI for the Future

06-27

Supercharging Classification - The Value of Multi-task Learning

06-26

Import AI：

06-25

Building a Diabetic Retinopathy Prediction Application using Azure Machine Learning

06-25

What Is Machine Learning and How Is It Making Our World a Better Place?

06-23

Last academic results

06-23

How I built a receipt chatbot over a weekend

06-23

2018

Add Constrained Optimization To Your Toolbelt

06-21

Open Source Datasets with Kaggle

06-21

The Impact of Bitcoin on the Insurance Industry

06-21

Big News： vtreat 1.2.0 is Available on CRAN, and it is now Big Data Capable

06-20

Top 12 Essential Command Line Tools for Data Scientists

06-20

How to Do Distributed Deep Learning for Object Detection Using Horovod on Azure

06-20

Opinion mining on Dutch news articles

06-20

On Tensor Networks and the Nature of Non-Linearity

06-20

A Study Of Reddit Politics

06-20

Deep Reinforcement Learning in Action (Announcement)

06-20

2018

Using Clustering Algorithms to Analyze Golf Shots from the U.S. Open

06-19

Profiling Top Kagglers： Martin Henze (AKA Heads or Tails), World's First Kernels Grandmaster

06-19

Sent2Vec： An unsupervised approach towards learning sentence embeddings

06-19

Is it Time to Regulate Bitcoin?

06-19

Pivoted document length normalisation

06-19

AI Lab： Learn to Code with the Cutting-Edge Microsoft AI Platform

06-19

Import AI

06-18

The Role of Resources in Data Analysis

06-18

BDD100K Blog Update

06-18

Docstrings in open source Python

06-18

2018

5 Tips To Learn Machine Learning

06-17

U.S. Open Data — Gathering and Understanding the Data from 2018 Shinnecock

06-15

R Tip： Be Wary of “…”

06-15

Predicting World Cup dark horses from press coverage using the AYLIEN News API – Monthly Media Roundup

06-15

Data Notes： Predict the World Cup 2018 Winner

06-14

Data Science in 30 Minutes： Holden Karau – A Quick Introduction to PySpark

06-13

Python Generators Tutorial

06-13

wrapr 1.5.0 available on CRAN

06-13

From Gaussian Algebra to Gaussian Processes, Part 2

06-12

Overview and benchmark of traditional and deep learning models in text classification

06-12

2018

Highlights of NAACL-HLT 2018： Generalization, Test-of-time, and Dialogue Systems

06-12

Multithreaded in the Wild

06-11

Take These 7 Small Steps To Make a Big Career Move

06-11

R Tip： use isTRUE()

06-11

Annihilation Review (2018) ： The Descent + Arrival

06-10

An ode to King James

06-10

World Models Experiments

06-09

The Dynamics of Philippine Senate Bills： Gensim, Topic Modeling and All That Good NLP Stuff

06-09

Philippine Senate Bills： NLP Word Cloud Analysis for the 13th to 17th Congress

06-09

Estimating mortality rates in Puerto Rico after hurricane María using newly released official death counts

06-08

2018

Bitcoin and Cryptocurrency Litigation

06-08

Programming Best Practices For Data Science

06-08

“If you deprive the robot of your intuition about cause and effect, you’re never going to communicate meaningfully.” – Pearl ’18

06-08

Automatically Tag Trello Cards with Zapier and Natural Language Processing

06-07

Summer of Data Science Goal-Setting

06-06

Engineering a New Career in Data Science： Alumni Spotlight on Abhishek Mishra

06-06

Deep Learning for Emojis with VS Code Tools for AI – Part 2

06-05

Import AI

06-05

Forbes： DataRobot Puts the Power of Machine Learning in the Hands of Business Analysts

06-04

Free E-Book： A Developer’s Guide to Building AI Applications

06-04

2018

When the bubble bursts…

06-04

Trustworthy Data Analysis

06-04

rqdatatable： rquery Powered by data.table

06-03

Data Links

06-03

Lucy`s Secret Number puzzle

06-03

3368a9b98a073e7ba296e1f5f41f6c4f

06-02

Bulk Loading Shapefiles Into Postgres/Postgis

06-01

Python and Tidyverse

06-01

Parallel, Disk-Efficient .zip to .gz Conversion

06-01

A crystal clear book draw

06-01

2018

Talking about clinical significance

06-01

Six Dice Betting Game

05-31

Rules to Learn By

05-31

Convolve all the things

05-31

Data Science in 30 Minutes： Deep Learning to Detect Fake News with Uber ATG Head of Data Science, Mike Tamir

05-30

BDD100K： A Large-scale Diverse Driving Video Database

05-30

The Data Incubator Unofficial Frequently Asked Questions

05-30

How to Overcome Imposter Syndrome For Good

05-30

Some updates

05-29

Import AI：

05-29

2018

An Updated Review of The Data Incubator Data Science Bootcamp

05-29

How to Use FPGAs for Deep Learning Inference to Perform Land Cover Mapping on Terabytes of Aerial Images

05-29

Summer of Data Science 2018

05-28

Image Compression using K-means Clustering.

05-28

“Creating correct and capable classifiers” at PyDataAmsterdam 2018

05-26

What's New in Dataquest v1.85： Takeaways, Intermediate R, and More

05-25

How digital cameras work

05-25

Context Compatibility in Data Analysis

05-24

Why Lies Spread Faster than the Truth

05-24

Light FM Recommendation System Explained

05-24

2018

ML beyond Curve Fitting： An Intro to Causal Inference and do-Calculus

05-24

Data Retrieval and Cleaning： Tracking Migratory Patterns

05-23

SQLite vs Pandas： Performance Benchmarks

05-23

Best practices with pandas (video series)

05-23

An Overview of Recommendation Systems

05-23

How to use an R interface with Airtable API

05-23

Generating Climate Temperature Spirals in Python

05-21

Enterprise Deployment Tips for Azure Data Science Virtual Machine (DSVM)

05-21

My steps into Data Science

05-21

My eRum 2018 biggest highlights

05-19

2018

Life-cycle of a Data Science Project

05-18

3 Things We Can Do About Fake News

05-18

Microsoft Weekly Data Science News for May 18, 2018

05-18

Delayed Impact of Fair Machine Learning

05-17

Awesome postdoc opportunities in computational genomics at JHU

05-17

Using Linear Regression for Predictive Modeling in R

05-16

Learn D3.js in 5 minutes

05-16

Rethinking Academic Data Sharing

05-15

Differentiable Dynamic Programs and SparseMAP Inference

05-15

Two things about power

05-14

2018

The Lottery Ticket Hypothesis - Paper Recommendation

05-10

Mother's Day Interview： How Nicole Finnie Became a Competitive Kaggler on Maternity Leave

05-10

Data types

05-08

Multithreaded in the Wild

05-07

Profiling Top Kagglers： Bestfitting, Currently

05-07

Ffa1ea00fdab31b3b44b87839c503629

05-06

Data Links

05-06

Kung Fury Review (2015) ： Don’t Hassle the Hoff

05-05

Quick DB result caching in R

05-05

Software as an academic publication

05-03

2018

Saving, resuming, and restarting experiments with Polyaxon

05-03

Cambridge Analytica, Facebook, and user data – Monthly Media Review with the AYLIEN News API, April

05-03

An intuitive, visual guide to copulas

05-03

A particles-arly fun book draw

05-02

How analog TV worked

05-01

Technology and Information： Data Science and UX

05-01

Using 360° Stance Detection to Analyze coverage of Donald Trump by CNN

04-30

PyDataLondon 2018 and “Creating Correct and Capable Classifiers”

04-30

Gensim Survey 2018

04-30

TDM： From Model-Free to Model-Based Deep Reinforcement Learning

04-26

2018

Simple Architectures Outperform Complex Ones in Language Modeling

04-25

Using Natural Language Processing to Combat Filter Bubbles and Fake News – 360° Stance Detection

04-24

AHL Python Data Hackathon

04-22

Some web API package development lessons from HIBPwned

04-19

Announcing Ursa Labs： an innovation lab for open source data science

04-19

Shared Autonomy via Deep Reinforcement Learning

04-18

How many CRAN package maintainers have been pwned?

04-18

Why Start a Data Science Project?

04-18

Can a Machine Be Racist or Sexist?

04-16

Seasonalities： The Near-Term Future for the Market

04-14

2018

Traveling salesman portrait in Python

04-12

Goals and Principles of Representation Learning

04-12

Graph embeddings in Hyperbolic Space

04-10

Gradient optimisation on the Poincaré disc

04-10

Towards a Virtual Stuntman

04-10

Circle circumference in the hyperbolic plane is exponential in the radius： proof by computer game

04-10

Multithreaded in the Wild

04-09

Webcam based image processing in Jupyter notebooks

04-09

Synthetic Gradients with Tensorflow

04-08

R Spatial Resources

04-06

2018

Lumpers and Splitters： Tensions in Taxonomies

04-05

Quarterly product update： Create your data science projects on Kaggle

04-04

Automated machine learning is coming... and it won't matter

04-04

From Gaussian Algebra to Gaussian Processes, Part 1

03-31

Package Paths in R

03-31

AlphaGo Zero Is Not A Sign of Imminent Human-Level AI

03-30

Learn to R blog series - R and RStudio

03-29

Data Science and Python

03-29

Mérida, Yucatán

03-25

The Bull Survived on Friday, but Barely

03-25

2018

Introducing Python for data scientists - Pt2

03-23

Crossing Your Data Science Chasm

03-22

Time Series for scikit-learn People (Part II)： Autoregressive Forecasting Pipelines

03-22

How many college football teams can you watch in-person in one football season?

03-21

Notes on the Frank-Wolfe algorithm, Part I

03-20

Engineering Data Science at Automattic

03-20

Deterministic A/B tests via the hashing trick

03-20

Why you should start using .npy file more often…

03-20

Distributed Deep Learning with Polyaxon

03-18

Two cool features of Python NumPy： Mutating by slicing and Broadcasting

03-17

2018

Introducing Python for data scientists - Pt1

03-15

Using RSiteCatalyst With Microsoft PowerBI Desktop

03-13

Turning Water into Wine

03-13

TSrepr use case - Clustering time series representations in R

03-13

Transfer Your Font Style with GANs

03-13

When Men and Women talk to Siri

03-09

Sock Puzzle Revisited

03-07

Understanding rolling calculations in R

03-07

The Building Blocks of Interpretability

03-06

Connect to Google Sheets in Power BI using R

03-06

2018

How Americans make a living based on their age

03-06

Top 10 oldest and youngest industries in the U.S.

03-05

Compound interest and retirement

03-05

ICML Board and Reviewer profiles

03-05

Jupyter notebooks and tensorboard on Polyaxon

03-04

Integration method to map model scores to conversion rates from example data

03-04

The 2018 Best Picture Nominees Ranked, Reviewed, and Reflected Upon

03-03

Multithreaded in the Wild

03-02

Nine digits puzzle

03-02

Image Recognition and Object Detection

02-28

2018

The Sickness That Is Depression

02-28

Mathematics of Tape Recorders

02-28

Cribbage Scores

02-25

Java Art Generation with Neural Style Transfer

02-24

My Approach to Natas Level 11 (a Web Security Game)

02-23

Getting Started With MapD, Part 2： Electricity Dataset

02-23

Google Calendar should prevent spam by default

02-22

What Do Data Scientists Need to Know about Containerization? As Little as Possible.

02-22

Reduce GPU costs with startup scripts on the Google Cloud Engine

02-21

Markdown based web analytics? Rectangle your blog

02-21

2018

Text to Speech Deep Learning Architectures

02-20

Production Recommendation Systems with Cloudera

02-20

Fast Company's 2018 World's Most Innovative Companies List

02-20

It’s okay to not be a data scientist

02-20

Sutton’s Temporal-Difference Learning

02-19

Kolmogorov and randomness

02-18

PyData Conference & AHL Hackathon

02-16

RSiteCatalyst Version 1.4.14 Release Notes

02-16

Hands-on： Creating Neural Networks using Chainer

02-15

Setting up Jupyter for Deep Learning on EC2

02-15

2018

Pervasive Simulator Misuse with Reinforcement Learning

02-14

How to maraaverickfy a blog post without even reading it

02-12

Pancake Numbers

02-12

Introduction to Learning to Trade with Reinforcement Learning

02-11

Performance metrics aren't everything

02-09

Machine learning mega-benchmark： GPU providers (part 2)

02-08

Pruning Neural Networks： Two Recent Papers

02-06

Natural and Artificial Intelligence

02-06

Linus Sequence

02-06

Learning Robot Objectives from Physical Human Interaction

02-06

2018

iPhone addiction? Get a grip!

02-06

Superbowl Helmet Puzzle

02-04

Hiring Data Scientists

02-04

A Practical Guide to the "Open-Source Machine Learning Masters"

02-03

Getting Started With MapD, Part 1： Docker Install and Loading Data

02-01

Static Blog： Jekyll, Hyde and GitHub Pages

02-01

Moravec's Paradox

01-31

Counting Efficiently with Bounter pt. 2： CountMinSketch

01-31

k-server, part 3： entropy regularization for weighted k-paging

01-29

Neural Networks and the generalisation problem

01-28

2018

Time Series for scikit-learn People (Part I)： Where's the X Matrix?

01-28

TSrepr - Time Series Representations in R

01-26

Habits and Tools, Old and New

01-26

My notes on (Liang et al., 2017)： Generalization and the Fisher-Rao norm

01-25

9 new pandas updates that will save you time

01-25

Lessons learned in my first year as a data scientist

01-25

Kernel Feature Selection via Conditional Covariance Minimization

01-23

Motivation in Academia vs Industry

01-21

Java Autonomous driving – Car detection

01-18

The Generalization Mystery： Sharp vs Flat Minima

01-18

2018

57 Summaries of Machine Learning and NLP Research

01-17

Machine Learning Trick of the Day (7)： Density Ratio Trick

01-14

Shortest Crease Problem

01-14

CES 2018

01-12

New Year's Resolutions 2018

01-05

Interactive Broker’s SNAP Orders for Delayed Trading

01-03

Should I do a Data Science bootcamp?

01-03

Java Image Cat&Dog Recognition with Deep Neural Networks

01-03

ML/NLP Publications in 2017

01-02

Top gsutil command lines to get started on Google Cloud Storage

01-01

2017

AI and Deep Learning in 2017 – A Year in Review

12-31

Python Data Science jobs list into 2018

12-31

Linked Lists

12-28

2017 Winners and Losers

12-27

Weekly Review： 12/23/2017

12-23

Setting Up Selenium on RaspberryPi 2/3

12-22

Large-Scale Health Data Analytics with OHDSI

12-21

k-server, part 2： continuous time mirror descent

12-20

Simulating Chutes & Ladders in Python

12-18

Why mere Machine Learning cannot predict Bitcoin price

12-18

2017

k-server, part 1： online learning and online algorithms

12-17

Weekly Review： 12/16/2017

12-16

How To Write, Deploy, and Interact with Ethereum Smart Contracts on a Private Blockchain

12-15

Data professional definitions： Data analyst vs data scientist vs data engineer

12-14

Java Handwritten Digit Recognition with Convolutional Neural Networks

12-13

Everything is a Model

12-13

NIPS 2017 Summary

12-11

Optimization of Scientific Code with Cython： Ising Model

12-11

Do AIs dream of pwning FF leagues?

12-10

Weekly Review： 12/10/2017

12-10

2017

Implementing Poincaré Embeddings

12-09

Alchemy, Rigour and Engineering

12-07

Installing Python Packages from a Jupyter Notebook

12-05

Using Artificial Intelligence to Augment Human Intelligence

12-04

The Last 5 Years In Deep Learning

12-04

AutoML on AWS

12-04

Hitchhiker’s guide to Used Car Prices Estimation

12-04

At NIPS 2017

12-04

Weekly Review： 12/03/2017

12-03

Sleeping Giant Rural Postman Problem

12-01

2017

A Neural Network for predicting Restaurant Reservations

11-30

House Price Prediction using a Random Forest Classifier

11-29

Java Handwritten Digit Recognition with Neural Networks

11-29

Grosse's challenge： duality and exponential families

11-29

Python Tutorial： Learn Python in one Day

11-28

Incremental means and variances

11-28

Sequence Modeling with CTC

11-27

New download API for pretrained NLP models and datasets in Gensim

11-27

Gaussian Processes

11-25

Markdown Language Reference

11-24

2017

Python Pandas Tutorial： The Basics

11-23

Thanksgiving Special 🦃： GANs are Being Fixed in More than One Way

11-23

How to Build Your Own Blockchain Part 4.2 — Ethereum Proof of Work Difficulty Explained

11-21

Linear Feedback Shift Registers

11-19

50 states Rural Postman Problem

11-19

“Should I get a PhD to be a data scientist/analytics professional?”

11-19

Weekly Review： 11/18/2017

11-18

Recommender System With Implicit Feedback

11-18

8 Important Python Interview Questions and Answers

11-17

A Cookbook for Machine Learning： Vol 1

11-16

2017

Python List Comprehension + Set + Dict Comprehension

11-16

Python Matplotlib (pyplot), a step-by-step Tutorial

11-15

Decision Making and Diversity

11-15

PyDataBudapest and “Machine Learning Libraries You’d Wish You’d Known About”

11-15

Data Science in Healthcare

11-14

How to Build Your Own Blockchain Part 4.1 — Bitcoin Proof of Work Difficulty Explained

11-13

Understanding deep Convolutional Neural Networks with a practical use-case in Tensorflow and Keras

11-13

Algorithms, Machine Learning, and Optimization： we are hiring!

11-12

Markets Performance after Election： One Year Update

11-12

Evolving Stable Strategies

11-12

2017

Weekly Review： 11/11/2017

11-11

Exploring Line Lengths in Python Packages

11-09

Gaussian Distributions are Soap Bubbles

11-09

Why Indian companies should take on different projects than competing Valley companies - an application of Cobb-Douglas

11-07

Feature Visualization

11-07

Weekly Review： 11/04/2017

11-06

When Traditional Programming Meets Machine Learning

11-05

PyConUK 2017, PyDataCardiff and “Machine Learning Libraries You’d Wish You’d Known About”

11-05

On Pyro - Deep Probabilistic Programming on PyTorch

11-03

How to Build Your Own Blockchain Part 3 — Writing Nodes that Mine and Talk

11-02

2017

mixup： Data-Dependent Data Augmentation

11-02

Gsutil cheatsheet

11-02

Self-Organizing Maps Tutorial

11-02

PokerBot： Create your poker AI bot in Python

11-01

Some comments to Daniel Abadi's blog about Apache Arrow

11-01

Two Recent Results in Transfer Learning for Music and Speech

11-01

New in Cloudera Data Science Workbench 1.2： Usage Monitoring for Administrators

10-31

Recommender System

10-30

A Visual Guide to Evolution Strategies

10-29

Weekly Review： 10/28/2017

10-28

2017

How easy is it to moneyball a fantasy football league draft?

10-28

How to Build Your Own Blockchain Part 2 — Syncing Chains From Different Nodes

10-27

Multi Armed Bandit

10-26

AlphaGo Zero： Minimal Policy Improvement, Expectation Propagation and other Connections

10-26

COLT 2018 call for papers

10-24

AWS Machine Learning Big Data NYC

10-24

Online Hard Example Mining on PyTorch

10-22

Hard Examples Mining in Keras

10-22

JUnit,Integration,End to End Tests

10-22

Some Thoughts on Meditation

10-22

2017

Weekly Review： 10/21/2017

10-21

Markets Performance after Election： Day 239

10-21

Martingales

10-20

Ensemble learning for time series forecasting in R

10-19

Deep learning with Apache MXNet on Cloudera Data Science Workbench

10-19

DeformNet, Or A Tale of Broken Chairs

10-18

How to Build Your Own Blockchain Part 1 — Creating, Storing, Syncing, Displaying, Mining, and Proving Work

10-17

Text Segmentation using Word Embeddings

10-16

How to use Tensorboard with PyTorch

10-16

Feather format update： Whence and Whither?

10-16

2017

Markets Performance after Election

10-16

How I was screwing up testing my code

10-15

Advice for aspiring data scientists and other FAQs

10-15

Building a Visual Search Algorithm

10-13

Understanding how Deep Learning learns to play SET®

10-12

Data Science for Managers and Directors (DS4MAD)

10-10

Intro to graph optimization： solving the Chinese Postman Problem

10-07

GANs are Broken in More than One Way： The Numerics of GANs

10-05

NPR Sunday Puzzle Solving, And Other Baby Name Questions

10-02

Podcast Listens Analysis

10-02

2017

Michael B. Cohen

09-28

Twitter bots for good, and information contagion!

09-27

Hi

09-27

Run some cool GitHubs on Azure (Python)

09-26

Paper review： EraseReLU

09-26

NIPS 2017 Workshop on Approximate Inference

09-25

Exploiting Daily Fantasy Football for Fun and Profit

09-22

Apache Arrow and the "10 Things I Hate About pandas"

09-21

Talk like a pirate day 2017

09-19

Face Similarity searching ~ landmark detecting

09-18

2017

How To Predict ICU Mortality with Digital Health Data, DL4J, Apache Spark and Cloudera

09-18

Deep Learning Dead-End?

09-17

Customizing Docker Images in Cloudera Data Science Workbench

09-14

QuantConnect – the only Game in Town

09-10

Joining ASAPP

09-09

Semantic trees for training word embeddings with hierarchical softmax

09-07

Making Smart Phones Dumb Again

09-07

What Killed the Curse of Dimensionality?

09-06

Deep Learning with Intel’s BigDL and Apache Spark

09-06

Software patents are evil, but BSD+Patents is probably not the solution

09-05

2017

When (not) to use Deep Learning for NLP

09-04

Inferring data loss (and correcting for it) from fundamental relationships

09-01

The Advent of Analytics Engineering

09-01

Crosslingual document comparison

08-31

How much compute do we need to train generative models?

08-31

A.I. 'Bias' Doesn't Mean What Journalists Say It Means

08-30

Poor Customer Support?

08-28

The jet plane that shot itself down

08-27

Python Deep Learning tutorial： Create a GRU (RNN) in TensorFlow

08-27

Designing a Deep Learning Project

08-23

2017

Using regression trees for forecasting double-seasonal time series with trend in R

08-22

He grins like a Cheshire cat; said of anyone who shows his teeth and gums in laughing

08-22

Why Machine Learning Is A Metaphor For Life

08-16

Van der Waerden Numbers

08-15

Parallelizing Distance Calculations Using A GPU With CUDAnative.jl

08-14

Hype or Not? Some Perspective on OpenAI’s DotA 2 Bot

08-12

Superresolution with semantic guide

08-09

My Qualifying Exam (Oral)

08-07

Cinderella science

08-05

Hierarchical Softmax

08-01

2017

从决策树到随机森林：树型算法的原理与实现

07-31

Logistic Regression

07-30

Random Dilation Networks for Action Recognition in Videos

07-29

More silliness

07-29

My 10-step path to becoming a remote data scientist with Automattic

07-29

Diffusion of ISIS propaganda on Twitter

07-28

Moving On, Looking Back

07-28

Web scraping the President's lies in 16 lines of Python

07-27

How I Used Deep Learning To Train A Chatbot To Talk Like Me (Sorta)

07-25

2 Quick Announcements

07-25

2017

Prophecy Fulfilled： Keras and Cloudera Data Science Workbench

07-25

RSiteCatalyst Version 1.4.13 Release Notes

07-23

Retrospective review of my first deep learning competition

07-22

implyr： R Interface for Apache Impala

07-19

Layman’s Guide to A/B Testing

07-18

Introductory Machine Learning Terminology with Food

07-18

What is Machine Learning?

07-17

Introducing a tensorflow library for deep learning and reinforcement learning

07-17

Japanese Kids Shows, Movies, Games, and Videos for Immersion

07-14

Cloudera Enterprise 5.12 is Now Available

07-13

2017

How to launch your data science career (with Python)

07-12

Guest Post – Learning R as an MBA Student

07-12

Clustering applied to showers in the OPERA

07-10

From Microservices to Service Blocks using Spring Cloud Function and AWS Lambda

07-07

6279e808ef0c35488ea3a81e9b6d302a

07-06

Smooth distributed convex optimization

07-06

What's new in PyMC3 3.1

07-05

Kaggle’s Mercedes-Benz Greener Manufacturing

07-01

From Python Hero to Java Rockstar

06-30

Deep learning on Apache Spark and Apache Hadoop with Deeplearning4j

06-27

2017

Docker for AWS

06-27

Hexagon Geometry Puzzle

06-27

Machine learning applied to showers in the OPERA

06-24

Announcing Elemetric

06-23

Reddit science discussions as a dataset

06-22

Matrix Factorization in PyTorch

06-20

Is the Universe Random?

06-19

Wind Turbine Efficiency

06-19

Neurally Embedded Emojis

06-19

Random Effects Neural Networks in Edward and Keras

06-15

2017

Machine Learning Fraud Detection： A Simple Machine Learning Approach

06-15

Machine Learning the Future Class

06-12

Make a Profitable Portfolio using Python

06-08

Minsky & Papert’s “Perceptrons”

06-08

Kaggle’s Quora Question Pairs Competition

06-07

My Video Game Playlists in Japanese for Immersion

06-07

Further Exploring Common Probabilistic Models

06-06

Work in progress： Portraits of Imaginary People

06-06

Safe Crime Detection

06-05

COLT 2017 accepted papers

06-03

2017

Exploring and visualising reef life survey data

06-03

A Research to Engineering Workflow

06-03

ICML 2017 Workshop on Implicit Models

06-02

From Instance Noise to Gradient Regularisation

06-01

Voronoi Soccer

05-31

JMP Publishes Exercises to Accompany Data Mining Techniques (3rd Edition)

05-31

Review of The Data Incubator data science bootcamp

05-29

Summer of Data Science 2017

05-29

Exposing Python 3.6's Private Dict Version

05-26

Teaching Machines to Draw

05-19

2017

Minimizing the Negative Log-Likelihood, in English

05-18

Parallel computation with two lines of code

05-18

Workshop sur le Topic Modeling

05-17

Python Deep Learning tutorial： Elman RNN implementation in Tensorflow

05-17

Create conda recipe to use C extended Python library on PySpark cluster with Cloudera Data Science Workbench

05-15

Normal Distributions

05-14

Voronoi Diagrams

05-12

Getting Started with Cloudera Data Science Workbench

05-08

Transfer Learning for Flight Delay Prediction via Variational Autoencoders

05-08

The Benefits of Migrating HPC Workloads To Apache Spark

05-04

2017

Hail： Scalable Genomics Analysis with Apache Spark

05-02

Flipping a Coin on a Crazy Plane

05-01

Hacking A Hackaton

04-30

Tutorial： Sentiment Analysis of Airlines Using the syuzhet Package and Twitter

04-30

Announcement

04-27

Use your favorite Python library on PySpark cluster with Cloudera Data Science Workbench

04-26

XOR Revisited： Keras and TensorFlow

04-24

Re-parameterising for non-negativity yields multiplicative updates

04-24

How to make the transition from academia to data science

04-23

F beta score for Keras

04-23

2017

Fact over Fiction

04-22

Alphabear Solver

04-21

Machine Learning in Science and Industry slides

04-20

Deep Learning Frameworks on CDH and Cloudera Data Science Workbench

04-20

Sentiment analysis on Twitter using word2vec and keras

04-20

Why I'm bullish on Uber - the customer acquisition trough

04-20

Deriving the Softmax from First Principles

04-19

Audio Signals in Python

04-17

Sentiment Analysis model deployed!

04-17

Sakura blossoms in Japan

04-11

2017

Cake cutting part 3

04-10

Getting Started with Sonnet, Deep Mind’s Deep Learning Library

04-10

RSiteCatalyst Version 1.4.12 (and 1.4.11) Release Notes

04-10

Retrospective on leaving academia for industry data science

04-09

Approximating Implicit Matrix Factorization with Shallow Neural Networks

04-07

Building a Tic-Tac-Toe web-app in this Webpack tutorial and Babel tutorial

04-07

Covariate-Based Diagnostics for Randomized Experiments are Often Misleading

04-06

Why Momentum Really Works

04-04

Time Series Analysis with Generalized Additive Models

04-04

Cake cutting part 2

04-01

2017

A Practical Guide to the Lomb-Scargle Periodogram

03-30

Cake cutting

03-28

R<-Slovakia meetup started to build community in Bratislava

03-26

Bias in Machine Learning Flipboard Magazine

03-25

Emojis Analysis in R

03-24

Three Bag Logic Puzzle

03-23

Docker y Kaggle con Enrique y Beto

03-22

Becoming a Data Scientist Podcast Episode 16： Randy Olson

03-22

Research Debt

03-22

Deep Learning without Backpropagation

03-21

2017

From Analytical to Numerical to Universal Solutions

03-20

Cryptogram Puzzle

03-20

Model AUC depends on test set difficulty

03-19

Ordered Categorical GLMs for Product Feedback Scores

03-17

Building Safe A.I.

03-17

How to mine newsfeed data and extract interactive insights in Python

03-15

Millions of social bots invaded Twitter!

03-14

Random-Walk Bayesian Deep Networks： Dealing with Non-Stationary Data

03-14

Discarded Hard Drives： Data Science as Debugging

03-14

Intercausal Reasoning in Bayesian Networks

03-13

2017

Applying Machine Learning To March Madness

03-12

Cognitive Machine Learning (2)： Uncertain Thoughts

03-12

Square to Hex

03-11

Does the Muslim ban make us safer?

03-10

An introduction to Bayesian Belief Networks

03-10

Vestigial trigonometry functions

03-08

Principle Component Analysis in Regression

03-08

Tic-Tac-AI： A Strong Tic-Tac-Toe AI Opponent using Forward Sampling

03-07

Topic Modeling Amazon Reviews

03-07

NSA Easter Egg Puzzle

03-05

2017

Self-Service Adobe Analytics Data Feeds!

03-03

Data Engineer vs Data Scientist (Infographic)

03-02

MULTI-VARIATE ANALYSIS

03-01

Deepcolor： automatic coloring and shading of manga-style lineart

03-01

Introduction to Random forest

02-28

Deep and Hierarchical Implicit Models

02-28

Artificial Intelligence to replace staff at O2

02-28

Persistent Homology (Part 5)

02-26

What is an Interaction Effect?

02-25

Scrape Tweets from Twitter using Python and Tweepy

02-24

2017

Movie Genre Ratings - Addendum

02-24

Persistent Homology (Part 4)

02-23

Persistent Homology (Part 3)

02-23

Genres Where Audiences and Critics Disagree Most

02-23

Recurrent Neural Networks for Churn Prediction

02-22

Getting Rich using Bitcoin stockprices and Twitter!

02-22

Topological Data Analysis - Persistent Homology

02-22

Persistent Homology (Part 2)

02-22

Introduction to Support Vector Machine

02-20

Writing Effective Amazon Machine Learning

02-19

2017

Similarity in the Wild

02-19

The Price is Right

02-19

T-Shirts!!

02-18

Introduction to XGBoost

02-17

Deconstruction with Discrete Embeddings

02-15

10 famous TV shows related to Data science & AI (Artificial Intelligence)

02-14

Reasons I left academia

02-12

Facts and Fallacies of Software Engineering - Book Review

02-11

Extreme IO performance with parallel Apache Parquet in Python

02-10

Bayesian Linear Regression (in PyMC) - a different way to think about regression

02-09

2017

Beyond Binary： Ternary and One-hot Neurons

02-08

Accelerating Apache Spark MLlib with Intel® Math Kernel Library (Intel® MKL)

02-08

Why hierarchical models are awesome, tricky, and Bayesian

02-08

Bayesian Inference via Simulated Annealing

02-07

Similarity via Jaccard Index

02-07

Analyzing US flight data on Amazon S3 with sparklyr and Apache Spark 2.0

02-06

Rec-a-Sketch： a Flask App for Interactive Sketchfab Recommendations

02-05

Cognitive Machine Learning (1)： Learning to Explain

02-05

Topic Modeling for Keyword Extraction

02-05

Simple Stock Ticker App

02-04

2017

RescueTime Inference via the "Poor Man's Dirichlet"

02-03

Create Inverted Music using Python

02-02

Up and running with Apache Spark on Apache Kudu

02-01

Building Event-driven Microservices Using CQRS and Serverless

02-01

Crazy Progress Bars

01-31

Becoming a Data Scientist Podcast Episode 15： David Meza

01-30

Data Cleaning, Categorization and Normalization

01-30

Journal： PLXtrum - realtime machine learning for predicting note onset

01-28

Streaming Columnar Data with Apache Arrow

01-27

Where Predictive Modeling Goes Astray

01-27

2017

Development update： High speed Apache Parquet in Python with Apache Arrow

01-25

Radiocarbon dating

01-24

Doing magic and analyzing seasonal time series with GAM (Generalized Additive Model) in R

01-24

Machine Learning Madden NFL： The best player position switches for Madden 17

01-20

Building a Data Science Workstation (2017)

01-18

This Website

01-18

Engineering is the bottleneck in (Deep Learning) Research

01-17

Wine dataset demonstrates importance of feature scaling

01-17

T-Shirt Contest Finalists

01-17

Hello, world!

01-16

2017

Questions on Artificial Intelligence

01-16

Complex System Society 2016 Junior Scientific Award!

01-16

Centroids of semicircles and hemispheres

01-16

Tutorial： Deep Learning in PyTorch

01-15

Creating an Azure VHD from Ubuntu Cloud Images on Mac OS X

01-13

Data Readiness Levels： Turning Data from Palid to Vivid

01-12

Becoming a Data Scientist Podcast Episode 14： Jasmine Dumas

01-11

Deep Learning Research Review Week 3: Natural Language Processing

01-10

Self Driving Cars

01-10

Optimization inequalities cheatsheet

01-10

2017

Machine Learning Madden NFL： How Madden player ratings are actually calculated

01-10

CES 2017

01-09

Customer lifetime value and the proliferation of misinformation on the internet

01-08

My Experience as a Freelance Data Scientist

01-07

Attending to characters in neural sequence labeling models

01-06

NLP and ML Publications – Looking Back at 2016

01-04

Native Hadoop file system (HDFS) connectivity in Python

01-03

Recurrent Neural Network Tutorial for Artists

01-01

2016

Our R package roundup

12-31

What is the natural gradient, and how does it work?

12-30

2016

Podcast Special Episode 2 – The Future of AI with Dr. Ed Felten

12-29

Painted Cube Puzzle

12-28

From Arrow to pandas at 10 Gigabytes Per Second

12-27

2017 Outlook： pandas, Arrow, Feather, Parquet, Spark, Ibis

12-27

Chuck-a-Luck

12-26

Mathematically, what is the optimal pitch for a roof?

12-23

Assorted links

12-21

Avoiding overfitting in object detection problem

12-19

Hamiltonian Monte Carlo explained

12-19

How-to： Automate Your sparklyr Environment with Cloudera Director

12-15

2016

Generating World Flags with Sparse Auto-Encoders

12-14

Post NIPS Reflections

12-13

On Model Mismatch and Bayesian Analysis

12-13

RSiteCatalyst Version 1.4.10 Release Notes

12-13

Type Safety and Statistical Computing

12-12

Think slow, think fast

12-12

3D printing glass and bronze： Lost-PLA casting

12-11

NIPS 2016 Generative Adversarial Training workshop talk

12-10

Speeding up TRPO through parallelization and parameter adaptation

12-09

Freudenstein’s Equation

12-07

2016

Support Becoming a Data Scientist!

12-07

Properties of Interpretability

12-06

Experiments in Handwriting with a Neural Network

12-06

Colorizing the DRAW Model

12-06

Using Keras' Pretrained Neural Networks for Visual Similarity Recommendations

12-05

Kinesis Advantage2： Impressions

12-05

Ackerman Steering

12-03

Forecast double seasonal time series with multiple linear regression in R

12-03

Salon des Refusés

12-02

Don't Panic： Deep Learning will be Mostly Harmless

11-29

2016

Respecting Boundaries with Inhomogeneous Kernels

11-29

Django and Elastic Beanstalk, a perfect combination

11-28

Google F1 Server Reading Summary

11-26

Book Review： Computer Age Statistical Inference

11-23

Grazing in a circular field

11-23

Docker and Kaggle with Ernie and Bert

11-22

Non-Zero Initial States for Recurrent Neural Networks

11-20

Data-Informed vs Data-Driven

11-20

The Two Tribes of Language Researchers

11-19

T-Shirt Design Contest!

11-19

2016

Lies, Damned Lies and Big Data

11-19

Simple python to LaTeX parser

11-18

Goals of Interpretability

11-17

Lost Car Key Puzzle

11-16

Fontstellations

11-16

Deep Learning Research Review Week 2: Reinforcement Learning

11-16

Recurrent Neural Networks in Tensorflow III - Variable Length Sequences

11-15

Your First Job

11-15

Becoming a Data Scientist Podcast Special Episode

11-14

Tangent Length Puzzle

11-14

2016

Enernoc smart meter data - forecast electricity consumption with similar day approach in R

11-12

While We Were Busy with Prosperity

11-10

Demystifying Data Science

11-10

GitHub's one-dimensional view of open source contributions

11-07

Twitter, Social Bots, and the US Presidential Elections!

11-07

Learning to Rank Sketchfab Models with LightFM

11-07

Aligned Clock Hands

11-04

Artificial Neural Networks Introduction (Part II)

11-03

Paper： A Differentiable Physics Engine for Deep Learning in Robotics

11-03

Analyzing Housing Prices in Berkeley

11-02

2016

Ten Ways Your Data Project is Going to Fail

11-01

Two papers released on arXiv, "Operator Variational Inference" and "Model Criticism for Bayesian Causal Inference"

10-30

Once Again： Prefer Confidence Intervals to Point Estimates

10-30

Interacting with ML Models

10-26

Random forest interpretation – conditional feature contributions

10-24

AI ‘judge’ doesn’t explain why it reaches certain decisions

10-24

DynamoDB Learnings

10-23

Clustering Zeppelin on Zeppelin

10-23

Intro to Implicit Matrix Factorization： Classic ALS with Sketchfab Models

10-19

Recurrent Neural Network Gradients, and Lessons Learned Therein

10-18

2016

Gradientes de Recurrent Neural Networks y Lo Que Aprendí Derivándolos

10-18

Deconvolution and Checkerboard Artifacts

10-17

Asynchronous Scraping with Python

10-16

Deep reinforcement learning, battleship

10-15

Gradient descent learns linear dynamical systems

10-13

How to Use t-SNE Effectively

10-13

Quick reference to Python in a single script (and notebook)

10-13

Simulación Estadística del Plebiscito Colombiano： ¿Realmente Ganaron Los del "No?"

10-12

Simulating the Colombian Peace Vote： Did the "No" Really Win?

10-12

Einstein's Spacetime

10-11

2016

A fully asynchronous variant of the SAGA algorithm

10-11

PyData DC 2016 Talk

10-11

WordPress to Jekyll： A 30x Speedup

10-10

Multiple Raffle Strategy

10-09

Likes Out! Guerilla Dataset!

10-09

Cognitive Machine Learning： Prologue

10-08

Nonparametric Density Estimation Parzen Windows And Beyond

10-06

Sensor Fusion Tutorial

10-05

Champagne Bottles

10-04

How-to： Do Scalable Graph Analytics with Apache Spark

10-03

2016

TensorFlow in a Nutshell — Part Three： All the Models

10-03

Claims and Evidence： A Joke

10-03

Learning Reinforcement Learning (with Code, Exercises and Solutions)

10-02

What is DRAW (Deep Recurrent Attentive Writer)?

10-02

Deep Learning Research Review Week 1: Generative Adversarial Nets

09-30

NIPS 2016 Workshop on Approximate Inference

09-30

Introducing sparklyr, an R Interface for Apache Spark

09-30

Hyper Networks

09-29

Attractive Mathematical Properties Of The Roc Curve

09-27

A fastText-based hybrid recommender

09-27

2016

Poker odds with wild cards

09-27

Unevenly Spaced Data

09-26

Binary Stochastic Neurons in Tensorflow

09-24

A Billion Words and The Limits of Language Modeling

09-23

GPU-accelerated Theano & Keras with Windows 10

09-23

No juice for you, CSV format. It just makes you more awful.

09-23

Assorted links

09-22

Poker Odds

09-22

Sales Automation Through a Deep Learning Platform

09-22

Turning Distances into Distributions

09-19

2016

Ask Why! Finding motives, causes, and purpose in data science

09-19

Discussion of "Fast Approximate Inference for Arbitrarily Large Semiparametric Regression Models via Message Passing"

09-19

Smart Cities at the Nexus of Emerging Data Technologies and You

09-18

Probability Calibration And Isotonic Regression

09-18

Collaborative Filtering using Alternating Least Squares

09-17

k-Nearest Neighbors & Anomaly Detection Tutorial

09-14

Solving Real-Life Mysteries with Big Data and Apache Spark

09-13

TensorFlow in a Nutshell — Part Two： Hybrid Learning

09-13

Outside a train rumbles by

09-09

Looking for exceptional postdoc candidates in Computational Social Sciences

09-09

2016

Attention and Augmented Recurrent Neural Networks

09-08

Wire Gauges

09-07

A Survival Guide to a PhD

09-07

Approaching fairness in machine learning

09-06

The Probability Monad and Why it's Important for Data Science

09-05

Republican-leaning states tend to have more traffic deaths

09-04

Python 2.7 still reigns supreme in pip installs

09-03

Analyzing The Papers Behind Facebook's Computer Vision Approach

09-01

Building Spring Cloud Microservices That Strangle Legacy Systems

08-30

Towards optimal personalization： synthesisizing machine learning and operations research

08-30

2016

How to Solve a Problem In 3 Steps -- Define It, Redefine It, Repeat

08-29

Basic Math on How Bloom Filter Works

08-27

Random Forest Tutorial： Predicting Crime in San Francisco

08-25

Conda： Myths and Misconceptions

08-25

The 9 Deep Learning Papers You Need To Know About (Understanding CNNs Part 3)

08-24

Is BackPropagation Necessary?

08-23

In Praise Of Reinventing The Wheel

08-23

“Becoming a Data Scientist” Survey Results 1： Jobs & Education

08-22

TensorFlow in a Nutshell — Part One： Basics

08-22

If you don’t pay attention, data can drive you off a cliff

08-21

2016

RNNs in Tensorflow, a Practical Guide and Undocumented Features

08-21

Lagrange Points

08-21

Creating a Search Engine

08-19

How to scrape a website using Python + Scrapy in 5 simple steps

08-18

Grokking Deep Learning

08-17

Preliminary Note on the Complexity of a Neural Network

08-16

Evolution of active categorical image classification via saccadic eye movement

08-13

Recurrent Neural Networks for Beginners

08-13

Podcast Episodes 0 to 3

08-13

Assorted links

08-12

2016

Measuring Bernoulli Probabilities in the Presence of Delayed Reactions

08-11

Blog has migrated from Ghost to Jekyll

08-11

How to score 0.8134 in Titanic Kaggle Challenge

08-10

IMDB Data Visualizations with D3 + Dimple.js

08-10

Playing with convolutions in TensorFlow

08-09

The Convexity of Improbability： How Rare are K-Sigma Effects?

08-08

Boosting (in Machine Learning) as a Metaphor for Diverse Teams

08-07

Moscow Math Olympiad Puzzle

08-07

A intuitive explanation of natural gradient descent

08-07

Variational Autoencoders Explained

08-06

2016

Balanced Field Length

08-04

Is Data Scientist a useless job title?

08-04

Efficient Guttering

07-29

My Open-Source Machine Learning Masters (in Casablanca, Morocco)

07-29

A Beginner's Guide To Understanding Convolutional Neural Networks Part 2

07-29

Talk： Building Machines that Imagine and Reason

07-28

Decision Trees Tutorial

07-27

Written Memories： Understanding, Deriving and Extending the LSTM

07-26

Re-work Interview Questions

07-26

Recurrent Neural Networks in Tensorflow II

07-25

2016

Summing the Fibonacci Sequence

07-24

Learning in Brains and Machines (4)： Episodic and Interactive Memory

07-24

Why I’m Not a Fan of R-Squared

07-24

Who's at the Center of the Star Trek Universe?

07-22

Simulating Twitch chat with a Recurrent Neural Network

07-21

Bulk Downloading Adobe Analytics Data

07-21

Linear regression can be understood in many ways (optimization, probabilistic, bayesian)

07-20

I'm all about ML, but let's talk about OR

07-20

Project Euler using Scala： Problem

07-19

Instagram’s Blind Spot

07-19

2016

Styles of Truncated Backpropagation

07-19

Monitoring your cluster in just a few minutes using ISA

07-18

Why Scala?

07-17

Becoming a Data Scientist Podcast Episode 13： Debbie Berebichez

07-15

12 Ways To Cultivate A Data-Savvy Workforce

07-15

Building a Data Science Portfolio： Storytelling with Data (Part 2： Data Exploration)

07-14

MLHEP 2016 lectures slides

07-12

Occam razor vs. machine learning

07-12

Recurrent Neural Networks in Tensorflow I

07-11

Build your own offshore company

07-06

2016

Gradient Boosting Interactive Playground

07-05

Bayesian Deep Learning Part II： Bridging PyMC3 and Lasagne to build a Hierarchical Neural Network

07-05

Deep Learning for Chatbots, Part 2 – Implementing a Retrieval-Based Model in Tensorflow

07-04

A tour of Factor： 4

07-04

Learning in Brains and Machines (3)： Synergistic and Modular Action

07-03

First Order Optimization Methods

07-02

Simple reinforcement learning methods to learn CartPole

07-01

Data Science Challenges

07-01

Analysing NLP publication patterns

06-30

Building a Data Science Portfolio： Storytelling with Data

06-30

2016

Which is more dangerous, guns or gay sex?

06-29

Generative Adversarial Networks Explained

06-29

3 Reasons Counting is the Hardest Thing in Data Science

06-29

Lethal Autonomous Weapon Systems are on the way

06-28

The Rise of Social Bots!

06-28

Factorization Machines A Theoretical Introduction

06-26

Big Data Technology Trends in Banking

06-24

Gradient Boosting explained [demonstration]

06-24

The Real Story Behind Today's Referendum

06-23

Making use of the model

06-20

2016

A tour of Factor： 3

06-20

How things float

06-20

Making Bayesian A/B testing more accessible

06-19

Making Deep Networks Probabilistic via Test-time Dropout

06-17

Announcing hs2client, A Fast New C++ / Python Thrift Client for Impala and Hive

06-16

The Policy Gradient

06-16

Why Can't Gay Men Donate Blood? A Bayesian Analysis

06-16

Visualizing Features from a Convolutional Neural Network

06-15

Becoming a Data Scientist Podcast Episode 12： Data Science Learning Club Members

06-15

Kinesis Savant Elite 2 Foot pedals

06-14

2016

Principal Component Analysis Tutorial

06-14

Document Similarity With Word Movers Distance

06-13

The Power of IPython Notebook + Pandas + and Scikit-learn

06-11

Animate NBA shot events with Paper.js

06-08

Model-Free Prediction and Control

06-07

Translating W2v Embedding From One Space To Another

06-06

A Guide to Gradient Boosted Trees with XGBoost in Python

06-05

A Gentle Introduction to Bloom Filter

06-05

LSTMs

06-04

Matter and Neutron Stars

06-04

2016

Quantitative Finance Resources

06-04

Data trusts could allay our privacy fears

06-03

Generating Large Images from Latent Vectors - Part Two

06-02

Deep Learning Trends @ ICLR 2016

06-01

Bayesian Deep Learning

06-01

TensorFlow Implementation of "A Neural Algorithm of Artistic Style"

05-31

Deep Reinforcement Learning： Pong from Pixels

05-31

Becoming a Data Scientist Podcast Episode 11： Stephanie Rivera

05-31

Assorted links

05-30

Concurrent bloom filters

05-30

2016

A Gentle Introduction to Recommender Systems with Implicit Feedback

05-30

Data Trusts

05-29

A tour of Factor： 2

05-27

Finding Similar Sounding Names – Some Basics

05-26

Blending independent estimates

05-25

Using Xcode with Github

05-25

How to make a good data-driven web app

05-25

Maximum Likelihood estimates follow a normal distribution

05-24

Hyperparameter optimization with approximate gradient

05-24

Adobe Analytics Clickstream Data Feed： Calculations and Outlier Analysis

05-24

2016

A tour of Factor： 1

05-23

Piano Keyboards

05-22

.new_item for python lists

05-22

Build your own Deep Learning Box

05-19

IP string to integer conversion with Rcpp

05-19

Q & A with Meta Brown

05-18

Top 8 resources for learning data analysis with pandas

05-16

Vanilla Neural Nets

05-16

Sequence prediction using recurrent neural networks(LSTM) with TensorFlow

05-14

Meanshift Algorithm for the Rest of Us (Python)

05-14

2016

German Temperature Data

05-12

Bisecting an arbitrary triangular cake

05-11

Easier data analysis in Python with pandas (video series)

05-10

Adobe： Give Credit. You DID NOT Write RSiteCatalyst.

05-09

Bisecting a triangular cake

05-09

Optimizing Split Sizes for Hadoop’s CombineFileInputFormat

05-09

Future of AI 6. Discussion of 'Superintelligence： Paths, Dangers, Strategies'

05-09

Future of AI 5： The Singularians

05-09

Streaming Log-sum-exp Computation

05-08

Neural Network Evolution Playground with Backprop NEAT

05-07

2016

Single Neuron Gradient Descent

05-06

Akka Stream

05-06

Google's NHS deal does not bode well for the future of data-sharing

05-05

The structure of Mafia syndacates

05-04

White House launches workshops to prepare for Artificial Intelligence

05-04

Similar pages for Wikipedia

05-03

A wild dataset has appeared! Now what?

05-02

Becoming a Data Scientist Podcast Episode 10： Trey Causey

05-01

Baseball Card Collecting

04-29

Rolling and Unrolling RNNs

04-28

2016

Data Analysis, NHS and Industrial Partners

04-28

Useful External Resources

04-27

Feather： it's about metadata

04-26

Interactive Abstract Pattern Generation Javascript Demo

04-24

AI and ML Futures 4： The Future of AI Meeting

04-22

Rejoinder： the problem with conda-forge right now

04-21

Feather and Apache Arrow： Grokking file formats vs. in-memory representations

04-21

Predicting Churn

04-21

conda-forge and PyData's CentOS moment

04-20

Where will Artificial Intelligence come from?

04-20

2016

How-to： Use Impala and Kudu Together for Analytic Workloads

04-20

Wesley Crushes Ratings

04-19

dotify： Recommending Spotify Music Through Country Arithmetic

04-15

First 3rd party notebook for Databricks Community Edition

04-14

Exploring convolutional neural networks with DL4J

04-14

Create a Chrome extension to modify a website’s HTML or CSS

04-14

“Redshift View Materializer” Now on Github

04-14

Becoming a Data Scientist Podcast Episode 09： Justin Kiggins

04-12

Where Will Your Country Stand in World War III?

04-12

Eiffel Tower

04-12

2016

First Convergence Bias

04-11

Step by step Kaggle competition tutorial

04-10

Becoming More Efficient

04-07

Learning in Brains and Machines (2)： The Dogma of Sparsity

04-07

Deep Learning for Chatbots, Part 1 – Introduction

04-07

The Frog of CIFAR 10

04-06

Genome Analysis Toolkit： Now Using Apache Spark for Data Processing

04-06

On Software Demos and Potemkin Villages

04-06

Travis CI： "You Have Too Many Tests LOLZ!"

04-05

Inverting a Neural Net

04-05

2016

Sheffield University Life

04-05

RSiteCatalyst Version 1.4.8 Release Notes

04-04

Solar Eclipses

04-03

Generating Large Images from Latent Vectors

04-01

Representational Power of Deeper Layers

03-30

Implementing Batch Normalization in Tensorflow

03-29

Becoming a Data Scientist Podcast Episode 08： Sebastian Raschka

03-29

Feather： A Fast On-Disk Format for Data Frames for R and Python, powered by Apache Arrow

03-29

Crowdsourcing Fantasy Baseball Leagues

03-25

Akka Stream

03-25

2016

Generating Abstract Patterns with TensorFlow

03-25

Examining Your Presence on Twitter with Python

03-24

Top content from two years of Data School

03-24

Sense is now part of Cloudera!

03-22

How tall is that tree?

03-22

Dealing with Corrupt Files in Hadoop

03-21

How To Become A Machine Learning Expert In One Simple Step

03-20

Adobe Analytics Clickstream Data Feed： Loading To Relational Database

03-18

Two Bingo Ball Puzzle

03-18

Avoid unsigned integers in C++ if you can

03-17

2016

Keras plays catch, a single file Reinforcement Learning example

03-17

Large Data with Scikit-learn - Boston Meetup

03-16

Compiling DataFrame code is harder than it looks

03-16

National Pi Day

03-15

Diagnosing Heart Diseases with Deep Neural Networks

03-15

David MacKay Symposium

03-15

Analyzing Golden State Warriors' passing network using GraphFrames in Spark

03-15

Stability as a foundation of machine learning

03-14

Do average consumers still need Dropbox?

03-13

Integrating D3.js into R Shiny

03-13

2016

Quora Q&A Session Answers

03-09

Analyzing Customer Churn – Competing Risks

03-08

Second Annual Data Science Bowl – Part 3 – Automatically Finding the Heart Location in an MRI Image

03-08

Second Annual Data Science Bowl – Part 2

03-07

Second Annual Data Science Bowl – Part 1

03-06

scikit-learn-contrib, an umbrella for scikit-learn related projects.

03-05

Watch Tiny Neural Nets Learn

03-04

First Steps With Neural Nets in Keras

03-04

Deep Learning, Pachinko, and James Watt： Efficiency is the Driver of Uncertainty

03-04

Meet the Authors： “Data Analytics with Hadoop” from O’Reilly Media

03-01

2016

Sheffield Advertises Posts in Machine Learning

03-01

Grazing and Calculus

02-29

Future Debates： This House Believes An Artificial Intelligence will Benefit Society

02-29

Discovering and understanding patterns in highly dimensional data

02-28

Histogram intersection for change detection

02-28

A Variant on “Statistically Controlling for Confounding Constructs is Harder than you Think”

02-25

How to Code and Understand DeepMind's Neural Stack Machine

02-25

Oil Changes, Gas Mileage, and my Unreliable Gut

02-24

Science Week Talk 2016

02-24

Guide to an in-depth understanding of logistic regression

02-22

2016

Why pandas users should be excited about Apache Arrow

02-22

Calling RSiteCatalyst From Python

02-22

SAGA algorithm in the lightning library

02-21

Data Science Learning Club Update

02-21

Learning in Brains and Machines (1)： Temporal Differences

02-21

Two nugget problem

02-21

Introducing Apache Arrow： A Fast, Interoperable In-Memory Columnar Data Structure Standard

02-18

Why Blog?

02-18

Making Python on Apache Hadoop Easier with Anaconda and CDH

02-17

Confluent Platform

02-15

2016

Becoming a Data Scientist Podcast Episode 05： Clare Corthell

02-15

Developing effective data scientists

02-11

d20 stopping puzzle

02-11

The Best of Unpublished Machine Learning and Statistics Books

02-09

Implement spelling correction using Language Models

02-08

Paris Meetup slides Topic Modeling of Twitter Followers

02-08

Russian Roulette

02-06

Amazon Redshift Performance – Bigger Clusters, or Bigger Nodes?

02-05

Class visualization with bilateral filters

02-05

Six Roll Dice Game

02-04

2016

RSiteCatalyst Version 1.4.7 (and 1.4.6.) Release Notes

02-01

Hamming Codes

01-30

Thinking is not something that goes on entirely, or even mostly, inside people’s heads. Little...

01-30

How-to： Train Models in R and Python using Apache Spark MLlib and H2O

01-29

The Shuttle Challenger Disaster： Reflections and Connections to Data Science

01-28

A Million Text Files And A Single Laptop

01-28

Why Today’s Big Data is Not Yesterday’s Big Data — Exponential and Combinatorial Growth

01-26

Data Mining with Python on Medical Datasets for Data Mining

01-25

Theano Tutorial

01-25

The Definitive Q&A Guide for Aspiring Data Scientists

01-25

2016

The Mathematics Behind： Rejection Sampling

01-24

Online Representation Learning in Recurrent Neural Language Models

01-24

Skill vs Strategy

01-23

Time Series for Spark： 0.2.0 Released

01-22

Otoro Blog Migration

01-22

Building a news search engine

01-21

The Mathematics Behind： Polynomial Curve Fitting (MATLAB)

01-20

Continuous Bayes’ Theorem

01-20

Learn How To Implement a Simple E-mail Spam Detector in Python

01-20

Be Like Water

01-19

2016

Introduction to Semi-Supervised Learning with Ladder Networks

01-19

Becoming a Data Scientist Podcast Episode 03： Shlomo Argamon

01-18

AI and ML Futures 1： Background

01-17

AI and ML Futures 2： The Quiet Revolution

01-17

AI and ML Futures 3： The Trojan Wars of Machine Learning

01-17

My Top 10% Solution for Kaggle Rossman Store Sales Forecasting Competition

01-16

Mini AI app using TensorFlow and Shiny

01-15

Defective Circuit Board Puzzle

01-14

Understanding the Pseudo-Truth as an Optimal Approximation

01-11

CES 2016

01-11

2016

The Fair Price to Pay a Spy： An Introduction to the Value of Information

01-09

Explicit Matrix Factorization： ALS, SGD, and All That Jazz

01-09

Machine Learning is not BS in Monitoring

01-09

How Data Science Fueled the Largest Outreach Effort in the History of New York City

01-08

Koch Snowflake

01-05

Generative King of Kowloon

01-05

Becoming a Data Scientist Podcast Episode 02： Safia Abdalla

01-04

Creating a PageRank Analytics Platform Using Spring Boot Microservices

01-03

Attention and Memory in Deep Learning and NLP

01-03

Top 8 Viz features in Excel 2016 !

01-02

2015

21st Century C： Error 64 on OSX When Using Make

12-31

Our R package roundup

12-30

Managing managed libraries with Scala and Eclipse

12-29

Agnez, analytics for deep learning research

12-24

March journal club

12-24

Set up Sublime Text for light-weight all-in-one data science IDE

12-23

Who are the best MMA fighters of all time. A Bayesian study

12-22

A Seasonal Test of AI

12-21

Podcast Available on Stitcher

12-21

Becoming A Data Scientist Podcast Episode 01： Will Kurt

12-21

2015

OpenAI： A new non-profit AI company

12-20

A Year of Approximate Inference： Review of the NIPS 2015 Workshop

12-18

László Babai's New Proof

12-16

ICCV 2015, Day 3

12-16

Weird Number Bases

12-16

ICCV 2015, Day 4

12-16

ICCV 2015, Day 2

12-15

Data Science Learning Club

12-14

Becoming A Data Scientist Podcast Episode 0： Me!

12-14

Does AI stand for Alchemical Intelligence?

12-14

2015

ICCV 2015, Day 1

12-14

A New Library for Analyzing Time-Series Data with Apache Spark

12-14

OpenAI won't benefit humanity without data-sharing

12-14

Adaptive data analysis

12-14

10 ways you might be able to tell when an area of research is undergoing rapid expansion and society's expectations may be somewhat unrealistic ...

12-13

Estimating known unknowns

12-11

Implementing a CNN for Text Classification in TensorFlow

12-11

Ten Tips for Writing CS Papers, Part 2

12-10

Hamiltonian Monte Carlo

12-10

ICCV 2015： Twenty one hottest research papers

12-09

2015

Why is Keras Running So Slow?

12-05

System Zero： What Kind of AI have we Created?

12-04

Give me five

12-04

Some Observations on Winsorization and Trimming

12-03

Common Probability Distributions： The Data Scientist’s Crib Sheet

12-03

Conference on the Economics of Machine Intelligence-Dec 15

12-01

Interactive association rules exploration app

11-30

The TensorFlow perspective on neural networks

11-30

Mazes

11-29

Ten Tips for Writing CS Papers, Part 1

11-29

2015

Why Julia’s DataFrames are Still Slow

11-28

How to Setup Theano to Run on GPU on Ubuntu 14.04 with Nvidia Geforce GTX 780

11-24

A Challenge to Data Scientists

11-22

Lending Club Data Analysis Revisited with Python

11-22

Datascope Promotes Brian Lange to Partner

11-19

Datascope Promotes Bo Peng to Partner

11-19

Visualizing the 2015 NL Cy Young Race

11-19

So You Want to Implement a Custom Loss Function?

11-18

It's not an Internet of Things, It's an Internet of People

11-17

Talking to Machines – The Rise of Conversational Interfaces and NLP

11-17

2015

The Information Barons Threaten our Autonomy and Our Privacy

11-16

An Even Dozen – Denoising Dirty Documents： Part 12

11-15

Anyone Can Learn To Code an LSTM-RNN in Python (Part 1： RNN)

11-15

Emotional contagion in Twitter!

11-14

James Bond movies

11-14

Short Story on AI： A Cognitive Discontinuity.

11-14

History of Monte Carlo Methods - Part 3

11-13

Association rule analysis beyond transaction data

11-11

Golf Balls

11-11

MCMC sampling for dummies

11-10

2015

Neural networks, linear transformations and word embeddings

11-09

Artificial Stupidity and the Mechanistic Fallacy

11-09

“Becoming a Data Scientist” Learning Club?

11-09

Denoising Dirty Documents： Part 11

11-08

Understanding Convolutional Neural Networks for NLP

11-07

The Deep Learning Gold Rush of 2015

11-07

A Torch autoencoder example

11-06

The problem with the data science language wars

11-02

Analyzing Interactive Brokers XML Flex Statements with pandas

11-02

Intro to Recommender Systems： Collaborative Filtering

11-02

2015

Deep Learning for Visual Question Answering

11-02

Most Winning A/B Test Results are Illusory

11-01

Denoising Dirty Documents – Part 10

11-01

History of Monte Carlo Methods - Part 2

10-30

Q-learning with Neural Networks

10-30

Spying on instance methods with Python's mock module

10-29

Hogwild Stochastic Gradient Descent

10-27

The Evolution of Pop Lyrics and a tale of two LDA’s

10-27

Go easy on Volkswagen

10-26

Books for Data Science Beginners, and Data Sources

10-26

2015

What a Deep Neural Network thinks about your

10-25

Reinforcement Learning - Monte Carlo Methods

10-25

Prototyping Long Term Time Series Storage with Kafka and Parquet

10-25

Data Science Tutorials Flipboard Magazine

10-21

Dropout Ensembling In Neural Nets

10-21

Recurrent Neural Networks

10-20

Reinforcement Learning - Part 1

10-19

Theoretical Motivations for Deep Learning

10-18

Analyzing Pronto CycleShare Data with Python and Pandas

10-18

Bayes Primer

10-17

2015

History of Monte Carlo Methods - Part 1

10-16

Clustering debates from UK politicians

10-16

Generating Fibonacci Numbers

10-16

Emoticons decoder for social media sentiment analysis in R

10-16

7 tools in every data scientist’s toolbox

10-15

Denoising Dirty Documents： Part 9

10-15

Deep Learning Startups, Applications and Acquisitions – A Summary

10-13

LOCF and Linear Imputation with PostgreSQL

10-11

How-to： Build a Machine-Learning App Using Sparkling Water and Apache Spark

10-08

On the consistency of ordinal regression methods

10-08

2015

Beer reviews with Recurrent Neural Networks (RNN)

10-08

Predicting Fantasy Football Points

10-07

Yet Another PhD to Data Science Post (Part III)

10-06

The Unbundling of AWS

10-06

Lychrel Numbers

10-05

Craft Software

10-05

Travel Recommendations with Jaccard Similarities

10-03

Experiments with style transfer

10-02

Denoising Dirty Documents： Part 8

10-02

The Julia language for Scientific Computing

10-02

2015

A Brief Guide to the Docker Ecosystem

10-01

Long Short-Term Memory dramatically improves Google Voice etc – now available to a billion users

09-30

Rebuilding Map Example With Apply Functions

09-30

Yet Another PhD to Data Science Post (Part II)

09-29

Beginner Tutorial： Neural Nets in Theano

09-29

Gravity Variations

09-27

Yet Another PhD to Data Science Post (Part I)

09-23

Denoising Dirty Documents： Part 7

09-23

A Few Tips To Make Distributed Teams Work Well

09-23

Business Execution

09-22

2015

MIMIC Data

09-22

Six lines to install and start SparkR on Mac OS X Yosemite

09-21

Twenty Peg Puzzle

09-19

How good are your beliefs? Part 2： The Quiz

09-18

SunJackson

© 2018 - 2019 SunJackson