Principles of Database Management: The Practical Guide to Storing, Managing and Analyzing Big and Small Data 01-10
Document worth reading: “Marketing Analytics: Methods, Practice, Implementation, and Links to Other Fields” 12-07
Aella Credit empowers underbanked individuals by using Amazon Rekognition for identity verification 08-15
Why Indian companies should take on different projects than competing Valley companies - an application of Cobb-Douglas 11-07
“Statistical insights into public opinion and politics” (my talk for the Columbia Data Science Society this Wed 9pm) 12-04
(People are missing the point on Wansink, so) what’s the lesson we should be drawing from this story? 09-27
Why, oh why, do so many people embrace the Pacific Garbage Cleanup nonsense? (I have a theory). 09-18
Predicting World Cup dark horses from press coverage using the AYLIEN News API – Monthly Media Roundup 06-15
Alternative approaches to scaling Shiny with RStudio Shiny Server, ShinyProxy or custom architecture. 12-18
Vulcan Post: This AI Startup Is Run By The World’s Top Data Scientists – Lets Anyone Build Predictive Models 09-04
You did a sentiment analysis with tidytext but you forgot to do dependency parsing to answer WHY is something positive/negative 01-08
Anomaly detection on Amazon DynamoDB Streams using the Amazon SageMaker Random Cut Forest algorithm 12-05
Amazon SageMaker now comes with new capabilities for accelerating machine learning experimentation 11-29
Slides from my talk at the R-Ladies Meetup about Interpretable Deep Learning with R, Keras and LIME 10-17
Segmenting brain tissue using Apache MXNet with Amazon SageMaker and AWS Greengrass ML Inference – Part 2 10-04
“To get started, I suggest coming up with a simple but reasonable model for missingness, then simulate fake complete data followed by a fake missingness pattern, and check that you can recover your missing-data model and your complete data model in that fake-data situation. You can then proceed from there. But if you can’t even do it with fake data, you’re sunk.” 08-27
Document worth reading: “Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution” 08-11
Transfer learning for custom labels using a TensorFlow container and “bring your own algorithm” in Amazon SageMaker 07-27
AWS Deep Learning AMIs now include ONNX, enabling model portability across deep learning frameworks 07-26
Model Updates: Entity-level Sentiment Analysis and Brand New Entity Extraction Models Now Live in the Text Analysis API 07-17
Using Siamese Networks and Pre-Trained Convolutional Neural Networks (CNNs) for Fashion Similarity Matching 07-10
Joint inference or modular inference? Pierre Jacob, Lawrence Murray, Chris Holmes, Christian Robert discuss conditions on the strength and weaknesses of these choices 07-09
How to Use FPGAs for Deep Learning Inference to Perform Land Cover Mapping on Terabytes of Aerial Images 05-29
Two papers released on arXiv, "Operator Variational Inference" and "Model Criticism for Bayesian Causal Inference" 10-30
Discussion of "Fast Approximate Inference for Arbitrarily Large Semiparametric Regression Models via Message Passing" 09-19
Top Stories of 2018: 9 Must-have skills you need to become a Data Scientist, updated; Python eats away at R: Top Software for Analytics, Data Science, Machine Learning in 2018 12-14
KDnuggets™ News 18:n44, Nov 21: What is the Best Python IDE for Data Science?; Anticipating the next move in data science 11-21
Top KDnuggets tweets, Oct 31 – Nov 6: 10 More Free Must-Read Books for Machine Learning and Data Science 11-07
Principles of Database Management: The Practical Guide to Storing, Managing and Analyzing Big and Small Data 01-10
Core Principles of Sustainable Data Science, Machine Learning and AI Product Development: Research as a core driver 01-09
Don’t reinvent the wheel: making use of shiny extension packages. Join MünsteR for our next meetup! 01-08
What to do when you read a paper and it’s full of errors and the author won’t share the data or be open about the analysis? 01-02
Zak David expresses critical views of some published research in empirical quantitative finance 12-24
Industry Predictions: AI, Machine Learning, Analytics & Data Science Main Developments in 2018 and Key Trends for 2019 12-18
Top Stories of 2018: 9 Must-have skills you need to become a Data Scientist, updated; Python eats away at R: Top Software for Analytics, Data Science, Machine Learning in 2018 12-14
Spark + AI Summit: learn best practices in ML and DL, latest frameworks, and more – special KDnuggets offer 12-14
Top November Stories: The Most in Demand Skills for Data Scientists; What is the Best Python IDE for Data Science? 12-11
KDnuggets™ News 18:n46, Dec 5: AI, Data Science, Analytics 2018 Main Developments, 2019 Key Trends; Deep Learning Cheat Sheets 12-05
Document worth reading: “A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition” 12-05
University of Tennessee Knoxville: Assistant or Associate Professor in Data Science [Knoxville, TN] 11-30
Linking Data Science Activities to Business Initiatives Using the Hypothesis Development Canvas 11-29
KDnuggets™ News 18:n45, Nov 28: Your Favorite Python IDE/editor? Intro to Data Science for Managers 11-28
Humana: Principal Data Scientist/Informatics Principal [Chicago, IL, Dallas, TX and Louisville, KY] 11-27
The best way to visit Luxembourguish castles is doing data science + combinatorial optimization 11-21
UnitedHealth Group: Clinical Data Statistical Analyst – SQL SAS (Clinician Required) [Telecommute] 11-16
UnitedHealth Group: Senior Principal Data Scientist [Telecommute, Central or Eastern Time Zones] 11-16
KDnuggets™ News 18:n43, Nov 14: To get hired as a data scientist, don’t follow the herd; LinkedIn Top Voices in Data Science & Analytics 11-14
Top October Stories: 9 Must-have skills you need to become a Data Scientist, updated; 10 Best Mobile Apps for Data Scientist / Data Analysts 11-09
Facial feedback: “These findings suggest that minute differences in the experimental protocol might lead to theoretically meaningful changes in the outcomes.” 11-01
Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning: November and Beyond 11-01
Key Takeaways from AI Conference SF, Day 1: Domain Specific Architectures, Emerging China, AI Risks 10-29
Google, Microsoft & Fraunhofer at the First European Edition of Deep Learning World – 12 Nov, 2018 10-23
Top KDnuggets tweets, Oct 3–9: 5 Reasons Logistic Regression should be the first thing you learn when becoming a Data Scientist 10-10
Job: Postdoctoral Researcher in Small Data Deep Learning and Explainable Machine Learning, Livermore, CA 10-08
Colorado State University: Assistant Professor in Industrial and Organizational (IO) Psychology [Fort Collins, CO] 10-05
Document worth reading: “Generative Adversarial Nets for Information Retrieval: Fundamentals and Advances” 10-01
(People are missing the point on Wansink, so) what’s the lesson we should be drawing from this story? 09-27
A psychology researcher uses Stan, multiverse, and open data exploration to explore human memory 09-21
“We continuously increased the number of animals until statistical significance was reached to support our conclusions” . . . I think this is not so bad, actually! 09-04
Document worth reading: “PMLB: A Large Benchmark Suite for Machine Learning Evaluation and Comparison” 08-31
Document worth reading: “Idealised Bayesian Neural Networks Cannot Have Adversarial Examples: Theoretical and Empirical Study” 08-29
Document worth reading: “A Comparative Study on using Principle Component Analysis with Different Text Classifiers” 08-29
“To get started, I suggest coming up with a simple but reasonable model for missingness, then simulate fake complete data followed by a fake missingness pattern, and check that you can recover your missing-data model and your complete data model in that fake-data situation. You can then proceed from there. But if you can’t even do it with fake data, you’re sunk.” 08-27
“The most important aspect of a statistical analysis is not what you do with the data, it’s what data you use” (survey adjustment edition) 08-07
Joint inference or modular inference? Pierre Jacob, Lawrence Murray, Chris Holmes, Christian Robert discuss conditions on the strength and weaknesses of these choices 07-09
Data Science in 30 Minutes: Deep Learning to Detect Fake News with Uber ATG Head of Data Science, Mike Tamir 05-30
Cambridge Analytica, Facebook, and user data – Monthly Media Review with the AYLIEN News API, April 05-03
Two papers released on arXiv, "Operator Variational Inference" and "Model Criticism for Bayesian Causal Inference" 10-30
Bayesian Deep Learning Part II: Bridging PyMC3 and Lasagne to build a Hierarchical Neural Network 07-05
Kent State University: Assistant/Associate Professor – Business Analytics/Information Systems [Kent, OH] 12-19
UnitedHealth Group: Senior Principal Data Scientist [Telecommute, Central or Eastern Time Zones] 11-16
Maps with pie charts on top of each administrative division: an example with Luxembourg’s elections data 10-27
“If you deprive the robot of your intuition about cause and effect, you’re never going to communicate meaningfully.” – Pearl ’18 06-08
“The idea of replication is central not just to scientific practice but also to formal statistics . . . Frequentist statistics relies on the reference set of repeated experiments, and Bayesian statistics relies on the prior distribution which represents the population of effects.” 07-19
Import AI 127: Why language AI advancements may make Google more competitive; COCO image captioning systems don’t live up to the hype, and Amazon sees 3X growth in voice shopping via Alexa 12-31
Now use Pipe mode with CSV datasets for faster training on Amazon SageMaker built-in algorithms 11-01
Amazon SageMaker Neural Topic Model now supports auxiliary vocabulary channel, new topic evaluation metrics, and training subsampling 10-04
Segmenting brain tissue using Apache MXNet with Amazon SageMaker and AWS Greengrass ML Inference – Part 1 09-26
Import AI: 108: Learning language with fake sentences, Chinese researchers use RL to train prototype warehouse robots; and what the implications are of scaled-up Neural Architecture Search 08-20
Transfer learning for custom labels using a TensorFlow container and “bring your own algorithm” in Amazon SageMaker 07-27
AWS Deep Learning AMIs now with optimized TensorFlow 1.9 and Apache MXNet 1.2 with Keras 2 support to accelerate deep learning on Amazon EC2 instances 07-23
Using Siamese Networks and Pre-Trained Convolutional Neural Networks (CNNs) for Fashion Similarity Matching 07-10
Import AI 127: Why language AI advancements may make Google more competitive; COCO image captioning systems don’t live up to the hype, and Amazon sees 3X growth in voice shopping via Alexa 12-31
“My advisor and I disagree on how we should carry out repeated cross-validation. We would love to have a third expert opinion…” 12-15
Import AI: 122: Google obtains new ImageNet state-of-the-art with GPipe; drone learns to land more effectively than PD controller policy; and Facebook releases its ‘CherryPi’ StarCraft bot 11-26
Now use Pipe mode with CSV datasets for faster training on Amazon SageMaker built-in algorithms 11-01
Amazon SageMaker Neural Topic Model now supports auxiliary vocabulary channel, new topic evaluation metrics, and training subsampling 10-04
Segmenting brain tissue using Apache MXNet with Amazon SageMaker and AWS Greengrass ML Inference – Part 1 09-26
Import AI: 108: Learning language with fake sentences, Chinese researchers use RL to train prototype warehouse robots; and what the implications are of scaled-up Neural Architecture Search 08-20
Transfer learning for custom labels using a TensorFlow container and “bring your own algorithm” in Amazon SageMaker 07-27
University of San Francisco: Assistant Professor, Tenure Track, Mathematics and Statistics [San Francisco, CA] 10-17
University of Tennessee Knoxville: Assistant or Associate Professor in Data Science [Knoxville, TN] 11-30
Vulcan Post: This AI Startup Is Run By The World’s Top Data Scientists – Lets Anyone Build Predictive Models 09-04
How to import a directory of csvs at once with base R and data.table. Can you guess which way is the fastest? 10-13
Long Short-Term Memory dramatically improves Google Voice etc – now available to a billion users 09-30
Document worth reading: “Recent Advances in Object Detection in the Age of Deep Convolutional Neural Networks” 10-25
Google, Microsoft & Fraunhofer at the First European Edition of Deep Learning World – 12 Nov, 2018 10-23
Document worth reading: “Quantizing deep convolutional networks for efficient inference: A whitepaper” 09-10
Bayesian Deep Learning Part II: Bridging PyMC3 and Lasagne to build a Hierarchical Neural Network 07-05
Long Short-Term Memory dramatically improves Google Voice etc – now available to a billion users 09-30
KDnuggets™ News 19:n02, Jan 9: The cold start problem: how to build your machine learning portfolio; 5 Best Data Visualization Libraries 01-09
KDnuggets™ News 19:n01, Jan 3: The Essence of Machine Learning; A Guide to Decision Trees for Machine Learning and Data Science 01-03
Top KDnuggets tweets, Dec 12-18: Deep Learning Cheat Sheets; The Nate Silver vs. Nassim Taleb Twitter War 12-19
Top Stories of 2018: 9 Must-have skills you need to become a Data Scientist, updated; Python eats away at R: Top Software for Analytics, Data Science, Machine Learning in 2018 12-14
Spark + AI Summit: learn best practices in ML and DL, latest frameworks, and more – special KDnuggets offer 12-14
KDnuggets™ News 18:n47, Dec 12: Common mistakes when doing machine learning; Here are the most popular Python IDEs / Editors 12-12
Top November Stories: The Most in Demand Skills for Data Scientists; What is the Best Python IDE for Data Science? 12-11
KDnuggets™ News 18:n46, Dec 5: AI, Data Science, Analytics 2018 Main Developments, 2019 Key Trends; Deep Learning Cheat Sheets 12-05
Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning: December and Beyond 12-04
Amazon SageMaker notebooks now support Git integration for increased persistence, collaboration, and reproducibility 11-29
Import AI 121: Sony researchers make ultra-fast ImageNet training breakthrough; Berkeley researchers tackle StarCraft II with modular RL system; and Germany adds €3bn for AI research 11-19
Document worth reading: “Multi-Agent Reinforcement Learning: A Report on Challenges and Approaches” 11-17
Top October Stories: 9 Must-have skills you need to become a Data Scientist, updated; 10 Best Mobile Apps for Data Scientist / Data Analysts 11-09
KDnuggets™ News 18:n42, Nov 7: The Most in Demand Skills for Data Scientists; How Machines Understand Our Language: Intro to NLP 11-07
Import AI 119: How to benefit AI research in Africa; German politician calls for billions in spending to prevent country being left behind; and using deep learning to spot thefts 11-05
U. of Zurich: Assistant Professorship in AI and Machine Learning (Non-tenure Track) [Zurich, Switzerland] 10-24
Google, Microsoft & Fraunhofer at the First European Edition of Deep Learning World – 12 Nov, 2018 10-23
Document worth reading: “A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress” 10-15
Top KDnuggets tweets, Oct 3–9: 5 Reasons Logistic Regression should be the first thing you learn when becoming a Data Scientist 10-10
Top September Stories: Essential Math for Data Science: Why and How; Machine Learning Cheat Sheets 10-10
Job: Postdoctoral Researcher in Small Data Deep Learning and Explainable Machine Learning, Livermore, CA 10-08
Top KDnuggets tweets, Sep 26 – Oct 2: Why building your own Deep Learning Computer is 10x cheaper than AWS; 6 Steps To Write Any Machine Learning Algorithm 10-03
KDnuggets™ News 18:n37, Oct 3: Mathematics of Machine Learning; Effective Transfer Learning for NLP; Path Analysis with R 10-03
Data Center Scale Computing and Artificial Intelligence with Matei Zaharia, Inventor of Apache Spark 09-12
Document worth reading: “Attend Before you Act: Leveraging human visual attention for continual learning” 08-03
Long Short-Term Memory dramatically improves Google Voice etc – now available to a billion users 09-30
Long Short-Term Memory dramatically improves Google Voice etc – now available to a billion users 09-30
Slides from my talk at the R-Ladies Meetup about Interpretable Deep Learning with R, Keras and LIME 10-17
Top September Stories: Essential Math for Data Science: Why and How; Machine Learning Cheat Sheets 10-10
Understanding deep Convolutional Neural Networks with a practical use-case in Tensorflow and Keras 11-13
Long Short-Term Memory dramatically improves Google Voice etc – now available to a billion users 09-30
Transfer learning for custom labels using a TensorFlow container and “bring your own algorithm” in Amazon SageMaker 07-27
Aella Credit empowers underbanked individuals by using Amazon Rekognition for identity verification 08-15
Amazon Rekognition is now available in the Asia Pacific (Seoul) and Asia Pacific (Mumbai) Regions 08-09
KDnuggets™ News 18:n44, Nov 21: What is the Best Python IDE for Data Science?; Anticipating the next move in data science 11-21
GitHub Python Data Science Spotlight: High Level Machine Learning & NLP, Ensembles, Command Line Viz & Docker Made Easy 10-16
Import AI 127: Why language AI advancements may make Google more competitive; COCO image captioning systems don’t live up to the hype, and Amazon sees 3X growth in voice shopping via Alexa 12-31
Amazon Comprehend introduces new Region availability and language support for French, German, Italian, and Portuguese 10-10
Import AI 127: Why language AI advancements may make Google more competitive; COCO image captioning systems don’t live up to the hype, and Amazon sees 3X growth in voice shopping via Alexa 12-31
Amazon Rekognition announces updates to its face detection, analysis, and recognition capabilities 11-22
Slides from my talk at the R-Ladies Meetup about Interpretable Deep Learning with R, Keras and LIME 10-17
Document worth reading: “Idealised Bayesian Neural Networks Cannot Have Adversarial Examples: Theoretical and Empirical Study” 08-29
Amazon Rekognition is now available in the Asia Pacific (Seoul) and Asia Pacific (Mumbai) Regions 08-09
Document worth reading: “Attend Before you Act: Leveraging human visual attention for continual learning” 08-03
Using Siamese Networks and Pre-Trained Convolutional Neural Networks (CNNs) for Fashion Similarity Matching 07-10
Understanding deep Convolutional Neural Networks with a practical use-case in Tensorflow and Keras 11-13
Second Annual Data Science Bowl – Part 3 – Automatically Finding the Heart Location in an MRI Image 03-08
Center for Ultrasound Research and Translation, Massachusetts General Hospital: Post-Doctoral Scholar / Research Scientist [Boston, MA] 12-31
Humana: Principal Data Scientist/Informatics Principal [Chicago, IL, Dallas, TX and Louisville, KY] 11-27
KDnuggets™ News 19:n01, Jan 3: The Essence of Machine Learning; A Guide to Decision Trees for Machine Learning and Data Science 01-03
KDnuggets™ News 18:n48, Dec 19: Why You Shouldn’t be a Data Science Generalist; Industry Data Science & Machine Learning 2019 Predictions 12-19
Top KDnuggets tweets, Dec 12-18: Deep Learning Cheat Sheets; The Nate Silver vs. Nassim Taleb Twitter War 12-19
Top KDnuggets tweets, Dec 5-11: How to build a data science project from scratch; NeurIPS 2018 video talk collection 12-13
KDnuggets™ News 18:n44, Nov 21: What is the Best Python IDE for Data Science?; Anticipating the next move in data science 11-21
Top KDnuggets tweets, Oct 31 – Nov 6: 10 More Free Must-Read Books for Machine Learning and Data Science 11-07
KDnuggets™ News 18:n42, Nov 7: The Most in Demand Skills for Data Scientists; How Machines Understand Our Language: Intro to NLP 11-07
Top KDnuggets tweets, Oct 3–9: 5 Reasons Logistic Regression should be the first thing you learn when becoming a Data Scientist 10-10
Top KDnuggets tweets, Sep 26 – Oct 2: Why building your own Deep Learning Computer is 10x cheaper than AWS; 6 Steps To Write Any Machine Learning Algorithm 10-03
KDnuggets™ News 18:n37, Oct 3: Mathematics of Machine Learning; Effective Transfer Learning for NLP; Path Analysis with R 10-03
Build an automatic alert system to easily moderate content at scale with Amazon Rekognition Video 08-15
Use AWS Machine Learning to Analyze Customer Calls from Contact Centers (Part 2): Automate, Deploy, and Visualize Analytics using Amazon Transcribe, Amazon Comprehend, AWS CloudFormation, and Amazon QuickSight 12-28
Amazon SageMaker notebooks now support Git integration for increased persistence, collaboration, and reproducibility 11-29
Analyze live video at scale in real time using Amazon Kinesis Video Streams and Amazon SageMaker 11-19
Direct access to Amazon SageMaker notebooks from Amazon VPC by using an AWS PrivateLink endpoint 11-06
Amazon Comprehend introduces new Region availability and language support for French, German, Italian, and Portuguese 10-10
Segmenting brain tissue using Apache MXNet with Amazon SageMaker and AWS Greengrass ML Inference – Part 2 10-04
Build an automatic alert system to easily moderate content at scale with Amazon Rekognition Video 08-15
Announcing the Artificial Intelligence (AI) Hackathon: Build Intelligent Applications using machine learning APIs and serverless 08-15
Amazon Rekognition is now available in the Asia Pacific (Seoul) and Asia Pacific (Mumbai) Regions 08-09
Document worth reading: “Examining the Use of Neural Networks for Feature Extraction: A Comparative Analysis using Deep Learning, Support Vector Machines, and K-Nearest Neighbor Classifiers” 08-08
Using Siamese Networks and Pre-Trained Convolutional Neural Networks (CNNs) for Fashion Similarity Matching 07-10
Principles of Database Management: The Practical Guide to Storing, Managing and Analyzing Big and Small Data 01-10
A potential big problem with placebo tests in econometrics: they’re subject to the “difference between significant and non-significant is not itself statistically significant” issue 09-26
Linguistic Signals of Album Quality: A Predictive Analysis of Pitchfork Review Scores Using Quanteda 01-10
Using Siamese Networks and Pre-Trained Convolutional Neural Networks (CNNs) for Fashion Similarity Matching 07-10
Matching (and discarding non-matches) to deal with lack of complete overlap, then regression to adjust for imbalance between treatment and control groups 11-10
Top KDnuggets tweets, Dec 12-18: Deep Learning Cheat Sheets; The Nate Silver vs. Nassim Taleb Twitter War 12-19
Spark + AI Summit: learn best practices in ML and DL, latest frameworks, and more – special KDnuggets offer 12-14
Top November Stories: The Most in Demand Skills for Data Scientists; What is the Best Python IDE for Data Science? 12-11
Data Science in 30 Minutes: Deep Learning to Detect Fake News with Uber ATG Head of Data Science, Mike Tamir 05-30
Searching for the optimal hyper-parameters of an ARIMA model in parallel: the tidy gridsearch approach 11-15
He had a sudden cardiac arrest. How does this change the probability that he has a particular genetic condition? 10-14
He had a sudden cardiac arrest. How does this change the probability that he has a particular genetic condition? 10-14
Displaying our “R – Quality Control Individual Range Chart Made Nice” inside a Java web App using AJAX – How To. 01-04
Document worth reading: “Recommendation System based on Semantic Scholar Mining and Topic modeling: A behavioral analysis of researchers from six conferences” 01-03
“My advisor and I disagree on how we should carry out repeated cross-validation. We would love to have a third expert opinion…” 12-15
“Increase sample size until statistical significance is reached” is not a valid adaptive trial design; but it’s fixable. 12-07
Document worth reading: “Multi-Agent Reinforcement Learning: A Report on Challenges and Approaches” 11-17
Arnaub Chatterjee discusses artificial intelligence (AI) and machine learning (ML) in healthcare. 10-29
Document worth reading: “Big Data Systems Meet Machine Learning Challenges: Towards Big Data Science as a Service” 10-07
Import AI 121: Sony researchers make ultra-fast ImageNet training breakthrough; Berkeley researchers tackle StarCraft II with modular RL system; and Germany adds €3bn for AI research 11-19
Import AI 119: How to benefit AI research in Africa; German politician calls for billions in spending to prevent country being left behind; and using deep learning to spot thefts 11-05
Import AI: 108: Learning language with fake sentences, Chinese researchers use RL to train prototype warehouse robots; and what the implications are of scaled-up Neural Architecture Search 08-20
Document worth reading: “Idealised Bayesian Neural Networks Cannot Have Adversarial Examples: Theoretical and Empirical Study” 08-29
Document worth reading: “A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress” 10-15
Document worth reading: “Recommendation System based on Semantic Scholar Mining and Topic modeling: A behavioral analysis of researchers from six conferences” 01-03
Document worth reading: “A Review of Theoretical and Practical Challenges of Trusted Autonomy in Big Data” 12-04
Amazon SageMaker Neural Topic Model now supports auxiliary vocabulary channel, new topic evaluation metrics, and training subsampling 10-04
Document worth reading: “Are screening methods useful in feature selection? An empirical study” 12-18
Document worth reading: “A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition” 12-05
Document worth reading: “A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress” 10-15
Document worth reading: “Vector and Matrix Optimal Mass Transport: Theory, Algorithm, and Applications” 10-14
Document worth reading: “Do Deep Learning Models Have Too Many Parameters An Information Theory Viewpoint” 09-22
“The most important aspect of a statistical analysis is not what you do with the data, it’s what data you use” (survey adjustment edition) 08-07
KDnuggets™ News 19:n01, Jan 3: The Essence of Machine Learning; A Guide to Decision Trees for Machine Learning and Data Science 01-03
KDnuggets™ News 18:n48, Dec 19: Why You Shouldn’t be a Data Science Generalist; Industry Data Science & Machine Learning 2019 Predictions 12-19
Industry Predictions: AI, Machine Learning, Analytics & Data Science Main Developments in 2018 and Key Trends for 2019 12-18
Surprise-hacking: “the narrative of blindness and illusion sells, and therefore continues to be the central thesis of popular books written by psychologists and cognitive scientists” 12-16
KDnuggets™ News 18:n44, Nov 21: What is the Best Python IDE for Data Science?; Anticipating the next move in data science 11-21
Hey, check this out: Columbia’s Data Science Institute is hiring research scientists and postdocs! 11-16
“Law professor Alan Dershowitz’s new book claims that political differences have lately been criminalized in the United States. He has it wrong. Instead, the orderly enforcement of the law has, ludicrously, been framed as political.” 11-12
University of San Francisco: Assistant Professor, Tenure Track, Mathematics and Statistics [San Francisco, CA] 10-17
Document worth reading: “An Analysis of Hierarchical Text Classification Using Word Embeddings” 10-06
How to Build Your Own Blockchain Part 1 — Creating, Storing, Syncing, Displaying, Mining, and Proving Work 10-17
Bayesian Deep Learning Part II: Bridging PyMC3 and Lasagne to build a Hierarchical Neural Network 07-05
Second Annual Data Science Bowl – Part 3 – Automatically Finding the Heart Location in an MRI Image 03-08
Amazon Comprehend introduces new Region availability and language support for French, German, Italian, and Portuguese 10-10
Document worth reading: “A Comparative Study on using Principle Component Analysis with Different Text Classifiers” 08-29
Bayesian Deep Learning Part II: Bridging PyMC3 and Lasagne to build a Hierarchical Neural Network 07-05
Second Annual Data Science Bowl – Part 3 – Automatically Finding the Heart Location in an MRI Image 03-08
“She also observed that results from smaller studies conducted by NGOs – often pilot studies – would often look promising. But when governments tried to implement scaled-up versions of those programs, their performance would drop considerably.” 11-22
Benford’s Law for Fraud Detection with an Application to all Brazilian Presidential Elections from 2002 to 2018 11-17
Document worth reading: “The Gap of Semantic Parsing: A Survey on Automatic Math Word Problem Solvers” 12-27
Document worth reading: “Are screening methods useful in feature selection? An empirical study” 12-18
Now use Pipe mode with CSV datasets for faster training on Amazon SageMaker built-in algorithms 11-01
Import AI: 117: Surveillance search engines; harvesting real-world road data with hovering drones; and improving language with unsupervised pre-training 10-22
Short Article Reveals the Undeniable Facts About College Essay Writing Service and How It Can Affect You 10-04
Document worth reading: “Do Deep Learning Models Have Too Many Parameters An Information Theory Viewpoint” 09-22
Don’t reinvent the wheel: making use of shiny extension packages. Join MünsteR for our next meetup! 01-08
What to think about this new study which says that you should limit your alcohol to 5 drinks a week? 10-23
A couple more papers on genetic diversity as an explanation for why Africa and remote Andean countries are so poor while Europe and North America are so wealthy 09-19
“My advisor and I disagree on how we should carry out repeated cross-validation. We would love to have a third expert opinion…” 12-15
Import AI: 123: Facebook sees demands for deep learning services in its data centers grow by 3.5X; why advanced AI might require a global policeforce; and diagnosing natural disasters with deep learning 12-03
Document worth reading: “A Temporal Difference Reinforcement Learning Theory of Emotion: unifying emotion, cognition and adaptive behavior” 08-07
“She also observed that results from smaller studies conducted by NGOs – often pilot studies – would often look promising. But when governments tried to implement scaled-up versions of those programs, their performance would drop considerably.” 11-22
Facial feedback: “These findings suggest that minute differences in the experimental protocol might lead to theoretically meaningful changes in the outcomes.” 11-01
What to think about this new study which says that you should limit your alcohol to 5 drinks a week? 10-23
No code chatbots: TIBCO uses Amazon Lex to put chat interfaces into the hands of business users 09-05
Linguistic Signals of Album Quality: A Predictive Analysis of Pitchfork Review Scores Using Quanteda 01-10
Amazon SageMaker Neural Topic Model now supports auxiliary vocabulary channel, new topic evaluation metrics, and training subsampling 10-04
Document worth reading: “Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution” 08-11
Two papers released on arXiv, "Operator Variational Inference" and "Model Criticism for Bayesian Causal Inference" 10-30
Discussion of "Fast Approximate Inference for Arbitrarily Large Semiparametric Regression Models via Message Passing" 09-19
Easy CI/CD of GPU applications on Google Cloud including bare-metal using Gitlab and Kubernetes 12-14
Create conda recipe to use C extended Python library on PySpark cluster with Cloudera Data Science Workbench 05-15
Create conda recipe to use C extended Python library on PySpark cluster with Cloudera Data Science Workbench 05-15
“Check yourself before you wreck yourself: Assessing discrete choice models through predictive simulations” 12-29
Zak David expresses critical views of some published research in empirical quantitative finance 12-24
Discussion of "Fast Approximate Inference for Arbitrarily Large Semiparametric Regression Models via Message Passing" 09-19
KDnuggets™ News 18:n42, Nov 7: The Most in Demand Skills for Data Scientists; How Machines Understand Our Language: Intro to NLP 11-07
GitHub Python Data Science Spotlight: High Level Machine Learning & NLP, Ensembles, Command Line Viz & Docker Made Easy 10-16
Document worth reading: “Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution” 08-11
Document worth reading: “Examining the Use of Neural Networks for Feature Extraction: A Comparative Analysis using Deep Learning, Support Vector Machines, and K-Nearest Neighbor Classifiers” 08-08
How to import a directory of csvs at once with base R and data.table. Can you guess which way is the fastest? 10-13
Document worth reading: “A Review of Theoretical and Practical Challenges of Trusted Autonomy in Big Data” 12-04
Import AI: 123: Facebook sees demands for deep learning services in its data centers grow by 3.5X; why advanced AI might require a global policeforce; and diagnosing natural disasters with deep learning 12-03
Document worth reading: “Saliency Prediction in the Deep Learning Era: An Empirical Investigation” 11-16
Melanie Miller says, “As someone who has worked in A.I. for decades, I’ve witnessed the failure of similar predictions of imminent human-level A.I., and I’m certain these latest forecasts will fall short as well. “ 11-08
Melanie Mitchell says, “As someone who has worked in A.I. for decades, I’ve witnessed the failure of similar predictions of imminent human-level A.I., and I’m certain these latest forecasts will fall short as well. “ 11-08
Document worth reading: “Attend Before you Act: Leveraging human visual attention for continual learning” 08-03
Top Stories, Dec 24 – Jan 6: The Essence of Machine Learning; Papers with Code: A Fantastic GitHub Resource for Machine Learning 01-08
Top Stories, Dec 17-23: Why You Shouldn’t be a Data Science Generalist; 10 More Must-See Free Courses for Machine Learning and Data Science 12-24
Top Stories, Dec 10-16: Why You Shouldn’t be a Data Science Generalist; Machine Learning & AI Main Developments in 2018 and Key Trends for 2019 12-17
Top KDnuggets tweets, Dec 5-11: How to build a data science project from scratch; NeurIPS 2018 video talk collection 12-13
Top Stories, Dec 3-9: Common mistakes when carrying out machine learning and data science; AI, Data Science, Analytics Main Developments in 2018 and Key Trends for 2019 12-10
Top Stories, Nov 26 – Dec 2: Deep Learning Cheat Sheets; A Complete Guide to Choosing the Best Machine Learning Course 12-03
Top Stories, Nov 19-25: What is the Best Python IDE for Data Science?; Intro to Data Science for Managers 11-26
Top KDnuggets tweets, Nov 14-20: 10 Free Must-See Courses for Machine Learning and Data Science; Great list of 11-21
Top Stories, Nov 12-18: What is the Best Python IDE for Data Science?; To get hired as a data scientist, don’t follow the herd 11-19
Top KDnuggets tweets, Nov 07-13: 10 Free Must-See Courses for Machine Learning and Data Science 11-14
Top Stories, Nov 5-11: The Most in Demand Skills for Data Scientists; 10 Free Must-See Courses for Machine Learning and Data Science 11-13
Top Stories, Oct 29 – Nov 4: The Most in Demand Skills for Data Scientists; How Machines Understand Our Language 11-05
KDnuggets™ News 18:n41, Oct 31: Introduction to Deep Learning with Keras; Easy Named Entity Recognition with Scikit-Learn 10-31
Top Stories, Oct 22-28: 9 Must-have skills you need to become a Data Scientist, updated; Named Entity Recognition and Classification with Scikit-Learn 10-29
KDnuggets™ News 18:n40, Oct 24: Graphs Are The Next Frontier In Data Science; Apache Spark Intro for Beginners 10-24
Top Stories, Oct 15-21: Graphs Are The Next Frontier In Data Science; The Main Approaches to Natural Language Processing Tasks 10-22
Top KDnuggets tweets, Oct 10-16: 7 Books to Grasp Mathematical Foundations of Data Science and Machine Learning; 6 Books Every Data Scientist Should Keep Nearby 10-17
KDnuggets™ News 18:n39, Oct 17: 10 Best Mobile Apps for Data Scientist; Vote in new poll: Largest dataset you analyzed? 10-17
KDnuggets™ News 18:n38, Oct 10: Concise Explanation of Learning Algorithms; Why I Call Myself a Data Scientist; Linear Regression in the Wild 10-10
Top KDnuggets tweets, Sep 26 – Oct 2: Why building your own Deep Learning Computer is 10x cheaper than AWS; 6 Steps To Write Any Machine Learning Algorithm 10-03
Top Stories, Dec 24 – Jan 6: The Essence of Machine Learning; Papers with Code: A Fantastic GitHub Resource for Machine Learning 01-08
Top Stories, Dec 17-23: Why You Shouldn’t be a Data Science Generalist; 10 More Must-See Free Courses for Machine Learning and Data Science 12-24
Top Stories, Dec 10-16: Why You Shouldn’t be a Data Science Generalist; Machine Learning & AI Main Developments in 2018 and Key Trends for 2019 12-17
Top KDnuggets tweets, Dec 5-11: How to build a data science project from scratch; NeurIPS 2018 video talk collection 12-13
Top Stories, Dec 3-9: Common mistakes when carrying out machine learning and data science; AI, Data Science, Analytics Main Developments in 2018 and Key Trends for 2019 12-10
Top Stories, Nov 26 – Dec 2: Deep Learning Cheat Sheets; A Complete Guide to Choosing the Best Machine Learning Course 12-03
Top Stories, Nov 19-25: What is the Best Python IDE for Data Science?; Intro to Data Science for Managers 11-26
Top KDnuggets tweets, Nov 14-20: 10 Free Must-See Courses for Machine Learning and Data Science; Great list of 11-21
Top Stories, Nov 12-18: What is the Best Python IDE for Data Science?; To get hired as a data scientist, don’t follow the herd 11-19
Top KDnuggets tweets, Nov 07-13: 10 Free Must-See Courses for Machine Learning and Data Science 11-14
Top Stories, Nov 5-11: The Most in Demand Skills for Data Scientists; 10 Free Must-See Courses for Machine Learning and Data Science 11-13
Top Stories, Oct 29 – Nov 4: The Most in Demand Skills for Data Scientists; How Machines Understand Our Language 11-05
Top Stories, Oct 22-28: 9 Must-have skills you need to become a Data Scientist, updated; Named Entity Recognition and Classification with Scikit-Learn 10-29
Top Stories, Oct 15-21: Graphs Are The Next Frontier In Data Science; The Main Approaches to Natural Language Processing Tasks 10-22
Top KDnuggets tweets, Oct 10-16: 7 Books to Grasp Mathematical Foundations of Data Science and Machine Learning; 6 Books Every Data Scientist Should Keep Nearby 10-17
Top KDnuggets tweets, Sep 26 – Oct 2: Why building your own Deep Learning Computer is 10x cheaper than AWS; 6 Steps To Write Any Machine Learning Algorithm 10-03
Google, Microsoft & Fraunhofer at the First European Edition of Deep Learning World – 12 Nov, 2018 10-23
10 ways you might be able to tell when an area of research is undergoing rapid expansion and society's expectations may be somewhat unrealistic ... 12-13
Zak David expresses critical views of some published research in empirical quantitative finance 12-24
Short Article Reveals the Undeniable Facts About College Essay Writing Service and How It Can Affect You 10-04
“We are reluctant to engage in post hoc speculation about this unexpected result, but it does not clearly support our hypothesis” 11-03
“Thus, a loss aversion principle is rendered superfluous to an account of the phenomena it was introduced to explain.” 12-25
Document worth reading: “Quantizing deep convolutional networks for efficient inference: A whitepaper” 09-10
Watch out for naively (because implicitly based on flat-prior) Bayesian statements based on classical confidence intervals! (Comptroller of the Currency edition) 11-08
“We continuously increased the number of animals until statistical significance was reached to support our conclusions” . . . I think this is not so bad, actually! 09-04
10 ways you might be able to tell when an area of research is undergoing rapid expansion and society's expectations may be somewhat unrealistic ... 12-13
10 ways you might be able to tell when an area of research is undergoing rapid expansion and society's expectations may be somewhat unrealistic ... 12-13
10 ways you might be able to tell when an area of research is undergoing rapid expansion and society's expectations may be somewhat unrealistic ... 12-13
KDnuggets™ News 18:n43, Nov 14: To get hired as a data scientist, don’t follow the herd; LinkedIn Top Voices in Data Science & Analytics 11-14
10 ways you might be able to tell when an area of research is undergoing rapid expansion and society's expectations may be somewhat unrealistic ... 12-13
Document worth reading: “Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences” 12-16
Announcing the Artificial Intelligence (AI) Hackathon: Build Intelligent Applications using machine learning APIs and serverless 08-15
Compare population age structures of Europe NUTS-3 regions and the US counties using ternary color-coding 12-03
Industry Predictions: AI, Machine Learning, Analytics & Data Science Main Developments in 2018 and Key Trends for 2019 12-18
Melanie Miller says, “As someone who has worked in A.I. for decades, I’ve witnessed the failure of similar predictions of imminent human-level A.I., and I’m certain these latest forecasts will fall short as well. “ 11-08
Melanie Mitchell says, “As someone who has worked in A.I. for decades, I’ve witnessed the failure of similar predictions of imminent human-level A.I., and I’m certain these latest forecasts will fall short as well. “ 11-08
Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning: November and Beyond 11-01
ITWire: VIDEO Interview with a DataRobot: Greg Michaelson talks AI, banking, machine learning and more 10-24
Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning: October and Beyond 10-03
Build an automatic alert system to easily moderate content at scale with Amazon Rekognition Video 08-15
Joint inference or modular inference? Pierre Jacob, Lawrence Murray, Chris Holmes, Christian Robert discuss conditions on the strength and weaknesses of these choices 07-09
Discussion of "Fast Approximate Inference for Arbitrarily Large Semiparametric Regression Models via Message Passing" 09-19
Three informal case studies: (1) Monte Carlo EM, (2) a new approach to C++ matrix autodiff with closures, (3) C++ serialization via parameter packs 08-17
Three informal case studies: (1) Monte Carlo EM, (2) a new approach to C++ matrix autodiff with closures, (3) C++ serialization via parameter packs 08-17
Document worth reading: “Weighted Abstract Dialectical Frameworks: Extended and Revised Report” 08-13
These 3 problems destroy many clinical trials (in context of some papers on problems with non-inferiority trials, or problems with clinical trials in general) 11-25
Watch out for naively (because implicitly based on flat-prior) Bayesian statements based on classical confidence intervals! (Comptroller of the Currency edition) 11-08
“We continuously increased the number of animals until statistical significance was reached to support our conclusions” . . . I think this is not so bad, actually! 09-04
These 3 problems destroy many clinical trials (in context of some papers on problems with non-inferiority trials, or problems with clinical trials in general) 11-25
Alternative approaches to scaling Shiny with RStudio Shiny Server, ShinyProxy or custom architecture. 12-18
Document worth reading: “Data Curation with Deep Learning [Vision]: Towards Self Driving Data Curation” 10-13
Aella Credit empowers underbanked individuals by using Amazon Rekognition for identity verification 08-15
Document worth reading: “Attend Before you Act: Leveraging human visual attention for continual learning” 08-03
Document worth reading: “Learning to Succeed while Teaching to Fail: Privacy in Closed Machine Learning Systems” 08-10
Document worth reading: “A Tutorial on Modeling and Inference in Undirected Graphical Models for Hyperspectral Image Analysis” 11-09
“If you deprive the robot of your intuition about cause and effect, you’re never going to communicate meaningfully.” – Pearl ’18 06-08
Matching (and discarding non-matches) to deal with lack of complete overlap, then regression to adjust for imbalance between treatment and control groups 11-10
Industry Predictions: AI, Machine Learning, Analytics & Data Science Main Developments in 2018 and Key Trends for 2019 12-18
Job: Postdoctoral Researcher in Small Data Deep Learning and Explainable Machine Learning, Livermore, CA 10-08
Response to Rafa: Why I don’t think ROC [receiver operating characteristic] works as a model for science 08-05
Melanie Miller says, “As someone who has worked in A.I. for decades, I’ve witnessed the failure of similar predictions of imminent human-level A.I., and I’m certain these latest forecasts will fall short as well. “ 11-08
Melanie Mitchell says, “As someone who has worked in A.I. for decades, I’ve witnessed the failure of similar predictions of imminent human-level A.I., and I’m certain these latest forecasts will fall short as well. “ 11-08
A psychology researcher uses Stan, multiverse, and open data exploration to explore human memory 09-21
Data Center Scale Computing and Artificial Intelligence with Matei Zaharia, Inventor of Apache Spark 09-12
Data Center Scale Computing and Artificial Intelligence with Matei Zaharia, Inventor of Apache Spark 09-12
Document worth reading: “Fog Computing: Survey of Trends, Architectures, Requirements, and Research Directions” 08-22
Document worth reading: “Saliency Prediction in the Deep Learning Era: An Empirical Investigation” 11-16
Announcing the Artificial Intelligence (AI) Hackathon: Build Intelligent Applications using machine learning APIs and serverless 08-15
“We continuously increased the number of animals until statistical significance was reached to support our conclusions” . . . I think this is not so bad, actually! 09-04
How to import a directory of csvs at once with base R and data.table. Can you guess which way is the fastest? 10-13
Document worth reading: “Machine Learning for Wireless Networks with Artificial Intelligence: A Tutorial on Neural Networks” 10-28
GitHub Python Data Science Spotlight: High Level Machine Learning & NLP, Ensembles, Command Line Viz & Docker Made Easy 10-16
Document worth reading: “Importance of the Mathematical Foundations of Machine Learning Methods for Scientific and Engineering Applications” 09-30
Document worth reading: “PMLB: A Large Benchmark Suite for Machine Learning Evaluation and Comparison” 08-31
Document worth reading: “Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution” 08-11
Melanie Miller says, “As someone who has worked in A.I. for decades, I’ve witnessed the failure of similar predictions of imminent human-level A.I., and I’m certain these latest forecasts will fall short as well. “ 11-08
Melanie Mitchell says, “As someone who has worked in A.I. for decades, I’ve witnessed the failure of similar predictions of imminent human-level A.I., and I’m certain these latest forecasts will fall short as well. “ 11-08
Document worth reading: “Are screening methods useful in feature selection? An empirical study” 12-18
“Increase sample size until statistical significance is reached” is not a valid adaptive trial design; but it’s fixable. 12-07
A comprehensive list of Machine Learning Resources: Open Courses, Textbooks, Tutorials, Cheat Sheets and more 12-07
Document worth reading: “Vector and Matrix Optimal Mass Transport: Theory, Algorithm, and Applications” 10-14
Why Indian companies should take on different projects than competing Valley companies - an application of Cobb-Douglas 11-07
KDnuggets™ News 19:n02, Jan 9: The cold start problem: how to build your machine learning portfolio; 5 Best Data Visualization Libraries 01-09
University of Tennessee Knoxville: Assistant or Associate Professor in Data Science [Knoxville, TN] 11-30
A comprehensive list of Machine Learning Resources: Open Courses, Textbooks, Tutorials, Cheat Sheets and more 12-07
Top KDnuggets tweets, Nov 07-13: 10 Free Must-See Courses for Machine Learning and Data Science 11-14
Document worth reading: “An Introduction to Inductive Statistical Inference — from Parameter Estimation to Decision-Making” 10-09
(People are missing the point on Wansink, so) what’s the lesson we should be drawing from this story? 09-27
John Hattie’s “Visible Learning”: How much should we trust this influential review of education research? 09-01
Joint inference or modular inference? Pierre Jacob, Lawrence Murray, Chris Holmes, Christian Robert discuss conditions on the strength and weaknesses of these choices 07-09
Alternative approaches to scaling Shiny with RStudio Shiny Server, ShinyProxy or custom architecture. 12-18
NYU Stern: 2019-20 Asst. Professor of Information, Operations & Management Sciences – Information Systems, tenure-track [New York City, NY] 11-14
DePaul University: Two tenure-track/tenured positions in Data Science/Computer Science [Chicago, IL] 11-07
Response to Rafa: Why I don’t think ROC [receiver operating characteristic] works as a model for science 08-05
Data Science in 30 Minutes: Deep Learning to Detect Fake News with Uber ATG Head of Data Science, Mike Tamir 05-30
Thinking is not something that goes on entirely, or even mostly, inside people’s heads. Little... 01-30
Kent State University: Assistant/Associate Professor – Business Analytics/Information Systems [Kent, OH] 12-19
Document worth reading: “Multi-Agent Reinforcement Learning: A Report on Challenges and Approaches” 11-17
Thinking is not something that goes on entirely, or even mostly, inside people’s heads. Little... 01-30
Thinking is not something that goes on entirely, or even mostly, inside people’s heads. Little... 01-30
Thinking is not something that goes on entirely, or even mostly, inside people’s heads. Little... 01-30
A study fails to replicate, but it continues to get referenced as if it had no problems. Communication channels are blocked. 10-24
Thinking is not something that goes on entirely, or even mostly, inside people’s heads. Little... 01-30
David Weakliem points out that both economic and cultural issues can be more or less “moralized.” 10-03
Top KDnuggets tweets, Oct 31 – Nov 6: 10 More Free Must-Read Books for Machine Learning and Data Science 11-07
Why Indian companies should take on different projects than competing Valley companies - an application of Cobb-Douglas 11-07
A comprehensive list of Machine Learning Resources: Open Courses, Textbooks, Tutorials, Cheat Sheets and more 12-07
Document worth reading: “Vector and Matrix Optimal Mass Transport: Theory, Algorithm, and Applications” 10-14
Two papers released on arXiv, "Operator Variational Inference" and "Model Criticism for Bayesian Causal Inference" 10-30
Import AI 114: Synthetic images take a big leap forward with BigGANs; US lawmakers call for national AI strategy; researchers probe language reasoning via HotspotQA 10-01
Document worth reading: “A Temporal Difference Reinforcement Learning Theory of Emotion: unifying emotion, cognition and adaptive behavior” 08-07
University of Virginia: Faculty, Open Rank Model and Simulation at the Human-Technology Frontier [Charlottesville, VA] 12-24
Document worth reading: “An Overview of Blockchain Integration with Robotics and Artificial Intelligence” 11-08
Second Annual Data Science Bowl – Part 3 – Automatically Finding the Heart Location in an MRI Image 03-08
Second Annual Data Science Bowl – Part 3 – Automatically Finding the Heart Location in an MRI Image 03-08
“The most important aspect of a statistical analysis is not what you do with the data, it’s what data you use” (survey adjustment edition) 08-07
Don’t reinvent the wheel: making use of shiny extension packages. Join MünsteR for our next meetup! 01-08
Alternative approaches to scaling Shiny with RStudio Shiny Server, ShinyProxy or custom architecture. 12-18
Amazon SageMaker now comes with new capabilities for accelerating machine learning experimentation 11-29
Amazon Rekognition is now available in the Asia Pacific (Seoul) and Asia Pacific (Mumbai) Regions 08-09
Document worth reading: “Machine Learning for Wireless Networks with Artificial Intelligence: A Tutorial on Neural Networks” 10-28
NYU Stern Fubon Center for Technology, Business and Innovation: Fubon Center Faculty Fellow [New York, NY] 01-08
Southern Illinois University Edwardsville: Director of the Center for Predictive Analytics/(Associate) Professor of Mathematics and Statistics [Edwardsville, IL] 01-04
DePaul University: Two tenure-track/tenured positions in Data Science/Computer Science [Chicago, IL] 11-07
Document worth reading: “Fog Computing: Survey of Trends, Architectures, Requirements, and Research Directions” 08-22
Document worth reading: “Examining the Use of Neural Networks for Feature Extraction: A Comparative Analysis using Deep Learning, Support Vector Machines, and K-Nearest Neighbor Classifiers” 08-08
KDnuggets™ News 18:n43, Nov 14: To get hired as a data scientist, don’t follow the herd; LinkedIn Top Voices in Data Science & Analytics 11-14
KDnuggets™ News 18:n37, Oct 3: Mathematics of Machine Learning; Effective Transfer Learning for NLP; Path Analysis with R 10-03
Colorado State University: Assistant Professor in Industrial and Organizational (IO) Psychology [Fort Collins, CO] 10-05
Generating data to explore the myriad causal effects that can be estimated in observational data analysis 11-20
Document worth reading: “Generative Adversarial Nets for Information Retrieval: Fundamentals and Advances” 10-01
Document worth reading: “Generative Adversarial Nets for Information Retrieval: Fundamentals and Advances” 10-01
Analyze live video at scale in real time using Amazon Kinesis Video Streams and Amazon SageMaker 11-19
Zak David expresses critical views of some published research in empirical quantitative finance 12-24
Document worth reading: “A Tutorial on Modeling and Inference in Undirected Graphical Models for Hyperspectral Image Analysis” 11-09
Data Center Scale Computing and Artificial Intelligence with Matei Zaharia, Inventor of Apache Spark 09-12
Top October Stories: 9 Must-have skills you need to become a Data Scientist, updated; 10 Best Mobile Apps for Data Scientist / Data Analysts 11-09
AWS Deep Learning AMIs now with optimized TensorFlow 1.9 and Apache MXNet 1.2 with Keras 2 support to accelerate deep learning on Amazon EC2 instances 07-23
University of Virginia: Faculty, Open Rank Model and Simulation at the Human-Technology Frontier [Charlottesville, VA] 12-24
Generating data to explore the myriad causal effects that can be estimated in observational data analysis 11-20
Document worth reading: “Generative Adversarial Nets for Information Retrieval: Fundamentals and Advances” 10-01
Document worth reading: “A Study and Comparison of Human and Deep Learning Recognition Performance Under Visual Distortions” 12-18
UnitedHealth Group: Clinical Data Statistical Analyst – SQL SAS (Clinician Required) [Telecommute] 11-16
Import AI 119: How to benefit AI research in Africa; German politician calls for billions in spending to prevent country being left behind; and using deep learning to spot thefts 11-05
Now use Pipe mode with CSV datasets for faster training on Amazon SageMaker built-in algorithms 11-01
Document worth reading: “A Comparative Study on using Principle Component Analysis with Different Text Classifiers” 08-29
“If you deprive the robot of your intuition about cause and effect, you’re never going to communicate meaningfully.” – Pearl ’18 06-08
Semantic Interoperability: Are you training your AI by mixing data sources that look the same but aren’t? 10-09
Google, Microsoft & Fraunhofer at the First European Edition of Deep Learning World – 12 Nov, 2018 10-23
Vulcan Post: This AI Startup Is Run By The World’s Top Data Scientists – Lets Anyone Build Predictive Models 09-04
Model Updates: Entity-level Sentiment Analysis and Brand New Entity Extraction Models Now Live in the Text Analysis API 07-17
What to think about this new study which says that you should limit your alcohol to 5 drinks a week? 10-23
Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning: October and Beyond 10-03
Top Stories, Dec 3-9: Common mistakes when carrying out machine learning and data science; AI, Data Science, Analytics Main Developments in 2018 and Key Trends for 2019 12-10
Cornell prof (but not the pizzagate guy!) has one quick trick to getting 1700 peer reviewed publications on your CV 11-04
What to think about this new study which says that you should limit your alcohol to 5 drinks a week? 10-23
Beyond text: How Spokata uses Amazon Polly to make news and information universally accessible as real-time audio 10-04
Import AI 114: Synthetic images take a big leap forward with BigGANs; US lawmakers call for national AI strategy; researchers probe language reasoning via HotspotQA 10-01
Anomaly detection on Amazon DynamoDB Streams using the Amazon SageMaker Random Cut Forest algorithm 12-05
Easy CI/CD of GPU applications on Google Cloud including bare-metal using Gitlab and Kubernetes 12-14
Amazon SageMaker Automatic Model Tuning becomes more efficient with warm start of hyperparameter tuning jobs 11-19
Should we be concerned about MRP estimates being used in later analyses? Maybe. I recommend checking using fake-data simulation. 12-09
Should we be concerned about MRP estimates being used in later analyses? Maybe. I recommend checking using fake-data simulation. 12-09
Response to Rafa: Why I don’t think ROC [receiver operating characteristic] works as a model for science 08-05
Matching (and discarding non-matches) to deal with lack of complete overlap, then regression to adjust for imbalance between treatment and control groups 11-10
ITWire: VIDEO Interview with a DataRobot: Greg Michaelson talks AI, banking, machine learning and more 10-24
ITWire: VIDEO Interview with a DataRobot: Greg Michaelson talks AI, banking, machine learning and more 10-24
InformationAge: Will 2019 See the Automation of Automation and Push Up Salaries of Data Scientists? 12-11
Data Science in 30 Minutes: Deep Learning to Detect Fake News with Uber ATG Head of Data Science, Mike Tamir 05-30
Document worth reading: “Are screening methods useful in feature selection? An empirical study” 12-18
“She also observed that results from smaller studies conducted by NGOs – often pilot studies – would often look promising. But when governments tried to implement scaled-up versions of those programs, their performance would drop considerably.” 11-22
Facial feedback: “These findings suggest that minute differences in the experimental protocol might lead to theoretically meaningful changes in the outcomes.” 11-01
What to think about this new study which says that you should limit your alcohol to 5 drinks a week? 10-23
Document worth reading: “PMLB: A Large Benchmark Suite for Machine Learning Evaluation and Comparison” 08-31
NYU Stern Fubon Center for Technology, Business and Innovation: Fubon Center Faculty Fellow [New York, NY] 01-08
Document worth reading: “A Survey: Non-Orthogonal Multiple Access with Compressed Sensing Multiuser Detection for mMTC” 12-31
Center for Ultrasound Research and Translation, Massachusetts General Hospital: Post-Doctoral Scholar / Research Scientist [Boston, MA] 12-31
Import AI 127: Why language AI advancements may make Google more competitive; COCO image captioning systems don’t live up to the hype, and Amazon sees 3X growth in voice shopping via Alexa 12-31
University of Virginia: Faculty, Open Rank Model and Simulation at the Human-Technology Frontier [Charlottesville, VA] 12-24
Zak David expresses critical views of some published research in empirical quantitative finance 12-24
Document worth reading: “Marketing Analytics: Methods, Practice, Implementation, and Links to Other Fields” 12-07
Import AI: 123: Facebook sees demands for deep learning services in its data centers grow by 3.5X; why advanced AI might require a global policeforce; and diagnosing natural disasters with deep learning 12-03
Import AI: 122: Google obtains new ImageNet state-of-the-art with GPipe; drone learns to land more effectively than PD controller policy; and Facebook releases its ‘CherryPi’ StarCraft bot 11-26
Import AI 121: Sony researchers make ultra-fast ImageNet training breakthrough; Berkeley researchers tackle StarCraft II with modular RL system; and Germany adds €3bn for AI research 11-19
U. of Zurich: Assistant Professorship in AI and Machine Learning (Non-tenure Track) [Zurich, Switzerland] 10-24
Document worth reading: “A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress” 10-15
Import AI 114: Synthetic images take a big leap forward with BigGANs; US lawmakers call for national AI strategy; researchers probe language reasoning via HotspotQA 10-01
(People are missing the point on Wansink, so) what’s the lesson we should be drawing from this story? 09-27
A couple more papers on genetic diversity as an explanation for why Africa and remote Andean countries are so poor while Europe and North America are so wealthy 09-19
Top KDnuggets tweets, Nov 14-20: 10 Free Must-See Courses for Machine Learning and Data Science; Great list of 11-21
Document worth reading: “An Analysis of Hierarchical Text Classification Using Word Embeddings” 10-06
About that claim in the NYT that the immigration issue helped Hillary Clinton? The numbers don’t seem to add up. 07-03
Humana: Principal Data Scientist/Informatics Principal [Chicago, IL, Dallas, TX and Louisville, KY] 11-27
Arnaub Chatterjee discusses artificial intelligence (AI) and machine learning (ML) in healthcare. 10-29
Document worth reading: “Importance of the Mathematical Foundations of Machine Learning Methods for Scientific and Engineering Applications” 09-30
“Law professor Alan Dershowitz’s new book claims that political differences have lately been criminalized in the United States. He has it wrong. Instead, the orderly enforcement of the law has, ludicrously, been framed as political.” 11-12
David Weakliem points out that both economic and cultural issues can be more or less “moralized.” 10-03
Why do sociologists (and bloggers) focus on the negative? 5 possible explanations. (A post in the style of Fabio Rojas) 12-17
“Statistical insights into public opinion and politics” (my talk for the Columbia Data Science Society this Wed 9pm) 12-04
Linking Data Science Activities to Business Initiatives Using the Hypothesis Development Canvas 11-29
Amazon Rekognition announces updates to its face detection, analysis, and recognition capabilities 11-22
Industry Predictions: AI, Machine Learning, Analytics & Data Science Main Developments in 2018 and Key Trends for 2019 12-18
InformationAge: Will 2019 See the Automation of Automation and Push Up Salaries of Data Scientists? 12-11
What to do when you read a paper and it’s full of errors and the author won’t share the data or be open about the analysis? 01-02
Document worth reading: “An Introduction to Inductive Statistical Inference — from Parameter Estimation to Decision-Making” 10-09
(People are missing the point on Wansink, so) what’s the lesson we should be drawing from this story? 09-27
Job opening at CDC: “The Statistician will play a central role in guiding the statistical methods of all major projects of the Epidemiology and Prevention Branch of the CDC Influenza Division, and aid in designing, analyzing, and interpreting research intended to understand the burden of influenza in the US and internationally and identify the best influenza vaccines and vaccine strategies.” 09-26
John Hattie’s “Visible Learning”: How much should we trust this influential review of education research? 09-01
“The idea of replication is central not just to scientific practice but also to formal statistics . . . Frequentist statistics relies on the reference set of repeated experiments, and Bayesian statistics relies on the prior distribution which represents the population of effects.” 07-19
Document worth reading: “A Tutorial on Modeling and Inference in Undirected Graphical Models for Hyperspectral Image Analysis” 11-09
John Hattie’s “Visible Learning”: How much should we trust this influential review of education research? 09-01
What to do when you read a paper and it’s full of errors and the author won’t share the data or be open about the analysis? 01-02
Why do sociologists (and bloggers) focus on the negative? 5 possible explanations. (A post in the style of Fabio Rojas) 12-17
Import AI 119: How to benefit AI research in Africa; German politician calls for billions in spending to prevent country being left behind; and using deep learning to spot thefts 11-05
Bayesian Deep Learning Part II: Bridging PyMC3 and Lasagne to build a Hierarchical Neural Network 07-05
Model Updates: Entity-level Sentiment Analysis and Brand New Entity Extraction Models Now Live in the Text Analysis API 07-17
He had a sudden cardiac arrest. How does this change the probability that he has a particular genetic condition? 10-14
Document worth reading: “A Study and Comparison of Human and Deep Learning Recognition Performance Under Visual Distortions” 12-18
KDnuggets™ News 18:n42, Nov 7: The Most in Demand Skills for Data Scientists; How Machines Understand Our Language: Intro to NLP 11-07
Document worth reading: “An Analysis of Hierarchical Text Classification Using Word Embeddings” 10-06
Document worth reading: “Importance of the Mathematical Foundations of Machine Learning Methods for Scientific and Engineering Applications” 09-30
Import AI 111: Hacking computers with Generative Adversarial Networks, Facebook trains world-class speech translation in 85 minutes via 128 GPUs, and Europeans use AI to classify 1,000-year-old graffiti. 09-10
Understanding deep Convolutional Neural Networks with a practical use-case in Tensorflow and Keras 11-13
John Hattie’s “Visible Learning”: How much should we trust this influential review of education research? 09-01
“It’s Always Sunny in Correlationville: Stories in Science,” or, Science should not be a game of Botticelli 09-08
He had a sudden cardiac arrest. How does this change the probability that he has a particular genetic condition? 10-14
The statistical checklist: Could there be a list of guidelines to help analysts do better work? 07-17
He wants to know what to read and what software to learn, to increase his ability to think about quantitative methods in social science 07-07
Why Almost Everything You’ve Learned About Cheap Custom Essay Is Wrong and What You Should Know 10-04
University of San Francisco: Assistant Professor, Tenure Track, Mathematics and Statistics [San Francisco, CA] 10-17
John Hattie’s “Visible Learning”: How much should we trust this influential review of education research? 09-01
The statistical checklist: Could there be a list of guidelines to help analysts do better work? 07-17
High-profile statistical errors occur in the physical sciences too, it’s not just a problem in social science. 09-15
Document worth reading: “An Introduction to Inductive Statistical Inference — from Parameter Estimation to Decision-Making” 10-09
Document worth reading: “Data Innovation for International Development: An overview of natural language processing for qualitative data analysis” 09-25
Core Principles of Sustainable Data Science, Machine Learning and AI Product Development: Research as a core driver 01-09
Document worth reading: “Examining the Use of Neural Networks for Feature Extraction: A Comparative Analysis using Deep Learning, Support Vector Machines, and K-Nearest Neighbor Classifiers” 08-08
Kent State University: Assistant/Associate Professor – Business Analytics/Information Systems [Kent, OH] 12-19
University of Tennessee Knoxville: Assistant or Associate Professor in Data Science [Knoxville, TN] 11-30
UnitedHealth Group: Clinical Data Statistical Analyst – SQL SAS (Clinician Required) [Telecommute] 11-16
Document worth reading: “Fog Computing: Survey of Trends, Architectures, Requirements, and Research Directions” 08-22
Compare population age structures of Europe NUTS-3 regions and the US counties using ternary color-coding 12-03
Generating data to explore the myriad causal effects that can be estimated in observational data analysis 11-20
Document worth reading: “Human-Machine Inference Networks For Smart Decision Making: Opportunities and Challenges” 09-26
Discussion of "Fast Approximate Inference for Arbitrarily Large Semiparametric Regression Models via Message Passing" 09-19
Why do sociologists (and bloggers) focus on the negative? 5 possible explanations. (A post in the style of Fabio Rojas) 12-17
MRP (multilevel regression and poststratification; Mister P): Clearing up misunderstandings about 01-10
“Identification of and correction for publication bias,” and another discussion of how forking paths is not the same thing as file drawer 08-31
Response to Rafa: Why I don’t think ROC [receiver operating characteristic] works as a model for science 08-05
Southern Illinois University Edwardsville: Director of the Center for Predictive Analytics/(Associate) Professor of Mathematics and Statistics [Edwardsville, IL] 01-04
Hey, check this out: Columbia’s Data Science Institute is hiring research scientists and postdocs! 11-16
KDnuggets™ News 18:n40, Oct 24: Graphs Are The Next Frontier In Data Science; Apache Spark Intro for Beginners 10-24
Searching for the optimal hyper-parameters of an ARIMA model in parallel: the tidy gridsearch approach 11-15
Document worth reading: “A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition” 12-05
Top KDnuggets tweets, Oct 3–9: 5 Reasons Logistic Regression should be the first thing you learn when becoming a Data Scientist 10-10
Data Center Scale Computing and Artificial Intelligence with Matei Zaharia, Inventor of Apache Spark 09-12
Document worth reading: “Recommendation System based on Semantic Scholar Mining and Topic modeling: A behavioral analysis of researchers from six conferences” 01-03
Estimating mortality rates in Puerto Rico after hurricane María using newly released official death counts 06-08
Benford’s Law for Fraud Detection with an Application to all Brazilian Presidential Elections from 2002 to 2018 11-17
Document worth reading: “A practical tutorial on autoencoders for nonlinear feature fusion: Taxonomy, models, software and guidelines” 09-16
Document worth reading: “A Study and Comparison of Human and Deep Learning Recognition Performance Under Visual Distortions” 12-18
A psychology researcher uses Stan, multiverse, and open data exploration to explore human memory 09-21
Facial feedback: “These findings suggest that minute differences in the experimental protocol might lead to theoretically meaningful changes in the outcomes.” 11-01
MRP (multilevel regression and poststratification; Mister P): Clearing up misunderstandings about 01-10
Two papers released on arXiv, "Operator Variational Inference" and "Model Criticism for Bayesian Causal Inference" 10-30
Core Principles of Sustainable Data Science, Machine Learning and AI Product Development: Research as a core driver 01-09
Benford’s Law for Fraud Detection with an Application to all Brazilian Presidential Elections from 2002 to 2018 11-17
InformationAge: Will 2019 See the Automation of Automation and Push Up Salaries of Data Scientists? 12-11
Using Natural Language Processing to Combat Filter Bubbles and Fake News – 360° Stance Detection 04-24
Searching for the optimal hyper-parameters of an ARIMA model in parallel: the tidy gridsearch approach 11-15
Analyzing contact center calls—Part 1: Use Amazon Transcribe and Amazon Comprehend to analyze customer sentiment 12-18
Anomaly detection on Amazon DynamoDB Streams using the Amazon SageMaker Random Cut Forest algorithm 12-05
Amazon SageMaker Automatic Model Tuning becomes more efficient with warm start of hyperparameter tuning jobs 11-19
Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning: December and Beyond 12-04
Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning: November and Beyond 11-01
Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning: October and Beyond 10-03
Job opening at CDC: “The Statistician will play a central role in guiding the statistical methods of all major projects of the Epidemiology and Prevention Branch of the CDC Influenza Division, and aid in designing, analyzing, and interpreting research intended to understand the burden of influenza in the US and internationally and identify the best influenza vaccines and vaccine strategies.” 09-26
Response to Rafa: Why I don’t think ROC [receiver operating characteristic] works as a model for science 08-05
Document worth reading: “Fog Computing: Survey of Trends, Architectures, Requirements, and Research Directions” 08-22
✚ Google Dataset Search Impressions, the Challenges of Looking for Data, and Other Places to Find Data 09-13
Document worth reading: “Generative Adversarial Nets for Information Retrieval: Fundamentals and Advances” 10-01
“She also observed that results from smaller studies conducted by NGOs – often pilot studies – would often look promising. But when governments tried to implement scaled-up versions of those programs, their performance would drop considerably.” 11-22
Import AI 114: Synthetic images take a big leap forward with BigGANs; US lawmakers call for national AI strategy; researchers probe language reasoning via HotspotQA 10-01
Core Principles of Sustainable Data Science, Machine Learning and AI Product Development: Research as a core driver 01-09
Cornell prof (but not the pizzagate guy!) has one quick trick to getting 1700 peer reviewed publications on your CV 11-04
Southern Illinois University Edwardsville: Director of the Center for Predictive Analytics/(Associate) Professor of Mathematics and Statistics [Edwardsville, IL] 01-04
Document worth reading: “An Introduction to Inductive Statistical Inference — from Parameter Estimation to Decision-Making” 10-09
Analyze live video at scale in real time using Amazon Kinesis Video Streams and Amazon SageMaker 11-19
Anomaly detection on Amazon DynamoDB Streams using the Amazon SageMaker Random Cut Forest algorithm 12-05
Analyzing contact center calls—Part 1: Use Amazon Transcribe and Amazon Comprehend to analyze customer sentiment 12-18
NYU Stern Fubon Center for Technology, Business and Innovation: Fubon Center Faculty Fellow [New York, NY] 01-08
Spark + AI Summit: learn best practices in ML and DL, latest frameworks, and more – special KDnuggets offer 12-14
Job: Postdoctoral Researcher in Small Data Deep Learning and Explainable Machine Learning, Livermore, CA 10-08
Document worth reading: “Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences” 12-16
Don’t reinvent the wheel: making use of shiny extension packages. Join MünsteR for our next meetup! 01-08
Document worth reading: “Saliency Prediction in the Deep Learning Era: An Empirical Investigation” 11-16
Document worth reading: “An Introduction to Inductive Statistical Inference — from Parameter Estimation to Decision-Making” 10-09
Statistical Modeling, Causal Inference, and Social Science Regrets Its Decision to Hire Cannibal P-hacker as Writer-at-Large 09-29
Document worth reading: “Human-Machine Inference Networks For Smart Decision Making: Opportunities and Challenges” 09-26
Three informal case studies: (1) Monte Carlo EM, (2) a new approach to C++ matrix autodiff with closures, (3) C++ serialization via parameter packs 08-17
Three informal case studies: (1) Monte Carlo EM, (2) a new approach to C++ matrix autodiff with closures, (3) C++ serialization via parameter packs 08-17
Use AWS Machine Learning to Analyze Customer Calls from Contact Centers (Part 2): Automate, Deploy, and Visualize Analytics using Amazon Transcribe, Amazon Comprehend, AWS CloudFormation, and Amazon QuickSight 12-28
Analyzing contact center calls—Part 1: Use Amazon Transcribe and Amazon Comprehend to analyze customer sentiment 12-18
Analyze live video at scale in real time using Amazon Kinesis Video Streams and Amazon SageMaker 11-19
Beyond text: How Spokata uses Amazon Polly to make news and information universally accessible as real-time audio 10-04
Build an automatic alert system to easily moderate content at scale with Amazon Rekognition Video 08-15
Announcing the Artificial Intelligence (AI) Hackathon: Build Intelligent Applications using machine learning APIs and serverless 08-15
Document worth reading: “Idealised Bayesian Neural Networks Cannot Have Adversarial Examples: Theoretical and Empirical Study” 08-29
Document worth reading: “An Overview of Blockchain Integration with Robotics and Artificial Intelligence” 11-08
Top KDnuggets tweets, Dec 12-18: Deep Learning Cheat Sheets; The Nate Silver vs. Nassim Taleb Twitter War 12-19
What to do when you read a paper and it’s full of errors and the author won’t share the data or be open about the analysis? 01-02
You did a sentiment analysis with tidytext but you forgot to do dependency parsing to answer WHY is something positive/negative 01-08
KDnuggets™ News 18:n46, Dec 5: AI, Data Science, Analytics 2018 Main Developments, 2019 Key Trends; Deep Learning Cheat Sheets 12-05
Estimating mortality rates in Puerto Rico after hurricane María using newly released official death counts 06-08
Segmenting brain tissue using Apache MXNet with Amazon SageMaker and AWS Greengrass ML Inference – Part 1 09-26
AWS Deep Learning AMIs now include ONNX, enabling model portability across deep learning frameworks 07-26
Top KDnuggets tweets, Nov 07-13: 10 Free Must-See Courses for Machine Learning and Data Science 11-14
Use AWS Machine Learning to Analyze Customer Calls from Contact Centers (Part 2): Automate, Deploy, and Visualize Analytics using Amazon Transcribe, Amazon Comprehend, AWS CloudFormation, and Amazon QuickSight 12-28
Document worth reading: “Learning to Succeed while Teaching to Fail: Privacy in Closed Machine Learning Systems” 08-10
Create conda recipe to use C extended Python library on PySpark cluster with Cloudera Data Science Workbench 05-15
Create conda recipe to use C extended Python library on PySpark cluster with Cloudera Data Science Workbench 05-15
Create conda recipe to use C extended Python library on PySpark cluster with Cloudera Data Science Workbench 05-15
NYU Stern Fubon Center for Technology, Business and Innovation: Fubon Center Faculty Fellow [New York, NY] 01-08
Southern Illinois University Edwardsville: Director of the Center for Predictive Analytics/(Associate) Professor of Mathematics and Statistics [Edwardsville, IL] 01-04
DePaul University: Two tenure-track/tenured positions in Data Science/Computer Science [Chicago, IL] 11-07
Humana: Principal Data Scientist/Informatics Principal [Chicago, IL, Dallas, TX and Louisville, KY] 11-27
UnitedHealth Group: Clinical Data Statistical Analyst – SQL SAS (Clinician Required) [Telecommute] 11-16
UnitedHealth Group: Senior Principal Data Scientist [Telecommute, Central or Eastern Time Zones] 11-16
Job opening at CDC: “The Statistician will play a central role in guiding the statistical methods of all major projects of the Epidemiology and Prevention Branch of the CDC Influenza Division, and aid in designing, analyzing, and interpreting research intended to understand the burden of influenza in the US and internationally and identify the best influenza vaccines and vaccine strategies.” 09-26
“The most important aspect of a statistical analysis is not what you do with the data, it’s what data you use” (survey adjustment edition) 08-07
Document worth reading: “Multi-Agent Reinforcement Learning: A Report on Challenges and Approaches” 11-17
Generating data to explore the myriad causal effects that can be estimated in observational data analysis 11-20
Direct access to Amazon SageMaker notebooks from Amazon VPC by using an AWS PrivateLink endpoint 11-06
“Thus, a loss aversion principle is rendered superfluous to an account of the phenomena it was introduced to explain.” 12-25
ITWire: VIDEO Interview with a DataRobot: Greg Michaelson talks AI, banking, machine learning and more 10-24
Amazon Rekognition announces updates to its face detection, analysis, and recognition capabilities 11-22
Document worth reading: “Recent Advances in Object Detection in the Age of Deep Convolutional Neural Networks” 10-25
Searching for the optimal hyper-parameters of an ARIMA model in parallel: the tidy gridsearch approach 11-15
Benford’s Law for Fraud Detection with an Application to all Brazilian Presidential Elections from 2002 to 2018 11-17
About that claim in the NYT that the immigration issue helped Hillary Clinton? The numbers don’t seem to add up. 07-03
Benford’s Law for Fraud Detection with an Application to all Brazilian Presidential Elections from 2002 to 2018 11-17
Understanding Amazon SageMaker notebook instance networking configurations and advanced routing options 10-24
Document worth reading: “A practical tutorial on autoencoders for nonlinear feature fusion: Taxonomy, models, software and guidelines” 09-16
Document worth reading: “Data Curation with Deep Learning [Vision]: Towards Self Driving Data Curation” 10-13
Document worth reading: “Do Deep Learning Models Have Too Many Parameters An Information Theory Viewpoint” 09-22
Principles of Database Management: The Practical Guide to Storing, Managing and Analyzing Big and Small Data 01-10
Top KDnuggets tweets, Oct 10-16: 7 Books to Grasp Mathematical Foundations of Data Science and Machine Learning; 6 Books Every Data Scientist Should Keep Nearby 10-17
Data Science in 30 Minutes: Deep Learning to Detect Fake News with Uber ATG Head of Data Science, Mike Tamir 05-30
Amazon SageMaker notebooks now support Git integration for increased persistence, collaboration, and reproducibility 11-29
No code chatbots: TIBCO uses Amazon Lex to put chat interfaces into the hands of business users 09-05
KDnuggets™ News 18:n45, Nov 28: Your Favorite Python IDE/editor? Intro to Data Science for Managers 11-28
Core Principles of Sustainable Data Science, Machine Learning and AI Product Development: Research as a core driver 01-09
Vulcan Post: This AI Startup Is Run By The World’s Top Data Scientists – Lets Anyone Build Predictive Models 09-04
Aella Credit empowers underbanked individuals by using Amazon Rekognition for identity verification 08-15
How to Build Your Own Blockchain Part 1 — Creating, Storing, Syncing, Displaying, Mining, and Proving Work 10-17
Document worth reading: “An Overview of Blockchain Integration with Robotics and Artificial Intelligence” 11-08
How to Build Your Own Blockchain Part 1 — Creating, Storing, Syncing, Displaying, Mining, and Proving Work 10-17
How to Build Your Own Blockchain Part 1 — Creating, Storing, Syncing, Displaying, Mining, and Proving Work 10-17
How to Build Your Own Blockchain Part 1 — Creating, Storing, Syncing, Displaying, Mining, and Proving Work 10-17
Top November Stories: The Most in Demand Skills for Data Scientists; What is the Best Python IDE for Data Science? 12-11
Top September Stories: Essential Math for Data Science: Why and How; Machine Learning Cheat Sheets 10-10
Segmenting brain tissue using Apache MXNet with Amazon SageMaker and AWS Greengrass ML Inference – Part 2 10-04
AWS Deep Learning AMIs now include ONNX, enabling model portability across deep learning frameworks 07-26
AWS Deep Learning AMIs now with optimized TensorFlow 1.9 and Apache MXNet 1.2 with Keras 2 support to accelerate deep learning on Amazon EC2 instances 07-23
Linguistic Signals of Album Quality: A Predictive Analysis of Pitchfork Review Scores Using Quanteda 01-10
✚ Google Dataset Search Impressions, the Challenges of Looking for Data, and Other Places to Find Data 09-13
Document worth reading: “Marketing Analytics: Methods, Practice, Implementation, and Links to Other Fields” 12-07
Document worth reading: “Recent Advances in Object Detection in the Age of Deep Convolutional Neural Networks” 10-25
Document worth reading: “A Survey of Knowledge Representation and Retrieval for Learning in Service Robotics” 12-26
Import AI: 108: Learning language with fake sentences, Chinese researchers use RL to train prototype warehouse robots; and what the implications are of scaled-up Neural Architecture Search 08-20
Document worth reading: “The Gap of Semantic Parsing: A Survey on Automatic Math Word Problem Solvers” 12-27
Why Indian companies should take on different projects than competing Valley companies - an application of Cobb-Douglas 11-07
Why Indian companies should take on different projects than competing Valley companies - an application of Cobb-Douglas 11-07
Understanding deep Convolutional Neural Networks with a practical use-case in Tensorflow and Keras 11-13
Understanding deep Convolutional Neural Networks with a practical use-case in Tensorflow and Keras 11-13
“Thus, a loss aversion principle is rendered superfluous to an account of the phenomena it was introduced to explain.” 12-25
Top KDnuggets tweets, Oct 10-16: 7 Books to Grasp Mathematical Foundations of Data Science and Machine Learning; 6 Books Every Data Scientist Should Keep Nearby 10-17
Document worth reading: “Vector and Matrix Optimal Mass Transport: Theory, Algorithm, and Applications” 10-14
Document worth reading: “PMLB: A Large Benchmark Suite for Machine Learning Evaluation and Comparison” 08-31
Top KDnuggets tweets, Oct 3–9: 5 Reasons Logistic Regression should be the first thing you learn when becoming a Data Scientist 10-10
Segmenting brain tissue using Apache MXNet with Amazon SageMaker and AWS Greengrass ML Inference – Part 2 10-04
Segmenting brain tissue using Apache MXNet with Amazon SageMaker and AWS Greengrass ML Inference – Part 1 09-26
Document worth reading: “Marketing Analytics: Methods, Practice, Implementation, and Links to Other Fields” 12-07
Alternative approaches to scaling Shiny with RStudio Shiny Server, ShinyProxy or custom architecture. 12-18
Document worth reading: “Recent Advances in Object Detection in the Age of Deep Convolutional Neural Networks” 10-25
“Identification of and correction for publication bias,” and another discussion of how forking paths is not the same thing as file drawer 08-31
Document worth reading: “A Review of Theoretical and Practical Challenges of Trusted Autonomy in Big Data” 12-04
Document worth reading: “Data Curation with Deep Learning [Vision]: Towards Self Driving Data Curation” 10-13
Why, oh why, do so many people embrace the Pacific Garbage Cleanup nonsense? (I have a theory). 09-18
“My advisor and I disagree on how we should carry out repeated cross-validation. We would love to have a third expert opinion…” 12-15
✚ Detailed Intentions of a Map, When Everything Leads to Nothing, Designing for Misinterpretations 08-09
Linguistic Signals of Album Quality: A Predictive Analysis of Pitchfork Review Scores Using Quanteda 01-10
NYU Stern Fubon Center for Technology, Business and Innovation: Fubon Center Faculty Fellow [New York, NY] 01-08
No code chatbots: TIBCO uses Amazon Lex to put chat interfaces into the hands of business users 09-05
Circle circumference in the hyperbolic plane is exponential in the radius: proof by computer game 04-10
Circle circumference in the hyperbolic plane is exponential in the radius: proof by computer game 04-10
Circle circumference in the hyperbolic plane is exponential in the radius: proof by computer game 04-10
Circle circumference in the hyperbolic plane is exponential in the radius: proof by computer game 04-10
Circle circumference in the hyperbolic plane is exponential in the radius: proof by computer game 04-10
Document worth reading: “A Survey of Knowledge Representation and Retrieval for Learning in Service Robotics” 12-26
Import AI: 108: Learning language with fake sentences, Chinese researchers use RL to train prototype warehouse robots; and what the implications are of scaled-up Neural Architecture Search 08-20
KDnuggets™ News 19:n02, Jan 9: The cold start problem: how to build your machine learning portfolio; 5 Best Data Visualization Libraries 01-09
KDnuggets™ News 19:n01, Jan 3: The Essence of Machine Learning; A Guide to Decision Trees for Machine Learning and Data Science 01-03
KDnuggets™ News 18:n48, Dec 19: Why You Shouldn’t be a Data Science Generalist; Industry Data Science & Machine Learning 2019 Predictions 12-19
KDnuggets™ News 18:n47, Dec 12: Common mistakes when doing machine learning; Here are the most popular Python IDEs / Editors 12-12
KDnuggets™ News 18:n46, Dec 5: AI, Data Science, Analytics 2018 Main Developments, 2019 Key Trends; Deep Learning Cheat Sheets 12-05
KDnuggets™ News 18:n45, Nov 28: Your Favorite Python IDE/editor? Intro to Data Science for Managers 11-28
KDnuggets™ News 18:n44, Nov 21: What is the Best Python IDE for Data Science?; Anticipating the next move in data science 11-21
KDnuggets™ News 18:n43, Nov 14: To get hired as a data scientist, don’t follow the herd; LinkedIn Top Voices in Data Science & Analytics 11-14
KDnuggets™ News 18:n42, Nov 7: The Most in Demand Skills for Data Scientists; How Machines Understand Our Language: Intro to NLP 11-07
KDnuggets™ News 18:n41, Oct 31: Introduction to Deep Learning with Keras; Easy Named Entity Recognition with Scikit-Learn 10-31
KDnuggets™ News 18:n39, Oct 17: 10 Best Mobile Apps for Data Scientist; Vote in new poll: Largest dataset you analyzed? 10-17
KDnuggets™ News 18:n38, Oct 10: Concise Explanation of Learning Algorithms; Why I Call Myself a Data Scientist; Linear Regression in the Wild 10-10
KDnuggets™ News 18:n37, Oct 3: Mathematics of Machine Learning; Effective Transfer Learning for NLP; Path Analysis with R 10-03
Cambridge Analytica, Facebook, and user data – Monthly Media Review with the AYLIEN News API, April 05-03
Using Natural Language Processing to Combat Filter Bubbles and Fake News – 360° Stance Detection 04-24
Using Natural Language Processing to Combat Filter Bubbles and Fake News – 360° Stance Detection 04-24
Cambridge Analytica, Facebook, and user data – Monthly Media Review with the AYLIEN News API, April 05-03
Using Natural Language Processing to Combat Filter Bubbles and Fake News – 360° Stance Detection 04-24
Monitoring the media reaction to Facebook’s disastrous earnings call – News API Monthly Media Review 08-16
Cambridge Analytica, Facebook, and user data – Monthly Media Review with the AYLIEN News API, April 05-03
Using Natural Language Processing to Combat Filter Bubbles and Fake News – 360° Stance Detection 04-24
Monitoring the media reaction to Facebook’s disastrous earnings call – News API Monthly Media Review 08-16
Cambridge Analytica, Facebook, and user data – Monthly Media Review with the AYLIEN News API, April 05-03
“Identification of and correction for publication bias,” and another discussion of how forking paths is not the same thing as file drawer 08-31
How to Use FPGAs for Deep Learning Inference to Perform Land Cover Mapping on Terabytes of Aerial Images 05-29
How to Use FPGAs for Deep Learning Inference to Perform Land Cover Mapping on Terabytes of Aerial Images 05-29
How to Use FPGAs for Deep Learning Inference to Perform Land Cover Mapping on Terabytes of Aerial Images 05-29
How to Use FPGAs for Deep Learning Inference to Perform Land Cover Mapping on Terabytes of Aerial Images 05-29
Document worth reading: “Does putting your emotions into words make you feel better? Measuring the minute-scale dynamics of emotions from online data” 08-14
Document worth reading: “A Survey of Knowledge Representation and Retrieval for Learning in Service Robotics” 12-26
ITWire: VIDEO Interview with a DataRobot: Greg Michaelson talks AI, banking, machine learning and more 10-24
Vulcan Post: This AI Startup Is Run By The World’s Top Data Scientists – Lets Anyone Build Predictive Models 09-04
✚ Google Dataset Search Impressions, the Challenges of Looking for Data, and Other Places to Find Data 09-13
Document worth reading: “Does putting your emotions into words make you feel better? Measuring the minute-scale dynamics of emotions from online data” 08-14
AWS Deep Learning AMIs now with optimized TensorFlow 1.9 and Apache MXNet 1.2 with Keras 2 support to accelerate deep learning on Amazon EC2 instances 07-23
“If you deprive the robot of your intuition about cause and effect, you’re never going to communicate meaningfully.” – Pearl ’18 06-08
“If you deprive the robot of your intuition about cause and effect, you’re never going to communicate meaningfully.” – Pearl ’18 06-08
Monitoring the media reaction to Facebook’s disastrous earnings call – News API Monthly Media Review 08-16
Estimating mortality rates in Puerto Rico after hurricane María using newly released official death counts 06-08
Estimating mortality rates in Puerto Rico after hurricane María using newly released official death counts 06-08
Estimating mortality rates in Puerto Rico after hurricane María using newly released official death counts 06-08
Top KDnuggets tweets, Oct 10-16: 7 Books to Grasp Mathematical Foundations of Data Science and Machine Learning; 6 Books Every Data Scientist Should Keep Nearby 10-17
A psychology researcher uses Stan, multiverse, and open data exploration to explore human memory 09-21
Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning: December and Beyond 12-04
✚ Detailed Intentions of a Map, When Everything Leads to Nothing, Designing for Misinterpretations 08-09
Predicting World Cup dark horses from press coverage using the AYLIEN News API – Monthly Media Roundup 06-15
Predicting World Cup dark horses from press coverage using the AYLIEN News API – Monthly Media Roundup 06-15
PNAS forgets basic principles of game theory, thus dooming thousands of Bothans to the fate of Alderaan 07-04
Predicting World Cup dark horses from press coverage using the AYLIEN News API – Monthly Media Roundup 06-15
Predicting World Cup dark horses from press coverage using the AYLIEN News API – Monthly Media Roundup 06-15
Top KDnuggets tweets, Nov 14-20: 10 Free Must-See Courses for Machine Learning and Data Science; Great list of 11-21
“Statistical insights into public opinion and politics” (my talk for the Columbia Data Science Society this Wed 9pm) 12-04
“Law professor Alan Dershowitz’s new book claims that political differences have lately been criminalized in the United States. He has it wrong. Instead, the orderly enforcement of the law has, ludicrously, been framed as political.” 11-12
David Weakliem points out that both economic and cultural issues can be more or less “moralized.” 10-03
What do you do when someone says, “The quote is, this is the exact quote”—and then misquotes you? 09-30
Machine Learning Explainability vs Interpretability: Two concepts that could help restore trust in AI 12-20
Slides from my talk at the R-Ladies Meetup about Interpretable Deep Learning with R, Keras and LIME 10-17
About that claim in the NYT that the immigration issue helped Hillary Clinton? The numbers don’t seem to add up. 07-03
About that claim in the NYT that the immigration issue helped Hillary Clinton? The numbers don’t seem to add up. 07-03
About that claim in the NYT that the immigration issue helped Hillary Clinton? The numbers don’t seem to add up. 07-03
InformationAge: Will 2019 See the Automation of Automation and Push Up Salaries of Data Scientists? 12-11
High-profile statistical errors occur in the physical sciences too, it’s not just a problem in social science. 09-15
PNAS forgets basic principles of game theory, thus dooming thousands of Bothans to the fate of Alderaan 07-04
PNAS forgets basic principles of game theory, thus dooming thousands of Bothans to the fate of Alderaan 07-04
PNAS forgets basic principles of game theory, thus dooming thousands of Bothans to the fate of Alderaan 07-04
PNAS forgets basic principles of game theory, thus dooming thousands of Bothans to the fate of Alderaan 07-04
Tutorial: The practical application of complicated statistical methods to fill up the scientific literature with confusing and irrelevant analyses 07-05
Tutorial: The practical application of complicated statistical methods to fill up the scientific literature with confusing and irrelevant analyses 07-05
Tutorial: The practical application of complicated statistical methods to fill up the scientific literature with confusing and irrelevant analyses 07-05
Tutorial: The practical application of complicated statistical methods to fill up the scientific literature with confusing and irrelevant analyses 07-05
Tutorial: The practical application of complicated statistical methods to fill up the scientific literature with confusing and irrelevant analyses 07-05
High-profile statistical errors occur in the physical sciences too, it’s not just a problem in social science. 09-15
He wants to know what to read and what software to learn, to increase his ability to think about quantitative methods in social science 07-07
He wants to know what to read and what software to learn, to increase his ability to think about quantitative methods in social science 07-07
He wants to know what to read and what software to learn, to increase his ability to think about quantitative methods in social science 07-07
He wants to know what to read and what software to learn, to increase his ability to think about quantitative methods in social science 07-07
Joint inference or modular inference? Pierre Jacob, Lawrence Murray, Chris Holmes, Christian Robert discuss conditions on the strength and weaknesses of these choices 07-09
U. of Zurich: Assistant Professorship in AI and Machine Learning (Non-tenure Track) [Zurich, Switzerland] 10-24
Colorado State University: Assistant Professor in Industrial and Organizational (IO) Psychology [Fort Collins, CO] 10-05
The statistical checklist: Could there be a list of guidelines to help analysts do better work? 07-17
The statistical checklist: Could there be a list of guidelines to help analysts do better work? 07-17
The statistical checklist: Could there be a list of guidelines to help analysts do better work? 07-17
Model Updates: Entity-level Sentiment Analysis and Brand New Entity Extraction Models Now Live in the Text Analysis API 07-17
Model Updates: Entity-level Sentiment Analysis and Brand New Entity Extraction Models Now Live in the Text Analysis API 07-17
“For professional baseball players, faster hand-eye coordination linked to batting performance” 07-18
“For professional baseball players, faster hand-eye coordination linked to batting performance” 07-18
Authority figures in psychology spread more happy talk, still don’t get the point that much of the published, celebrated, and publicized work in their field is no good (Part 2) 12-31
“The idea of replication is central not just to scientific practice but also to formal statistics . . . Frequentist statistics relies on the reference set of repeated experiments, and Bayesian statistics relies on the prior distribution which represents the population of effects.” 07-19
“The idea of replication is central not just to scientific practice but also to formal statistics . . . Frequentist statistics relies on the reference set of repeated experiments, and Bayesian statistics relies on the prior distribution which represents the population of effects.” 07-19
“The idea of replication is central not just to scientific practice but also to formal statistics . . . Frequentist statistics relies on the reference set of repeated experiments, and Bayesian statistics relies on the prior distribution which represents the population of effects.” 07-19
AWS Deep Learning AMIs now with optimized TensorFlow 1.9 and Apache MXNet 1.2 with Keras 2 support to accelerate deep learning on Amazon EC2 instances 07-23
Import AI 121: Sony researchers make ultra-fast ImageNet training breakthrough; Berkeley researchers tackle StarCraft II with modular RL system; and Germany adds €3bn for AI research 11-19
Surprise-hacking: “the narrative of blindness and illusion sells, and therefore continues to be the central thesis of popular books written by psychologists and cognitive scientists” 12-16
Document worth reading: “A Temporal Difference Reinforcement Learning Theory of Emotion: unifying emotion, cognition and adaptive behavior” 08-07
AWS Deep Learning AMIs now include ONNX, enabling model portability across deep learning frameworks 07-26
AWS Deep Learning AMIs now include ONNX, enabling model portability across deep learning frameworks 07-26
Aella Credit empowers underbanked individuals by using Amazon Rekognition for identity verification 08-15
Amazon Rekognition is now available in the Asia Pacific (Seoul) and Asia Pacific (Mumbai) Regions 08-09
Amazon SageMaker Automatic Model Tuning becomes more efficient with warm start of hyperparameter tuning jobs 11-19
Direct access to Amazon SageMaker notebooks from Amazon VPC by using an AWS PrivateLink endpoint 11-06
Transfer learning for custom labels using a TensorFlow container and “bring your own algorithm” in Amazon SageMaker 07-27
Slides from my talk at the R-Ladies Meetup about Interpretable Deep Learning with R, Keras and LIME 10-17
Document worth reading: “Recommendation System based on Semantic Scholar Mining and Topic modeling: A behavioral analysis of researchers from six conferences” 01-03
Document worth reading: “Machine Learning for Wireless Networks with Artificial Intelligence: A Tutorial on Neural Networks” 10-28
These 3 problems destroy many clinical trials (in context of some papers on problems with non-inferiority trials, or problems with clinical trials in general) 11-25
KDnuggets™ News 18:n48, Dec 19: Why You Shouldn’t be a Data Science Generalist; Industry Data Science & Machine Learning 2019 Predictions 12-19
Document worth reading: “Attend Before you Act: Leveraging human visual attention for continual learning” 08-03
Bayes, statistics, and reproducibility: “Many serious problems with statistics in practice arise from Bayesian inference that is not Bayesian enough, or frequentist evaluation that is not frequentist enough, in both cases using replication distributions that do not make scientific sense or do not reflect the actual procedures being performed on the data.” 12-04
Facial feedback: “These findings suggest that minute differences in the experimental protocol might lead to theoretically meaningful changes in the outcomes.” 11-01
A study fails to replicate, but it continues to get referenced as if it had no problems. Communication channels are blocked. 10-24
“The most important aspect of a statistical analysis is not what you do with the data, it’s what data you use” (survey adjustment edition) 08-07
Apache Drill 1.15.0 + sergeant 0.8.0 = pcapng Support, Proper Column Types & Mounds of New Metadata 01-02
An actual quote from a paper published in a medical journal: “The data, analytic methods, and study materials will not be made available to other researchers for purposes of reproducing the results or replicating the procedure.” 10-19
Three informal case studies: (1) Monte Carlo EM, (2) a new approach to C++ matrix autodiff with closures, (3) C++ serialization via parameter packs 08-17
Three informal case studies: (1) Monte Carlo EM, (2) a new approach to C++ matrix autodiff with closures, (3) C++ serialization via parameter packs 08-17
Document worth reading: “A Temporal Difference Reinforcement Learning Theory of Emotion: unifying emotion, cognition and adaptive behavior” 08-07
Document worth reading: “A Temporal Difference Reinforcement Learning Theory of Emotion: unifying emotion, cognition and adaptive behavior” 08-07
Document worth reading: “A Survey: Non-Orthogonal Multiple Access with Compressed Sensing Multiuser Detection for mMTC” 12-31
Document worth reading: “Examining the Use of Neural Networks for Feature Extraction: A Comparative Analysis using Deep Learning, Support Vector Machines, and K-Nearest Neighbor Classifiers” 08-08
✚ Detailed Intentions of a Map, When Everything Leads to Nothing, Designing for Misinterpretations 08-09
Beyond text: How Spokata uses Amazon Polly to make news and information universally accessible as real-time audio 10-04
Build an automatic alert system to easily moderate content at scale with Amazon Rekognition Video 08-15
Document worth reading: “Learning to Succeed while Teaching to Fail: Privacy in Closed Machine Learning Systems” 08-10
Document worth reading: “Learning to Succeed while Teaching to Fail: Privacy in Closed Machine Learning Systems” 08-10
Document worth reading: “Learning to Succeed while Teaching to Fail: Privacy in Closed Machine Learning Systems” 08-10
Surprise-hacking: “the narrative of blindness and illusion sells, and therefore continues to be the central thesis of popular books written by psychologists and cognitive scientists” 12-16
Document worth reading: “Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution” 08-11
Document worth reading: “Weighted Abstract Dialectical Frameworks: Extended and Revised Report” 08-13
Document worth reading: “Weighted Abstract Dialectical Frameworks: Extended and Revised Report” 08-13
Document worth reading: “Weighted Abstract Dialectical Frameworks: Extended and Revised Report” 08-13
Document worth reading: “Weighted Abstract Dialectical Frameworks: Extended and Revised Report” 08-13
High-profile statistical errors occur in the physical sciences too, it’s not just a problem in social science. 09-15
Amazon SageMaker now comes with new capabilities for accelerating machine learning experimentation 11-29
Direct access to Amazon SageMaker notebooks from Amazon VPC by using an AWS PrivateLink endpoint 11-06
Direct access to Amazon SageMaker notebooks from Amazon VPC by using an AWS PrivateLink endpoint 11-06
Document worth reading: “Does putting your emotions into words make you feel better? Measuring the minute-scale dynamics of emotions from online data” 08-14
Document worth reading: “Does putting your emotions into words make you feel better? Measuring the minute-scale dynamics of emotions from online data” 08-14
Document worth reading: “Does putting your emotions into words make you feel better? Measuring the minute-scale dynamics of emotions from online data” 08-14
Announcing the Artificial Intelligence (AI) Hackathon: Build Intelligent Applications using machine learning APIs and serverless 08-15
“My advisor and I disagree on how we should carry out repeated cross-validation. We would love to have a third expert opinion…” 12-15
Monitoring the media reaction to Facebook’s disastrous earnings call – News API Monthly Media Review 08-16
Monitoring the media reaction to Facebook’s disastrous earnings call – News API Monthly Media Review 08-16
Three informal case studies: (1) Monte Carlo EM, (2) a new approach to C++ matrix autodiff with closures, (3) C++ serialization via parameter packs 08-17
Three informal case studies: (1) Monte Carlo EM, (2) a new approach to C++ matrix autodiff with closures, (3) C++ serialization via parameter packs 08-17
Three informal case studies: (1) Monte Carlo EM, (2) a new approach to C++ matrix autodiff with closures, (3) C++ serialization via parameter packs 08-17
Three informal case studies: (1) Monte Carlo EM, (2) a new approach to C++ matrix autodiff with closures, (3) C++ serialization via parameter packs 08-17
Linguistic Signals of Album Quality: A Predictive Analysis of Pitchfork Review Scores Using Quanteda 01-10
Maps with pie charts on top of each administrative division: an example with Luxembourg’s elections data 10-27
Amazon Rekognition announces updates to its face detection, analysis, and recognition capabilities 11-22
Document worth reading: “Fog Computing: Survey of Trends, Architectures, Requirements, and Research Directions” 08-22
Matching (and discarding non-matches) to deal with lack of complete overlap, then regression to adjust for imbalance between treatment and control groups 11-10
Document worth reading: “Nonnegative Matrix Factorization for Signal and Data Analytics: Identifiability, Algorithms, and Applications” 08-25
Document worth reading: “Nonnegative Matrix Factorization for Signal and Data Analytics: Identifiability, Algorithms, and Applications” 08-25
Document worth reading: “Nonnegative Matrix Factorization for Signal and Data Analytics: Identifiability, Algorithms, and Applications” 08-25
Document worth reading: “Nonnegative Matrix Factorization for Signal and Data Analytics: Identifiability, Algorithms, and Applications” 08-25
Document worth reading: “Nonnegative Matrix Factorization for Signal and Data Analytics: Identifiability, Algorithms, and Applications” 08-25
Analyzing contact center calls—Part 1: Use Amazon Transcribe and Amazon Comprehend to analyze customer sentiment 12-18
“To get started, I suggest coming up with a simple but reasonable model for missingness, then simulate fake complete data followed by a fake missingness pattern, and check that you can recover your missing-data model and your complete data model in that fake-data situation. You can then proceed from there. But if you can’t even do it with fake data, you’re sunk.” 08-27
“To get started, I suggest coming up with a simple but reasonable model for missingness, then simulate fake complete data followed by a fake missingness pattern, and check that you can recover your missing-data model and your complete data model in that fake-data situation. You can then proceed from there. But if you can’t even do it with fake data, you’re sunk.” 08-27
“To get started, I suggest coming up with a simple but reasonable model for missingness, then simulate fake complete data followed by a fake missingness pattern, and check that you can recover your missing-data model and your complete data model in that fake-data situation. You can then proceed from there. But if you can’t even do it with fake data, you’re sunk.” 08-27
Short Article Reveals the Undeniable Facts About College Essay Writing Service and How It Can Affect You 10-04
Why Almost Everything You’ve Learned About Cheap Custom Essay Is Wrong and What You Should Know 10-04
Oh, I hate it when work is criticized (or, in this case, fails in attempted replications) and then the original researchers don’t even consider the possibility that maybe in their original work they were inadvertently just finding patterns in noise. 12-13
Document worth reading: “Recent Advances in Object Detection in the Age of Deep Convolutional Neural Networks” 10-25
Document worth reading: “A Comparative Study on using Principle Component Analysis with Different Text Classifiers” 08-29
Document worth reading: “A Comparative Study on using Principle Component Analysis with Different Text Classifiers” 08-29
Document worth reading: “Idealised Bayesian Neural Networks Cannot Have Adversarial Examples: Theoretical and Empirical Study” 08-29
Document worth reading: “PMLB: A Large Benchmark Suite for Machine Learning Evaluation and Comparison” 08-31
“Identification of and correction for publication bias,” and another discussion of how forking paths is not the same thing as file drawer 08-31
“Identification of and correction for publication bias,” and another discussion of how forking paths is not the same thing as file drawer 08-31
Authority figures in psychology spread more happy talk, still don’t get the point that much of the published, celebrated, and publicized work in their field is no good (Part 2) 12-31
“We continuously increased the number of animals until statistical significance was reached to support our conclusions” . . . I think this is not so bad, actually! 09-04
You did a sentiment analysis with tidytext but you forgot to do dependency parsing to answer WHY is something positive/negative 01-08
No code chatbots: TIBCO uses Amazon Lex to put chat interfaces into the hands of business users 09-05
No code chatbots: TIBCO uses Amazon Lex to put chat interfaces into the hands of business users 09-05
These 3 problems destroy many clinical trials (in context of some papers on problems with non-inferiority trials, or problems with clinical trials in general) 11-25
“It’s Always Sunny in Correlationville: Stories in Science,” or, Science should not be a game of Botticelli 09-08
“It’s Always Sunny in Correlationville: Stories in Science,” or, Science should not be a game of Botticelli 09-08
“It’s Always Sunny in Correlationville: Stories in Science,” or, Science should not be a game of Botticelli 09-08
“It’s Always Sunny in Correlationville: Stories in Science,” or, Science should not be a game of Botticelli 09-08
Document worth reading: “Concept Tagging for Natural Language Understanding: Two Decadelong Algorithm Development” 09-10
Document worth reading: “Concept Tagging for Natural Language Understanding: Two Decadelong Algorithm Development” 09-10
Document worth reading: “Concept Tagging for Natural Language Understanding: Two Decadelong Algorithm Development” 09-10
Document worth reading: “Quantizing deep convolutional networks for efficient inference: A whitepaper” 09-10
Document worth reading: “Quantizing deep convolutional networks for efficient inference: A whitepaper” 09-10
Document worth reading: “Quantizing deep convolutional networks for efficient inference: A whitepaper” 09-10
Import AI 111: Hacking computers with Generative Adversarial Networks, Facebook trains world-class speech translation in 85 minutes via 128 GPUs, and Europeans use AI to classify 1,000-year-old graffiti. 09-10
Import AI 111: Hacking computers with Generative Adversarial Networks, Facebook trains world-class speech translation in 85 minutes via 128 GPUs, and Europeans use AI to classify 1,000-year-old graffiti. 09-10
Import AI 111: Hacking computers with Generative Adversarial Networks, Facebook trains world-class speech translation in 85 minutes via 128 GPUs, and Europeans use AI to classify 1,000-year-old graffiti. 09-10
Import AI 111: Hacking computers with Generative Adversarial Networks, Facebook trains world-class speech translation in 85 minutes via 128 GPUs, and Europeans use AI to classify 1,000-year-old graffiti. 09-10
Amazon Rekognition announces updates to its face detection, analysis, and recognition capabilities 11-22
✚ Google Dataset Search Impressions, the Challenges of Looking for Data, and Other Places to Find Data 09-13
✚ Google Dataset Search Impressions, the Challenges of Looking for Data, and Other Places to Find Data 09-13
NYU Stern: 2019-20 Asst. Professor of Information, Operations & Management Sciences – Information Systems, tenure-track [New York City, NY] 11-14
High-profile statistical errors occur in the physical sciences too, it’s not just a problem in social science. 09-15
Document worth reading: “A practical tutorial on autoencoders for nonlinear feature fusion: Taxonomy, models, software and guidelines” 09-16
Document worth reading: “A practical tutorial on autoencoders for nonlinear feature fusion: Taxonomy, models, software and guidelines” 09-16
Document worth reading: “A practical tutorial on autoencoders for nonlinear feature fusion: Taxonomy, models, software and guidelines” 09-16
KDnuggets™ News 18:n41, Oct 31: Introduction to Deep Learning with Keras; Easy Named Entity Recognition with Scikit-Learn 10-31
How to import a directory of csvs at once with base R and data.table. Can you guess which way is the fastest? 10-13
Why, oh why, do so many people embrace the Pacific Garbage Cleanup nonsense? (I have a theory). 09-18
Why, oh why, do so many people embrace the Pacific Garbage Cleanup nonsense? (I have a theory). 09-18
Why, oh why, do so many people embrace the Pacific Garbage Cleanup nonsense? (I have a theory). 09-18
A couple more papers on genetic diversity as an explanation for why Africa and remote Andean countries are so poor while Europe and North America are so wealthy 09-19
A couple more papers on genetic diversity as an explanation for why Africa and remote Andean countries are so poor while Europe and North America are so wealthy 09-19
A couple more papers on genetic diversity as an explanation for why Africa and remote Andean countries are so poor while Europe and North America are so wealthy 09-19
Should we be concerned about MRP estimates being used in later analyses? Maybe. I recommend checking using fake-data simulation. 12-09
Linking Data Science Activities to Business Initiatives Using the Hypothesis Development Canvas 11-29
Import AI: 123: Facebook sees demands for deep learning services in its data centers grow by 3.5X; why advanced AI might require a global policeforce; and diagnosing natural disasters with deep learning 12-03
A psychology researcher uses Stan, multiverse, and open data exploration to explore human memory 09-21
Document worth reading: “Do Deep Learning Models Have Too Many Parameters An Information Theory Viewpoint” 09-22
Document worth reading: “Do Deep Learning Models Have Too Many Parameters An Information Theory Viewpoint” 09-22
Statistical Modeling, Causal Inference, and Social Science Regrets Its Decision to Hire Cannibal P-hacker as Writer-at-Large 09-29
Document worth reading: “Data Innovation for International Development: An overview of natural language processing for qualitative data analysis” 09-25
Document worth reading: “Data Innovation for International Development: An overview of natural language processing for qualitative data analysis” 09-25
Document worth reading: “Data Innovation for International Development: An overview of natural language processing for qualitative data analysis” 09-25
Document worth reading: “Data Innovation for International Development: An overview of natural language processing for qualitative data analysis” 09-25
Job opening at CDC: “The Statistician will play a central role in guiding the statistical methods of all major projects of the Epidemiology and Prevention Branch of the CDC Influenza Division, and aid in designing, analyzing, and interpreting research intended to understand the burden of influenza in the US and internationally and identify the best influenza vaccines and vaccine strategies.” 09-26
Job opening at CDC: “The Statistician will play a central role in guiding the statistical methods of all major projects of the Epidemiology and Prevention Branch of the CDC Influenza Division, and aid in designing, analyzing, and interpreting research intended to understand the burden of influenza in the US and internationally and identify the best influenza vaccines and vaccine strategies.” 09-26
Document worth reading: “Human-Machine Inference Networks For Smart Decision Making: Opportunities and Challenges” 09-26
Document worth reading: “Human-Machine Inference Networks For Smart Decision Making: Opportunities and Challenges” 09-26
Document worth reading: “Human-Machine Inference Networks For Smart Decision Making: Opportunities and Challenges” 09-26
A potential big problem with placebo tests in econometrics: they’re subject to the “difference between significant and non-significant is not itself statistically significant” issue 09-26
A potential big problem with placebo tests in econometrics: they’re subject to the “difference between significant and non-significant is not itself statistically significant” issue 09-26
A potential big problem with placebo tests in econometrics: they’re subject to the “difference between significant and non-significant is not itself statistically significant” issue 09-26
A potential big problem with placebo tests in econometrics: they’re subject to the “difference between significant and non-significant is not itself statistically significant” issue 09-26
Segmenting brain tissue using Apache MXNet with Amazon SageMaker and AWS Greengrass ML Inference – Part 1 09-26
Statistical Modeling, Causal Inference, and Social Science Regrets Its Decision to Hire Cannibal P-hacker as Writer-at-Large 09-29
Statistical Modeling, Causal Inference, and Social Science Regrets Its Decision to Hire Cannibal P-hacker as Writer-at-Large 09-29
Statistical Modeling, Causal Inference, and Social Science Regrets Its Decision to Hire Cannibal P-hacker as Writer-at-Large 09-29
Document worth reading: “Importance of the Mathematical Foundations of Machine Learning Methods for Scientific and Engineering Applications” 09-30
Document worth reading: “Importance of the Mathematical Foundations of Machine Learning Methods for Scientific and Engineering Applications” 09-30
What do you do when someone says, “The quote is, this is the exact quote”—and then misquotes you? 09-30
What do you do when someone says, “The quote is, this is the exact quote”—and then misquotes you? 09-30
What do you do when someone says, “The quote is, this is the exact quote”—and then misquotes you? 09-30
What do you do when someone says, “The quote is, this is the exact quote”—and then misquotes you? 09-30
Center for Ultrasound Research and Translation, Massachusetts General Hospital: Post-Doctoral Scholar / Research Scientist [Boston, MA] 12-31
Import AI 114: Synthetic images take a big leap forward with BigGANs; US lawmakers call for national AI strategy; researchers probe language reasoning via HotspotQA 10-01
David Weakliem points out that both economic and cultural issues can be more or less “moralized.” 10-03
David Weakliem points out that both economic and cultural issues can be more or less “moralized.” 10-03
Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning: October and Beyond 10-03
Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning: October and Beyond 10-03
KDnuggets™ News 18:n37, Oct 3: Mathematics of Machine Learning; Effective Transfer Learning for NLP; Path Analysis with R 10-03
Top KDnuggets tweets, Sep 26 – Oct 2: Why building your own Deep Learning Computer is 10x cheaper than AWS; 6 Steps To Write Any Machine Learning Algorithm 10-03
GitHub Python Data Science Spotlight: High Level Machine Learning & NLP, Ensembles, Command Line Viz & Docker Made Easy 10-16
Easy CI/CD of GPU applications on Google Cloud including bare-metal using Gitlab and Kubernetes 12-14
Why Almost Everything You’ve Learned About Cheap Custom Essay Is Wrong and What You Should Know 10-04
Why Almost Everything You’ve Learned About Cheap Custom Essay Is Wrong and What You Should Know 10-04
Why Almost Everything You’ve Learned About Cheap Custom Essay Is Wrong and What You Should Know 10-04
Beyond text: How Spokata uses Amazon Polly to make news and information universally accessible as real-time audio 10-04
Beyond text: How Spokata uses Amazon Polly to make news and information universally accessible as real-time audio 10-04
UnitedHealth Group: Senior Principal Data Scientist [Telecommute, Central or Eastern Time Zones] 11-16
Segmenting brain tissue using Apache MXNet with Amazon SageMaker and AWS Greengrass ML Inference – Part 2 10-04
Document worth reading: “A Survey: Non-Orthogonal Multiple Access with Compressed Sensing Multiuser Detection for mMTC” 12-31
Amazon SageMaker Neural Topic Model now supports auxiliary vocabulary channel, new topic evaluation metrics, and training subsampling 10-04
Short Article Reveals the Undeniable Facts About College Essay Writing Service and How It Can Affect You 10-04
Short Article Reveals the Undeniable Facts About College Essay Writing Service and How It Can Affect You 10-04
Linking Data Science Activities to Business Initiatives Using the Hypothesis Development Canvas 11-29
Semantic Interoperability: Are you training your AI by mixing data sources that look the same but aren’t? 10-09
Now use Pipe mode with CSV datasets for faster training on Amazon SageMaker built-in algorithms 11-01
Colorado State University: Assistant Professor in Industrial and Organizational (IO) Psychology [Fort Collins, CO] 10-05
Colorado State University: Assistant Professor in Industrial and Organizational (IO) Psychology [Fort Collins, CO] 10-05
Document worth reading: “An Analysis of Hierarchical Text Classification Using Word Embeddings” 10-06
Document worth reading: “An Analysis of Hierarchical Text Classification Using Word Embeddings” 10-06
Sunday Morning Video (in french): Les travaux de Grothendieck.sur les espaces de Banach, Gilles. Pisier (Lectures grothendieckiennes) 10-07
Job: Postdoctoral Researcher in Small Data Deep Learning and Explainable Machine Learning, Livermore, CA 10-08
Sunday Morning Video (in french): Les travaux de Grothendieck.sur les espaces de Banach, Gilles. Pisier (Lectures grothendieckiennes) 10-07
Sunday Morning Video (in french): Les travaux de Grothendieck.sur les espaces de Banach, Gilles. Pisier (Lectures grothendieckiennes) 10-07
Sunday Morning Video (in french): Les travaux de Grothendieck.sur les espaces de Banach, Gilles. Pisier (Lectures grothendieckiennes) 10-07
Sunday Morning Video (in french): Les travaux de Grothendieck.sur les espaces de Banach, Gilles. Pisier (Lectures grothendieckiennes) 10-07
Document worth reading: “Big Data Systems Meet Machine Learning Challenges: Towards Big Data Science as a Service” 10-07
Document worth reading: “Big Data Systems Meet Machine Learning Challenges: Towards Big Data Science as a Service” 10-07
Document worth reading: “Big Data Systems Meet Machine Learning Challenges: Towards Big Data Science as a Service” 10-07
Document worth reading: “An Overview of Blockchain Integration with Robotics and Artificial Intelligence” 11-08
Semantic Interoperability: Are you training your AI by mixing data sources that look the same but aren’t? 10-09
Semantic Interoperability: Are you training your AI by mixing data sources that look the same but aren’t? 10-09
Semantic Interoperability: Are you training your AI by mixing data sources that look the same but aren’t? 10-09
Top November Stories: The Most in Demand Skills for Data Scientists; What is the Best Python IDE for Data Science? 12-11
Top September Stories: Essential Math for Data Science: Why and How; Machine Learning Cheat Sheets 10-10
Top September Stories: Essential Math for Data Science: Why and How; Machine Learning Cheat Sheets 10-10
KDnuggets™ News 18:n38, Oct 10: Concise Explanation of Learning Algorithms; Why I Call Myself a Data Scientist; Linear Regression in the Wild 10-10
KDnuggets™ News 18:n40, Oct 24: Graphs Are The Next Frontier In Data Science; Apache Spark Intro for Beginners 10-24
KDnuggets™ News 18:n38, Oct 10: Concise Explanation of Learning Algorithms; Why I Call Myself a Data Scientist; Linear Regression in the Wild 10-10
Top Stories of 2018: 9 Must-have skills you need to become a Data Scientist, updated; Python eats away at R: Top Software for Analytics, Data Science, Machine Learning in 2018 12-14
KDnuggets™ News 18:n45, Nov 28: Your Favorite Python IDE/editor? Intro to Data Science for Managers 11-28
KDnuggets™ News 18:n41, Oct 31: Introduction to Deep Learning with Keras; Easy Named Entity Recognition with Scikit-Learn 10-31
KDnuggets™ News 18:n40, Oct 24: Graphs Are The Next Frontier In Data Science; Apache Spark Intro for Beginners 10-24
KDnuggets™ News 18:n39, Oct 17: 10 Best Mobile Apps for Data Scientist; Vote in new poll: Largest dataset you analyzed? 10-17
KDnuggets™ News 18:n38, Oct 10: Concise Explanation of Learning Algorithms; Why I Call Myself a Data Scientist; Linear Regression in the Wild 10-10
UnitedHealth Group: Clinical Data Statistical Analyst – SQL SAS (Clinician Required) [Telecommute] 11-16
Amazon Comprehend introduces new Region availability and language support for French, German, Italian, and Portuguese 10-10
Amazon Comprehend introduces new Region availability and language support for French, German, Italian, and Portuguese 10-10
University of Virginia: Faculty, Open Rank Model and Simulation at the Human-Technology Frontier [Charlottesville, VA] 12-24
NYU Stern: 2019-20 Asst. Professor of Information, Operations & Management Sciences – Information Systems, tenure-track [New York City, NY] 11-14
Document worth reading: “A Survey of Knowledge Representation and Retrieval for Learning in Service Robotics” 12-26
How to import a directory of csvs at once with base R and data.table. Can you guess which way is the fastest? 10-13
Document worth reading: “Data Curation with Deep Learning [Vision]: Towards Self Driving Data Curation” 10-13
Document worth reading: “Data Curation with Deep Learning [Vision]: Towards Self Driving Data Curation” 10-13
He had a sudden cardiac arrest. How does this change the probability that he has a particular genetic condition? 10-14
Document worth reading: “Vector and Matrix Optimal Mass Transport: Theory, Algorithm, and Applications” 10-14
Document worth reading: “A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress” 10-15
Amazon SageMaker notebooks now support Git integration for increased persistence, collaboration, and reproducibility 11-29
GitHub Python Data Science Spotlight: High Level Machine Learning & NLP, Ensembles, Command Line Viz & Docker Made Easy 10-16
University of San Francisco: Assistant Professor, Tenure Track, Mathematics and Statistics [San Francisco, CA] 10-17
University of San Francisco: Assistant Professor, Tenure Track, Mathematics and Statistics [San Francisco, CA] 10-17
Key Takeaways from AI Conference SF, Day 1: Domain Specific Architectures, Emerging China, AI Risks 10-29
An actual quote from a paper published in a medical journal: “The data, analytic methods, and study materials will not be made available to other researchers for purposes of reproducing the results or replicating the procedure.” 10-19
An actual quote from a paper published in a medical journal: “The data, analytic methods, and study materials will not be made available to other researchers for purposes of reproducing the results or replicating the procedure.” 10-19
An actual quote from a paper published in a medical journal: “The data, analytic methods, and study materials will not be made available to other researchers for purposes of reproducing the results or replicating the procedure.” 10-19
An actual quote from a paper published in a medical journal: “The data, analytic methods, and study materials will not be made available to other researchers for purposes of reproducing the results or replicating the procedure.” 10-19
Import AI: 117: Surveillance search engines; harvesting real-world road data with hovering drones; and improving language with unsupervised pre-training 10-22
Import AI: 117: Surveillance search engines; harvesting real-world road data with hovering drones; and improving language with unsupervised pre-training 10-22
Import AI: 117: Surveillance search engines; harvesting real-world road data with hovering drones; and improving language with unsupervised pre-training 10-22
Import AI: 117: Surveillance search engines; harvesting real-world road data with hovering drones; and improving language with unsupervised pre-training 10-22
Understanding Amazon SageMaker notebook instance networking configurations and advanced routing options 10-24
U. of Zurich: Assistant Professorship in AI and Machine Learning (Non-tenure Track) [Zurich, Switzerland] 10-24
U. of Zurich: Assistant Professorship in AI and Machine Learning (Non-tenure Track) [Zurich, Switzerland] 10-24
Understanding Amazon SageMaker notebook instance networking configurations and advanced routing options 10-24
Understanding Amazon SageMaker notebook instance networking configurations and advanced routing options 10-24
Understanding Amazon SageMaker notebook instance networking configurations and advanced routing options 10-24
Center for Ultrasound Research and Translation, Massachusetts General Hospital: Post-Doctoral Scholar / Research Scientist [Boston, MA] 12-31
A study fails to replicate, but it continues to get referenced as if it had no problems. Communication channels are blocked. 10-24
A study fails to replicate, but it continues to get referenced as if it had no problems. Communication channels are blocked. 10-24
A study fails to replicate, but it continues to get referenced as if it had no problems. Communication channels are blocked. 10-24
Kent State University: Assistant/Associate Professor – Business Analytics/Information Systems [Kent, OH] 12-19
Maps with pie charts on top of each administrative division: an example with Luxembourg’s elections data 10-27
Maps with pie charts on top of each administrative division: an example with Luxembourg’s elections data 10-27
Maps with pie charts on top of each administrative division: an example with Luxembourg’s elections data 10-27
Document worth reading: “Machine Learning for Wireless Networks with Artificial Intelligence: A Tutorial on Neural Networks” 10-28
Document worth reading: “Machine Learning for Wireless Networks with Artificial Intelligence: A Tutorial on Neural Networks” 10-28
Arnaub Chatterjee discusses artificial intelligence (AI) and machine learning (ML) in healthcare. 10-29
Arnaub Chatterjee discusses artificial intelligence (AI) and machine learning (ML) in healthcare. 10-29
Arnaub Chatterjee discusses artificial intelligence (AI) and machine learning (ML) in healthcare. 10-29
Key Takeaways from AI Conference SF, Day 1: Domain Specific Architectures, Emerging China, AI Risks 10-29
Key Takeaways from AI Conference SF, Day 1: Domain Specific Architectures, Emerging China, AI Risks 10-29
Key Takeaways from AI Conference SF, Day 1: Domain Specific Architectures, Emerging China, AI Risks 10-29
Top KDnuggets tweets, Nov 14-20: 10 Free Must-See Courses for Machine Learning and Data Science; Great list of 11-21
Machine Learning Explainability vs Interpretability: Two concepts that could help restore trust in AI 12-20
KDnuggets™ News 18:n41, Oct 31: Introduction to Deep Learning with Keras; Easy Named Entity Recognition with Scikit-Learn 10-31
Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning: November and Beyond 11-01
Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning: November and Beyond 11-01
Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning: December and Beyond 12-04
“We are reluctant to engage in post hoc speculation about this unexpected result, but it does not clearly support our hypothesis” 11-03
“We are reluctant to engage in post hoc speculation about this unexpected result, but it does not clearly support our hypothesis” 11-03
“We are reluctant to engage in post hoc speculation about this unexpected result, but it does not clearly support our hypothesis” 11-03
“We are reluctant to engage in post hoc speculation about this unexpected result, but it does not clearly support our hypothesis” 11-03
Cornell prof (but not the pizzagate guy!) has one quick trick to getting 1700 peer reviewed publications on your CV 11-04
Cornell prof (but not the pizzagate guy!) has one quick trick to getting 1700 peer reviewed publications on your CV 11-04
Cornell prof (but not the pizzagate guy!) has one quick trick to getting 1700 peer reviewed publications on your CV 11-04
Import AI 119: How to benefit AI research in Africa; German politician calls for billions in spending to prevent country being left behind; and using deep learning to spot thefts 11-05
KDnuggets™ News 19:n02, Jan 9: The cold start problem: how to build your machine learning portfolio; 5 Best Data Visualization Libraries 01-09
Postdocs and Research fellows for combining probabilistic programming, simulators and interactive AI 11-06
Postdocs and Research fellows for combining probabilistic programming, simulators and interactive AI 11-06
Postdocs and Research fellows for combining probabilistic programming, simulators and interactive AI 11-06
DePaul University: Two tenure-track/tenured positions in Data Science/Computer Science [Chicago, IL] 11-07
DePaul University: Two tenure-track/tenured positions in Data Science/Computer Science [Chicago, IL] 11-07
Top KDnuggets tweets, Oct 31 – Nov 6: 10 More Free Must-Read Books for Machine Learning and Data Science 11-07
Top KDnuggets tweets, Oct 31 – Nov 6: 10 More Free Must-Read Books for Machine Learning and Data Science 11-07
Melanie Mitchell says, “As someone who has worked in A.I. for decades, I’ve witnessed the failure of similar predictions of imminent human-level A.I., and I’m certain these latest forecasts will fall short as well. “ 11-08
Watch out for naively (because implicitly based on flat-prior) Bayesian statements based on classical confidence intervals! (Comptroller of the Currency edition) 11-08
Watch out for naively (because implicitly based on flat-prior) Bayesian statements based on classical confidence intervals! (Comptroller of the Currency edition) 11-08
Watch out for naively (because implicitly based on flat-prior) Bayesian statements based on classical confidence intervals! (Comptroller of the Currency edition) 11-08
Melanie Miller says, “As someone who has worked in A.I. for decades, I’ve witnessed the failure of similar predictions of imminent human-level A.I., and I’m certain these latest forecasts will fall short as well. “ 11-08
Document worth reading: “An Overview of Blockchain Integration with Robotics and Artificial Intelligence” 11-08
Document worth reading: “A Tutorial on Modeling and Inference in Undirected Graphical Models for Hyperspectral Image Analysis” 11-09
Document worth reading: “A Tutorial on Modeling and Inference in Undirected Graphical Models for Hyperspectral Image Analysis” 11-09
Top October Stories: 9 Must-have skills you need to become a Data Scientist, updated; 10 Best Mobile Apps for Data Scientist / Data Analysts 11-09
Top October Stories: 9 Must-have skills you need to become a Data Scientist, updated; 10 Best Mobile Apps for Data Scientist / Data Analysts 11-09
Matching (and discarding non-matches) to deal with lack of complete overlap, then regression to adjust for imbalance between treatment and control groups 11-10
“Law professor Alan Dershowitz’s new book claims that political differences have lately been criminalized in the United States. He has it wrong. Instead, the orderly enforcement of the law has, ludicrously, been framed as political.” 11-12
“Law professor Alan Dershowitz’s new book claims that political differences have lately been criminalized in the United States. He has it wrong. Instead, the orderly enforcement of the law has, ludicrously, been framed as political.” 11-12
KDnuggets™ News 18:n47, Dec 12: Common mistakes when doing machine learning; Here are the most popular Python IDEs / Editors 12-12
KDnuggets™ News 18:n43, Nov 14: To get hired as a data scientist, don’t follow the herd; LinkedIn Top Voices in Data Science & Analytics 11-14
Top KDnuggets tweets, Nov 07-13: 10 Free Must-See Courses for Machine Learning and Data Science 11-14
NYU Stern: 2019-20 Asst. Professor of Information, Operations & Management Sciences – Information Systems, tenure-track [New York City, NY] 11-14
NYU Stern: 2019-20 Asst. Professor of Information, Operations & Management Sciences – Information Systems, tenure-track [New York City, NY] 11-14
Searching for the optimal hyper-parameters of an ARIMA model in parallel: the tidy gridsearch approach 11-15
UnitedHealth Group: Senior Principal Data Scientist [Telecommute, Central or Eastern Time Zones] 11-16
Hey, check this out: Columbia’s Data Science Institute is hiring research scientists and postdocs! 11-16
Hey, check this out: Columbia’s Data Science Institute is hiring research scientists and postdocs! 11-16
Hey, check this out: Columbia’s Data Science Institute is hiring research scientists and postdocs! 11-16
Document worth reading: “Saliency Prediction in the Deep Learning Era: An Empirical Investigation” 11-16
Document worth reading: “Saliency Prediction in the Deep Learning Era: An Empirical Investigation” 11-16
Document worth reading: “Multi-Agent Reinforcement Learning: A Report on Challenges and Approaches” 11-17
Document worth reading: “A Study and Comparison of Human and Deep Learning Recognition Performance Under Visual Distortions” 12-18
Amazon SageMaker Automatic Model Tuning becomes more efficient with warm start of hyperparameter tuning jobs 11-19
Amazon SageMaker Automatic Model Tuning becomes more efficient with warm start of hyperparameter tuning jobs 11-19
Analyze live video at scale in real time using Amazon Kinesis Video Streams and Amazon SageMaker 11-19
Import AI 121: Sony researchers make ultra-fast ImageNet training breakthrough; Berkeley researchers tackle StarCraft II with modular RL system; and Germany adds €3bn for AI research 11-19
Generating data to explore the myriad causal effects that can be estimated in observational data analysis 11-20
The best way to visit Luxembourguish castles is doing data science + combinatorial optimization 11-21
The best way to visit Luxembourguish castles is doing data science + combinatorial optimization 11-21
The best way to visit Luxembourguish castles is doing data science + combinatorial optimization 11-21
The best way to visit Luxembourguish castles is doing data science + combinatorial optimization 11-21
“She also observed that results from smaller studies conducted by NGOs – often pilot studies – would often look promising. But when governments tried to implement scaled-up versions of those programs, their performance would drop considerably.” 11-22
These 3 problems destroy many clinical trials (in context of some papers on problems with non-inferiority trials, or problems with clinical trials in general) 11-25
Import AI: 122: Google obtains new ImageNet state-of-the-art with GPipe; drone learns to land more effectively than PD controller policy; and Facebook releases its ‘CherryPi’ StarCraft bot 11-26
Import AI: 122: Google obtains new ImageNet state-of-the-art with GPipe; drone learns to land more effectively than PD controller policy; and Facebook releases its ‘CherryPi’ StarCraft bot 11-26
Import AI: 122: Google obtains new ImageNet state-of-the-art with GPipe; drone learns to land more effectively than PD controller policy; and Facebook releases its ‘CherryPi’ StarCraft bot 11-26
Document worth reading: “An exploration of algorithmic discrimination in data and classification” 11-27
Document worth reading: “An exploration of algorithmic discrimination in data and classification” 11-27
Humana: Principal Data Scientist/Informatics Principal [Chicago, IL, Dallas, TX and Louisville, KY] 11-27
KDnuggets™ News 18:n47, Dec 12: Common mistakes when doing machine learning; Here are the most popular Python IDEs / Editors 12-12
KDnuggets™ News 18:n45, Nov 28: Your Favorite Python IDE/editor? Intro to Data Science for Managers 11-28
Linking Data Science Activities to Business Initiatives Using the Hypothesis Development Canvas 11-29
Amazon SageMaker notebooks now support Git integration for increased persistence, collaboration, and reproducibility 11-29
Anomaly detection on Amazon DynamoDB Streams using the Amazon SageMaker Random Cut Forest algorithm 12-05
Amazon SageMaker now comes with new capabilities for accelerating machine learning experimentation 11-29
Amazon SageMaker now comes with new capabilities for accelerating machine learning experimentation 11-29
University of Tennessee Knoxville: Assistant or Associate Professor in Data Science [Knoxville, TN] 11-30
Spark + AI Summit: learn best practices in ML and DL, latest frameworks, and more – special KDnuggets offer 12-14
Compare population age structures of Europe NUTS-3 regions and the US counties using ternary color-coding 12-03
Compare population age structures of Europe NUTS-3 regions and the US counties using ternary color-coding 12-03
Compare population age structures of Europe NUTS-3 regions and the US counties using ternary color-coding 12-03
Import AI: 123: Facebook sees demands for deep learning services in its data centers grow by 3.5X; why advanced AI might require a global policeforce; and diagnosing natural disasters with deep learning 12-03
Authority figures in psychology spread more happy talk, still don’t get the point that much of the published, celebrated, and publicized work in their field is no good (Part 2) 12-31
Bayes, statistics, and reproducibility: “Many serious problems with statistics in practice arise from Bayesian inference that is not Bayesian enough, or frequentist evaluation that is not frequentist enough, in both cases using replication distributions that do not make scientific sense or do not reflect the actual procedures being performed on the data.” 12-04
Bayes, statistics, and reproducibility: “Many serious problems with statistics in practice arise from Bayesian inference that is not Bayesian enough, or frequentist evaluation that is not frequentist enough, in both cases using replication distributions that do not make scientific sense or do not reflect the actual procedures being performed on the data.” 12-04
Bayes, statistics, and reproducibility: “Many serious problems with statistics in practice arise from Bayesian inference that is not Bayesian enough, or frequentist evaluation that is not frequentist enough, in both cases using replication distributions that do not make scientific sense or do not reflect the actual procedures being performed on the data.” 12-04
Bayes, statistics, and reproducibility: “Many serious problems with statistics in practice arise from Bayesian inference that is not Bayesian enough, or frequentist evaluation that is not frequentist enough, in both cases using replication distributions that do not make scientific sense or do not reflect the actual procedures being performed on the data.” 12-04
Document worth reading: “A Review of Theoretical and Practical Challenges of Trusted Autonomy in Big Data” 12-04
Document worth reading: “A Review of Theoretical and Practical Challenges of Trusted Autonomy in Big Data” 12-04
Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning: December and Beyond 12-04
“Statistical insights into public opinion and politics” (my talk for the Columbia Data Science Society this Wed 9pm) 12-04
“Statistical insights into public opinion and politics” (my talk for the Columbia Data Science Society this Wed 9pm) 12-04
Document worth reading: “A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition” 12-05
Document worth reading: “A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition” 12-05
KDnuggets™ News 18:n48, Dec 19: Why You Shouldn’t be a Data Science Generalist; Industry Data Science & Machine Learning 2019 Predictions 12-19
KDnuggets™ News 18:n46, Dec 5: AI, Data Science, Analytics 2018 Main Developments, 2019 Key Trends; Deep Learning Cheat Sheets 12-05
“Increase sample size until statistical significance is reached” is not a valid adaptive trial design; but it’s fixable. 12-07
“Increase sample size until statistical significance is reached” is not a valid adaptive trial design; but it’s fixable. 12-07
“Increase sample size until statistical significance is reached” is not a valid adaptive trial design; but it’s fixable. 12-07
A comprehensive list of Machine Learning Resources: Open Courses, Textbooks, Tutorials, Cheat Sheets and more 12-07
A comprehensive list of Machine Learning Resources: Open Courses, Textbooks, Tutorials, Cheat Sheets and more 12-07
Document worth reading: “Marketing Analytics: Methods, Practice, Implementation, and Links to Other Fields” 12-07
Apache Drill 1.15.0 + sergeant 0.8.0 = pcapng Support, Proper Column Types & Mounds of New Metadata 01-02
Should we be concerned about MRP estimates being used in later analyses? Maybe. I recommend checking using fake-data simulation. 12-09
Should we be concerned about MRP estimates being used in later analyses? Maybe. I recommend checking using fake-data simulation. 12-09
Document worth reading: “A Survey: Non-Orthogonal Multiple Access with Compressed Sensing Multiuser Detection for mMTC” 12-31
InformationAge: Will 2019 See the Automation of Automation and Push Up Salaries of Data Scientists? 12-11
KDnuggets™ News 18:n47, Dec 12: Common mistakes when doing machine learning; Here are the most popular Python IDEs / Editors 12-12
Machine Learning Explainability vs Interpretability: Two concepts that could help restore trust in AI 12-20
Machine Learning Explainability vs Interpretability: Two concepts that could help restore trust in AI 12-20
Top KDnuggets tweets, Dec 5-11: How to build a data science project from scratch; NeurIPS 2018 video talk collection 12-13
Top KDnuggets tweets, Dec 5-11: How to build a data science project from scratch; NeurIPS 2018 video talk collection 12-13
Oh, I hate it when work is criticized (or, in this case, fails in attempted replications) and then the original researchers don’t even consider the possibility that maybe in their original work they were inadvertently just finding patterns in noise. 12-13
Oh, I hate it when work is criticized (or, in this case, fails in attempted replications) and then the original researchers don’t even consider the possibility that maybe in their original work they were inadvertently just finding patterns in noise. 12-13
Oh, I hate it when work is criticized (or, in this case, fails in attempted replications) and then the original researchers don’t even consider the possibility that maybe in their original work they were inadvertently just finding patterns in noise. 12-13
Oh, I hate it when work is criticized (or, in this case, fails in attempted replications) and then the original researchers don’t even consider the possibility that maybe in their original work they were inadvertently just finding patterns in noise. 12-13
Top Stories of 2018: 9 Must-have skills you need to become a Data Scientist, updated; Python eats away at R: Top Software for Analytics, Data Science, Machine Learning in 2018 12-14
Easy CI/CD of GPU applications on Google Cloud including bare-metal using Gitlab and Kubernetes 12-14
Easy CI/CD of GPU applications on Google Cloud including bare-metal using Gitlab and Kubernetes 12-14
Surprise-hacking: “the narrative of blindness and illusion sells, and therefore continues to be the central thesis of popular books written by psychologists and cognitive scientists” 12-16
Surprise-hacking: “the narrative of blindness and illusion sells, and therefore continues to be the central thesis of popular books written by psychologists and cognitive scientists” 12-16
Document worth reading: “Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences” 12-16
Document worth reading: “Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences” 12-16
Document worth reading: “Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences” 12-16
Why do sociologists (and bloggers) focus on the negative? 5 possible explanations. (A post in the style of Fabio Rojas) 12-17
Why do sociologists (and bloggers) focus on the negative? 5 possible explanations. (A post in the style of Fabio Rojas) 12-17
KDnuggets™ News 19:n02, Jan 9: The cold start problem: how to build your machine learning portfolio; 5 Best Data Visualization Libraries 01-09
Comparing racism from different eras: If only Tucker Carlson had been around in the 1950s he could’ve been a New York Intellectual. 12-18
Comparing racism from different eras: If only Tucker Carlson had been around in the 1950s he could’ve been a New York Intellectual. 12-18
Comparing racism from different eras: If only Tucker Carlson had been around in the 1950s he could’ve been a New York Intellectual. 12-18
Comparing racism from different eras: If only Tucker Carlson had been around in the 1950s he could’ve been a New York Intellectual. 12-18
Comparing racism from different eras: If only Tucker Carlson had been around in the 1950s he could’ve been a New York Intellectual. 12-18
Document worth reading: “A Study and Comparison of Human and Deep Learning Recognition Performance Under Visual Distortions” 12-18
Document worth reading: “Are screening methods useful in feature selection? An empirical study” 12-18
Analyzing contact center calls—Part 1: Use Amazon Transcribe and Amazon Comprehend to analyze customer sentiment 12-18
Top KDnuggets tweets, Dec 12-18: Deep Learning Cheat Sheets; The Nate Silver vs. Nassim Taleb Twitter War 12-19
Kent State University: Assistant/Associate Professor – Business Analytics/Information Systems [Kent, OH] 12-19
Exploring model fit by looking at a histogram of a posterior simulation draw of a set of parameters in a hierarchical model 12-20
Exploring model fit by looking at a histogram of a posterior simulation draw of a set of parameters in a hierarchical model 12-20
Exploring model fit by looking at a histogram of a posterior simulation draw of a set of parameters in a hierarchical model 12-20
Exploring model fit by looking at a histogram of a posterior simulation draw of a set of parameters in a hierarchical model 12-20
Machine Learning Explainability vs Interpretability: Two concepts that could help restore trust in AI 12-20
University of Virginia: Faculty, Open Rank Model and Simulation at the Human-Technology Frontier [Charlottesville, VA] 12-24
“Thus, a loss aversion principle is rendered superfluous to an account of the phenomena it was introduced to explain.” 12-25
“Thus, a loss aversion principle is rendered superfluous to an account of the phenomena it was introduced to explain.” 12-25
Document worth reading: “A Survey of Knowledge Representation and Retrieval for Learning in Service Robotics” 12-26
Document worth reading: “The Gap of Semantic Parsing: A Survey on Automatic Math Word Problem Solvers” 12-27
Document worth reading: “The Gap of Semantic Parsing: A Survey on Automatic Math Word Problem Solvers” 12-27
Document worth reading: “The Gap of Semantic Parsing: A Survey on Automatic Math Word Problem Solvers” 12-27
Use AWS Machine Learning to Analyze Customer Calls from Contact Centers (Part 2): Automate, Deploy, and Visualize Analytics using Amazon Transcribe, Amazon Comprehend, AWS CloudFormation, and Amazon QuickSight 12-28
Use AWS Machine Learning to Analyze Customer Calls from Contact Centers (Part 2): Automate, Deploy, and Visualize Analytics using Amazon Transcribe, Amazon Comprehend, AWS CloudFormation, and Amazon QuickSight 12-28
“Check yourself before you wreck yourself: Assessing discrete choice models through predictive simulations” 12-29
“Check yourself before you wreck yourself: Assessing discrete choice models through predictive simulations” 12-29
“Check yourself before you wreck yourself: Assessing discrete choice models through predictive simulations” 12-29
“Check yourself before you wreck yourself: Assessing discrete choice models through predictive simulations” 12-29
This dance, it’s like a weapon: Radiohead’s and Beck’s danceability, valence, popularity, and more from the LastFM and Spotify APIs 12-30
This dance, it’s like a weapon: Radiohead’s and Beck’s danceability, valence, popularity, and more from the LastFM and Spotify APIs 12-30
This dance, it’s like a weapon: Radiohead’s and Beck’s danceability, valence, popularity, and more from the LastFM and Spotify APIs 12-30
This dance, it’s like a weapon: Radiohead’s and Beck’s danceability, valence, popularity, and more from the LastFM and Spotify APIs 12-30
This dance, it’s like a weapon: Radiohead’s and Beck’s danceability, valence, popularity, and more from the LastFM and Spotify APIs 12-30
Center for Ultrasound Research and Translation, Massachusetts General Hospital: Post-Doctoral Scholar / Research Scientist [Boston, MA] 12-31
Document worth reading: “A Survey: Non-Orthogonal Multiple Access with Compressed Sensing Multiuser Detection for mMTC” 12-31
Authority figures in psychology spread more happy talk, still don’t get the point that much of the published, celebrated, and publicized work in their field is no good (Part 2) 12-31
Authority figures in psychology spread more happy talk, still don’t get the point that much of the published, celebrated, and publicized work in their field is no good (Part 2) 12-31
What to do when you read a paper and it’s full of errors and the author won’t share the data or be open about the analysis? 01-02
Apache Drill 1.15.0 + sergeant 0.8.0 = pcapng Support, Proper Column Types & Mounds of New Metadata 01-02
Apache Drill 1.15.0 + sergeant 0.8.0 = pcapng Support, Proper Column Types & Mounds of New Metadata 01-02
Apache Drill 1.15.0 + sergeant 0.8.0 = pcapng Support, Proper Column Types & Mounds of New Metadata 01-02
Document worth reading: “Recommendation System based on Semantic Scholar Mining and Topic modeling: A behavioral analysis of researchers from six conferences” 01-03
Southern Illinois University Edwardsville: Director of the Center for Predictive Analytics/(Associate) Professor of Mathematics and Statistics [Edwardsville, IL] 01-04
Displaying our “R – Quality Control Individual Range Chart Made Nice” inside a Java web App using AJAX – How To. 01-04
Displaying our “R – Quality Control Individual Range Chart Made Nice” inside a Java web App using AJAX – How To. 01-04
Displaying our “R – Quality Control Individual Range Chart Made Nice” inside a Java web App using AJAX – How To. 01-04
Displaying our “R – Quality Control Individual Range Chart Made Nice” inside a Java web App using AJAX – How To. 01-04
Don’t reinvent the wheel: making use of shiny extension packages. Join MünsteR for our next meetup! 01-08
You did a sentiment analysis with tidytext but you forgot to do dependency parsing to answer WHY is something positive/negative 01-08
You did a sentiment analysis with tidytext but you forgot to do dependency parsing to answer WHY is something positive/negative 01-08
Principles of Database Management: The Practical Guide to Storing, Managing and Analyzing Big and Small Data 01-10
MRP (multilevel regression and poststratification; Mister P): Clearing up misunderstandings about 01-10
MRP (multilevel regression and poststratification; Mister P): Clearing up misunderstandings about 01-10
MRP (multilevel regression and poststratification; Mister P): Clearing up misunderstandings about 01-10