Linguistic Signals of Album Quality: A Predictive Analysis of Pitchfork Review Scores Using Quanteda 01-10
MRP (multilevel regression and poststratification; Mister P): Clearing up misunderstandings about 01-10
Principles of Database Management: The Practical Guide to Storing, Managing and Analyzing Big and Small Data 01-10
Core Principles of Sustainable Data Science, Machine Learning and AI Product Development: Research as a core driver 01-09
KDnuggets™ News 19:n02, Jan 9: The cold start problem: how to build your machine learning portfolio; 5 Best Data Visualization Libraries 01-09
NYU Stern Fubon Center for Technology, Business and Innovation: Fubon Center Faculty Fellow [New York, NY] 01-08
You did a sentiment analysis with tidytext but you forgot to do dependency parsing to answer WHY is something positive/negative 01-08
Top Stories, Dec 24 – Jan 6: The Essence of Machine Learning; Papers with Code: A Fantastic GitHub Resource for Machine Learning 01-08
Don’t reinvent the wheel: making use of shiny extension packages. Join MünsteR for our next meetup! 01-08
Displaying our “R – Quality Control Individual Range Chart Made Nice” inside a Java web App using AJAX – How To. 01-04
Southern Illinois University Edwardsville: Director of the Center for Predictive Analytics/(Associate) Professor of Mathematics and Statistics [Edwardsville, IL] 01-04
KDnuggets™ News 19:n01, Jan 3: The Essence of Machine Learning; A Guide to Decision Trees for Machine Learning and Data Science 01-03
Document worth reading: “Recommendation System based on Semantic Scholar Mining and Topic modeling: A behavioral analysis of researchers from six conferences” 01-03
Apache Drill 1.15.0 + sergeant 0.8.0 = pcapng Support, Proper Column Types & Mounds of New Metadata 01-02
What to do when you read a paper and it’s full of errors and the author won’t share the data or be open about the analysis? 01-02
Authority figures in psychology spread more happy talk, still don’t get the point that much of the published, celebrated, and publicized work in their field is no good (Part 2) 12-31
Document worth reading: “A Survey: Non-Orthogonal Multiple Access with Compressed Sensing Multiuser Detection for mMTC” 12-31
Center for Ultrasound Research and Translation, Massachusetts General Hospital: Post-Doctoral Scholar / Research Scientist [Boston, MA] 12-31
Import AI 127: Why language AI advancements may make Google more competitive; COCO image captioning systems don’t live up to the hype, and Amazon sees 3X growth in voice shopping via Alexa 12-31
This dance, it’s like a weapon: Radiohead’s and Beck’s danceability, valence, popularity, and more from the LastFM and Spotify APIs 12-30
“Check yourself before you wreck yourself: Assessing discrete choice models through predictive simulations” 12-29
Use AWS Machine Learning to Analyze Customer Calls from Contact Centers (Part 2): Automate, Deploy, and Visualize Analytics using Amazon Transcribe, Amazon Comprehend, AWS CloudFormation, and Amazon QuickSight 12-28
Document worth reading: “The Gap of Semantic Parsing: A Survey on Automatic Math Word Problem Solvers” 12-27
Document worth reading: “A Survey of Knowledge Representation and Retrieval for Learning in Service Robotics” 12-26
“Thus, a loss aversion principle is rendered superfluous to an account of the phenomena it was introduced to explain.” 12-25
University of Virginia: Faculty, Open Rank Model and Simulation at the Human-Technology Frontier [Charlottesville, VA] 12-24
Top Stories, Dec 17-23: Why You Shouldn’t be a Data Science Generalist; 10 More Must-See Free Courses for Machine Learning and Data Science 12-24
Zak David expresses critical views of some published research in empirical quantitative finance 12-24
Machine Learning Explainability vs Interpretability: Two concepts that could help restore trust in AI 12-20
Exploring model fit by looking at a histogram of a posterior simulation draw of a set of parameters in a hierarchical model 12-20
KDnuggets™ News 18:n48, Dec 19: Why You Shouldn’t be a Data Science Generalist; Industry Data Science & Machine Learning 2019 Predictions 12-19
Kent State University: Assistant/Associate Professor – Business Analytics/Information Systems [Kent, OH] 12-19
Top KDnuggets tweets, Dec 12-18: Deep Learning Cheat Sheets; The Nate Silver vs. Nassim Taleb Twitter War 12-19
Alternative approaches to scaling Shiny with RStudio Shiny Server, ShinyProxy or custom architecture. 12-18
Analyzing contact center calls—Part 1: Use Amazon Transcribe and Amazon Comprehend to analyze customer sentiment 12-18
Industry Predictions: AI, Machine Learning, Analytics & Data Science Main Developments in 2018 and Key Trends for 2019 12-18
Document worth reading: “Are screening methods useful in feature selection? An empirical study” 12-18
Document worth reading: “A Study and Comparison of Human and Deep Learning Recognition Performance Under Visual Distortions” 12-18
Comparing racism from different eras: If only Tucker Carlson had been around in the 1950s he could’ve been a New York Intellectual. 12-18
Why do sociologists (and bloggers) focus on the negative? 5 possible explanations. (A post in the style of Fabio Rojas) 12-17
Top Stories, Dec 10-16: Why You Shouldn’t be a Data Science Generalist; Machine Learning & AI Main Developments in 2018 and Key Trends for 2019 12-17
Document worth reading: “Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences” 12-16
Surprise-hacking: “the narrative of blindness and illusion sells, and therefore continues to be the central thesis of popular books written by psychologists and cognitive scientists” 12-16
“My advisor and I disagree on how we should carry out repeated cross-validation. We would love to have a third expert opinion…” 12-15
Easy CI/CD of GPU applications on Google Cloud including bare-metal using Gitlab and Kubernetes 12-14
Top Stories of 2018: 9 Must-have skills you need to become a Data Scientist, updated; Python eats away at R: Top Software for Analytics, Data Science, Machine Learning in 2018 12-14
Spark + AI Summit: learn best practices in ML and DL, latest frameworks, and more – special KDnuggets offer 12-14
Oh, I hate it when work is criticized (or, in this case, fails in attempted replications) and then the original researchers don’t even consider the possibility that maybe in their original work they were inadvertently just finding patterns in noise. 12-13
Top KDnuggets tweets, Dec 5-11: How to build a data science project from scratch; NeurIPS 2018 video talk collection 12-13
KDnuggets™ News 18:n47, Dec 12: Common mistakes when doing machine learning; Here are the most popular Python IDEs / Editors 12-12
InformationAge: Will 2019 See the Automation of Automation and Push Up Salaries of Data Scientists? 12-11
Top November Stories: The Most in Demand Skills for Data Scientists; What is the Best Python IDE for Data Science? 12-11
Top Stories, Dec 3-9: Common mistakes when carrying out machine learning and data science; AI, Data Science, Analytics Main Developments in 2018 and Key Trends for 2019 12-10
Should we be concerned about MRP estimates being used in later analyses? Maybe. I recommend checking using fake-data simulation. 12-09
Document worth reading: “Marketing Analytics: Methods, Practice, Implementation, and Links to Other Fields” 12-07
A comprehensive list of Machine Learning Resources: Open Courses, Textbooks, Tutorials, Cheat Sheets and more 12-07
“Increase sample size until statistical significance is reached” is not a valid adaptive trial design; but it’s fixable. 12-07
KDnuggets™ News 18:n46, Dec 5: AI, Data Science, Analytics 2018 Main Developments, 2019 Key Trends; Deep Learning Cheat Sheets 12-05
Anomaly detection on Amazon DynamoDB Streams using the Amazon SageMaker Random Cut Forest algorithm 12-05
Document worth reading: “A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition” 12-05
“Statistical insights into public opinion and politics” (my talk for the Columbia Data Science Society this Wed 9pm) 12-04
Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning: December and Beyond 12-04
Document worth reading: “A Review of Theoretical and Practical Challenges of Trusted Autonomy in Big Data” 12-04
Bayes, statistics, and reproducibility: “Many serious problems with statistics in practice arise from Bayesian inference that is not Bayesian enough, or frequentist evaluation that is not frequentist enough, in both cases using replication distributions that do not make scientific sense or do not reflect the actual procedures being performed on the data.” 12-04
Import AI: 123: Facebook sees demands for deep learning services in its data centers grow by 3.5X; why advanced AI might require a global policeforce; and diagnosing natural disasters with deep learning 12-03
Compare population age structures of Europe NUTS-3 regions and the US counties using ternary color-coding 12-03
Top Stories, Nov 26 – Dec 2: Deep Learning Cheat Sheets; A Complete Guide to Choosing the Best Machine Learning Course 12-03
University of Tennessee Knoxville: Assistant or Associate Professor in Data Science [Knoxville, TN] 11-30
Amazon SageMaker now comes with new capabilities for accelerating machine learning experimentation 11-29
Amazon SageMaker notebooks now support Git integration for increased persistence, collaboration, and reproducibility 11-29
Linking Data Science Activities to Business Initiatives Using the Hypothesis Development Canvas 11-29
KDnuggets™ News 18:n45, Nov 28: Your Favorite Python IDE/editor? Intro to Data Science for Managers 11-28
Humana: Principal Data Scientist/Informatics Principal [Chicago, IL, Dallas, TX and Louisville, KY] 11-27
Document worth reading: “An exploration of algorithmic discrimination in data and classification” 11-27
Import AI: 122: Google obtains new ImageNet state-of-the-art with GPipe; drone learns to land more effectively than PD controller policy; and Facebook releases its ‘CherryPi’ StarCraft bot 11-26
Top Stories, Nov 19-25: What is the Best Python IDE for Data Science?; Intro to Data Science for Managers 11-26
These 3 problems destroy many clinical trials (in context of some papers on problems with non-inferiority trials, or problems with clinical trials in general) 11-25
“She also observed that results from smaller studies conducted by NGOs – often pilot studies – would often look promising. But when governments tried to implement scaled-up versions of those programs, their performance would drop considerably.” 11-22
Amazon Rekognition announces updates to its face detection, analysis, and recognition capabilities 11-22
KDnuggets™ News 18:n44, Nov 21: What is the Best Python IDE for Data Science?; Anticipating the next move in data science 11-21
Top KDnuggets tweets, Nov 14-20: 10 Free Must-See Courses for Machine Learning and Data Science; Great list of 11-21
The best way to visit Luxembourguish castles is doing data science + combinatorial optimization 11-21
Generating data to explore the myriad causal effects that can be estimated in observational data analysis 11-20
Import AI 121: Sony researchers make ultra-fast ImageNet training breakthrough; Berkeley researchers tackle StarCraft II with modular RL system; and Germany adds €3bn for AI research 11-19
Analyze live video at scale in real time using Amazon Kinesis Video Streams and Amazon SageMaker 11-19
Amazon SageMaker Automatic Model Tuning becomes more efficient with warm start of hyperparameter tuning jobs 11-19
Top Stories, Nov 12-18: What is the Best Python IDE for Data Science?; To get hired as a data scientist, don’t follow the herd 11-19
Benford’s Law for Fraud Detection with an Application to all Brazilian Presidential Elections from 2002 to 2018 11-17
Document worth reading: “Multi-Agent Reinforcement Learning: A Report on Challenges and Approaches” 11-17
UnitedHealth Group: Clinical Data Statistical Analyst – SQL SAS (Clinician Required) [Telecommute] 11-16
Document worth reading: “Saliency Prediction in the Deep Learning Era: An Empirical Investigation” 11-16
Hey, check this out: Columbia’s Data Science Institute is hiring research scientists and postdocs! 11-16
UnitedHealth Group: Senior Principal Data Scientist [Telecommute, Central or Eastern Time Zones] 11-16
Searching for the optimal hyper-parameters of an ARIMA model in parallel: the tidy gridsearch approach 11-15
NYU Stern: 2019-20 Asst. Professor of Information, Operations & Management Sciences – Information Systems, tenure-track [New York City, NY] 11-14
Top KDnuggets tweets, Nov 07-13: 10 Free Must-See Courses for Machine Learning and Data Science 11-14
KDnuggets™ News 18:n43, Nov 14: To get hired as a data scientist, don’t follow the herd; LinkedIn Top Voices in Data Science & Analytics 11-14
Top Stories, Nov 5-11: The Most in Demand Skills for Data Scientists; 10 Free Must-See Courses for Machine Learning and Data Science 11-13
“Law professor Alan Dershowitz’s new book claims that political differences have lately been criminalized in the United States. He has it wrong. Instead, the orderly enforcement of the law has, ludicrously, been framed as political.” 11-12
Matching (and discarding non-matches) to deal with lack of complete overlap, then regression to adjust for imbalance between treatment and control groups 11-10
Top October Stories: 9 Must-have skills you need to become a Data Scientist, updated; 10 Best Mobile Apps for Data Scientist / Data Analysts 11-09
Document worth reading: “A Tutorial on Modeling and Inference in Undirected Graphical Models for Hyperspectral Image Analysis” 11-09
Document worth reading: “An Overview of Blockchain Integration with Robotics and Artificial Intelligence” 11-08
Melanie Miller says, “As someone who has worked in A.I. for decades, I’ve witnessed the failure of similar predictions of imminent human-level A.I., and I’m certain these latest forecasts will fall short as well. “ 11-08
Watch out for naively (because implicitly based on flat-prior) Bayesian statements based on classical confidence intervals! (Comptroller of the Currency edition) 11-08
Melanie Mitchell says, “As someone who has worked in A.I. for decades, I’ve witnessed the failure of similar predictions of imminent human-level A.I., and I’m certain these latest forecasts will fall short as well. “ 11-08
Top KDnuggets tweets, Oct 31 – Nov 6: 10 More Free Must-Read Books for Machine Learning and Data Science 11-07
KDnuggets™ News 18:n42, Nov 7: The Most in Demand Skills for Data Scientists; How Machines Understand Our Language: Intro to NLP 11-07
DePaul University: Two tenure-track/tenured positions in Data Science/Computer Science [Chicago, IL] 11-07
Direct access to Amazon SageMaker notebooks from Amazon VPC by using an AWS PrivateLink endpoint 11-06
Postdocs and Research fellows for combining probabilistic programming, simulators and interactive AI 11-06
Top Stories, Oct 29 – Nov 4: The Most in Demand Skills for Data Scientists; How Machines Understand Our Language 11-05
Import AI 119: How to benefit AI research in Africa; German politician calls for billions in spending to prevent country being left behind; and using deep learning to spot thefts 11-05
Cornell prof (but not the pizzagate guy!) has one quick trick to getting 1700 peer reviewed publications on your CV 11-04
“We are reluctant to engage in post hoc speculation about this unexpected result, but it does not clearly support our hypothesis” 11-03
Facial feedback: “These findings suggest that minute differences in the experimental protocol might lead to theoretically meaningful changes in the outcomes.” 11-01
Now use Pipe mode with CSV datasets for faster training on Amazon SageMaker built-in algorithms 11-01
Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning: November and Beyond 11-01
KDnuggets™ News 18:n41, Oct 31: Introduction to Deep Learning with Keras; Easy Named Entity Recognition with Scikit-Learn 10-31
Key Takeaways from AI Conference SF, Day 1: Domain Specific Architectures, Emerging China, AI Risks 10-29
Arnaub Chatterjee discusses artificial intelligence (AI) and machine learning (ML) in healthcare. 10-29
Top Stories, Oct 22-28: 9 Must-have skills you need to become a Data Scientist, updated; Named Entity Recognition and Classification with Scikit-Learn 10-29
Document worth reading: “Machine Learning for Wireless Networks with Artificial Intelligence: A Tutorial on Neural Networks” 10-28
Maps with pie charts on top of each administrative division: an example with Luxembourg’s elections data 10-27
Document worth reading: “Recent Advances in Object Detection in the Age of Deep Convolutional Neural Networks” 10-25
A study fails to replicate, but it continues to get referenced as if it had no problems. Communication channels are blocked. 10-24
Understanding Amazon SageMaker notebook instance networking configurations and advanced routing options 10-24
U. of Zurich: Assistant Professorship in AI and Machine Learning (Non-tenure Track) [Zurich, Switzerland] 10-24
ITWire: VIDEO Interview with a DataRobot: Greg Michaelson talks AI, banking, machine learning and more 10-24
KDnuggets™ News 18:n40, Oct 24: Graphs Are The Next Frontier In Data Science; Apache Spark Intro for Beginners 10-24
Google, Microsoft & Fraunhofer at the First European Edition of Deep Learning World – 12 Nov, 2018 10-23
What to think about this new study which says that you should limit your alcohol to 5 drinks a week? 10-23
Top Stories, Oct 15-21: Graphs Are The Next Frontier In Data Science; The Main Approaches to Natural Language Processing Tasks 10-22
Import AI: 117: Surveillance search engines; harvesting real-world road data with hovering drones; and improving language with unsupervised pre-training 10-22
An actual quote from a paper published in a medical journal: “The data, analytic methods, and study materials will not be made available to other researchers for purposes of reproducing the results or replicating the procedure.” 10-19
University of San Francisco: Assistant Professor, Tenure Track, Mathematics and Statistics [San Francisco, CA] 10-17
Top KDnuggets tweets, Oct 10-16: 7 Books to Grasp Mathematical Foundations of Data Science and Machine Learning; 6 Books Every Data Scientist Should Keep Nearby 10-17
KDnuggets™ News 18:n39, Oct 17: 10 Best Mobile Apps for Data Scientist; Vote in new poll: Largest dataset you analyzed? 10-17
Slides from my talk at the R-Ladies Meetup about Interpretable Deep Learning with R, Keras and LIME 10-17
GitHub Python Data Science Spotlight: High Level Machine Learning & NLP, Ensembles, Command Line Viz & Docker Made Easy 10-16
Document worth reading: “A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress” 10-15
Document worth reading: “Vector and Matrix Optimal Mass Transport: Theory, Algorithm, and Applications” 10-14
He had a sudden cardiac arrest. How does this change the probability that he has a particular genetic condition? 10-14
Document worth reading: “Data Curation with Deep Learning [Vision]: Towards Self Driving Data Curation” 10-13
How to import a directory of csvs at once with base R and data.table. Can you guess which way is the fastest? 10-13
Top KDnuggets tweets, Oct 3–9: 5 Reasons Logistic Regression should be the first thing you learn when becoming a Data Scientist 10-10
Amazon Comprehend introduces new Region availability and language support for French, German, Italian, and Portuguese 10-10
KDnuggets™ News 18:n38, Oct 10: Concise Explanation of Learning Algorithms; Why I Call Myself a Data Scientist; Linear Regression in the Wild 10-10
Top September Stories: Essential Math for Data Science: Why and How; Machine Learning Cheat Sheets 10-10
Document worth reading: “An Introduction to Inductive Statistical Inference — from Parameter Estimation to Decision-Making” 10-09
Semantic Interoperability: Are you training your AI by mixing data sources that look the same but aren’t? 10-09
Job: Postdoctoral Researcher in Small Data Deep Learning and Explainable Machine Learning, Livermore, CA 10-08
Document worth reading: “Big Data Systems Meet Machine Learning Challenges: Towards Big Data Science as a Service” 10-07
Sunday Morning Video (in french): Les travaux de Grothendieck.sur les espaces de Banach, Gilles. Pisier (Lectures grothendieckiennes) 10-07
Document worth reading: “An Analysis of Hierarchical Text Classification Using Word Embeddings” 10-06
Colorado State University: Assistant Professor in Industrial and Organizational (IO) Psychology [Fort Collins, CO] 10-05
Short Article Reveals the Undeniable Facts About College Essay Writing Service and How It Can Affect You 10-04
Amazon SageMaker Neural Topic Model now supports auxiliary vocabulary channel, new topic evaluation metrics, and training subsampling 10-04
Segmenting brain tissue using Apache MXNet with Amazon SageMaker and AWS Greengrass ML Inference – Part 2 10-04
Beyond text: How Spokata uses Amazon Polly to make news and information universally accessible as real-time audio 10-04
Why Almost Everything You’ve Learned About Cheap Custom Essay Is Wrong and What You Should Know 10-04
Top KDnuggets tweets, Sep 26 – Oct 2: Why building your own Deep Learning Computer is 10x cheaper than AWS; 6 Steps To Write Any Machine Learning Algorithm 10-03
KDnuggets™ News 18:n37, Oct 3: Mathematics of Machine Learning; Effective Transfer Learning for NLP; Path Analysis with R 10-03
Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning: October and Beyond 10-03
David Weakliem points out that both economic and cultural issues can be more or less “moralized.” 10-03
Import AI 114: Synthetic images take a big leap forward with BigGANs; US lawmakers call for national AI strategy; researchers probe language reasoning via HotspotQA 10-01
Document worth reading: “Generative Adversarial Nets for Information Retrieval: Fundamentals and Advances” 10-01
What do you do when someone says, “The quote is, this is the exact quote”—and then misquotes you? 09-30
Document worth reading: “Importance of the Mathematical Foundations of Machine Learning Methods for Scientific and Engineering Applications” 09-30
Statistical Modeling, Causal Inference, and Social Science Regrets Its Decision to Hire Cannibal P-hacker as Writer-at-Large 09-29
(People are missing the point on Wansink, so) what’s the lesson we should be drawing from this story? 09-27
Segmenting brain tissue using Apache MXNet with Amazon SageMaker and AWS Greengrass ML Inference – Part 1 09-26
A potential big problem with placebo tests in econometrics: they’re subject to the “difference between significant and non-significant is not itself statistically significant” issue 09-26
Document worth reading: “Human-Machine Inference Networks For Smart Decision Making: Opportunities and Challenges” 09-26
Job opening at CDC: “The Statistician will play a central role in guiding the statistical methods of all major projects of the Epidemiology and Prevention Branch of the CDC Influenza Division, and aid in designing, analyzing, and interpreting research intended to understand the burden of influenza in the US and internationally and identify the best influenza vaccines and vaccine strategies.” 09-26
Document worth reading: “Data Innovation for International Development: An overview of natural language processing for qualitative data analysis” 09-25
Document worth reading: “Do Deep Learning Models Have Too Many Parameters An Information Theory Viewpoint” 09-22
A psychology researcher uses Stan, multiverse, and open data exploration to explore human memory 09-21
A couple more papers on genetic diversity as an explanation for why Africa and remote Andean countries are so poor while Europe and North America are so wealthy 09-19
Why, oh why, do so many people embrace the Pacific Garbage Cleanup nonsense? (I have a theory). 09-18
Document worth reading: “A practical tutorial on autoencoders for nonlinear feature fusion: Taxonomy, models, software and guidelines” 09-16
High-profile statistical errors occur in the physical sciences too, it’s not just a problem in social science. 09-15
✚ Google Dataset Search Impressions, the Challenges of Looking for Data, and Other Places to Find Data 09-13
Data Center Scale Computing and Artificial Intelligence with Matei Zaharia, Inventor of Apache Spark 09-12
Import AI 111: Hacking computers with Generative Adversarial Networks, Facebook trains world-class speech translation in 85 minutes via 128 GPUs, and Europeans use AI to classify 1,000-year-old graffiti. 09-10
Document worth reading: “Quantizing deep convolutional networks for efficient inference: A whitepaper” 09-10
Document worth reading: “Concept Tagging for Natural Language Understanding: Two Decadelong Algorithm Development” 09-10
“It’s Always Sunny in Correlationville: Stories in Science,” or, Science should not be a game of Botticelli 09-08
No code chatbots: TIBCO uses Amazon Lex to put chat interfaces into the hands of business users 09-05
“We continuously increased the number of animals until statistical significance was reached to support our conclusions” . . . I think this is not so bad, actually! 09-04
Vulcan Post: This AI Startup Is Run By The World’s Top Data Scientists – Lets Anyone Build Predictive Models 09-04
John Hattie’s “Visible Learning”: How much should we trust this influential review of education research? 09-01
“Identification of and correction for publication bias,” and another discussion of how forking paths is not the same thing as file drawer 08-31
Document worth reading: “PMLB: A Large Benchmark Suite for Machine Learning Evaluation and Comparison” 08-31
Document worth reading: “Idealised Bayesian Neural Networks Cannot Have Adversarial Examples: Theoretical and Empirical Study” 08-29
Document worth reading: “A Comparative Study on using Principle Component Analysis with Different Text Classifiers” 08-29
“To get started, I suggest coming up with a simple but reasonable model for missingness, then simulate fake complete data followed by a fake missingness pattern, and check that you can recover your missing-data model and your complete data model in that fake-data situation. You can then proceed from there. But if you can’t even do it with fake data, you’re sunk.” 08-27
Document worth reading: “Nonnegative Matrix Factorization for Signal and Data Analytics: Identifiability, Algorithms, and Applications” 08-25
Document worth reading: “Fog Computing: Survey of Trends, Architectures, Requirements, and Research Directions” 08-22
Import AI: 108: Learning language with fake sentences, Chinese researchers use RL to train prototype warehouse robots; and what the implications are of scaled-up Neural Architecture Search 08-20
Three informal case studies: (1) Monte Carlo EM, (2) a new approach to C++ matrix autodiff with closures, (3) C++ serialization via parameter packs 08-17
Three informal case studies: (1) Monte Carlo EM, (2) a new approach to C++ matrix autodiff with closures, (3) C++ serialization via parameter packs 08-17
Monitoring the media reaction to Facebook’s disastrous earnings call – News API Monthly Media Review 08-16
Build an automatic alert system to easily moderate content at scale with Amazon Rekognition Video 08-15
Announcing the Artificial Intelligence (AI) Hackathon: Build Intelligent Applications using machine learning APIs and serverless 08-15
Aella Credit empowers underbanked individuals by using Amazon Rekognition for identity verification 08-15
Document worth reading: “Does putting your emotions into words make you feel better? Measuring the minute-scale dynamics of emotions from online data” 08-14
Document worth reading: “Weighted Abstract Dialectical Frameworks: Extended and Revised Report” 08-13
Document worth reading: “Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution” 08-11
Document worth reading: “Learning to Succeed while Teaching to Fail: Privacy in Closed Machine Learning Systems” 08-10
Amazon Rekognition is now available in the Asia Pacific (Seoul) and Asia Pacific (Mumbai) Regions 08-09
✚ Detailed Intentions of a Map, When Everything Leads to Nothing, Designing for Misinterpretations 08-09
Document worth reading: “Examining the Use of Neural Networks for Feature Extraction: A Comparative Analysis using Deep Learning, Support Vector Machines, and K-Nearest Neighbor Classifiers” 08-08
Document worth reading: “A Temporal Difference Reinforcement Learning Theory of Emotion: unifying emotion, cognition and adaptive behavior” 08-07
“The most important aspect of a statistical analysis is not what you do with the data, it’s what data you use” (survey adjustment edition) 08-07
Response to Rafa: Why I don’t think ROC [receiver operating characteristic] works as a model for science 08-05
Document worth reading: “Attend Before you Act: Leveraging human visual attention for continual learning” 08-03
Transfer learning for custom labels using a TensorFlow container and “bring your own algorithm” in Amazon SageMaker 07-27
AWS Deep Learning AMIs now include ONNX, enabling model portability across deep learning frameworks 07-26
AWS Deep Learning AMIs now with optimized TensorFlow 1.9 and Apache MXNet 1.2 with Keras 2 support to accelerate deep learning on Amazon EC2 instances 07-23
“The idea of replication is central not just to scientific practice but also to formal statistics . . . Frequentist statistics relies on the reference set of repeated experiments, and Bayesian statistics relies on the prior distribution which represents the population of effects.” 07-19
“For professional baseball players, faster hand-eye coordination linked to batting performance” 07-18
Model Updates: Entity-level Sentiment Analysis and Brand New Entity Extraction Models Now Live in the Text Analysis API 07-17
The statistical checklist: Could there be a list of guidelines to help analysts do better work? 07-17
Using Siamese Networks and Pre-Trained Convolutional Neural Networks (CNNs) for Fashion Similarity Matching 07-10
Joint inference or modular inference? Pierre Jacob, Lawrence Murray, Chris Holmes, Christian Robert discuss conditions on the strength and weaknesses of these choices 07-09
He wants to know what to read and what software to learn, to increase his ability to think about quantitative methods in social science 07-07
Tutorial: The practical application of complicated statistical methods to fill up the scientific literature with confusing and irrelevant analyses 07-05
PNAS forgets basic principles of game theory, thus dooming thousands of Bothans to the fate of Alderaan 07-04
About that claim in the NYT that the immigration issue helped Hillary Clinton? The numbers don’t seem to add up. 07-03
Predicting World Cup dark horses from press coverage using the AYLIEN News API – Monthly Media Roundup 06-15
Estimating mortality rates in Puerto Rico after hurricane María using newly released official death counts 06-08
“If you deprive the robot of your intuition about cause and effect, you’re never going to communicate meaningfully.” – Pearl ’18 06-08
Data Science in 30 Minutes: Deep Learning to Detect Fake News with Uber ATG Head of Data Science, Mike Tamir 05-30
How to Use FPGAs for Deep Learning Inference to Perform Land Cover Mapping on Terabytes of Aerial Images 05-29
Cambridge Analytica, Facebook, and user data – Monthly Media Review with the AYLIEN News API, April 05-03
Using Natural Language Processing to Combat Filter Bubbles and Fake News – 360° Stance Detection 04-24
Circle circumference in the hyperbolic plane is exponential in the radius: proof by computer game 04-10
Understanding deep Convolutional Neural Networks with a practical use-case in Tensorflow and Keras 11-13
Why Indian companies should take on different projects than competing Valley companies - an application of Cobb-Douglas 11-07
How to Build Your Own Blockchain Part 1 — Creating, Storing, Syncing, Displaying, Mining, and Proving Work 10-17
Create conda recipe to use C extended Python library on PySpark cluster with Cloudera Data Science Workbench 05-15
Two papers released on arXiv, "Operator Variational Inference" and "Model Criticism for Bayesian Causal Inference" 10-30
Discussion of "Fast Approximate Inference for Arbitrarily Large Semiparametric Regression Models via Message Passing" 09-19
Bayesian Deep Learning Part II: Bridging PyMC3 and Lasagne to build a Hierarchical Neural Network 07-05
Second Annual Data Science Bowl – Part 3 – Automatically Finding the Heart Location in an MRI Image 03-08
Thinking is not something that goes on entirely, or even mostly, inside people’s heads. Little... 01-30
10 ways you might be able to tell when an area of research is undergoing rapid expansion and society's expectations may be somewhat unrealistic ... 12-13
Long Short-Term Memory dramatically improves Google Voice etc – now available to a billion users 09-30