AI, Machine Learning and Data Science Roundup: September 2018

A monthly roundup of news about Artificial Intelligence, Machine Learning and Data Science. This is an eclectic collection of interesting blog posts, software announcements and data applications from Microsoft and elsewhere that I’ve noted over the past month or so.

Open Source AI, ML & Data Science News

ONNX 1.3 released. The standard for representing predictive models adds a cross-platform API for executing graphs on optimized backends and hardware.

A new foundation supporting the development of scikit-learn, the Python machine learning library.

Google open sources “dopamine”, a Tensorflow-based framework for reinforcement learning research.

Google adds a “What-If Tool” to Tensorboard, to interactively explore the robustness and algorithmic fairness of machine learning models.

Industry News

Google introduces Dataset Search, an index of public datasets from the environmental and social sciences, government, and journalism.

The 2018.3 update to the Alteryx analytics platform brings interactive graphics, and Spark and Jupyter Notebook integration.

AWS Deep Learning AMIs now provide Tensorflow 1.10 and PyTorch with CUDA 9.2 support.

Rigetti Computing launches the Quantum Cloud Service with a $1M prize for the first application demonstrating “quantum advantage”.

Thomas Dinsmore’s commentary on Forrester Wave rankings for “Multimodal Predictive Analytics and Machine Learning Platforms” and “Notebook-Based Predictive Analytics and Machine Learning Solutions”.

Microsoft News

Microsoft acquires Lobe, a Bay Area startup that produced a drag-and-drop interface for building machine learning models.

NVIDIA GPU Cloud now provides ready-to-run containers with GPU-enabled deep learning frameworks for use on Azure.

Microsoft introduces Azure CycleCloud, a tool for creating, managing, operating, and optimizing burst, hybrid, and cloud-only HPC clusters. Schedulers including Slurm, Grid Engine and Condor are supported.

New features in Azure HDInsight: Spark 2.3.0 and Kafka 1.1 support; ML Services 9.3 integration, with updated R engine and new statistical and machine learning algorithms; Apache Phoenix, for SQL-like queries for data in HBase; Apache Zeppelin, web-based notebooks for querying Phoenix tables; and more.

Learning resources

A review of the algorithms behind AutoML systems for model selection and hyperparameter optimization, from the H2O blog.

Joel Grus’s criticisms of Jupyter Notebooks as a platform for reproducible and production-ready computing. Yihui Xiu offers an alternative: RMarkdown.

A comprehensive introduction to mixed-effects models, and fitting them in Python.

A collection of videos, presentations and essays by Brandon Rohrer with approachable explanations of the inner workings of deep learning and machine learning algorithms.

A tutorial on using Azure Batch AI to parallelize forecasts of energy demand.

A meticulously-researched accounting of the resources (natural, technological, and human) that enable an Alexa voice query, presented as highly-detailed map.

A review of the book SQL Server 2017 Machine Learning Services with R.

Applications

Data scientists use text similarity analyses in R to try and identify the author of that anonymous NYT op-ed.

Sketch2Code, an application to translate hand drawings into HTML forms.

Measuring building footprints from satellite images, with semantic segmentation.

Near real-time fraud detection for mobile banking, with a classification model implemented in Azure Machine Learning.

Credit card fraud detection with an Autoencoder neural network, using the Azure Data Science VM.

Shell deploys machine learning and AI systems to avert equipment failures, autonomously direct drill-bits underground, and improve safety.

Find previous editions of the monthly AI roundup here