AI, Machine Learning and Data Science Roundup: October 2018

A monthly roundup of news about Artificial Intelligence, Machine Learning and Data Science. This is an eclectic collection of interesting blog posts, software announcements and data applications from Microsoft and elsewhere that I’ve noted over the past month or so.

Open Source AI, ML & Data Science News

PyTorch 1.0 is now in preview, and brings a streamlined workflow from model development to production. Relatedly, Sam Charrington compares the growing PyTorch ecosystem with that of Tensorflow.

Tensorflow 1.11 is released, with binaries for cuDNN 7.2 and TensorRT 4, and new functions for querying kernels.

fastai, a new open library for deep learning built on PyTorch, has been released by fast.ai. It provides a single, consistent API for many deep learning applications and data types.

The R packages keras and tensorflow now allow definition of models that take advantage of eager execution.

Anaconda can now install and manage Tensorflow as a conda package.

Industry News

PyXLL, a commercial Excel add-in for calling Python functions from Excel.

IBM releases AI Fairness 360, an open source toolkit for investigating and mitigating bias in machine learning models.

Google open sources a TensorFlow package for Active Question Answering, a reinforcement learning based method to train artificial agents to answer natural-language questions.

Google Cloud Platform adds PyTorch support to several services, including Tensorboard, Kubeflow and Deep Learning VM images.

RStudio Package Manager, a commercial product to help organizations manage R packages, has been released.

RStudio adds support for Stan, the Bayesian modeling language.

Databricks releases MLflow 0.7.0, featuring a new R client API that allows you to log parameters, code versions, metrics, and output files when running R code and visualize the results in MLflow.

Cloudera and Hortonworks are merging. Thomas Dinsmore assesses the impact on the machine learning ecosystem.

Dataiku 5 is released. The enterprise data platform adds support for containerized R and Python recipes, and integration with Tensorflow and Keras.

Microsoft News

Microsoft has opened a new AI research facility in Shanghai in conjunction with INESA, the China organization developing smart city solutions.

SQL Server 2019 will provide integration with “big data clusters”: external SQL Server, Spark or HDFS containers managed in Kubernetes. For AI and ML workloads, you can run Spark or SQL Server’s R, Python and Java extensions against the cluster data.

Azure Machine Learning Services now provides Python developer libraries for data preparation, experiment tracking and model management, training on GPU clusters, automated model search and hyperparameter optimization, deploying trained models as containers, and many other new capabilities.

Visual Studio Code Tools for AI has been updated to provide a convenient interface to Azure Machine Learning for users of the popular open-source editor.

Speech Service in Azure Cognitive Services is now generally available, and includes a new neural text-to-speech capability for humanlike synthesized spech.

Azure Databricks is now supported in more regions, offers GPU support for deep learning, and Databricks Delta is now available (in preview) for transactional data capabilities.

Microsoft Bot Framework SDK v4 is now available. The Ignite presentation “Creating Enterprise-Scale Intelligent Agents and Bots” provides an overview and several examples.

Cortana Skills Kit for Enterprise, a development platform based on Azure Bot Service and Language Understanding, is now in private preview.  

ONNX Runtime, a high-performance engine for executing trained models represented in the open ONNX format, is now in preview.

PyTorch is now supported in many Azure services, including Azure Machine Learning service, Data Science Virtual Machine, Azure Notebooks, and Visual Studio Code Tools for AI.

The Microsoft Infer.NET machine learning framework has been released as open source. An online book, Model-Based Machine Learning, describes its probabilistic approach with several in-depth examples.

Learning resources

Microsoft Learn, which provides free interactive training for the Azure platform, is now available. Modules include Deep Learning with PyTorch and Computer Vision with Tensorflow.

A tutorial on converting a PyTorch model to ONNX, the cross-platform model-sharing format.

Chromebook Data Science, a free course in Data Science and R from the Johns Hopkins Data Science lab that requires only a browser to take.

Applications

The Ethics Certification Program for Autonomous and Intelligent Systems, a new IEEE standard with the goal of advancing transparency and reducing algorithmic bias in AI systems.

The fast.ai research datasets collection, which includes MNIST, CIFAR 10 and Imagenet, is now available on AWS Open Data.

A comprehensive list of AI Ethics resources, published by fast.ai.

Snip Insights, a Microsoft AI Lab project for image analysis and text extraction from screenshots.

Azure Healthcare AI blueprint, a process to deploy a HIPAA and HITRUST compliant environment in Azure for managing and running healthcare AI experiments.

Find previous editions of the monthly AI roundup here