Distilled News

Causal inference and Bayesian network structure learning from nominal data

This study investigates a discrete causal method for nominal data (DCMND) which is one of the important issues of causal inference. It is utilized to learn the causal Bayesian network to reflect the interconnections between variables in our paper. This article also proposes a Bayesian network construction algorithm based on discrete causal inference (BDCI) and an extended BDCI Bayesian network construction algorithm based on DCMND. Furthermore, the paper studies the alarm data of mobile communication system in practice. The results suggest that decision criterion based our method is effective in causal inference and the Bayesian network constructed by our method has better classification accuracy compared to other methods.

LeNet-5 – A Classic CNN Architecture

Yann LeCun, Leon Bottou, Yosuha Bengio and Patrick Haffner proposed a neural network architecture for handwritten and machine-printed character recognition in 1990’s which they called LeNet-5. The architecture is straightforward and simple to understand that’s why it is mostly used as a first step for teaching Convolutional Neural Network.

Agile Framework for Analytics: a Decalogue

You are standing on a moss covered rock a short distance from a high waterfall that pours into a deep pond before flowing further downstream. The creek is surrounded by tall trees that provide a high canopy and shade. The air is cool and a gentle breeze blows through the trunks of the trees around you. Your body is agile. Your weight is shifted to one leg. The entire sole of your foot remains in contact with the mossy rock. Your right knee is bent and your right foot placed on your left inner thigh. Your hips are open, with your right knee pointing towards the right. With the toes of your right foot pointing down, your left foot, the centre of your pelvis, your shoulders and your head are all vertically aligned. Your hands are held above your head pointing directly upwards and clasped together, completing the vriksasana pose. You feel a faint mist of water from the waterfall hitting your face and arms and you start mediating about the ten commandments of the Agile Framework for Analytics Teams.

Human-competitive Patches in Automatic Program Repair with Repairnator

Repairnator is a bot. It constantly monitors software bugs discovered during continuous integration of open-source software and tries to fix them automatically. If it succeeds to synthesize a valid patch, Repairnator proposes the patch to the human developers, disguised under a fake human identity. To date, Repairnator has been able to produce 5 patches that were accepted by the human developers and permanently merged in the code base. This is a milestone for human-competitiveness in software engineering research on automatic program repair. In this post, we tell the story about this research done at KTH Royal Institute of Technology, Inria, the University of Lille and the University of Valenciennes.

Camelot: PDF Table Extraction for Humans

Camelot is a Python library that makes it easy for anyone to extract tables from PDF files!

Probability Distributions in Python

In this tutorial, you’ll learn about commonly used probability distributions in machine learning literature.

Autocorrelation in R

Practice autocorrelation in R by using course material from DataCamp’s Introduction to Time Series Analysis course.

Basic Programming Skills in R

Practice basic programming skills in R by using course material from DataCamp’s free Model a Quantitative Trading Strategy in R course.

What to Do With RPA

RPA (Robotic Process Automation) is hot! This term – which refers to small ‘bots’ that automate manual tasks – has ranked between #5 and #7 in searches on Gartner.com over the summer. To satisfy the demand for information on RPA, Gartner has published many works on the subject, including a market guide and several deployment best practice notes. Our published work however cautions clients not use RPA as an alternative when more robust, configured enterprise applications are available. The notion behind this advice is that RPA is inferior when compared with configured, enterprise-class solutions purpose-built for a particular process. Let me give you an example. One could create an RPA ‘bot’ to transcribe invoice data into an accounts payable system. OR the accounts payable department could deploy a modern e-invoicing system or B2B network solution that does the same thing. Neither approach is ‘correct.’ But I think all of us at Gartner for technology leaders could agree that the enterprise application is the more architecturally clean and flexible way of doing things.

Visualizations for credit modeling in R

Visualization is a great way to get an overview of credit modeling. Typically you will start by making data management and data cleaning and after this, your credit modeling analysis will start with visualizations. This article is, therefore, the first part of a credit machine learning analysis with visualizations. The second part of the analysis will typically use logistic regression and ROC curves.

Machine Learning Confronts the Elephant in the Room

Score one for the human brain. In a new study, computer scientists found that artificial intelligence systems fail a vision test a child could accomplish with ease. ‘It’s a clever and important study that reminds us that ‘deep learning’ isn’t really that deep,’ said Gary Marcus, a neuroscientist at New York University who was not affiliated with the work. The result takes place in the field of computer vision, where artificial intelligence systems attempt to detect and categorize objects. They might try to find all the pedestrians in a street scene, or just distinguish a bird from a bicycle (which is a notoriously difficult task). The stakes are high: As computers take over critical tasks like automated surveillance and autonomous driving, we’ll want their visual processing to be at least as good as the human eyes they’re replacing.

Building Machine Learning Model From Unstructured Data

You might be familiar with structured data, it is everywhere. Here i would like to focus on discussion on how we transform unstructured data to something data machine can process the data then to take inference.

Mastering The New Generation of Gradient Boosting

Gradient Boosted Decision Trees and Random Forest are my favorite ML models for tabular heterogeneous datasets. These models are the top performers on Kaggle competitions and in widespread use in the industry. Catboost, the new kid on the block, has been around for a little more than a year now, and it is already threatening XGBoost, LightGBM and H2O.

From Exploration to Production – Bridging the Deployment Gap for Deep Learning (Part 2)

This is the second part of a series of two blogposts on deep learning model exploration, translation, and deployment. Both involve many technologies like PyTorch, TensorFlow, TensorFlow Serving, Docker, ONNX, NNEF, GraphPipe, and Flask. We will orchestrate these technologies to solve the task of image classification using the more challenging and less popular EMNIST dataset. In the first part, we introduced EMNIST, developed and trained models with PyTorch, translated them using the Open Neural Network eXchange format (ONNX) and served them through GraphPipe. This part concludes the series by adding two additional approaches for model deployment. TensorFlow Serving and Docker as well as a rather hobbyist approach in which we build a simple web application that serves our model. Both deployments will offer a REST API to call for predictions. You will find all the related sourcecode on GitHub. If you like to start from the very beginning, find the first part here on Towards Data Science.

From Exploration to Production – Bridging the Deployment Gap for Deep Learning (Part 1)

This is the first part of a series of two blogposts on deep learning model exploration, translation, and deployment. Both involve many technologies like PyTorch, TensorFlow, TensorFlow Serving, Docker, ONNX, NNEF, GraphPipe, and Flask. We will orchestrate these technologies to solve the task of image classification using the more challenging and less popular EMNIST dataset. The first part introduces EMNIST, we develop and train models with PyTorch, translate them with the Open Neural Network eXchange format ONNX and serve them through GraphPipe. Part two covers TensorFlow Serving and Docker as well as a rather hobbyist approach in which we build a simple web application that serves our model. You can find all the related sourcecode on GitHub.

Bayesian Analysis & The Replication Crisis (A Layperson’s Perspective)

The Bayesian approach to statistical analysis has been gaining popularity in recent years, in the wake of the Replication Crisis and with the help of greater computational power. While many of us have heard that it is an alternative to the Frequentist approach that most people are familiar with, not a lot truly understand what it does and how to use it. This post hopes to simplify the core concepts of Bayesian analysis, and briefly explains why it was proposed as a solution to the Replication Crisis.

Machine Learning Types and Algorithms

Different types of machine learning types and algorithms, also when and where these are used.

Why Machine learning for achieving Artificial Intelligence? “ The Need for Machine Learning “

Even though AI can be achieved in many ways why does machine learning has more edge over others? ( A must read for beginners ) Why is it called Machine Learning?

A Framework for Approaching Textual Data Science Tasks

Although NLP and text mining are not the same thing, they are closely related, deal with the same raw data type, and have some crossover in their uses. Let’s discuss the steps in approaching these types of tasks.

Like this:

Like Loading…

Related