Distilled News

Most AI Explainability Is Snake Oil. Ours Isn’t And Here’s Why.

Advanced machine learning (ML) is a subset of AI that uses more data and sophisticated math to make better predictions and decisions. Banks and lenders could make a lot more money using ML-powered credit scoring instead of legacy methods in use today. But adoption of ML has been held back by the technology’s ‘black-box’ nature: you can see the model’s results but not how it came to those results. You can’t run a credit model safely or accurately if you can’t explain its decisions, especially for a regulated use case such as credit underwriting.

How to deploy a predictive service to Kubernetes with R and the AzureContainers package

It’s easy to create a function in R, but what if you want to call that function from a different application, with the scale to support a large number of simultaneous requests? This article shows how you can deploy an R fitted model as a Plumber web service in Kubernetes, using Azure Container Registry (ACR) and Azure Kubernetes Service (AKS). We use the AzureContainers package to create the necessary resources and deploy the service.

Visualizing Hurricane Data with Shiny

Around the time that I was selecting a topic for this project, my parents and my hometown found themselves in the path of a Category 1 hurricane. Thankfully, everyone was ok, and there was only minor damage to their property. But this event made me think about how long it had been since the last time my hometown had been in the path of a Category 1 hurricane. I also wanted to study trends in hurricane intensity over time to see if it corresponds to the popular impression that storms have grown stronger storms over the past few years.

Network Centrality in R: New ways of measuring Centrality

This is the third post of a series on the concept of ‘network centrality’ with applications in R and the package netrankr. The last part introduced the concept of neighborhood-inclusion and its implications for centrality. In this post, we extend the concept to a broader class of dominance relations by deconstructing indices into a series of building blocks and introduce new ways of evaluating centrality.

Named Entity Recognition (NER) With Keras and Tensorflow – Meeting Industry’s Requirement by Applying State-of-the-art Deep Learning Methods

Few years ago when I was working as a software engineering intern at a startup, I saw a new feature in a job posting web-app. The app was able to recognize and parse important information form the resumes like, email address, phone number, degree titles and etc. I started discussing possible approaches with our team and we decided to build a rule based parser in python to just parse different sections of a resume. After spending some time developing the parser, we realized that the answer may not be a rule-based tool. We started googling how it’s done and we came across the term Natural Language Processing (NLP) and more specific, Named Entity Recognition (NER) associated with Machine Learning.

The Importance of Being Recurrent for Modeling Hierarchical Structure

Recurrent Neural Networks (RNNs), such as Long Short-Term Memory networks (LSTMs), currently have performance limitations, while newer methods such as Fully Attentional Networks (FANs) show potential for replacing LSTMs without those same limitations. So the authors set out to compare the two approaches using standardized methods and found that LSTMs universally surpass FANs in prediction accuracy when applied to the hierarchy structure of language.

Supervised Machine Learning: Classification

In Supervised Learning, algorithms learn from labeled data. After understanding the data, the algorithm determines which label should be given to new data based on pattern and associating the patterns to the unlabeled new data.

Customer Analysis with Network Science

Over the past decade or two, Americans have continued to prefer payment methods that are traceable, providing retailers and vendors with a rich source of data on their customers. This data is used by data scientists to help businesses make more informed decisions with respect to inventory, marketing, and supply chain, to name a few. There are several tools and techniques for performing customer segmentation, and network analysis can be a powerful one.

AWS Architecture For Your Machine Learning Solutions

One of the regular challenges I face while designing enterprise-grade solutions for our client companies is the lack of reference online on examples of real world architectural use cases. You will find tons of tutorials on how to get started on individual technologies, and these are great when your focus is just limited to that particular framework or service. But in order to evaluate the broad spectrum of all that is available out there and to predetermine the implications of bundling a bunch of these together, you either have to hunt down someone who’s been down the road before, or venture on an independent experimentation yourself. That’s why I decided to start a series on sharing some of my own insights gathered while designing and developing technical solutions for multiple fortune 200 companies and emerging startups. And hopefully, today’s use case will help you plan the AWS Architecture for your Machine Learning solutions.

AI: the silver bullet to stop Technical Debt from sucking you dry

It’s Friday evening in the Bahamas. You’re relaxing under a striped red umbrella with a succulent glass of wine and your favorite book?-?it’s a great read and you love the way the ocean breeze moves the pages like leaves on a tree. As the sun descends your eyes follow, your consciousness drifting with the waves, closer to the horizon, closer to a soft, lulling sleep, closer to a perfect evening in a perfect world.

5 Machine Learning Resolutions for 2019

More organizations are using machine learning for competitive reasons, but their results are mixed. It turns out there are better — and worse — ways of approaching it. If you want to improve the outcome of your efforts in 2019, consider these points.• Start with an appropriate scope• Approach machine learning holistically• Make the connection between data and machine learning• Don’t expect too much ‘out of the box’• Don’t forget infrastructural requirements

XGBoost is not black magic

Nowadays is quite easy to have decent results in data science tasks: it’s sufficient to have a general understanding of the process, a basic knowledge of Python and ten minutes of your time to instantiate XGBoost and fit the model. Ok, if it’s your first time then you would probably spend a couple of minutes collecting the required packages via pip, but that’s it. The only problem with this approach is that it works pretty well: a couple of years ago I classified in the Top 5 in a university competition by just feeding the dataset to an XGBoost with some basic feature engineering, outperforming groups presenting very complex architectures and data pipelines. One of the coolest characteristics of XGBoost is how it deals with missing values: deciding for each sample which is the best way to impute them. This feature has been super-useful for a lot of projects and datasets I run into during the last months; to be more deserving of the Data Scientist title written under my name, I decided to dig a little deeper, taking a couple of hours to read the original paper, trying to understand what an XGBoost is actually about and how it is able to deal with missing values in the sort of magical way it does.

Which Model and How Much Data?

Building deep learning applications in the real world is a never-ending process of selecting and refining the right elements of a specific solution. Among those elements, the selection of the correct model and the right structure of the training dataset are, arguably, the two most important decisions that data scientists need to make when architecting deep learning solutions. How to decide what deep learning model to use for a specific problem? How do we know whether we are using the correct training dataset or we should gather more data? Those questions are the common denominator across all stages of the lifecycle of a deep learning application. Even though there is no magic answer to those questions, there are several ideas that could guide your decision-making process. Let’s start with the selection of the correct deep learning model.

Like this:

Like Loading…

Related