Whats new on arXiv

Dataset: Rare Event Classification in Multivariate Time Series

A real-world dataset is provided from a pulp-and-paper manufacturing industry. The dataset comes from a multivariate time series process. The data contains a rare event of paper break that commonly occurs in the industry. The data contains sensor readings at regular time-intervals (x’s) and the event label (y). The primary purpose of the data is thought to be building a classification model for early prediction of the rare event. However, it can also be used for multivariate time series data exploration and building other supervised and unsupervised models.

A Short Survey of Topological Data Analysis in Time Series and Systems Analysis

Topological Data Analysis (TDA) is the collection of mathematical tools that capture the structure of shapes in data. Despite computational topology and computational geometry, the utilization of TDA in time series and signal processing is relatively new. In some recent contributions, TDA has been utilized as an alternative to the conventional signal processing methods. Specifically, TDA is been considered to deal with noisy signals and time series. In these applications, TDA is used to find the shapes in data as the main properties, while the other properties are assumed much less informative. In this paper, we will review recent developments and contributions where topological data analysis especially persistent homology has been applied to time series analysis, dynamical systems and signal processing. We will cover problem statements such as stability determination, risk analysis, systems behaviour, and predicting critical transitions in financial markets.

Estimation of Personalized Effects Associated With Causal Pathways

The goal of personalized decision making is to map a unit’s characteristics to an action tailored to maximize the expected outcome for that unit. Obtaining high-quality mappings of this type is the goal of the dynamic regime literature. In healthcare settings, optimizing policies with respect to a particular causal pathway may be of interest as well. For example, we may wish to maximize the chemical effect of a drug given data from an observational study where the chemical effect of the drug on the outcome is entangled with the indirect effect mediated by differential adherence. In such cases, we may wish to optimize the direct effect of a drug, while keeping the indirect effect to that of some reference treatment. [16] shows how to combine mediation analysis and dynamic treatment regime ideas to defines policies associated with causal pathways and counterfactual responses to these policies. In this paper, we derive a variety of methods for learning high quality policies of this type from data, in a causal model corresponding to a longitudinal setting of practical importance. We illustrate our methods via a dataset of HIV patients undergoing therapy, gathered in the Nigerian PEPFAR program.

Generative Adversarial Active Learning for Unsupervised Outlier Detection

Outlier detection is an important topic in machine learning and has been used in a wide range of applications. In this paper, we approach outlier detection as a binary-classification issue by sampling potential outliers from a uniform reference distribution. However, due to the sparsity of data in high-dimensional space, a limited number of potential outliers may fail to provide sufficient information to assist the classifier in describing a boundary that can separate outliers from normal data effectively. To address this, we propose a novel Single-Objective Generative Adversarial Active Learning (SO-GAAL) method for outlier detection, which can directly generate informative potential outliers based on the mini-max game between a generator and a discriminator. Moreover, to prevent the generator from falling into the mode collapsing problem, the stop node of training should be determined when SO-GAAL is able to provide sufficient information. But without any prior information, it is extremely difficult for SO-GAAL. Therefore, we expand the network structure of SO-GAAL from a single generator to multiple generators with different objectives (MO-GAAL), which can generate a reasonable reference distribution for the whole dataset. We empirically compare the proposed approach with several state-of-the-art outlier detection methods on both synthetic and real-world datasets. The results show that MO-GAAL outperforms its competitors in the majority of cases, especially for datasets with various cluster types or high irrelevant variable ratio.

Cost-Sensitive Learning for Predictive Maintenance

In predictive maintenance, model performance is usually assessed by means of precision, recall, and F1-score. However, employing the model with best performance, e.g. highest F1-score, does not necessarily result in minimum maintenance cost, but can instead lead to additional expenses. Thus, we propose to perform model selection based on the economic costs associated with the particular maintenance application. We show that cost-sensitive learning for predictive maintenance can result in significant cost reduction and fault tolerant policies, since it allows to incorporate various business constraints and requirements.

A Unified Approach to Construct Correlation Coefficient Between Random Variables

Measuring the correlation (association) between two random variables is one of the important goals in statistical applications. In the literature, the covariance between two random variables is a widely used criterion in measuring the linear association between two random variables. In this paper, first we propose a covariance based unified measure of variability for a continuous random variable X and we show that several measures of variability and uncertainty, such as variance, Gini mean difference, cumulative residual entropy, etc., can be considered as special cases. Then, we propose a unified measure of correlation between two continuous random variables X and Y, with distribution functions (DFs) F and G, based on the covariance between X and H^{-1}G(Y) (known as the Q-transformation of H on G) where H is a continuous DF. We show that our proposed measure of association subsumes some of the existing measures of correlation. It is shown that the suggested index ranges between [-1,1], where the extremes of the range, i.e., -1 and 1, are attainable by the Frechet bivariate minimal and maximal DFs, respectively. A special case of the proposed correlation measure leads to a variant of Pearson correlation coefficient which, as a measure of strength and direction of the linear relationship between X and Y, has absolute values greater than or equal to the Pearson correlation. The results are examined numerically for some well known bivariate DFs.

FanStore: Enabling Efficient and Scalable I/O for Distributed Deep Learning

Emerging Deep Learning (DL) applications introduce heavy I/O workloads on computer clusters. The inherent long lasting, repeated, and random file access pattern can easily saturate the metadata and data service and negatively impact other users. In this paper, we present FanStore, a transient runtime file system that optimizes DL I/O on existing hardware/software stacks. FanStore distributes datasets to the local storage of compute nodes, and maintains a global namespace. With the techniques of system call interception, distributed metadata management, and generic data compression, FanStore provides a POSIX-compliant interface with native hardware throughput in an efficient and scalable manner. Users do not have to make intrusive code changes to use FanStore and take advantage of the optimized I/O. Our experiments with benchmarks and real applications show that FanStore can scale DL training to 512 compute nodes with over 90\% scaling efficiency.

GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration

Despite advances in scalable models, the inference tools used for Gaussian processes (GPs) have yet to fully capitalize on recent developments in machine learning hardware. We present an efficient and general approach to GP inference based on Blackbox Matrix-Matrix multiplication (BBMM). BBMM inference uses a modified batched version of the conjugate gradients algorithm to derive all terms required for training and inference in a single call. Adapting this algorithm to complex models simply requires a routine for efficient matrix-matrix multiplication with the kernel and its derivative. In addition, BBMM utilizes a specialized preconditioner that substantially speeds up convergence. In experiments, we show that BBMM efficiently utilizes GPU hardware, speeding up GP inference by an order of magnitude on a variety of popular GP models. Additionally, we provide GPyTorch, a new software platform for scalable Gaussian process inference via BBMM, built on PyTorch.

Incorporating GAN for Negative Sampling in Knowledge Representation Learning

Knowledge representation learning aims at modeling knowledge graph by encoding entities and relations into a low dimensional space. Most of the traditional works for knowledge embedding need negative sampling to minimize a margin-based ranking loss. However, those works construct negative samples through a random mode, by which the samples are often too trivial to fit the model efficiently. In this paper, we propose a novel knowledge representation learning framework based on Generative Adversarial Networks (GAN). In this GAN-based framework, we take advantage of a generator to obtain high-quality negative samples. Meanwhile, the discriminator in GAN learns the embeddings of the entities and relations in knowledge graph. Thus, we can incorporate the proposed GAN-based framework into various traditional models to improve the ability of knowledge representation learning. Experimental results show that our proposed GAN-based framework outperforms baselines on triplets classification and link prediction tasks.

A Note on Spectral Clustering and SVD of Graph Data

Spectral clustering and Singular Value Decomposition (SVD) are both widely used technique for analyzing graph data. In this note, I will present their connections using simple linear algebra, aiming to provide some in-depth understanding for future research.

Introducing Noise in Decentralized Training of Neural Networks

It has been shown that injecting noise into the neural network weights during the training process leads to a better generalization of the resulting model. Noise injection in the distributed setup is a straightforward technique and it represents a promising approach to improve the locally trained models. We investigate the effects of noise injection into the neural networks during a decentralized training process. We show both theoretically and empirically that noise injection has no positive effect in expectation on linear models, though. However for non-linear neural networks we empirically show that noise injection substantially improves model quality helping to reach a generalization ability of a local model close to the serial baseline.

Controllable Neural Story Generation via Reinforcement Learning

Open story generation is the problem of automatically creating a story for any domain without retraining. Neural language models can be trained on large corpora across many domains and then used to generate stories. However, stories generated via language models tend to lack direction and coherence. We introduce a policy gradient reinforcement learning approach to open story generation that learns to achieve a given narrative goal state. In this work, the goal is for a story to end with a specific type of event, given in advance. However, a reward based on achieving the given goal is too sparse for effective learning. We use reward shaping to provide the reinforcement learner with a partial reward at every step. We show that our technique can train a model that generates a story that reaches the goal 94% of the time and reduces model perplexity. A human subject evaluation shows that stories generated by our technique are perceived to have significantly higher plausible event ordering and plot coherence over a baseline language modeling technique without perceived degradation of overall quality, enjoyability, or local causality.

An Introduction to Probabilistic Programming

This document is designed to be a first-year graduate-level introduction to probabilistic programming. It not only provides a thorough background for anyone wishing to use a probabilistic programming system, but also introduces the techniques needed to design and build these systems. It is aimed at people who have an undergraduate-level understanding of either or, ideally, both probabilistic machine learning and programming languages. We start with a discussion of model-based reasoning and explain why conditioning as a foundational computation is central to the fields of probabilistic machine learning and artificial intelligence. We then introduce a simple first-order probabilistic programming language (PPL) whose programs define static-computation-graph, finite-variable-cardinality models. In the context of this restricted PPL we introduce fundamental inference algorithms and describe how they can be implemented in the context of models denoted by probabilistic programs. In the second part of this document, we introduce a higher-order probabilistic programming language, with a functionality analogous to that of established programming languages. This affords the opportunity to define models with dynamic computation graphs, at the cost of requiring inference methods that generate samples by repeatedly executing the program. Foundational inference algorithms for this kind of probabilistic programming language are explained in the context of an interface between program executions and an inference controller. This document closes with a chapter on advanced topics which we believe to be, at the time of writing, interesting directions for probabilistic programming research; directions that point towards a tight integration with deep neural network research and the development of systems for next-generation artificial intelligence applications.

Duality for Nonlinear Filtering

Nearly 60 years ago, in a celebrated paper of Kalman and Bucy, it was established that optimal estimation for linear Gaussian systems is dual to a linear-quadratic optimal control problem. In this paper, for the first time, a duality result is established for a general nonlinear filtering problem, mirroring closely the original Kalman-Bucy duality of control and estimation for linear systems. The main result is presented for a finite state space Markov process in continuous time. It is used to derive the classical Wonham filter. The form of the result suggests a natural generalization which is presented as a conjecture for the continuous state space case.

Visual Analytics for Automated Model Discovery

A recent advancement in the machine learning community is the development of automated machine learning (autoML) systems, such as autoWeka or Google’s Cloud AutoML, which automate the model selection and tuning process. However, while autoML tools give users access to arbitrarily complex models, they typically return those models with little context or explanation. Visual analytics can be helpful in giving a user of autoML insight into their data, and a more complete understanding of the models discovered by autoML, including differences between multiple models. In this work, we describe how visual analytics for automated model discovery differs from traditional visual analytics for machine learning. First, we propose an architecture based on an extension of existing visual analytics frameworks. Then we describe a prototype system Snowcat, developed according to the presented framework and architecture, that aids users in generating models for a diverse set of data and modeling tasks.

Complexity of Training ReLU Neural Network

In this paper, we explore some basic questions on the complexity of training Neural networks with ReLU activation function. We show that it is NP-hard to train a two- hidden layer feedforward ReLU neural network. If dimension d of the data is fixed then we show that there exists a polynomial time algorithm for the same training problem. We also show that if sufficient over-parameterization is provided in the first hidden layer of ReLU neural network then there is a polynomial time algorithm which finds weights such that output of the over-parameterized ReLU neural network matches with the output of the given data

Inverse Transport Networks

We introduce inverse transport networks as a learning architecture for inverse rendering problems where, given input image measurements, we seek to infer physical scene parameters such as shape, material, and illumination. During training, these networks are evaluated not only in terms of how close they can predict groundtruth parameters, but also in terms of whether the parameters they produce can be used, together with physically-accurate graphics renderers, to reproduce the input image measurements. To enable training of inverse transport networks using stochastic gradient descent, we additionally create a general-purpose, physically-accurate differentiable renderer, which can be used to estimate derivatives of images with respect to arbitrary physical scene parameters. Our experiments demonstrate that inverse transport networks can be trained efficiently using differentiable rendering, and that they generalize to scenes with completely unseen geometry and illumination better than networks trained without appearance- matching regularization.

A theoretical framework for deep locally connected ReLU network

Understanding theoretical properties of deep and locally connected nonlinear network, such as deep convolutional neural network (DCNN), is still a hard problem despite its empirical success. In this paper, we propose a novel theoretical framework for such networks with ReLU nonlinearity. The framework explicitly formulates data distribution, favors disentangled representations and is compatible with common regularization techniques such as Batch Norm. The framework is built upon teacher-student setting, by expanding the student forward/backward propagation onto the teacher’s computational graph. The resulting model does not impose unrealistic assumptions (e.g., Gaussian inputs, independence of activation, etc). Our framework could help facilitate theoretical analysis of many practical issues, e.g. overfitting, generalization, disentangled representations in deep networks.

Learning and Planning with a Semantic Model

Building deep reinforcement learning agents that can generalize and adapt to unseen environments remains a fundamental challenge for AI. This paper describes progresses on this challenge in the context of man-made environments, which are visually diverse but contain intrinsic semantic regularities. We propose a hybrid model-based and model-free approach, LEArning and Planning with Semantics (LEAPS), consisting of a multi-target sub-policy that acts on visual inputs, and a Bayesian model over semantic structures. When placed in an unseen environment, the agent plans with the semantic model to make high-level decisions, proposes the next sub-target for the sub-policy to execute, and updates the semantic model based on new observations. We perform experiments in visual navigation tasks using House3D, a 3D environment that contains diverse human-designed indoor scenes with real-world objects. LEAPS outperforms strong baselines that do not explicitly plan using the semantic content.

Graph Generation via Scattering

Generative networks have made it possible to generate meaningful signals such as images and texts from simple noise. Recently, generative methods based on GAN and VAE were developed for graphs and graph signals. However, some of these methods are complex as well as difficult to train and fine-tune. This work proposes a graph generation model that uses a recent adaptation of Mallat’s scattering transform to graphs. The proposed model is naturally composed of an encoder and a decoder. The encoder is a Gaussianized graph scattering transform. The decoder is a simple fully connected network that is adapted to specific tasks, such as link prediction, signal generation on graphs and full graph and signal generation. The training of our proposed system is efficient since it is only applied to the decoder and the hardware requirement is moderate. Numerical results demonstrate state-of-the-art performance of the proposed system for both link prediction and graph and signal generation. These results are in contrast to experience with Euclidean data, where it is difficult to form a generative scattering network that performs as well as state-of-the-art methods. We believe that this is because of the discrete and simpler nature of graph applications, unlike the more complex and high-frequency nature of Euclidean data, in particular, of some natural images.

Adaptive Input Representations for Neural Language Modeling

We introduce adaptive input representations for neural language modeling which extend the adaptive softmax of Grave et al. (2017) to input representations of variable capacity. There are several choices on how to factorize the input and output layers, and whether to model words, characters or sub-word units. We perform a systematic comparison of popular choices for a self-attentional architecture. Our experiments show that models equipped with adaptive embeddings are more than twice as fast to train than the popular character input CNN while having a lower number of parameters. We achieve a new state of the art on the \wiki{} benchmark of 20.51 perplexity, improving the next best known result by 8.7 perplexity. On the Billion word benchmark, we achieve a state of the art of 24.14 perplexity.

Answering Analytical Queries on Text Data with Temporal Term Histograms

Temporal text, i.e., time-stamped text data are found abundantly in a variety of data sources like newspapers, blogs and social media posts. While today’s data management systems provide facilities for searching full-text data, they do not provide any simple primitives for performing analytical operations with text. This paper proposes the temporal term histograms (TTH) as an intermediate primitive that can be used for analytical tasks. We propose an algebra, with operators and equivalence rules for TTH and present a reference implementation on a relational database system.

The Rule of Three: Abstractive Text Summarization in Three Bullet Points

Neural network-based approaches have become widespread for abstractive text summarization. Though previously proposed models for abstractive text summarization addressed the problem of repetition of the same contents in the summary, they did not explicitly consider its information structure. One of the reasons these previous models failed to account for information structure in the generated summary is that standard datasets include summaries of variable lengths, resulting in problems in analyzing information flow, specifically, the manner in which the first sentence is related to the following sentences. Therefore, we use a dataset containing summaries with only three bullet points, and propose a neural network-based abstractive summarization model that considers the information structures of the generated summaries. Our experimental results show that the information structure of a summary can be controlled, thus improving the performance of the overall summarization.

Confidence Calibration in Deep Neural Networks through Stochastic Inferences

We propose a generic framework to calibrate accuracy and confidence (score) of a prediction through stochastic inferences in deep neural networks. We first analyze relation between variation of multiple model parameters for a single example inference and variance of the corresponding prediction scores by Bayesian modeling of stochastic regularization. Our empirical observation shows that accuracy and score of a prediction are highly correlated with variance of multiple stochastic inferences given by stochastic depth or dropout. Motivated by these facts, we design a novel variance-weighted confidence-integrated loss function that is composed of two cross-entropy loss terms with respect to ground-truth and uniform distribution, which are balanced by variance of stochastic prediction scores. The proposed loss function enables us to learn deep neural networks that predict confidence calibrated scores using a single inference. Our algorithm presents outstanding confidence calibration performance and improves classification accuracy with two popular stochastic regularization techniques—stochastic depth and dropout—in multiple models and datasets; it alleviates overconfidence issue in deep neural networks significantly by training networks to achieve prediction accuracy proportional to confidence of prediction.

Deep learning systems as complex networks

Thanks to the availability of large scale digital datasets and massive amounts of computational power, deep learning algorithms can learn representations of data by exploiting multiple levels of abstraction. These machine learning methods have greatly improved the state-of-the-art in many challenging cognitive tasks, such as visual object recognition, speech processing, natural language understanding and automatic translation. In particular, one class of deep learning models, known as deep belief networks, can discover intricate statistical structure in large data sets in a completely unsupervised fashion, by learning a generative model of the data using Hebbian-like learning mechanisms. Although these self-organizing systems can be conveniently formalized within the framework of statistical mechanics, their internal functioning remains opaque, because their emergent dynamics cannot be solved analytically. In this article we propose to study deep belief networks using techniques commonly employed in the study of complex networks, in order to gain some insights into the structural and functional properties of the computational graph resulting from the learning process.

Domain Generalization with Domain-Specific Aggregation Modules

Visual recognition systems are meant to work in the real world. For this to happen, they must work robustly in any visual domain, and not only on the data used during training. Within this context, a very realistic scenario deals with domain generalization, i.e. the ability to build visual recognition algorithms able to work robustly in several visual domains, without having access to any information about target data statistic. This paper contributes to this research thread, proposing a deep architecture that maintains separated the information about the available source domains data while at the same time leveraging over generic perceptual information. We achieve this by introducing domain-specific aggregation modules that through an aggregation layer strategy are able to merge generic and specific information in an effective manner. Experiments on two different benchmark databases show the power of our approach, reaching the new state of the art in domain generalization.

Pumpout: A Meta Approach for Robustly Training Deep Neural Networks with Noisy Labels

It is challenging to train deep neural networks robustly on the industrial-level data, since labels of such data are heavily noisy, and their label generation processes are normally agnostic. To handle these issues, by using the memorization effects of deep neural networks, we may train deep neural networks on the whole dataset only the first few iterations. Then, we may employ early stopping or the small-loss trick to train them on selected instances. However, in such training procedures, deep neural networks inevitably memorize some noisy labels, which will degrade their generalization. In this paper, we propose a meta algorithm called Pumpout to overcome the problem of memorizing noisy labels. By using scaled stochastic gradient ascent, Pumpout actively squeezes out the negative effects of noisy labels from the training model, instead of passively forgetting these effects. We leverage Pumpout to upgrade two representative methods: MentorNet and Backward Correction. Empirical results on benchmark datasets demonstrate that Pumpout can significantly improve the robustness of representative methods.

Relational Forward Models for Multi-Agent Learning

The behavioral dynamics of multi-agent systems have a rich and orderly structure, which can be leveraged to understand these systems, and to improve how artificial agents learn to operate in them. Here we introduce Relational Forward Models (RFM) for multi-agent learning, networks that can learn to make accurate predictions of agents’ future behavior in multi-agent environments. Because these models operate on the discrete entities and relations present in the environment, they produce interpretable intermediate representations which offer insights into what drives agents’ behavior, and what events mediate the intensity and valence of social interactions. Furthermore, we show that embedding RFM modules inside agents results in faster learning systems compared to non-augmented baselines. As more and more of the autonomous systems we develop and interact with become multi-agent in nature, developing richer analysis tools for characterizing how and why agents make decisions is increasingly necessary. Moreover, developing artificial agents that quickly and safely learn to coordinate with one another, and with humans in shared environments, is crucial.

Jensen-Shannon Divergence as a Goodness-of-Fit Measure for Maximum Likelihood Estimation and Curve Fitting

$R^2$

$JSD$

Robot Representing and Reasoning with Knowledge from Reinforcement Learning

Reinforcement learning (RL) agents aim at learning by interacting with an environment, and are not designed for representing or reasoning with declarative knowledge. Knowledge representation and reasoning (KRR) paradigms are strong in declarative KRR tasks, but are ill-equipped to learn from such experiences. In this work, we integrate logical-probabilistic KRR with model-based RL, enabling agents to simultaneously reason with declarative knowledge and learn from interaction experiences. The knowledge from humans and RL is unified and used for dynamically computing task-specific planning models under potentially new environments. Experiments were conducted using a mobile robot working on dialog, navigation, and delivery tasks. Results show significant improvements, in comparison to existing model-based RL methods.

Reuse and Adaptation for Entity Resolution through Transfer Learning

Entity resolution (ER) is one of the fundamental problems in data integration, where machine learning (ML) based classifiers often provide the state-of-the-art results. Considerable human effort goes into feature engineering and training data creation. In this paper, we investigate a new problem: Given a dataset D_T for ER with limited or no training data, is it possible to train a good ML classifier on D_T by reusing and adapting the training data of dataset D_S from same or related domain? Our major contributions include (1) a distributed representation based approach to encode each tuple from diverse datasets into a standard feature space; (2) identification of common scenarios where the reuse of training data can be beneficial; and (3) five algorithms for handling each of the aforementioned scenarios. We have performed comprehensive experiments on 12 datasets from 5 different domains (publications, movies, songs, restaurants, and books). Our experiments show that our algorithms provide significant benefits such as providing superior performance for a fixed training data size.

Which Knowledge Graph Is Best for Me?

In recent years, DBpedia, Freebase, OpenCyc, Wikidata, and YAGO have been published as noteworthy large, cross-domain, and freely available knowledge graphs. Although extensively in use, these knowledge graphs are hard to compare against each other in a given setting. Thus, it is a challenge for researchers and developers to pick the best knowledge graph for their individual needs. In our recent survey, we devised and applied data quality criteria to the above-mentioned knowledge graphs. Furthermore, we proposed a framework for finding the most suitable knowledge graph for a given setting. With this paper we intend to ease the access to our in-depth survey by presenting simplified rules that map individual data quality requirements to specific knowledge graphs. However, this paper does not intend to replace our previously introduced decision-support framework. For an informed decision on which KG is best for you we still refer to our in-depth survey.

Propagation Networks for Model-Based Control Under Partial Observation

There has been an increasing interest in learning dynamics simulators for model-based control. Compared with off-the-shelf physics engines, a learnable simulator can quickly adapt to unseen objects, scenes, and tasks. However, existing models like interaction networks only work for fully observable systems; they also only consider pairwise interactions within a single time step, both restricting their use in practical systems. We introduce Propagation Networks (PropNet), a differentiable, learnable dynamics model that handles partially observable scenarios and enables instantaneous propagation of signals beyond pairwise interactions. With these innovations, our propagation networks not only outperform current learnable physics engines in forward simulation, but also achieves superior performance on various control tasks. Compared with existing deep reinforcement learning algorithms, model-based control with propagation networks is more accurate, efficient, and generalizable to novel, partially observable scenes and tasks.

• Multimodal Trajectory Predictions for Autonomous Driving using Deep Convolutional Networks• Two Phase Transitions in Two-way Bootstrap Percolation• A Coarse-To-Fine Framework For Video Object Segmentation• Optimal Multicast of Tiled 360 VR Video in OFDMA Systems• Definition and evaluation of model-free coordination of electrical vehicle charging with reinforcement learning• Supervised Nonnegative Matrix Factorization to Predict ICU Mortality Risk• Weakly-Supervised Localization and Classification of Proximal Femur Fractures• Learning a High-Precision Robotic Assembly Task Using Pose Estimation from Simulated Depth Images• Semantic Topic Analysis of Traffic Camera Images• Multi-Scale Recursive and Perception-Distortion Controllable Image Super-Resolution• Eshelby description of highly viscous flow III• A Qualitative Comparison of CoQA, SQuAD 2.0 and QuAC• Plane and Planarity Thresholds for Random Geometric Graphs• Constraining neutrino mass with tomographic weak lensing one-point probability distribution function and power spectrum• On the loss landscape of a class of deep neural networks with no bad local valleys• The 1-2-3 Conjecture almost holds for regular graphs• Building a Lemmatizer and a Spell-checker for Sorani Kurdish• Auto-Encoding Knockoff Generator for FDR Controlled Variable Selection• Comparative Efficiency of Altruism and Egoism as Voting Strategies in Stochastic Environment• Wiener index and Steiner 3-Wiener index of a graph• Point-of-Interest Recommendation: Exploiting Self-Attentive Autoencoders with Neighbor-Aware Influence• Double-layered distributed transient frequency control with regional coordination for power networks• Performance of MPI sends of non-contiguous data• On the Reliability Roots of Simplicial Complexes and Matroids• Morpho-MNIST: Quantitative Assessment and Diagnostics for Representation Learning• Methods and Concepts in Economic Complexity• Adaptive Gaussian process surrogates for Bayesian inference• Learning and Acting in Peripersonal Space: Moving, Reaching, and Grasping• An Empirical Comparison of Syllabuses for Curriculum Learning• Cursive Scene Text Analysis by Deep Convolutional Linear Pyramids• Model-Preserving Sensitivity Analysis for Families of Gaussian Distributions• A Hybrid Neural Network Framework and Application to Radar Automatic Target Recognition• Effective Cloud Detection and Segmentation using a Gradient-Based Algorithm for Satellite Imagery; Application to improve PERSIANN-CCS• Patient Risk Assessment and Warning Symptom Detection Using Deep Attention-Based Neural Networks• Learning Confidence Sets using Support Vector Machines• Novel Multi Agent Models for Chemical Self-assembly• Boundary-guided Feature Aggregation Network for Salient Object Detection• Weak detection of signal in the spiked Wigner model• Throughput Optimization in FDD MU-MISO Wireless Powered Communication Networks• Spatial-Temporal Inference of Urban Traffic Emissions Based on Taxi Trajectories and Multi-Source Urban Data• Embedded-State Latent Conditional Random Fields for Sequence Labeling• Model confidence sets and forecast combination: An application to age-specific mortality• Cocktail BPSK: Cross Power Utilization for High Data Rates• Proximal Recursion for Solving the Fokker-Planck Equation• Using Multi-task and Transfer Learning to Solve Working Memory Tasks• Exploiting Sparsity in SOS Programming and Sparse Polynomial Optimization• Minimax Lower Bounds for $\mathcal{H}_\infty$-Norm Estimation• Efficiently testing local optimality and escaping saddles for ReLU networks• semantic segmentation for urban planning maps based on u-net• Lindeberg’s method for $α$-stable central limit theorems• Effect of information asymmetry in Cournot duopoly game with bounded rationality• Understanding the Temporal Fading in Wireless Industrial Networks: Measurements and Analyses• Characterizing Audio Adversarial Examples Using Temporal Dependency• Application of the novel fractional grey model FAGMO(1,1,k) to predict China’s nuclear energy consumption• Structure of the set of quantum correlators via semidefinite programming• HyperST-Net: Hypernetworks for Spatio-Temporal Forecasting• A hierarchical cellular automaton model of distributed traffic signal control• An open source massively parallel solver for Richards equation: Mechanistic modelling of water fluxes at the watershed scale• Large sample properties of the Midzuno sampling scheme• Evidential community detection based on density peaks• Protograph-Based LDPC Code Design for Ternary Message Passing Decoding• A SwarmESB Based Architecture for an European Healthcare Insurance System in Compliance with GDPR• Permutations With Equal Orders• Blockchain and Smart-contracts Modeled in a SwarmESB Ecosystem• Depth Reconstruction of Translucent Objects from a Single Time-of-Flight Camera using Deep Residual Networks• Large deviations for conditional guesswork• Data depth and floating body• Low analytic rank implies low partition rank for tensors• SeqSleepNet: End-to-End Hierarchical Recurrent Neural Network for Sequence-to-Sequence Automatic Sleep Staging• Multi operator-stable random measures and fields• Strong Coordination over Noisy Channels with Strictly Causal Encoding• New Thread Migration Strategies for NUMA Systems• Subsets of Cayley graphs that induce many edges• On Rational Entailment for Propositional Typicality Logic• Elementary coupling approach for non-linear perturbation of Markov processes with mean-field jump mechanims and related problems• Deep Adaptive Learning for Writer Identification based on Single Handwritten Word Images• Overview of PicTropes, a film trope dataset• Variational Bayesian Inference for Audio-Visual Tracking of Multiple Speakers• Target-Independent Active Learning via Distribution-Splitting• CNNs Fusion for Building Detection in Aerial Images for the Building Detection Challenge• Some properties of a new partial order on Dyck paths• The Generalized Sinusoidal Frequency Modulated Waveform for Active Sonar Systems• Dynamical Gibbs-non-Gibbs transitions in Curie-Weiss Widom-Rowlinson models• Growing in time IDLA cluster is recurrent• An exploration-exploitation tradeoff dictates the optimal distribution of phenotypes for populations in presence of fitness fluctuations• Efficient Linear Bandits through Matrix Sketching• Peer-to-Peer Energy Trading with Sustainable User Participation: A Game Theoretic Approach• Real-time Dynamic Object Detection for Autonomous Driving using Prior 3D-Maps• The edge-Erdős-Pósa property• Interest point detectors stability evaluation on ApolloScape dataset• A Symmetric Keyring Encryption Scheme for Biometric Cryptosystems• Cross-situational learning of large lexicons with finite memory• SConE: Siamese Constellation Embedding Descriptor for Image Matching• On the Hardness of the Strongly Dependent Decision Problem• Aggregation of binary feature descriptors for compact scene model representation in large scale structure-from-motion applications• On wavelets to select the parametric form of a regression model• Camera Pose Estimation from Sequence of Calibrated Images• Spoken Pass-Phrase Verification in the i-vector Space• Face Recognition Based on Sequence of Images• Extrinsic camera calibration method and its performance evaluation• On the Landscape of Synchronization Networks: A Perspective from Nonconvex Optimization• Learning Recurrent Binary/Ternary Weights• Learning to Remember, Forget and Ignore using Attention Control in Memory• A Systems Approach to Achieving the Benefits of Artificial Intelligence in UK Defence• Resonant Beam Communications• A kernel-based approach to molecular conformation analysis• Large Scale GAN Training for High Fidelity Natural Image Synthesis• A Differential Degree Test for Comparing Brain Networks• Rethinking Self-driving: Multi-task Knowledge for Better Generalization and Accident Explanation Ability• Perturbed Bayesian Inference for Online Parameter Estimation• No percolation at criticality on certain groups of intermediate growth• Weighted Spectral Embedding of Graphs• Channel-wise and Spatial Feature Modulation Network for Single Image Super-Resolution• EDDI: Efficient Dynamic Discovery of High-Value Information with Partial VAE• Duality between source coding with quantum side information and c-q channel coding• Large deviations of subgraph counts for sparse Erdős–Rényi graphs• Hölder Continuity of Cumulative Distribution Functions for Noncommutative Polynomials under Finite Free Fisher Information• SALSA-TEXT : self attentive latent space based adversarial text generation• Universal and Dynamic Locally Repairable Codes with Maximal Recoverability via Sum-Rank Codes• Formal Context Generation using Dirichlet Distributions• Fast state tomography with optimal error bounds• Partial words with a unique position starting a power• Non-equilibrium statistical mechanics of continuous attractors

Like this:

Like Loading…

Related