Risk-Sensitive Reinforcement Learning: A Constrained Optimization Viewpoint
The classic objective in a reinforcement learning (RL) problem is to find a policy that minimizes, in expectation, a long-run objective such as the infinite-horizon discounted or long-run average cost. In many practical applications, optimizing the expected value alone is not sufficient, and it may be necessary to include a risk measure in the optimization process, either as the objective or as a constraint. Various risk measures have been proposed in the literature, e.g., mean-variance tradeoff, exponential utility, the percentile performance, value at risk, conditional value at risk, prospect theory and its later enhancement, cumulative prospect theory. In this article, we focus on the combination of risk criteria and reinforcement learning in a constrained optimization framework, i.e., a setting where the goal to find a policy that optimizes the usual objective of infinite-horizon discounted/average cost, while ensuring that an explicit risk constraint is satisfied. We introduce the risk-constrained RL framework, cover popular risk measures based on variance, conditional value-at-risk and cumulative prospect theory, and present a template for a risk-sensitive RL algorithm. We survey some of our recent work on this topic, covering problems encompassing discounted cost, average cost, and stochastic shortest path settings, together with the aforementioned risk measures in a constrained framework. This non-exhaustive survey is aimed at giving a flavor of the challenges involved in solving a risk-sensitive RL problem, and outlining some potential future research directions.
A Simple Baseline Algorithm for Graph Classification
Graph classification has recently received a lot of attention from various fields of machine learning e.g. kernel methods, sequential modeling or graph embedding. All these approaches offer promising results with different respective strengths and weaknesses. However, most of them rely on complex mathematics and require heavy computational power to achieve their best performance. We propose a simple and fast algorithm based on the spectral decomposition of graph Laplacian to perform graph classification and get a first reference score for a dataset. We show that this method obtains competitive results compared to state-of-the-art algorithms.
What is an Ontology?
In the knowledge engineering community ‘ontology’ is usually defined in the tradition of Gruber as an ‘explicit specification of a conceptualization’. Several variations of this definition exist. In the paper we argue that (with one notable exception) these definitions are of no explanatory value, because they violate one of the basic rules for good definitions: The defining statement (the definiens) should be clearer than the term that is defined (the definiendum). In the paper we propose a different definition of ‘ontology’ and discuss how it helps to explain various phenomena: the ability of ontologies to change, the role of the choice of vocabulary, the significance of annotations, the possibility of collaborative ontology development, and the relationship between ontological conceptualism and ontological realism.
Node Representation Learning for Directed Graphs
We propose a novel approach for learning node representations in directed graphs, which maintains separate views or embedding spaces for the two distinct node roles induced by the directionality of the edges. In order to achieve this, we propose a novel alternating random walk strategy to generate training samples from the directed graph while preserving the role information. These samples are then trained using Skip-Gram with Negative Sampling (SGNS) with nodes retaining their source/target semantics. We conduct extensive experimental evaluation to showcase our effectiveness on several real-world datasets on link prediction, multi-label classification and graph reconstruction tasks. We show that the embeddings from our approach are indeed robust, generalizable and well performing across multiple kinds of tasks and networks. We show that we consistently outperform all random-walk based neural embedding methods for link prediction and graph reconstruction tasks. In addition to providing a theoretical interpretation of our method we also show that we are more considerably robust than the other directed graph approaches.
From Hard to Soft: Understanding Deep Network Nonlinearities via Vector Quantization and Statistical Inference
Alternating Linear Bandits for Online Matrix-Factorization Recommendation
We consider the problem of online collaborative filtering in the online setting, where items are recommended to the users over time. At each time step, the user (selected by the environment) consumes an item (selected by the agent) and provides a rating of the selected item. In this paper, we propose a novel algorithm for online matrix factorization recommendation that combines linear bandits and alternating least squares. In this formulation, the bandit feedback is equal to the difference between the ratings of the best and selected items. We evaluate the performance of the proposed algorithm over time using both cumulative regret and average cumulative NDCG. Simulation results over three synthetic datasets as well as three real-world datasets for online collaborative filtering indicate the superior performance of the proposed algorithm over two state-of-the-art online algorithms.
An Exploration of Dropout with RNNs for Natural Language Inference
Dropout is a crucial regularization technique for the Recurrent Neural Network (RNN) models of Natural Language Inference (NLI). However, dropout has not been evaluated for the effectiveness at different layers and dropout rates in NLI models. In this paper, we propose a novel RNN model for NLI and empirically evaluate the effect of applying dropout at different layers in the model. We also investigate the impact of varying dropout rates at these layers. Our empirical evaluation on a large (Stanford Natural Language Inference (SNLI)) and a small (SciTail) dataset suggest that dropout at each feed-forward connection severely affects the model accuracy at increasing dropout rates. We also show that regularizing the embedding layer is efficient for SNLI whereas regularizing the recurrent layer improves the accuracy for SciTail. Our model achieved an accuracy 86.14% on the SNLI dataset and 77.05% on SciTail.
• On $s$-distance-transitive graphs• Constituent Parsing as Sequence Labeling• Transition-based Parsing with Lighter Feed-Forward Networks• Visualization Framework for Colonoscopy Videos• Safe Adaptive Cruise Control with Road Grade Preview and V2V Communication• Robust Receiver Design for Non-orthogonal Multiple Access• Signal Adaptive Variable Selector for the Horseshoe Prior• Theoretical and Practical Aspects of the Linear Tape Scheduling Problem• A Non-asymptotic, Sharp, and User-friendly Reverse Chernoff-Cramèr Bound• Spatial Co-location Pattern Mining – A new perspective using Graph Database• 3D shape retrieval basing on representatives of classes• On unconstrained optimization problems solved using CDT and triality theory• Combinatorics of $k$-Farey graphs• C2A: Crowd Consensus Analytics for Virtual Colonoscopy• On a linear functional for infinitely divisible moving average random fields• eXogenous Kalman Filter for Lithium-Ion Batteries State-of-Charge Estimation in Electric Vehicles• Local Properties via Color Energy Graphs and Forbidden Configurations• Correcting an estimator of a multivariate monotone function with isotonic regression• Hierarchical ResNeXt Models for Breast Cancer Histology Image Classification• Actor-Critic Policy Optimization in Partially Observable Multiagent Environments• Distributed Approximate Distance Oracles• Soft Concept Analysis• Depth with Nonlinearity Creates No Bad Local Minima in ResNets• Patient Subtyping with Disease Progression and Irregular Observation Trajectories• VIENA2: A Driving Anticipation Dataset• On DC based Methods for Phase Retrieval• A convex integer programming approach for optimal sparse PCA• Optimal electricity demand response contracting with responsiveness incentives• Where is this? Video geolocation based on neural network features• On the Conditional Smooth Renyi Entropy and its Applications in Guessing and Source Coding• Learning from the Kernel and the Range Space• Labeling Gaps Between Words: Recognizing Overlapping Mentions with Mention Separators• Atrial fibrosis quantification based on maximum likelihood estimator of multivariate images• Our Practice Of Using Machine Learning To Recognize Species By Voice• Sparsemax and Relaxed Wasserstein for Topic Sparsity• ComNet: Combination of Deep Learning and Expert Knowledge in OFDM Receivers• A general learning system based on neuron bursting and tonic firing• SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation• Interpretability is Harder in the Multiclass Setting: Axiomatic Interpretability for Multiclass Additive Models• Degree growth for tame automorphisms of an affine quadric threefold• A Variable Reduction Method for Large-Scale Security Constrained Unit Commitment• Can We Gain More from Orthogonality Regularizations in Training Deep CNNs?• Actor-Expert: A Framework for using Action-Value Methods in Continuous Action Spaces• Norm-Range Partition: A Univiseral Catalyst for LSH based Maximum Inner Product Search (MIPS)• Evolution of holonic control architectures towards Industry 4.0: A short overview• Learning to Measure Change: Fully Convolutional Siamese Metric Networks for Scene Change Detection• The Bregman chord divergence• Surrogate modeling based on resampled polynomial chaos expansions• Uniform and $L^q$-Ensemble Reachability of Parameter-dependent Linear Systems• Atrial scars segmentation via potential learning in the graph-cuts framework• Distributed Mixed Voltage Angle and Frequency Droop Control of Microgrid Interconnections with Loss of Distribution-PMU Measurements• Unsupervised Detection of Anomalous Sound based on Deep Learning and the Neyman-Pearson Lemma• Do Deep Generative Models Know What They Don’t Know?• DNN-based Source Enhancement to Increase Objective Sound Quality Assessment Score• Bayesian Modelling of Lexis Mortality Data• Mining useful Macro-actions in Planning• Beyond ROUGE Scores in Algorithmic Summarization: Creating Fairness-Preserving Textual Summaries• Mean-based Heuristic Search for Real-Time Planning• PriSTE: From Location Privacy to Spatiotemporal Event Privacy• A Dual-Attention Hierarchical Recurrent Neural Network for Dialogue Act Classification• Temporal inactivation enhances robustness in an evolving system• Threat or Opportunity? – Examining Social Bots in Social Media Crisis Communication• Exploring Correlations in Multiple Facial Attributes through Graph Attention Network• Named Entity Disambiguation using Deep Learning on Graphs• A Maximum Likelihood-Based Minimum Mean Square Error Separation and Estimation of Stationary Gaussian Sources from Noisy Mixtures• Ensemble Method for Censored Demand Prediction• Optimal arrangements of hyperplanes for multiclass classification• Dating Ancient Paintings of Mogao Grottoes Using Deeply Learnt Visual Codes• The Hessenberg matrices and Catalan and its generalized numbers• Compositional coding capsule network with k-means routing for text classification• Learning sparse transformations through backpropagation• Computation via Interacting Magnetic Memory Bites: Integration of Boolean Gates• Subtleties in the interpretation of hazard ratios• Field Of Interest Proposal for Augmented Mitotic Cell Count: Comparison of two Convolutional Networks• Graph Convolutional Reinforcement Learning for Multi-Agent Cooperation• Multi-Agent Actor-Critic with Generative Cooperative Policy Network• Halfspace depth does not characterize probability distributions• Weighted asymmetric least squares regression for longitudinal data using GEE• Chance-Constrained AC Optimal Power Flow Integrating HVDC Lines and Controllability• On Number Rigidity for Pfaffian Point Processes• Cost-Sensitive Robustness against Adversarial Examples• Knowledge Graph Completion to Predict Polypharmacy Side Effects• Approximations of the boundary crossing probabilities for the maximum of moving sums• A Constraint-Reduced MPC Algorithm for Convex Quadratic Programming, with a Modified Active Set Identification Scheme• A Review on Learning Planning Action Models for Socio-Communicative HRI• Optimal terminal dimensionality reduction in Euclidean space• The Price equation program: simple invariances unify population dynamics, thermodynamics, probability, information and inference• Coalition Resilient Outcomes in Max k-Cut Games• Data-driven optimization of processes with degrading equipment• A Bayesian Nonparametrics based Robust Particle Filter Algorithm• Optimal distributed control of a stochastic Cahn-Hilliard equation• The Multi-Scale Impact of the Alzheimer’s Disease in the Topology Diversity of Astrocytes Molecular Communications Nanonetworks• Sparse constrained projection approximation subspace tracking• RCanopus: Making Canopus Resilient to Failures and Byzantine Faults• BioSentVec: creating sentence embeddings for biomedical texts• On the k-Boundedness for Existential Rules• Topological and metric recurrence for general Markov chains• Assessing the Impact of Gamification on Self-Directed Learning in Medical Students• Circuits through prescribed edges• On the number of limit cycles in asymmetric neural networks• Double-precision FPUs in High-Performance Computing: an Embarrassment of Riches?• Recovering Robustness in Model-Free Reinforcement learning• Baseline Detection in Historical Documents using Convolutional U-Nets• Adversarial Online Learning with noise• Generation of Virtual Dual Energy Images from Standard Single-Shot Radiographs using Multi-scale and Conditional Adversarial Network• Fast Dual Simulation Processing of Graph Database Queries (Supplement)• Weighted Super Poincare Inequalities for Infinite-Dimensional Extension of the Dirichlet Distribution• Enabling Efficient RDMA-based Synchronous Mirroring of Persistent Memory Transactions• ensmallen: a flexible C++ library for efficient function optimization• Coupled Longitudinal and Lateral Control of a Vehicle using Deep Learning• Description of Incomplete Financial Markets for the Discrete Time Evolution of Risk Assets• Brain Tumor Image Retrieval via Multitask Learning• On the Atkin and Swinnerton-Dyer type congruences for some truncated hypergeometric ${}_1F_0$ series• Predictive Linguistic Features of Schizophrenia• Linguistic Legal Concept Extraction in Portuguese• Unsupervised Learning of Shape and Pose with Differentiable Point Clouds• A minimax near-optimal algorithm for adaptive rejection sampling• A neuro-inspired architecture for unsupervised continual learning based on online clustering and hierarchical predictive coding• Nonhomogeneous Euclidean first-passage percolation and distance learning• Towards a context-dependent numerical data quality evaluation framework• Subcritical approximations to stochastic defocusing mass-critical nonlinear Schrödinger equation on $\mathbb{R}$• Event-triggered Natural Hazard Monitoring with Convolutional Neural Networks on the Edge• Properties of an N Time-Slice Dynamic Chain Event Graph• Human-Competitive Awards 2018• Optimality of the final model found via Stochastic Gradient Descent• Scaling up Deep Learning for PDE-based Models• New Bounds for the Dichromatic Number of a Digraph• Proactive Security: Embedded AI Solution for Violent and Abusive Speech Recognition• Bayesian multi-domain learning for cancer subtype discovery from next-generation sequencing count data
Like this:
Like Loading…
Related