Whats new on arXiv

An Improvement of Data Classification Using Random Multimodel Deep Learning (RMDL)

The exponential growth in the number of complex datasets every year requires more enhancement in machine learning methods to provide robust and accurate data classification. Lately, deep learning approaches have achieved surpassing results in comparison to previous machine learning algorithms. However, finding the suitable structure for these models has been a challenge for researchers. This paper introduces Random Multimodel Deep Learning (RMDL): a new ensemble, deep learning approach for classification. RMDL solves the problem of finding the best deep learning structure and architecture while simultaneously improving robustness and accuracy through ensembles of deep learning architectures. In short, RMDL trains multiple randomly generated models of Deep Neural Network (DNN), Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) in parallel and combines their results to produce better result of any of those models individually. In this paper, we describe RMDL model and compare the results for image and text classification as well as face recognition. We used MNIST and CIFAR-10 datasets as ground truth datasets for image classification and WOS, Reuters, IMDB, and 20newsgroup datasets for text classification. Lastly, we used ORL dataset to compare the model performance on face recognition task.

Multiclass Universum SVM

We introduce Universum learning for multiclass problems and propose a novel formulation for multiclass universum SVM (MU-SVM). We also propose an analytic span bound for model selection with almost 2-4x faster computation times than standard resampling techniques. We empirically demonstrate the efficacy of the proposed MUSVM formulation on several real world datasets achieving > 20% improvement in test accuracies compared to multi-class SVM.

Style Transfer as Unsupervised Machine Translation

Language style transferring rephrases text with specific stylistic attributes while preserving the original attribute-independent content. One main challenge in learning a style transfer system is a lack of parallel data where the source sentence is in one style and the target sentence in another style. With this constraint, in this paper, we adapt unsupervised machine translation methods for the task of automatic style transfer. We first take advantage of style-preference information and word embedding similarity to produce pseudo-parallel data with a statistical machine translation (SMT) framework. Then the iterative back-translation approach is employed to jointly train two neural machine translation (NMT) based transfer systems. To control the noise generated during joint training, a style classifier is introduced to guarantee the accuracy of style transfer and penalize bad candidates in the generated pseudo data. Experiments on benchmark datasets show that our proposed method outperforms previous state-of-the-art models in terms of both accuracy of style transfer and quality of input-output correspondence.

LIFT: Reinforcement Learning in Computer Systems by Learning From Demonstrations

Reinforcement learning approaches have long appealed to the data management community due to their ability to learn to control dynamic behavior from raw system performance. Recent successes in combining deep neural networks with reinforcement learning have sparked significant new interest in this domain. However, practical solutions remain elusive due to large training data requirements, algorithmic instability, and lack of standard tools. In this work, we introduce LIFT, an end-to-end software stack for applying deep reinforcement learning to data management tasks. While prior work has frequently explored applications in simulations, LIFT centers on utilizing human expertise to learn from demonstrations, thus lowering online training times. We further introduce TensorForce, a TensorFlow library for applied deep reinforcement learning exposing a unified declarative interface to common RL algorithms, thus providing a backend to LIFT. We demonstrate the utility of LIFT in two case studies in database compound indexing and resource management in stream processing. Results show LIFT controllers initialized from demonstrations can outperform human baselines and heuristics across latency metrics and space usage by up to 70%.

Structural-Factor Modeling of High-Dimensional Time Series: Another Look at Approximate Factor Models with Diverging Eigenvalues

Ontology Reasoning with Deep Neural Networks

The ability to conduct logical reasoning is a fundamental aspect of intelligent behavior, and thus an important problem along the way to human-level artificial intelligence. Traditionally, symbolic methods from the field of knowledge representation and reasoning have been used to equip agents with capabilities that resemble human reasoning qualities. More recently, however, there has been an increasing interest in applying alternative approaches based on machine learning rather than logic-based formalisms to tackle this kind of tasks. Here, we make use of state-of-the-art methods for training deep neural networks to devise a novel model that is closely coupled to symbolic reasoning methods, and thus able to learn how to effectively perform basic ontology reasoning. This term describes an important and at the same time very natural kind of problem settings where the rules for conducting reasoning are specified alongside with the actual information. Many problems in practice may be viewed as such reasoning tasks, which is why the presented approach is applicable to a plethora of important real-world problems. To demonstrate the effectiveness of the suggested method, we present the outcomes of several experiments that have been conducted on both toy datasets as well as real-world data, which show that our model learned to perform precise reasoning on a number of diverse inference tasks that require comprehensive deductive proficiencies. Furthermore, it turned out that the suggested model suffers much less from different obstacles that prohibit symbolic reasoning.

Features of word similarity

In this theoretical note we compare different types of computational models of word similarity and association in their ability to predict a set of about 900 rating data. Using regression and predictive modeling tools (neural net, decision tree) the performance of a total of 28 models using different combinations of both surface and semantic word features is evaluated. The results present evidence for the hypothesis that word similarity ratings are based on more than only semantic relatedness. The limited cross-validated performance of the models asks for the development of psychological process models of the word similarity rating task.

Reinforcement Learning for Relation Classification from Noisy Data

Existing relation classification methods that rely on distant supervision assume that a bag of sentences mentioning an entity pair are all describing a relation for the entity pair. Such methods, performing classification at the bag level, cannot identify the mapping between a relation and a sentence, and largely suffers from the noisy labeling problem. In this paper, we propose a novel model for relation classification at the sentence level from noisy data. The model has two modules: an instance selector and a relation classifier. The instance selector chooses high-quality sentences with reinforcement learning and feeds the selected sentences into the relation classifier, and the relation classifier makes sentence level prediction and provides rewards to the instance selector. The two modules are trained jointly to optimize the instance selection and relation classification processes. Experiment results show that our model can deal with the noise of data effectively and obtains better performance for relation classification at the sentence level.

Self-Paced Multi-Task Clustering

Multi-task clustering (MTC) has attracted a lot of research attentions in machine learning due to its ability in utilizing the relationship among different tasks. Despite the success of traditional MTC models, they are either easy to stuck into local optima, or sensitive to outliers and noisy data. To alleviate these problems, we propose a novel self-paced multi-task clustering (SPMTC) paradigm. In detail, SPMTC progressively selects data examples to train a series of MTC models with increasing complexity, thus highly decreases the risk of trapping into poor local optima. Furthermore, to reduce the negative influence of outliers and noisy data, we design a soft version of SPMTC to further improve the clustering performance. The corresponding SPMTC framework can be easily solved by an alternating optimization method. The proposed model is guaranteed to converge and experiments on real data sets have demonstrated its promising results compared with state-of-the-art multi-task clustering methods.

From Random to Supervised: A Novel Dropout Mechanism Integrated with Global Information

Dropout is used to avoid overfitting by randomly dropping units from the neural networks during training. Inspired by dropout, this paper presents GI-Dropout, a novel dropout method integrating with global information to improve neural networks for text classification. Unlike the traditional dropout method in which the units are dropped randomly according to the same probability, we aim to use explicit instructions based on global information of the dataset to guide the training process. With GI-Dropout, the model is supposed to pay more attention to inapparent features or patterns. Experiments demonstrate the effectiveness of the dropout with global information on seven text classification tasks, including sentiment analysis and topic classification.

Different but Equal: Comparing User Collaboration with Digital Personal Assistants vs. Teams of Expert Agents

This work compares user collaboration with conversational personal assistants vs. teams of expert chatbots. Two studies were performed to investigate whether each approach affects accomplishment of tasks and collaboration costs. Participants interacted with two equivalent financial advice chatbot systems, one composed of a single conversational adviser and the other based on a team of four experts chatbots. Results indicated that users had different forms of experiences but were equally able to achieve their goals. Contrary to the expected, there were evidences that in the teamwork situation that users were more able to predict agent behavior better and did not have an overhead to maintain common ground, indicating similar collaboration costs. The results point towards the feasibility of either of the two approaches for user collaboration with conversational agents.

An Empirical Study of Rich Subgroup Fairness for Machine Learning

Kearns et al. [2018] recently proposed a notion of rich subgroup fairness intended to bridge the gap between statistical and individual notions of fairness. Rich subgroup fairness picks a statistical fairness constraint (say, equalizing false positive rates across protected groups), but then asks that this constraint hold over an exponentially or infinitely large collection of subgroups defined by a class of functions with bounded VC dimension. They give an algorithm guaranteed to learn subject to this constraint, under the condition that it has access to oracles for perfectly learning absent a fairness constraint. In this paper, we undertake an extensive empirical evaluation of the algorithm of Kearns et al. On four real datasets for which fairness is a concern, we investigate the basic convergence of the algorithm when instantiated with fast heuristics in place of learning oracles, measure the tradeoffs between fairness and accuracy, and compare this approach with the recent algorithm of Agarwal et al. [2018], which implements weaker and more traditional marginal fairness constraints defined by individual protected attributes. We find that in general, the Kearns et al. algorithm converges quickly, large gains in fairness can be obtained with mild costs to accuracy, and that optimizing accuracy subject only to marginal fairness leads to classifiers with substantial subgroup unfairness. We also provide a number of analyses and visualizations of the dynamics and behavior of the Kearns et al. algorithm. Overall we find this algorithm to be effective on real data, and rich subgroup fairness to be a viable notion in practice.

STDP Learning of Image Patches with Convolutional Spiking Neural Networks

Spiking neural networks are motivated from principles of neural systems and may possess unexplored advantages in the context of machine learning. A class of \textit{convolutional spiking neural networks} is introduced, trained to detect image features with an unsupervised, competitive learning mechanism. Image features can be shared within subpopulations of neurons, or each may evolve independently to capture different features in different regions of input space. We analyze the time and memory requirements of learning with and operating such networks. The MNIST dataset is used as an experimental testbed, and comparisons are made between the performance and convergence speed of a baseline spiking neural network.

Future Automation Engineering using Structural Graph Convolutional Neural Networks

The digitalization of automation engineering generates large quantities of engineering data that is interlinked in knowledge graphs. Classifying and clustering subgraphs according to their functionality is useful to discover functionally equivalent engineering artifacts that exhibit different graph structures. This paper presents a new graph learning algorithm designed to classify engineering data artifacts — represented in the form of graphs — according to their structure and neighborhood features. Our Structural Graph Convolutional Neural Network (SGCNN) is capable of learning graphs and subgraphs with a novel graph invariant convolution kernel and downsampling/pooling algorithm. On a realistic engineering-related dataset, we show that SGCNN is capable of achieving ~91% classification accuracy.

• Dynamical attractors of memristors and their networks• Dual approach for object tracking based on optical flow and swarm intelligence• New Classes of Infinite Image Partition Regular Matrices Near Zero• Attention Gated Networks: Learning to Leverage Salient Regions in Medical Images• Insect cyborgs: Biological feature generators improve machine learning accuracy on limited data• Recalibrating Fully Convolutional Networks with Spatial and Channel ‘Squeeze & Excitation’ Blocks• The double-edge sword of disorder in multichannel topological superconductors• Persistence and extinction for stochastic ecological difference equations with feedbacks• TAP free energy, spin glasses, and variational inference• A Century Long Commitment to Assessing Artificial Intelligence and its Impact on Society• Optimal Energy-Efficient Policies for Data Centers through Sensitivity-Based Optimization• Spread of an infection on the zero range process• The Importance of Generation Order in Language Modeling• Multivariate Extension of Matrix-based Renyi’s α-order Entropy Functional• Improving Abstraction in Text Summarization• Estimation of Integrated Functionals of a Monotone Density• Conditional expectation of the duration of the classical gambler problem with defects• SOTER: Programming Safe Robotics System using Runtime Assurance• Financial Aspect-Based Sentiment Analysis using Deep Representations• Deconvolutional Networks for Point-Cloud Vehicle Detection and Tracking in Driving Scenarios• A Closed-Form Approximation of the Gaussian Noise Model in the Presence of Inter-Channel Stimulated Raman Scattering• Solving Quadratic Multi-Leader-Follower Games by Smoothing the Follower’s Best Response• Finite-State Contract Theory with a Principal and a Field of Agents• Maximal Jacobian-based Saliency Map Attack• Fractional Risk Process in Insurance• Thermal conductivity and local thermodynamic equilibrium of stochastic energy exchange models• From Hand-Crafted to Deep Learning-based Cancer Radiomics: Challenges and Opportunities• One (more) line on the most Ancient Algorithm in History• Learning Human-Object Interactions by Graph Parsing Neural Networks• The Optimal Memory-Rate Trade-off for the Non-uniform Centralized Caching Problem with Two Files under Uncoded Placement• Left ventricle quantification through spatio-temporal CNNs• A Communication Protocol for Man-Machine Networks• Proximal Policy Optimization and its Dynamic Version for Sequence Generation• Analysis of Noise Contrastive Estimation from the Perspective of Asymptotic Variance• Introducing the Perception-Distortion Tradeoff into the Rate-Distortion Theory of General Information Sources• A Semi-Markov Chain Approach to Modeling Respiratory Patterns Prior to Extubation in Preterm Infants• Predicting Extubation Readiness in Extreme Preterm Infants based on Patterns of Breathing• Undersampling and Bagging of Decision Trees in the Analysis of Cardiorespiratory Behavior for the Prediction of Extubation Readiness in Extremely Preterm Infants• Deep Feature Pyramid Reconfiguration for Object Detection• Non-asymptotic bounds for percentiles of independent non-identical random variables• Online Static Security Assessment of Power Systems Based on Lasso Algorithm• Polynomial Chaos-Based Adaptive Control for Nonlinear Systems• Approximate Distribution Matching for Sequence-to-Sequence Learning• Energy-Efficient Massive IoT Shared Spectrum Access over UAV-enabled Cellular Networks• An Enhanced SCMA Detector Enabled by Deep Neural Network• Linear complexity of generalized cyclotomic sequences of period $2p^{m}$• A Jointly Learned Context-Aware Place of Interest Embedding for Trip Recommendations• Decision fusion with multiple spatial supports by conditional random fields• Hybrid Job-driven Scheduling for Virtual MapReduce Clusters• Role Semantics for Better Models of Implicit Discourse Relations• Integration with an Adaptive Harmonic Mean Algorithm• Explicit rates of convergence in the multivariate CLT for nonlinear statistics• Towards Machine Learning-Based Optimal HAS• Minimal covers of hypergraphs with applications to topological spaces• One-sided scaling limit of multicolor box-ball system• Under the Hood: Using Diagnostic Classifiers to Investigate and Improve how Language Models Track Agreement Information• On the Finite Horizon Optimal Switching Problem with Random Lag• The Forward-Backward-Forward Method from discrete and continuous perspective for pseudo-monotone variational inequalities in Hilbert Spaces• A dynamical approach to privacy preserving average consensus• Bayesian Multi–Dipole Modeling in the Frequency Domain• Atherosclerotic carotid plaques on panoramic imaging: an automatic detection using deep learning with small dataset• Multi-scenario deep learning for multi-speaker source separation• Memory Time Span in LSTMs for Multi-Speaker Source Separation• Measuring LDA Topic Stability from Clusters of Replicated Runs• A stochastic SIR network epidemic model with preventive dropping of edges• A Bayesian nonparametric approach for generalized Bradley-Terry models in random environment• The Shift from Processor Power Consumption to Performance Variations: Fundamental Implications at Scale• Performance Analysis of Ultra-Reliable Short Message Decode and Forward Relaying Protocols• Performance Limits of Single-Anchor mm-Wave Positioning• Set-partition tableaux and representations of diagram algebras• Green kernel asymptotics for two-dimensional random walks under random conductances• Networks of coupled oscillators: from phase to amplitude chimeras• Partial-fraction Expansion of Lossless Negative Imaginary Property and A Generalized Lossless Negative Imaginary Lemma• Simply Generated Unrooted Plane Trees• A hierarchical modelling approach to assess multi pollutant effects in time-series studies• Overcoming unambiguous state discrimination attack with the help of Schrödinger Cat decoy states• Spectral thresholding for the estimation of Markov chain transition operators• A heterogeneous spatial model in which savanna and forest coexist in a stable equilibrium• Continuous time Gaussian process dynamical models in gene regulatory network inference• Asynchronous One-Level and Two-Level Domain Decomposition Solvers• MVOR: A Multi-view RGB-D Operating Room Dataset for 2D and 3D Human Pose Estimation• Truth Inference on Sparse Crowdsourcing Data with Local Differential Privacy• On solutions of equations with measurable coefficients driven by $α$- stable processes• GoT-WAVE: Temporal network alignment using graphlet-orbit transitions• Is Machine Learning in Power Systems Vulnerable?• Applications of the Fractional-Random-Weight Bootstrap• On a class of norms generated by nonnegative integrable distributions• Automatic Foreground Extraction using Multi-Agent Consensus Equilibrium

Like this:

Like Loading…

Related