Whats new on arXiv

Syntree2Vec – An algorithm to augment syntactic hierarchy into word embeddings

Word embeddings aims to map sense of the words into a lower dimensional vector space in order to reason over them. Training embeddings on domain specific data helps express concepts more relevant to their use case but comes at a cost of accuracy when data is less. Our effort is to minimise this by infusing syntactic knowledge into the embeddings. We propose a graph based embedding algorithm inspired from node2vec. Experimental results have shown that our algorithm improves the syntactic strength and gives robust performance on meagre data.

Improved Language Modeling by Decoding the Past

Highly regularized LSTMs that model the auto-regressive conditional factorization of the joint probability distribution of words achieve state-of-the-art results in language modeling. These models have an implicit bias towards predicting the next word from a given context. We propose a new regularization term based on decoding words in the context from the predicted distribution of the next word. With relatively few additional parameters, our model achieves absolute improvements of 1.7\% and 2.3\% over the current state-of-the-art results on the Penn Treebank and WikiText-2 datasets.

Efficient sampling of spreading processes on complex networks using a composition and rejection algorithm

PUG: A Framework and Practical Implementation for Why & Why-Not Provenance (extended version)

Explaining why an answer is (or is not) returned by a query is important for many applications including auditing, debugging data and queries, and answering hypothetical questions about data. In this work, we present the first practical approach for answering such questions for queries with negation (first- order queries). Specifically, we introduce a graph-based provenance model that, while syntactic in nature, supports reverse reasoning and is proven to encode a wide range of provenance models from the literature. The implementation of this model in our PUG (Provenance Unification through Graphs) system takes a provenance question and Datalog query as an input and generates a Datalog program that computes an explanation, i.e., the part of the provenance that is relevant to answer the question. Furthermore, we demonstrate how a desirable factorization of provenance can be achieved by rewriting an input query. We experimentally evaluate our approach demonstrating its efficiency.

Story Disambiguation: Tracking Evolving News Stories across News and Social Streams

Following a particular news story online is an important but difficult task, as the relevant information is often scattered across different domains/sources (e.g., news articles, blogs, comments, tweets), presented in various formats and language styles, and may overlap with thousands of other stories. In this work we join the areas of topic tracking and entity disambiguation, and propose a framework named Story Disambiguation – a cross-domain story tracking approach that builds on real-time entity disambiguation and a learning-to-rank framework to represent and update the rich semantic structure of news stories. Given a target news story, specified by a seed set of documents, the goal is to effectively select new story-relevant documents from an incoming document stream. We represent stories as entity graphs and we model the story tracking problem as a learning-to-rank task. This enables us to track content with high accuracy, from multiple domains, in real-time. We study a range of text, entity and graph based features to understand which type of features are most effective for representing stories. We further propose new semi-supervised learning techniques to automatically update the story representation over time. Our empirical study shows that we outperform the accuracy of state-of-the-art methods for tracking mixed-domain document streams, while requiring fewer labeled data to seed the tracked stories. This is particularly the case for local news stories that are easily over shadowed by other trending stories, and for complex news stories with ambiguous content in noisy stream environments.

IceBreaker: Solving Cold Start Problem for Video Recommendation Engines

Internet has brought about a tremendous increase in content of all forms and, in that, video content constitutes the major backbone of the total content being published as well as watched. Thus it becomes imperative for video recommendation engines such as Hulu to look for novel and innovative ways to recommend the newly added videos to their users. However, the problem with new videos is that they lack any sort of metadata and user interaction so as to be able to rate the videos for the consumers. To this effect, this paper introduces the several techniques we develop for the Content Based Video Relevance Prediction (CBVRP) Challenge being hosted by Hulu for the ACM Multimedia Conference 2018. We employ different architectures on the CBVRP dataset to make use of the provided frame and video level features and generate predictions of videos that are similar to the other videos. We also implement several ensemble strategies to explore complementarity between both the types of provided features. The obtained results are encouraging and will impel the boundaries of research for multimedia based video recommendation systems.

Adaptive Detection of Structured Signals in Low-Rank Interference

In this paper, we consider the problem of detecting the presence (or absence) of an unknown but structured signal from the space-time outputs of an array under strong, non-white interference. Our motivation is the detection of a communication signal in jamming, where often the training portion is known but the data portion is not. We assume that the measurements are corrupted by additive white Gaussian noise of unknown variance and a few strong interferers, whose number, powers, and array responses are unknown. We also assume the desired signals array response is unknown. To address the detection problem, we propose several GLRT-based detection schemes that employ a probabilistic signal model and use the EM algorithm for likelihood maximization. Numerical experiments are presented to assess the performance of the proposed schemes.

Graph Edit Distance Computation via Graph Neural Networks

Graph similarity search is among the most important graph-based applications, e.g. finding the chemical compounds that are most similar to a query compound. Graph similarity/distance computation, such as Graph Edit Distance (GED) and Maximum Common Subgraph (MCS), is the core operation of graph similarity search and many other applications, which is usually very costly to compute. Inspired by the recent success of neural network approaches to several graph applications, such as node classification and graph classification, we propose a novel neural network-based approach to address this challenging while classical graph problem, with the hope to alleviate the computational burden while preserving a good performance. Our model generalizes to unseen graphs, and in the worst case runs in linear time with respect to the number of nodes in two graphs. Taking GED computation as an example, experimental results on three real graph datasets demonstrate the effectiveness and efficiency of our approach. Specifically, our model achieves smaller error and great time reduction compared against several approximate algorithms on GED computation. To the best of our knowledge, we are among the first to adopt neural networks to model the similarity between two graphs, and provide a new direction for future research on graph similarity computation and graph similarity search.

Deep Bayesian Active Learning for Natural Language Processing: Results of a Large-Scale Empirical Study

Several recent papers investigate Active Learning (AL) for mitigating the data dependence of deep learning for natural language processing. However, the applicability of AL to real-world problems remains an open question. While in supervised learning, practitioners can try many different methods, evaluating each against a validation set before selecting a model, AL affords no such luxury. Over the course of one AL run, an agent annotates its dataset exhausting its labeling budget. Thus, given a new task, an active learner has no opportunity to compare models and acquisition functions. This paper provides a large scale empirical study of deep active learning, addressing multiple tasks and, for each, multiple datasets, multiple models, and a full suite of acquisition functions. We find that across all settings, Bayesian active learning by disagreement, using uncertainty estimates provided either by Dropout or Bayes-by Backprop significantly improves over i.i.d. baselines and usually outperforms classic uncertainty sampling.

Read + Verify: Machine Reading Comprehension with Unanswerable Questions

Machine reading comprehension with unanswerable questions aims to abstain from answering when no answer can be inferred. Previous works using an additional no-answer option attempt to extract answers and detect unanswerable questions simultaneously, but they have struggled to produce high-quality answers and often misclassify questions. In this paper, we propose a novel read-then-verify system that combines a base neural reader with a sentence-level answer verifier trained to further validate if the predicted answer is entailed by input snippets. Moreover, we augment the base reader with two auxiliary losses to better handle answer extraction and no-answer detection respectively, and investigate three different architectures for the answer verifier. Our experiments on the SQuAD 2.0 dataset show that our system can achieve a score of 74.8 F1 on the development set, outperforming the previous best published model by more than 7 points.

Data Consistency Approach to Model Validation

In scientific inference problems, the underlying statistical modeling assumptions have a crucial impact on the end results. There exist, however, only a few automatic means for validating these fundamental modelling assumptions. The contribution in this paper is a general criterion to evaluate the consistency of a set of statistical models with respect to observed data. This is achieved by automatically gauging the models’ ability to generate data that is similar to the observed data. Importantly, the criterion follows from the model class itself and is therefore directly applicable to a broad range of inference problems with varying data types. The proposed data consistency criterion is illustrated and evaluated using three synthetic and two real data sets.

Learning Supervised Topic Models for Classification and Regression from Crowds

The growing need to analyze large collections of documents has led to great developments in topic modeling. Since documents are frequently associated with other related variables, such as labels or ratings, much interest has been placed on supervised topic models. However, the nature of most annotation tasks, prone to ambiguity and noise, often with high volumes of documents, deem learning under a single-annotator assumption unrealistic or unpractical for most real-world applications. In this article, we propose two supervised topic models, one for classification and another for regression problems, which account for the heterogeneity and biases among different annotators that are encountered in practice when learning from crowds. We develop an efficient stochastic variational inference algorithm that is able to scale to very large datasets, and we empirically demonstrate the advantages of the proposed model over state-of-the-art approaches.

A bagging and importance sampling approach to Support Vector Machines

An importance sampling and bagging approach to solving the support vector machine (SVM) problem in the context of large databases is presented and evaluated. Our algorithm builds on the nearest neighbors ideas presented in Camelo at al. (2015). As in that reference, the goal of the present proposal is to achieve a faster solution of the SVM problem without a significance loss in the prediction error. The performance of the methodology is evaluated in benchmark examples and theoretical aspects of subsample methods are discussed.

Randomized Least Squares Regression: Combining Model- and Algorithm-Induced Uncertainties

We analyze the uncertainties in the minimum norm solution of full-rank regression problems, arising from Gaussian linear models, computed by randomized (row-wise sampling and, more generally, sketching) algorithms. From a deterministic perspective, our structural perturbation bounds imply that least squares problems are less sensitive to multiplicative perturbations than to additive perturbations. From a probabilistic perspective, our expressions for the total expectation and variance with regard to both model- and algorithm-induced uncertainties, are exact, hold for general sketching matrices, and make no assumptions on the rank of the sketched matrix. The relative differences between the total bias and variance on the one hand, and the model bias and variance on the other hand, are governed by two factors: (i) the expected rank deficiency of the sketched matrix, and (ii) the expected difference between projectors associated with the original and the sketched problems. A simple example, based on uniform sampling with replacement, illustrates the statistical quantities.

• Limit Laws of Planar Maps with Prescribed Vertex Degrees• Transfer Learning Enhanced Common Spatial Pattern Filtering for Brain Computer Interfaces (BCIs): Overview and a New Approach• Context-Aware Visual Policy Network for Sequence-Level Image Captioning• Auto-Classification of Retinal Diseases in the Limit of Sparse Data Using a Two-Streams Machine Learning Model• Fast and Accurate, Convolutional Neural Network Based Approach for Object Detection from UAV• Using path signatures to predict a diagnosis of Alzheimer’s disease• Wild bootstrap logrank tests with broader power functions for testing superiority• On the Strong Feller Property and Well-Posedness for SDEs with Functional, Locally Unbounded Drift• The Computational Wiretap Channel• Modelling Persistence Diagrams with Planar Point Processes, and Revealing Topology with Bagplots• Classification of symmetry-protected topological many-body localized phases in one dimension• Constant Arboricity Spectral Sparsifiers• Predicting Human Trustfulness from Facebook Language• Lattices from graph associahedra and subalgebras of the Malvenuto-Reutenauer algebra• On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization• Steady states of lattice population models with immigration• Steady state for the subcritical contact branching random walk on the lattice with the arbitrary number of offspring and with immigration• QoE-Aware Resource Allocation for Small Cells• How did the shape of your network change? (On detecting anomalies in static and dynamic networks via change of non-local curvatures)• Probabilistic approach to a cell growth model• Coordinated Scheduling and Spectrum Sharing via Matrix Fractional Programming• Reduced basis methods- an application to variational discretization of parametrized elliptic optimal control problems• Optimal Scheduling of an Isolated Microgrid with Battery Storage Considering Load and Renewable Generation Uncertainties• Bus transport network analysis in Rio de Janeiro based on topological models using Social Networks• Jitter-compensated VHT and its application to WSN clock synchronization• Session Guarantees with Raft and Hybrid Logical Clocks• Augmenting Statistical Machine Translation with Subword Translation of Out-of-Vocabulary Words• A Two-Stage Approach for Combined Heat and Power Economic Emission Dispatch: Combining Multi-Objective Optimization with Integrated Decision Making• Mitigation of Adversarial Attacks through Embedded Feature Selection• Two-stage multi-objective OPF for AC/DC grids with VSC-HVDC: Incorporating decisions analysis into optimization process• Optimal distributed generation planning in active distribution networks considering integration of energy storage• An N Time-Slice Dynamic Chain Event Graph• Ensemble-based Adaptive Single-shot Multi-box Detector• Efficient Single-Shot Multibox Detector for Construction Site Monitoring• Efficiently Learning Mixtures of Mallows Models• Medical Image Imputation from Image Collections• Convolutional Neural Networks based Intra Prediction for HEVC• Combinatorial identities related to $2\times 2$ submatrices of recursive matrices• Multicast With Prioritized Delivery: How Fresh is Your Data?• Extremality, Stationarity and Generalized Separation of Collections of Sets• Dynamic Routing on Deep Neural Network for Thoracic Disease Classification and Sensitive Area Localization• Data Poisoning Attacks in Contextual Bandits• Deep Learning Architecture for Voltage Stability Evaluation in Smart Grid based on Variational Autoencoders• LP Relaxation and Tree Packing for Minimum $k$-cuts• The Function Transformation Omics – Funomics• Reinforcement Learning for Autonomous Defence in Software-Defined Networking• Non-Asymptotic Behavior of the Maximum Likelihood Estimate of a Discrete Distribution• Optimum Experimental Design for Interface Identification Problems• Unsupervised adversarial domain adaptation for acoustic scene classification• Joint Training of Low-Precision Neural Network with Quantization Interval Parameters• Semi-Supervised Cluster Extraction via a Compressive Sensing Approach• Inconsistency of diagonal scaling under high-dimensional limit: a replica approach• Multiview Boosting by Controlling the Diversity and the Accuracy of View-specific Voters• Extending finite-memory determinacy by Boolean combination of winning conditions• Estimation in a Generalization of Bivariate Probit Models with Dummy Endogenous Regressors• Single-Server Multi-Message Private Information Retrieval with Side Information• On Bias and Rank• Blind Ptychographic Phase Retrieval via Convergent Alternating Direction Method of Multipliers• Towards Robotic Eye Surgery: Marker-free, Online Hand-eye Calibration using Optical Coherence Tomography Images• The phase diagram for a multispecies left-permeable asymmetric exclusion process• Optimal Distributed Weighted Set Cover Approximation• Nonlinear predictable representation and $L^1$-solutions of second-order backward SDEs• Motion Prediction of Traffic Actors for Autonomous Driving using Deep Convolutional Networks• Importance mixing: Improving sample reuse in evolutionary policy search methods• The Manickam-Miklós-Singhi Parameter of Graphs and Degree Sequences• Topological Percolation on Hyperbolic Simplicial Complexes• Neuromorphic Architecture for the Hierarchical Temporal Memory• Popular Products and Continued Fractions• Performance Analysis and Robustification of Single-query 6-DoF Camera Pose Estimation• Towards a Theory-Guided Benchmarking Suite for Discrete Black-Box Optimization Heuristics: Profiling $(1+λ)$ EA Variants on OneMax and LeadingOnes• Robust Compressive Phase Retrieval via Deep Generative Priors• Fitting Probabilistic Index Models on Large Datasets• Dual-mode Dynamic Window Approach to Robot Navigation with Convergence Guarantees• Lifted Wasserstein Matcher for Fast and Robust Topology Tracking• Co-evolution of nodes and links: diversity driven coexistence in cyclic competition of three species• Statistical modeling for adaptive trait evolution in randomly evolving environment• Cardinality Estimators do not Preserve Privacy• Epithelium segmentation using deep learning in H&E-stained; prostate specimens with immunohistochemistry as reference standard• An Empirical Evaluation of the Approximation of Subjective Logic Operators Using Monte Carlo Simulations• A High Order Method for Pricing of Financial Derivatives using Radial Basis Function generated Finite Differences• Exploring how innovation strategies at time of crisis influence performance: a cluster analysis perspective• Estimating and accounting for unobserved covariates in high dimensional correlated data• Whole-Slide Mitosis Detection in H Breast Histology Using PHH3 as a Reference to Train Distilled Stain-Invariant Convolutional Networks• Membership criteria and containments of powers of monomial ideals• Optimal scheduling of energy storage resources• Correlated Multi-armed Bandits with a Latent Random Source• On Representations of Graphs as Two-Distance Sets• On the Separability of Ergodic Fading MIMO Channels: A Lattice Coding Approach• Mixed-Level Column Augmented Uniform Designs• Weak convergences of marked empirical processes with applications to goodness-of-fit tests for Markovian processes• Characterizing the public perception of WhatsApp through the lens of media• Decentralized Dictionary Learning Over Time-Varying Digraphs• Recuperation of Regenerative Braking Energy in Electric Rail Transit Systems• All minor-minimal apex obstructions with connectivity two• First Steps Toward CNN based Source Classification of Document Images Shared Over Messaging App• Neural Body Fitting: Unifying Deep Learning and Model-Based Human Pose and Shape Estimation

Like this:

Like Loading…

Related