Whats new on arXiv

Reconfigurable Inverted Index

Existing approximate nearest neighbor search systems suffer from two fundamental problems that are of practical importance but have not received sufficient attention from the research community. First, although existing systems perform well for the whole database, it is difficult to run a search over a subset of the database. Second, there has been no discussion concerning the performance decrement after many items have been newly added to a system. We develop a reconfigurable inverted index (Rii) to resolve these two issues. Based on the standard IVFADC system, we design a data layout such that items are stored linearly. This enables us to efficiently run a subset search by switching the search method to a linear PQ scan if the size of a subset is small. Owing to the linear layout, the data structure can be dynamically adjusted after new items are added, maintaining the fast speed of the system. Extensive comparisons show that Rii achieves a comparable performance with state-of-the art systems such as Faiss.

Text classification using capsules

This paper presents an empirical exploration of the use of capsule networks for text classification. While it has been shown that capsule networks are effective for image classification, their validity in the domain of text has not been explored. In this paper, we show that capsule networks indeed have the potential for text classification and that they have several advantages over convolutional neural networks. We further suggest a simple routing method that effectively reduces the computational complexity of dynamic routing. We utilized seven benchmark datasets to demonstrate that capsule networks, along with the proposed routing method provide comparable results.

Interpretable Time Series Classification using All-Subsequence Learning and Symbolic Representations in Time and Frequency Domains

The time series classification literature has expanded rapidly over the last decade, with many new classification approaches published each year. The research focus has mostly been on improving the accuracy and efficiency of classifiers, while their interpretability has been somewhat neglected. Classifier interpretability has become a critical constraint for many application domains and the introduction of the ‘right to explanation’ GDPR EU legislation in May 2018 is likely to further emphasize the importance of explainable learning algorithms. In this work we analyse the state-of-the-art for time series classification, and propose new algorithms that aim to maintain the classifier accuracy and efficiency, but keep interpretability as a key design constraint. We present new time series classification algorithms that advance the state-of-the-art by implementing the following three key ideas: (1) Multiple resolutions of symbolic approximations: we combine symbolic representations obtained using different parameters; (2) Multiple domain representations: we combine symbolic approximations in time (e.g., SAX) and frequency (e.g., SFA) domains; (3) Efficient navigation of a huge symbolic-words space: we adapt a symbolic sequence classifier named SEQL, to make it work with multiple domain representations (e.g., SAX-SEQL, SFA-SEQL), and use its greedy feature selection strategy to effectively filter the best features for each representation. We show that a multi-resolution multi-domain linear classifier, SAX-SFA-SEQL, achieves a similar accuracy to the state-of-the-art COTE ensemble, and to a recent deep learning method (FCN), but uses a fraction of the time required by either COTE or FCN. We discuss the accuracy, efficiency and interpretability of our proposed algorithms. To further analyse the interpretability aspect of our classifiers, we present a case study on an ecology benchmark.

Review of Different Privacy Preserving Techniques in PPDP

Big data is a term used for a very large data sets that have many difficulties in storing and processing the data. Analysis this much amount of data will lead to information loss. The main goal of this paper is to share data in a way that privacy is preserved while information loss is kept at least. Data that include Government agencies, University details and Medical history etc., are very necessary for an organization to do analysis and predict trends and patterns, but it may prevent the data owner from sharing the data because of privacy regulations [1]. By doing an analysis of several algorithms of Anonymization such as k-anonymity, l-diversity and tcloseness, one can achieve privacy at minimum loss. Admitting these techniques has some limitations. We need to maintain trade-off between privacy and information loss. We introduce a novel approach called Differential Privacy.

Detecting deviations from second-order stationarity in locally stationary functional time series $p$

Parikh Matrices for Powers of Words

Certain upper triangular matrices, termed as Parikh matrices, are often used in the combinatorial study of words. Given a word, the Parikh matrix of that word elegantly computes the number of occurrences of certain predefined subwords in that word. In this paper, we compute the Parikh matrix of any word raised to an arbitrary power. Furthermore, we propose canonical decompositions of both Parikh matrices and words into normal forms. Finally, given a Parikh matrix, the relation between its normal form and the normal forms of words in the corresponding M-equivalence class is established.

A Capsule Network-based Embedding Model for Knowledge Graph Completion and Search Personalization

In this paper, we introduce an embedding model, named CapsE, exploring a capsule network to model relationship triples \textit{(subject, relation, object)}. Our CapsE represents each triple as a 3-column matrix where each column vector represents the embedding of an element in the triple. This 3-column matrix is then fed to a convolution layer where multiple filters are operated to generate different feature maps. These feature maps are used to construct capsules in the first capsule layer. Capsule layers are connected via dynamic routing mechanism. The last capsule layer consists of only one capsule to produce a vector output. The length of this vector output is used to measure the plausibility of the triple. Our proposed CapsE obtains state-of-the-art link prediction results for knowledge graph completion on two benchmark datasets: WN18RR and FB15k-237, and outperforms strong search personalization baselines on SEARCH17 dataset.

Modeling Semantics with Gated Graph Neural Networks for Knowledge Base Question Answering

The most approaches to Knowledge Base Question Answering are based on semantic parsing. In this paper, we address the problem of learning vector representations for complex semantic parses that consist of multiple entities and relations. Previous work largely focused on selecting the correct semantic relations for a question and disregarded the structure of the semantic parse: the connections between entities and the directions of the relations. We propose to use Gated Graph Neural Networks to encode the graph structure of the semantic parse. We show on two data sets that the graph networks outperform all baseline models that do not explicitly model the structure. The error analysis confirms that our approach can successfully process complex semantic parses.

Learning Explanations from Language Data

PatternAttribution is a recent method, introduced in the vision domain, that explains classifications of deep neural networks. We demonstrate that it also generates meaningful interpretations in the language domain.

A Matching Based Theoretical Framework for Estimating Probability of Causation

The concept of Probability of Causation (PC) is critically important in legal contexts and can help in many other domains. While it has been around since 1986, current operationalizations can obtain only the minimum and maximum values of PC, and do not apply for purely observational data. We present a theoretical framework to estimate the distribution of PC from experimental and from purely observational data. We illustrate additional problems of the existing operationalizations and show how our method can be used to address them. We also provide two illustrative examples of how our method is used and how factors like sample size or rarity of events can influence the distribution of PC. We hope this will make the concept of PC more widely usable in practice.

Multi-Task Learning for Sequence Tagging: An Empirical Study

We study three general multi-task learning (MTL) approaches on 11 sequence tagging tasks. Our extensive empirical results show that in about 50% of the cases, jointly learning all 11 tasks improves upon either independent or pairwise learning of the tasks. We also show that pairwise MTL can inform us what tasks can benefit others or what tasks can be benefited if they are learned jointly. In particular, we identify tasks that can always benefit others as well as tasks that can always be harmed by others. Interestingly, one of our MTL approaches yields embeddings of the tasks that reveal the natural clustering of semantic and syntactic tasks. Our inquiries have opened the doors to further utilization of MTL in NLP.

Semiparametric Bayesian causal inference using Gaussian process priors

We develop a semiparametric Bayesian approach for estimating the mean response in a missing data model with binary outcomes and a nonparametrically modelled propensity score. Equivalently we estimate the causal effect of a treatment, correcting nonparametrically for confounding. We show that standard Gaussian process priors satisfy a semiparametric Bernstein-von Mises theorem under smoothness conditions. We further propose a novel propensity score-dependent prior that provides efficient inference under strictly weaker conditions. We also show that it is theoretically preferable to model the covariate distribution with a Dirichlet process or Bayesian bootstrap, rather than modelling the covariate density using a Gaussian process prior.

iNNvestigate neural networks!

In recent years, deep neural networks have revolutionized many application domains of machine learning and are key components of many critical decision or predictive processes. Therefore, it is crucial that domain specialists can understand and analyze actions and predictions, even of the most complex neural network architectures. Despite these arguments neural networks are often treated as black boxes. In the attempt to alleviate this short- coming many analysis methods were proposed, yet the lack of reference implementations often makes a systematic comparison between the methods a major effort. The presented library iNNvestigate addresses this by providing a common interface and out-of-the- box implementation for many analysis methods, including the reference implementation for PatternNet and PatternAttribution as well as for LRP-methods. To demonstrate the versatility of iNNvestigate, we provide an analysis of image classifications for variety of state-of-the-art neural network architectures.

Estimating Heterogeneous Causal Effects in the Presence of Irregular Assignment Mechanisms

This paper provides a link between causal inference and machine learning techniques – specifically, Classification and Regression Trees (CART) – in observational studies where the receipt of the treatment is not randomized, but the assignment to the treatment can be assumed to be randomized (irregular assignment mechanism). The paper contributes to the growing applied machine learning literature on causal inference, by proposing a modified version of the Causal Tree (CT) algorithm to draw causal inference from an irregular assignment mechanism. The proposed method is developed by merging the CT approach with the instrumental variable framework to causal inference, hence the name Causal Tree with Instrumental Variable (CT-IV). As compared to CT, the main strength of CT-IV is that it can deal more efficiently with the heterogeneity of causal effects, as demonstrated by a series of numerical results obtained on synthetic data. Then, the proposed algorithm is used to evaluate a public policy implemented by the Tuscan Regional Administration (Italy), which aimed at easing the access to credit for small firms. In this context, CT-IV breaks fresh ground for target-based policies, identifying interesting heterogeneous causal effects.

Understanding training and generalization in deep learning by Fourier analysis

Background: It is still an open research area to theoretically understand why Deep Neural Networks (DNNs)—equipped with many more parameters than training data and trained by (stochastic) gradient-based methods—often achieve remarkably low generalization error. Contribution: We study DNN training by Fourier analysis. Our theoretical framework explains: i) DNN with (stochastic) gradient-based methods endows low-frequency components of the target function with a higher priority during the training; ii) Small initialization leads to good generalization ability of DNN while preserving the DNN’s ability of fitting any function. These results are further confirmed by experiments of DNNs fitting the following datasets, i.e., natural images, one-dimensional functions and MNIST dataset.

Simple Root Cause Analysis by Separable Likelihoods

Root Cause Analysis for Anomalies is challenging because of the trade-off between the accuracy and its explanatory friendliness, required for industrial applications. In this paper we propose a framework for simple and friendly RCA within the Bayesian regime under certain restrictions (that Hessian at the mode is diagonal, here referred to as \emph{separability}) imposed on the predictive posterior. We show that this assumption is satisfied for important base models, including Multinomal, Dirichlet-Multinomial and Naive Bayes. To demonstrate the usefulness of the framework, we embed it into the Bayesian Net and validate on web server error logs (real world data set).

Rank-1 Convolutional Neural Network

In this paper, we propose a convolutional neural network(CNN) with 3-D rank-1 filters which are composed by the outer product of 1-D filters. After being trained, the 3-D rank-1 filters can be decomposed into 1-D filters in the test time for fast inference. The reason that we train 3-D rank-1 filters in the training stage instead of consecutive 1-D filters is that a better gradient flow can be obtained with this setting, which makes the training possible even in the case where the network with consecutive 1-D filters cannot be trained. The 3-D rank-1 filters are updated by both the gradient flow and the outer product of the 1-D filters in every epoch, where the gradient flow tries to obtain a solution which minimizes the loss function, while the outer product operation tries to make the parameters of the filter to live on a rank-1 sub-space. Furthermore, we show that the convolution with the rank-1 filters results in low rank outputs, constraining the final output of the CNN also to live on a low dimensional subspace.

• Augmenting word2vec with latent Dirichlet allocation within a clinical application• Existential monadic second order convergence law fails on sparse random graphs• Globally Convergent Type-I Anderson Acceleration for Non-Smooth Fixed-Point Iterations• Local Decodability of the Burrows-Wheeler Transform• Robot Safe Interaction System for Intelligent Industrial Co-Robots• Multimodal Differential Network for Visual Question Generation• A simulation study to distinguish prompt photon from $π^0$ and beam halo in a granular calorimeter using deep networks• Convex Union Representability and Convex Codes• Various Optimality Criteria for the Prediction of Individual Response Curves• Time-Varying Semidefinite Programs• Language Guided Fashion Image Manipulation with Feature-wise Transformations• Multi-Cell Massive MIMO in LoS• On rigidity of unit-bar frameworks• On configuration spaces and Whitehouse’s lifts of the Eulerian representations• PAC-Battling Bandits with Plackett-Luce: Tradeoff between Sample Complexity and Subset Size• Scene-LSTM: A Model for Human Trajectory Prediction• Saturation numbers for Ramsey-minimal graphs• Off-diagonal ordered Ramsey numbers of matchings• 3D Geometry-Aware Semantic Labeling of Outdoor Street Scenes• Confidence penalty, annealing Gaussian noise and zoneout for biLSTM-CRF networks for named entity recognition• Modeling and Simulation of Regenerative Braking Energy in DC Electric Rail Systems• Fooling Polytopes• Dynamic Pricing for Revenue Maximization in Mobile Social Data Market with Network Effects• Optimization of a perturbed sweeping process by constrained discontinuous controls• Faster and More Robust Mesh-based Algorithms for Obstacle k-Nearest Neighbour• A Nonparametric Bayesian Method for Clustering of High-Dimensional Mixed Dataset• A short note on checkerboard colorable twisted duals• Relax, and Accelerate: A Continuous Perspective on ADMM• Optimal control of Markov-modulated multiclass many-server queues• Constructing Non-isomorphic Signless Laplacian Cospectral Graphs• Privacy Preserving and Cost Optimal Mobile Crowdsensing using Smart Contracts on Blockchain• Estimating the Distribution of Random Parameters in a Diffusion Equation Forward Model for a Transdermal Alcohol Biosensor• Rigid colourings of hypergraphs and contiguity• Speeding Up Constrained $k$-Means Through 2-Means• Time Perception Machine: Temporal PointProcesses for the When, Where and What ofActivity Prediction• Regularizing Neural Machine Translation by Target-bidirectional Agreement• Game Theoretic Analysis for Joint Sponsored and Edge Caching Content Service Market• A Transfer Learning based Feature-Weak-Relevant Method for Image Clustering• Language Style Transfer from Sentences with Arbitrary Unknown Styles• Stretched and compressed exponentials in the relaxation dynamics of a metallic glass-forming melt• Long properly coloured cycles in edge-coloured graphs• Live Video Comment Generation Based on Surrounding Frames and Live Comments• Directed Policy Gradient for Safe Reinforcement Learning with Human Advice• Eigenvectors of Deformed Wigner Random Matrices• Regularity and Sensitivity for McKean-Vlasov Type SPDEs Generated by Stable-like Processes• Improved Recovery of Analysis Sparse Vectors in Presence of Prior Information• Towards Audio to Scene Image Synthesis using Generative Adversarial Network• Enumerating five families of pattern-avoiding inversion sequences; and introducing the powered Catalan numbers• AsySPA: An Exact Asynchronous Algorithm for Convex Optimization Over Digraphs• Methodology for identifying study sites in scientific corpus• DenseRAN for Offline Handwritten Chinese Character Recognition• Parsimonious HMMs for Offline Handwritten Chinese Text Recognition• Network Flows that Solve Least Squares for Linear Equations• Symmetric decompositions and real-rootedness• A Preliminary Study On Emerging Cloud Computing Security Challenges• On grounded L-graphs and their relatives• Relaxed Schedulers Can Efficiently Parallelize Iterative Algorithms• Quantization effects and convergence properties of rigid formation control systems with quantized distance measurements• A Forward-Backward Splitting Method for Monotone Inclusions Without Cocoercivity• Automatic Reference-Based Evaluation of Pronoun Translation Misses the Point• On the Shannon entropy of the number of vertices with zero in-degree in randomly oriented hypergraphs• Exponential loss of memory for the 2-dimensional Allen-Cahn equation with small noise• Incremental Non-Rigid Structure-from-Motion with Unknown Focal Length• Stealth Attacks on the Smart Grid• Faster deterministic parameterized algorithm for k-Path• Automatic Plaque Detection in IVOCT Pullbacks Using Convolutional Neural Networks• Rapid Adaptation of Neural Machine Translation to New Languages• Equidistributed statistics on Fishburn matrices and permutations• Passing through a stack $k$ times with reversals• New optimal control problems in density functional theory motivated by photovoltaics• Neural Semi-Markov Conditional Random Fields for Robust Character-Based Part-of-Speech Tagging• Separation-type combinatorial invariants for triangulations of manifolds• On criteria for rook equivalence of Ferrers boards• A Reference Architecture for Datacenter Scheduling: Extended Technical Report• Fast Video Shot Transition Localization with Deep Structured Models• Rook and Wilf equivalence of integer partitions• Applications of molecular communications to medicine: a survey• On the Distribution of Range for Tree-Indexed Random Walks• Fully commutative elements of the complex reflection groups• A molecular communications model for drug delivery• Symmetric Dellac configurations• Graph-Based Controller Synthesis for Safety-Constrained, Resilient Systems• BACH: Grand Challenge on Breast Cancer Histology Images• Clustering genomic words in human DNA using peaks and trends of distributions• Generalized Multivariate Extreme Value Models for Explicit Route Choice Sets• The monotonicity properties for the rank of overpartitions• Unsupervised Hard Example Mining from Videos for Improved Object Detection• Visual Sensor Network Reconfiguration with Deep Reinforcement Learning• Automatic Playlist Continuation through a Composition of Collaborative Filters• Global Complexity Analysis of Inexact Successive Quadratic Approximation methods for Regularized Optimization under Mild Assumptions• Fast, Better Training Trick — Random Gradient• Randomized Hamiltonian Monte Carlo as Scaling Limit of the Bouncy Particle Sampler and Dimension-Free Convergence Rates• Evaluation of estimation approaches on the quality and robustness of collision warning system• What is Unique in Individual Gait Patterns? Understanding and Interpreting Deep Learning in Gait Analysis• Precise Performance Analysis of the LASSO under Matrix Uncertainties• Analysing Multiple Epidemic Data Sources• Comparing morphological complexity of Spanish, Otomi and Nahuatl• Noncoherent Multiantenna Receivers for Cognitive Backscatter System with Multiple RF Sources• Generating Paths with WFC• A simple counterexample to the Monge ansatz in multi-marginal optimal transport, convex geometry of the set of Kantorovich plans, and the Frenkel-Kontorova model• Improving Shape Deformation in Unsupervised Image-to-Image Translation• Hidden Fluid Mechanics: A Navier-Stokes Informed Deep Learning Framework for Assimilating Flow Visualization Data• Stable limits for Markov chains via the Principle of Conditioning• Angular-Based Word Meta-Embedding Learning• Vision-Based Preharvest Yield Mapping for Apple Orchards• Disentangled Representation Learning for Text Style Transfer• Estimating the Density of States of Frustrated Spin Systems• REGMAPR – A Recipe for Textual Matching• Interactive Launch of 16,000 Microsoft Windows Instances on a Supercomputer• Functional Large Deviations for Cox Processes and $Cox/G/\infty$ Queues, with a Biological Application• Multivariate Geometric Anisotropic Cox Processes• Patrolling on Dynamic Ring Networks• Small-time fluctuations for the bridge in a model class of hypoelliptic diffusions of weak Hörmander type• Moments of the SHE under delta initial measure• Large-Scale Study of Curiosity-Driven Learning

Like this:

Like Loading…

Related