Whats new on arXiv

Inverse Conditional Probability Weighting with Clustered Data in Causal Inference

Estimating the average treatment causal effect in clustered data often involves dealing with unmeasured cluster-specific confounding variables. Such variables may be correlated with the measured unit covariates and outcome. When the correlations are ignored, the causal effect estimation can be biased. By utilizing sufficient statistics, we propose an inverse conditional probability weighting (ICPW) method, which is robust to both (i) the correlation between the unmeasured cluster-specific confounding variable and the covariates and (ii) the correlation between the unmeasured cluster-specific confounding variable and the outcome. Assumptions and conditions for the ICPW method are presented. We establish the asymptotic properties of the proposed estimators. Simulation studies and a case study are presented for illustration.

Combining Graph-based Dependency Features with Convolutional Neural Network for Answer Triggering

Answer triggering is the task of selecting the best-suited answer for a given question from a set of candidate answers if exists. In this paper, we present a hybrid deep learning model for answer triggering, which combines several dependency graph based alignment features, namely graph edit distance, graph-based similarity and dependency graph coverage, with dense vector embeddings from a Convolutional Neural Network (CNN). Our experiments on the WikiQA dataset show that such a combination can more accurately trigger a candidate answer compared to the previous state-of-the-art models. Comparative study on WikiQA dataset shows 5.86% absolute F-score improvement at the question level.

Hybrid Subspace Learning for High-Dimensional Data

The high-dimensional data setting, in which p » n, is a challenging statistical paradigm that appears in many real-world problems. In this setting, learning a compact, low-dimensional representation of the data can substantially help distinguish signal from noise. One way to achieve this goal is to perform subspace learning to estimate a small set of latent features that capture the majority of the variance in the original data. Most existing subspace learning models, such as PCA, assume that the data can be fully represented by its embedding in one or more latent subspaces. However, in this work, we argue that this assumption is not suitable for many high-dimensional datasets; often only some variables can easily be projected to a low-dimensional space. We propose a hybrid dimensionality reduction technique in which some features are mapped to a low-dimensional subspace while others remain in the original space. Our model leads to more accurate estimation of the latent space and lower reconstruction error. We present a simple optimization procedure for the resulting biconvex problem and show synthetic data results that demonstrate the advantages of our approach over existing methods. Finally, we demonstrate the effectiveness of this method for extracting meaningful features from both gene expression and video background subtraction datasets.

Is Robustness the Cost of Accuracy? — A Comprehensive Study on the Robustness of 18 Deep Image Classification Models $\ell_2$

$\ell_\infty$

Mathematical Foundations of Probability Theory

In the footsteps of the book \textit{Measure Theory and Integration By and For the Learner} of our series in Probability Theory and Statistics, we intended to devote a special volume of the very probabilistic aspects of the first cited theory. The book might have assigned the title : From Measure Theory and Integration to Probability Theory. The fundamental aspects of Probability Theory, as described by the keywords and phrases below, are presented, not from experiences as in the book \textit{A Course on Elementary Probability Theory}, but from a pure mathematical view based on Measure Theory. Such an approach places Probability Theory in its natural frame of Functional Analysis and constitutes a firm preparation to the study of Random Analysis and Stochastic processes. At the same time, it offers a solid basis towards Mathematical Statistics Theory. The book will be continuously updated and improved on a yearly basis.

Logical Semantics and Commonsense Knowledge: Where Did we Go Wrong, and How to Go Forward, Again

We argue that logical semantics might have faltered due to its failure in distinguishing between two fundamentally very different types of concepts: ontological concepts, that should be types in a strongly-typed ontology, and logical concepts, that are predicates corresponding to properties of and relations between objects of various ontological types. We will then show that accounting for these differences amounts to the integration of lexical and compositional semantics in one coherent framework, and to an embedding in our logical semantics of a strongly-typed ontology that reflects our commonsense view of the world and the way we talk about it in ordinary language. We will show that in such a framework a number of challenges in natural language semantics can be adequately and systematically treated.

NIMFA: A Python Library for Nonnegative Matrix Factorization

NIMFA is an open-source Python library that provides a unified interface to nonnegative matrix factorization algorithms. It includes implementations of state-of-the-art factorization methods, initialization approaches, and quality scoring. It supports both dense and sparse matrix representation. NIMFA’s component-based implementation and hierarchical design should help the users to employ already implemented techniques or design and code new strategies for matrix factorization tasks.

Regularized matrix data clustering and its application to image analysis

In this paper, we propose a regularized mixture probabilistic model to cluster matrix data and apply it to brain signals. The approach is able to capture the sparsity (low rank, small/zero values) of the original signals by introducing regularization terms into the likelihood function. Through a modified EM algorithm, our method achieves the optimal solution with low computational cost. Theoretical results are also provided to establish the consistency of the proposed estimators. Simulations show the advantages of the proposed method over other existing methods. We also apply the approach to two real datasets from different experiments. Promising results imply that the proposed method successfully characterizes signals with different patterns while yielding insightful scientific interpretation.

Nuisance Parameters Free Changepoint Detection in Non-stationary Series

Detecting abrupt changes in the mean of a time series, so-called changepoints, is important for many applications. However, many procedures rely on the estimation of nuisance parameters (like long-run variance). Under the alternative (a change in mean), estimators might be biased and data-adaptive rules for the choice of tuning parameters might not work as expected. If the data is not stationary, but heteroscedastic, this becomes more challenging. The aim of this paper is to present and investigate two changepoint tests, which involve neither nuisance nor tuning parameters. This is achieved by combing self-normalization and wild bootstrap. We study the asymptotic behavior and show the consistency of the bootstrap under the hypothesis as well as under the alternative, assuming mild conditions on the weak dependence of the time series and allowing the variance to change over time. As a by-product of the proposed tests, a changepoint estimator is introduced and its consistency is proved. The results are illustrated through a simulation study, which demonstrates computational efficiency of the developed methods. The new tests will also be applied to real data examples from finance and hydrology.

Residual Memory Networks: Feed-forward approach to learn long temporal dependencies

Training deep recurrent neural network (RNN) architectures is complicated due to the increased network complexity. This disrupts the learning of higher order abstracts using deep RNN. In case of feed-forward networks training deep structures is simple and faster while learning long-term temporal information is not possible. In this paper we propose a residual memory neural network (RMN) architecture to model short-time dependencies using deep feed-forward layers having residual and time delayed connections. The residual connection paves way to construct deeper networks by enabling unhindered flow of gradients and the time delay units capture temporal information with shared weights. The number of layers in RMN signifies both the hierarchical processing depth and temporal depth. The computational complexity in training RMN is significantly less when compared to deep recurrent networks. RMN is further extended as bi-directional RMN (BRMN) to capture both past and future information. Experimental analysis is done on AMI corpus to substantiate the capability of RMN in learning long-term information and hierarchical information. Recognition performance of RMN trained with 300 hours of Switchboard corpus is compared with various state-of-the-art LVCSR systems. The results indicate that RMN and BRMN gains 6 % and 3.8 % relative improvement over LSTM and BLSTM networks.

Differential Private Stream Processing of Energy Consumption

A number of applications benefit from continuously releasing streams of personal data statistics. The process, however, poses significant privacy risks. Motivated by an application in energy systems, this paper presents OptStream, a novel algorithm for releasing differential private data streams. OptStream is a 4-step procedure consisting of sampling, perturbation, reconstruction, and post-processing modules. The sampling module selects a small set of points to access privately in each period of interest, the perturbation module adds noise to the sampled data points to guarantee privacy, the reconstruction module re-assembles the non-sampling data points from the perturbed sampled points, and the post-processing module uses convex optimization over the private output of the previous modules, as well as the private answers of additional queries on the data stream, to ensure consistency of the data’s salient features. OptStream is used to release a real data stream from the largest transmission operator in Europe. Experimental results show that OptStream not only improves the accuracy of the state-of-the-art by at least one order of magnitude on this application domain, but it is also able to ensure accurate load forecasting based on the private data.

A Survey on Deep Transfer Learning

As a new classification platform, deep learning has recently received increasing attention from researchers and has been successfully applied to many domains. In some domains, like bioinformatics and robotics, it is very difficult to construct a large-scale well-annotated dataset due to the expense of data acquisition and costly annotation, which limits its development. Transfer learning relaxes the hypothesis that the training data must be independent and identically distributed (i.i.d.) with the test data, which motivates us to use transfer learning to solve the problem of insufficient training data. This survey focuses on reviewing the current researches of transfer learning by using deep neural network and its applications. We defined deep transfer learning, category and review the recent research works based on the techniques used in deep transfer learning.

Automated Extraction of Personal Knowledge from Smartphone Push Notifications

Personalized services are in need of a rich and powerful personal knowledge base, i.e. a knowledge base containing information about the user. This paper proposes an approach to extracting personal knowledge from smartphone push notifications, which are used by mobile systems and apps to inform users of a rich range of information. Our solution is based on the insight that most notifications are formatted using templates, while knowledge entities can be usually found within the parameters to the templates. As defining all the notification templates and their semantic rules are impractical due to the huge number of notification templates used by potentially millions of apps, we propose an automated approach for personal knowledge extraction from push notifications. We first discover notification templates through pattern mining, then use machine learning to understand the template semantics. Based on the templates and their semantics, we are able to translate notification text into knowledge facts automatically. Users’ privacy is preserved as we only need to upload the templates to the server for model training, which do not contain any personal information. According to our experiments with about 120 million push notifications from 100,000 smartphone users, our system is able to extract personal knowledge accurately and efficiently.

• Towards Closing the Gap in Weakly Supervised Semantic Segmentation with DCNNs: Combining Local and Global Models• A Review of Learning with Deep Generative Models from perspective of graphical modeling• Notes On Group Distance Magicness of Product Graphs• Towards Efficient Maximum Likelihood Estimation of LPV-SS Models• Self-Attention Recurrent Network for Saliency Detection• Note: Effect of localization on mean-field density of state near jamming• Degree Growth Rates and Index Estimation in a Directed Preferential Attachment Model• Prediction in Riemannian metrics derived from divergence functions• Multi-Objective Cognitive Model: a supervised approach for multi-subject fMRI analysis• Dynamical multiple regression in function spaces, under kernel regressors, with ARH(1) errors• Graph Based Imaging for Synthetic Aperture Radar• Strongly consistent autoregressive predictors in abstract Banach spaces• Instantiation• Structured Adversarial Attack: Towards General Implementation and Better Interpretability• Diffusion approximations and control variates for MCMC• An inversion metric for reduced words• Homogenization of Symmetric Lévy Processes on $\mathbb{R}^d$• Model-Aided Wireless Artificial Intelligence: Embedding Expert Knowledge in Deep Neural Networks Towards Wireless Systems Optimization• Dilated Convolutions in Neural Networks for Left Atrial Segmentation in 3D Gadolinium Enhanced-MRI• 3D Conceptual Design Using Deep Learning• A Multi-task Framework for Skin Lesion Detection and Segmentation• Kid on The Phone! Toward Automatic Detection of Children on Mobile Devices• Missing Value Imputation Based on Deep Generative Models• Too many secants: a hierarchical approach to secant-based dimensionality reduction on large data sets• Error Detection in a Large-Scale Lexical Taxonomy• Sampling-based randomized designs for causal inference under the potential outcomes framework• Projectively unique polytopes and toric slack ideals• Computationally efficient model selection for joint spikes and waveforms decoding• Skin Lesion Diagnosis using Ensembles, Unscaled Multi-Crop Evaluation and Loss Weighting• Effective Resource Sharing in Mobile-Cell Environments• Revisiting the simulation of quantum Turing machines by quantum circuits• The Bases of Association Rules of High Confidence• New Viewpoint and Algorithms for Water-Filling Solutions in Wireless Communications• A formula for the cohomology and $K$-class of a regular Hessenberg variety• Energy-Age Tradeoff in Status Update Communication Systems with Retransmission• A Study of Deep Feature Fusion based Methods for Classifying Multi-lead ECG• Signal Jamming Attacks Against Communication-Based Train Control: Attack Impact and Countermeasure• Liquid Pouring Monitoring via Rich Sensory Inputs• Incorporating Scalability in Unsupervised Spatio-Temporal Feature Learning• Machine Learning Phase Transition: An Iterative Methodology• Concentration bounds for empirical conditional value-at-risk: The unbounded case• Using Linguistic Cues for Analyzing Social Movements• Beyond the Central Limit Theorem: Universal and Non-universal Simulations of Random Variables by General Mappings• Deep Transfer Learning for EEG-based Brain Computer Interface• Gray-box Adversarial Training• A Flip-Syndrome-List Polar Decoder Architecture for Ultra-Low-Latency Communications• Scalability Analysis of a LoRa Network under Imperfect Orthogonality• On Optimizing Deep Convolutional Neural Networks by Evolutionary Computing• DP-Degree Colorable Hypergraphs• Spline Regression with Automatic Knot Selection• Phase Transition in Matched Formulas and a Heuristic for Biclique Satisfiability• About the Stein equation for the generalized inverse Gaussian and Kummer distributions• Solution Paths of Variational Regularization Methods for Inverse Problems• Defense Against Adversarial Attacks with Saak Transform• Inner approximation algorithm for solving linear multiobjective optimization problems• Thresholds of mixed fractional Brownian motion• Blockchain Queueing Theory• Compactness of semigroups of explosive symmetric Markov processes• The k-cube is k-representable• Linearly Precoded Rate Splitting: Optimality and Non-Optimality for MIMO Broadcast Channels• Field theory for amorphous solids• Regret Bounds for Reinforcement Learning via Markov Chain Concentration• Visual Question Generation for Class Acquisition of Unknown Objects• Girsanov formula for $G$-Brownian motion: the degenerate case• Efficient domination in regular graphs• Detailed Dense Inference with Convolutional Neural Networks via Discrete Wavelet Transform• Fourth moment theorems on the Poisson space: analytic statements via product formulae• Improving Temporal Interpolation of Head and Body Pose using Gaussian Process Regression in a Matrix Completion Setting• Occlusions, Motion and Depth Boundaries with a Generic Network for Disparity, Optical Flow or Scene Flow Estimation• Beyond $1/2$-Approximation for Submodular Maximization on Massive Data Streams• An Efficient Approach to Learning Chinese Judgment Document Similarity Based on Knowledge Summarization• Generalized Port-Hamiltonian DAE Systems• Correlated time-changed Lévy Processes• Metal Artifact Reduction in Cone-Beam X-Ray CT via Ray Profile Correction• Statistical Windows in Testing for the Initial Distribution of a Reversible Markov Chain• The Contact Process on Periodic Trees• Coloured stochastic vertex models and their spectral theory• Reasoning with Justifiable Exceptions in Contextual Hierarchies (Appendix)• An Efficient Deep Reinforcement Learning Model for Urban Traffic Control• GLSE Precoders for Massive MIMO Systems: Analysis and Applications• One-Shot Coherence Distillation: The Full Story• Two Practical Random-Subcarrier-Selection Methods for Secure Precise Wireless Transmission• DeepTAM: Deep Tracking and Mapping• The Fluid Mechanics of Liquid Democracy• A bijection between ternary trees and a subclass of Motzkin paths• On the Duality and File Size Hierarchy of Fractional Repetition Codes• Audio Tagging With Connectionist Temporal Classification Model Using Sequential Labelled Data• Error Correction Maximization for Deep Image Hashing• V-FCNN: Volumetric Fully Convolution Neural Network For Automatic Atrial Segmentation• Assessing and countering reaction attacks against post-quantum public-key cryptosystems based on QC-LDPC codes• Deep Shape Analysis on Abdominal Organs for Diabetes Prediction• A Review on Image- and Network-based Brain Data Analysis Techniques for Alzheimer’s Disease Diagnosis Reveals a Gap in Developing Predictive Methods for Prognosis• A non-linear parabolic PDE with a distributional coefficient and its applications to stochastic analysis• Distributional Multivariate Policy Evaluation and Exploration with the Bellman GAN• Super Resolution Phase Retrieval for Sparse Signals• Semi-discrete unbalanced optimal transport and quantization• A Survey on Surrogate Approaches to Non-negative Matrix Factorization• Adversarial Vision Challenge• Time-Dependent Shortest Path Queries Among Growing Discs• Stability and Throughput Analysis of Multiple Access Networks with Finite Blocklength Constraints• Idempotent Analysis, Tropical Convexity and Reduced Divisors• Hashing with Binary Matrix Pursuit• Simultaneous Edge Alignment and Learning• Mass-spring-damper Network for Distributed Averaging and Optimization• Bionic Reflex Control Strategy for Robotic Finger with Kinematic Constraints• Robust Secrecy Energy Efficient Beamforming in MISOME-SWIPT Systems With Proportional Fairness• Distributionally Robust Co-Optimization of Power Dispatch and Do-Not-Exceed Limits

Like this:

Like Loading…

Related