MALTS: Matching After Learning to Stretch
We introduce a flexible framework for matching in causal inference that produces high quality almost-exact matches. Most prior work in matching uses ad hoc distance metrics, often leading to poor quality matches, particularly when there are irrelevant covariates that degrade the distance metric. In this work, we learn an interpretable distance metric used for matching, which leads to substantially higher quality matches. The distance metric can stretch continuous covariates and matches exactly on categorical covariates. The framework is flexible in that the user can choose the form of distance metric, the type of optimization algorithm, and the type of relaxation for matching. Our ability to learn flexible distance metrics leads to matches that are interpretable and useful for estimation of conditional average treatment effects.
Stochastic Deep Networks
Machine learning is increasingly targeting areas where input data cannot be accurately described by a single vector, but can be modeled instead using the more flexible concept of random vectors, namely probability measures or more simply point clouds of varying cardinality. Using deep architectures on measures poses, however, many challenging issues. Indeed, deep architectures are originally designed to handle fixedlength vectors, or, using recursive mechanisms, ordered sequences thereof. In sharp contrast, measures describe a varying number of weighted observations with no particular order. We propose in this work a deep framework designed to handle crucial aspects of measures, namely permutation invariances, variations in weights and cardinality. Architectures derived from this pipeline can (i) map measures to measures – using the concept of push-forward operators; (ii) bridge the gap between measures and Euclidean spaces – through integration steps. This allows to design discriminative networks (to classify or reduce the dimensionality of input measures), generative architectures (to synthesize measures) and recurrent pipelines (to predict measure dynamics). We provide a theoretical analysis of these building blocks, review our architectures’ approximation abilities and robustness w.r.t. perturbation, and try them on various discriminative and generative tasks.
The PyTorch-Kaldi Speech Recognition Toolkit
The availability of open-source software is playing a remarkable role in the popularization of speech recognition and deep learning. Kaldi, for instance, is nowadays an established framework used to develop state-of-the-art speech recognizers. PyTorch is used to build neural networks with the Python language and has recently spawn tremendous interest within the machine learning community thanks to its simplicity and flexibility. The PyTorch-Kaldi project aims to bridge the gap between these popular toolkits, trying to inherit the efficiency of Kaldi and the flexibility of PyTorch. PyTorch-Kaldi is not only a simple interface between these software, but it embeds several useful features for developing modern speech recognizers. For instance, the code is specifically designed to naturally plug-in user-defined acoustic models. As an alternative, users can exploit several pre-implemented neural networks that can be customized using intuitive configuration files. PyTorch-Kaldi supports multiple feature and label streams as well as combinations of neural networks, enabling the use of complex neural architectures. The toolkit is publicly-released along with a rich documentation and is designed to properly work locally or on HPC clusters. Experiments, that are conducted on several datasets and tasks, show that PyTorch-Kaldi can effectively be used to develop modern state-of-the-art speech recognizers.
Unsupervised Domain Adaptation: An Adaptive Feature Norm Approach
An Efficient Transfer Learning Technique by Using Final Fully-Connected Layer Output Features of Deep Networks
In this paper, we propose a computationally efficient transfer learning approach using the output vector of final fully-connected layer of deep convolutional neural networks for classification. Our proposed technique uses a single layer perceptron classifier designed with hyper-parameters to focus on improving computational efficiency without adversely affecting the performance of classification compared to the baseline technique. Our investigations show that our technique converges much faster than baseline yielding very competitive classification results. We execute thorough experiments to understand the impact of similarity between pre-trained and new classes, similarity among new classes, number of training samples in the performance of classification using transfer learning of the final fully-connected layer’s output features.
Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spatiotemporal Dynamics
Natural spatiotemporal processes can be highly non-stationary in many ways, e.g. the low-level non-stationarity such as spatial correlations or temporal dependencies of local pixel values; and the high-level variations such as the accumulation, deformation or dissipation of radar echoes in precipitation forecasting. From Cramer’s Decomposition, any non-stationary process can be decomposed into deterministic, time-variant polynomials, plus a zero-mean stochastic term. By applying differencing operations appropriately, we may turn time-variant polynomials into a constant, making the deterministic component predictable. However, most previous recurrent neural networks for spatiotemporal prediction do not use the differential signals effectively, and their relatively simple state transition functions prevent them from learning too complicated variations in spacetime. We propose the Memory In Memory (MIM) networks and corresponding recurrent blocks for this purpose. The MIM blocks exploit the differential signals between adjacent recurrent states to model the non-stationary and approximately stationary properties in spatiotemporal dynamics with two cascaded, self-renewed memory modules. By stacking multiple MIM blocks, we could potentially handle higher-order non-stationarity. The MIM networks achieve the state-of-the-art results on three spatiotemporal prediction tasks across both synthetic and real-world datasets. We believe that the general idea of this work can be potentially applied to other time-series forecasting tasks.
DEXON: A Highly Scalable, Decentralized DAG-Based Consensus Algorithm
A blockchain system is a replicated state machine that must be fault tolerant. When designing a blockchain system, there is usually a trade-off between decentralization, scalability, and security. In this paper, we propose a novel blockchain system, DEXON, which achieves high scalability while remaining decentralized and robust in the real-world environment. We have two main contributions. First, we present a highly scalable sharding framework for blockchain. This framework takes an arbitrary number of single chains and transforms them into the \textit{blocklattice} data structure, enabling \textit{high scalability} and \textit{low transaction confirmation latency} with asymptotically optimal communication overhead. Second, we propose a single-chain protocol based on our novel verifiable random function and a new Byzantine agreement that achieves high decentralization and low latency.
Variational Bayesian Dropout
Variational dropout (VD) is a generalization of Gaussian dropout, which aims at inferring the posterior of network weights based on a log-uniform prior on them to learn these weights as well as dropout rate simultaneously. The log-uniform prior not only interprets the regularization capacity of Gaussian dropout in network training, but also underpins the inference of such posterior. However, the log-uniform prior is an improper prior (i.e., its integral is infinite) which causes the inference of posterior to be ill-posed, thus restricting the regularization performance of VD. To address this problem, we present a new generalization of Gaussian dropout, termed variational Bayesian dropout (VBD), which turns to exploit a hierarchical prior on the network weights and infer a new joint posterior. Specifically, we implement the hierarchical prior as a zero-mean Gaussian distribution with variance sampled from a uniform hyper-prior. Then, we incorporate such a prior into inferring the joint posterior over network weights and the variance in the hierarchical prior, with which both the network training and the dropout rate estimation can be cast into a joint optimization problem. More importantly, the hierarchical prior is a proper prior which enables the inference of posterior to be well-posed. In addition, we further show that the proposed VBD can be seamlessly applied to network compression. Experiments on both classification and network compression tasks demonstrate the superior performance of the proposed VBD in terms of regularizing network training.
Contributors profile modelization in crowdsourcing platforms
The crowdsourcing consists in the externalisation of tasks to a crowd of people remunerated to execute this ones. The crowd, usually diversified, can include users without qualification and/or motivation for the tasks. In this paper we will introduce a new method of user expertise modelization in the crowdsourcing platforms based on the theory of belief functions in order to identify serious and qualificated users.
Fine-grained Classification using Heterogeneous Web Data and Auxiliary Categories
Fine-grained classification remains a very challenging problem, because of the absence of well-labeled training data caused by the high cost of annotating a large number of fine-grained categories. In the extreme case, given a set of test categories without any well-labeled training data, the majority of existing works can be grouped into the following two research directions: 1) crawl noisy labeled web data for the test categories as training data, which is dubbed as webly supervised learning; 2) transfer the knowledge from auxiliary categories with well-labeled training data to the test categories, which corresponds to zero-shot learning setting. Nevertheless, the above two research directions still have critical issues to be addressed. For the first direction, web data have noisy labels and considerably different data distribution from test data. For the second direction, zero-shot learning is struggling to achieve compelling results compared with conventional supervised learning. The issues of the above two directions motivate us to develop a novel approach which can jointly exploit both noisy web training data from test categories and well-labeled training data from auxiliary categories. In particular, on one hand, we crawl web data for test categories as noisy training data. On the other hand, we transfer the knowledge from auxiliary categories with well-labeled training data to test categories by virtue of free semantic information (e.g., word vector) of all categories. Moreover, given the fact that web data are generally associated with additional textual information (e.g., title and tag), we extend our method by using the surrounding textual information of web data as privileged information. Extensive experiments show the effectiveness of our proposed methods.
Deep Active Learning with a Neural Architecture Search
We consider active learning of deep neural networks. Most active learning works in this context have focused on studying effective querying mechanisms and assumed that an appropriate network architecture is a priori known for the problem at hand. We challenge this assumption and propose a novel active strategy whereby the learning algorithm searches for effective architectures on the fly, while actively learning. We apply our strategy using three known querying techniques (softmax response, MC-dropout, and coresets) and show that the proposed approach overwhelmingly outperforms active learning using fixed architectures.
Self-Referenced Deep Learning
Knowledge distillation is an effective approach to transferring knowledge from a teacher neural network to a student target network for satisfying the low-memory and fast running requirements in practice use. Whilst being able to create stronger target networks compared to the vanilla non-teacher based learning strategy, this scheme needs to train additionally a large teacher model with expensive computational cost. In this work, we present a Self-Referenced Deep Learning (SRDL) strategy. Unlike both vanilla optimisation and existing knowledge distillation, SRDL distils the knowledge discovered by the in-training target model back to itself to regularise the subsequent learning procedure therefore eliminating the need for training a large teacher model. SRDL improves the model generalisation performance compared to vanilla learning and conventional knowledge distillation approaches with negligible extra computational cost. Extensive evaluations show that a variety of deep networks benefit from SRDL resulting in enhanced deployment performance on both coarse-grained object categorisation tasks (CIFAR10, CIFAR100, Tiny ImageNet, and ImageNet) and fine-grained person instance identification tasks (Market-1501).
A Trustworthy, Responsible and Interpretable System to Handle Chit-Chat in Conversational Bots
Most often, chat-bots are built to solve the purpose of a search engine or a human assistant: Their primary goal is to provide information to the user or help them complete a task. However, these chat-bots are incapable of responding to unscripted queries like ‘Hi, what’s up’, ‘What’s your favourite food’. Human evaluation judgments show that 4 humans come to a consensus on the intent of a given query which is from chat domain only 77% of the time, thus making it evident how non-trivial this task is. In our work, we show why it is difficult to break the chitchat space into clearly defined intents. We propose a system to handle this task in chat-bots, keeping in mind scalability, interpretability, appropriateness, trustworthiness, relevance and coverage. Our work introduces a pipeline for query understanding in chitchat using hierarchical intents as well as a way to use seq-seq auto-generation models in professional bots. We explore an interpretable model for chat domain detection and also show how various components such as adult/offensive classification, grammars/regex patterns, curated personality based responses, generic guided evasive responses and response generation models can be combined in a scalable way to solve this problem.
Outlier Aware Network Embedding for Attributed Networks
An efficient density-based clustering algorithm using reverse nearest neighbour
Density-based clustering is the task of discovering high-density regions of entities (clusters) that are separated from each other by contiguous regions of low-density. DBSCAN is, arguably, the most popular density-based clustering algorithm. However, its cluster recovery capabilities depend on the combination of the two parameters. In this paper we present a new density-based clustering algorithm which uses reverse nearest neighbour (RNN) and has a single parameter. We also show that it is possible to estimate a good value for this parameter using a clustering validity index. The RNN queries enable our algorithm to estimate densities taking more than a single entity into account, and to recover clusters that are not well-separated or have different densities. Our experiments on synthetic and real-world data sets show our proposed algorithm outperforms DBSCAN and its recent variant ISDBSCAN.
Chat More If You Like: Dynamic Cue Words Planning to Flow Longer Conversations
To build an open-domain multi-turn conversation system is one of the most interesting and challenging tasks in Artificial Intelligence. Many research efforts have been dedicated to building such dialogue systems, yet few shed light on modeling the conversation flow in an ongoing dialogue. Besides, it is common for people to talk about highly relevant aspects during a conversation. And the topics are coherent and drift naturally, which demonstrates the necessity of dialogue flow modeling. To this end, we present the multi-turn cue-words driven conversation system with reinforcement learning method (RLCw), which strives to select an adaptive cue word with the greatest future credit, and therefore improve the quality of generated responses. We introduce a new reward to measure the quality of cue words in terms of effectiveness and relevance. To further optimize the model for long-term conversations, a reinforcement approach is adopted in this paper. Experiments on real-life dataset demonstrate that our model consistently outperforms a set of competitive baselines in terms of simulated turns, diversity and human evaluation.
An Influence-based Clustering Model on Twitter
This paper introduces a temporal framework for detecting and clustering emergent and viral topics on social networks. Endogenous and exogenous influence on developing viral content is explored using a clustering method based on the a user’s behavior on social network and a dataset from Twitter API. Results are discussed by introducing metrics such as popularity, burstiness, and relevance score. The results show clear distinction in characteristics of developed content by the two classes of users.
When Conventional machine learning meets neuromorphic engineering: Deep Temporal Networks (DTNets) a machine learning frawmework allowing to operate on Events and Frames and implantable on Tensor Flow Like Hardware
We introduce in this paper the principle of Deep Temporal Networks that allow to add time to convolutional networks by allowing deep integration principles not only using spatial information but also increasingly large temporal window. The concept can be used for conventional image inputs but also event based data. Although inspired by the architecture of brain that inegrates information over increasingly larger spatial but also temporal scales it can operate on conventional hardware using existing architectures. We introduce preliminary results to show the efficiency of the method. More in-depth results and analysis will be reported soon!
Complexity Analysis of a Sampling-Based Interior Point Method for Convex Optimization
We develop a short-step interior point method to optimize a linear function over a convex body assuming that one only knows a membership oracle for this body. The approach is based on Abernethy and Hazan’s sketch of a universal interior point method using the so-called entropic barrier [arXiv 1507.02528v2, 2015]. It is well-known that the gradient and Hessian of the entropic barrier can be approximated by sampling from Boltzmann-Gibbs distributions, and the entropic barrier was shown to be self-concordant by Bubeck and Eldan [arXiv 1412.1587v3, 2015]. The analysis of our algorithm uses properties of the entropic barrier, mixing times for hit-and-run random walks by Lov\’asz and Vempala [Foundations of Computer Science, 2006], approximation quality guarantees for the mean and covariance of a log-concave distribution, and results from De Klerk, Glineur and Taylor on inexact Newton-type methods [arXiv 1709.0519, 2017].
Efficient keyword spotting using dilated convolutions and gating
We explore the application of end-to-end stateless temporal modeling to small-footprint keyword spotting as opposed to recurrent networks that model long-term temporal dependencies using internal states. We propose a model inspired by the recent success of dilated convolutions in sequence modeling applications, allowing to train deeper architectures in resource-constrained configurations. Gated activations and residual connections are also added, following a similar configuration to WaveNet. In addition, we apply a custom target labeling that back-propagates loss from specific frames of interest, therefore yielding higher accuracy and only requiring to detect the end of the keyword. Our experimental results show that our model outperforms a max-pooling loss trained recurrent neural network using LSTM cells, with a significant decrease in false rejection rate. The underlying dataset – ‘Hey Snips’ utterances recorded by over 2.2K different speakers – has been made publicly available to establish an open reference for wake-word detection.
Do Normalization Layers in a Deep ConvNet Really Need to Be Distinct?
Yes, they do. This work investigates a perspective for deep learning: whether different normalization layers in a ConvNet require different normalizers. This is the first step towards understanding this phenomenon. We allow each convolutional layer to be stacked before a switchable normalization (SN) that learns to choose a normalizer from a pool of normalization methods. Through systematic experiments in ImageNet, COCO, Cityscapes, and ADE20K, we answer three questions: (a) Is it useful to allow each normalization layer to select its own normalizer? (b) What impacts the choices of normalizers? (c) Do different tasks and datasets prefer different normalizers? Our results suggest that (1) using distinct normalizers improves both learning and generalization of a ConvNet; (2) the choices of normalizers are more related to depth and batch size, but less relevant to parameter initialization, learning rate decay, and solver; (3) different tasks and datasets have different behaviors when learning to select normalizers.
Reinforcement Learning with A* and a Deep Heuristic
A* is a popular path-finding algorithm, but it can only be applied to those domains where a good heuristic function is known. Inspired by recent methods combining Deep Neural Networks (DNNs) and trees, this study demonstrates how to train a heuristic represented by a DNN and combine it with A*. This new algorithm which we call aleph-star can be used efficiently in domains where the input to the heuristic could be processed by a neural network. We compare aleph-star to N-Step Deep Q-Learning (DQN Mnih et al. 2013) in a driving simulation with pixel-based input, and demonstrate significantly better performance in this scenario.
How far from automatically interpreting deep learning
In recent years, deep learning researchers have focused on how to find the interpretability behind deep learning models. However, today cognitive competence of human has not completely covered the deep learning model. In other words, there is a gap between the deep learning model and the cognitive mode. How to evaluate and shrink the cognitive gap is a very important issue. In this paper, the interpretability evaluation, the relationship between the generalization performance and the interpretability of the model and the method for improving the interpretability are concerned. A universal learning framework is put forward to solve the equilibrium problem between the two performances. The uniqueness of solution of the problem is proved and condition of unique solution is obtained. Probability upper bound of the sum of the two performances is analyzed.
Building Efficient Deep Neural Networks with Unitary Group Convolutions
We propose unitary group convolutions (UGConvs), a building block for CNNs which compose a group convolution with unitary transforms in feature space to learn a richer set of representations than group convolution alone. UGConvs generalize two disparate ideas in CNN architecture, channel shuffling (i.e. ShuffleNet) and block-circulant networks (i.e. CirCNN), and provide unifying insights that lead to a deeper understanding of each technique. We experimentally demonstrate that dense unitary transforms can outperform channel shuffling in DNN accuracy. On the other hand, different dense transforms exhibit comparable accuracy performance. Based on these observations we propose HadaNet, a UGConv network using Hadamard transforms. HadaNets achieve similar accuracy to circulant networks with lower computation complexity, and better accuracy than ShuffleNets with the same number of parameters and floating-point multiplies.
Weighted Ensemble of Statistical Models
We present a detailed description of our submission for the M4 forecasting competition, in which it ranked 3rd overall. Our solution utilizes several commonly used statistical models, which are weighted according to their performance on historical data. We cluster series within each type of frequency with respect to the existence of trend and seasonality. Every class of series is assigned a different set of algorithms to combine. We conduct experiments with holdout set to manually pick pools of models that perform best for a given series type, as well as to choose the combination approaches.
How to Use Heuristics for Differential Privacy
We develop theory for using heuristics to solve computationally hard problems in differential privacy. Heuristic approaches have enjoyed tremendous success in machine learning, for which performance can be empirically evaluated. However, privacy guarantees cannot be evaluated empirically, and must be proven — without making heuristic assumptions. We show that learning problems over broad classes of functions can be solved privately and efficiently, assuming the existence of a non-private oracle for solving the same problem. Our first algorithm yields a privacy guarantee that is contingent on the correctness of the oracle. We then give a reduction which applies to a class of heuristics which we call certifiable, which allows us to convert oracle-dependent privacy guarantees to worst-case privacy guarantee that hold even when the heuristic standing in for the oracle might fail in adversarial ways. Finally, we consider a broad class of functions that includes most classes of simple boolean functions studied in the PAC learning literature, including conjunctions, disjunctions, parities, and discrete halfspaces. We show that there is an efficient algorithm for privately constructing synthetic data for any such class, given a non-private learning oracle. This in particular gives the first oracle-efficient algorithm for privately generating synthetic data for contingency tables. The most intriguing question left open by our work is whether or not every problem that can be solved differentially privately can be privately solved with an oracle-efficient algorithm. While we do not resolve this, we give a barrier result that suggests that any generic oracle-efficient reduction must fall outside of a natural class of algorithms (which includes the algorithms given in this paper).
Explicit Bias Discovery in Visual Question Answering Models
Researchers have observed that Visual Question Answering (VQA) models tend to answer questions by learning statistical biases in the data. For example, their answer to the question ‘What is the color of the grass?’ is usually ‘Green’, whereas a question like ‘What is the title of the book?’ cannot be answered by inferring statistical biases. It is of interest to the community to explicitly discover such biases, both for understanding the behavior of such models, and towards debugging them. Our work address this problem. In a database, we store the words of the question, answer and visual words corresponding to regions of interest in attention maps. By running simple rule mining algorithms on this database, we discover human-interpretable rules which give us unique insight into the behavior of such models. Our results also show examples of unusual behaviors learned by models in attempting VQA tasks.
Deeper Interpretability of Deep Networks
Deep Convolutional Neural Networks (CNNs) have been one of the most influential recent developments in computer vision, particularly for categorization. There is an increasing demand for explainable AI as these systems are deployed in the real world. However, understanding the information represented and processed in CNNs remains in most cases challenging. Within this paper, we explore the use of new information theoretic techniques developed in the field of neuroscience to enable novel understanding of how a CNN represents information. We trained a 10-layer ResNet architecture to identify 2,000 face identities from 26M images generated using a rigorously controlled 3D face rendering model that produced variations of intrinsic (i.e. face morphology, gender, age, expression and ethnicity) and extrinsic factors (i.e. 3D pose, illumination, scale and 2D translation). With our methodology, we demonstrate that unlike human’s network overgeneralizes face identities even with extreme changes of face shape, but it is more sensitive to changes of texture. To understand the processing of information underlying these counterintuitive properties, we visualize the features of shape and texture that the network processes to identify faces. Then, we shed a light into the inner workings of the black box and reveal how hidden layers represent these features and whether the representations are invariant to pose. We hope that our methodology will provide an additional valuable tool for interpretability of CNNs.
Sampling on Social Networks from a Decision Theory Perspective
Some of the most used sampling mechanisms that propagate through a social network are defined in terms of tuning parameters, for instance, Respondent-Driven Sampling (RDS) is specified by the number of seeds and maximum number of referrals. We are interested in the problem of optimising these tuning parameters with the purpose of improving the inference of a population quantity, where such quantity is a function of the network and measurements taken at the nodes. This is done by formulating the problem in terms of Decision Theory. The optimisation procedure for different sampling mechanisms is illustrated via simulations in the fashion of the ones used for Bayesian clinical trials.
On the Network Visibility Problem
Prediction-Based Decisions and Fairness: A Catalogue of Choices, Assumptions, and Definitions
A recent flurry of research activity has attempted to quantitatively define ‘fairness’ for decisions based on statistical and machine learning (ML) predictions. The rapid growth of this new field has led to wildly inconsistent terminology and notation, presenting a serious challenge for cataloguing and comparing definitions. This paper attempts to bring much-needed order. First, we explicate the various choices and assumptions made—often implicitly—to justify the use of prediction-based decisions. Next, we show how such choices and assumptions can raise concerns about fairness and we present a notationally consistent catalogue of fairness definitions from the ML literature. In doing so, we offer a concise reference for thinking through the choices, assumptions, and fairness considerations of prediction-based decision systems.
Scalable agent alignment via reward modeling: a research direction
One obstacle to applying reinforcement learning algorithms to real-world problems is the lack of suitable reward functions. Designing such reward functions is difficult in part because the user only has an implicit understanding of the task objective. This gives rise to the agent alignment problem: how do we create agents that behave in accordance with the user’s intentions? We outline a high-level research direction to solve the agent alignment problem centered around reward modeling: learning a reward function from interaction with the user and optimizing the learned reward function with reinforcement learning. We discuss the key challenges we expect to face when scaling reward modeling to complex and general domains, concrete approaches to mitigate these challenges, and ways to establish trust in the resulting agents.
Guiding Policies with Language via Meta-Learning
Behavioral skills or policies for autonomous agents are conventionally learned from reward functions, via reinforcement learning, or from demonstrations, via imitation learning. However, both modes of task specification have their disadvantages: reward functions require manual engineering, while demonstrations require a human expert to be able to actually perform the task in order to generate the demonstration. Instruction following from natural language instructions provides an appealing alternative: in the same way that we can specify goals to other humans simply by speaking or writing, we would like to be able to specify tasks for our machines. However, a single instruction may be insufficient to fully communicate our intent or, even if it is, may be insufficient for an autonomous agent to actually understand how to perform the desired task. In this work, we propose an interactive formulation of the task specification problem, where iterative language corrections are provided to an autonomous agent, guiding it in acquiring the desired skill. Our proposed language-guided policy learning algorithm can integrate an instruction and a sequence of corrections to acquire new skills very quickly. In our experiments, we show that this method can enable a policy to follow instructions and corrections for simulated navigation and manipulation tasks, substantially outperforming direct, non-interactive instruction following.
Patterns in Random Permutations
Every k entries in a permutation can have one of k! different relative orders, called patterns. How many times does each pattern occur in a large random permutation of size n? The distribution of this k!-dimensional vector of pattern densities was studied by Janson, Nakamura, and Zeilberger (2015). Their analysis showed that some component of this vector is asymptotically multinormal of order 1/sqrt(n), while the orthogonal component is smaller. Using representations of the symmetric group, and the theory of U-statistics, we refine the analysis of this distribution. We show that it decomposes into k asymptotically uncorrelated components of different orders in n, that correspond to representations of Sk. Some combinations of pattern densities that arise in this decomposition have interpretations as practical nonparametric statistical tests.
• The problematic nature of potentially polynomial-time algorithms solving the subset-sum problem• Compact localized states of open scattering media• Analyticity results in Bernoulli Percolation• Multimodal Densenet• Optimal H2 moment matching-based model reduction for linear systems by (non)convex optimization• Domain expansion and transient scaling regimes in population networks with in-domain cyclic selection• The Preemptive Resource Allocation Problem• Realtime Scheduling and Power Allocation Using Deep Neural Networks• PerSIM: Multi-resolution Image Quality Assessment in the Perceptually Uniform Color Domain• Periodic switching strategies for an isoperimetric control problem with application to nonlinear chemical reactions• Harmonic Recomposition using Conditional Autoregressive Modeling• The core consistency of a compressed tensor• Understanding and Measuring Psychological Stress using Social Media• Pixel-Anchor: A Fast Oriented Scene Text Detector with Combined Networks• Non-Hermitian Quasi-Localization and Ring Attractor Neural Networks• Limitations of Source-Filter Coupling In Phonation• On buildings that compute. A proposal• Learning to Generate the ‘Unseen’ via Part Synthesis and Composition• Predictive and Semantic Layout Estimation for Robotic Applications in Manhattan Worlds• Sorting permutations with a transposition tree• High-precision timing and frequency synchronization method for MIMO-OFDM systems in double-selective channels• Testing local properties of arrays• Regular and biregular planar cages• An Investigation on Partitions with Equal Products• Product of sumsets over arbitrary finite fields• On Geometric Alignment in Low Doubling Dimension• Generalizable Adversarial Training via Spectral Normalization• Segregated Temporal Assembly Recurrent Networks for Weakly Supervised Multiple Action Detection• Indoor GeoNet: Weakly Supervised Hybrid Learning for Depth and Pose Estimation• Towards Nearly-linear Time Algorithms for Submodular Maximization with a Matroid Constraint• Bayesian CycleGAN via Marginalizing Latent Sampling• Multi-scale 3D Convolution Network for Video Based Person Re-Identification• Exploring Small-World Network with an Elite-Clique: Bringing Embeddedness Theory into the Dynamic Evolution of a Venture Capital Network• Denoising and Completion of Structured Low-Rank Matrices via Iteratively Reweighted Least Squares• On the Sweep Map for $\vec{k}$-Dyck Paths• Best-arm identification with cascading bandits• Global and Local Sensitivity Guided Key Salient Object Re-augmentation for Video Saliency Detection• Intersection theorems for families of matchings of complete $k$-partite $k$-graphs• Minimum degree condition for a graph to be knitted• Show, Attend and Translate: Unpaired Multi-Domain Image-to-Image Translation with Visual Attention• Reducing Visual Confusion with Discriminative Attention• Visual-Texual Emotion Analysis with Deep Coupled Video and Danmu Neural Networks• Re-Identification with Consistent Attentive Siamese Networks• Quantifying Human Behavior on the Block Design Test Through Automated Multi-Level Analysis of Overhead Video• A Self-Adaptive Network For Multiple Sclerosis Lesion Segmentation From Multi-Contrast MRI With Various Imaging Protocols• DeepSeeNet: A deep learning model for automated classification of patient-based age-related macular degeneration severity from color fundus photographs• FotonNet: A HW-Efficient Object Detection System Using 3D-Depth Segmentation and 2D-DNN Classifier• A Comparative Analysis of Content-based Geolocation in Blogs and Tweets• Robust Visual Tracking using Multi-Frame Multi-Feature Joint Modeling• Optimal Iterative Threshold-Kernel Estimation of Jump Diffusion Processes• Fast Efficient Object Detection Using Selective Attention• Compressing Recurrent Neural Networks with Tensor Ring for Action Recognition• Low Complexity Iterative Detection for a Large-scale Distributed MIMO Prototyping System• Modularity in biological evolution and evolutionary computation• NSEEN: Neural Semantic Embedding for Entity Normalization• Classical Algorithms from Quantum and Arthur-Merlin Communication Protocols• Unsupervised Learning in Reservoir Computing for EEG-based Emotion Recognition• Multiuser Computation Offloading and Downloading for Edge Computing with Virtualization• Corrected pair correlation functions for environments with obstacles• High Order Neural Networks for Video Classification• A Note on Two Constructions of Zero-Difference Balanced Functions• Practical Deep Reinforcement Learning Approach for Stock Trading• Feature selection as Monte-Carlo Search in Growing Single Rooted Directed Acyclic Graph by Best Leaf Identification• Note on the exact delay stability margin computation of hybrid dynamical systems• MIMO Channel Information Feedback Using Deep Recurrent Network• A Pretrained DenseNet Encoder for Brain Tumor Segmentation• CA3Net: Contextual-Attentional Attribute-Appearance Network for Person Re-Identification• Representation based and Attention augmented Meta learning• Multilevel Monte Carlo estimation of expected information gains• iQIYI-VID: A Large Dataset for Multi-modal Person Identification• Switch-based Active Deep Dyna-Q: Efficient Adaptive Planning for Task-Completion Dialogue Policy Learning• Upper Tails for Edge Eigenvalues of Random Graphs• Three Dimensional Convolutional Neural Network Pruning with Regularization-Based Method• NECST: Neural Joint Source-Channel Coding• Random walk in a stratified independent random environment• Understanding the combined effect of $k$-space undersampling and transient states excitation in MR Fingerprinting reconstructions• Nash equilibrium seeking in potential games with double-integrator agents• Algebraic structures on typed decorated rooted trees• Restricting Schubert classes to symplectic Grassmannians using self-dual puzzles• Localisation via Deep Imagination: learn the features not the map• Deep Frank-Wolfe For Neural Network Optimization• Measurement-based adaptation protocol with quantum reinforcement learning in a Rigetti quantum computer• Quantum Inspired High Dimensional Conceptual Space as KID Model for Elderly Assistance• Adversarial Autoencoders for Generating 3D Point Clouds• On graceful and harmonious labelings of trees• Weakly Supervised Soft-detection-based Aggregation Method for Image Retrieval• Approximate Eigenvalue Decompositions of Linear Transformations with a Few Householder Reflectors• Reconstruction and prediction of random dynamical systems under borrowing of strength• Beyond Attributes: Adversarial Erasing Embedding Network for Zero-shot Learning• Mixed Likelihood Gaussian Process Latent Variable Model• ATOM: Accurate Tracking by Overlap Maximization• SEIGAN: Towards Compositional Image Generation by Simultaneously Learning to Segment, Enhance, and Inpaint• Collaborative Dense SLAM• Mismatch error correction for time interleaved analog-to-digital converter over a wide frequency range• Watermark Retrieval from 3D Printed Objects via Convolutional Neural Networks• What’s in a GitHub Star? Understanding Repository Starring Practices in a Social Coding Platform• Visibility Extension via Reflective Edges to an Exact Quantity• External branch lengths of $Λ$-coalescents without a dust component• Synthesis of Spatial Charging/Discharging Patterns of In-Vehicle Batteries for Provision of Ancillary Service and Mitigation of Voltage Impact• Intention Oriented Image Captions with Guiding Objects• FD-GAN: Face-demorphing generative adversarial network for restoring accomplice’s facial image• Fast submodular maximization subject to k-extendible system constraints• An Adaptive Oversampling Learning Method for Class-Imbalanced Fault Diagnostics and Prognostics• Distributions of mesh patterns of short lengths• Representations of mock theta functions• Towards Global Explanations for Credit Risk Scoring• Asymptotic enumeration of Cayley digraphs• Ehrhart polynomials of polytopes and spectrum at infinity of Laurent polynomials• Cyclic bent functions and their applications in codes, codebooks, designs, MUBs and sequences• M2U-Net: Effective and Efficient Retinal Vessel Segmentation for Resource-Constrained Environments• Social interaction networks and depressive symptoms• The infinite dimensional manifold of Hölder equilibrium probabilities has non-negative curvature• Past, Present, and Future Approaches Using Computer Vision for Animal Re-Identification from Camera Trap Data• Contextual Face Recognition with a Nested-Hierarchical Nonparametric Identity Model• Decentralized Exploration in Multi-Armed Bandits• Injecting and removing malignant features in mammography with CycleGAN: Investigation of an automated adversarial attack using neural networks• Multi-dimensional BSDEs with diagonal generators driven by $G$-Brownian motion• Experimental Evaluation of Parameterized Algorithms for Graph Separation Problems: Half-Integral Relaxations and Matroid-based Kernelization• A Simple Sublinear-Time Algorithm for Counting Arbitrary Subgraphs via Edge Sampling• Modeling Local Geometric Structure of 3D Point Clouds using Geo-CNN• Optimal medication for tumors modeled by a Cahn-Hilliard-Brinkman equation• Lifted and geometric differentiability of the squared quadratic Wasserstein distance• Deep Shape-from-Template: Wide-Baseline, Dense and Fast Registration and Deformable Reconstruction from a Single Image• DeepIR: A Deep Semantics Driven Framework for Image Retargeting• On mean field limit for Brownian particles with Coulomb interaction in 3D• Semantic Security and the Second-Largest Eigenvalue of Biregular Graphs• Distributed Learning of Average Belief Over Networks Using Sequential Observations• Event-based Gesture Recognition with Dynamic Background Suppression using Smartphone Computational Capabilities• Explicitly Sample-Equivalent Dynamic Models for Gaussian Markov, Reciprocal, and Conditionally Markov Sequences• Paracontrolled approach to the three-dimensional stochastic nonlinear wave equation with quadratic nonlinearity• Learning Actionable Representations with Goal-Conditioned Policies• Efficient random graph matching via degree profiles• Forman-Ricci Curvature for Hypergraphs• A priori positivity of solutions to a non-conservative stochastic thin-film equation• Domain of Inverse Double Arcsine Transformation• Experimental evaluation of kernelization algorithms to Dominating Set• Edgeworth expansion for Euler approximation of continuous diffusion processes• Safe and Complete Real-Time Planning and Exploration in Unknown Environments• Event-Based Features Selection and Tracking from Intertwined Estimation of Velocity and Generative Contours• Behavioral Malware Classification using Convolutional Recurrent Neural Networks• Slit-slide-sew bijections for bipartite and quasibipartite plane maps• The Mafiascum Dataset: A Large Text Corpus for Deception Detection• Discrete-time port-Hamiltonian systems: A definition based on symplectic integration• Characterizing the spread of exaggerated news content over social media• On Well-posedness of Stochastic Anisotropic $p$-Laplace Equation Driven by Lévy noise• Equitable Partitions into Matchings and Coverings in Mixed Graphs• OrthoSeg: A Deep Multimodal Convolutional Neural Network for Semantic Segmentation of Orthoimagery• A Faster DiSH: Hardware Implementation of a Discrete Cell Signaling Network Simulator• Polynomial partitioning over varieties• Simulated Autonomous Driving in a Realistic Driving Environment using Deep Reinforcement Learning and a Deterministic Finite State Machine• The orientation morphism: from graph cocycles to deformations of Poisson structures
Like this:
Like Loading…
Related