Whats new on arXiv

Analyzing and provably improving fixed budget ranking and selection algorithms

This paper studies the fixed budget formulation of the Ranking and Selection (R&S) problem with independent normal samples, where the goal is to investigate different algorithms’ convergence rate in terms of their resulting probability of false selection (PFS). First, we reveal that for the well-known Optimal Computing Budget Allocation (OCBA) algorithm and its two variants, a constant initial sample size (independent of the total budget) only amounts to a sub-exponential (or even polynomial) convergence rate. After that, a modification is proposed to achieve an exponential convergence rate, where the improvement is shown by a finite-sample bound on the PFS as well as numerical results. Finally, we focus on a more tractable two-design case and explicitly characterize the large deviations rate of PFS for some simplified algorithms. Our analysis not only develops insights into the algorithms’ properties, but also highlights several useful techniques for analyzing the convergence rate of fixed budget R\&S algorithms.

Efficient Measuring of Congruence on High Dimensional Time Series $\texttt{DTW}$

$\texttt{DK}$

$\,$

$\texttt{CD}$

Bayesian Neural Network Ensembles

Ensembles of neural networks (NNs) have long been used to estimate predictive uncertainty; a small number of NNs are trained from different initialisations and sometimes on differing versions of the dataset. The variance of the ensemble’s predictions is interpreted as its epistemic uncertainty. The appeal of ensembling stems from being a collection of regular NNs – this makes them both scalable and easily implementable. They have achieved strong empirical results in recent years, often presented as a practical alternative to more costly Bayesian NNs (BNNs). The departure from Bayesian methodology is of concern since the Bayesian framework provides a principled, widely-accepted approach to handling uncertainty. In this extended abstract we derive and implement a modified NN ensembling scheme, which provides a consistent estimator of the Bayesian posterior in wide NNs – regularising parameters about values drawn from a prior distribution.

A Visual Interaction Framework for Dimensionality Reduction Based Data Exploration

Dimensionality reduction is a common method for analyzing and visualizing high-dimensional data. However, reasoning dynamically about the results of a dimensionality reduction is difficult. Dimensionality-reduction algorithms use complex optimizations to reduce the number of dimensions of a dataset, but these new dimensions often lack a clear relation to the initial data dimensions, thus making them difficult to interpret. Here we propose a visual interaction framework to improve dimensionality-reduction based exploratory data analysis. We introduce two interaction techniques, forward projection and backward projection, for dynamically reasoning about dimensionally reduced data. We also contribute two visualization techniques, prolines and feasibility maps, to facilitate the effective use of the proposed interactions. We apply our framework to PCA and autoencoder-based dimensionality reductions. Through data-exploration examples, we demonstrate how our visual interactions can improve the use of dimensionality reduction in exploratory data analysis.

A comparison of cluster algorithms as applied to unsupervised surveys

When considering answering important questions with data, unsupervised data offers extensive insight opportunity and unique challenges. This study considers student survey data with a specific goal of clustering students into like groups with underlying concept of identifying different poverty levels. Fuzzy logic is considered during the data cleaning and organizing phase helping to create a logical dependent variable for analysis comparison. Using multiple data reduction techniques, the survey was reduced and cleaned. Finally, multiple clustering techniques (k-means, k-modes, and hierarchical clustering) are applied and compared. Though each method has strengths, the goal was to identify which was most viable when applied to survey data and specifically when trying to identify the most impoverished students.

The SWAG Algorithm; a Mathematical Approach that Outperforms Traditional Deep Learning. Theory and Implementation

The performance of artificial neural networks (ANNs) is influenced by weight initialization, the nature of activation functions, and their architecture. There is a wide range of activation functions that are traditionally used to train a neural network, e.g. sigmoid, tanh, and Rectified Linear Unit (ReLU). A widespread practice is to use the same type of activation function in all neurons in a given layer. In this manuscript, we present a type of neural network in which the activation functions in every layer form a polynomial basis; we name this method SWAG after the initials of the last names of the authors. We tested SWAG on three complex highly non-linear functions as well as the MNIST handwriting data set. SWAG outperforms and converges faster than the state of the art performance in fully connected neural networks. Given the low computational complexity of SWAG, and the fact that it was capable of solving problems current architectures cannot, it has the potential to change the way that we approach deep learning.

Semantic Part Detection via Matching: Learning to Generalize to Novel Viewpoints from Limited Training Data

Detecting semantic parts of an object is a challenging task in computer vision, particularly because it is hard to construct large annotated datasets due to the difficulty of annotating semantic parts. In this paper we present an approach which learns from a small training dataset of annotated semantic parts, where the object is seen from a limited range of viewpoints, but generalizes to detect semantic parts from a much larger range of viewpoints. Our approach is based on a matching algorithm for finding accurate spatial correspondence between two images, which enables semantic parts annotated on one image to be transplanted to another. In particular, this enables images in the training dataset to be matched to a virtual 3D model of the object (for simplicity, we assume that the object viewpoint can be estimated by standard techniques). Then a clustering algorithm is used to annotate the semantic parts of the 3D virtual model. This virtual 3D model can be used to synthesize annotated images from a large range of viewpoint. These can be matched to images in the test set, using the same matching algorithm, to detect semantic parts in novel viewpoints of the object. Our algorithm is very simple, intuitive, and contains very few parameters. We evaluate our approach in the car subclass of the VehicleSemanticPart dataset. We show it outperforms standard deep network approaches and, in particular, performs much better on novel viewpoints.

A Review on Recommendation Systems: Context-aware to Social-based

Predicting the Computational Cost of Deep Learning Models

Deep learning is rapidly becoming a go-to tool for many artificial intelligence problems due to its ability to outperform other approaches and even humans at many problems. Despite its popularity we are still unable to accurately predict the time it will take to train a deep learning network to solve a given problem. This training time can be seen as the product of the training time per epoch and the number of epochs which need to be performed to reach the desired level of accuracy. Some work has been carried out to predict the training time for an epoch — most have been based around the assumption that the training time is linearly related to the number of floating point operations required. However, this relationship is not true and becomes exacerbated in cases where other activities start to dominate the execution time. Such as the time to load data from memory or loss of performance due to non-optimal parallel execution. In this work we propose an alternative approach in which we train a deep learning network to predict the execution time for parts of a deep learning network. Timings for these individual parts can then be combined to provide a prediction for the whole execution time. This has advantages over linear approaches as it can model more complex scenarios. But, also, it has the ability to predict execution times for scenarios unseen in the training data. Therefore, our approach can be used not only to infer the execution time for a batch, or entire epoch, but it can also support making a well-informed choice for the appropriate hardware and model.

Distributed Inference for Linear Support Vector Machine

The growing size of modern data brings many new challenges to existing statistical inference methodologies and theories, and calls for the development of distributed inferential approaches. This paper studies distributed inference for linear support vector machine (SVM) for the binary classification task. Despite a vast literature on SVM, much less is known about the inferential properties of SVM, especially in a distributed setting. In this paper, we propose a multi-round distributed linear-type (MDL) estimator for conducting inference for linear SVM. The proposed estimator is computationally efficient. In particular, it only requires an initial SVM estimator and then successively refines the estimator by solving simple weighted least squares problem. Theoretically, we establish the Bahadur representation of the estimator. Based on the representation, the asymptotic normality is further derived, which shows that the MDL estimator achieves the optimal statistical efficiency, i.e., the same efficiency as the classical linear SVM applying to the entire dataset in a single machine setup. Moreover, our asymptotic result avoids the condition on the number of machines or data batches, which is commonly assumed in distributed estimation literature, and allows the case of diverging dimension. We provide simulation studies to demonstrate the performance of the proposed MDL estimator.

Visual SLAM with Network Uncertainty Informed Feature Selection

In order to facilitate long-term localization using a visual simultaneous localization and mapping (SLAM) algorithm, careful feature selection is required such that reference points persist over long durations and the runtime and storage complexity of the algorithm remain consistent. We present SIVO (Semantically Informed Visual Odometry and Mapping), a novel information-theoretic feature selection method for visual SLAM which incorporates machine learning and neural network uncertainty into the feature selection pipeline. Our algorithm selects points which provide the highest reduction in Shannon entropy between the entropy of the current state, and the joint entropy of the state given the addition of the new feature with the classification entropy of the feature from a Bayesian neural network. This feature selection strategy generates a sparse map suitable for long-term localization, as each selected feature significantly reduces the uncertainty of the vehicle state and has been detected to be a static object (building, traffic sign, etc.) repeatedly with a high confidence. The KITTI odometry dataset is used to evaluate our method, and we also compare our results against ORB_SLAM2. Overall, SIVO performs comparably to ORB_SLAM2 (average of 0.17% translation error difference, 6.2 x 10^(-5) deg/m rotation error difference) while reducing the map size by 69%.

Accounting for model uncertainty in multiple imputation under complex sampling

Multiple imputation provides an effective way to handle missing data. When several possible models are under consideration for the data, the multiple imputation is typically performed under a single-best model selected from the candidate models. This single model selection approach ignores the uncertainty associated with the model selection and so leads to underestimation of the variance of multiple imputation estimator. In this paper, we propose a new multiple imputation procedure incorporating model uncertainty in the final inference. The proposed method incorporates possible candidate models for the data into the imputation procedure using the idea of Bayesian Model Averaging (BMA). The proposed method is directly applicable to handling item nonresponse in survey sampling. Asymptotic properties of the proposed method are investigated. A limited simulation study confirms that our model averaging approach provides better estimation performance than the single model selection approach.

Prediction Factory: automated development and collaborative evaluation of predictive models

In this paper, we present a data science automation system called Prediction Factory. The system uses several key automation algorithms to enable data scientists to rapidly develop predictive models and share them with domain experts. To assess the system’s impact, we implemented 3 different interfaces for creating predictive modeling projects: baseline automation, full automation, and optional automation. With a dataset of online grocery shopper behaviors, we divided data scientists among the interfaces to specify prediction problems, learn and evaluate models, and write a report for domain experts to judge whether or not to fund to continue working on. In total, 22 data scientists created 94 reports that were judged 296 times by 26 experts. In a head-to-head trial, reports generated utilizing full data science automation interface reports were funded 57.5% of the time, while the ones that used baseline automation were only funded 42.5% of the time. An intermediate interface which supports optional automation generated reports were funded 58.6% more often compared to the baseline. Full automation and optional automation reports were funded about equally when put head-to-head. These results demonstrate that Prediction Factory has implemented a critical amount of automation to augment the role of data scientists and improve business outcomes.

ADCrowdNet: An Attention-injective Deformable Convolutional Network for Crowd Understanding

We propose an attention-injective deformable convolutional network called ADCrowdNet for crowd understanding that can address the accuracy degradation problem of highly congested noisy scenes. ADCrowdNet contains two concatenated networks. An attention-aware network called Attention Map Generator (AMG) first detects crowd regions in images and computes the congestion degree of these regions. Based on detected crowd regions and congestion priors, a multi-scale deformable network called Density Map Estimator (DME) then generates high-quality density maps. With the attention-aware training scheme and multi-scale deformable convolutional scheme, the proposed ADCrowdNet achieves the capability of being more effective to capture the crowd features and more resistant to various noises. We have evaluated our method on four popular crowd counting datasets (ShanghaiTech, UCF_CC_50, WorldEXPO’10, and UCSD) and an extra vehicle counting dataset TRANCOS, our approach overwhelmingly beats existing approaches on all of these datasets.

Simple stopping criteria for information theoretic feature selection

Information theoretic feature selection aims to select a smallest feature subset such that the mutual information between the selected features and the class labels is maximized. Despite the simplicity of this objective, there still remains several open problems to optimize it. These include, for example, the automatic determination of the optimal subset size (i.e., the number of features) or a stopping criterion if the greedy searching strategy is adopted. In this letter, we suggest two stopping criteria by just monitoring the conditional mutual information (CMI) among groups of variables. Using the recently developed multivariate matrix-based Renyi’s \alpha-entropy functional, we show that the CMI among groups of variables can be easily estimated without any decomposition or approximation, hence making our criteria easily implemented and seamlessly integrated into any existing information theoretic feature selection methods with greedy search strategy.

Weakly Supervised Silhouette-based Semantic Change Detection

This paper presents a novel semantic change detection scheme with only weak supervision. A straightforward approach for this task is to train a semantic change detection network directly from a large-scale dataset in an end-to-end manner. However, a specific dataset for this new task, which is usually labor-intensive and time-consuming, becomes indispensable. To avoid this problem, we propose to train this kind of network from existing datasets by dividing this task into change detection and semantic extraction. On the other hand, the difference in camera viewpoints, for example images of the same scene captured from a vehicle-mounted camera at different time points, usually brings a challenge to the change detection task. To address this challenge, we propose a new siamese network structure with the introduction of correlation layer. In addition, we create a publicly available dataset for semantic change detection to evaluate the proposed method. Both the robustness to viewpoint difference in change detection task and the effectiveness for semantic change detection of the proposed networks are verified by the experimental results.

Global Second-order Pooling Neural Networks

Deep Convolutional Networks (ConvNets) are fundamental to, besides large-scale visual recognition, a lot of vision tasks. As the primary goal of the ConvNets is to characterize complex boundaries of thousands of classes in a high-dimensional space, it is critical to learn higher-order representations for enhancing non-linear modeling capability. Recently, Global Second-order Pooling (GSoP), plugged at the end of networks, has attracted increasing attentions, achieving much better performance than classical, first-order networks in a variety of vision tasks. However, how to effectively introduce higher-order representation in earlier layers for improving non-linear capability of ConvNets is still an open problem. In this paper, we propose a novel network model introducing GSoP across from lower to higher layers for exploiting holistic image information throughout a network. Given an input 3D tensor outputted by some previous convolutional layer, we perform GSoP to obtain a covariance matrix which, after nonlinear transformation, is used for tensor scaling along channel dimension. Similarly, we can perform GSoP along spatial dimension for tensor scaling as well. In this way, we can make full use of the second-order statistics of the holistic image throughout a network. The proposed networks are thoroughly evaluated on large-scale ImageNet-1K, and experiments have shown that they outperformed non-trivially the counterparts while achieving state-of-the-art results.

• Hardness results for rainbow disconnection of graphs• Estimating of the inertial manifold dimension for a chaotic attractor of complex Ginzburg-Landau equation using a neural network• What Should I Learn First: Introducing LectureBank for NLP Education and Prerequisite Chain Learning• MRAM Co-designed Processing-in-Memory CNN Accelerator for Mobile and IoT Applications• The system operating time with two different unreliable servicing devices• DeepPos: Deep Supervised Autoencoder Network for CSI Based Indoor Localization• Learning to Synthesize Motion Blur• Particle Probability Hypothesis Density Filter based on Pairwise Markov Chains• Stability of Disturbance Based Unified Control• Automatic Diagnosis of Short-Duration 12-Lead ECG using a Deep Convolutional Network• AI based Safety System for Employees of Manufacturing Industries in Developing Countries• Optimizing running a race on a curved track• A scalable estimator of sets of integral operators• A Scoring Method for Driving Safety Credit Using Trajectory Data• Gridless Line Spectral Estimation with Multiple Measurement Vector via Variational Bayesian Inference• Dimensional reduction and scattering formulation for even topological invariants• SVD-PHAT: A Fast Sound Source Localization Method• A Study of the Complexity and Accuracy of Direction of Arrival Estimation Methods Based on GCC-PHAT for a Pair of Close Microphones• Meta-Learning for Few-shot Camera-Adaptive Color Constancy• Reconstructing probabilistic trees of cellular differentiation from single-cell RNA-seq data• Algorithms for Joint Sensor and Control Nodes Selection in Dynamic Networks• Cartoon-to-real: An Approach to Translate Cartoon to Realistic Images using GAN• 19 dubious ways to compute the marginal likelihood of a phylogenetic tree topology• The Grand Canonical ensemble of weighted networks• Fractional coloring with local demands• The center of the wreath product of symmetric groups algebra• Tighter enumeration of matroids of fixed rank• An Application of Storage-Optimal MatDot Codes for Coded Matrix Multiplication: Fast k-Nearest Neighbors Estimation• Phase Collaborative Network for Multi-Phase Medical Imaging Segmentation• Unrepresentative video data: A review and evaluation• 2D/3D Megavoltage Image Registration Using Convolutional Neural Networks• Disease phenotyping using deep learning: A diabetes case study• Unsupervised Meta-Learning For Few-Shot Image and Video Classification• Stochastic Optimization for DC Functions and Non-smooth Non-convex Regularizers with Non-asymptotic Convergence• Towards Task Understanding in Visual Settings• Asymptotic Analysis of Model Selection Criteria for General Hidden Markov Models• Deep learning based automatic segmentation of lumbosacral nerves on non-contrast CT for radiographic evaluation: a pilot study• Non-Volume Preserving-based Feature Fusion Approach to Group-Level Expression Recognition on Crowd Videos• How to design a tournament: lessons from the EHF men’s handball Champions League• Joint Correction of Attenuation and Scatter Using Deep Convolutional Neural Networks (DCNN) for Time-of-Flight PET• Guided patch-wise nonlocal SAR despeckling• Triangles in $C_5$-free graphs and Hypergraphs of Girth Six• RetinaMatch: Efficient Template Matching of Retina Images for Teleophthalmology• Adversarial Attacks for Optical Flow-Based Action Recognition Classifiers• Towards Neural Co-Processors: Combining Neural Decoding and Encoding in Brain-Computer Interfaces• Deformations of the Weyl Character Formula for $SO(2n+1,\mathbb{C})$ via Ice Models• Adversarial Bandits with Knapsacks• Optimizing Throughput in a MIMO System with a Self-sustained Relay and Non-uniform Power Splitting• Nonlinear Decomposition Principle and Fundamental Matrix Solutions for Dynamic Compartmental Systems• A regression approach for explaining manifold embedding coordinates• Variational Autoencoding the Lagrangian Trajectories of Particles in a Combustion System• Forward Investment Performance Processes in Semimartingale Financial Markets• Visual Question Answering as Reading Comprehension• Using permutations to assess confounding in machine learning applications for digital health• Optimizable Object Reconstruction from a Single View• Soft-Output Detection Methods for Sparse Millimeter Wave MIMO Systems with Low-Precision ADCs• Pak-Stanley labeling for Central Graphical Arrangements• Regret Bounds for Stochastic Combinatorial Multi-Armed Bandits with Linear Space Complexity• Autoconj: Recognizing and Exploiting Conjugacy Without a Domain-Specific Language• Adaptive Sparse Estimation with Side Information• Joint Design of Convolutional Code and CRC under Serial List Viterbi Decoding• Privacy-Preserving Aggregation of Controllable Loads to Compensate Fluctuations in Solar Power• Multi-granularity hierarchical attention fusion networks for reading comprehension and question answering• Quantifying CDS Sortability of Permutations by Strategic Pile Size• Automatic Rendering of Building Floor Plan Images from Textual Descriptions in English• HYPE: A High Performing NLP System for Automatically Detecting Hypoglycemia Events from Electronic Health Record Notes• Low-Complexity Adaptive Beam and Channel Tracking for Mobile mmWave Communications• Distributed Augmented Reality with 3D Lung Dynamics — A Planning Tool Concept• Large-scale Generative Modeling to Improve Automated Veterinary Disease Coding• Traffic Danger Recognition With Surveillance Cameras Without Training Data• Charging station optimization for balanced electric car sharing• Efficient Online Multi-Person 2D Pose Tracking with Recurrent Spatio-Temporal Affinity Fields• DuLa-Net: A Dual-Projection Network for Estimating Room Layouts from a Single RGB Panorama• FogBus: A Blockchain-based Lightweight Framework for Edge and Fog Computing• Unsupervised Image-to-Image Translation Using Domain-Specific Variational Information Bound• Sums of Standard Uniform Random Variables• Analysing Emergent Users’ Text Messages Data and Exploring its Benefits• Joint Uplink-Downlink Cooperative Interference Management with Flexible Cell Associations• Deep learning for pedestrians: backpropagation in CNNs• Sample Efficient Stochastic Variance-Reduced Cubic Regularization Method• Shape-conditioned Image Generation by Learning Latent Appearance Representation from Unpaired Data• Effective, Fast, and Memory-Efficient Compressed Multi-function Convolutional Neural Networks for More Accurate Medical Image Classification• Hand Gesture Detection and Conversion to Speech and Text• The basins of attraction of the global minimizers of the non-convex sparse spikes estimation problem• Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose• Random polytopes obtained by matrices with heavy tailed entries• Efficient Semantic Segmentation for Visual Bird’s-eye View Interpretation• On the inducibility of small trees• The Alon-Tarsi number of a planar graph minus a matching• Generalized Graph Convolutional Networks for Skeleton-based Action Recognition• 3D Shape Reconstruction from a Single 2D Image via 2D-3D Self-Consistency• An Equivalence Class for Orthogonal Vectors• Second-order Optimization Method for Large Mini-batch: Training ResNet-50 on ImageNet in 35 Epochs

Like this:

Like Loading…

Related