Whats new on arXiv

Crossbar-aware neural network pruning

Crossbar architecture based devices have been widely adopted in neural network accelerators by taking advantage of the high efficiency on vector-matrix multiplication (VMM) operations. However, in the case of convolutional neural networks (CNNs), the efficiency is compromised dramatically due to the large amounts of data reuse. Although some mapping methods have been designed to achieve a balance between the execution throughput and resource overhead, the resource consumption cost is still huge while maintaining the throughput. Network pruning is a promising and widely studied leverage to shrink the model size. Whereas, previous work didn`t consider the crossbar architecture and the corresponding mapping method, which cannot be directly utilized by crossbar-based neural network accelerators. Tightly combining the crossbar structure and its mapping, this paper proposes a crossbar-aware pruning framework based on a formulated L0-norm constrained optimization problem. Specifically, we design an L0-norm constrained gradient descent (LGD) with relaxant probabilistic projection (RPP) to solve this problem. Two grains of sparsity are successfully achieved: i) intuitive crossbar-grain sparsity and ii) column-grain sparsity with output recombination, based on which we further propose an input feature maps (FMs) reorder method to improve the model accuracy. We evaluate our crossbar-aware pruning framework on median-scale CIFAR10 dataset and large-scale ImageNet dataset with VGG and ResNet models. Our method is able to reduce the crossbar overhead by 44%-72% with little accuracy degradation. This work greatly saves the resource and the related energy cost, which provides a new co-design solution for mapping CNNs onto various crossbar devices with significantly higher efficiency.

One-Shot Optimal Topology Generation through Theory-Driven Machine Learning

We introduce a theory-driven mechanism for learning a neural network model that performs generative topology design in one shot given a problem setting, circumventing the conventional iterative procedure that computational design tasks usually entail. The proposed mechanism can lead to machines that quickly response to new design requirements based on its knowledge accumulated through past experiences of design generation. Achieving such a mechanism through supervised learning would require an impractically large amount of problem-solution pairs for training, due to the known limitation of deep neural networks in knowledge generalization. To this end, we introduce an interaction between a student (the neural network) and a teacher (the optimality conditions underlying topology optimization): The student learns from existing data and is tested on unseen problems. Deviation of the student’s solutions from the optimality conditions is quantified, and used to choose new data points for the student to learn from. We show through a compliance minimization problem that the proposed learning mechanism is significantly more data efficient than using a static dataset under the same computational budget.

Estimating a change point in a sequence of very high-dimensional covariance matrices

This paper considers the problem of estimating a change point in the covariance matrix in a sequence of high-dimensional vectors, where the dimension is substantially larger than the sample size. A two-stage approach is proposed to efficiently estimate the location of the change point. The first step consists of a reduction of the dimension to identify elements of the covariance matrices corresponding to significant changes. In a second step we use the components after dimension reduction to determine the position of the change point. Theoretical properties are developed for both steps and numerical studies are conducted to support the new methodology.

Markets for Public Decision-making

A public decision-making problem consists of a set of issues, each with multiple possible alternatives, and a set of competing agents, each with a preferred alternative for each issue. We study adaptations of market economies to this setting, focusing on binary issues. Issues have prices, and each agent is endowed with artificial currency that she can use to purchase probability for her preferred alternatives (we allow randomized outcomes). We first show that when each issue has a single price that is common to all agents, market equilibria can be arbitrarily bad. This negative result motivates a different approach. We present a novel technique called ‘pairwise issue expansion’, which transforms any public decision-making instance into an equivalent Fisher market, the simplest type of private goods market. This is done by expanding each issue into many goods: one for each pair of agents who disagree on that issue. We show that the equilibrium prices in the constructed Fisher market yield a ‘pairwise pricing equilibrium’ in the original public decision-making problem which maximizes Nash welfare. More broadly, pairwise issue expansion uncovers a powerful connection between the public decision-making and private goods settings; this immediately yields several interesting results about public decisions markets, and furthers the hope that we will be able to find a simple iterative voting protocol that leads to near-optimum decisions.

A New Bivariate Point Process Model with Application to Social Media User Content Generation

In this paper, we propose a new bivariate point process model to study the activity patterns of social media users. The proposed model not only is flexible to accommodate but also can provide meaningful insight into the complex behaviors of modern social media users. A composite likelihood approach and a composite EM estimation procedure are developed to overcome the challenges that arise in parameter estimation. Furthermore, we show consistency and asymptotic normality of the resulting estimator. We apply our proposed method to Donald Trump’s Twitter data and study if and how his tweeting behavior evolved before, during and after the presidential campaign. Moreover, we apply our method to a large-scale social media data and find interesting subgroups of users with distinct behaviors. Additionally, we discuss the effect of social ties on a user’s online content generating behavior.

A Survey of the Usages of Deep Learning in Natural Language Processing

Over the last several years, the field of natural language processing has been propelled forward by an explosion in the use of deep learning models. This survey provides a brief introduction to the field and a quick overview of deep learning architectures and methods. It then sifts through the plethora of recent studies and summarizes a large assortment of relevant contributions. Analyzed research areas include several core linguistic processing issues in addition to a number of applications of computational linguistics. A discussion of the current state of the art is then provided along with recommendations for future research in the field.

Transportation Modes Classification Using Feature Engineering

Predicting transportation modes from GPS (Global Positioning System) records is a hot topic in the trajectory mining domain. Each GPS record is called a trajectory point and a trajectory is a sequence of these points. Trajectory mining has applications including but not limited to transportation mode detection, tourism, traffic congestion, smart cities management, animal behaviour analysis, environmental preservation, and traffic dynamics are some of the trajectory mining applications. Transportation modes prediction as one of the tasks in human mobility and vehicle mobility applications plays an important role in resource allocation, traffic management systems, tourism planning and accident detection. In this work, the proposed framework in Etemad et al. is extended to consider other aspects in the task of transportation modes prediction. Wrapper search and information retrieval methods were investigated to find the best subset of trajectory features. Finding the best classifier and the best feature subset, the framework is compared against two related papers that applied deep learning methods. The results show that our framework achieved better performance. Moreover, the ground truth noise removal improved accuracy of transportation modes prediction task; however, the assumption of having access to test set labels in pre-processing task is invalid. Furthermore, the cross validation approaches were investigated and the performance results show that the random cross validation method provides optimistic results.

PROPEL: Probabilistic Parametric Regression Loss for Convolutional Neural Networks

Recently, Convolutional Neural Networks (CNNs) have dominated the field of computer vision. Their widespread success has been attributed to their representation learning capabilities. For classification tasks, CNNs have widely employed probabilistic output and have shown the significance of providing additional confidence for predictions. However, such probabilistic methodologies are not widely applicable for addressing regression problems using CNNs, as regression involves learning unconstrained continuous and, in many cases, multi-variate target variables. We propose a PRObabilistic Parametric rEgression Loss (PROPEL) that enables probabilistic regression using CNNs. PROPEL is fully differentiable and, hence, can be easily incorporated for end-to-end training of existing regressive CNN architectures. The proposed method is flexible as it learns complex unconstrained probabilities while being generalizable to higher dimensional multi-variate regression problems. We utilize a PROPEL-based CNN to address the problem of learning hand and head orientation from uncalibrated color images. Comprehensive experimental validation and comparisons with existing CNN regression loss functions are provided. Our experimental results indicate that PROPEL significantly improves the performance of a CNN, while reducing model parameters by 10x as compared to the existing state-of-the-art.

Prior-Proposal Recursive Bayesian Inference

Bayesian models are naturally equipped to provide recursive inference because they can formally reconcile new data and existing scientific information. However, popular use of Bayesian methods often avoids priors that are based on exact posterior distributions resulting from former studies. Recursive Bayesian methods include two main approaches that we refer to as Prior- and Proposal-Recursive Bayes. Prior-Recursive Bayes uses Bayesian updating, fitting models to partitions of data sequentially, and provides a convenient way to accommodate new data as they become available. Prior-Recursive Bayes uses the posterior from the previous stage as the prior in the new stage based on the latest data. By contrast, Proposal-Recursive Bayes is intended for use with hierarchical Bayesian models and uses a set of transient priors in first stage independent analyses of the data partitions. The second stage of Proposal-Recursive Bayes uses the posterior distributions from the first stage as proposals in an MCMC algorithm to fit the full model. The second-stage recursive proposals simplify the Metropolis-Hastings ratio substantially and can lead to computational advantages for the Proposal-Recursive Bayes method. We combine Prior- and Proposal-Recursive concepts in a framework that can be used to fit any Bayesian model exactly, and often with computational improvements. We demonstrate our new method by fitting a geostatistical model to spatially-explicit data in a sequence of stages, leading to computational improvements by a factor of three in our example. While the method we propose provides exact inference, it can also be coupled with modern approximation methods leading to additional computational efficiency. Overall, our new approach has implications for big data, streaming data, and optimal adaptive design situations and can be modified to fit a broad class of Bayesian models to data.

A Survey of Machine and Deep Learning Methods for Internet of Things (IoT) Security

The Internet of Things (IoT) integrates billions of smart devices that can communicate with one another with minimal human intervention. It is one of the fastest developing fields in the history of computing, with an estimated 50 billion devices by the end of 2020. On the one hand, IoT play a crucial role in enhancing several real-life smart applications that can improve life quality. On the other hand, the crosscutting nature of IoT systems and the multidisciplinary components involved in the deployment of such systems introduced new security challenges. Implementing security measures, such as encryption, authentication, access control, network security and application security, for IoT devices and their inherent vulnerabilities is ineffective. Therefore, existing security methods should be enhanced to secure the IoT system effectively. Machine learning and deep learning (ML/DL) have advanced considerably over the last few years, and machine intelligence has transitioned from laboratory curiosity to practical machinery in several important applications. Consequently, ML/DL methods are important in transforming the security of IoT systems from merely facilitating secure communication between devices to security-based intelligence systems. The goal of this work is to provide a comprehensive survey of ML /DL methods that can be used to develop enhanced security methods for IoT systems. IoT security threats that are related to inherent or newly introduced threats are presented, and various potential IoT system attack surfaces and the possible threats related to each surface are discussed. We then thoroughly review ML/DL methods for IoT security and present the opportunities, advantages and shortcomings of each method. We discuss the opportunities and challenges involved in applying ML/DL to IoT security. These opportunities and challenges can serve as potential future research directions.

Estimating Causal Effects Under Interference Using Bayesian Generalized Propensity Scores

In most real-world systems units are interconnected and can be represented as networks consisting of nodes and edges. For instance, in social systems individuals can have social ties, family or financial relationships. In settings where some units are exposed to a treatment and its effect spills over connected units, estimating both the direct effect of the treatment and spillover effects presents several challenges. First, assumptions on the way and the extent to which spillover effects occur along the observed network are required. Second, in observational studies, where the treatment assignment is not under the control of the investigator, confounding and homophily are potential threats to the identification and estimation of causal effects on networks. Here, we make two structural assumptions: i) neighborhood interference, which assumes interference operates only through a function of the immediate neighbors’ treatments ii) unconfoundedness of the individual and neighborhood treatment, which rules out the presence of unmeasured confounding variables, including those driving homophily. Under these assumptions we develop a new covariate-adjustment estimator for treatment and spillover effects in observational studies on networks. Estimation is based on a generalized propensity score that balances individual and neighborhood covariates across units under different levels of individual treatment and of exposure to neighbors’ treatment. Adjustment for propensity score is performed using a penalized spline regression. Inference capitalizes on a three-step Bayesian procedure which allows to take into account the uncertainty in the propensity score estimation and avoiding model feedback. Finally, correlation of interacting units is taken into account using a community detection algorithm and incorporating random effects in the outcome model.

MISS: Finding Optimal Sample Sizes for Approximate Analytics

$L^2$

DataJoint: A Simpler Relational Data Model

The relational data model offers unrivaled rigor and precision in defining data structure and querying complex data. Yet the use of relational databases in scientific data pipelines is limited due to their perceived unwieldiness. We propose a simplified and conceptually refined relational data model named DataJoint. The model includes a language for schema definition, a language for data queries, and diagramming notation for visualizing entities and relationships among them. The model adheres to the principle of entity normalization, which requires that all data — both stored and derived — must be represented by well-formed entity sets. DataJoint’s data query language is an algebra on entity sets with five operators that provide matching capabilities to those of other relational query languages with greater clarity due to entity normalization. Practical implementations of DataJoint have been adopted in neuroscience labs for fluent interaction with scientific data pipelines.

• Weakly-Supervised Deep Learning of Heat Transport via Physics Informed Loss• Robust Stabilization of Fractional-order Interval Systems via Dynamic Output Feedback: An LMI Approach• Automatic Processing and Solar Cell Detection in Photovoltaic Electroluminescence Images• A writer-independent approach for offline signature verification using deep convolutional neural networks features• False Positive Reduction by Actively Mining Negative Samples for Pulmonary Nodule Detection in Chest Radiographs• A multi-contrast MRI approach to thalamus segmentation• On the Inability of Markov Models to Capture Criticality in Human Mobility• The Generalized Power Generalized Weibull Distribution: Properties and Applications• Backflow Transformations via Neural Networks for Quantum Many-Body Wave-Functions• Theta and eta polynomials in geometry, Lie theory, and combinatorics• The Sparse Variance Contamination Model• Solving Target Set Selection with Bounded Thresholds Faster than $2^n$• NDBench: Benchmarking Microservices at Scale• Combined Mutiplicative-Heston Model for Stochastic Volatility• Clustering Prominent People and Organizations in Topic-Specific Text Corpora• On the expected runtime of multiple testing algorithms with bounded error• Improving Neural Sequence Labelling using Additional Linguistic Information• Gated Fusion Network for Joint Image Deblurring and Super-Resolution• CASED: Curriculum Adaptive Sampling for Extreme Data Imbalance• A cornucopia of quasi-Yamanouchi tableaux• Space-Time Block Coded Spatial Modulation Aided mmWave MIMO with Hybrid Precoding• ACQUIRE: an inexact iteratively reweighted norm approach for TV-based Poisson image restoration• TBI Contusion Segmentation from MRI using Convolutional Neural Networks• Nonparametric estimation of utility functions• Agent cognition through micro-simulations: Adaptive and tunable intelligence with NetLogo LevelSpace• Deep nested level sets: Fully automated segmentation of cardiac MR images in patients with pulmonary hypertension• On Disjoint Holes in Point Sets• Schur Ring, Run Structure and Periodic Compatible Binary Sequences• Synthesizing CT from Ultrashort Echo-Time MR Images via Convolutional Neural Networks• A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition• Inference of stochastic parameterizations for model error treatment using nested ensemble Kalman filters• Communication-efficient Distributed Multi-resource Allocation• Residual Balancing Weights for Marginal Structural Models: with Application to Analyses of Time-varying Treatments and Causal Mediation• Bayesian Sparse Propensity Score Estimation for Unit Nonresponse• TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing• Revealed Preference Dimension via Matrix Sign Rank• On the Capacity of Symmetric $M$-user Gaussian Interference Channels with Feedback• Pairwise Body-Part Attention for Recognizing Human-Object Interactions• Eigenvalues of the Laplacian on the Goldberg-Coxeter constructions for $3$- and $4$-valent graphs• Back-Translation-Style Data Augmentation for End-to-End ASR• Holographic Sensing• Equilibrium Problems and Proximal Algorithms in Hadamard Spaces• Spectrahedral representations of plane hyperbolic curves• Logistic regression and Ising networks: prediction and estimation when violating lasso assumptions• Maximum Margin Metric Learning Over Discriminative Nullspace for Person Re-identification• Sparsity Learning Based Multiuser Detection in Grant-Free Massive-Device Multiple Access• Unsupervised Adversarial Depth Estimation using Cycled Generative Networks• Fine-Grained Visual Categorization using Meta-Learning Optimization with Sample Selection of Auxiliary Data• Multiple Access for Transmissions Over Independent Fading Channels• A Law of Large Numbers and Large Deviations for interacting diffusions on Erdös-Rényi graphs• Stability of the overdamped Langevin equation in double-well potential• Poisson and normal approximations for the measurable functions of independent random variables• Decimation analysis in the signal processing of current to detect broken bars in induction machine• Deep Leaf Segmentation Using Synthetic Data• Bike Flow Prediction with Multi-Graph Convolutional Networks• Towards Explainable Inference about Object Motion using Qualitative Reasoning• Unsupervised Learning of a Hierarchical Spiking Neural Network for Optical Flow Estimation: From Events to Global Motion Perception• Acoustic and Textual Data Augmentation for Improved ASR of Code-Switching Speech• Articulatory Features for ASR of Pathological Speech• Building a Unified Code-Switching ASR System for South African Languages• Domination Mappings into the Hamming Ball: Existence, Constructions, and Algorithms• Group-sparse SVD Models and Their Applications in Biological Data• Improving Sequential Determinantal Point Processes for Supervised Video Summarization• Uniqueness of Minimizers of Some Variational Problems Arising in Image Processing• Resilience of airborne networks• Modulation Mode Detection & Classification for in-Vivo Nano-Scale Communication Systems Operating in Terahertz Band• Ontology-Grounded Topic Modeling for Climate Science Research• On roots of Wiener polynomials of trees• A multi-material transport problem with arbitrary marginals• High Precision Numerical Computation of Principal Points For Univariate Distributions• RS-Net: Regression-Segmentation 3D CNN for Synthesis of Full Resolution Missing Brain MRI in the Presence of Tumours• Polynomial Identities Implying Capparelli’s Partition Theorems• Point Process Models for Distribution of Cell Phone Antennas• A combinatorial proof of the extension property for partial isometries• Energy Contract Settlements through Automated Negotiation in Residential Cooperatives• Actor-Centric Relation Network• Team Diagonalization• Domain Robust Feature Extraction for Rapid Low Resource ASR Development• A new mixture-based fixed-effect model for a biometrical case-study related to immunogenecity with highly censored data• Naji’s characterization of circle graphs• Bridge the Gap Between VQA and Human Behavior on Omnidirectional Video: A Large-Scale Dataset and a Deep Learning Model• U-Finger: Multi-Scale Dilated Convolutional Network for Fingerprint Image Denoising and Inpainting• Optimal Tap Setting of Voltage Regulation Transformers Using Batch Reinforcement Learning• A new sum-product estimate in prime fields• A Two-Phase Quasi-Newton Method for Optimization Problem• Convex Hull Formulations for Mixed-Integer Multilinear Functions• Sidekick Policy Learning for Active Visual Exploration• Decomposable clutters and a generalization of Simon’s conjectutre• Tiny-DSOD: Lightweight Object Detection for Resource-Restricted Usages• A Margin-based MLE for Crowdsourced Partial Ranking• Betti Numbers of Gaussian Excursions in the Sparse Regime• Optimum Depth of the Bounded Pipeline• Opinion Spam Recognition Method for Online Reviews using Ontological Features• The Linking-Unlinking Game• MoCoNet: Motion Correction in 3D MPRAGE images using a Convolutional Neural Network approach• Consistent polynomial-time unseeded graph matching for Lipschitz graphons• Dynamical analysis of a chaos generator• PSDF Fusion: Probabilistic Signed Distance Function for On-the-fly 3D Data Fusion and Scene Reconstruction• Texture Mixing by Interpolating Deep Statistics via Gaussian Models• Efficient Uncertainty Estimation for Semantic Segmentation in Videos• Fast Trajectory Planning for Automated Vehicles using Gradient-based Nonlinear Model Predictive Control• Towards Good Practices on Building Effective CNN Baseline Model for Person Re-identification• On L-shaped Point Set Embeddings of Trees: First Non-embeddable Examples• Fast derivation of neural network based document vectors with distance constraint and negative sampling• Clause Vivification by Unit Propagation in CDCL SAT Solvers• A Note on Bayesian Nonparametric Inference for Spherically Symmetric Distribution• Joint Representation and Truncated Inference Learning for Correlation Filter based Tracking• An Open Framework Enabling Electromagnetic Tracking in Image-Guided Interventions• Visual Analogies between Atari Games for Studying Transfer Learning in RL• Semi-supervised CNN for Single Image Rain Removal• ReenactGAN: Learning to Reenact Faces via Boundary Transfer• Towards ultra-high resolution 3D reconstruction of a whole rat brain from 3D-PLI data• Convolutional Gated Recurrent Units for Medical Relation Classification• Information Distance Revisited• Towards Automatic Speech Identification from Vocal Tract Shape Dynamics in Real-time MRI• ADAM-ADMM: A Unified, Systematic Framework of Structured Weight Pruning for DNNs• Modestly Weighted Logrank Tests

Like this:

Like Loading…

Related