Whats new on arXiv

Overarching Computation Model (OCM)

Existing models of computation, such as a Turing machine (hereafter, TM), do not consider the agent involved in interpreting the outcome of the computation. We argue that a TM, or any other computation model, has no significance if its output is not interpreted by some agent. Furthermore, we argue that including the interpreter in the model definition sheds light on some of the difficult problems faced in computation and mathematics. We provide an analytic process framework to address this limitation. The framework can be overlaid on existing concepts of computation to address many practical and philosophical concerns such as the P vs NP problem. In addition, we provide constructive proof for the P vs NP problem under the assumption that the class NP comprises of problems solvable by non-deterministic algorithms. We utilize the observation that deterministic computational procedures lack fundamental capacity to fully simulate their non-deterministic variant.

Maximum Weight Online Matching with Deadlines

We study the problem of matching agents who arrive at a marketplace over time and leave after d time periods. Agents can only be matched while they are present in the marketplace. Each pair of agents can yield a different match value, and the planner’s goal is to maximize the total value over a finite time horizon. First we study the case in which vertices arrive in an adversarial order. We provide a randomized 0.25-competitive algorithm building on a result by Feldman et al. (2009) and Lehman et al. (2006). We extend the model to the case in which departure times are drawn independently from a distribution with non-decreasing hazard rate, for which we establish a 1/8-competitive algorithm. When the arrival order is chosen uniformly at random, we show that a batching algorithm, which computes a maximum-weighted matching every (d+1) periods, is 0.279-competitive.

NL4Py: Agent-Based Modeling in Python with Parallelizable NetLogo Workspaces

NL4Py is a NetLogo controller software for Python, for the rapid, parallel execution of NetLogo models. NL4Py provides both headless (no graphical user interface) and GUI NetLogo workspace control through Python. Spurred on by the increasing availability of open-source computation and machine learning libraries on the Python package index, there is an increasing demand for such rapid, parallel execution of agent-based models through Python. NetLogo, being the language of choice for a majority of agent-based modeling driven research projects, requires an integration to Python for researchers looking to perform statistical analyses of agent-based model output using these libraries. Unfortunately, until the recent introduction of PyNetLogo, and now NL4Py, such a controller was unavailable. This article provides a detailed introduction into the usage of NL4Py and explains its client-server software architecture, highlighting architectural differences to PyNetLogo. A step-by-step demonstration of global sensitivity analysis and parameter calibration of the Wolf Sheep Predation model is then performed through NL4Py. Finally, NL4Py’s performance is benchmarked against PyNetLogo and its combination with IPyParallel, and shown to provide significant savings in execution time over both configurations.

Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network

Because of their effectiveness in broad practical applications, LSTM networks have received a wealth of coverage in scientific journals, technical blogs, and implementation guides. However, in most articles, the inference formulas for the LSTM network and its parent, RNN, are stated axiomatically, while the training formulas are omitted altogether. In addition, the technique of ‘unrolling’ an RNN is routinely presented without justification throughout the literature. The goal of this paper is to explain the essential RNN and LSTM fundamentals in a single document. Drawing from concepts in signal processing, we formally derive the canonical RNN formulation from differential equations. We then propose and prove a precise statement, which yields the RNN unrolling technique. We also review the difficulties with training the standard RNN and address them by transforming the RNN into the ‘Vanilla LSTM’ network through a series of logical arguments. We provide all equations pertaining to the LSTM system together with detailed descriptions of its constituent entities. Albeit unconventional, our choice of notation and the method for presenting the LSTM system emphasizes ease of understanding. As part of the analysis, we identify new opportunities to enrich the LSTM system and incorporate these extensions into the Vanilla LSTM network, producing the most general LSTM variant to date. The target reader has already been exposed to RNNs and LSTM networks through numerous available resources and is open to an alternative pedagogical approach. A Machine Learning practitioner seeking guidance for implementing our new augmented LSTM model in software for experimentation and research will find the insights and derivations in this tutorial valuable as well.

Fuzzy Clustering to Identify Clusters at Different Levels of Fuzziness: An Evolutionary Multi-Objective Optimization Approach $c$

Linked Causal Variational Autoencoder for Inferring Paired Spillover Effects $u$

$\bar{u}$

$u$

$\bar{u}$

$u$

$\bar{u}$

$u$

Exploiting Structure for Fast Kernel Learning

We propose two methods for exact Gaussian process (GP) inference and learning on massive image, video, spatial-temporal, or multi-output datasets with missing values (or ‘gaps’) in the observed responses. The first method ignores the gaps using sparse selection matrices and a highly effective low-rank preconditioner is introduced to accelerate computations. The second method introduces a novel approach to GP training whereby response values are inferred on the gaps before explicitly training the model. We find this second approach to be greatly advantageous for the class of problems considered. Both of these novel approaches make extensive use of Kronecker matrix algebra to design massively scalable algorithms which have low memory requirements. We demonstrate exact GP inference for a spatial-temporal climate modelling problem with 3.7 million training points as well as a video reconstruction problem with 1 billion points.

Fast Flexible Function Dispatch in Julia

Technical computing is a challenging application area for programming languages to address. This is evinced by the unusually large number of specialized languages in the area (e.g. MATLAB, R), and the complexity of common software stacks, often involving multiple languages and custom code generators. We believe this is ultimately due to key characteristics of the domain: highly complex operators, a need for extensive code specialization for performance, and a desire for permissive high-level programming styles allowing productive experimentation. The Julia language attempts to provide a more effective structure for this kind of programming by allowing programmers to express complex polymorphic behaviors using dynamic multiple dispatch over parametric types. The forms of extension and reuse permitted by this paradigm have proven valuable for technical computing. We report on how this approach has allowed domain experts to express useful abstractions while simultaneously providing a natural path to better performance for high-level technical code.

Fast computation of the principal components of genotype matrices in Julia

Finding the largest few principal components of a matrix of genetic data is a common task in genome-wide association studies (GWASs), both for dimensionality reduction and for identifying unwanted factors of variation. We describe a simple random matrix model for matrices that arise in GWASs, showing that the singular values have a bulk behavior that obeys a Marchenko-Pastur distributed with a handful of large outliers. We also implement Golub-Kahan-Lanczos (GKL) bidiagonalization in the Julia programming language, providing thick restarting and a choice between full and partial reorthogonalization strategies to control numerical roundoff. Our implementation of GKL bidiagonalization is up to 36 times faster than software tools used commonly in genomics data analysis for computing principal components, such as EIGENSOFT and FlashPCA, which use dense LAPACK routines and randomized subspace iteration respectively.

Hierarchical Block Sparse Neural Networks

Sparse deep neural networks(DNNs) are efficient in both memory and compute when compared to dense DNNs. But due to irregularity in computation of sparse DNNs, their efficiencies are much lower than that of dense DNNs on general purpose hardwares. This leads to poor/no performance benefits for sparse DNNs. Performance issue for sparse DNNs can be alleviated by bringing structure to the sparsity and leveraging it for improving runtime efficiency. But such structural constraints often lead to sparse models with suboptimal accuracies. In this work, we jointly address both accuracy and performance of sparse DNNs using our proposed class of neural networks called HBsNN ( Hierarchical Block Sparse Neural Networks).

Lingke: A Fine-grained Multi-turn Chatbot for Customer Service

Traditional chatbots usually need a mass of human dialogue data, especially when using supervised machine learning method. Though they can easily deal with single-turn question answering, for multi-turn the performance is usually unsatisfactory. In this paper, we present Lingke, an information retrieval augmented chatbot which is able to answer questions based on given product introduction document and deal with multi-turn conversations. We will introduce a fine-grained pipeline processing to distill responses based on unstructured documents, and attentive sequential context-response matching for multi-turn conversations.

Bagging of Density Estimators

In this work we give new density estimators by averaging classical density estimators such as the histogram, the frequency polygon and the kernel density estimators obtained over different bootstrap samples of the original data. We prove the L 2-consistency of these new estimators and compare them to several similar approaches by extensive simulations. Based on them, we give also a way to construct non parametric pointwise confidence intervals for the target density.

AIQ: Measuring Intelligence of Business AI Software

Focusing on Business AI, this article introduces the AIQ quadrant that enables us to measure AI for business applications in a relative comparative manner, i.e. to judge that software A has more or less intelligence than software B. Recognizing that the goal of Business software is to maximize value in terms of business results, the dimensions of the quadrant are the key factors that determine the business value of AI software: Level of Output Quality (Smartness) and Level of Automation. The use of the quadrant is illustrated by several software solutions to support the real life business challenge of field service scheduling. The role of machine learning and conversational digital assistants in increasing the business value are also discussed and illustrated with a recent integration of existing intelligent digital assistants for factory floor decision making with the new version of Google Glass. Such hands free AI solutions elevate the AIQ level to its ultimate position.

TwoWingOS: A Two-Wing Optimization Strategy for Evidential Claim Verification

Greedy Algorithms for Approximating the Diameter of Machine Learning Datasets in Multidimensional Euclidean Space

Finding the diameter of a dataset in multidimensional Euclidean space is a well-established problem, with well-known algorithms. However, most of the algorithms found in the literature do not scale well with large values of data dimension, so the time complexity grows exponentially in most cases, which makes these algorithms impractical. Therefore, we implemented 4 simple greedy algorithms to be used for approximating the diameter of a multidimensional dataset; these are based on minimum/maximum l2 norms, hill climbing search, Tabu search and Beam search approaches, respectively. The time complexity of the implemented algorithms is near-linear, as they scale near-linearly with data size and its dimensions. The results of the experiments (conducted on different machine learning data sets) prove the efficiency of the implemented algorithms and can therefore be recommended for finding the diameter to be used by different machine learning applications when needed.

Dropout is a special case of the stochastic delta rule: faster and more accurate deep learning $p$

$\mu_{w_{ij}}$

$\sigma_{w_{ij}}$

How Complex is your classification problem? A survey on measuring classification complexity

Extracting characteristics from the training datasets of classification problems has proven effective in a number of meta-analyses. Among them, measures of classification complexity can estimate the difficulty in separating the data points into their expected classes. Descriptors of the spatial distribution of the data and estimates of the shape and size of the decision boundary are among the existent measures for this characterization. This information can support the formulation of new data-driven pre-processing and pattern recognition techniques, which can in turn be focused on challenging characteristics of the problems. This paper surveys and analyzes measures which can be extracted from the training datasets in order to characterize the complexity of the respective classification problems. Their use in recent literature is also reviewed and discussed, allowing to prospect opportunities for future work in the area. Finally, descriptions are given on an R package named Extended Complexity Library (ECoL) that implements a set of complexity measures and is made publicly available.

Ensemble Kalman Inversion: A Derivative-Free Technique For Machine Learning Tasks

The standard probabilistic perspective on machine learning gives rise to empirical risk-minimization tasks that are frequently solved by stochastic gradient descent (SGD) and variants thereof. We present a formulation of these tasks as classical inverse or filtering problems and, furthermore, we propose an efficient, gradient-free algorithm for finding a solution to these problems using ensemble Kalman inversion (EKI). Applications of our approach include offline and online supervised learning with deep neural networks, as well as graph-based semi-supervised learning. The essence of the EKI procedure is an ensemble based approximate gradient descent in which derivatives are replaced by differences from within the ensemble. We suggest several modifications to the basic method, derived from empirically successful heuristics developed in the context of SGD. Numerical results demonstrate wide applicability and robustness of the proposed algorithm.

• ‘Quantum Equilibrium-Disequilibrium’: Asset Price Dynamics, Symmetry Breaking, and Defaults as Dissipative Instantons• Effective Caching for the Secure Content Distribution in Information-Centric Networking• Self-Adaptive Systems in Organic Computing: Strategies for Self-Improvement• A Hybrid Recommender System for Patient-Doctor Matchmaking in Primary Care• Evolutionary optimisation of neural network models for fish collective behaviours in mixed groups of robots and zebrafish• Random forest prediction of Alzheimer’s disease using pairwise selection from time series data• VerIDeep: Verifying Integrity of Deep Neural Networks through Sensitive-Sample Fingerprinting• Temporal starvation in multi-channel CSMA networks: an analytical framework• Who Falls for Online Political Manipulation?• The frog model on trees with drift• Code-Mixed Sentiment Analysis Using Machine Learning and Neural Network Approaches• $α$-Approximation Density-based Clustering of Multi-valued Objects• On-Chip Optical Convolutional Neural Networks• The Elephant in the Room• Longest Increasing Subsequence under Persistent Comparison Errors• The Effectiveness of Multitask Learning for Phenotyping with Electronic Health Records Data• DeepMag: Source Specific Motion Magnification Using Gradient Ascent• Transport coefficients in multi-component fluids from equilibrium molecular dynamics• On Physical Layer Security over Fox’s $H$-Function Wiretap Fading Channels• Deep Learning for Single Image Super-Resolution: A Brief Review• Uncovering the Spread of Chagas Disease in Argentina and Mexico• Efficient human-like semantic representations via the Information Bottleneck principle• Sequence-Based OOK for Orthogonal Multiplexing of Wake-up Radio Signals and OFDM Waveforms• Error Forward-Propagation: Reusing Feedforward Connections to Propagate Errors in Deep Learning• Collective irregular dynamics in balanced networks of leaky integrate-and-fire neurons• A Panel Quantile Approach to Attrition Bias in Big Data: Evidence from a Randomized Experiment• A note on partial rejection sampling for the hard disks model in the plane• Blue Phase: Optimal Network Traffic Control for Legacy and Autonomous Vehicles• A survey of data transfer and storage techniques in prevalent cryptocurrencies and suggested improvements• Code-division multiplexed resistive pulse sensor networks for spatio-temporal detection of particles in microfluidic devices• Classes of graphs with e-positive chromatic symmetric function• Distinctiveness, complexity, and repeatability of online signature templates• End-to-end Active Object Tracking and Its Real-world Deployment via Reinforcement Learning• A note on the critical barrier for the survival of $α-$stable branching random walk with absorption• On the Convergence of AdaGrad with Momentum for Training Deep Neural Networks• Linear time algorithm to check the singularity of block graphs• Efficient Measurement on Programmable SwitchesUsing Probabilistic Recirculation• DeepWrinkles: Accurate and Realistic Clothing Modeling• (De)localization of Fermions in Correlated-Spin Background• Learning and Inference on Generative Adversarial Quantum Circuits• WonDerM: Skin Lesion Classification with Fine-tuned Neural Networks• Power Minimization Based Joint Task Scheduling and Resource Allocation in Downlink C-RAN• Stochastic $R_0$ Tensors to Stochastic Tensor Complementarity Problems• Hybrid approach for transliteration of Algerian arabizi: a primary study• Spin systems on Bethe lattices• On the optimal designs for the prediction of complex Ornstein-Uhlenbeck processes• The Moment-SOS hierarchy• Stability for Intersecting Families of Perfect Matchings• Weakly-Supervised Attention and Relation Learning for Facial Action Unit Detection• The Evolution of Sex Chromosomes through the Baldwin Effect• Cross-location wind speed forecasting for wind energy applications using machine learning based models• Deep Learning Based Speed Estimation for Constraining Strapdown Inertial Navigation on Smartphones• Pulse-laser Based Long-range Non-line-of-sight Ultraviolet Communication with Pulse Response Position Estimation• Construction of cospectral graphs• On the Complexity of Solving Subtraction Games• The Power of Cut-Based Parameters for Computing Edge Disjoint Paths• Extremal process of the zero-average Gaussian Free Field for $d\ge 3$• Model Approximation Using Cascade of Tree Decompositions• ChipNet: Real-Time LiDAR Processing for Drivable Region Segmentation on an FPGA• Making effective use of healthcare data using data-to-text technology• Band selection with Higher Order Multivariate Cumulants for small target detection in hyperspectral images• Optimizing error of high-dimensional statistical queries under differential privacy• On testing for high-dimensional white noise• Evaluation of the Spatial Consistency Feature in the 3GPP GSCM Channel Model• Atmospheric turbulence mitigation for sequences with moving objects using recursive image fusion• Dynamic all scores matrices for LCS score• Ektelo: A Framework for Defining Differentially-Private Computations• Existence of symmetric maximal noncrossing collections of $k$-element sets• Finding a Small Number of Colourful Components• Choosing the optimal multi-point iterative method for the Colebrook flow friction equation — Numerical validation• Densely Connected Convolutional Networks for Speech Recognition• Physics-based Learned Design: Optimized Coded-Illumination for Quantitative Phase Imaging• Enumerating Anchored Permutations with Bounded Gaps• Weakly- and Semi-Supervised Panoptic Segmentation• Johnson type bounds for mixed dimension subspace codes• Shape differentiability of Lagrangians and application to Stokes problem• One-dimensional quasicrystals with power-law hopping• On the moments of the (2+1)-dimensional directed polymer and stochastic heat equation in the critical window• A simplified convolutional sparse filter for impulsive signature enhancement and its application to the prognostic of rotating machinery• New global optimality conditions for nonsmooth DC optimization problems• On the structure of the set of active sets in constrained linear quadratic regulation• Overlapping community detection using superior seed set selection in social networks• Rigidity of proper colorings of $\mathbb{Z}^d$• Using Randomness to Improve Robustness of Machine-Learning Models Against Evasion Attacks• Disease Progression Timeline Estimation for Alzheimer’s Disease using Discriminative Event Based Modeling• Weakly supervised learning of indoor geometry by dual warping• An Iterative Path-Breaking Approach with Mutation and Restart Strategies for the MAX-SAT Problem• Reliability of Relay Networks under Random Linear Network Coding• Automorphism groups of Steiner triple systems• Improved Bit-Flipping Algorithm for Successive Cancellation Decoding of Polar Codes• A Novel Method for Epileptic Seizure Detection Using Coupled Hidden Markov Models• A New Algorithm for the Robust Semi-random Independent Set Problem

Like this:

Like Loading…

Related