COLA: Communication-Efficient Decentralized Linear Learning
Decentralized machine learning is a promising emerging paradigm in view of global challenges of data ownership and privacy. We consider learning of linear classification and regression models, in the setting where the training data is decentralized over many user devices, and the learning algorithm must run on-device, on an arbitrary communication network, without a central coordinator. We propose COLA, a new decentralized training algorithm with strong theoretical guarantees and superior practical performance. Our framework overcomes many limitations of existing methods, and achieves communication efficiency, scalability, elasticity as well as resilience to changes in data and participating devices.
VizML: A Machine Learning Approach to Visualization Recommendation
Data visualization should be accessible for all analysts with data, not just the few with technical expertise. Visualization recommender systems aim to lower the barrier to exploring basic visualizations by automatically generating results for analysts to search and select, rather than manually specify. Here, we demonstrate a novel machine learning-based approach to visualization recommendation that learns visualization design choices from a large corpus of datasets and associated visualizations. First, we identify five key design choices made by analysts while creating visualizations, such as selecting a visualization type and choosing to encode a column along the X- or Y-axis. We train models to predict these design choices using one million dataset-visualization pairs collected from a popular online visualization platform. Neural networks predict these design choices with high accuracy compared to baseline models. We report and interpret feature importances from one of these baseline models. To evaluate the generalizability and uncertainty of our approach, we benchmark with a crowdsourced test set, and show that the performance of our model is comparable to human performance when predicting consensus visualization type, and exceeds that of other ML-based systems.
A Scalable Data Science Platform for Healthcare and Precision Medicine Research
Objective: To (1) demonstrate the implementation of a data science platform built on open-source technology within a large, academic healthcare system and (2) describe two computational healthcare applications built on such a platform. Materials and Methods: A data science platform based on several open source technologies was deployed to support real-time, big data workloads. Data acquisition workflows for Apache Storm and NiFi were developed in Java and Python to capture patient monitoring and laboratory data for downstream analytics. Results: The use of emerging data management approaches along with open-source technologies such as Hadoop can be used to create integrated data lakes to store large, real-time data sets. This infrastructure also provides a robust analytics platform where healthcare and biomedical research data can be analyzed in near real-time for precision medicine and computational healthcare use cases. Discussion: The implementation and use of integrated data science platforms offer organizations the opportunity to combine traditional data sets, including data from the electronic health record, with emerging big data sources, such as continuous patient monitoring and real-time laboratory results. These platforms can enable cost-effective and scalable analytics for the information that will be key to the delivery of precision medicine initiatives. Conclusion: Organizations that can take advantage of the technical advances found in data science platforms will have the opportunity to provide comprehensive access to healthcare data for computational healthcare and precision medicine research.
Latent Agents in Networks: Estimation and Pricing
We focus on a setting where agents in a social network consume a product that exhibits positive local network externalities. A seller has access to data on past consumption decisions/prices for a subset of observable agents, and can target these agents with appropriate discounts to exploit network effects and increase her revenues. A novel feature of the model is that the observable agents potentially interact with additional latent agents. These latent agents can purchase the same product from a different channel, and are not observed by the seller. Observable agents influence each other both directly and indirectly through the influence they exert on the latent part. The seller knows the connection structure of neither the observable nor the latent part of the network. We investigate how the seller can use the available data to estimate the matrix that captures the dependence of observable agents’ consumption decisions on the prices offered to them. We provide an algorithm for estimating this matrix under an approximate sparsity condition, and obtain convergence rates for the proposed estimator despite the high-dimensionality that allows more agents than observations. Importantly, we then show that this approximate sparsity condition holds under standard conditions present in the literature and hence our algorithms are applicable to a large class of networks. We establish that by using the estimated matrix the seller can construct prices that lead to a small revenue loss relative to revenue-maximizing prices under complete information, and the optimality gap vanishes relative to the size of the network. We also illustrate that the presence of latent agents leads to significant changes in the structure of the revenue-maximizing prices.
Multiply Robust Causal Inference With Double Negative Control Adjustment for Unmeasured Confounding
Unmeasured confounding is a threat to causal inference in observational studies. In recent years, use of negative controls to address unmeasured confounding has gained increasing recognition and popularity. Negative controls have a longstanding tradition in laboratory sciences and epidemiology to rule out non-causal explanations, although they have been used primarily for bias detection. Recently, Miao et al. (2017) have described sufficient conditions under which a pair of negative control exposure-outcome variables can be used to nonparametrically identify average treatment effect from observational data subject to uncontrolled confounding. In this paper, building on their results, we provide a general semiparametric framework for obtaining inferences about the average treatment effect with double negative control adjustment for unmeasured confounding, while accounting for a large number of observed confounding variables. In particular, we derive the semiparametric efficiency bound under a nonparametric model for the observed data distribution, and we propose multiply robust locally efficient estimators when nonparametric estimation may not be feasible. We assess the finite sample performance of our methods under potential model misspecification in extensive simulation studies. We illustrate our methods with an application to the evaluation of the effect of higher education on wage among married working women.
A Confounding Bridge Approach for Double Negative Control Inference on Causal Effects
Unmeasured confounding is a key challenge for causal inference. Negative control variables are widely available in observational studies. A negative control outcome is associated with the confounder but not causally affected by the exposure in view, and a negative control exposure is correlated with the primary exposure or the confounder but does not causally affect the outcome of interest. In this paper, we establish a framework to use them for unmeasured confounding adjustment. We introduce a confounding bridge function that links the potential outcome mean and the negative control outcome distribution, and we incorporate a negative control exposure to identify the bridge function and the average causal effect. Our approach can be used to repair an invalid instrumental variable in case it is correlated with the unmeasured confounder. We also extend our approach by allowing for a causal association between the primary exposure and the control outcome. We illustrate our approach with simulations and apply it to a study about the short-term effect of air pollution. Although a standard analysis shows a significant acute effect of PM2.5 on mortality, our analysis indicates that this effect may be confounded, and after double negative control adjustment, the effect is attenuated toward zero.
Automatic Derivation Of Formulas Using Reforcement Learning
This paper presents an artificial intelligence algorithm that can be used to derive formulas from various scientific disciplines called automatic derivation machine. First, the formula is abstractly expressed as a multiway tree model, and then each step of the formula derivation transformation is abstracted as a mapping of multiway trees. Derivation steps similar can be expressed as a reusable formula template by a multiway tree map. After that, the formula multiway tree is eigen-encoded to feature vectors construct the feature space of formulas, the Q-learning model using in this feature space can achieve the derivation by making training data from derivation process. Finally, an automatic formula derivation machine is made to choose the next derivation step based on the current state and object. We also make an example about the nuclear reactor physics problem to show how the automatic derivation machine works.
Collapse of Deep and Narrow Neural Nets
Recent theoretical work has demonstrated that deep neural networks have superior performance over shallow networks, but their training is more difficult, e.g., they suffer from the vanishing gradient problem. This problem can be typically resolved by the rectified linear unit (ReLU) activation. However, here we show that even for such activation, deep and narrow neural networks will converge to erroneous mean or median states of the target function depending on the loss with high probability. We demonstrate this collapse of deep and narrow neural networks both numerically and theoretically, and provide estimates of the probability of collapse. We also construct a diagram of a safe region of designing neural networks that avoid the collapse to erroneous states. Finally, we examine different ways of initialization and normalization that may avoid the collapse problem.
Neural Collaborative Ranking
Recommender systems are aimed at generating a personalized ranked list of items that an end user might be interested in. With the unprecedented success of deep learning in computer vision and speech recognition, recently it has been a hot topic to bridge the gap between recommender systems and deep neural network. And deep learning methods have been shown to achieve state-of-the-art on many recommendation tasks. For example, a recent model, NeuMF, first projects users and items into some shared low-dimensional latent feature space, and then employs neural nets to model the interaction between the user and item latent features to obtain state-of-the-art performance on the recommendation tasks. NeuMF assumes that the non-interacted items are inherent negative and uses negative sampling to relax this assumption. In this paper, we examine an alternative approach which does not assume that the non-interacted items are necessarily negative, just that they are less preferred than interacted items. Specifically, we develop a new classification strategy based on the widely used pairwise ranking assumption. We combine our classification strategy with the recently proposed neural collaborative filtering framework, and propose a general collaborative ranking framework called Neural Network based Collaborative Ranking (NCR). We resort to a neural network architecture to model a user’s pairwise preference between items, with the belief that neural network will effectively capture the latent structure of latent factors. The experimental results on two real-world datasets show the superior performance of our models in comparison with several state-of-the-art approaches.
A framework for automatic question generation from text using deep reinforcement learning
Automatic question generation (QG) is a useful yet challenging task in NLP. Recent neural network-based approaches represent the state-of-the-art in this task, but they are not without shortcomings. Firstly, these models lack the ability to handle rare words and the word repetition problem. Moreover, all previous works optimize the cross-entropy loss, which can induce inconsistencies between training (objective) and testing (evaluation measure). In this paper, we present a novel deep reinforcement learning based framework for automatic question generation. The generator of the framework is a sequence-to-sequence model, enhanced with the copy mechanism to handle the rare-words problem and the coverage mechanism to solve the word repetition problem. The evaluator model of the framework evaluates and assigns a reward to each predicted question. The overall model is trained by learning the parameters of the generator network which maximizes the reward. Our framework allows us to directly optimize any task-specific score including evaluation measures such as BLEU, GLEU, ROUGE-L, {\em etc.}, suitable for sequence to sequence tasks such as QG. Our comprehensive evaluation shows that our approach significantly outperforms state-of-the-art systems on the widely-used SQuAD benchmark in both automatic and human evaluation.
Recent Advances in Physical Reservoir Computing: A Review
Reservoir computing is a computational framework suited for temporal/sequential data processing. It is derived from several recurrent neural network models, including echo state networks and liquid state machines. A reservoir computing system consists of a reservoir for mapping inputs into a high-dimensional space and a readout for extracting features of the inputs. Further, training is carried out only in the readout. Thus, the major advantage of reservoir computing is fast and simple learning compared to other recurrent neural networks. Another advantage is that the reservoir can be realized using physical systems, substrates, and devices, instead of recurrent neural networks. In fact, such physical reservoir computing has attracted increasing attention in various fields of research. The purpose of this review is to provide an overview of recent advances in physical reservoir computing by classifying them according to the type of the reservoir. We discuss the current issues and perspectives related to physical reservoir computing, in order to further expand its practical applications and develop next-generation machine learning systems.
Shedding Light on Black Box Machine Learning Algorithms: Development of an Axiomatic Framework to Assess the Quality of Methods that Explain Individual Predictions
From self-driving vehicles and back-flipping robots to virtual assistants who book our next appointment at the hair salon or at that restaurant for dinner – machine learning systems are becoming increasingly ubiquitous. The main reason for this is that these methods boast remarkable predictive capabilities. However, most of these models remain black boxes, meaning that it is very challenging for humans to follow and understand their intricate inner workings. Consequently, interpretability has suffered under this ever-increasing complexity of machine learning models. Especially with regards to new regulations, such as the General Data Protection Regulation (GDPR), the necessity for plausibility and verifiability of predictions made by these black boxes is indispensable. Driven by the needs of industry and practice, the research community has recognised this interpretability problem and focussed on developing a growing number of so-called explanation methods over the past few years. These methods explain individual predictions made by black box machine learning models and help to recover some of the lost interpretability. With the proliferation of these explanation methods, it is, however, often unclear, which explanation method offers a higher explanation quality, or is generally better-suited for the situation at hand. In this thesis, we thus propose an axiomatic framework, which allows comparing the quality of different explanation methods amongst each other. Through experimental validation, we find that the developed framework is useful to assess the explanation quality of different explanation methods and reach conclusions that are consistent with independent research.
The Steady-State Behavior of Multivariate Exponentially Weighted Moving Average Control Charts
Multivariate Exponentially Weighted Moving Average, MEWMA, charts are popular, handy and effective procedures to detect distributional changes in a stream of multivariate data. For doing appropriate performance analysis, dealing with the steady-state behavior of the MEWMA statistic is essential. Going beyond early papers, we derive quite accurate approximations of the respective steady-state densities of the MEWMA statistic. It turns out that these densities could be rewritten as the product of two functions depending on one argument only which allows feasible calculation. For proving the related statements, the presentation of the non-central chisquare density deploying the confluent hypergeometric limit function is applied. Using the new methods it was found that for large dimensions, the steady-state behavior becomes different to what one might expect from the univariate monitoring field. Based on the integral equation driven methods, steady-state and worst-case average run lengths are calculated with higher accuracy than before. Eventually, optimal MEWMA smoothing constants are derived for all considered measures.
Multi-feature Fusion for Image Retrieval Using Constrained Dominant Sets
Aggregating different image features for image retrieval has recently shown its effectiveness. While highly effective, though, the question of how to uplift the impact of the best features for a specific query image persists as an open computer vision problem. In this paper, we propose a computationally efficient approach to fuse several hand-crafted and deep features, based on the probabilistic distribution of a given membership score of a constrained cluster in an unsupervised manner. First, we introduce an incremental nearest neighbor (NN) selection method, whereby we dynamically select k-NN to the query. We then build several graphs from the obtained NN sets and employ constrained dominant sets (CDS) on each graph G to assign edge weights which consider the intrinsic manifold structure of the graph, and detect false matches to the query. Finally, we elaborate the computation of feature positive-impact weight (PIW) based on the dispersive degree of the characteristics vector. To this end, we exploit the entropy of a cluster membership-score distribution. In addition, the final NN set bypasses a heuristic voting scheme. Experiments on several retrieval benchmark datasets show that our method can improve the state-of-the-art result.
Joint & Progressive Learning from High-Dimensional Data for Multi-Label Classification
Despite the fact that nonlinear subspace learning techniques (e.g. manifold learning) have successfully applied to data representation, there is still room for improvement in explainability (explicit mapping), generalization (out-of-samples), and cost-effectiveness (linearization). To this end, a novel linearized subspace learning technique is developed in a joint and progressive way, called \textbf{j}oint and \textbf{p}rogressive \textbf{l}earning str\textbf{a}teg\textbf{y} (J-Play), with its application to multi-label classification. The J-Play learns high-level and semantically meaningful feature representation from high-dimensional data by 1) jointly performing multiple subspace learning and classification to find a latent subspace where samples are expected to be better classified; 2) progressively learning multi-coupled projections to linearly approach the optimal mapping bridging the original space with the most discriminative subspace; 3) locally embedding manifold structure in each learnable latent subspace. Extensive experiments are performed to demonstrate the superiority and effectiveness of the proposed method in comparison with previous state-of-the-art methods.
LogCanvas: Visualizing Search History Using Knowledge Graphs
In this demo paper, we introduce LogCanvas, a platform for user search history visualisation. Different from the existing visualisation tools, LogCanvas focuses on helping users re-construct the semantic relationship among their search activities. LogCanvas segments a user’s search history into different sessions and generates a knowledge graph to represent the information exploration process in each session. A knowledge graph is composed of the most important concepts or entities discovered by each search query as well as their relationships. It thus captures the semantic relationship among the queries. LogCanvas offers a session timeline viewer and a snippets viewer to enable users to re-find their previous search results efficiently. LogCanvas also provides a collaborative perspective to support a group of users in sharing search results and experience.
A Blockchain Database Application Platform
A blockchain is a decentralised linked data structure that is characterised by its inherent resistance to data modification, but it is deficient in search queries, primarily due to its inferior data formatting. A distributed database is also a decentralised data structure which features quick query processing and well-designed data formatting but suffers from data reliability. In this demonstration, we showcase a blockchain database application platform developed by integrating the blockchain with the database, i.e. we demonstrate a system that has the decentralised, distributed and audibility features of the blockchain and quick query processing and well-designed data structure of the distributed databases. The system features a tamper-resistant, consistent and cost-effective multi-active database and an effective and reliable data-level disaster recovery backup. The system is demonstrated in practice as a multi-active database along with the data-level disaster recovery backup feature.
• Non-Interfering Concurrent Exchange (NICE) Networks• Analysis of luminosity measurements of the pre-white dwarf PG 1159-035• On the Number of Rumer Diagrams• Ghost imaging with the human eye• Motifs, Coherent Configurations and Second Order Network Generation• Database Operations in D4M.jl• A Framework for Automated Cellular Network Tuning with Reinforcement Learning• Age of Information Minimization for an Energy Harvesting Source with Updating Erasures: With and Without Feedback• ACVAE-VC: Non-parallel many-to-many voice conversion with auxiliary classifier variational autoencoder• Biological Optical-to-Chemical Signal Conversion Interface: A Small-scale Modulator for Molecular Communications• Magnetic Nanoparticle Based Molecular Communication in Microfluidic Environments• Audience-Retention-Rate-Aware Caching and Coded Video Delivery with Asynchronous Demands• Discrete gradient descent differs qualitatively from gradient flow• Aspirational pursuit of mates in online dating markets• A Relax-and-Round Approach to Complex Lattice Basis Reduction• URSA: A Neural Network for Unordered Point Clouds Using Constellations• Two Local Models for Neural Constituent Parsing• On spectral embedding performance and elucidating network structure in stochastic block model graphs• GestureGAN for Hand Gesture-to-Gesture Translation in the Wild• Sanov-type large deviations in Schatten classes• An Experimental Study of Algorithms for Online Bipartite Matching• Top-Down Tree Structured Text Generation• Mitigating Sybils in Federated Learning Poisoning• Rao-Blackwellizing Field Goal Percentage• Generalization of Equilibrium Propagation to Vector Field Dynamics• Multi-user Communication Networks: A Coordinated Multi-armed Bandit Approach• Plato: Approximate Analytics over Compressed Time Series with Tight Deterministic Error Guarantees• Infection Analysis on Irregular Networks through Graph Signal Processing• A Precision Environment-Wide Association Study of Hypertension via Supervised Cadre Models• Estimating the input of a Lévy-driven queue by Poisson sampling of the workload process• Timed Network Games with Clocks• Skill Rating for Generative Models• Cycles in the burnt pancake graphs• Embedding Grammars• False Discovery Rate Controlled Heterogeneous Treatment Effect Detection for Online Controlled Experiments• Multi-Sector and Multi-Panel Performance in 5G mmWave Cellular Networks• Vendor-independent soft tissue lesion detection using weakly supervised and unsupervised adversarial domain adaptation• Cross-Lingual Cross-Platform Rumor Verification Pivoting on Multimedia Content• Cyclic Descents for General Skew Tableaux• Counting primitive subsets and other statistics of the divisor graph of ${1,2, \ldots n}$• Complexity of Shift Spaces on Semigroups• How Much Reading Does Reading Comprehension Require? A Critical Investigation of Popular Benchmarks• Deep EHR: Chronic Disease Prediction Using Medical Notes• Holographic Visualisation of Radiology Data and Automated Machine Learning-based Medical Image Segmentation• Data-driven discretization: a method for systematic coarse graining of partial differential equations• A Brief Survey on Lattice Zonotopes• Tempered fractional Brownian motion: wavelet estimation, modeling and testing• A Unified Framework for Efficient Estimation of General Treatment Models• Sparse Transmit Array Design for Dual-Function Radar Communications by Antenna Selection• Dual-Function MIMO Radar Communications System Design Via Sparse Array Optimization• The norm and the evaluation of the Macdonald polynomials in superspace• Folksonomication: Predicting Tags for Movies from Plot Synopses Using Emotion Flow Encoded Neural Network• Subtrees of a random tree• Convolutional Neural Networks on 3D Surfaces Using Parallel Frames• Rainbow matchings in properly-colored hypergraphs• Physical Layer Security Enhancement for Satellite Communication among Similar Channels: Relay Selection and Power Allocation• On Local Antimagic Vertex Coloring for Corona Products of Graphs• Multiple Character Embeddings for Chinese Word Segmentation• A Probabilistic Proof of the Perron-Frobenius Theorem• A bilinear Bogolyubov-Ruzsa lemma with poly-logarithmic bounds• Can GDP measurement be further improved? Data revision and reconciliation• SAN: Learning Relationship between Convolutional Features for Multi-Scale Object Detection• A note on matchings in abelian groups• Pairwise Relational Networks for Face Recognition• Characterizations of Super-regularity and its Variants• Large Deviations of the Exit Measure through a Characteristic Boundary for a Poisson driven SDE• The Sketching Complexity of Graph and Hypergraph Counting• Scene Coordinate Regression with Angle-Based Reprojection Loss for Camera Relocalization• C-RAN with Hybrid RF/FSO Fronthaul Links: Joint Optimization of RF Time Allocation and Fronthaul Compression• Statistical Piano Reduction Controlling Performance Difficulty• Stationarity of entrance Markov chains and overshoots of random walks• On Shadowing the $κ$-$μ$ Fading Model• A Conversation with Jon Wellner• Counting Minimal Transversals of $β$-Acyclic Hypergraphs• A Dense-Depth Representation for VLAD descriptors in Content-Based Image Retrieval• Deep RTS: A Game Environment for Deep Reinforcement Learning in Real-Time Strategy Games• Structural transition in social networks: The role of homophily• A note on strong-consistency of componentwise ARH(1) predictors• Utilization of Water Supply Networks for Harvesting Renewable Energy• libhclooc: Software Library Facilitating Out-of-core Implementations of Accelerator Kernels on Hybrid Computing Platforms• On optimal transport of matrix-valued measures• Towards the Kohayakawa–Kreuter conjecture on asymmetric Ramsey properties• Ensemble of Convolutional Neural Networks for Dermoscopic Images Classification• On ‘two important theorems’ in canonical duality theory• Exploiting Deep Learning for Persian Sentiment Analysis• SentiALG: Automated Corpus Annotation for Algerian Sentiment Analysis• Hurwitz transitivity in elliptic Weyl groups and weighted projective lines• Temporal Sequence Distillation: Towards Few-Frame Action Recognition in Videos• On Optimizing VLC Networks for Downlink Multi-User Transmission: A Survey• The action of a Coxeter element on an affine root system• Inequalities for the overpartition function• A Spectrum Sharing Solution for the Efficient Use of mmWave Bands in 5G Cellular Scenarios• Energy-Efficient Multi-View Video Transmission with View Synthesis-Enabled Multicast• 7-Connected Graphs are 4-Ordered• Optimal allocation of subjects in a cluster randomized trial with fixed number of clusters when the ICCs or costs are heterogeneous over clusters• Using Regular Languages to Explore the Representational Capacity of Recurrent Neural Architectures• Deep Learning using K-space Based Data Augmentation for Automated Cardiac MR Motion Artefact Detection• An Analysis of Asynchronous Stochastic Accelerated Coordinate Descent• Asymptotic majorization of finite probability distributions• Backtracking gradient descent method for general $C^1$ functions• A Simple but Hard-to-Beat Baseline for Session-based Recommendations• Generating Graphs with Symmetry• Recycle-GAN: Unsupervised Video Retargeting• Forbidden cycles in metrically homogeneous graphs• Extremum Seeking Optimal Controls of Unknown Systems• Control of an Architectural Cable Net Geometry• Model-based clustering for random hypergraphs• A Proximal Operator for Multispectral Phase Retrieval Problems• Unified characterizations of minuscule Kac-Moody representations built from colored posets• On the localization of the stochastic heat equation in strong disorder• Building medical image classifiers with very limited data using segmentation networks• Magnificent Four with Colors
Like this:
Like Loading…
Related