A Schur Complement Cheeger Inequality
A Frequency Scaling based Performance Indicator Framework for Big Data Systems
It is important for big data systems to identify their performance bottleneck. However, the popular indicators such as resource utilizations, are often misleading and incomparable with each other. In this paper, a novel indicator framework which can directly compare the impact of different indicators with each other is proposed to identify and analyze the performance bottleneck efficiently. A methodology which can construct the indicator from the performance change with the CPU frequency scaling is described. Spark is used as an example of a big data system and two typical SQL benchmarks are used as the workloads to evaluate the proposed method. Experimental results show that the proposed method is accurate compared with the resource utilization method and easy to implement compared with some white-box method. Meanwhile, the analysis with our indicators lead to some interesting findings and valuable performance optimization suggestions for big data systems.
Cloud based Real-Time and Low Latency Scientific Event Analysis
Astronomy is well recognized as big data driven science. As the novel observation infrastructures are developed, the sky survey cycles have been shortened from a few days to a few seconds, causing data processing pressure to shift from offline to online. However, existing scientific databases focus on offline analysis of long-term historical data, not real-time and low latency analysis of large-scale newly arriving data. In this paper, a cloud based method is proposed to efficiently analyze scientific events on large-scale newly arriving data. The solution is implemented as a highly efficient system, namely Aserv. A set of compact data store and index structures are proposed to describe the proposed scientific events and a typical analysis pattern is formulized as a set of query operations. Domain aware filter, accuracy aware data partition, highly efficient index and frequently used statistical data designs are four key methods to optimize the performance of Aserv. Experimental results under the typical cloud environment show that the presented optimization mechanism can meet the low latency demand for both large data insertion and scientific event analysis. Aserv can insert 3.5 million rows of data within 3 seconds and perform the heaviest query on 6.7 billion rows of data also within 3 seconds. Furthermore, a performance model is given to help Aserv choose the right cloud resource setup to meet the guaranteed real-time performance requirement.
Beta Distribution Drift Detection for Adaptive Classifiers
With today’s abundant streams of data, the only constant we can rely on is change. For stream classification algorithms, it is necessary to adapt to concept drift. This can be achieved by monitoring the model error, and triggering counter measures as changes occur. In this paper, we propose a drift detection mechanism that fits a beta distribution to the model error, and treats abnormal behavior as drift. It works with any given model, leverages prior knowledge about this model, and allows to set application-specific confidence thresholds. Experiments confirm that it performs well, in particular when drift occurs abruptly.
Efficiently Charting RDF
We propose a visual query language for interactively exploring large-scale knowledge graphs. Starting from an overview, the user explores bar charts through three interactions: class expansion, property expansion, and subject/object expansion. A major challenge faced is performance: a state-of-the-art SPARQL engine may require tens of minutes to compute the multiway join, grouping and counting required to render a bar chart. A promising alternative is to apply approximation through online aggregation, trading precision for performance. However, state-of-the-art online aggregation algorithms such as Wander Join have two limitations for our exploration scenario: (1) a high number of rejected paths slows the convergence of the count estimations, and (2) no unbiased estimator exists for counts under the distinct operator. We thus devise a specialized algorithm for online aggregation that augments Wander Join with exact partial computations to reduce the number of rejected paths encountered, as well as a novel estimator that we prove to be unbiased in the case of the distinct operator. In an experimental study with random interactions exploring two large-scale knowledge graphs, our algorithm shows a clear reduction in error with respect to computation time versus Wander Join.
Dataset Distillation
Neural Non-Stationary Spectral Kernel
Standard kernels such as Mat\’ern or RBF kernels only encode simple monotonic dependencies within the input space. Spectral mixture kernels have been proposed as general-purpose, flexible kernels for learning and discovering more complicated patterns in the data. Spectral mixture kernels have recently been generalized into non-stationary kernels by replacing the mixture weights, frequency means and variances by input-dependent functions. These functions have also been modelled as Gaussian processes on their own. In this paper we propose modelling the hyperparameter functions with neural networks, and provide an experimental comparison between the stationary spectral mixture and the two non-stationary spectral mixtures. Scalable Gaussian process inference is implemented within the sparse variational framework for all the kernels considered. We show that the neural variant of the kernel is able to achieve the best performance, among alternatives, on several benchmark datasets.
A deconvolution method for reconstruction of data time series from intermittent systems
In this manuscript, we will investigate the deconvolution method for recovering pulse arrival times and amplitudes using synthetic data. For the deconvolution procedure to have hope of recovering amplitudes and arrivals, the average waiting time between events must be at least 10 times the time step.
An Analytical Approach to Improving Time Warping on Multidimensional Time Series
Generative Adversarial Network Training is a Continual Learning Problem
Generative Adversarial Networks (GANs) have proven to be a powerful framework for learning to draw samples from complex distributions. However, GANs are also notoriously difficult to train, with mode collapse and oscillations a common problem. We hypothesize that this is at least in part due to the evolution of the generator distribution and the catastrophic forgetting tendency of neural networks, which leads to the discriminator losing the ability to remember synthesized samples from previous instantiations of the generator. Recognizing this, our contributions are twofold. First, we show that GAN training makes for a more interesting and realistic benchmark for continual learning methods evaluation than some of the more canonical datasets. Second, we propose leveraging continual learning techniques to augment the discriminator, preserving its ability to recognize previous generator samples. We show that the resulting methods add only a light amount of computation, involve minimal changes to the model, and result in better overall performance on the examined image and text generation tasks.
Bayesian graph convolutional neural networks for semi-supervised classification
Recently, techniques for applying convolutional neural networks to graph-structured data have emerged. Graph convolutional neural networks (GCNNs) have been used to address node and graph classification and matrix completion. Although the performance has been impressive, the current implementations have limited capability to incorporate uncertainty in the graph structure. Almost all GCNNs process a graph as though it is a ground-truth depiction of the relationship between nodes, but often the graphs employed in applications are themselves derived from noisy data or modelling assumptions. Spurious edges may be included; other edges may be missing between nodes that have very strong relationships. In this paper we adopt a Bayesian approach, viewing the observed graph as a realization from a parametric family of random graphs. We then target inference of the joint posterior of the random graph parameters and the node (or graph) labels. We present the Bayesian GCNN framework and develop an iterative learning procedure for the case of assortative mixed-membership stochastic block models. We present the results of experiments that demonstrate that the Bayesian formulation can provide better performance when there are very few labels available during the training process.
LEASGD: an Efficient and Privacy-Preserving Decentralized Algorithm for Distributed Learning
Distributed learning systems have enabled training large-scale models over large amount of data in significantly shorter time. In this paper, we focus on decentralized distributed deep learning systems and aim to achieve differential privacy with good convergence rate and low communication cost. To achieve this goal, we propose a new learning algorithm LEASGD (Leader-Follower Elastic Averaging Stochastic Gradient Descent), which is driven by a novel Leader-Follower topology and a differential privacy model.We provide a theoretical analysis of the convergence rate and the trade-off between the performance and privacy in the private setting.The experimental results show that LEASGD outperforms state-of-the-art decentralized learning algorithm DPSGD by achieving steadily lower loss within the same iterations and by reducing the communication cost by 30%. In addition, LEASGD spends less differential privacy budget and has higher final accuracy result than DPSGD under private setting.
A Concept-Centered Hypertext Approach to Case-Based Retrieval
The goal of case-based retrieval is to assist physicians in the clinical decision making process, by finding relevant medical literature in large archives. We propose a research that aims at improving the effectiveness of case-based retrieval systems through the use of automatically created document-level semantic networks. The proposed research tackles different aspects of information systems and leverages the recent advancements in information extraction and relational learning to revisit and advance the core ideas of concept-centered hypertext models. We propose a two-step methodology that in the first step addresses the automatic creation of document-level semantic networks, then in the second step it designs methods that exploit such document representations to retrieve relevant cases from medical literature. For the automatic creation of documents’ semantic networks, we design a combination of information extraction techniques and relational learning models. Mining concepts and relations from text, information extraction techniques represent the core of the document-level semantic networks’ building process. On the other hand, relational learning models have the task of enriching the graph with additional connections that have not been detected by information extraction algorithms and strengthening the confidence score of extracted relations. For the retrieval of relevant medical literature, we investigate methods that are capable of comparing the documents’ semantic networks in terms of structure and semantics. The automatic extraction of semantic relations from documents, and their centrality in the creation of the documents’ semantic networks, represent our attempt to go one step further than previous graph-based approaches.
MG-WFBP: Efficient Data Communication for Distributed Synchronous SGD Algorithms
Distributed synchronous stochastic gradient descent has been widely used to train deep neural networks on computer clusters. With the increase of computational power, network communications have become one limiting factor on system scalability. In this paper, we observe that many deep neural networks have a large number of layers with only a small amount of data to be communicated. Based on the fact that merging some short communication tasks into a single one may reduce the overall communication time, we formulate an optimization problem to minimize the training iteration time. We develop an optimal solution named merged-gradient WFBP (MG-WFBP) and implement it in our open-source deep learning platform B-Caffe. Our experimental results on an 8-node GPU cluster with 10GbE interconnect and trace-based simulation results on a 64-node cluster both show that the MG-WFBP algorithm can achieve much better scaling efficiency than existing methods WFBP and SyncEASGD.
• A Frank-Wolfe Framework for Efficient and Effective Adversarial Attacks• Optimal Learning for Dynamic Coding in Deadline-Constrained Multi-Channel Networks• From Recognition to Cognition: Visual Commonsense Reasoning• sCAKE: Semantic Connectivity Aware Keyword Extraction• Adaptive Control By Regulation-Triggered Batch Least-Squares Estimation of Non-Observable Parameters• Part-level Car Parsing and Reconstruction from Single Street View• Maximizing Multivariate Information with Error-Correcting Codes• Robust Artificial Intelligence and Robust Human Organizations• CIAN: Cross-Image Affinity Net for Weakly Supervised Semantic Segmentation• Algae Detection Using Computer Vision and Deep Learning• Entropy and drift for word metrics on relatively hyperbolic groups• Data Management in Time-Domain Astronomy: Requirements and Challenges• Research on CRO’s Dilemma In Sapiens Chain: A Game Theory Method• Fair Division with a Secretive Agent• AstroServ: Distributed Database for Serving Large-Scale Full Life-Cycle Astronomical Data• Sampling Techniques for Large-Scale Object Detection from Sparsely Annotated Objects• Object Tracking by Reconstruction with View-Specific Discriminative Correlation Filters• Exploiting Numerical Sparsity for Efficient Learning : Faster Eigenvector Computation and Regression• Efficient non-uniform quantizer for quantized neural network targeting reconfigurable hardware• Affinity Derivation and Graph Merge for Instance Segmentation• Automatic Image Stylization Using Deep Fully Convolutional Networks• An Optimal Space Lower Bound for Approximating MAX-CUT• The synthesis of data from instrumented structures and physics-based models via Gaussian processes• UnDEMoN 2.0: Improved Depth and Ego Motion Estimation through Deep Image Sampling• Optimal switching problems with an infinite set of modes: an approach by randomization and constrained backward SDEs• Chasing the Echo State Property• DSBI: Double-Sided Braille Image Dataset and Algorithm Evaluation for Braille Dots Detection• Are 2D-LSTM really dead for offline text recognition?• Making Agents’ Abilities Explicit• Kernel-based Multi-Task Contextual Bandits in Cellular Network Configuration• Efficient Image Retrieval via Decoupling Diffusion into Online and Offline Processing• Hermitian Laplacians and a Cheeger inequality for the Max-2-Lin problem• Beyond One Glance: Gated Recurrent Architecture for Hand Segmentation• Robust active attacks on social graphs• Non-overlapping Dyck matrices• DeepVisInterests: CNN-Ontology Prediction of Users Interests from Social Images• Large-scale analysis of user exposure to online advertising in Facebook• Single-Agent Policy Tree Search With Guarantees• On the martingale property in the rough Bergomi model• Near Gaussian Multiple Access Channel Capacity Detection and Decoding• Adaptive Edge Process Migration for IoT in Heterogeneous Cloud-Fog-Edge Computing Environment• Deep Geometric Prior for Surface Reconstruction• A Real-Time Remote IDS Testbed for Connected Vehicles• Deep Learned Frame Prediction for Video Compression• Robust Semi-Supervised Learning when Labels are Missing at Random• Predicting the Flu from Instagram• Subsampling (weighted smooth) empirical copula processes• A Bayesian model of acquisition and clearance of bacterial colonization• Lévy noise induced escape in the Morris-Lecar model• MagicVO: End-to-End Monocular Visual Odometry through Deep Bi-directional Recurrent Convolutional Neural Network• Extracting conditionally heteroscedastic components using ICA• One-Shot Item Search with Multimodal Data• The Fact Extraction and VERification (FEVER) Shared Task• Optimal Designs for Second-Order Interactions in Paired Comparison Experiments with Binary Attributes• Noise2Void – Learning Denoising from Single Noisy Images• Modelling Agents Endowed with Social Practices: Static Aspects• GarNet: A Two-stream Network for Fast and Accurate 3D Cloth Draping• Eliminating Exposure Bias and Loss-Evaluation Mismatch in Multiple Object Tracking• An explicit representation and enumeration for negacyclic codes of length $2^kn$ over $\mathbb{Z}4+u\mathbb{Z}_4$• Systematic Exploration of the High Likelihood Set of Phylogenetic Tree Topologies• Besov class via heat semigroup on Dirichlet spaces II: BV functions and Gaussian heat kernel estimates• An explicit representation and enumeration for self-dual cyclic codes over $\mathbb{F}{2^m}+u\mathbb{F}_{2^m}$ of length $2^s$• Convexity in scientific collaboration networks• Higher-order approximate confidence intervals• Finding perfect matchings in random regular graphs in linear time• A spatially varying change points model for monitoring glaucoma progression using visual field data• Rotting bandits are no harder than stochastic ones• Performance Analysis of a Low-Interference N-Continuous OFDM Scheme• Dense xUnit Networks• Understanding the Importance of Single Directions via Representative Substitution• Fast Object Detection in Compressed Video• Combining Deep Learning and Qualitative Spatial Reasoning to Learn Complex Structures from Sparse Examples with Noise• Learning State Representations in Complex Systems with Multimodal Data• Measuring Effects of Medication Adherence on Time-Varying Health Outcomes using Bayesian Dynamic Linear Models• Recycling cardiogenic artifacts in impedance pneumography• Ultra-dense Radio Access Networks for Smart Cities: Cloud-RAN, Fog-RAN and ‘cell-free’ Massive MIMO• Refined WaveNet Vocoder for Variational Autoencoder Based Voice Conversion• Robust Classification of Financial Risk• MobiFace: A Lightweight Deep Learning Face Recognition on Mobile Devices• Automatic Face Aging in Videos via Deep Reinforcement Learning• Bilinear Parameterization For Differentiable Rank-Regularization• Learning to detect dysarthria from raw speech• Classifications of quasitrivial semigroups• Counterexamples to a conjecture of Harris on Hall ratio• Unprocessing Images for Learned Raw Denoising• See before you see: Real-time high speed motion prediction using fast aperture-robust event-driven visual flow• SOC: hunting the underground inside story of the ethereum Social-network Opinion and Comment• Spatial log-Gaussian Cox processes in Hilbert spaces• Reliable uncertainty estimate for antibiotic resistance classification with Stochastic Gradient Langevin Dynamics• Understanding and Improving Kernel Local Descriptors• The Structure of Optimal Private Tests for Simple Hypotheses• Finite-Time Stability of Adaptive Parameter Estimation and Control• Knots in random neural networks• Fairness Under Unawareness: Assessing Disparity When Protected Class Is Unobserved• FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery• The Capacity of Private Information Retrieval from Decentralized Uncoded Caching Databases• Cross-Lingual Approaches to Reference Resolution in Dialogue Systems• Class-Distinct and Class-Mutual Image Generation with GANs• Label-Noise Robust Generative Adversarial Networks• Integrated Object Detection and Tracking with Tracklet-Conditioned Detection• Deformable ConvNets v2: More Deformable, Better Results• Central limit theorems with a rate of convergence for time-dependent intermittent maps
Like this:
Like Loading…
Related