Whats new on arXiv

Feature selection for transient stability assessment based on kernelized fuzzy rough sets and memetic algorithm

A new feature selection method based on kernelized fuzzy rough sets (KFRS) and the memetic algorithm (MA) is proposed for transient stability assessment of power systems. Considering the possible real-time information provided by wide-area measurement systems, a group of system-level classification features are extracted from the power system operation parameters to build the original feature set. By defining a KFRS-based generalized classification function as the separability criterion, the memetic algorithm based on binary differential evolution (BDE) and Tabu search (TS) is employed to obtain the optimal feature subsets with the maximized classification capability. The proposed method may avoid the information loss caused by the feature discretization process of the rough-set based attribute selection, and comprehensively utilize the advantages of BDE and TS to improve the solution quality and search efficiency. The effectiveness of the proposed method is validated by the application results on the New England 39-bus power system and the southern power system of Hebei province.

Building a Robust Text Classifier on a Test-Time Budget

We propose a generic and interpretable learning framework for building robust text classification model that achieves accuracy comparable to full models under test-time budget constraints. Our approach learns a selector to identify words that are relevant to the prediction tasks and passes them to the classifier for processing. The selector is trained jointly with the classifier and directly learns to incorporate with the classifier. We further propose a data aggregation scheme to improve the robustness of the classifier. Our learning framework is general and can be incorporated with any type of text classification model. On real-world data, we show that the proposed approach improves the performance of a given classifier and speeds up the model with a mere loss in accuracy performance.

Performance evaluation of job schedulers on Hadoop YARN

To solve the limitation of Hadoop on scalability, resource sharing, and application support, the open-source community proposes the next generation of Hadoop’s compute platform called Yet Another Resource Negotiator (YARN) by separating resource management functions from the programming model. This separation enables various application types to run on YARN in parallel. To achieve fair resource sharing and high resource utilization, YARN provides the capacity scheduler and the fair scheduler. However, the performance impacts of the two schedulers are not clear when mixed applications run on a YARN cluster. Therefore, in this paper, we study four scheduling-policy combinations (SPCs for short) derived from the two schedulers and then evaluate the four SPCs in extensive scenarios, which consider not only four application types, but also three different queue structures for organizing applications. The experimental results enable YARN managers to comprehend the influences of different SPCs and different queue structures on mixed applications. The results also help them to select a proper SPC and an appropriate queue structure to achieve better application execution performance.

Unknown Examples & Machine Learning Model Generalization

Over the past decades, researchers and ML practitioners have come up with better and better ways to build, understand and improve the quality of ML models, but mostly under the key assumption that the training data is distributed identically to the testing data. In many real-world applications, however, some potential training examples are unknown to the modeler, due to sample selection bias or, more generally, covariate shift, i.e., a distribution shift between the training and deployment stage. The resulting discrepancy between training and testing distributions leads to poor generalization performance of the ML model and hence biased predictions. We provide novel algorithms that estimate the number and properties of these unknown training examples—unknown unknowns. This information can then be used to correct the training set, prior to seeing any test data. The key idea is to combine species-estimation techniques with data-driven methods for estimating the feature values for the unknown unknowns. Experiments on a variety of ML models and datasets indicate that taking the unknown examples into account can yield a more robust ML model that generalizes better.

ParaNet – Using Dense Blocks for Early Inference

DenseNets have been shown to be a competitive model among recent convolutional network architectures. These networks utilize Dense Blocks, which are groups of densely connected layers where the output of a hidden layer is fed in as the input of every other layer following it. In this paper, we aim to improve certain aspects of DenseNet, especially when it comes to practicality. We introduce ParaNet, a new architecture that constructs three pipelines which allow for early inference. We additionally introduce a cascading mechanism such that different pipelines are able to share parameters, as well as logit matching between the outputs of the pipelines. We separately evaluate each of the newly introduced mechanisms of ParaNet, then evaluate our proposed architecture on CIFAR-100.

To Cluster, or Not to Cluster: An Analysis of Clusterability Methods

Clustering is an essential data mining tool that aims to discover inherent cluster structure in data. For most applications, applying clustering is only appropriate when cluster structure is present. As such, the study of clusterability, which evaluates whether data possesses such structure, is an integral part of cluster analysis. However, methods for evaluating clusterability vary radically, making it challenging to select a suitable measure. In this paper, we perform an extensive comparison of measures of clusterability and provide guidelines that clustering users can reference to select suitable measures for their applications.

Dr. Tux: A Question Answering System for Ubuntu users

Various forums and question answering (Q) sites are available online that allow Ubuntu users to find results similar to their queries. However, searching for a result is often time consuming as it requires the user to find a specific problem instance relevant to his/her query from a large set of questions. In this paper, we present an automated question answering system for Ubuntu users called Dr. Tux that is designed to answer user’s queries by selecting the most similar question from an online database. The prototype was implemented in Python and uses NLTK and CoreNLP tools for Natural Language Processing. The data for the prototype was taken from the AskUbuntu website which contains about 150k questions. The results obtained from the manual evaluation of the prototype were promising while also presenting some interesting opportunities for improvement.

Data-dependent Learning of Symmetric/Antisymmetric Relations for Knowledge Base Completion

Embedding-based methods for knowledge base completion (KBC) learn representations of entities and relations in a vector space, along with the scoring function to estimate the likelihood of relations between entities. The learnable class of scoring functions is designed to be expressive enough to cover a variety of real-world relations, but this expressive comes at the cost of an increased number of parameters. In particular, parameters in these methods are superfluous for relations that are either symmetric or antisymmetric. To mitigate this problem, we propose a new L1 regularizer for Complex Embeddings, which is one of the state-of-the-art embedding-based methods for KBC. This regularizer promotes symmetry or antisymmetry of the scoring function on a relation-by-relation basis, in accordance with the observed data. Our empirical evaluation shows that the proposed method outperforms the original Complex Embeddings and other baseline methods on the FB15k dataset.

How many labeled license plates are needed?

Training a good deep learning model often requires a lot of annotated data. As a large amount of labeled data is typically difficult to collect and even more difficult to annotate, data augmentation and data generation are widely used in the process of training deep neural networks. However, there is no clear common understanding on how much labeled data is needed to get satisfactory performance. In this paper, we try to address such a question using vehicle license plate character recognition as an example application. We apply computer graphic scripts and Generative Adversarial Networks to generate and augment a large number of annotated, synthesized license plate images with realistic colors, fonts, and character composition from a small number of real, manually labeled license plate images. Generated and augmented data are mixed and used as training data for the license plate recognition network modified from DenseNet. The experimental results show that the model trained from the generated mixed training data has good generalization ability, and the proposed approach achieves a new state-of-the-art accuracy on Dataset-1 and AOLP, even with a very limited number of original real license plates. In addition, the accuracy improvement caused by data generation becomes more significant when the number of labeled images is reduced. Data augmentation also plays a more significant role when the number of labeled images is increased.

Unsupervised Hypergraph Feature Selection via a Novel Point-Weighting Framework and Low-Rank Representation

Feature selection methods are widely used in order to solve the ‘curse of dimensionality’ problem. Many proposed feature selection frameworks, treat all data points equally; neglecting their different representation power and importance. In this paper, we propose an unsupervised hypergraph feature selection method via a novel point-weighting framework and low-rank representation that captures the importance of different data points. We introduce a novel soft hypergraph with low complexity to model data. Then, we formulate the feature selection as an optimization problem to preserve local relationships and also global structure of data. Our approach for global structure preservation helps the framework overcome the problem of unavailability of data labels in unsupervised learning. The proposed feature selection method treats with different data points based on their importance in defining data structure and representation power. Moreover, since the robustness of feature selection methods against noise and outlier is of great importance, we adopt low-rank representation in our model. Also, we provide an efficient algorithm to solve the proposed optimization problem. The computational cost of the proposed algorithm is lower than many state-of-the-art methods which is of high importance in feature selection tasks. We conducted comprehensive experiments with various evaluation methods on different benchmark data sets. These experiments indicate significant improvement, compared with state-of-the-art feature selection methods.

Efficiently Processing Workflow Provenance Queries on SPARK

In this paper, we investigate how we can leverage Spark platform for efficiently processing provenance queries on large volumes of workflow provenance data. We focus on processing provenance queries at attribute-value level which is the finest granularity available. We propose a novel weakly connected component based framework which is carefully engineered to quickly determine a minimal volume of data containing the entire lineage of the queried attribute-value. This minimal volume of data is then processed to figure out the provenance of the queried attribute-value. The proposed framework computes weakly connected components on the workflow provenance graph and further partitions the large components as a collection of weakly connected sets. The framework exploits the workflow dependency graph to effectively partition the large components into a collection of weakly connected sets. We study the effectiveness of the proposed framework through experiments on a provenance trace obtained from a real-life unstructured text curation workflow. On provenance graphs containing upto 500M nodes and edges, we show that the proposed framework answers provenance queries in real-time and easily outperforms the naive approaches.

Churn Intent Detection in Multilingual Chatbot Conversations and Social Media

We propose a new method to detect when users express the intent to leave a service, also known as churn. While previous work focuses solely on social media, we show that this intent can be detected in chatbot conversations. As companies increasingly rely on chatbots they need an overview of potentially churny users. To this end, we crowdsource and publish a dataset of churn intent expressions in chatbot interactions in German and English. We show that classifiers trained on social media data can detect the same intent in the context of chatbots. We introduce a classification architecture that outperforms existing work on churn intent detection in social media. Moreover, we show that, using bilingual word embeddings, a system trained on combined English and German data outperforms monolingual approaches. As the only existing dataset is in English, we crowdsource and publish a novel dataset of German tweets. We thus underline the universal aspect of the problem, as examples of churn intent in English help us identify churn in German tweets and chatbot conversations.

A Tutorial on Modular Ontology Modeling with Ontology Design Patterns: The Cooking Recipes Ontology

We provide a detailed example for modular ontology modeling based on ontology design patterns.

Causes of Effects via a Bayesian Model Selection Procedure

In causal inference, and specifically in the \textit{Causes of Effects} problem, one is interested in how to use statistical evidence to understand causation in an individual case, and so how to assess the so-called {\em probability of causation} (PC). The answer relies on the potential responses, which can incorporate information about what would have happened to the outcome as we had observed a different value of the exposure. However, even given the best possible statistical evidence for the association between exposure and outcome, we can typically only provide bounds for the PC. Dawid et al. (2016) highlighted some fundamental conditions, namely, exogeneity, comparability, and sufficiency, required to obtain such bounds, based on experimental data. The aim of the present paper is to provide methods to find, in specific cases, the best subsample of the reference dataset to satisfy such requirements. To this end, we introduce a new variable, expressing the desire to be exposed or not, and we set the question up as a model selection problem. The best model will be selected using the marginal probability of the responses and a suitable prior proposal over the model space. An application in the educational field is presented.

Taxonomy of Big Data: A Survey

The Big Data is the most popular paradigm nowadays and it has almost no untouched area. For instance, science, engineering, economics, business, social science, and government. The Big Data are used to boost up the organization performance using massive amount of dataset. The Data are assets of the organization, and these data gives revenue to the organizations. Therefore, the Big Data is spawning everywhere to enhance the organizations’ revenue. Thus, many new technologies emerging based on Big Data. In this paper, we present the taxonomy of Big Data. Besides, we present in-depth insight on the Big Data paradigm.

Bayesian Hypothesis Testing: Redux

Bayesian hypothesis testing is re-examined from the perspective of an a priori assessment of the test statistic distribution under the alternative. By assessing the distribution of an observable test statistic, rather than prior parameter values, we provide a practical default Bayes factor which is straightforward to interpret. To illustrate our methodology, we provide examples where evidence for a Bayesian strikingly supports the null, but leads to rejection under a classical test. Finally, we conclude with directions for future research.

FinBrain: When Finance Meets AI 2.0

Artificial intelligence (AI) is the core technology of technological revolution and industrial transformation. As one of the new intelligent needs in the AI 2.0 era, financial intelligence has elicited much attention from the academia and industry. In our current dynamic capital market, financial intelligence demonstrates a fast and accurate machine learning capability to handle complex data and has gradually acquired the potential to become a ‘financial brain’. In this work, we survey existing studies on financial intelligence. First, we describe the concept of financial intelligence and elaborate on its position in the financial technology field. Second, we introduce the development of financial intelligence and review state-of-the-art techniques in wealth management, risk management, financial security, financial consulting, and blockchain. Finally, we propose a research framework called FinBrain and summarize four open issues, namely, explainable financial agents and causality, perception and prediction under uncertainty, risk-sensitive and robust decision making, and multi-agent game and mechanism design. We believe that these research directions can lay the foundation for the development of AI 2.0 in the finance field.

Alignment Strength and Correlation for Graphs

Event Detection with Neural Networks: A Rigorous Empirical Evaluation

Detecting events and classifying them into predefined types is an important step in knowledge extraction from natural language texts. While the neural network models have generally led the state-of-the-art, the differences in performance between different architectures have not been rigorously studied. In this paper we present a novel GRU-based model that combines syntactic information along with temporal structure through an attention mechanism. We show that it is competitive with other neural network architectures through empirical evaluations under different random initializations and training-validation-test splits of ACE2005 dataset.

Data Motifs: A Lens Towards Fully Understanding Big Data and AI Workloads

An Incremental Construction of Deep Neuro Fuzzy System for Continual Learning of Non-stationary Data Streams

Existing fuzzy neural networks (FNNs) are mostly developed under a shallow network configuration having lower generalization power than those of deep structures. This paper proposes a novel self-organizing deep fuzzy neural network, namely deep evolving fuzzy neural networks (DEVFNN). Fuzzy rules can be automatically extracted from data streams or removed if they play little role during their lifespan. The structure of the network can be deepened on demand by stacking additional layers using a drift detection method which not only detects the covariate drift, variations of input space, but also accurately identifies the real drift, dynamic changes of both feature space and target space. DEVFNN is developed under the stacked generalization principle via the feature augmentation concept where a recently developed algorithm, namely Generic Classifier (gClass), drives the hidden layer. It is equipped by an automatic feature selection method which controls activation and deactivation of input attributes to induce varying subsets of input features. A deep network simplification procedure is put forward using the concept of hidden layer merging to prevent uncontrollable growth of input space dimension due to the nature of feature augmentation approach in building a deep network structure. DEVFNN works in the sample-wise fashion and is compatible for data stream applications. The efficacy of DEVFNN has been thoroughly evaluated using six datasets with non-stationary properties under the prequential test-then-train protocol. It has been compared with four state-of the art data stream methods and its shallow counterpart where DEVFNN demonstrates improvement of classification accuracy.

DIFET: Distributed Feature Extraction Tool For High Spatial Resolution Remote Sensing Images

In this paper, we propose distributed feature extraction tool from high spatial resolution remote sensing images. Tool is based on Apache Hadoop framework and Hadoop Image Processing Interface. Two corner detection (Harris and Shi-Tomasi) algorithms and five feature descriptors (SIFT, SURF, FAST, BRIEF, and ORB) are considered. Robustness of the tool in the task of feature extraction from LandSat-8 imageries are evaluated in terms of horizontal scalability.

• 2DR: Towards Fine-Grained 2-D RFID Touch Sensing• Probabilistic Model of Object Detection Based on Convolutional Neural Network• An elementary introduction to information geometry• Improving Breast Cancer Detection using Symmetry Information with Deep Learning• Deep Mask For X-ray Based Heart Disease Classification• Controlling Over-generalization and its Effect on Adversarial Examples Generation and Detection• Nuclei Detection Using Mixture Density Networks• Deep multiscale convolutional feature learning for weakly supervised localization of chest pathologies in X-ray images• Brain Biomarker Interpretation in ASD Using Deep Learning and fMRI• Compact Linearization for Binary Quadratic Problems Comprising Linear Constraints• The XDEM Multi-physics and Multi-scale Simulation Technology: Review on DEM-CFD Coupling, Methodology and Engineering Applications• Cox Model with Covariate Measurement Error and Unknown Changepoint• Using Apple Machine Learning Algorithms to Detect and Subclassify Non-Small Cell Lung Cancer• The transport phenomenon of inertia Brownian particle in periodic systems with non-Gaussian noise• Detecting strong cliques• Delay-induced chimeras in neural networks with fractal topology• The Entropy Power Inequality with quantum conditioning• LMI-Based Reset Unknown Input Observer for State Estimation of Linear Uncertain Systems• Inverse Kinematics for Control of Tensegrity Soft Robots: Existence and Optimality of Solutions• Adaptive Grey-Box Fuzz-Testing with Thompson Sampling• Resource Allocation Game on Social Networks: Best Response Dynamics and Convergence• Stability of Metabolic Networks via Linear-In-Flux-Expressions• A Visual Attention Grounding Neural Model for Multimodal Machine Translation• Learning Models for Shared Control of Human-Machine Systems with Unknown Dynamics• Can we leverage rating patterns from traditional users to enhance recommendations for children?• Probabilistic Graphical Modeling approach to dynamic PET direct parametric map estimation and image reconstruction• Convergence of the Augmented Decomposition Algorithm• Harnessing Infant Cry for swift, cost-effective Diagnosis of Perinatal Asphyxia in low-resource settings• GlymphVIS: Visualizing Glymphatic Transport Pathways Using Regularized Optimal Transport• Interpretable Spiculation Quantification for Lung Cancer Screening• Trajectory Tracking Control of a Flexible Spine Robot, With and Without a Reference Input• Voice Conversion with Conditional SampleRNN• Quantification of Local Metabolic Tumor Volume Changes by Registering Blended PET-CT Images for Prediction of Pathologic Tumor Response• Shape of the zeroth Landau level in graphene with non-diagonal disorder• A Deterministic Self-Organizing Map Approach and its Application on Satellite Data based Cloud Type Classification• A Trio Neural Model for Dynamic Entity Relatedness Ranking• BOP: Benchmark for 6D Object Pose Estimation• $L_p$ and almost sure convergence of estimation on heavy tail index under random censoring• Aperiodic Array Synthesis for Multi-User MIMO Applications• Rate-Splitting for Multi-Antenna Non-Orthogonal Unicast and Multicast Transmission: Spectral and Energy Efficiency Analysis• A Bayesian Approach to Restricted Latent Class Models for Scientifically-Structured Clustering of Multivariate Binary Outcomes• Composable block solvers for the four-field double porosity/permeability model• Theory-Driven Automated Content Analysis of Suicidal Tweets : Using Typicality-Based Classification for LDA Dataset• Reducing model bias in a deep learning classifier using domain adversarial neural networks in the MINERvA experiment• Backward Stochastic Riccati Equation with Jumps associated with Stochastic Linear Quadratic Optimal Control with Jumps and Random Coefficients• Uniformly Bounded Sets in Quasiperiodically Forced Dynamical Systems• Multiobjective Optimization Training of PLDA for Speaker Verification• A Comparison of the Taguchi Method and Evolutionary Optimization in Multivariate Testing• Multi-scale CNN stereo and pattern removal technique for underwater active stereo system• Detection and Mitigation of Attacks on Transportation Networks as a Multi-Stage Security Game• Random Matrix Theory Model for Mean Notch Depth of the Diagonally Loaded MVDR Beamformer for a Single Interferer Case• Hyperscaling Internet Graph Analysis with D4M on the MIT SuperCloud• Database-Agnostic Workload Management• Consensus-Before-Talk: Distributed Dynamic Spectrum Access via Distributed Spectrum Ledger Technology• Embedded Pilot-Aided Channel Estimation for OTFS in Delay-Doppler Channels• A short exposition of S. Parsa’s theorem on intrinsic linking and non-realizability• Relaxing the Identically Distributed Assumption in Gaussian Co-Clustering for High Dimensional Data• Inequalities of Riesz-Sobolev type for compact connected Abelian groups• Optimal uniform approximation of Lévy processes on Banach spaces with finite variation processes• NavigationNet: A Large-scale Interactive Indoor Navigation Dataset• Ranked Schröder Trees• Fusion++: Volumetric Object-Level SLAM• Stochastic Collocation with Non-Gaussian Correlated Parameters via a New Quadrature Rule• Antenna Array Based Positional Modulation with a Two-Ray Multi-Path Model• Circulant matrices and Galois-Togliatti systems• MADARi: A Web Interface for Joint Arabic Morphological Annotation and Spelling Correction• Saliency Detection via Bidirectional Absorbing Markov Chain• A Novel Deep Neural Network Architecture for Mars Visual Navigation• Noiseprint: a CNN-based camera model fingerprint• Tree-based Particle Smoothing Algorithms in a Hidden Markov Model• How do Convolutional Neural Networks Learn Design?• StreamChain: Do Blockchains Need Blocks?• Longest increasing path within the critical strip• Improving the results of string kernels in sentiment analysis and Arabic dialect identification by adapting them to your test set• Multiplayer bandits without observing collision information• Parameter estimation for Gaussian processes with application to the model with two independent fractional Brownian motions• Simple Graph Coloring Algorithms for Congested Clique and Massively Parallel Computation• Spectral gap property for random dynamics on the real line and multifractal analysis of generalised Takagi functions• Analysis of adversarial attacks against CNN-based image forgery detectors• Driven tabu search: a quantum inherent optimisation• An Experimental Comparison of SONC and SOS Certificates for Unconstrained Optimization• Meta-Learning for Low-Resource Neural Machine Translation• Paraphrases as Foreign Languages in Multilingual Neural Machine Translation• Inductive Learning of Answer Set Programs from Noisy Examples• Efficient improvement of frequency-domain Kalman filter• Deep Emotion: A Computational Model of Emotion Using Deep Neural Networks• What is an answer? – remarks, results and problems on PIO formulas in combinatorial enumeration, part I• Comparing CNN and LSTM character-level embeddings in BiLSTM-CRF models for chemical and disease named entity recognition• Additive Volume of Sets Contained in Few Arithmetic Progressions• The Social Cost of Strategic Classification• An Upper Bound on the Number of $(132,213)$-Avoiding Cyclic Permutations• Stronger sum-product inequalities for small sets• DNN: A Two-Scale Distributional Tale of Heterogeneous Treatment Effect Inference• Representing Social Media Users for Sarcasm Detection• Human-centric Indoor Scene Synthesis Using Stochastic Grammar• Discrete Decreasing Minimization, Part II: Views from Discrete Convex Analysis• Network Inference from Temporal-Dependent Grouped Observations• Deep-Learning Ensembles for Skin-Lesion Segmentation, Analysis, Classification: RECOD Titans at ISIC Challenge 2018• The Eulerian distribution on involutions is indeed $γ$-positive• Exploring Recombination for Efficient Decoding of Neural Machine Translation• Painting Outside the Box: Image Outpainting with GANs• Deep Probabilistic Logic: A Unifying Framework for Indirect Supervision• Modified Erdös–Ginzburg–Ziv Constants for $\mathbb Z/n\mathbb Z$ and $(\mathbb Z/n\mathbb Z)^2$• Bent Vectorial Functions, Codes and Designs• Contextual Parameter Generation for Universal Neural Machine Translation• Towards Tight Approximation Bounds for Graph Diameter and Eccentricities• Finite and infinite Mallows ranking models, maximum likelihood estimator, and regeneration• Energy Efficient and Fair Resource Allocation for LTE-Unlicensed Uplink Networks: A Two-sided Matching Approach with Partial Information• Efficient Single Image Super Resolution using Enhanced Learned Group Convolutions• Word Sense Induction with Neural biLM and Symmetric Patterns• Spectral Efficiency Analysis of Multi-Cell Massive MIMO Systems with Ricean Fading• There is no lattice tiling of $\mathbb{Z}^n$ by Lee spheres of radius $2$ for $n\geq 3$• An Approach For Stitching Satellite Images In A Bigdata Mapreduce Framework• When facts fail: Bias, polarisation and truth in social networks• A Framework on Hybrid MIMO Transceiver Design based on Matrix-Monotonic Optimization• A MapReduce based Big-data Framework for Object Extraction from Mosaic Satellite Images

Like this:

Like Loading…

Related