Whats new on arXiv

Interpretable Convolutional Filter Pruning

The sophisticated structure of Convolutional Neural Network (CNN) allows for outstanding performance, but at the cost of intensive computation. As significant redundancies inevitably present in such a structure, many works have been proposed to prune the convolutional filters for computation cost reduction. Although extremely effective, most works are based only on quantitative characteristics of the convolutional filters, and highly overlook the qualitative interpretation of individual filter’s specific functionality. In this work, we interpreted the functionality and redundancy of the convolutional filters from different perspectives, and proposed a functionality-oriented filter pruning method. With extensive experiment results, we proved the convolutional filters’ qualitative significance regardless of magnitude, demonstrated significant neural network redundancy due to repetitive filter functions, and analyzed the filter functionality defection under inappropriate retraining process. Such an interpretable pruning approach not only offers outstanding computation cost optimization over previous filter pruning methods, but also interprets filter pruning process.

Super Characters: A Conversion from Sentiment Classification to Image Classification

We propose a method named Super Characters for sentiment classification. This method converts the sentiment classification problem into image classification problem by projecting texts into images and then applying CNN models for classification. Text features are extracted automatically from the generated Super Characters images, hence there is no need of any explicit step of embedding the words or characters into numerical vector representations. Experimental results on large social media corpus show that the Super Characters method consistently outperforms other methods for sentiment classification and topic classification tasks on ten large social media datasets of millions of contents in four different languages, including Chinese, Japanese, Korean and English.

The Newton Scheme for Deep Learning

We introduce a neural network (NN) strictly governed by Newton’s Law, with the nature required basis functions derived from the fundamental classic mechanics. Then, by classifying the training model as a quick procedure of ‘force pattern’ recognition, we developed the Newton physics-based NS scheme. Once the force pattern is confirmed, the neuro network simply does the checking of the ‘pattern stability’ instead of the continuous fitting by computational resource consuming big data-driven processing. In the given physics’s law system, once the field is confirmed, the mathematics bases for the force field description actually are not diverged but denumerable, which can save the function representations from the exhaustible available mathematics bases. In this work, we endorsed Newton’s Law into the deep learning technology and proposed Newton Scheme (NS). Under NS, the user first identifies the path pattern, like the constant acceleration movement.The object recognition technology first loads mass information, then, the NS finds the matched physical pattern and describe and predict the trajectory of the movements with nearly zero error. We compare the major contribution of this NS with the TCN, GRU and other physics inspired ‘FIND-PDE’ methods to demonstrate fundamental and extended applications of how the NS works for the free-falling, pendulum and curve soccer balls.The NS methodology provides more opportunity for the future deep learning advances.

Multi-Task Deep Learning for Legal Document Translation, Summarization and Multi-Label Classification

The digitalization of the legal domain has been ongoing for a couple of years. In that process, the application of different machine learning (ML) techniques is crucial. Tasks such as the classification of legal documents or contract clauses as well as the translation of those are highly relevant. On the other side, digitized documents are barely accessible in this field, particularly in Germany. Today, deep learning (DL) is one of the hot topics with many publications and various applications. Sometimes it provides results outperforming the human level. Hence this technique may be feasible for the legal domain as well. However, DL requires thousands of samples to provide decent results. A potential solution to this problem is multi-task DL to enable transfer learning. This approach may be able to overcome the data scarcity problem in the legal domain, specifically for the German language. We applied the state of the art multi-task model on three tasks: translation, summarization, and multi-label classification. The experiments were conducted on legal document corpora utilizing several task combinations as well as various model parameters. The goal was to find the optimal configuration for the tasks at hand within the legal domain. The multi-task DL approach outperformed the state of the art results in all three tasks. This opens a new direction to integrate DL technology more efficiently in the legal domain.

Security Matters: A Survey on Adversarial Machine Learning

Adversarial machine learning is a fast growing research area, which considers the scenarios when machine learning systems may face potential adversarial attackers, who intentionally synthesize input data to make a well-trained model to make mistake. It always involves a defending side, usually a classifier, and an attacking side that aims to cause incorrect output. The earliest studies of the adversarial learning starts from the information security area, which considers a variety of possible attacks. But recent research focus that popularized by the deep learning community places strong emphasis on how the ‘imperceivable’ perturbations on the normal inputs may cause dramatic mistakes by the deep learning with supposed super-human accuracy. This paper serves to give a comprehensive introduction to a wide range of aspects of the adversarial deep learning topic, including its foundations, typical attacking and defending strategies, and some extended studies. We also share our points of view on the root cause of its existence and possible future directions of this research field.

Incremental Few-Shot Learning with Attention Attractor Networks

Machine learning classifiers are often trained to recognize a set of pre-defined classes. However, in many real applications, it is often desirable to have the flexibility of learning additional concepts, without re-training on the full training set. This paper addresses this problem, incremental few-shot learning, where a regular classification network has already been trained to recognize a set of base classes; and several extra novel classes are being considered, each with only a few labeled examples. After learning the novel classes, the model is then evaluated on the overall performance of both base and novel classes. To this end, we propose a meta-learning model, the Attention Attractor Network, which regularizes the learning of novel classes. In each episode, we train a set of new weights to recognize novel classes until they converge, and we show that the technique of recurrent back-propagation can back-propagate through the optimization process and facilitate the learning of the attractor network regularizer. We demonstrate that the learned attractor network can recognize novel classes while remembering old classes without the need to review the original training set, outperforming baselines that do not rely on an iterative optimization process.

The Concept of Criticality in Reinforcement Learning

Reinforcement learning methods carry a well known bias-variance trade-off in n-step algorithms for optimal control. Unfortunately, this has rarely been addressed in current research. This trade-off principle holds independent of the choice of the algorithm, such as n-step SARSA, n-step Expected SARSA or n-step Tree backup. A small n results in a large bias, while a large n leads to large variance. The literature offers no straightforward recipe for the best choice of this value. While currently all n-step algorithms use a fixed value of n over the state space we extend the framework of n-step updates by allowing each state to have its specific n. We propose a solution to this problem within the context of human aided reinforcement learning. Our approach is based on the observation that a human can learn more efficiently if she receives input regarding the criticality of a given state and thus the amount of attention she needs to invest into the learning in that state. This observation is related to the idea that each state of the MDP has a certain measure of criticality which indicates how much the choice of the action in that state influences the return. In our algorithm the RL agent utilizes the criticality measure, a function provided by a human trainer, in order to locally choose the best stepnumber n for the update of the Q function.

At Human Speed: Deep Reinforcement Learning with Action Delay

There has been a recent explosion in the capabilities of game-playing artificial intelligence. Many classes of tasks, from video games to motor control to board games, are now solvable by fairly generic algorithms, based on deep learning and reinforcement learning, that learn to play from experience with minimal prior knowledge. However, these machines often do not win through intelligence alone — they possess vastly superior speed and precision, allowing them to act in ways a human never could. To level the playing field, we restrict the machine’s reaction time to a human level, and find that standard deep reinforcement learning methods quickly drop in performance. We propose a solution to the action delay problem inspired by human perception — to endow agents with a neural predictive model of the environment which ‘undoes’ the delay inherent in their environment — and demonstrate its efficacy against professional players in Super Smash Bros. Melee, a popular console fighting game.

Refining interaction search through signed iterative Random Forests

Advances in supervised learning have enabled accurate prediction in biological systems governed by complex interactions among biomolecules. However, state-of-the-art predictive algorithms are typically black-boxes, learning statistical interactions that are difficult to translate into testable hypotheses. The iterative Random Forest algorithm took a step towards bridging this gap by providing a computationally tractable procedure to identify the stable, high-order feature interactions that drive the predictive accuracy of Random Forests (RF). Here we refine the interactions identified by iRF to explicitly map responses as a function of interacting features. Our method, signed iRF, describes subsets of rules that frequently occur on RF decision paths. We refer to these rule subsets as signed interactions. Signed interactions share not only the same set of interacting features but also exhibit similar thresholding behavior, and thus describe a consistent functional relationship between interacting features and responses. We describe stable and predictive importance metrics to rank signed interactions. For each SPIM, we define null importance metrics that characterize its expected behavior under known structure. We evaluate our proposed approach in biologically inspired simulations and two case studies: predicting enhancer activity and spatial gene expression patterns. In the case of enhancer activity, s-iRF recovers one of the few experimentally validated high-order interactions and suggests novel enhancer elements where this interaction may be active. In the case of spatial gene expression patterns, s-iRF recovers all 11 reported links in the gap gene network. By refining the process of interaction recovery, our approach has the potential to guide mechanistic inquiry into systems whose scale and complexity is beyond human comprehension.

Deep Neural Maps

We introduce a new unsupervised representation learning and visualization using deep convolutional networks and self organizing maps called Deep Neural Maps (DNM). DNM jointly learns an embedding of the input data and a mapping from the embedding space to a two-dimensional lattice. We compare visualizations of DNM with those of t-SNE and LLE on the MNIST and COIL-20 data sets. Our experiments show that the DNM can learn efficient representations of the input data, which reflects characteristics of each class. This is shown via back-projecting the neurons of the map on the data space.

ReDMark: Framework for Residual Diffusion Watermarking on Deep Networks

Due to the rapid growth of machine learning tools and specifically deep networks in various computer vision and image processing areas, application of Convolutional Neural Networks for watermarking have recently emerged. In this paper, we propose a deep end-to-end diffusion watermarking framework (ReDMark) which can be adapted for any desired transform space. The framework is composed of two Fully Convolutional Neural Networks with the residual structure for embedding and extraction. The whole deep network is trained end-to-end to conduct a blind secure watermarking. The framework is customizable for the level of robustness vs. imperceptibility. It is also adjustable for the trade-off between capacity and robustness. The proposed framework simulates various attacks as a differentiable network layer to facilitate end-to-end training. For JPEG attack, a differentiable approximation is utilized, which drastically improves the watermarking robustness to this attack. Another important characteristic of the proposed framework, which leads to improved security and robustness, is its capability to diffuse watermark information among a relatively wide area of the image. Comparative results versus recent state-of-the-art researches highlight the superiority of the proposed framework in terms of imperceptibility and robustness.

A Short Introduction to Local Graph Clustering Methods and Software

Graph clustering has many important applications in computing, but due to the increasing sizes of graphs, even traditionally fast clustering methods can be computationally expensive for real-world graphs of interest. Scalability problems led to the development of local graph clustering algorithms that come with a variety of theoretical guarantees. Rather than return a global clustering of the entire graph, local clustering algorithms return a single cluster around a given seed node or set of seed nodes. These algorithms improve scalability because they use time and memory resources that depend only on the size of the cluster returned, instead of the size of the input graph. Indeed, for many of them, their running time grows linearly with the size of the output. In addition to scalability arguments, local graph clustering algorithms have proven to be very useful for identifying and interpreting small-scale and meso-scale structure in large-scale graphs. As opposed to heuristic operational procedures, this class of algorithms comes with strong algorithmic and statistical theory. These include statistical guarantees that prove they have implicit regularization properties. One of the challenges with the existing literature on these approaches is that they are published in a wide variety of areas, including theoretical computer science, statistics, data science, and mathematics. This has made it difficult to relate the various algorithms and ideas together into a cohesive whole. We have recently been working on unifying these diverse perspectives through the lens of optimization as well as providing software to perform these computations in a cohesive fashion. In this note, we provide a brief introduction to local graph clustering, we provide some representative examples of our perspective, and we introduce our software named Local Graph Clustering (LGC).

Autonomous Deep Learning: Continual Learning Approach for Dynamic Environments

The feasibility of deep neural networks (DNNs) to address data stream problems still requires intensive study because of the static and offline nature of conventional deep learning approaches. A deep continual learning algorithm, namely autonomous deep learning (ADL), is proposed in this paper. Unlike traditional deep learning methods, ADL features a flexible structure where its network structure can be constructed from scratch with the absence of initial network structure via the self-constructing network structure. ADL specifically addresses catastrophic forgetting by having a different-depth structure which is capable of achieving a trade-off between plasticity and stability. Network significance (NS) formula is proposed to drive the hidden nodes growing and pruning mechanism. Drift detection scenario (DDS) is put forward to signal distributional changes in data streams which induce the creation of a new hidden layer. Maximum information compression index (MICI) method plays an important role as a complexity reduction module eliminating redundant layers. The efficacy of ADL is numerically validated under the prequential test-then-train procedure in lifelong environments using nine popular data stream problems. The numerical results demonstrate that ADL consistently outperforms recent continual learning methods while characterizing the automatic construction of network structures.

Fault Tolerance in Iterative-Convergent Machine Learning

Machine learning (ML) training algorithms often possess an inherent self-correcting behavior due to their iterative-convergent nature. Recent systems exploit this property to achieve adaptability and efficiency in unreliable computing environments by relaxing the consistency of execution and allowing calculation errors to be self-corrected during training. However, the behavior of such systems are only well understood for specific types of calculation errors, such as those caused by staleness, reduced precision, or asynchronicity, and for specific types of training algorithms, such as stochastic gradient descent. In this paper, we develop a general framework to quantify the effects of calculation errors on iterative-convergent algorithms and use this framework to design new strategies for checkpoint-based fault tolerance. Our framework yields a worst-case upper bound on the iteration cost of arbitrary perturbations to model parameters during training. Our system, SCAR, employs strategies which reduce the iteration cost upper bound due to perturbations incurred when recovering from checkpoints. We show that SCAR can reduce the iteration cost of partial failures by 78% – 95% when compared with traditional checkpoint-based fault tolerance across a variety of ML models and training algorithms.

Optimization of Indexing Based on k-Nearest Neighbor Graph for Proximity Search in High-dimensional Data

Searching for high-dimensional vector data with high accuracy is an inevitable search technology for various types of data. Graph-based indexes are known to reduce the query time for high-dimensional data. To further improve the query time by using graphs, we focused on the indegrees and outdegrees of graphs. While a sufficient number of incoming edges (indegrees) are indispensable for increasing search accuracy, an excessive number of outgoing edges (outdegrees) should be suppressed so as to not increase the query time. Therefore, we propose three degree-adjustment methods: static degree adjustment of not only outdegrees but also indegrees, dynamic degree adjustment with which outdegrees are determined by the search accuracy users require, and path adjustment to remove edges that have alternative search paths to reduce outdegrees. We also show how to obtain optimal degree-adjustment parameters and that our methods outperformed previous methods for image and textual data.

Learning to Separate Domains in Generalized Zero-Shot and Open Set Learning: a probabilistic perspective

This paper studies the problem of domain division problem which aims to segment instances drawn from different probabilistic distributions. Such a problem exists in many previous recognition tasks, such as Open Set Learning (OSL) and Generalized Zero-Shot Learning (G-ZSL), where the testing instances come from either seen or novel/unseen classes of different probabilistic distributions. Previous works focused on either only calibrating the confident prediction of classifiers of seen classes (W-SVM), or taking unseen classes as outliers. In contrast, this paper proposes a probabilistic way of directly estimating and fine-tuning the decision boundary between seen and novel/unseen classes. In particular, we propose a domain division algorithm of learning to split the testing instances into known, unknown and uncertain domains, and then conduct recognize tasks in each domain. Two statistical tools, namely, bootstrapping and Kolmogorov-Smirnov (K-S) Test, for the first time, are introduced to discover and fine-tune the decision boundary of each domain. Critically, the uncertain domain is newly introduced in our framework to adopt those instances whose domain cannot be predicted confidently. Extensive experiments demonstrate that our approach achieved the state-of-the-art performance on OSL and G-ZSL benchmarks.

Progressive Weight Pruning of Deep Neural Networks using ADMM

Deep neural networks (DNNs) although achieving human-level performance in many domains, have very large model size that hinders their broader applications on edge computing devices. Extensive research work have been conducted on DNN model compression or pruning. However, most of the previous work took heuristic approaches. This work proposes a progressive weight pruning approach based on ADMM (Alternating Direction Method of Multipliers), a powerful technique to deal with non-convex optimization problems with potentially combinatorial constraints. Motivated by dynamic programming, the proposed method reaches extremely high pruning rate by using partial prunings with moderate pruning rates. Therefore, it resolves the accuracy degradation and long convergence time problems when pursuing extremely high pruning ratios. It achieves up to 34 times pruning rate for ImageNet dataset and 167 times pruning rate for MNIST dataset, significantly higher than those reached by the literature work. Under the same number of epochs, the proposed method also achieves faster convergence and higher compression rates. The codes and pruned DNN models are released in the link bit.ly/2zxdlss

Analysis of Railway Accidents’ Narratives Using Deep Learning

Automatic understanding of domain specific texts in order to extract useful relationships for later use is a non-trivial task. One such relationship would be between railroad accidents’ causes and their correspondent descriptions in reports. From 2001 to 2016 rail accidents in the U.S. cost more than $latex $

Adversarial Balancing for Causal Inference

Biases in observational data pose a major challenge to estimation methods for the effect of treatments. An important technique that accounts for these biases is reweighting samples to minimize the discrepancy between treatment groups. Inverse probability weighting, a popular weighting technique, models the conditional treatment probability given covariates. However, it is overly sensitive to model misspecification and suffers from large estimation variance. Recent methods attempt to alleviate these limitations by finding weights that minimize a selected discrepancy measure between the reweighted populations. We present a new reweighting approach that uses classification error as a measure of similarity between datasets. Our proposed framework uses bi-level optimization to alternately train a discriminator to minimize classification error, and a balancing weights generator to maximize this error. This approach borrows principles from generative adversarial networks (GANs) that aim to exploit the power of classifiers for discrepancy measure estimation. We tested our approach on several benchmarks. The results of our experiments demonstrate the effectiveness and robustness of this approach in estimating causal effects under different data generating settings.

Online Learning of Recurrent Neural Architectures by Locally Aligning Distributed Representations

Temporal models based on recurrent neural networks have proven to be quite powerful in a wide variety of applications, including language modeling and speech processing. However, to train these models, one relies on back-propagation through time, which entails unfolding the network over many time steps, making the process of conducting credit assignment considerably more challenging. Furthermore, the nature of back-propagation itself does not permit the use of non-differentiable activation functions and is inherently sequential, making parallelization of the underlying training process very difficult. In this work, we propose the Parallel Temporal Neural Coding Network, a biologically inspired model trained by the local learning algorithm known as Local Representation Alignment, that aims to resolve the difficulties and problems that plague recurrent networks trained by back-propagation through time. Most notably, this architecture requires neither unrolling nor the derivatives of its internal activation functions. We compare our model and learning procedure to other online back-propagation-through-time alternatives (which also tend to be computationally expensive), including real-time recurrent learning, echo state networks, and unbiased online recurrent optimization, and show that it outperforms them on sequence modeling benchmarks such as Bouncing MNIST, a new benchmark we call Bouncing NotMNIST, and Penn Treebank. Notably, our approach can, in some instances, even outperform full back-propagation through time itself as well as variants such as sparse attentive back-tracking. Furthermore, we present promising experimental results that demonstrate our model’s ability to conduct zero-shot adaptation.

Cloud Service Provider Evaluation System using Fuzzy Rough Set Technique

Cloud Service Provider (CSPs) offers a wide variety of scalable, flexible, and cost-efficient services to the cloud customers on demand and pay-per-utilization. However, vast diversity in available cloud services leads to various challenges for users to determine and select the best suitable service. Also, sometimes users need to hire the required services from multiple CSPs which introduce difficulties in managing interfaces, accounts, security, supports, and Service Level Agreements (SLAs). To circumvent such problems having a Cloud Service Broker (CSB) be aware of service offerings and users Quality of Service (QoS) requirements will benefit both the CSPs as well as users. In this work, we proposed a Fuzzy Rough Set based Cloud Service Brokerage Architecture, which is responsible for ranking and selection of services based on users QoS requirements, and finally monitoring the service executions. We have used the fuzzy rough set technique for dimension reduction, and to rank the CSPs we have used weighted Euclidean distance. To prioritize user QoS request, we intended to use user assign weights, also incorporated system assigned weights to give the relative importance to QoS attributes. We compared the proposed ranking technique with an existing method based on the system response time taken. The case study experiment results show that the proposed approach is scalable, resilience, and produce better results with less searching time.

Hierarchical Methods of Moments

Spectral methods of moments provide a powerful tool for learning the parameters of latent variable models. Despite their theoretical appeal, the applicability of these methods to real data is still limited due to a lack of robustness to model misspecification. In this paper we present a hierarchical approach to methods of moments to circumvent such limitations. Our method is based on replacing the tensor decomposition step used in previous algorithms with approximate joint diagonalization. Experiments on topic modeling show that our method outperforms previous tensor decomposition methods in terms of speed and model quality.

One-Shot Observation Learning

Observation learning is the process of learning a task by observing an expert demonstrator. We present a robust observation learning method for robotic systems. Our principle contributions are in introducing a one shot learning method where only a single demonstration is needed for learning and in proposing a novel feature extraction method for extracting unique activity features from the demonstration. Reward values are then generated from these demonstrations. We use a learning algorithm with these rewards to learn the controls for a robotic manipulator to perform the demonstrated task. With simulation and real robot experiments, we show that the proposed method can be used to learn tasks from a single demonstration under varying conditions of viewpoints, object properties, morphology of manipulators and scene backgrounds.

Combine Statistical Thinking With Scientific Practice: A Protocol of a Bayesian Thesis Project For Undergraduate Students

Current developments in the statistics community suggest that modern statistics education should be structured holistically, i.e., by allowing students to work with real data and answer concrete statistical questions, but also by educating them about alternative statistical frameworks, such as Bayesian statistics. In this article, we describe how we incorporated such a holistic structure in a Bayesian thesis project on ordered binomial probabilities. The project was targeted at undergraduate students in psychology with basic knowledge in Bayesian statistics and programming, but no formal mathematical training. The thesis project aimed to (1) convey the basic mathematical concepts of Bayesian inference, (2) let students experience the entire empirical cycle including the collection, analysis, and interpretation of data, and (3) teach students open science practices.

Machine Common Sense Concept Paper

This paper summarizes some of the technical background, research ideas, and possible development strategies for achieving machine common sense. Machine common sense has long been a critical-but-missing component of Artificial Intelligence (AI). Recent advances in machine learning have resulted in new AI capabilities, but in all of these applications, machine reasoning is narrow and highly specialized. Developers must carefully train or program systems for every situation. General commonsense reasoning remains elusive. The absence of common sense prevents intelligent systems from understanding their world, behaving reasonably in unforeseen situations, communicating naturally with people, and learning from new experiences. Its absence is perhaps the most significant barrier between the narrowly focused AI applications we have today and the more general, human-like AI systems we would like to build in the future. Machine common sense remains a broad, potentially unbounded problem in AI. There are a wide range of strategies that could be employed to make progress on this difficult challenge. This paper discusses two diverse strategies for focusing development on two different machine commonsense services: (1) a service that learns from experience, like a child, to construct computational models that mimic the core domains of child cognition for objects (intuitive physics), agents (intentional actors), and places (spatial navigation); and (2) service that learns from reading the Web, like a research librarian, to construct a commonsense knowledge repository capable of answering natural language and image-based questions about commonsense phenomena.

On Evaluating Embedding Models for Knowledge Base Completion

Knowledge bases contribute to many artificial intelligence tasks, yet they are often incomplete. To add missing facts to a given knowledge base, various embedding models have been proposed in the recent literature. Perhaps surprisingly, relatively simple models with limited expressiveness often performed remarkably well under today’s most commonly used evaluation protocols. In this paper, we explore whether recent embedding models work well for knowledge base completion tasks and argue that the current evaluation protocols are more suited for question answering rather than knowledge base completion. We show that using an alternative evaluation protocol more suitable for knowledge base completion, the performance of all models is unsatisfactory. This indicates the need for more research into embedding models and evaluation protocols for knowledge base completion.

Pruning Deep Neural Networks using Partial Least Squares

A Self-adaptive Agent-based System for Cloud Platforms

Cloud computing is a model for enabling on-demand network access to a shared pool of computing resources, that can be dynamically allocated and released with minimal effort. However, this task can be complex in highly dynamic environments with various resources to allocate for an increasing number of different users requirements. In this work, we propose a Cloud architecture based on a multi-agent system exhibiting a self-adaptive behavior to address the dynamic resource allocation. This self-adaptive system follows a MAPE-K approach to reason and act, according to QoS, Cloud service information, and propagated run-time information, to detect QoS degradation and make better resource allocation decisions. We validate our proposed Cloud architecture by simulation. Results show that it can properly allocate resources to reduce energy consumption, while satisfying the users demanded QoS.

Shrinkage estimation of rate statistics

This paper presents a simple shrinkage estimator of rates based on Bayesian methods. Our focus is on crime rates as a motivating example. The estimator shrinks each town’s observed crime rate toward the country-wide average crime rate according to town size. By realistic simulations we confirm that the proposed estimator outperforms the maximum likelihood estimator in terms of global risk. We also show that it has better coverage properties.

Structural Equation Modeling and simultaneous clustering through the Partial Least Squares algorithm

The identification of different homogeneous groups of observations and their appropriate analysis in PLS-SEM has become a critical issue in many application fields. Usually, both SEM and PLS-SEM assume the homogeneity of all units on which the model is estimated, and approaches of segmentation present in literature, consist in estimating separate models for each segments of statistical units, which have been obtained either by assigning the units to segments a priori defined. However, these approaches are not fully acceptable because no causal structure among the variables is postulated. In other words, a modeling approach should be used, where the obtained clusters are homogeneous with respect to the structural causal relationships. In this paper, a new methodology for simultaneous non-hierarchical clustering and PLS-SEM is proposed. This methodology is motivated by the fact that the sequential approach of applying first SEM or PLS-SEM and second the clustering algorithm such as K-means on the latent scores of the SEM/PLS-SEM may fail to find the correct clustering structure existing in the data. A simulation study and an application on real data are included to evaluate the performance of the proposed methodology.

• Conceptual Collectives• All-Optical FSO Relaying Under Mixture-Gamma Fading Channels and Pointing Errors• Stochastic homogenization for a diffusion-reaction model• Compressed Randomized UTV Decompositions for Low-Rank Approximations and Big Data Applications• Constructing sparse Davenport-Schinzel sequences by hypergraph edge coloring• Convex Analysis for LQG Systems with Applications to Major Minor LQG Mean Field Game Systems• Deep Learning Based Power Control for Quality-Driven Wireless Video Transmissions• A Mobile Ad hoc Cloud Computing and Networking Infrastructure for Automated Video Surveillance System• Fine-tuning on Clean Data for End-to-End Speech Translation: FBK @ IWSLT 2018• Malware triage for early identification of Advanced Persistent Threat activities• Modeling and Analysis of Wildfire Detection using Wireless Sensor Network with Poisson Deployment• Study of Sparsity-Aware Subband Adaptive Filtering Algorithms with Adjustable Penalties• Reinforcement Learning Decoders for Fault-Tolerant Quantum Computation• A Subsampling Line-Search Method with Second-Order Results• Cross-Modal and Hierarchical Modeling of Video and Text• Strategic heuristics underlie animal dominance hierarchies and provide evidence of group-level social knowledge• Accounting for Unobservable Heterogeneity in Cross Section Using Spatial First Differences• Hierarchical Generative Modeling for Controllable Speech Synthesis• Generalized Derangements and Anagrams Without Fixed Letters• Minimizing Inputs for Strong Structural Controllability• Optimal Network Topology Design in Composite Systems with Constrained Neighbors for Structural Controllability• Integrating kinematics and environment context into deep inverse reinforcement learning for predicting off-road vehicle trajectories• Optimal Cache Allocation for Named Data Caching under Network-Wide Capacity Constraint• Hybrid Feature Based SLAM Prototype• Conceptual Analysis of Hypertext• A Retrieval Framework and Implementation for Electronic Documents with Similar Layouts• A probabilistic analysis of a continuous-time evolution in recombination• Statistical classification for partially observed functional data via filtering• Assessing the distribution of discrete survival time in presence of recall error• Reduced-Gate Convolutional LSTM Using Predictive Coding for Spatiotemporal Prediction• Ensemble Inhibition and Excitation in the Human Cortex: an Ising Model Analysis with Uncertainties• Nearly Optimal Space Efficient Algorithm for Depth First Search• Deep-Waveform: A Learned OFDM Receiver Based on Deep Complex Convolutional Networks• Operationalizing Conflict and Cooperation between Automated Software Agents in Wikipedia: A Replication and Expansion of ‘Even Good Bots Fight’• A New Characterization of $\mathcal{V}$-Posets• The structure of low-complexity Gibbs measures on product spaces• Gender Bias in Nobel Prizes• List Decoding of Deletions Using Guess & Check Codes• Optimal locally private estimation under $\ell_p$ loss for $1\le p\le 2$• Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron• The Strassen Invariance Principle for Certain Non-stationary Markov-Feller Chains• Peek Search: Near-Optimal Online Markov Decoding• A Cohomology Theory for Planar Trivalent Graphs with Perfect Matchings• Solving Tree Problems with Category Theory• Deep neural network based i-vector mapping for speaker verification using short utterances• Prediction of Atomization Energy Using Graph Kernel and Active Learning• Finding Options that Minimize Planning Time• Tetrahedra and Relative Directions in Space, Using 2 and 3-Space Simplexes for 3-Space Localization• Improved return level estimation via a weighted likelihood, latent spatial extremes model• Exploring Sentence Vector Spaces through Automatic Summarization• A note on the asymptotics of the number of O-sequences of given length• Structure and enumeration results of matchable Lucas cubes• A class of non-matchable distributive lattices• Small-Deviation Inequalities for Sums of Random Matrices• On the computational complexity of MSTD sets• Learning in Non-convex Games with an Optimization Oracle• A Span-Extraction Dataset for Chinese Machine Reading Comprehension• Simple Regret Minimization for Contextual Bandits• XJTLUIndoorLoc: A New Fingerprinting Database for Indoor Localization and Trajectory Estimation Based on Wi-Fi RSS and Geomagnetic Field• The rank of random matrices over finite fields• Sequence to Sequence Mixture Model for Diverse Machine Translation• Optimization over time-varying directed graphs with row and column-stochastic matrices• Recognizing Partial Biometric Patterns• Data-driven identification of a thermal network in multi-zone building• Optimal Covariance Estimation for Condition Number Loss in the Spiked Model• Embarrassingly Simple Model for Early Action Proposal• Fast and Longest Rollercoasters• Learning an MR acquisition-invariant representation using Siamese neural networks• Learning to quantify emphysema extent: What labels do we need?• EMHMM Simulation Study• A graph theoretic characterization of the classical generalized hexagon on $364$ vertices• Multi-Stage Robust Transmission Constrained Unit Commitment: A Decomposition Framework with Implicit Decision Rules• Generalized Earthquake Frequency-Magnitude Distribution Described by Asymmetric Laplace Mixture Modelling• Reverse engineering of CAD models via clustering and approximate implicitization• Another construction of edge-regular graphs with regular cliques• Exploring Textual and Speech information in Dialogue Act Classification with Speaker Domain Adaptation• Does the public discuss other topics on climate change than researchers? A comparison of networks based on author keywords and hashtags• A Study of Efficient Energy Management Techniques for Cloud Computing Environment• What might matter in autonomous cars adoption: first person versus third person scenarios• Halfway to Rota’s basis conjecture• On Kahn’s basis conjecture• Maneuver-Based Generation of Motion Primitives for Differentially Constrained Motion Planning in State Lattices• Provable Robustness of ReLU networks via Maximization of Linear Regions• Offline Signature Verification by Combining Graph Edit Distance and Triplet Networks• When does Bone Suppression and Lung Field Segmentation Improve Chest X-Ray Disease Classification?• Domain-Informed Spline Interpolation• Mixed-Timescale Online PHY Caching for Dual-Mode MIMO Cooperative Networks• Existence of densities for stochastic differential equations driven by Lévy processes with anisotropic jumps• $k$-Servers with a Smile: Online Algorithms via Projections• Mean survival by ordered fractions of population with censored data• An EPTAS for machine scheduling with bag-constraints• Cyber Threat Impact Analysis to Air Traffic Flows Through Dynamic Queue Networks• Convergence of blanket times for sequences of random walks on critical random graphs• Optimizing Beams and Bits: A Novel Approach for Massive MIMO Base-Station Design• Virtual Wave Optics for Non-Line-of-Sight Imaging• Learning and Tracking the 3D Body Shape of Freely Moving Infants from RGB-D sequences• Delocalization and ergodicity of the Anderson model on Bethe lattices• Generating Event Triggers Based on Hilbert-Huang Transform and Its Application to Gravitational-Wave Data• Modelling project failure and its mitigation in a time-stamped network of interrelated tasks• Algorithms and Fundamental Limits for Unlabeled Detection using Types• Path-based measures of expansion rates and Lagrangian transport in stochastic flows• Efficient Proximal Mapping Computation for Unitarily Invariant Low-Rank Inducing Norms• Elastic Scattering Time of Matter-Waves in Disordered Potentials• The Finite Embedding Property for IP~Loops and Local Embeddability of Groups into Finite IP~Loops• Mode Division Multiplexing (MDM) Weight Bank Design for Use in Photonic Neural Networks• Payment Network Design with Fees• Trajectories in random minimal transposition factorizations• Uniform Graphical Convergence of Subgradients in Nonconvex Optimization and Learning• An Analysis of Attention Mechanisms: The Case of Word Sense Disambiguation in Neural Machine Translation• Orthogonal Deep Features Decomposition for Age-Invariant Face Recognition• Algorithmic Blockchain Channel Design• Adversarial Contract Design for Private Data Commercialization• A Consistent LM Type Specification Test for Semiparametric Models• Searching for collective behavior in a small brain• Enhanced Power Graphs of Finite Groups• Properties of Constacyclic Codes Under the Schur Product• Bayesian wavelet de-noising with the caravan prior• Security Attacks on Smart Grid Scheduling and Their Defences: A Game-Theoretic Approach• Dynkin games with incomplete and asymmetric information• Bayesian Estimation Based Load Modeling Report• Covert Capacity of Non-Coherent Rayleigh-Fading Channels

Like this:

Like Loading…

Related