GritNet 2: Real-Time Student Performance Prediction with Domain Adaptation
Increasingly fast development and update cycle of online course contents, and diverse demographics of students in each online classroom, make student performance prediction in real-time (before the course finishes) an interesting topic for both industrial research and practical needs. In that, we tackle the problem of real-time student performance prediction with on-going courses in domain adaptation framework, which is a system trained on students’ labeled outcome from one previous coursework but is meant to be deployed on another. In particular, we first review recently-developed GritNet architecture which is the current state of the art for student performance prediction problem, and introduce a new unsupervised domain adaptation method to transfer a GritNet trained on a past course to a new course without any (students’ outcome) label. Our results for real Udacity students’ graduation predictions show that the GritNet not only generalizes well from one course to another across different Nanodegree programs, but enhances real-time predictions explicitly in the first few weeks when accurate predictions are most challenging.
Visual Diagnostics for Deep Reinforcement Learning Policy Development
An Inexact First-order Method for Constrained Nonlinear Optimization
The primary focus of this paper is on designing inexact first-order methods for solving large-scale constrained nonlinear optimization problems. By controlling the inexactness of the subproblem solution, we can significantly reduce the computational cost needed for each iteration. A penalty parameter updating strategy during the subproblem solve enables the algorithm to automatically detect infeasibility. Global convergence for both feasible and infeasible cases are proved. Complexity analysis for the KKT residual is also derived under loose assumptions. Numerical experiments exhibit the ability of the proposed algorithm to rapidly find inexact optimal solution through cheap computational cost.
Document Informed Neural Autoregressive Topic Models with Distributional Prior
We address two challenges in topic models: (1) Context information around words helps in determining their actual meaning, e.g., ‘networks’ used in the contexts artificial neural networks vs. biological neuron networks. Generative topic models infer topic-word distributions, taking no or only little context into account. Here, we extend a neural autoregressive topic model to exploit the full context information around words in a document in a language modeling fashion. The proposed model is named as iDocNADE. (2) Due to the small number of word occurrences (i.e., lack of context) in short text and data sparsity in a corpus of few documents, the application of topic models is challenging on such texts. Therefore, we propose a simple and efficient way of incorporating external knowledge into neural autoregressive topic models: we use embeddings as a distributional prior. The proposed variants are named as DocNADE2 and iDocNADE2. We present novel neural autoregressive topic model variants that consistently outperform state-of-the-art generative topic models in terms of generalization, interpretability (topic coherence) and applicability (retrieval and classification) over 6 long-text and 8 short-text datasets from diverse domains.
Improvements on Hindsight Learning
Sparse reward problems are one of the biggest challenges in Reinforcement Learning. Goal-directed tasks are one such sparse reward problems where a reward signal is received only when the goal is reached. One promising way to train an agent to perform goal-directed tasks is to use Hindsight Learning approaches. In these approaches, even when an agent fails to reach the desired goal, the agent learns to reach the goal it achieved instead. Doing this over multiple trajectories while generalizing the policy learned from the achieved goals, the agent learns a goal conditioned policy to reach any goal. One such approach is Hindsight Experience replay which uses an off-policy Reinforcement Learning algorithm to learn a goal conditioned policy. In this approach, a replay of the past transitions happens in a uniformly random fashion. Another approach is to use a Hindsight version of the policy gradients to directly learn a policy. In this work, we discuss different ways to replay past transitions to improve learning in hindsight experience replay focusing on prioritized variants in particular. Also, we implement the Hindsight Policy gradient methods to robotic tasks.
Decision-support for the Masses by Enabling Conversations with Open Data
Open data refers to data that is freely available for reuse. Although there has been rapid increase in availability of open data to public in the last decade, this has not translated into better decision-support tools for them. We propose intelligent conversation generators as a grand challenge that would automatically create data-driven conversation interfaces (CIs), also known as chatbots or dialog systems, from open data and deliver personalized analytical insights to users based on their contextual needs. Such generators will not only help bring Artificial Intelligence (AI)-based solutions for important societal problems to the masses but also advance AI by providing an integrative testbed for human-centric AI and filling gaps in the state-of-art towards this aim.
Déjà Vu: an empirical evaluation of the memorization properties of ConvNets
Convolutional neural networks memorize part of their training data, which is why strategies such as data augmentation and drop-out are employed to mitigate overfitting. This paper considers the related question of ‘membership inference’, where the goal is to determine if an image was used during training. We consider it under three complementary angles. We show how to detect which dataset was used to train a model, and in particular whether some validation images were used at train time. We then analyze explicit memorization and extend classical random label experiments to the problem of learning a model that predicts if an image belongs to an arbitrary set. Finally, we propose a new approach to infer membership when a few of the top layers are not available or have been fine-tuned, and show that lower layers still carry information about the training samples. To support our findings, we conduct large-scale experiments on Imagenet and subsets of YFCC-100M with modern architectures such as VGG and Resnet.
Fast embedding of multilayer networks: An algorithm and application to group fMRI
Learning interpretable features from complex multilayer networks is a challenging and important problem. The need for such representations is particularly evident in multilayer networks of the brain, where nodal characteristics may help model and differentiate regions of the brain according to individual, cognitive task, or disease. Motivated by this problem, we introduce the multi-node2vec algorithm, an efficient and scalable feature engineering method that automatically learns continuous node feature representations from multilayer networks. Multi-node2vec relies upon a second-order random walk sampling procedure that efficiently explores the inner- and intra-layer ties of the observed multilayer network is utilized to identify multilayer neighborhoods. Maximum likelihood estimators of the nodal features are identified through the use of the Skip-gram neural network model on the collection of sampled neighborhoods. We investigate the conditions under which multi-node2vec is an approximation of a closed-form matrix factorization problem. We demonstrate the efficacy of multi-node2vec on a multilayer functional brain network from resting state fMRI scans over a group of 74 healthy individuals. We find that multi-node2vec outperforms contemporary methods on complex networks, and that multi-node2vec identifies nodal characteristics that closely associate with the functional organization of the brain.
Self Configuration in Machine Learning
In this paper we first present a class of algorithms for training multi-level neural networks with a quadratic cost function one layer at a time starting from the input layer. The algorithm is based on the fact that for any layer to be trained, the effect of a direct connection to an optimized linear output layer can be computed without the connection being made. Thus, starting from the input layer, we can train each layer in succession in isolation from the other layers. Once trained, the weights are kept fixed and the outputs of the trained layer then serve as the inputs to the next layer to be trained. The result is a very fast algorithm. The simplicity of this training arrangement allows the activation function and step size in weight adjustment to be adaptive and self-adjusting. Furthermore, the stability of the training process allows relatively large steps to be taken and thereby achieving in even greater speeds. Finally, in our context configuring the network means determining the number of outputs for each layer. By decomposing the overall cost function into separate components related to approximation and estimation, we obtain an optimization formula for determining the number of outputs for each layer. With the ability to self-configure and set parameters, we now have more than a fast training algorithm, but the ability to build automatically a fully trained deep neural network starting with nothing more than data.
Active Anomaly Detection via Ensembles
In critical applications of anomaly detection including computer security and fraud prevention, the anomaly detector must be configurable by the analyst to minimize the effort on false positives. One important way to configure the anomaly detector is by providing true labels for a few instances. We study the problem of label-efficient active learning to automatically tune anomaly detection ensembles and make four main contributions. First, we present an important insight into how anomaly detector ensembles are naturally suited for active learning. This insight allows us to relate the greedy querying strategy to uncertainty sampling, with implications for label-efficiency. Second, we present a novel formalism called compact description to describe the discovered anomalies and show that it can also be employed to improve the diversity of the instances presented to the analyst without loss in the anomaly discovery rate. Third, we present a novel data drift detection algorithm that not only detects the drift robustly, but also allows us to take corrective actions to adapt the detector in a principled manner. Fourth, we present extensive experiments to evaluate our insights and algorithms in both batch and streaming settings. Our results show that in addition to discovering significantly more anomalies than state-of-the-art unsupervised baselines, our active learning algorithms under the streaming-data setup are competitive with the batch setup.
Transfer Entropy in MDPs with Temporal Logic Specifications
Emerging applications in autonomy require control techniques that take into account uncertain environments, communication and sensing constraints, while satisfying highlevel mission specifications. Motivated by this need, we consider a class of Markov decision processes (MDPs), along with a transfer entropy cost function. In this context, we study highlevel mission specifications as co-safe linear temporal logic (LTL) formulae. We provide a method to synthesize a policy that minimizes the weighted sum of the transfer entropy and the probability of failure to satisfy the specification. We derive a set of coupled non-linear equations that an optimal policy must satisfy. We then use a modified Arimoto-Blahut algorithm to solve the non-linear equations. Finally, we demonstrated the proposed method on a navigation and path planning scenario of a Mars rover.
Least Inferable Policies for Markov Decision Processes
In a variety of applications, an agent’s success depends on the knowledge that an adversarial observer has or can gather about the agent’s decisions. It is therefore desirable for the agent to achieve a task while reducing the ability of an observer to infer the agent’s policy. We consider the task of the agent as a reachability problem in a Markov decision process and study the synthesis of policies that minimize the observer’s ability to infer the transition probabilities of the agent between the states of the Markov decision process. We introduce a metric that is based on the Fisher information as a proxy for the information leaked to the observer and using this metric formulate a problem that minimizes expected total information subject to the reachability constraint. We proceed to solve the problem using convex optimization methods. To verify the proposed method, we analyze the relationship between the expected total information and the estimation error of the observer, and show that, for a particular class of Markov decision processes, these two values are inversely proportional.
On Misinformation Containment in Online Social Networks
HashTran-DNN: A Framework for Enhancing Robustness of Deep Neural Networks against Adversarial Malware Samples
Adversarial machine learning in the context of image processing and related applications has received a large amount of attention. However, adversarial machine learning, especially adversarial deep learning, in the context of malware detection has received much less attention despite its apparent importance. In this paper, we present a framework for enhancing the robustness of Deep Neural Networks (DNNs) against adversarial malware samples, dubbed Hashing Transformation Deep Neural Networks} (HashTran-DNN). The core idea is to use hash functions with a certain locality-preserving property to transform samples to enhance the robustness of DNNs in malware classification. The framework further uses a Denoising Auto-Encoder (DAE) regularizer to reconstruct the hash representations of samples, making the resulting DNN classifiers capable of attaining the locality information in the latent space. We experiment with two concrete instantiations of the HashTran-DNN framework to classify Android malware. Experimental results show that four known attacks can render standard DNNs useless in classifying Android malware, that known defenses can at most defend three of the four attacks, and that HashTran-DNN can effectively defend against all of the four attacks.
Range entropy: A bridge between signal complexity and self-similarity
Analysis of Bag-of-n-grams Representation’s Properties Based on Textual Reconstruction
Despite its simplicity, bag-of-n-grams sentence representation has been found to excel in some NLP tasks. However, it has not received much attention in recent years and further analysis on its properties is necessary. We propose a framework to investigate the amount and type of information captured in a general-purposed bag-of-n-grams sentence representation. We first use sentence reconstruction as a tool to obtain bag-of-n-grams representation that contains general information of the sentence. We then run prediction tasks (sentence length, word content, phrase content and word order) using the obtained representation to look into the specific type of information captured in the representation. Our analysis demonstrates that bag-of-n-grams representation does contain sentence structure level information. However, incorporating n-grams with higher order n empirically helps little with encoding more information in general, except for phrase content information.
Actionable Recourse in Linear Classification
Classification models are often used to make decisions that affect humans: whether to approve a loan application, extend a job offer, or provide insurance. In such applications, individuals should have the ability to change the decision of the model. When a person is denied a loan by a credit scoring model, for example, they should be able to change the input variables of the model in a way that will guarantee approval. Otherwise, this person will be denied the loan so long as the model is deployed, and — more importantly — will lack agency over a decision that affects their livelihood. In this paper, we propose to audit a linear classification model in terms of recourse, which we define as the ability of a person to change the decision of the model through actionable input variables (e.g., income vs. gender, age, or marital status). We present an integer programming toolkit to: (i) measure the feasibility and difficulty of recourse in a target population; and (ii) generate a list of actionable changes for an individual to obtain a desired outcome. We demonstrate how our tools can inform practitioners, policymakers, and consumers by auditing credit scoring models built using real-world datasets. Our results illustrate how recourse can be significantly impacted by common modeling practices, and motivate the need to guarantee recourse as a policy objective for regulation in algorithmic decision-making.
Parameterless Stochastic Natural Gradient Method for Discrete Optimization and its Application to Hyper-Parameter Optimization for Neural Network
Black box discrete optimization (BBDO) appears in wide range of engineering tasks. Evolutionary or other BBDO approaches have been applied, aiming at automating necessary tuning of system parameters, such as hyper parameter tuning of machine learning based systems when being installed for a specific task. However, automation is often jeopardized by the need of strategy parameter tuning for BBDO algorithms. An expert with the domain knowledge must undergo time-consuming strategy parameter tuning. This paper proposes a parameterless BBDO algorithm based on information geometric optimization, a recent framework for black box optimization using stochastic natural gradient. Inspired by some theoretical implications, we develop an adaptation mechanism for strategy parameters of the stochastic natural gradient method for discrete search domains. The proposed algorithm is evaluated on commonly used test problems. It is further extended to two examples of simultaneous optimization of the hyper parameters and the connection weights of deep learning models, leading to a faster optimization than the existing approaches without any effort of parameter tuning.
Random problems with R
Model-Protected Multi-Task Learning
Multi-task learning (MTL) refers to the paradigm of learning multiple related tasks together. By contrast, single-task learning (STL) learns each individual task independently. MTL often leads to better trained models because they can leverage the commonalities among related tasks. However, because MTL algorithms will ‘transmit’ information on different models across different tasks, MTL poses a potential security risk. Specifically, an adversary may participate in the MTL process through a participating task, thereby acquiring the model information for another task. Previously proposed privacy-preserving MTL methods protect data instances rather than models, and some of them may underperform in comparison with STL methods. In this paper, we propose a privacy-preserving MTL framework to prevent the information on each model from leaking to other models based on a perturbation of the covariance matrix of the model matrix, and we study two popular MTL approaches for instantiation, namely, MTL approaches for learning the low-rank and group-sparse patterns of the model matrix. Our methods are built upon tools for differential privacy. Privacy guarantees and utility bounds are provided. Heterogeneous privacy budgets are considered. Our algorithms can be guaranteed not to underperform comparing with STL methods. Experiments demonstrate that our algorithms outperform existing privacy-preserving MTL methods on the proposed model-protection problem.
MBS: Macroblock Scaling for CNN Model Reduction
We estimate the proper channel (width) scaling of Convolution Neural Networks (CNNs) for model reduction. Unlike the traditional scaling method that reduces every CNN channel width by the same scaling factor, we address each CNN macroblock adaptively depending on its information redundancy measured by our proposed effective flops. Our proposed macroblock scaling (MBS) algorithm can be applied to various CNN architectures to reduce their model size. These applicable models range from compact CNN models such as MobileNet (25.53% reduction, ImageNet) and ShuffleNet (20.74% reduction, ImageNet) to ultra-deep ones such as ResNet-101 (51.67% reduction, ImageNet) and ResNet-1202 (72.71% reduction, CIFAR-10) with negligible accuracy degradation. MBS also performs better reduction at a much lower cost than does the state-of-the-art optimization-based method. MBS’s simplicity and efficiency, its flexibility to work with any CNN model, and its scalability to work with models of any depth makes it an attractive choice for CNN model size reduction.
Runtime Monitoring Neural Activation Patterns
For using neural networks in safety critical domains, it is important to know if a decision made by a neural network is supported by prior similarities in training. We propose runtime neuron activation pattern monitoring – after the standard training process, one creates a monitor by feeding the training data to the network again in order to store the neuron activation patterns in abstract form. In operation, a classification decision over an input is further supplemented by examining if a pattern similar (measured by Hamming distance) to the generated pattern is contained in the monitor. If the monitor does not contain any pattern similar to the generated pattern, it raises a warning that the decision is not based on the training data. Our experiments show that, by adjusting the similarity-threshold for activation patterns, the monitors can report a significant portion of misclassfications to be not supported by training with a small false-positive rate, when evaluated on a test set.
Talking to myself: self-dialogues as data for conversational agents
Conversational agents are gaining popularity with the increasing ubiquity of smart devices. However, training agents in a data driven manner is challenging due to a lack of suitable corpora. This paper presents a novel method for gathering topical, unstructured conversational data in an efficient way: self-dialogues through crowd-sourcing. Alongside this paper, we include a corpus of 3.6 million words across 23 topics. We argue the utility of the corpus by comparing self-dialogues with standard two-party conversations as well as data from other corpora.
Is rotation forest the best classifier for problems with continuous features?
Rotation forest is a tree based ensemble that performs transforms on subsets of attributes prior to constructing each tree. We present an empirical comparison of classifiers for problems with only real valued features. We evaluate classifiers from three families of algorithms: support vector machines; tree-based ensembles; and neural networks. We compare classifiers on unseen data based on the quality of the decision rule (using classification error) the ability to rank cases (area under the receiver operator curve) and the probability estimates (using negative log likelihood). We conclude that, in answer to the question posed in the title, yes, rotation forest, is significantly more accurate on average than competing techniques when compared on three distinct sets of datasets. The same pattern of results are observed when tuning classifiers on the train data using a grid search. We investigate why rotation forest does so well by testing whether the characteristics of the data can be used to differentiate classifier performance. We assess the impact of the design features of rotation forest through an ablative study that transforms random forest into rotation forest. We identify the major limitation of rotation forest as its scalability, particularly in number of attributes. To overcome this problem we develop a model to predict the train time of the algorithm and hence propose a contract version of rotation forest where a run time cap {\em a priori}. We demonstrate that on large problems rotation forest can be made an order of magnitude faster without significant loss of accuracy and that there is no real benefit (on average) from tuning the ensemble. We conclude that without any domain knowledge to indicate an algorithm preference, rotation forest should be the default algorithm of choice for problems with continuous attributes.
From BOP to BOSS and Beyond: Time Series Classification with Dictionary Based Classifiers
A family of algorithms for time series classification (TSC) involve running a sliding window across each series, discretising the window to form a word, forming a histogram of word counts over the dictionary, then constructing a classifier on the histograms. A recent evaluation of two of this type of algorithm, Bag of Patterns (BOP) and Bag of Symbolic Fourier Approximation Symbols (BOSS) found a significant difference in accuracy between these seemingly similar algorithms. We investigate this phenomenon by deconstructing the classifiers and measuring the relative importance of the four key components between BOP and BOSS. We find that whilst ensembling is a key component for both algorithms, the effect of the other components is mixed and more complex. We conclude that BOSS represents the state of the art for dictionary based TSC. Both BOP and BOSS can be classed as bag of words approaches. These are particularly popular in Computer Vision for tasks such as image classification. Converting approaches from vision requires careful engineering. We adapt three techniques used in Computer Vision for TSC: Scale Invariant Feature Transform; Spatial Pyramids; and Histrogram Intersection. We find that using Spatial Pyramids in conjunction with BOSS (SP) produces a significantly more accurate classifier. SP is significantly more accurate than standard benchmarks and the original BOSS algorithm. It is not significantly worse than the best shapelet based approach, and is only outperformed by HIVE-COTE, an ensemble that includes BOSS as a constituent module.
A generalized financial time series forecasting model based on automatic feature engineering using genetic algorithms and support vector machine
We propose the genetic algorithm for time window optimization, which is an embedded genetic algorithm (GA), to optimize the time window (TW) of the attributes using feature selection and support vector machine. This GA is evolved using the results of a trading simulation, and it determines the best TW for each technical indicator. An appropriate evaluation was conducted using a walk-forward trading simulation, and the trained model was verified to be generalizable for forecasting other stock data. The results show that using the GA to determine the TW can improve the rate of return, leading to better prediction models than those resulting from using the default TW.
Labyrinth: Compiling Imperative Control Flow to Parallel Dataflows
Parallel dataflow systems have become a standard technology for large-scale data analytics. Complex data analysis programs in areas such as machine learning and graph analytics often involve control flow, i.e., iterations and branching. Therefore, systems for advanced analytics should include control flow constructs that are efficient and easy to use. A natural approach is to provide imperative control flow constructs similar to those of mainstream programming languages: while-loops, if-statements, and mutable variables, whose values can change between iteration steps. However, current parallel dataflow systems execute programs written using imperative control flow constructs by launching a separate dataflow job after every control flow decision (e.g., for every step of a loop). The performance of this approach is suboptimal, because (a) launching a dataflow job incurs scheduling overhead; and (b) it prevents certain optimizations across iteration steps. In this paper, we introduce Labyrinth, a method to compile programs written using imperative control flow constructs to a single dataflow job, which executes the whole program, including all iteration steps. This way, we achieve both efficiency and ease of use. We also conduct an experimental evaluation, which shows that Labyrinth has orders of magnitude smaller per-iteration-step overhead than launching new dataflow jobs, and also allows for significant optimizations across iteration steps.
MNIST Dataset Classification Utilizing k-NN Classifier with Modified Sliding Window Metric
This paper evaluates the performance of the K-nearest neighbor classification algorithm on the MNIST dataset of the handwritten digits. The L2 Euclidean distance metric is compared to a modified distance metric which utilizes the sliding window technique in order to avoid performance degradations due to slight spatial misalignments. Accuracy and confusion matrix are used as the performance indicators to compare the performance of the baseline algorithm versus the enhanced sliding window method and results show significant improvement using this simple method.
On the Learning Dynamics of Deep Neural Networks
While a lot of progress has been made in recent years, the dynamics of learning in deep nonlinear neural networks remain to this day largely misunderstood. In this work, we study the case of binary classification and prove various properties of learning in such networks under strong assumptions such as linear separability of the data. Extending existing results from the linear case, we confirm empirical observations by proving that the classification error also follows a sigmoidal shape in nonlinear architectures. We show that given proper initialization, learning expounds parallel independent modes and that certain regions of parameter space might lead to failed training. We also demonstrate that input norm and features’ frequency in the dataset lead to distinct convergence speeds which might shed some light on the generalization capabilities of deep neural networks. We provide a comparison between the dynamics of learning with cross-entropy and hinge losses, which could prove useful to understand recent progress in the training of generative adversarial networks. Finally, we identify a phenomenon that we baptize gradient starvation where the most frequent features in a dataset prevent the learning of other less frequent but equally informative features.
• SECS: Efficient Deep Stream Processing via Class Skew Dichotomy• Capsule Deep Neural Network for Recognition of Historical Graffiti Handwriting• Reflection identities of harmonic sums of weight four• EEG-based Subjects Identification based on Biometrics of Imagined Speech using EMD• Leakage Mitigation in Heterodyne FMCW Radar For Small Drone Detection with Stationary Point Concentration Technique• Adversarial Reinforcement Learning for Observer Design in Autonomous Systems under Cyber Attacks• A class of non-linear fractional-order system stabilisation via fixed-order dynamic output feedback controller• AUEB at BioASQ 6: Document and Snippet Retrieval• Conditional Joint Probability Distributions of First Exit Times to Overlapping Absorbing Sets of the Mixture of Markov Jump Processes• Controller Synthesis for Discrete-time Hybrid Polynomial Systems via Occupation Measures• A Fog Robotic System for Dynamic Visual Servoing• Underlay Drone Cell for Temporary Events: Impact of Drone Height and Aerial Channel Environments• Generative x-vectors for text-independent speaker verification• Scattering Networks for Hybrid Representation Learning• Surface Wave-Based Underwater Radio Communication• Strange Attractor in Density Evolution• The Best-or-Worst and the Postdoc problems with random number of candidates• A Rainbow Dirac’s Theorem• Segmenting root systems in X-ray computed tomography images using level sets• Hidden Markov Model Estimation-Based Q-learning for Partially Observable Markov Decision Process• Crowdsourcing Lung Nodules Detection and Annotation• Adversarial Imitation via Variational Inverse Reinforcement Learning• Bayesian analysis of absolute continuous Marshall-Olkin bivariate Pareto distribution with location and scale parameters• The Double Star Sequences and the General Second Zagreb Index• Crowd-Assisted Polyp Annotation of Virtual Colonoscopy Videos• DeClarE: Debunking Fake News and False Claims using Evidence-Aware Deep Learning• Radiative Transport Based Flame Volume Reconstruction from Videos• Spatial Variable Selection and An Application to Virginia Lyme Disease Emergence• A class of parabolic systems associated with optimal control of grain boundary motions• Bridging the Simulated-to-Real Gap: Benchmarking Super-Resolution on Real Data• Metric Registration of Curves and Surfaces using Optimal Control• A General Framework for Temporal Fair User Scheduling in NOMA Systems• The Hrushovski property for hypertournaments and profinite topologies• The Effective Geometry Monte Carlo Algorithm: Applications to Molecular Communication• Limited Rate Distributed Weight-Balancing and Average Consensus Over Digraphs• LMap: Shape-Preserving Local Mappings for Biomedical Visualization• Robust Spoken Language Understanding via Paraphrasing• Homogeneity testing under finite location-scale mixtures• A new lower bound on Hadwiger-Debrunner numbers in the plane• Robustness Guarantees for Bayesian Inference with Gaussian Processes• Non-Uniform Stability, Detectability, and, Sliding Mode Observer Design for Time Varying Systems with Unknown Inputs• Mask Editor : an Image Annotation Tool for Image Segmentation Tasks• Recovering the Underlying Trajectory from Sparse and Irregular Longitudinal Data• Functional Measurement Error in Functional Regression• Ground vehicle odometry using a non-intrusive inertial speed sensor• Towards Deep and Representation Learning for Talent Search at LinkedIn• Zeroth-order (Non)-Convex Stochastic Optimization via Conditional Gradient and Gradient Updates• Talent Search and Recommendation Systems at LinkedIn: Practical Challenges and Lessons Learned• Lagrangian chaos and scalar advection in stochastic fluid mechanics• Correlations in the shear flow of athermal amorphous solids: A principal component analysis• In-Session Personalization for Talent Search• Triad-based Neural Network for Coreference Resolution• An inverse problem formulation of the immersed boundary method• Estimating grouped data models with a binary dependent variable and fixed effects: What are the issues• On the Partition Set Cover Problem• Scene Text Recognition from Two-Dimensional Perspective• Concentration Inequalities for the Empirical Distribution• Negative type diversities, a multi-dimensional analogue of negative type metrics• Formal Barriers to Longest-Chain Proof-of-Stake Protocols• Performance Analysis and Modeling of Video Transcoding Using Heterogeneous Cloud Services• Multi-channel EEG recordings during a sustained-attention driving task• Leveraging Computational Reuse for Cost- and QoS-Efficient Task Scheduling in Clouds• Automatic Judgment Prediction via Legal Reading Comprehension• Robust Model Predictive Control with Adjustable Uncertainty Sets• Deep Textured 3D Reconstruction of Human Bodies• On generalized Erdős-Ginzburg-Ziv constants of $C_n^r$• On the abelian complexity of generalized Thue-Morse sequences• Image Super-Resolution via Deterministic-Stochastic Synthesis and Local Statistical Rectification• User Information Augmented Semantic Frame Parsing using Coarse-to-Fine Neural Networks• Low-Latency Short-Packet Transmissions: Fixed Length or HARQ?• A Simple Approximation for a Hard Routing Problem• Utilizing Network Structure to Bound the Convergence Rate in Markov Chain Monte Carlo Algorithms• Evolution of vacancy pores in bounded particles• Connectivity and Structure in Large Networks• Switching Isotropic and Directional Exploration with Parameter Space Noise in Deep Reinforcement Learning• Convergence to a Lévy process in the Skorohod $M_1$ and $M_2$ topologies for nonuniformly hyperbolic systems, including billiards with cusps• U-Net for MAV-based Penstock Inspection: an Investigation of Focal Loss in Multi-class Segmentation for Corrosion Identification• How does bond percolation happen in coloured networks?• Towards a symbolic summation theory for unspecified sequences• A probabilistic framework for approximating functions in active subspaces• Symbolic Tensor Neural Networks for Digital Media – from Tensor Processing via BNF Graph Rules to CREAMS Applications• Learning Universal Sentence Representations with Mean-Max Attention Autoencoder• Enhanced 3DTV Regularization and Its Applications on Hyper-spectral Image Denoising and Compressed Sensing• The distortion principle for insurance pricing: properties, identification and robustness• Rare tail approximation using asymptotics and $L^1$ polar coordinates• Asymptotic expansion for some local volatility models arising in finance• SCC-rFMQ Learning in Cooperative Markov Games with Continuous Actions• Tilings of polygons composed of equal rectangles by similar rectangles• Comparison between Suitable Priors for Additive Bayesian Networks• Towards Abstraction in ASP with an Application on Reasoning about Agent Policies• Lung Cancer Concept Annotation from Spanish Clinical Narratives• Model-Free Adaptive Optimal Control of Sequential Manufacturing Processes using Reinforcement Learning• Attribute Enhanced Face Aging with Wavelet-based Generative Adversarial Networks• Local Reconstruction Codes: A Class of MDS-PIR Capacity-Achieving Codes• Toward Unobtrusive In-home Gait Analysis Based on Radar Micro-Doppler Signatures• Quantum communication in a superposition of causal orders• Probing Limits of Information Spread with Sequential Seeding• Low-Voltage Distribution Network Impedances Identification Based on Smart Meter Data• Bidirectional Attentional Encoder-Decoder Model and Bidirectional Beam Search for Abstractive Summarization• Support Vector Machine (SVM) Recognition Approach adapted to Individual and Touching Moths Counting in Trap Images• A Simple Approach to Intrinsic Correspondence Learning on Unstructured 3D Meshes• Compressed Sensing Parallel MRI with Adaptive Shrinkage TV Regularization• Gram Charlier and Edgeworth expansion for sample variance• Effects of Repetitive SSVEPs on EEG Complexity using Multiscale Inherent Fuzzy Entropy• Benchmarking five global optimization approaches for nano-optical shape optimization and parameter reconstruction• Reconfiguration of Brain Network between Resting-state and Oddball Paradigm• Estimating Bayesian Optimal Treatment Regimes for Dichotomous Outcomes using Observational Data• RumourEval 2019: Determining Rumour Veracity and Support for Rumours• Average performance of Orthogonal Matching Pursuit (OMP) for sparse approximation• Adding Cues to Binary Feature Descriptors for Visual Place Recognition• Multiple Combined Constraints for Image Stitching• Optimal strategies for patrolling fences• Dynamical variety of shapes in financial multifractality• Stable processes conditioned to hit an interval continuously from the outside• Bridging the Gap Between Safety and Real-Time Performance in Receding-Horizon Trajectory Design for Mobile Robots• Transfer and Multi-Task Learning for Noun-Noun Compound Interpretation• Multiobjective Reinforcement Learning for Reconfigurable Adaptive Optimal Control of Manufacturing Processes• 3D segmentation of mandible from multisectional CT scans by convolutional neural networks• A Variance Reduction Method for Non-Convex Optimization with Improved Convergence under Large Condition Number• State-Dependent Kernel Selection for Conditional Sampling of Graphs• Structural Target Controllability of Undirected Networks• Structured Sparsity Promoting Functions• Generalized Content-Preserving Warps for Image Stitching• Phase transition in random tensors with multiple spikes• On the combinatorics of last passage percolation in a quarter square and $\mathrm{GOE}^2$ fluctuations• Analysis of Convergence for the Newton Method in DC Microgrids• Nonconvex Demixing From Bilinear Measurements• Discrete Derivative Asymptotics of the $β$-Hermite Eigenvalues• Competing paths over fitness valleys in growing populations• Branch-and-bound for bi-objective integer programming• A Bayesian Approach for Inferring Local Causal Structure in Gene Regulatory Networks• Finding k-Dissimilar Paths with Minimum Collective Length• Face enumeration on flag complexes and flag spheres• Bias behaviour and antithetic sampling in mean-field particle approximations of SDEs nonlinear in the sense of McKean• Albumentations: fast and flexible image augmentations• Device-to-Device Secure Coded Caching• $L^{p}$-solutions of the Navier-Stokes equation with fractional Brownian noise
Like this:
Like Loading…
Related