Whats new on arXiv

Topic representation: finding more representative words in topic models

The top word list, i.e., the top-M words with highest marginal probability in a given topic, is the standard topic representation in topic models. Most of recent automatical topic labeling algorithms and popular topic quality metrics are based on it. However, we find, empirically, words in this type of top word list are not always representative. The objective of this paper is to find more representative top word lists for topics. To achieve this, we rerank the words in a given topic by further considering marginal probability on words over every other topic. The reranking list of top-M words is used to be a novel topic representation for topic models. We investigate three reranking methodologies, using (1) standard deviation weight, (2) standard deviation weight with topic size and (3) Chi Square \c{hi}2statistic selection. Experimental results on real world collections indicate that our representations can extract more representative words for topics, agreeing with human judgements.

The Faults in Our Pi Stars: Security Issues and Open Challenges in Deep Reinforcement Learning

Since the inception of Deep Reinforcement Learning (DRL) algorithms, there has been a growing interest in both research and industrial communities in the promising potentials of this paradigm. The list of current and envisioned applications of deep RL ranges from autonomous navigation and robotics to control applications in the critical infrastructure, air traffic control, defense technologies, and cybersecurity. While the landscape of opportunities and the advantages of deep RL algorithms are justifiably vast, the security risks and issues in such algorithms remain largely unexplored. To facilitate and motivate further research on these critical challenges, this paper presents a foundational treatment of the security problem in DRL. We formulate the security requirements of DRL, and provide a high-level threat model through the classification and identification of vulnerabilities, attack vectors, and adversarial capabilities. Furthermore, we present a review of current literature on security of deep RL from both offensive and defensive perspectives. Lastly, we enumerate critical research venues and open problems in mitigation and prevention of intentional attacks against deep RL as a roadmap for further research in this area.

Time-Aware and Corpus-Specific Entity Relatedness

Entity relatedness has emerged as an important feature in a plethora of applications such as information retrieval, entity recommendation and entity linking. Given an entity, for instance a person or an organization, entity relatedness measures can be exploited for generating a list of highly-related entities. However, the relation of an entity to some other entity depends on several factors, with time and context being two of the most important ones (where, in our case, context is determined by a particular corpus). For example, the entities related to the International Monetary Fund are different now compared to some years ago, while these entities also may highly differ in the context of a USA news portal compared to a Greek news portal. In this paper, we propose a simple but flexible model for entity relatedness which considers time and entity aware word embeddings by exploiting the underlying corpus. The proposed model does not require external knowledge and is language independent, which makes it widely useful in a variety of applications.

Stochastic Substitute Training: A Gray-box Approach to Craft Adversarial Examples Against Gradient Obfuscation Defenses

It has been shown that adversaries can craft example inputs to neural networks which are similar to legitimate inputs but have been created to purposely cause the neural network to misclassify the input. These adversarial examples are crafted, for example, by calculating gradients of a carefully defined loss function with respect to the input. As a countermeasure, some researchers have tried to design robust models by blocking or obfuscating gradients, even in white-box settings. Another line of research proposes introducing a separate detector to attempt to detect adversarial examples. This approach also makes use of gradient obfuscation techniques, for example, to prevent the adversary from trying to fool the detector. In this paper, we introduce stochastic substitute training, a gray-box approach that can craft adversarial examples for defenses which obfuscate gradients. For those defenses that have tried to make models more robust, with our technique, an adversary can craft adversarial examples with no knowledge of the defense. For defenses that attempt to detect the adversarial examples, with our technique, an adversary only needs very limited information about the defense to craft adversarial examples. We demonstrate our technique by applying it against two defenses which make models more robust and two defenses which detect adversarial examples.

Some negative results for Neural Networks

We demonstrate some negative results for approximation of functions with neural networks.

A new approach of contextual recommendation based on the method of Hierarchical Analysis of Processes

Recommender systems are able to estimate the user’s interest for resource given from some relative information to others similar users and to propriety of the resource. In this Memory, we introduced a new contextual recommendation approach based on the AHP Process Hierarchical Analysis method. This work consisted in making a bibliographic study on the works having proposed systems of recommendation based on the context of the users in the field of films. The goal is to design and develop a new approach to recommending movies based on user context. And we relied on methods of multi-criteria decision making (MCDM) and more precisely the method of Hierarchical Process Analysis (AHP) for context integration in the recommendation process.

Graph Laplacian mixture model

Graph learning methods have recently been receiving increasing interest as means to infer structure in datasets. Most of the recent approaches focus on different relationships between a graph and data sample distributions, mostly in settings where all available relate to the same graph. This is, however, not always the case, as data is often available in mixed form, yielding the need for methods that are able to cope with mixture data and learn multiple graphs. We propose a novel generative model that explains a collection of distinct data naturally living on different graphs. We assume the mapping of data to graphs is not known and investigate the problem of jointly clustering a set of data and learning a graph for each of the clusters. Experiments in both synthetic and real-world datasets demonstrate promising performance both in terms of data clustering, as well as multiple graph inference from mixture data.

NestDNN: Resource-Aware Multi-Tenant On-Device Deep Learning for Continuous Mobile Vision

Mobile vision systems such as smartphones, drones, and augmented-reality headsets are revolutionizing our lives. These systems usually run multiple applications concurrently and their available resources at runtime are dynamic due to events such as starting new applications, closing existing applications, and application priority changes. In this paper, we present NestDNN, a framework that takes the dynamics of runtime resources into account to enable resource-aware multi-tenant on-device deep learning for mobile vision systems. NestDNN enables each deep learning model to offer flexible resource-accuracy trade-offs. At runtime, it dynamically selects the optimal resource-accuracy trade-off for each deep learning model to fit the model’s resource demand to the system’s available runtime resources. In doing so, NestDNN efficiently utilizes the limited resources in mobile vision systems to jointly maximize the performance of all the concurrently running applications. Our experiments show that compared to the resource-agnostic status quo approach, NestDNN achieves as much as 4.2% increase in inference accuracy, 2.0x increase in video frame processing rate and 1.7x reduction on energy consumption.

Learning Representations in Model-Free Hierarchical Reinforcement Learning

Common approaches to Reinforcement Learning (RL) are seriously challenged by large-scale applications involving huge state spaces and sparse delayed reward feedback. Hierarchical Reinforcement Learning (HRL) methods attempt to address this scalability issue by learning action selection policies at multiple levels of temporal abstraction. Abstraction can be had by identifying a relatively small set of states that are likely to be useful as subgoals, in concert with the learning of corresponding skill policies to achieve those subgoals. Many approaches to subgoal discovery in HRL depend on the analysis of a model of the environment, but the need to learn such a model introduces its own problems of scale. Once subgoals are identified, skills may be learned through intrinsic motivation, introducing an internal reward signal marking subgoal attainment. In this paper, we present a novel model-free method for subgoal discovery using incremental unsupervised learning over a small memory of the most recent experiences of the agent. When combined with an intrinsic motivation learning mechanism, this method learns subgoals and skills together, based on experiences in the environment. Thus, we offer an original approach to HRL that does not require the acquisition of a model of the environment, suitable for large-scale applications. We demonstrate the efficiency of our method on two RL problems with sparse delayed feedback: a variant of the rooms environment and the ATARI 2600 game called Montezuma’s Revenge.

Autowarp: Learning a Warping Distance from Unlabeled Time Series Using Sequence Autoencoders

Measuring similarities between unlabeled time series trajectories is an important problem in domains as diverse as medicine, astronomy, finance, and computer vision. It is often unclear what is the appropriate metric to use because of the complex nature of noise in the trajectories (e.g. different sampling rates or outliers). Domain experts typically hand-craft or manually select a specific metric, such as dynamic time warping (DTW), to apply on their data. In this paper, we propose Autowarp, an end-to-end algorithm that optimizes and learns a good metric given unlabeled trajectories. We define a flexible and differentiable family of warping metrics, which encompasses common metrics such as DTW, Euclidean, and edit distance. Autowarp then leverages the representation power of sequence autoencoders to optimize for a member of this warping distance family. The output is a metric which is easy to interpret and can be robustly learned from relatively few trajectories. In systematic experiments across different domains, we show that Autowarp often outperforms hand-crafted trajectory similarity metrics.

PoPPy: A Point Process Toolbox Based on PyTorch

PoPPy is a Point Process toolbox based on PyTorch, which achieves flexible designing and efficient learning of point process models. It can be used for interpretable sequential data modeling and analysis, e.g., Granger causality analysis of multi-variate point processes, point process-based simulation and prediction of event sequences. In practice, the key points of point process-based sequential data modeling include: 1) How to design intensity functions to describe the mechanism behind observed data? 2) How to learn the proposed intensity functions from observed data? The goal of PoPPy is providing a user-friendly solution to the key points above and achieving large-scale point process-based sequential data analysis, simulation and prediction.

Area Attention

Existing attention mechanisms, are mostly item-based in that a model is designed to attend to a single item in a collection of items (the memory). Intuitively, an area in the memory that may contain multiple items can be worth attending to as a whole. We propose area attention: a way to attend to an area of the memory, where each area contains a group of items that are either spatially adjacent when the memory has a 2-dimensional structure, such as images, or temporally adjacent for 1-dimensional memory, such as natural language sentences. Importantly, the size of an area, i.e., the number of items in an area, can vary depending on the learned coherence of the adjacent items. By giving the model the option to attend to an area of items, instead of only a single item, we hope attention mechanisms can better capture the nature of the task. Area attention can work along multi-head attention for attending to multiple areas in the memory. We evaluate area attention on two tasks: neural machine translation and image captioning, and improve upon strong (state-of-the-art) baselines in both cases. These improvements are obtainable with a basic form of area attention that is parameter free. In addition to proposing the novel concept of area attention, we contribute an efficient way for computing it by leveraging the technique of summed area tables.

The Unit-B Method — Refinement Guided by Progress Concerns

We present Unit-B, a formal method inspired by Event-B and UNITY. Unit-B aims at the stepwise design of software systems satisfying safety and liveness properties. The method features the novel notion of coarse and fine schedules, a generalisation of weak and strong fairness for specifying events’ scheduling assumptions. Based on events schedules, we propose proof rules to reason about progress properties and a refinement order preserving both liveness and safety properties. We illustrate our approach by an example to show that systems development can be driven by not only safety but also liveness requirements.

FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation

Randomized Gradient Boosting Machine

Gradient Boosting Machine (GBM) introduced by Friedman is an extremely powerful supervised learning algorithm that is widely used in practice — it routinely features as a leading algorithm in machine learning competitions such as Kaggle and the KDDCup. In spite of the usefulness of GBM in practice, there is a big gap between its theoretical understanding and its success in practice. In this work, we propose Randomized Gradient Boosting Machine (RGBM) which leads to significant computational gains compared to GBM, by using a randomization scheme to reduce the search in the space of weak learners. Our analysis provides a formal justification of commonly used ad hoc heuristics employed by GBM implementations such as XGBoost, and suggests alternatives. In particular, we also provide a principled guideline towards better step-size selection in RGBM that does not require a line search. The analysis of RGBM is inspired by a special variant of coordinate descent that combines the benefits of randomized coordinate descent and greedy coordinate descent; and may be of independent interest as an optimization algorithm. As a special case, our results for RGBM lead to superior computational guarantees for GBM. Our computational guarantees depend upon a curious geometric quantity that we call Minimal Cosine Angle, which relates to the density of weak learners in the prediction space. We demonstrate the effectiveness of RGBM over GBM in terms of obtaining a model with good training/test data fidelity with a fraction of the computational cost, via numerical experiments on several real datasets.

Deep Learning with Long Short-Term Memory for Time Series Prediction

Time series prediction can be generalized as a process that extracts useful information from historical records and then determines future values. Learning long-range dependencies that are embedded in time series is often an obstacle for most algorithms, whereas Long Short-Term Memory (LSTM) solutions, as a specific kind of scheme in deep learning, promise to effectively overcome the problem. In this article, we first give a brief introduction to the structure and forward propagation mechanism of the LSTM model. Then, aiming at reducing the considerable computing cost of LSTM, we put forward the Random Connectivity LSTM (RCLSTM) model and test it by predicting traffic and user mobility in telecommunication networks. Compared to LSTM, RCLSTM is formed via stochastic connectivity between neurons, which achieves a significant breakthrough in the architecture formation of neural networks. In this way, the RCLSTM model exhibits a certain level of sparsity, which leads to an appealing decrease in the computational complexity and makes the RCLSTM model become more applicable in latency-stringent application scenarios. In the field of telecommunication networks, the prediction of traffic series and mobility traces could directly benefit from this improvement as we further demonstrate that the prediction accuracy of RCLSTM is comparable to that of the conventional LSTM no matter how we change the number of training samples or the length of input sequences.

Outcome-wide longitudinal designs for causal inference: a new template for empirical studies

In this paper we propose a new template for empirical studies intended to assess causal effects: the outcome-wide longitudinal design. The approach is an extension of what is often done to assess the causal effects of a treatment or exposure using confounding control, but now, over numerous outcomes. We discuss the temporal and confounding control principles for such outcome-wide studies, metrics to evaluate robustness or sensitivity to potential unmeasured confounding for each outcome, and approaches to handle multiple testing. We argue that the outcome-wide longitudinal design has numerous advantages over more traditional studies of single exposure-outcome relationships including results that are less subject to investigator bias, greater potential to report null effects, greater capacity to compare effect sizes, a tremendous gain in the efficiency for the research community, a greater policy relevance, and a more rapid advancement of knowledge. We discuss both the practical and theoretical justification for the outcome-wide longitudinal design and also the pragmatic details of its implementation.

Exploiting Partial Correlations in Distributionally Robust Optimization

Modified Multidimensional Scaling and High Dimensional Clustering

The Hellinger Correlation

In this paper, the defining properties of a valid measure of the dependence between two random variables are reviewed and complemented with two original ones, shown to be more fundamental than other usual postulates. While other popular choices are proved to violate some of these requirements, a class of dependence measures satisfying all of them is identified. One particular measure, that we call the Hellinger correlation, appears as a natural choice within that class due to both its theoretical and intuitive appeal. A simple and efficient nonparametric estimator for that quantity is proposed. Synthetic and real-data examples finally illustrate the descriptive ability of the measure, which can also be used as test statistic for exact independence testing.

Label Propagation for Learning with Label Proportions

Learning with Label Proportions (LLP) is the problem of recovering the underlying true labels given a dataset when the data is presented in the form of bags. This paradigm is particularly suitable in contexts where providing individual labels is expensive and label aggregates are more easily obtained. In the healthcare domain, it is a burden for a patient to keep a detailed diary of their daily routines, but often they will be amenable to provide higher level summaries of daily behavior. We present a novel and efficient graph-based algorithm that encourages local smoothness and exploits the global structure of the data, while preserving the `mass’ of each bag.

Why every GBDT speed benchmark is wrong

This article provides a comprehensive study of different ways to make speed benchmarks of gradient boosted decision trees algorithm. We show main problems of several straight forward ways to make benchmarks, explain, why a speed benchmarking is a challenging task and provide a set of reasonable requirements for a benchmark to be fair and useful.

HAKD: Hardware Aware Knowledge Distillation

Despite recent developments, deploying deep neural networks on resource constrained general purpose hardware remains a significant challenge. There has been much work in developing methods for reshaping neural networks, usually with a focus on minimising total parameter count. These methods are typically developed in a hardware-agnostic manner and do not exploit hardware behaviour. In this paper we propose a new approach, Hardware Aware Knowledge Distillation (HAKD) which uses empirical observations of hardware behaviour to design efficient student networks which are then trained with knowledge distillation. This allows the trade-off between accuracy and performance to be managed explicitly. We have applied this approach across three platforms and evaluated it on two networks, MobileNet and DenseNet, on CIFAR-10. We show that HAKD outperforms Deep Compression and Fisher pruning in terms of size, accuracy and performance.

Multi-Multi-View Learning: Multilingual and Multi-Representation Entity Typing

Knowledge bases (KBs) are paramount in NLP. We employ multiview learning for increasing accuracy and coverage of entity type information in KBs. We rely on two metaviews: language and representation. For language, we consider high-resource and low-resource languages from Wikipedia. For representation, we consider representations based on the context distribution of the entity (i.e., on its embedding), on the entity’s name (i.e., on its surface form) and on its description in Wikipedia. The two metaviews language and representation can be freely combined: each pair of language and representation (e.g., German embedding, English description, Spanish name) is a distinct view. Our experiments on entity typing with fine-grained classes demonstrate the effectiveness of multiview learning. We release MVET, a large multiview – and, in particular, multilingual – entity typing dataset we created. Mono- and multilingual fine-grained entity typing systems can be evaluated on this dataset.

• Multi-scale Geometric Summaries for Similarity-based Sensor Fusion• Instance Segmentation and Object Detection with Bounding Shape Masks• Vehicle classification using ResNets, localisation and spatially-weighted pooling• Multi-Stage Reinforcement Learning For Object Detection• Hyper-Process Model: A Zero-Shot Learning algorithm for Regression Problems based on Shape Analysis• Bottleneck Supervised U-Net for Pixel-wise Liver and Tumor Segmentation• Finite-time Guarantees for Byzantine-Resilient Distributed State Estimation with Noisy Measurements• Downsampling leads to Image Memorization in Convolutional Autoencoders• Projecting Trouble: Light Based Adversarial Attacks on Deep Learning Classifiers• Coherence Constraints in Facial Expression Recognition• Strategies for Training Stain Invariant CNNs• Characterization of Brain Cortical Morphology Using Localized Topology-Encoding Graphs• A Proximal Zeroth-Order Algorithm for Nonconvex Nonsmooth Problems• A Case for Object Compositionality in Deep Generative Models of Images• Visions of a generalized probability theory• Predicting optical coherence tomography-derived diabetic macular edema grades from fundus photographs using deep learning• Stochastic temporal data upscaling using the generalized k-nearest neighbor algorithm• From Machine to Machine: An OCT-trained Deep Learning Algorithm for Objective Quantification of Glaucomatous Damage in Fundus Photographs• Differentiable Fine-grained Quantization for Deep Neural Network Compression• Machine Learning Methods for Track Classification in the AT-TPC• Dermatologist Level Dermoscopy Skin Cancer Classification Using Different Deep Learning Convolutional Neural Networks Algorithms• Block Matching Frame based Material Reconstruction for Spectral CT• Boosted Convolutional Neural Networks for Motor Imagery EEG Decoding with Multiwavelet-based Time-Frequency Conditional Granger Causality Analysis• Implicit Modeling with Uncertainty Estimation for Intravoxel Incoherent Motion Imaging• SOT-MRAM 300mm integration for low power and ultrafast embedded memories• Experimental Investigation of Programmed State Stability in OxRAM Resistive Memories• Fantom: A scalable framework for asynchronous distributed systems• Chord Recognition in Symbolic Music: A Segmental CRF Model, Segment-Level Features, and Comparative Evaluations on Classical and Popular Music• Effective Filtering for Multiscale Stochastic Dynamical Systems driven by Lévy processes• Computing control invariant sets in high dimension is easy• LoGAN: Generating Logos with a Generative Adversarial Neural Network Conditioned on color• TweetsKB: A Public and Large-Scale RDF Corpus of Annotated Tweets• Active Ranking with Subset-wise Preferences• Analysis of Strategy and Spread of Russia-sponsored Content in the US in 2017• Uncovering Complex Overlapping Pattern of Communities in Large-scale Social Networks• The cumulative mass profile of the Milky Way as determined by globular cluster kinematics from Gaia DR2• DeepLSR: Deep learning approach for laser speckle reduction• Semiparametric Analysis of Competing Risks Data Under Missing Cause of Failure• Efficiently measuring a quantum device using machine learning• A Ramsey-type Theorem on the Max-Cut Value of $d$-Regular Graphs• Language Modeling at Scale• Approximating the Quadratic Transportation Metric in Near-Linear Time• A Method to Construct $1$-Rotational Factorizations of Complete Graphs and Solutions to the Oberwolfach Problem• On the Root solution to the Skorokhod embedding problem given full marginals• Cyclic structure induced by load fluctuations in adaptive transportation networks• Implosion of a pure death process• Perturbation techniques for convergence analysis of proximal gradient method and other first-order algorithms via variational analysis• Ramsey subsets of the space of infinite block sequences of vectors• Explicit Boij-Soderberg theory of ideals from a graph isomorphism reduction• Recognition of basic hand movements using Electromyography• Change of variable formula for local time of continuous semimartingale• Statistical mechanics of low-rank tensor decomposition• A Fusion Approach for Multi-Frame Optical Flow Estimation• A Statistical Approach to Adult Census Income Level Prediction• A belief propagation algorithm based on domain decomposition• Arithmetic progressions in the trace of Brownian motion in space• Model Selection for Nonnegative Matrix Factorization by Support Union Recovery• A Continuous-Time View of Early Stopping for Least Squares Regression• Structured Domain Randomization: Bridging the Reality Gap by Context-Aware Synthetic Data• Novel Adaptive Algorithms for Estimating Betweenness, Coverage and k-path Centralities• Deblending galaxy superpositions with branched generative adversarial networks• Classical pattern distributions in $\mathcal{S}{n}(132)$ and $\mathcal{S}{n}(123)$• Comparative Evaluation of Tree-Based Ensemble Algorithms for Short-Term Travel Time Prediction• Fast Computation of Steady-State Response for Nonlinear Vibrations of High-Degree-of-Freedom Systems• Reproducing AmbientGAN: Generative models from lossy measurements• Resource-Constrained Simultaneous Detection and Labeling of Objects in High-Resolution Satellite Images• On the log-normality of the degree distribution in large homogeneous binary multiplicative attribute graph models• End-to-End Diagnosis and Segmentation Learning from Cardiac Magnetic Resonance Imaging• Interpreting Black Box Predictions using Fisher Kernels• Delocalization of uniform graph homomorphisms from $\mathbb{Z}^2$ to $\mathbb{Z}$• A Remark on the Arcsine Distribution and the Hilbert Transform• Smoothed Online Optimization for Regression and Control• Voltage Collapse Stabilization: A Game Theory Viewpoint• A Binary Optimization Approach for Constrained K-Means Clustering• Approximation of nonnegative systems by moving averages of fixed order• Local Homology of Word Embeddings• Bayesian Modeling of Nonlinear Poisson Regression with Artificial Neural Networks• Sojourn times of Gaussian processes with trend• A Deep-Learning-Based Fashion Attributes Detection Model• Quadratic Backward Stochastic Volterra Integral Equations• AUNet: Breast Mass Segmentation of Whole Mammograms• Background Subtraction using Compressed Low-resolution Images• Automatic Identification of Indicators of Compromise using Neural-Based Sequence Labelling• Size Ramsey numbers of paths• Resolving Referring Expressions in Images With Labeled Elements• Nonconvex and Nonsmooth Sparse Optimization via Adaptively Iterative Reweighted Methods• Data-driven Blockbuster Planning on Online Movie Knowledge Library• Text Embeddings for Retrieval From a Large Knowledge Base• Learned optimizers that outperform SGD on wall-clock and validation loss• Exploiting Deep Representations for Neural Machine Translation• Modeling Localness for Self-Attention Networks• Multi-Head Attention with Disagreement Regularization• Conjugate coupling induced symmetry breaking and quenched oscillations• Fault Area Detection in Leaf Diseases using k-means Clustering• Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks• Solving Poisson’s Equation using Deep Learning in Particle Simulation of PN Junction• Automated Evaluation of Semantic Segmentation Robustness for Autonomous Driving• Niji: Bitcoin Bridge Utilizing Payment Channels• Exact distance graphs of product graphs• On the well-posedness of a class of McKean Feynman-Kac equations• Solving Weakly-Convex-Weakly-Concave Saddle-Point Problems as Weakly-Monotone Variational Inequality• The Langevin diffusion as a continuous-time model of animal movement and habitat selection• Asymptotics of eigenstructure of sample correlation matrices for high-dimensional spiked models• Numerical methods for piecewise deterministic Markov processes with boundary• Incompatible double posets and double order polytopes• DSFD: Dual Shot Face Detector• Cross-Resolution Person Re-identification with Deep Antithetical Learning• Universal Language Model Fine-Tuning with Subword Tokenization for Polish• Volume Of Sub-level Sets Of Homogeneous Polynomials• Textually Guided Ranking Network for Attentional Image Retweet Modeling• Faster approximation algorithms for computing shortest cycles on weighted graphs• Multistep Speed Prediction on Traffic Networks: A Graph Convolutional Sequence-to-Sequence Learning Approach with Attention Mechanism• Extension of the Gradient Boosting Algorithm for Joint Modeling of Longitudinal and Time-to-Event data• History by Diversity: Helping Historians search News Archives• Discovering Entities with Just a Little Help from You• Designing Search Tasks for Archive Search• Learn to Code-Switch: Data Augmentation using Copy Mechanism on Language Modeling• Algebraic solution of weighted minimax single-facility constrained location problems• Hierarchical landscape of hard disk glasses• A Maximum Edge-Weight Clique Extraction Algorithm Based on Branch-and-Bound• Production facilities location computing the environmental pollution• Publish-and-Flourish: decentralized co-creation and curation of scholarly content• Uniform Exponential Stabilisation of Serially Connected Inhomogeneous Euler-Bernoulli Beams• Accurate and efficient explicit approximations of the Colebrook flow friction equation based on the Wright-Omega function• A Bag-of-Tasks Scheduler Tolerant to Temporal Failures in Clouds• Learning color space adaptation from synthetic to real images of cirrus clouds• Mask Propagation Network for Video Object Segmentation• Estimating abundance from multiple sampling capture-recapture data via a multi-state multi-period stopover model• Coarse-to-fine volumetric segmentation of teeth in Cone-Beam CT• A Proof-Theoretic Approach to Scope Ambiguity in Compositional Vector Space Models• Optimal Algorithm for Bayesian Incentive-Compatible• Dental pathology detection in 3D cone-beam CT• First and Second Order Shape Optimization based on Restricted Mesh Deformations• Learning to Discriminate Noises for Incorporating External Information in Neural Machine Translation• The MeMAD Submission to the IWSLT 2018 Speech Translation Task• Generative adversarial networks and adversarial methods in biomedical image analysis• Complexity, combinatorial positivity, and Newton polytopes• G-SMOTE: A GMM-based synthetic minority oversampling technique for imbalanced learning• Scalable Gaussian Processes on Discrete Domains• Functional Inequalities for Weighted Gamma Distributions on the Space of Finite (Signed) Measures• Multi-condition of stability for nonlinear stochastic non-autonomous delay differential equation• Image-based Natural Language Understanding Using 2D Convolutional Neural Networks• Simultaneous transmission of classical and quantum information under channel uncertainty and jamming attacks• Multi-Agent Reinforcement Learning Based Resource Allocation for UAV Networks• Effective extractive summarization using frequency-filtered entity relationship graphs• Faithful orthogonal representations of graphs from partition logics• A Deep Learning Mechanism for Efficient Information Dissemination in Vehicular Floating Content• Notes on asymptotics of sample eigenstructure for spiked covariance models with non-Gaussian data• Enumeration of $S$-omino towers and row-convex $k$-omino towers• A localization method in Hamiltonian graph theory• A Map Equation with Metadata: Varying the Role of Attributes in Community Detection• Entropy in Quantum Information Theory — Communication and Cryptography• Semi-supervised Target-level Sentiment Analysis via Variational Autoencoder• The UAVid Dataset for Video Semantic Segmentation• A recursively feasible and convergent Sequential Convex Programming procedure to solve non-convex problems with linear equality constraints• Semantic Neutral Drift• Boundary of the Range of a random walk and the Fölner property• Building and Querying Semantic Layers for Web Archives (Extended Version)• The coset and stability rings• Software Rejuvenation for Secure Tracking Control• Learning Negotiating Behavior Between Cars in Intersections using Deep Q-Learning• Multi-type branching processes with time dependent branching rates• Noisy Blackbox Optimization with Multi-Fidelity Queries: A Tree Search Approach• Design of Software Rejuvenation for CPS Security Using Invariant Sets• Precipitation Nowcasting: Leveraging bidirectional LSTM and 1D CNN• Statistical modeling of rates and trends in Holocene relative sea level• Forecasting Individualized Disease Trajectories using Interpretable Deep Learning• Posterior Convergence of Gaussian and General Stochastic Process Regression Under Possible Misspecifications• Communities as Well Separated Subgraphs With Cohesive Cores: Identification of Core-Periphery Structures in Link Communities• Sleep-like slow oscillations induce hierarchical memory association and synaptic homeostasis in thalamo-cortical simulations• A stochastic sewing lemma and applications• Neighbourhood Consensus Networks• Between-Ride Routing for Private Transportation Services• Spatiotemporal CNNs for Pornography Detection in Videos• Toward an AI Physicist for Unsupervised Learning

Like this:

Like Loading…

Related