Hidden tree Markov models allow learning distributions for tree structured data while being interpretable as nondeterministic automata. We provide a concise summary of the main approaches in literature, focusing in particular on the causality assumptions introduced by the choice of a specific tree visit direction. We will then sketch a novel non-parametric generalization of the bottom-up hidden tree Markov model with its interpretation as a nondeterministic tree automaton with infinite states. Learning Tree Distributions by Hidden Markov Models
“Fudged statistics on the Iraq War death toll are still circulating today”
Mike Spagat shares this story entitled, “Fudged statistics on the Iraq War death toll are still circulating today,” which discusses problems with a paper published in a scientific journal in 2006, and errors that a reporter inadvertently included in a recent news article. Spagat writes:
R Packages worth a look
Conditional Random Fields for Labelling Sequential Data in Natural Language Processing (crfsuite)Wraps the ‘CRFsuite’ library https://…/crfsuite allowing users to fit a C …
R Packages worth a look
Covariate Assisted Principal (CAP) Regression for Covariance Matrix Outcomes (cap)Performs Covariate Assisted Principal (CAP) Regression for covariance matrix outcomes. The method identifies the optimal projection direction which max …
Document worth reading: “An Analysis of Hierarchical Text Classification Using Word Embeddings”
Efficient distributed numerical word representation models (word embeddings) combined with modern machine learning algorithms have recently yielded considerable improvement on automatic document classification tasks. However, the effectiveness of such techniques has not been assessed for the hierarchical text classification (HTC) yet. This study investigates the application of those models and algorithms on this specific problem by means of experimentation and analysis. We trained classification models with prominent machine learning algorithm implementations—fastText, XGBoost, SVM, and Keras’ CNN—and noticeable word embeddings generation methods—GloVe, word2vec, and fastText—with publicly available data and evaluated them with measures specifically appropriate for the hierarchical context. FastText achieved an ${}_{LCA}F_1$ of 0.893 on a single-labeled version of the RCV1 dataset. An analysis indicates that using word embeddings and its flavors is a very promising approach for HTC. An Analysis of Hierarchical Text Classification Using Word Embeddings
Present each others’ posters
It seems that I’ll be judging a poster session next week. So this seems like a good time to repost this from 2009:
Distilled News
Adding strings in R
Distilled News
Under the hood: Facebook Marketplace powered by artificial intelligence
Quick Significance Calculations for A/B Tests in R
Introduction
A Concise Explanation of Learning Algorithms with the Mitchell Paradigm
A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.
- Tom Mitchell, “Machine Learning”1