Much has been said about how the game of Quidditch is ruined by the scoring system – specifically how it makes no sense that the snitch is worth 150 points and that catching it ends the game [1, 2, 3]. Most of these arguments seem to revolve around the claim that it is nearly impossible to win a match of Quidditch without catching the snitch. Is this true? Let’s try and answer this question formally using statistics and R simulations:
OneR – fascinating insights through simple rules
We already saw the power of the OneR package in the preceding post. Here we want to give some more examples to gain some fascinating, often counter-intuitive, insights.
RcppEigen 0.3.3.5.0
Another minor release 0.3.3.5.0 of RcppEigen arrived on CRAN today (and just went to Debian too) bringing support for Eigen 3.3.5 to R.
How to work with strings in base R – An overview of 20+ methods for daily use.
In this post in the R:case4base series we will look at string manipulation with base R, and provide an overview of a wide range of functions for our string working needs.
Distilled News
Case Study Example 1: An eCommerce Company Evaluation
Distilled News
RStudio 1.2 Preview: The Little Things
Polished statistical analysis chapters in evidence-based software engineering
I have completed the polishing/correcting/fiddling of the eight statistical analysis related chapters of my evidence-based software engineering book, and an updated draft pdf is now available (download here).
More Robust Monotonic Binning Based on Isotonic Regression
Since publishing the monotonic binning function based upon the isotonic regression (https://statcompute.wordpress.com/2017/06/15/finer-monotonic-binning-based-on-isotonic-regression), I’ve received some feedback from peers. A potential concern is that, albeit improving the granularity and predictability, the binning is too fine and might not generalize well in the new data.
lmer vs INLA for variance components
Just for fun, I decided to compare the estimates from lmer and INLA for the variance components of an LMM (this isn’t really something that you would ordinarily do – comparing frequentist and bayesian approaches). The codes are below. A couple of plots are drawn, which show the distribution of the hyperparameters (in this case variances) from INLA, which are difficult to get from the frequentist framework (there’s a link to a presentation by Douglas Bates in the code, detailing why you might not want to do it [distribution is not symmetrical], and how you could do it… but it’s a lot of work).
R Packages worth a look
Simulating Spreading Activation in a Network (spreadr)The notion of spreading activation is a prevalent metaphor in the cognitive sciences. This package provides the tools for cognitive scientists and psyc …