A long time ago (5 years) I wrote a blog post on tapply. Back then I was just getting into programming and I thought the possibilities of tapply were amazing. So it seems, do many others as it’s become one of my most viewed articles.
However, I never use tapply these days because the output is either a named vector or a matrix. Both of these require munging if I’m going to use the output. Three months after I wrote my tapply post a little package called dplyr was released. It took a while before it became integral to my workflow (I like to use as few packages as possible), but now I use it almost daily. The two biggest reasons are:
-
A data frame as output
-
Readable code.
Now we’re all living in the tidyverse, I’m a bit confused that so many folk still land on my blog looking for tapply. So this post updates/supersedes what I wrote previously. I’ve repeated the toy example I made before: