<!DOCTYPE html PUBLIC “-//W3C//DTD HTML 4.0 Transitional//EN” “http://www.w3.org/TR/REC-html40/loose.dtd”>
The Beta-Binomial model is the “hello world� of Bayesian statistics. That is, it’s the first model you get to run, often before you even know what you are doing. There are many reasons for this:
-
It only has one parameter, the underlying proportion of success, so it’s easy to visualize and reason about.
-
It’s easy to come up with a scenario where it can be used, for example: “What is the proportion of patients that will be cured by this drug?�
-
The model can be computed analytically (no need for any messy MCMC).
-
It’s relatively easy to come up with an informative prior for the underlying proportion.
-
Most importantly: It’s fun to see some results before diving into the theory! �
That’s why I also introduced the Beta-Binomial model as the first model in my DataCamp course Fundamentals of Bayesian Data Analysis in R and quite a lot of people have asked me for the code I used to visualize the Beta-Binomial. Scroll to the bottom of this post if that’s what you want, otherwise, here is how I visualized the Beta-Binomial in my course given two successes and four failures:
The function that produces these plots is called prop_model
(prop
as in proportion) and takes a vector of TRUE
s and FALSE
s representing successes and failures. The visualization is created using the excellent ggridges
package (previously called joyplot). Here’s how you would use prop_model
to produce the last plot in the animation above:
1 |
|
The result is, I think, a quite nice visualization of how the model’s knowledge about the parameter changes as data arrives. At n=0
the model doesn’t know anything and — as the default prior states that it’s equally likely the proportion of success is anything from 0.0 to 1.0 — the result is a big, blue, and uniform square. As more data arrives the probability distribution becomes more concentrated, with the final posterior distribution at n=6
.
Some added features of prop_model
is that it also plots larger data somewhat gracefully and that it returns a random sample from the posterior that can be further explored. For example:
1 |
|
1 |
|
2.5% 50% 98%
0.68 0.77 0.84
1 |
|
So here we calculated that the underlying proportion of success is most likely 0.77 with a 95% CI of [0.68, 0.84] (which nicely includes the correct value of 0.75 which we used to simulate big_data
).
To be clear, prop_model
is not intended as anything serious, it’s just meant as a nice way of exploring the Beta-Binomial model when learning Bayesian statistics, maybe as part of a workshop exercise.
1 |
|
Related