[edited Dec 23, 2018]

## tl;dr

You too can make sideways Gaussian density curves within the tidyverse. Here’s how.

## Here’s the deal: I like making pictures.

Over the past several months, I’ve been slowly chipping away at John Kruschke’s *Doing Bayesian data analysis, Second Edition: A tutorial with R, JAGS, and Stan*. Kruschke has a unique plotting style. One of the quirks is once in a while he likes to express the results of his analyses in plots where he shows the data alongside density curves of the model-implied data-generating distributions. Here’s an example from chapter 19 (p. 563).

In this example, he has lifespan data (i.e., `Longevity`

) for fruit flies from five experimental conditions (i.e., `CompanionNumber`

). Those are the black circles. In this section of the chapter, he used a Gaussian multilevel model in which the mean value for `Longevity`

had a grand mean in addition to random effects for the five experimental conditions. Those sideways-turned blue Gaussians are his attempt to express the model-implied data generating distributions for each group.

If you haven’t gone through Kruschke’s text, you should know he relies on base R and all its loopy glory. If you carefully go through his code, you can reproduce his plots in that fashion. I’m a tidyverse man and prefer to avoid writing a `for()`

loop at all costs. At first, I tried to work with convenience functions within ggplot2 and friends, but only had limited success. After staring long and hard at Kruschke’s base code, I came up with a robust solution, which I’d like to share here.

In this post, we’ll practice making sideways Gaussians in the Kruschke style. We’ll do so with a simple intercept-only single-level model and then expand our approach to an intercept-only multilevel model like the one in the picture, above.

## My assumptions

For the sake of this post, I’m presuming you’re familiar with R, aware of the tidyverse, and have fit a Bayesian model or two. Yes. I admit that’s a narrow crowd. Sometimes the target’s a small one.

## We need data.

First, we need data. Here we’ll borrow code from Matthew Kay’s nice tutorial on how to use his great tidybayes package.

```
library(tidyverse)
set.seed(5)
n <- 10
n_condition <- 5
abc <-
tibble(condition = rep(letters[1:5], times = n),
response = rnorm(n * 5, mean = c(0, 1, 2, 1, -1), sd = 0.5))
```

The data structure looks like so.

`str(abc)`

```
## Classes 'tbl_df', 'tbl' and 'data.frame': 50 obs. of 2 variables:
## $ condition: chr "a" "b" "c" "d" ...
## $ response : num -0.42 1.692 1.372 1.035 -0.144 ...
```

With Kay’s code, we have `response`

values for five `condition`

s. All follow the normal distribution and share a common standard deviation. However, they differ in their group means.

```
abc %>%
group_by(condition) %>%
summarise(mean = mean(response) %>% round(digits = 2))
```

```
## # A tibble: 5 x 2
## condition mean
## <chr> <dbl>
## 1 a 0.18
## 2 b 1.01
## 3 c 1.87
## 4 d 1.03
## 5 e -0.94
```

Altogether, the data look like this.

```
theme_set(theme_grey() +
theme(panel.grid = element_blank()))
abc %>%
ggplot(aes(y = condition, x = response)) +
geom_point(shape = 1)
```

Let’s get ready to model.

## Just one intercept

If you’ve read this far, you know we’re going Bayesian. Let’s open up our favorite Bayesian modeling package, Bürkner’s brms.

`library(brms)`

For our first model, we’ll ignore the groups and just estimate a grand mean and a standard deviation. Relative to the scale of the `abc`

data, our priors are modestly regularizing.

```
fit1 <-
brm(data = abc,
response ~ 1,
prior = c(prior(normal(0, 1), class = Intercept),
prior(student_t(3, 0, 1), class = sigma)))
```

Extract the posterior draws and save them as a data frame we’ll call `post`

.

```
post <- posterior_samples(fit1)
glimpse(post)
```

```
## Observations: 4,000
## Variables: 3
## $ b_Intercept <dbl> 0.5987480, 0.5987480, 0.7314609, 0.6445826, 0.5576265, 0.7100258, 0.5000373, …
## $ sigma <dbl> 1.1177155, 1.1177155, 1.0092032, 1.0192299, 1.1700033, 1.0615338, 1.0849874, …
## $ lp__ <dbl> -76.96158, -76.96158, -77.46269, -77.09759, -77.26516, -77.10602, -77.20185, …
```

If all you want is a quick and dirty way to plot a few of the model-implied Gaussians from the simple model, you can just nest `stat_function()`

within `mapply()`

and tack on the original data in a `geom_jitter()`

.

```
# How many Gaussians would you like?
n_iter <- 20
tibble(response = c(-4, 4)) %>%
ggplot(aes(x = response)) +
mapply(function(mean, sd) {
stat_function(fun = dnorm,
args = list(mean = mean, sd = sd),
alpha = 1/2,
color = "steelblue")
},
# Enter means and standard deviations here
mean = post[1:n_iter, "b_Intercept"],
sd = post[1:n_iter, "sigma"]
) +
geom_jitter(data = abc, aes(y = -0.02),
height = .025, shape = 1, alpha = 2/3) +
scale_y_continuous(NULL, breaks = NULL)
```

This works pretty okay. But notice the orientation is the usual horizontal. Kruschke’s Gaussians were on their sides. If we switch out our `scale_y_continuous()`

line with `scale_y_reverse()`

and add in `coord_flip()`

, we’ll have it.

```
tibble(response = c(-4, 4)) %>%
ggplot(aes(x = response)) +
mapply(function(mean, sd) {
stat_function(fun = dnorm,
args = list(mean = mean, sd = sd),
alpha = 1/2,
color = "steelblue")
},
mean = post[1:n_iter, "b_Intercept"],
sd = post[1:n_iter, "sigma"]
) +
geom_jitter(data = abc, aes(y = -0.02),
height = .025, shape = 1, alpha = 2/3) +
scale_y_reverse(NULL, breaks = NULL) +
coord_flip()
```

Boom. It won’t always be this easy, though.

## Multiple intercepts

Since the `response`

values are from a combination of five `condition`

groups, we can fit a multilevel model to compute both the grand mean and the group-level deviations from the grand mean.

```
fit2 <-
brm(data = abc,
response ~ 1 + (1 | condition),
prior = c(prior(normal(0, 1), class = Intercept),
prior(student_t(3, 0, 1), class = sigma),
prior(student_t(3, 0, 1), class = sd)),
cores = 4)
```

“Wait. Whoa. I’m so confused”—you say. “What’s a multilevel model, again?” Read this book, or this book; start here on this lecture series; or even check out my project, starting with chapter 12.

Once again, extract the posterior draws and save them as a data frame, `post`

.

```
post <- posterior_samples(fit2)
str(post)
```

```
## 'data.frame': 4000 obs. of 9 variables:
## $ b_Intercept : num 0.0379 0.102 -0.048 -1.0037 0.0231 ...
## $ sd_condition__Intercept : num 2.08 2.13 1.94 1.99 2.21 ...
## $ sigma : num 0.473 0.468 0.475 0.648 0.527 ...
## $ r_condition[a,Intercept]: num -0.01062 0.00219 0.23764 1.40189 0.08825 ...
## $ r_condition[b,Intercept]: num 0.877 0.911 0.781 1.944 1.233 ...
## $ r_condition[c,Intercept]: num 1.46 1.62 2.03 3.15 1.74 ...
## $ r_condition[d,Intercept]: num 0.995 1.096 1.014 2.217 0.876 ...
## $ r_condition[e,Intercept]: num -1.111 -1.144 -1.005 0.234 -0.906 ...
## $ lp__ : num -51.8 -49.2 -50 -53.5 -48.7 ...
```

This is where our task becomes difficult. Now each level of `condition`

has its own mean estimate, which is a combination of the grand mean `b_Intercept`

and the group-specific deviation, `r_condition[a,Intercept]`

through `r_condition[e,Intercept]`

. If all we wanted to do was show the model-implied Gaussians for, say, `condition == a`

, that’d be a small extension of our last approach.

```
tibble(response = c(-4, 4)) %>%
ggplot(aes(x = response)) +
mapply(function(mean, sd) {
stat_function(fun = dnorm,
args = list(mean = mean, sd = sd),
alpha = 1/2,
color = "steelblue")
},
# Here's the small extension, part a
mean = post[1:n_iter, "b_Intercept"] + post[1:n_iter, "r_condition[a,Intercept]"],
sd = post[1:n_iter, "sigma"]
) +
# The small extension, part b:
geom_jitter(data = abc %>% filter(condition == "a"), aes(y = 0),
height = .025, shape = 1, alpha = 2/3) +
scale_y_reverse(NULL, breaks = NULL) +
coord_flip() +
labs(subtitle = "This is just for condition a")
```

The main thing we did was add to the definition of the `mean`

within `mapply()`

: `mean = post[1:n_iter, "b_Intercept"] + post[1:n_iter, "r_condition[a,Intercept]"]`

. Within `geom_jitter()`

, we also isolated the `condition == "a"`

cases with `filter()`

. Simple. However, it’s more of a pickle if we want multiple densities stacked atop/next to one another within the same plot.

Unfortunately, we can’t extend our `mapply(stat_function())`

method to the group-level estimates–at least not that I’m aware. But there are other ways. We’ll need a little help from `tidybayes::spread_draws()`

, about which you can learn more here.

```
library(tidybayes)
sd <-
fit2 %>%
spread_draws(b_Intercept, sigma, r_condition[condition,])
head(sd)
```

```
## # A tibble: 6 x 7
## # Groups: condition [5]
## .chain .iteration .draw b_Intercept sigma condition r_condition
## <int> <int> <int> <dbl> <dbl> <chr> <dbl>
## 1 1 1 1 0.0379 0.473 a -0.0106
## 2 1 1 1 0.0379 0.473 b 0.877
## 3 1 1 1 0.0379 0.473 c 1.46
## 4 1 1 1 0.0379 0.473 d 0.995
## 5 1 1 1 0.0379 0.473 e -1.11
## 6 1 2 2 0.102 0.468 a 0.00219
```

In our `sp`

tibble, we have much of the same information we’d get from `brms::posterior_samples()`

, but in the long format with respect to the random effects for `condition`

. Also notice that each row is indexed by the chain, iteration, and draw number. Among those, `.draw`

is the column that corresponds to a unique row like what we’d get from `brms::posterior_samples()`

. This is the index that ranges from 1 to the number of chains multiplied by the number of post-warmup iterations (i.e., default 4000 in our case).

But we need to wrangle a bit. Within the `expand()`

function, we’ll select the columns we’d like to keep within the `nesting()`

function and then expand the tibble by adding a sequence of `response`

values ranging from -4 to 4, for each. This sets us up to use the `dnorm()`

function in the next line to compute the density for each of those `response`

values based on 20 unique normal distributions for each of the five `condition`

groups. “Why 20?” Because we need some reasonably small number and 20’s the one Kruschke tended to use in his text and because, well, we set `filter(.draw < 21)`

. But choose whatever number you like.

The difficulty, however, is that all of these densities will have a minimum value of around 0 and all will be on the same basic scale. So we need a way to serially shift the density values up the y-axis in such a way that they’ll be sensibly separated by group. As far as I can figure, this’ll take us a couple steps. For the first step, we’ll create an intermediary variable, `g`

, with which we’ll arbitrarily assign each of our five groups an integer index ranging from 0 to 4.

The second step is tricky. There we use our `g`

integers to sequentially shift the density values up. Since our `g`

value for `a == 0`

, those we’ll keep 0 as their baseline. As our `g`

value for `b == 1`

, the baseline for those will now increase by 1. And so on for the other groups. But we still need to do a little more fiddling. What we want is for the maximum values of the density estimates to be a little lower than the baselines of the ones one grouping variable up. That is, we want the maximum values for the `a`

densities to fall a little bit below 1 on the y-axis. It’s with the `* .75 / max(density)`

part of the code that we accomplish that task. If you want to experiment with more or less room between the top and bottom of each density, play around with increasing/decreasing that .75 value.

```
sd <-
sd %>%
filter(.draw < 21) %>%
expand(nesting(.draw, b_Intercept, sigma, condition, r_condition),
response = seq(from = -4, to = 4, length.out = 200)) %>%
mutate(density = dnorm(response, mean = b_Intercept + r_condition, sd = sigma),
g = recode(condition,
a = 0,
b = 1,
c = 2,
d = 3,
e = 4)) %>%
mutate(density = g + density * .75 / max(density))
glimpse(sd)
```

```
## Observations: 20,000
## Variables: 8
## Groups: condition [5]
## $ .draw <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ b_Intercept <dbl> 0.03787325, 0.03787325, 0.03787325, 0.03787325, 0.03787325, 0.03787325, 0.037…
## $ sigma <dbl> 0.4734374, 0.4734374, 0.4734374, 0.4734374, 0.4734374, 0.4734374, 0.4734374, …
## $ condition <chr> "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a…
## $ r_condition <dbl> -0.0106207, -0.0106207, -0.0106207, -0.0106207, -0.0106207, -0.0106207, -0.01…
## $ response <dbl> -4.000000, -3.959799, -3.919598, -3.879397, -3.839196, -3.798995, -3.758794, …
## $ density <dbl> 1.435677e-16, 2.945670e-16, 6.000399e-16, 1.213514e-15, 2.436566e-15, 4.85713…
## $ g <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
```

Since we’ll now be using the same axis for both the densities and the five `condition`

groups, we’ll need to add a `density`

column to our `abc`

data.

```
abc <-
abc %>%
mutate(density = recode(condition,
a = 0,
b = 1,
c = 2,
d = 3,
e = 4))
```

Time to plot.

```
sd %>%
ggplot(aes(x = response, y = density)) +
# here we make our density lines
geom_line(aes(group = interaction(.draw, g)),
alpha = 1/2, size = 1/3, color = "steelblue") +
# use the original data for the jittered points
geom_jitter(data = abc,
height = .05, shape = 1, alpha = 2/3) +
scale_y_continuous("condition",
breaks = 0:4,
labels = letters[1:5])
```

Now we’re rolling. Let’s make a cosmetic adjustment. Recall that the full range of the normal distribution spans from \(-\infty\) to \(\infty\). At a certain point, it’s just not informative to show the left and right tails. If you look back up at our motivating example, you’ll note Kruschke’s densities stopped well before trailing off into the tails. If you look closely to the code from his text, you’ll see he’s just showing the inner 95-percentile range for each. To follow suit, we can compute those ranges with `qnorm()`

.

```
sd <-
sd %>%
mutate(ll = qnorm(.025, mean = b_Intercept + r_condition, sd = sigma),
ul = qnorm(.975, mean = b_Intercept + r_condition, sd = sigma))
```

Now we have our lower- and upper-level points for each iteration, we can limit the ranges of our Gaussians with `filter()`

.

```
sd %>%
filter(response > ll,
response < ul) %>%
ggplot(aes(x = response, y = density)) +
geom_line(aes(group = interaction(.draw, g)),
alpha = 1/2, size = 1/3, color = "steelblue") +
geom_jitter(data = abc,
height = .05, shape = 1, alpha = 2/3) +
scale_y_continuous("condition",
breaks = 0:4,
labels = letters[1:5])
```

Oh man, just look how sweet that is. Although I prefer our current method, another difference between it and Kruschke’s example is all of his densities are the same relative height. In all our plots so far, though, the densities differ by their heights. We’ll need a slight adjustment in our `sd`

workflow for that. All we need to do is insert a `group_by()`

statement between the two `mutate()`

lines.

```
sd <-
sd %>%
mutate(density = dnorm(response, mean = b_Intercept + r_condition, sd = sigma),
g = recode(condition,
a = 0,
b = 1,
c = 2,
d = 3,
e = 4)) %>%
# here's the new line
group_by(.draw) %>%
mutate(density = g + density * .75 / max(density))
# now plot
sd %>%
filter(response > ll,
response < ul) %>%
ggplot(aes(x = response, y = density)) +
geom_line(aes(group = interaction(.draw, g)),
alpha = 1/2, size = 1/3, color = "steelblue") +
geom_jitter(data = abc,
height = .05, shape = 1, alpha = 2/3) +
scale_y_continuous("condition",
breaks = 0:4,
labels = letters[1:5])
```

Nice. “But wait!”, you say. “We wanted our Gaussians to be on their sides.” We can do that in at least two ways. At this point, the quickest way is to use our `scale_y_reverse() + coord_flip()`

combo from before.

```
sd %>%
filter(response > ll,
response < ul) %>%
ggplot(aes(x = response, y = density)) +
geom_line(aes(group = interaction(.draw, g)),
alpha = 1/2, size = 1/3, color = "steelblue") +
geom_jitter(data = abc,
height = .05, shape = 1, alpha = 2/3) +
scale_y_reverse("condition",
breaks = 0:4,
labels = letters[1:5]) +
coord_flip()
```

Another way to get those sideways Gaussians is to alter our `sd`

data workflow. The main differene is this time we change the original `mutate(density = g + density * .75 / max(density))`

line to `mutate(density = g - density * .75 / max(density))`

. In case you missed it, the only difference is we changed the `+`

to a `-`

.

```
sd <-
sd %>%
# step one: starting fresh
mutate(density = dnorm(response, mean = b_Intercept + r_condition, sd = sigma)) %>%
group_by(.draw) %>%
# step two: now SUBTRACTING density from g within the equation
mutate(density = g - density * .75 / max(density))
```

Now in our global `aes()`

statement in the plot, we put `density`

on the x and `response`

on the y. We need to take a few other subtle steps:

- Switch out
`geom_line()`

for`geom_path()`

(see here). - Drop the
`height`

argument within`geom_jitter()`

for`width`

. - Switch out
`scale_y_continuous()`

for`scale_x_continuous()`

.

Though totally not necessary, we’ll add a little something extra by coloring the Gaussians by their means.

```
sd %>%
filter(response > ll,
response < ul) %>%
ggplot(aes(x = density, y = response)) +
geom_path(aes(group = interaction(.draw, g),
color = b_Intercept + r_condition),
alpha = 1/2, size = 1/3, show.legend = F) +
geom_jitter(data = abc,
width = .05, shape = 1, alpha = 2/3) +
scale_x_continuous("condition",
breaks = 0:4,
labels = letters[1:5]) +
scale_color_viridis_c(option = "A", end = .92)
```

There you have it–Kruschke-style sideways Gaussians for your model plots.

## Afterward

After releasing the initial version of this post, some of us had a lively twitter discussion on how to improve the code.

Ah, hrm. Took some digging, but it looks like negative density + setting `min_height = NA` (otherwise negative values are cut off) might work pic.twitter.com/gmF9kpo2T7

— Matthew Kay (@mjskay) December 22, 2018

Part of that discussion had to do with the possibility of using functions from Claus Wilke’s great ggridges package. After some great efforts, especially from Matthew Kay, we came up with solutions. In this section, we’ll cover them in some detail.

First, here’s a more compact way to prepare the data for the plot.

```
abc %>%
distinct(condition) %>%
add_fitted_draws(fit2, n = 20, dpar = c("mu", "sigma")) %>%
mutate(lower = qnorm(.025, mean = mu, sd = sigma),
upper = qnorm(.975, mean = mu, sd = sigma)) %>%
mutate(response = map2(lower, upper, seq, length.out = 200)) %>%
mutate(density = pmap(list(response, mu, sigma), dnorm)) %>%
unnest() %>%
group_by(.draw) %>%
mutate(density = density * .75 / max(density)) %>%
glimpse()
```

```
## Warning: `cols` is now required.
## Please use `cols = c(response, density)`
```

```
## Observations: 20,000
## Variables: 12
## Groups: .draw [20]
## $ condition <chr> "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a"…
## $ .row <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ .chain <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ .iteration <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ .draw <int> 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24…
## $ .value <dbl> 0.2041854, 0.2041854, 0.2041854, 0.2041854, 0.2041854, 0.2041854, 0.2041854, 0…
## $ mu <dbl> 0.2041854, 0.2041854, 0.2041854, 0.2041854, 0.2041854, 0.2041854, 0.2041854, 0…
## $ sigma <dbl> 0.5729852, 0.5729852, 0.5729852, 0.5729852, 0.5729852, 0.5729852, 0.5729852, 0…
## $ lower <dbl> -0.918845, -0.918845, -0.918845, -0.918845, -0.918845, -0.918845, -0.918845, -…
## $ upper <dbl> 1.327216, 1.327216, 1.327216, 1.327216, 1.327216, 1.327216, 1.327216, 1.327216…
## $ response <dbl> -0.9188450, -0.9075582, -0.8962715, -0.8849847, -0.8736980, -0.8624113, -0.851…
## $ density <dbl> 0.1098804, 0.1141834, 0.1186089, 0.1231581, 0.1278322, 0.1326322, 0.1375591, 0…
```

This could use some walking out. With the first two lines, we made a \(5 \times 1\) tibble containing the five levels of `condition`

, `a`

through `f`

. The `add_fitted_draws()`

function comes from tidybayes. The first argument took our brms model fit, `fit2`

. With the `n`

argument, we indicated we just wanted `20`

draws. With `dpar`

, we requested distributional regression parameters in the output. In our case, those were the \(\mu\) and \(\sigma\) values for each level of `condition`

. Here’s what that looks like.

```
abc %>%
distinct(condition) %>%
add_fitted_draws(fit2, n = 20, dpar = c("mu", "sigma")) %>%
head()
```

```
## # A tibble: 6 x 8
## # Groups: condition, .row [1]
## condition .row .chain .iteration .draw .value mu sigma
## <chr> <int> <int> <int> <int> <dbl> <dbl> <dbl>
## 1 a 1 NA NA 55 0.163 0.163 0.458
## 2 a 1 NA NA 126 0.509 0.509 0.513
## 3 a 1 NA NA 404 0.472 0.472 0.717
## 4 a 1 NA NA 813 0.387 0.387 0.537
## 5 a 1 NA NA 1111 0.154 0.154 0.515
## 6 a 1 NA NA 1218 0.204 0.204 0.495
```

Next, we established the lower- and upper-bounds bounds for the density lines, which were 95% intervals in this example. Within the second `mutate()`

function, we used the `purrr::map2()`

function to feed those two values into the first two arguments of the `seq()`

function. Those arguments, recall, are `from`

and `to`

. We then hard coded `200`

into the `length.out`

argument. As a result, we turned our regular old tibble into a nested tibble. In each row of our new `response`

column, we now have a \(200 \times 1\) data frame containing the `seq()`

output. If you’re new to nested data structures, I recommend checking out Hadley Wickham’s *Managing many models with R*.

```
abc %>%
distinct(condition) %>%
add_fitted_draws(fit2, n = 20, dpar = c("mu", "sigma")) %>%
mutate(lower = qnorm(.025, mean = mu, sd = sigma),
upper = qnorm(.975, mean = mu, sd = sigma)) %>%
mutate(response = map2(lower, upper, seq, length.out = 200)) %>%
head()
```

```
## # A tibble: 6 x 11
## # Groups: condition, .row [1]
## condition .row .chain .iteration .draw .value mu sigma lower upper response
## <chr> <int> <int> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <list>
## 1 a 1 NA NA 346 0.278 0.278 0.521 -0.743 1.30 <dbl [200]>
## 2 a 1 NA NA 696 0.193 0.193 0.498 -0.783 1.17 <dbl [200]>
## 3 a 1 NA NA 734 0.320 0.320 0.709 -1.07 1.71 <dbl [200]>
## 4 a 1 NA NA 1214 0.0686 0.0686 0.559 -1.03 1.16 <dbl [200]>
## 5 a 1 NA NA 1313 0.229 0.229 0.539 -0.828 1.29 <dbl [200]>
## 6 a 1 NA NA 1348 0.329 0.329 0.589 -0.826 1.48 <dbl [200]>
```

Much as the `purrr::map2()`

function allowed us to iterate over two arguments, the `purrr::pmap()`

function will allow us to iterate over an arbitrary number of arguments. In the case of our third `mutate()`

function, we’ll iterate over the first three arguments of the `dnorm()`

function. In case you forgot, those arguments are `x`

, `mean`

, and `sd`

, respectively. Within our `list()`

, we indicated we wanted to insert into them the `response`

, `mu`

, and `sigma`

values. This returns the desired `density`

values. Since our `map2()`

and `pmap()`

operations returned a nested tibble, we then followed them up with the `unnest()`

function to make it easier to access the results.

Before `unnest`

ing, our nested tibble had 100 observations. After `unnest()`

, we converted it to the long format, resulting in \(100 \times 200 = 20,000\) observations.

```
abc %>%
distinct(condition) %>%
add_fitted_draws(fit2, n = 20, dpar = c("mu", "sigma")) %>%
mutate(lower = qnorm(.025, mean = mu, sd = sigma),
upper = qnorm(.975, mean = mu, sd = sigma)) %>%
mutate(response = map2(lower, upper, seq, length.out = 200)) %>%
mutate(density = pmap(list(response, mu, sigma), dnorm)) %>%
unnest() %>%
glimpse()
```

```
## Warning: `cols` is now required.
## Please use `cols = c(response, density)`
```

```
## Observations: 20,000
## Variables: 12
## Groups: condition, .row [5]
## $ condition <chr> "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a"…
## $ .row <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ .chain <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ .iteration <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ .draw <int> 79, 79, 79, 79, 79, 79, 79, 79, 79, 79, 79, 79, 79, 79, 79, 79, 79, 79, 79, 79…
## $ .value <dbl> -0.05142892, -0.05142892, -0.05142892, -0.05142892, -0.05142892, -0.05142892, …
## $ mu <dbl> -0.05142892, -0.05142892, -0.05142892, -0.05142892, -0.05142892, -0.05142892, …
## $ sigma <dbl> 0.5490717, 0.5490717, 0.5490717, 0.5490717, 0.5490717, 0.5490717, 0.5490717, 0…
## $ lower <dbl> -1.12759, -1.12759, -1.12759, -1.12759, -1.12759, -1.12759, -1.12759, -1.12759…
## $ upper <dbl> 1.024732, 1.024732, 1.024732, 1.024732, 1.024732, 1.024732, 1.024732, 1.024732…
## $ response <dbl> -1.1275896, -1.1167739, -1.1059582, -1.0951425, -1.0843269, -1.0735112, -1.062…
## $ density <dbl> 0.1064434, 0.1106119, 0.1148989, 0.1193059, 0.1238338, 0.1284836, 0.1332564, 0…
```

Hopefully, our last two lines look familiar. We `group_by(.draw)`

just like in previous examples. However, our final `mutate()`

line is a little simpler than in previous versions. Before we had to make that intermediary variable, `g`

. Because we intend to plot these data with help from ggridges, we no longer have need for `g`

. You’ll see. But the upshot is the only reason we’re adding this last `mutate()`

line is to scale all the Gaussians to have the same maximum height the way Kruschke did.

```
afd <-
abc %>%
distinct(condition) %>%
add_fitted_draws(fit2, n = 20, dpar = c("mu", "sigma")) %>%
mutate(lower = qnorm(.025, mean = mu, sd = sigma),
upper = qnorm(.975, mean = mu, sd = sigma)) %>%
mutate(response = map2(lower, upper, seq, length.out = 200)) %>%
mutate(density = pmap(list(response, mu, sigma), dnorm)) %>%
unnest() %>%
group_by(.draw) %>%
mutate(density = density * .75 / max(density))
```

```
## Warning: `cols` is now required.
## Please use `cols = c(response, density)`
```

`glimpse(afd)`

```
## Observations: 20,000
## Variables: 12
## Groups: .draw [20]
## $ condition <chr> "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a"…
## $ .row <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ .chain <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ .iteration <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ .draw <int> 292, 292, 292, 292, 292, 292, 292, 292, 292, 292, 292, 292, 292, 292, 292, 292…
## $ .value <dbl> -0.302783, -0.302783, -0.302783, -0.302783, -0.302783, -0.302783, -0.302783, -…
## $ mu <dbl> -0.302783, -0.302783, -0.302783, -0.302783, -0.302783, -0.302783, -0.302783, -…
## $ sigma <dbl> 0.5187913, 0.5187913, 0.5187913, 0.5187913, 0.5187913, 0.5187913, 0.5187913, 0…
## $ lower <dbl> -1.319595, -1.319595, -1.319595, -1.319595, -1.319595, -1.319595, -1.319595, -…
## $ upper <dbl> 0.7140294, 0.7140294, 0.7140294, 0.7140294, 0.7140294, 0.7140294, 0.7140294, 0…
## $ response <dbl> -1.3195953, -1.3093761, -1.2991569, -1.2889377, -1.2787185, -1.2684993, -1.258…
## $ density <dbl> 0.1098804, 0.1141834, 0.1186089, 0.1231581, 0.1278322, 0.1326322, 0.1375591, 0…
```

Let’s open ggridges

`library(ggridges)`

Note how contrary to before, we set the global y axis to our `condition`

grouping variable. It’s within the `geom_ridgeline()`

function that we now specify `height = density`

. Other than that, the main thing to point out is you might want to adjust the `ylim`

parameters. Otherwise the margins aren’t the best.

```
afd %>%
ggplot(aes(x = response, y = condition)) +
geom_ridgeline(aes(height = density, group = interaction(condition, .draw)),
fill = NA, size = 1/3, color = adjustcolor("steelblue", alpha.f = 1/2)) +
geom_jitter(data = abc,
height = .05, shape = 1, alpha = 2/3) +
coord_cartesian(ylim = c(1.25, 5.5))
```

“But I wanted my Gaussians tipped to the left!”, you say. Yep, we can do that, too. Three things: First, we’ll want to adjust the `height`

parameter to `-density`

. We want our Gaussians to extend under their baselines. Along with that, we need to include `min_height = NA`

. Finally, we’ll switch out `coord_cartesian()`

for good old `coord_flip()`

. And you can adjust your `ylim`

parameters as desired.

```
afd %>%
ggplot(aes(x = response, y = condition)) +
geom_ridgeline(aes(height = -density, group = interaction(condition, .draw)),
fill = NA, size = 1/3, color = adjustcolor("steelblue", alpha.f = 1/2),
min_height = NA) +
geom_jitter(data = abc,
height = .05, shape = 1, alpha = 2/3) +
coord_flip(ylim = c(0.5, 4.75))
```

I think it’s important to note that I’ve never met any of the people who helped me with this project. Academic twitter, man–it’s a good place to be.

## Session info

`sessionInfo()`

```
## R version 3.6.2 (2019-12-12)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS High Sierra 10.13.6
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
##
## Random number generation:
## RNG: Mersenne-Twister
## Normal: Inversion
## Sample: Rounding
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] ggridges_0.5.2 tidybayes_2.0.1.9000 brms_2.12.0 Rcpp_1.0.3
## [5] forcats_0.4.0 stringr_1.4.0 dplyr_0.8.4 purrr_0.3.3
## [9] readr_1.3.1 tidyr_1.0.2 tibble_2.1.3 ggplot2_3.2.1
## [13] tidyverse_1.3.0
##
## loaded via a namespace (and not attached):
## [1] colorspace_1.4-1 ellipsis_0.3.0 rsconnect_0.8.16
## [4] markdown_1.1 base64enc_0.1-3 fs_1.3.1
## [7] rstudioapi_0.10 farver_2.0.3 rstan_2.19.2
## [10] svUnit_0.7-12 DT_0.11 fansi_0.4.1
## [13] mvtnorm_1.0-12 lubridate_1.7.4 xml2_1.2.2
## [16] bridgesampling_0.8-1 knitr_1.26 shinythemes_1.1.2
## [19] bayesplot_1.7.1 jsonlite_1.6.1 broom_0.5.3
## [22] dbplyr_1.4.2 shiny_1.4.0 compiler_3.6.2
## [25] httr_1.4.1 backports_1.1.5 assertthat_0.2.1
## [28] Matrix_1.2-18 fastmap_1.0.1 lazyeval_0.2.2
## [31] cli_2.0.1 later_1.0.0 htmltools_0.4.0
## [34] prettyunits_1.1.1 tools_3.6.2 igraph_1.2.4.2
## [37] coda_0.19-3 gtable_0.3.0 glue_1.3.1
## [40] reshape2_1.4.3 cellranger_1.1.0 vctrs_0.2.2
## [43] nlme_3.1-142 blogdown_0.17 crosstalk_1.0.0
## [46] xfun_0.12 ps_1.3.0 rvest_0.3.5
## [49] mime_0.8 miniUI_0.1.1.1 lifecycle_0.1.0
## [52] gtools_3.8.1 zoo_1.8-7 scales_1.1.0
## [55] colourpicker_1.0 hms_0.5.3 promises_1.1.0
## [58] Brobdingnag_1.2-6 parallel_3.6.2 inline_0.3.15
## [61] shinystan_2.5.0 yaml_2.2.1 gridExtra_2.3
## [64] StanHeaders_2.19.0 loo_2.2.0 stringi_1.4.6
## [67] dygraphs_1.1.1.6 pkgbuild_1.0.6 rlang_0.4.5
## [70] pkgconfig_2.0.3 matrixStats_0.55.0 evaluate_0.14
## [73] lattice_0.20-38 rstantools_2.0.0 htmlwidgets_1.5.1
## [76] labeling_0.3 tidyselect_1.0.0 processx_3.4.1
## [79] plyr_1.8.5 magrittr_1.5 bookdown_0.17
## [82] R6_2.4.1 generics_0.0.2 DBI_1.1.0
## [85] pillar_1.4.3 haven_2.2.0 withr_2.1.2
## [88] xts_0.12-0 abind_1.4-5 modelr_0.1.5
## [91] crayon_1.3.4 arrayhelpers_1.0-20160527 utf8_1.1.4
## [94] rmarkdown_2.0 grid_3.6.2 readxl_1.3.1
## [97] callr_3.4.1 threejs_0.3.3 reprex_0.3.0
## [100] digest_0.6.23 xtable_1.8-4 httpuv_1.5.2
## [103] stats4_3.6.2 munsell_0.5.0 viridisLite_0.3.0
## [106] shinyjs_1.1
```