Do you continually substitute “%>%” for “+” when switching between data wrangling and data visualization? I’ve got just the solution for you!


Count myself as one of those people that continually use a pipe instead of a plus and vice-verca when I’m writing a lot of code. Sir Hadley has basically shut the door on ever switching ggplot to using magrittr pipes and I don’t blame him. But he can’t stop me from doing whatever the heck I want.

In the following code, I took the sample ggplot code from the help and modified it to use magrittr.

geom_point_p = function(p, ...) {
    p + geom_point(...)

geom_errorbar_p = function(p, ...) {
    p + geom_errorbar(...)

df = data.frame(
  gp = factor(rep(letters[1:3], each = 10)),
  y = rnorm(30)
ds = plyr::ddply(df, "gp", plyr::summarise, mean = mean(y), sd = sd(y))

# The summary data frame ds is used to plot larger red points on top
# of the raw data. Note that we don't need to supply `data` or `mapping`
# in each layer because the defaults from ggplot() are used.
ggplot(df, aes(gp, y)) %>%
  geom_point_p() %>%
  geom_point_p(data = ds, aes(y = mean), colour = 'red', size = 3)

plot of chunk unnamed-chunk-1

# Same plot as above, declaring only the data frame in ggplot().
# Note how the x and y aesthetics must now be declared in
# each geom_point() layer.
ggplot(df) %>%
  geom_point_p(aes(gp, y)) %>%
  geom_point_p(data = ds, aes(gp, mean), colour = 'red', size = 3)

plot of chunk unnamed-chunk-1

# Alternatively we can fully specify the plot in each layer. This
# is not useful here, but can be more clear when working with complex
# mult-dataset graphics
ggplot() %>%
  geom_point_p(data = df, aes(gp, y)) %>%
  geom_point_p(data = ds, aes(gp, mean), colour = 'red', size = 3) %>%
    data = ds,
    aes(gp, mean, ymin = mean - sd, ymax = mean + sd),
    colour = 'red',
    width = 0.4

plot of chunk unnamed-chunk-1

I may have to roll this into a package at some point.