Code Club S02E08: Combining Plots


Learning objectives

  • Continue to practice creating plots with ggplot
  • Use faceting to divide a plot into multiple panels according to some variable.
  • Arrange multiple plots of different types on a single figure.


1 – Intro

We’ll continue with our theme on plotting by exploring some options for arranging multiple plots on a single figure. A couple scenarios where you might want to do this…

1.) You create a plot that needs to be subdivided according to some variable, possibly because accounting for that variable is important for the interpretation, or maybe there’s just too much on one plot and it helps to split the data up according to some factor.

2.) You have a series of different plots that all address some related question, maybe each in a slightly different way, and you want to present them all in one figure.

We’ll take a couple approaches during this and next week’s sessions to deal with these two scenarios. Today we’ll look at some ggplot functions like facet_wrap() and facet_grid() that allow us to easily deal with scenario 1. Then in the next session we’ll try a separate package, patchwork, that offers one good option for scenario 2.

Like in previous sessions, we’ll use some packages from the tidyverse and also the palmerpenguins dataset. If you haven’t installed either of those yet, you can do so with the following commands. If you installed them previously, you can just run the latter of the commands (library()) to load them for the current session.

install.packages("tidyverse")
install.packages("palmerpenguins")

And now let’s preview/explore the penguins dataset just to remind ourselves of what’s in there…

head(penguins)

#> # A tibble: 6 x 8
#>   species island bill_length_mm bill_depth_mm flipper_length_… body_mass_g sex  
#>   <fct>   <fct>           <dbl>         <dbl>            <int>       <int> <fct>
#> 1 Adelie  Torge…           39.1          18.7              181        3750 male 
#> 2 Adelie  Torge…           39.5          17.4              186        3800 fema…
#> 3 Adelie  Torge…           40.3          18                195        3250 fema…
#> 4 Adelie  Torge…           NA            NA                 NA          NA NA   
#> 5 Adelie  Torge…           36.7          19.3              193        3450 fema…
#> 6 Adelie  Torge…           39.3          20.6              190        3650 male 
#> # … with 1 more variable: year <int>

summary(penguins)

#>       species          island    bill_length_mm  bill_depth_mm  
#>  Adelie   :152   Biscoe   :168   Min.   :32.10   Min.   :13.10  
#>  Chinstrap: 68   Dream    :124   1st Qu.:39.23   1st Qu.:15.60  
#>  Gentoo   :124   Torgersen: 52   Median :44.45   Median :17.30  
#>                                  Mean   :43.92   Mean   :17.15  
#>                                  3rd Qu.:48.50   3rd Qu.:18.70  
#>                                  Max.   :59.60   Max.   :21.50  
#>                                  NA's   :2       NA's   :2      
#>  flipper_length_mm  body_mass_g       sex           year     
#>  Min.   :172.0     Min.   :2700   female:165   Min.   :2007  
#>  1st Qu.:190.0     1st Qu.:3550   male  :168   1st Qu.:2007  
#>  Median :197.0     Median :4050   NA's  : 11   Median :2008  
#>  Mean   :200.9     Mean   :4202                Mean   :2008  
#>  3rd Qu.:213.0     3rd Qu.:4750                3rd Qu.:2009  
#>  Max.   :231.0     Max.   :6300                Max.   :2009  
#>  NA's   :2         NA's   :2

2 – Faceting

Let’s start by revisiting some plots Michael Broe created in his intro to ggplot a couple sessions ago. He was using the plots to investigate whether a relationship exists between the variables bill length and bill depth in these penguins. A scatterplot with a line of best fit from ggplot

penguins %>% 
  ggplot(aes(x = bill_length_mm, y = bill_depth_mm)) +
  geom_point() +
  geom_smooth(method = "lm")

#> `geom_smooth()` using formula 'y ~ x'

#> Warning: Removed 2 rows containing non-finite values (stat_smooth).

#> Warning: Removed 2 rows containing missing values (geom_point).

As Michael pointed out previously, mapping an additional aesthetic (color) to the variable species helps us see a relationship a little more clearly…

penguins %>% 
  ggplot(aes(x = bill_length_mm, y = bill_depth_mm, color = species)) +
  geom_point() +
  geom_smooth(method = "lm")

#> `geom_smooth()` using formula 'y ~ x'

#> Warning: Removed 2 rows containing non-finite values (stat_smooth).

#> Warning: Removed 2 rows containing missing values (geom_point).

The color aesthetic partitions the data according to some variable (in this case, species), and here helps add important information to the visualization. An alternative might be to plot the data in separate panels, with each corresponding to a different species. We can do that with either of two functions from ggplot, facet_wrap() or facet_grid(). Let’s start with facet_wrap(). This is added as an additional layer to the plot, and indicates one or more variables that will be used to split the data into separate panels. I’ll facet here by species.

penguins %>% 
  ggplot(aes(x = bill_length_mm, y = bill_depth_mm)) +
  geom_point() +
  geom_smooth(method = "lm") +
  facet_wrap("species")

#> `geom_smooth()` using formula 'y ~ x'

#> Warning: Removed 2 rows containing non-finite values (stat_smooth).

#> Warning: Removed 2 rows containing missing values (geom_point).

The effect here is similar to what we did with adding a color aesthetic to the species variable earlier - it allows us to evaluate the relationship between bill length and bill depth for each species separately.


Breakout Rooms: Faceting

Exercise 1: Analyze Adelie Penguins By Island

Try analyzing the relationship between bill length and bill depth for just the Adelie penguins (the only species with observations from each of the three islands). For this species, try faceting by island. Does the relationship seem to be consistent across all islands?

Hint (click here)


Use filter() to select out Adelie penguins, then create a plot similar to the one in the example, but facet on island instead of species

Solution (click here)
penguins %>%
  filter(species == "Adelie") %>%
  ggplot(aes(x = bill_length_mm, y = bill_depth_mm)) +
  geom_point() +
  geom_smooth(method = "lm") +
  facet_wrap("island")

#> `geom_smooth()` using formula 'y ~ x'

#> Warning: Removed 1 rows containing non-finite values (stat_smooth).

#> Warning: Removed 1 rows containing missing values (geom_point).

Exercise 2a: Multiple Facets

Now building on the plot you just created for Adelie Penguins, what if you wanted to facet on not just island, but a combination of island and sex? Give it a try.

Hint (click here)


facet_wrap() accepts a character vector of column names. Use the c() function to provide a vector with two column names.

Solution (click here)
penguins %>% 
  filter(species == "Adelie") %>%
  ggplot(aes(x = bill_length_mm, y = bill_depth_mm)) +
  geom_point() +
  geom_smooth(method = "lm") +
  facet_wrap(c("island", "sex"))

#> `geom_smooth()` using formula 'y ~ x'

#> Warning: Removed 1 rows containing non-finite values (stat_smooth).

#> Warning: Removed 1 rows containing missing values (geom_point).

Exercise 2b: Multiple Facets

There are some facets coming through in that last plot that are based on NA’s. Try getting rid of all observations that include missing data before creating the plot.

Hint (click here)


Use the drop_na() function to remove observations with NA before calling ggplot.

Solution (click here)
penguins %>% 
  drop_na() %>%
  filter(species == "Adelie") %>%
  ggplot(aes(x = bill_length_mm, y = bill_depth_mm)) +
  geom_point() +
  geom_smooth(method = "lm") +
  facet_wrap(c("island", "sex"))

#> `geom_smooth()` using formula 'y ~ x'

Exercise 3: Axis Scales

Now let’s go back to the full dataset where we faceted by species. The code we used (with the drop_na function added), along with its associated plot, are below…

penguins %>% 
  drop_na() %>%
  ggplot(aes(x = bill_length_mm, y = bill_depth_mm)) +
  geom_point() +
  geom_smooth(method = "lm") +
  facet_wrap("species")

#> `geom_smooth()` using formula 'y ~ x'

Use the help page for facet_wrap to look in to the scales option. Try changing the value of this option to see what effect it has on the plot.

Hint 1 (click here)


Use ?facet_wrap to get the help page for the function, and find information about the scales option.

Hint 2 (click here)


Within the facet_wrap() function, set scales = “free_y”.

Solution (click here)
penguins %>% 
  ggplot(aes(x = bill_length_mm, y = bill_depth_mm)) +
  geom_point() +
  geom_smooth(method = "lm") +
  facet_wrap("species", scales = "free")

#> `geom_smooth()` using formula 'y ~ x'

#> Warning: Removed 2 rows containing non-finite values (stat_smooth).

#> Warning: Removed 2 rows containing missing values (geom_point).



In next week’s session, we’ll use facet_grid(), which has some similarities to facet_wrap(), and then check out the patchwork package, which gives you more control over how multiple plots are combined in a single figure.


Mike Sovic
Mike Sovic
Bioinformatician at CAPS