GGPLOT2

library(tidyverse)
## -- Attaching packages -------------------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.0.0       v purrr   0.2.5  
## v tibble  2.1.1       v dplyr   0.8.0.1
## v tidyr   0.8.1       v stringr 1.3.1  
## v readr   1.1.1       v forcats 0.3.0
## Warning: package 'tibble' was built under R version 3.5.3
## Warning: package 'dplyr' was built under R version 3.5.3
## -- Conflicts ----------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

Recap from Yesterday

Wrangling Data

  • “verbs” from the tidyverse
  • select()
  • filter()
  • group_by()
  • summarise()
outreach <- read_csv("../data/outreach.csv")
## Parsed with column specification:
## cols(
##   .default = col_integer(),
##   wday = col_character(),
##   temperature = col_double(),
##   lactate = col_double()
## )
## See spec(...) for full column specifications.
head(outreach[, 1:5])
## # A tibble: 6 x 5
##   hospital patient dead28 icu_accept icu_admit
##      <int>   <int>  <int>      <int>     <int>
## 1        1    2750      1          0         0
## 2        1    2297      1          1         1
## 3        1    3782      0          1         1
## 4        1    2337      0          0         1
## 5        1    1020      0          0         0
## 6        1    4852      0          0         0

Are patients sicker at weekends?

outreach$weekend <- ifelse(
  outreach$wday == "Sat" | outreach$wday == "Sun",
  TRUE, FALSE)

outreach_f <- filter(outreach, icu_accept == 1)
outreach_s <- select(outreach_f, age, male, weekend, sofa_score)
outreach_g <- group_by(outreach_s, weekend)
outreach_x <- summarise(outreach_g, mean.sofa = mean(sofa_score),
                        sd.sofa = sd(sofa_score))

outreach_x
## # A tibble: 2 x 3
##   weekend mean.sofa sd.sofa
##   <lgl>       <dbl>   <dbl>
## 1 FALSE        4.02    2.41
## 2 TRUE         4.13    2.30

A better way?

outreach %>%
  filter(icu_accept == 1) %>%
  select(age, male, weekend, sofa_score) %>%
  group_by(weekend) %>%
  summarise(mean.sofa = mean(sofa_score),
            sd.sofa = sd(sofa_score))
## # A tibble: 2 x 3
##   weekend mean.sofa sd.sofa
##   <lgl>       <dbl>   <dbl>
## 1 FALSE        4.02    2.41
## 2 TRUE         4.13    2.30

The Grammar of Graphics

Building Layers

  • The Data
  • The thing we are plotting
  • Mapping Aesthetics
  • How we are mapping the data to a visual dimention
  • Geometric Objects
  • How the mapping is presented to us

The Data

  • Should be “tidy”
  • Think carefully about unit of observation
outreach %>%
  ggplot()

# ggplot(data = outreach)

Mapping Aesthetics

  • This is how the data is “mapped” to a visual dimention
  • We are most familiar with x and y mappings
  • Others might include:
  • Size
  • Shape
  • Colour/Fill
outreach %>%
  ggplot(mapping = aes(x = sofa_score, fill = weekend))

# ggplot(data = outreach,
#        mapping = aes(x = sofa_score, fill = weekend))

Geometric Objects

  • This allows us to display our data
  • Points, lines, bars etc.
outreach %>%
  ggplot(mapping = aes(x = sofa_score, fill = weekend)) +
  geom_density()

# ggplot(data = outreach,
#        mapping = aes(x = sofa_score, fill = weekend)) +
#   geom_density()

Review of layers

  1. data
  2. aesthetic mapping
  3. geoms

Let’s Play

outreach %>%
  ggplot(mapping = aes(x = news_score,
                       y = sofa_score)) +
  geom_point()

outreach %>%
  ggplot(mapping = aes(x = news_score,
                       y = sofa_score)) +
  geom_jitter()

Exercise

  • There are many different anaesthetics (x, y, col, size, shape etc.)
  • There are many different geoms
  • Check out the cheat sheet and see if you can construct your own