Just Enough Statistics

Disclaimer

We can’t teach all of statistics in one hour.
Many of the examples here are not strictly correct, but are used to demonstrate a point.

Before we start… We need some fake ages

library(tidyverse)
RCT <- read.csv("http://bit.ly/2XzvW47")

The big picture

Statistics

We aspire to discern some truth from the universe.
We take measurements, but cannot measure all patients in a population.
Therefore we have to take a sample, and take measurements from this sample.
We then apply mathematical functions on these raw measurements to try and discern truth.
f(raw measurements) -> Statistics

Our RCT using this analogy

Our RCT attempts to answer a research question by collecting data from a sample of a population.
Data collection consists in measuring the values of several variables for each member of the sample.

Statistics broadly classified

Descriptive statistics

Describe the central tendencies
Describe the spread
Describe the proportions
Summarise the data

Inferential statistics

Deduce conclusions about the population from the sample
Test the reliability of those conclusions

Variables

Variables may be continuous or discrete
Consider these:
- age
- random, satisfaction

Descriptive Statistics

Distributions

hist(RCT$age)

hist(RCT$ps12)

Do our variables looks like this?

Means, medians and modes

Mean: sum of all datapoints / number of datapoints
Median: middle datapoint
Mode: most common datapoint

Have a go with these variables: age, ps12, move24h

Have a go

You can use the functions mean() and median(), for example:

mean(RCT$ps12, na.rm = TRUE)

## [1] 1.40625

median(RCT$ps12, na.rm = TRUE)

## [1] 1

Measuring ‘spread’

For normally distibuted data, use standard deviation
For non-normally distributed data, use interquartile range

IQR(RCT$age, na.rm = TRUE)

## [1] 22

quantile(RCT$age, na.rm = TRUE, probs = c(0.25, 0.75))

## 25% 75% 
##  55  77

sd(RCT$age, na.rm = TRUE)

## [1] 14.55191

Calculate it all at once

summary(RCT)

##        X               id             recruit        age        gender
##  Min.   : 0.00   Min.   : 1.00   1/1/2014 : 1   Min.   :29.00   f:25  
##  1st Qu.:15.75   1st Qu.:16.75   1/10/2014: 1   1st Qu.:55.00   F:38  
##  Median :31.50   Median :32.50   1/11/2014: 1   Median :70.50   M: 1  
##  Mean   :31.50   Mean   :32.50   1/12/2014: 1   Mean   :66.27         
##  3rd Qu.:47.25   3rd Qu.:48.25   1/13/2014: 1   3rd Qu.:77.00         
##  Max.   :63.00   Max.   :64.00   1/14/2014: 1   Max.   :95.00         
##                                  (Other)  :58   NA's   :4             
##    random        ps0h             ps3h            ps12      
##  drain:35   Min.   :0.0000   Min.   :0.000   Min.   :0.000  
##  skin :29   1st Qu.:0.0000   1st Qu.:0.000   1st Qu.:0.000  
##             Median :0.0000   Median :1.000   Median :1.000  
##             Mean   :0.1905   Mean   :1.125   Mean   :1.406  
##             3rd Qu.:0.0000   3rd Qu.:2.000   3rd Qu.:2.000  
##             Max.   :3.0000   Max.   :5.000   Max.   :5.000  
##             NA's   :1                                       
##      ps24h          ps168h          move12h         move24h      
##  Min.   :0.00   Min.   :0.0000   Min.   :0.000   Min.   :0.0000  
##  1st Qu.:0.00   1st Qu.:0.0000   1st Qu.:0.000   1st Qu.:0.0000  
##  Median :1.00   Median :0.0000   Median :1.000   Median :1.0000  
##  Mean   :1.25   Mean   :0.6275   Mean   :1.078   Mean   :0.9375  
##  3rd Qu.:2.00   3rd Qu.:1.0000   3rd Qu.:2.000   3rd Qu.:1.2500  
##  Max.   :5.00   Max.   :3.0000   Max.   :5.000   Max.   :5.0000  
##                 NA's   :13                                       
##     move168h         tylenol    codeine      oramorph 
##  Min.   :0.0000   8g     :31   0    :56   0      :47  
##  1st Qu.:0.0000   4g     :15   10mg : 1   10mg   : 7  
##  Median :0.0000   0      : 5   120mg: 1   20mg   : 4  
##  Mean   :0.6078   2g     : 4   210mg: 1   50mg   : 2  
##  3rd Qu.:1.0000   3g     : 3   30mg : 1   30mg   : 1  
##  Max.   :4.0000   1g     : 2   60mg : 4   5mg    : 1  
##  NA's   :13       (Other): 4              (Other): 2  
##               other         los        
##  0               :34   Min.   : 0.000  
##  Diclofenac 225mg:10   1st Qu.: 2.000  
##  Diclofenac 300mg: 2   Median : 3.000  
##  Diclofenac 75mg : 2   Mean   : 3.431  
##  Ibuprofen 800mg : 2   3rd Qu.: 4.000  
##  Morphine 20mg   : 2   Max.   :10.000  
##  (Other)         :12   NA's   :13      
##                                           los_reason       satisfaction
##                                                :12               :12   
##  Comorbidities- recurrent PE                   : 2   Excellent   :13   
##  Continued fluid through drain                 :47   Good        :29   
##  Continued fluid through drain and bowel stasis: 1   Poor        : 3   
##  Pain due to spinal mets                       : 1   satisfactory: 6   
##  Urinary incontinence                          : 1   Satisfactory: 1   
##                                                                        
##     fake_age     binary_satisfaction
##  Min.   :43.98   Min.   :0.0000     
##  1st Qu.:63.13   1st Qu.:0.0000     
##  Median :76.99   Median :1.0000     
##  Mean   :73.49   Mean   :0.6562     
##  3rd Qu.:83.01   3rd Qu.:1.0000     
##  Max.   :92.50   Max.   :1.0000     
##

Inferential statistics

Testing (here be dragons)

best_option <- “Ask a statistician”
pragmatic_option <- “Do it yourself, carefully and simply”

Think about

The hypothesis you want to test
Where your observations come from
The type of variables you are dealing with
The assumptions that different tests use
Interpreting the test results

Hypothesis 1

‘Mean age differs between the two treatment arms’

What sort of variables are these?
What is the independent variable?
What is the dependent variable?
What is the null hypothesis?
Which test should we use?

Unpaired t-test: Assumptions

Values within each group are independent
The two groups are independent of one another
(Values in each group should be approximately normally-distributed)

Let’s look at our data…

hist(RCT[RCT$random == "drain", ]$age, , xlim=c(10, 100), ylim=c(0, 20), col = "#FF000080")
hist(RCT[RCT$random == "skin", ]$age, col = "#0000FF80", add = TRUE)

Because we don’t really have differences in age, we needed to create some fake data.

hist(RCT[RCT$random == "drain", ]$fake_age, xlim=c(30, 100), ylim=c(0, 20), col = "#FF000080")
hist(RCT[RCT$random == "skin", ]$fake_age, col = "#0000FF80", add = TRUE)

Let’s run the test!

Note that the ‘spread’ of the groups is different - a problem for the standard t test, but R takes care of this for us behind the scenes (it gives us Welch’s instead of Student’s t-test).

t.test(RCT$fake_age ~ RCT$random)

## 
##  Welch Two Sample t-test
## 
## data:  RCT$fake_age by RCT$random
## t = -9.6394, df = 48.021, p-value = 8.275e-13
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -21.19970 -13.88222
## sample estimates:
## mean in group drain  mean in group skin 
##            65.54566            83.08662

Hypothesis 2

‘Patient satisfaction after axillary node dissection depends on treatment arm.’

Hypothesis 2

‘Patient satisfaction after axillary node dissection depends on treatment arm.’

What sort of variables are these?
What is the independent variable?
What is the dependent variable?
What is the null hypothesis?
Which test should we use?

Chi-squared test for independence: Assumptions

The two groups are independent of one another
Count in each cell of the table should be 5 or more

Let’s look at our data

tbl <- table(RCT$random, RCT$binary_satisfaction)

tbl

##        
##          0  1
##   drain  8 27
##   skin  14 15

Let’s run the test!

chisq.test(tbl)

## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  tbl
## X-squared = 3.4854, df = 1, p-value = 0.06191

Hypothesis 3

‘Postoperative pain scores at 12hrs depends on treatment arm.’

Hypothesis 3

‘Postoperative pain scores at 12hrs depends on treatment arm.’

What sort of variables are these?
What is the independent variable?
What is the dependent variable?
What is the null hypothesis?
Which test should we use?

Mann-Whitney U test: Assumptions

AKA Wilcoxon Rank Sum test
Dependent variable is either continuous or ordinal
Independent variable is categorical
Observations in each group are independent
Distributions of dependent variable in each group are the same

Let’s run the test!

wilcox.test(RCT$ps12 ~ RCT$random, conf.int = TRUE)

## Warning in wilcox.test.default(x = c(0L, 2L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, :
## cannot compute exact p-value with ties

## Warning in wilcox.test.default(x = c(0L, 2L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, :
## cannot compute exact confidence intervals with ties

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  RCT$ps12 by RCT$random
## W = 259, p-value = 0.0005341
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
##  -1.9999351 -0.9999614
## sample estimates:
## difference in location 
##              -1.000061

Extending this further

Table One

Most research studies have Table 1 describing patient baseline characteristics

install.packages(tableone)

library(tableone)
CreateTableOne(data = RCT, 
               vars = c("age", "ps0h", "los", "satisfaction"),
               strata = "random")

##                   Stratified by random
##                    drain         skin          p      test
##   n                   35            29                    
##   age (mean (sd))  67.50 (14.04) 64.65 (15.32)  0.458     
##   ps0h (mean (sd))  0.09 (0.29)   0.31 (0.71)   0.101     
##   los (mean (sd))   3.23 (1.55)   3.71 (2.15)   0.357     
##   satisfaction (%)                              0.266     
##                        4 (11.4)      8 (27.6)             
##      Excellent         9 (25.7)      4 (13.8)             
##      Good             18 (51.4)     11 (37.9)             
##      Poor              1 ( 2.9)      2 ( 6.9)             
##      satisfactory      2 ( 5.7)      4 (13.8)             
##      Satisfactory      1 ( 2.9)      0 ( 0.0)

Modelling

Beyond hypothesis testing, modelling tries to explain the data generation process.
R can be used for any number of regression models, e.g. linear regression

Linear regression

Do patients who have a higher pain score at 3hrs have a longer Length of Stay?

plot(RCT$los ~ RCT$ps3h)

Jittering for overplotting

plot(jitter(RCT$los) ~ jitter(RCT$ps3h))

linear_model <- lm(los ~ ps3h, data = RCT)
summary(linear_model)

## 
## Call:
## lm(formula = los ~ ps3h, data = RCT)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.7703 -1.0648 -0.0648  0.7297  6.2297 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   3.0648     0.3473   8.825 1.08e-11 ***
## ps3h          0.3527     0.2313   1.525    0.134    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.79 on 49 degrees of freedom
##   (13 observations deleted due to missingness)
## Multiple R-squared:  0.04532,    Adjusted R-squared:  0.02584 
## F-statistic: 2.326 on 1 and 49 DF,  p-value: 0.1336

Logistic regression

You might think that patients have different satisfaction levels by age

plot(RCT$binary_satisfaction ~ RCT$age)

logit_model <- glm(binary_satisfaction ~ age, 
                   data = RCT, 
                   family = binomial)
summary(logit_model)

## 
## Call:
## glm(formula = binary_satisfaction ~ age, family = binomial, data = RCT)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.7452  -1.3640   0.8001   0.9520   1.0715  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  2.25316    1.41651   1.591    0.112
## age         -0.02324    0.02043  -1.138    0.255
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 76.382  on 59  degrees of freedom
## Residual deviance: 75.006  on 58  degrees of freedom
##   (4 observations deleted due to missingness)
## AIC: 79.006
## 
## Number of Fisher Scoring iterations: 4

Picking the right statistical tests

UCLA Which test?

A gentle (and pretty!) intro to probability/stats

Seeing theory

Statistical tests as Linear models

Common statistical tests are linear models

Just Enough Statistics

Disclaimer

Before we start… We need some fake ages

The big picture

Statistics

Our RCT using this analogy

Statistics broadly classified

Descriptive statistics

Inferential statistics

Variables

Descriptive Statistics

Distributions

Do our variables looks like this?

Means, medians and modes

Have a go

Measuring ‘spread’

Calculate it all at once

Inferential statistics

Testing (here be dragons)

Think about

Hypothesis 1

Unpaired t-test: Assumptions

Let’s look at our data…

Let’s run the test!

Hypothesis 2

Hypothesis 2

Chi-squared test for independence: Assumptions

Let’s look at our data

Let’s run the test!

Hypothesis 3

Hypothesis 3

Mann-Whitney U test: Assumptions

Let’s run the test!

Extending this further

Table One

Modelling

Linear regression

Jittering for overplotting

Logistic regression

Picking the right statistical tests

A gentle (and pretty!) intro to probability/stats

Statistical tests as Linear models

Questions