Unpaired Two-Samples T-test in R

This article has been updated, you are now consulting an old release of this article!

Introduction

Independent t-test or (unpaired t-test) is used to compare the means of two unrelated groups of samples. The aim of this article is to show you how to calculate independent samples t test with R software. The t-test formula is described here.

A simplified format of the R function to use is :

t.test(x, y)

x and y are two numeric vectors of data values to compare.

t.test function is described in detail here.

Example of data

As an example, we have a cohort of 20 individuals (10 women and 10 men). The question is to test whether women’s average weight is significantly different from men’s average weight? The number of individuals considered here is obviously low. This is just to illustrate the usage of two-sample t-test.

The data are shown below:

Group Weight (kg)
1 Woman 38.90
2 Woman 61.20
3 Woman 73.30
4 Woman 21.80
5 Woman 63.40
6 Woman 64.60
7 Woman 48.40
8 Woman 48.80
9 Woman 48.50
10 Woman 43.60
11 Man 67.80
12 Man 60.00
13 Man 63.40
14 Man 76.00
15 Man 89.40
16 Man 73.30
17 Man 67.30
18 Man 61.30
19 Man 62.40
20 Man 111.20

Question : Does the women’s average weight is significantly different from that of men?

To answer to this question an independent t-test can be used :

From the data table above, two different methods can be used to perform the t-test. It depends on the structure of your input data.

Calculate independent t-test using R

1) Method 1 - The data are saved in two differents numeric vectors (x and y) :

set.seed(1234)
# Women's weights
x<- c(38.9, 61.2, 73.3, 21.8, 63.4, 64.6, 48.4, 48.8, 48.5, 43.6)
# Men's weights
y <- c(67.8, 60, 63.4, 76, 89.4, 73.3, 67.3, 61.3, 62.4, 111.2) 

In this case unpaired t-test can be performed as follow :

res<-t.test(x,y)
res

    Welch Two Sample t-test
data:  x and y
t = -3.17, df = 17.92, p-value = 0.005319
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -36.517  -7.403
sample estimates:
mean of x mean of y 
    51.25     73.21 

2) Method 2 - The data are saved in a data.frame :

d<-as.data.frame(list(
                   group=c(rep("Woman", 10), rep("Man", 10)),
                   weight=c(x, y)
                   ))
head(d)
  group weight
1 Woman   38.9
2 Woman   61.2
3 Woman   73.3
4 Woman   21.8
5 Woman   63.4
6 Woman   64.6

In this case, unpaired t-test can be calculated using the following R code :

#res<-t.test(d$weight ~ d$group) 
res<-t.test(weight ~ group, data=d)
res

    Welch Two Sample t-test
data:  weight by group
t = 3.17, df = 17.92, p-value = 0.005319
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
  7.403 36.517
sample estimates:
  mean in group Man mean in group Woman 
              73.21               51.25 

As you can see, the two methods give the same results.


In the result above : t is the Student t-test statistics value (t = 3.17), df is the degrees of freedom (df= 17.916), p-value is the significance level of the t-test (p-value = 0.0053). The confidence interval (conf.int) of the mean differences at 95% is also shown (conf.int= [7.4, 36.52]); and finally, we have the means of the two groups of samples (average weight of women = 73.21, average weight of men =51.25).


The p-value of the test is 0.0053, which is less than the significance level alpha = 0.05. We can then reject the null hypothesis and conclude that women’s average weight is significantly different from men’s average weight with a p-value = 0.0053.

Remember that independent t-test can be used only when the two sets of data follow a bivariate normal distributions with equal variances.

By default, the R t.test() function makes the assumption that the variances of the two groups of samples, being compared, are different. Therefore, Welch t-test is performed by default. Welch t-test is just an adaptation of t-test, and it is used when the two samples have possibly unequal variances.

The argument “var.equal=TRUE” can be used to indicate to the t.test() function that the two samples have equal variances. However you have to check this assumption before using it.

Thus, we’ll use F-test to test for differences in variances.

The following R code can be used :

var.test(x,y)

    F test to compare two variances
data:  x and y
F = 0.8718, num df = 9, denom df = 9, p-value = 0.8414
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.2165 3.5099
sample estimates:
ratio of variances 
            0.8718 

The p-value of the F-test is = 0.8414. It’s greater than the significance level alpha = 0.05. In conclusion, there is no significant difference between the variances of the two sets of data. Therefore, we can use the classic t-test witch assume equality of the two variances.

The t-test can be performed as follow:

res<-t.test(x, y, var.equal=TRUE)
res

    Two Sample t-test
data:  x and y
t = -3.17, df = 18, p-value = 0.005296
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -36.512  -7.408
sample estimates:
mean of x mean of y 
    51.25     73.21 

Note that the formula of Welch t-test is described here and the formula of Student t-test here

Get the objects returned by t.test function

As indicated here, we can easily get each of the objects returned by t.test() function:

# printing the p-value
res$p.value
[1] 0.005296
# printing the mean
res$estimate
mean of x mean of y 
    51.25     73.21 
# printing the confidence interval
res$conf.int
[1] -36.512  -7.408
attr(,"conf.level")
[1] 0.95

Online independent t-test calculator


Note that an online t-test calculator is available here to perform unpaired Student’s t-test without any installation.


Infos

This analysis has been done using R (ver. 3.1.0).


Enjoyed this article? I’d be very grateful if you’d help it spread by emailing it to a friend, or sharing it on Twitter, Facebook or Linked In.

Show me some love with the like buttons below... Thank you and please don't forget to share and comment below!!
Avez vous aimé cet article? Je vous serais très reconnaissant si vous aidiez à sa diffusion en l'envoyant par courriel à un ami ou en le partageant sur Twitter, Facebook ou Linked In.

Montrez-moi un peu d'amour avec les like ci-dessous ... Merci et n'oubliez pas, s'il vous plaît, de partager et de commenter ci-dessous!