Unpaired Two-Samples T-test in R

This article has been updated, you are now consulting an old release of this article!

Introduction
Example of data
Calculate independent t-test using R
Get the objects returned by t.test function
Online independent t-test calculator
Infos

Introduction

Independent t-test or (unpaired t-test) is used to compare the means of two unrelated groups of samples. The aim of this article is to show you how to calculate independent samples t test with R software. The t-test formula is described here.

A simplified format of the R function to use is :

t.test(x, y)

x and y are two numeric vectors of data values to compare.

t.test function is described in detail here.

Example of data

As an example, we have a cohort of 20 individuals (10 women and 10 men). The question is to test whether women’s average weight is significantly different from men’s average weight? The number of individuals considered here is obviously low. This is just to illustrate the usage of two-sample t-test.

The data are shown below:

	Group	Weight (kg)
1	Woman	38.90
2	Woman	61.20
3	Woman	73.30
4	Woman	21.80
5	Woman	63.40
6	Woman	64.60
7	Woman	48.40
8	Woman	48.80
9	Woman	48.50
10	Woman	43.60
11	Man	67.80
12	Man	60.00
13	Man	63.40
14	Man	76.00
15	Man	89.40
16	Man	73.30
17	Man	67.30
18	Man	61.30
19	Man	62.40
20	Man	111.20

Question : Does the women’s average weight is significantly different from that of men?

To answer to this question an independent t-test can be used :

From the data table above, two different methods can be used to perform the t-test. It depends on the structure of your input data.

Calculate independent t-test using R

1) Method 1 - The data are saved in two differents numeric vectors (x and y) :

set.seed(1234)
# Women's weights
x<- c(38.9, 61.2, 73.3, 21.8, 63.4, 64.6, 48.4, 48.8, 48.5, 43.6)
# Men's weights
y <- c(67.8, 60, 63.4, 76, 89.4, 73.3, 67.3, 61.3, 62.4, 111.2)

In this case unpaired t-test can be performed as follow :

res<-t.test(x,y)
res


    Welch Two Sample t-test
data:  x and y
t = -3.17, df = 17.92, p-value = 0.005319
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -36.517  -7.403
sample estimates:
mean of x mean of y 
    51.25     73.21

2) Method 2 - The data are saved in a data.frame :

d<-as.data.frame(list(
                   group=c(rep("Woman", 10), rep("Man", 10)),
                   weight=c(x, y)
                   ))
head(d)

  group weight
1 Woman   38.9
2 Woman   61.2
3 Woman   73.3
4 Woman   21.8
5 Woman   63.4
6 Woman   64.6

In this case, unpaired t-test can be calculated using the following R code :

#res<-t.test(d$weight ~ d$group) 
res<-t.test(weight ~ group, data=d)
res


    Welch Two Sample t-test
data:  weight by group
t = 3.17, df = 17.92, p-value = 0.005319
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
  7.403 36.517
sample estimates:
  mean in group Man mean in group Woman 
              73.21               51.25

As you can see, the two methods give the same results.

In the result above : t is the Student t-test statistics value (t = 3.17), df is the degrees of freedom (df= 17.916), p-value is the significance level of the t-test (p-value = 0.0053). The confidence interval (conf.int) of the mean differences at 95% is also shown (conf.int= [7.4, 36.52]); and finally, we have the means of the two groups of samples (average weight of women = 73.21, average weight of men =51.25).

The p-value of the test is 0.0053, which is less than the significance level alpha = 0.05. We can then reject the null hypothesis and conclude that women’s average weight is significantly different from men’s average weight with a p-value = 0.0053.

Remember that independent t-test can be used only when the two sets of data follow a bivariate normal distributions with equal variances.

By default, the R t.test() function makes the assumption that the variances of the two groups of samples, being compared, are different. Therefore, Welch t-test is performed by default. Welch t-test is just an adaptation of t-test, and it is used when the two samples have possibly unequal variances.

The argument “var.equal=TRUE” can be used to indicate to the t.test() function that the two samples have equal variances. However you have to check this assumption before using it.

Thus, we’ll use F-test to test for differences in variances.

The following R code can be used :

var.test(x,y)


    F test to compare two variances
data:  x and y
F = 0.8718, num df = 9, denom df = 9, p-value = 0.8414
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.2165 3.5099
sample estimates:
ratio of variances 
            0.8718

The p-value of the F-test is = 0.8414. It’s greater than the significance level alpha = 0.05. In conclusion, there is no significant difference between the variances of the two sets of data. Therefore, we can use the classic t-test witch assume equality of the two variances.

The t-test can be performed as follow:

res<-t.test(x, y, var.equal=TRUE)
res


    Two Sample t-test
data:  x and y
t = -3.17, df = 18, p-value = 0.005296
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -36.512  -7.408
sample estimates:
mean of x mean of y 
    51.25     73.21

Note that the formula of Welch t-test is described here and the formula of Student t-test here

Get the objects returned by t.test function

As indicated here, we can easily get each of the objects returned by t.test() function:

# printing the p-value
res$p.value

[1] 0.005296

# printing the mean
res$estimate

mean of x mean of y 
    51.25     73.21

# printing the confidence interval
res$conf.int

[1] -36.512  -7.408
attr(,"conf.level")
[1] 0.95

Online independent t-test calculator

Note that an online t-test calculator is available here to perform unpaired Student’s t-test without any installation.

Infos

This analysis has been done using R (ver. 3.1.0).

Enjoyed this article? I’d be very grateful if you’d help it spread by emailing it to a friend, or sharing it on Twitter, Facebook or Linked In.

Show me some love with the like buttons below... Thank you and please don't forget to share and comment below!!

Avez vous aimé cet article? Je vous serais très reconnaissant si vous aidiez à sa diffusion en l'envoyant par courriel à un ami ou en le partageant sur Twitter, Facebook ou Linked In.

Montrez-moi un peu d'amour avec les like ci-dessous ... Merci et n'oubliez pas, s'il vous plaît, de partager et de commenter ci-dessous!