The goal of this script is to assess the extent of regulatory variation in our cell types.

Evaluating global changes in variation

No global reduction in means between day 0 and day 1 in both species

Check to see if differences between means (of all gene expression values) within humans and chimps between days

########## Within chimps ##########

# Test at least one of the means is different
group <- rbind(labels1, labels2, labels3, labels4)

mean_day_species <- rbind(chimps_day0_mean,chimps_day1_mean, chimps_day2_mean, chimps_day3_mean)

aov_groups <- cbind(mean_day_species, group)

dim(aov_groups)
[1] 41216     2
aov_bet <- oneway.test(aov_groups$Mean ~ aov_groups$group)
aov_bet

    One-way analysis of means (not assuming equal variances)

data:  aov_groups$Mean and aov_groups$group
F = 3.8146, num df = 3, denom df = 22892, p-value = 0.009564
#F = 4.4683, num df = 3, denom df = 22893, p-value = 0.003845F = 3.8146, num df = 3, denom df = 22892, p-value = 0.009564

# Test groups individually

# Day 0 and Day 1
t.test(chimps_day0_mean, chimps_day1_mean) # 0.1716

    Welch Two Sample t-test

data:  chimps_day0_mean and chimps_day1_mean
t = -1.3672, df = 20583, p-value = 0.1716
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.08994855  0.01602921
sample estimates:
mean of x mean of y 
 5.191453  5.228413 
# Day 1 and Day 2
t.test(chimps_day1_mean, chimps_day2_mean) # 0.2033

    Welch Two Sample t-test

data:  chimps_day1_mean and chimps_day2_mean
t = -1.2721, df = 20606, p-value = 0.2033
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.08578381  0.01825800
sample estimates:
mean of x mean of y 
 5.228413  5.262176 
# Day 2 and Day 3
t.test(chimps_day2_mean, chimps_day3_mean) # 0.6032

    Welch Two Sample t-test

data:  chimps_day2_mean and chimps_day3_mean
t = -0.51978, df = 20565, p-value = 0.6032
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.06724364  0.03905490
sample estimates:
mean of x mean of y 
 5.262176  5.276270 
########## Within humans ##########

# Test at least one of the means is different
group <- rbind(labels5, labels6, labels7, labels8)

mean_day_species <- rbind(humans_day0_mean,humans_day1_mean, humans_day2_mean, humans_day3_mean)

aov_groups <- cbind(mean_day_species, group)

dim(aov_groups)
[1] 41216     2
aov_bet <- oneway.test(aov_groups$Mean ~ aov_groups$group)
aov_bet

    One-way analysis of means (not assuming equal variances)

data:  aov_groups$Mean and aov_groups$group
F = 6.3119, num df = 3, denom df = 22893, p-value = 0.0002829
# F = 5.5523, num df = 3, denom df = 22893, p-value = 0.0008336

# Test groups individually

# Day 0 and Day 1
t.test(humans_day0_mean, humans_day1_mean) # 0.2702

    Welch Two Sample t-test

data:  humans_day0_mean and humans_day1_mean
t = -1.1027, df = 20596, p-value = 0.2702
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.08260299  0.02312380
sample estimates:
mean of x mean of y 
 5.211648  5.241388 
# Day 1 and Day 2
t.test(humans_day1_mean, humans_day2_mean) # 0.02824

    Welch Two Sample t-test

data:  humans_day1_mean and humans_day2_mean
t = -2.1941, df = 20598, p-value = 0.02824
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.109737567 -0.006181887
sample estimates:
mean of x mean of y 
 5.241388  5.299347 
# Day 2 and Day 3
t.test(humans_day2_mean, humans_day3_mean) # 0.628

    Welch Two Sample t-test

data:  humans_day2_mean and humans_day3_mean
t = -0.48453, df = 20602, p-value = 0.628
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.06437644  0.03885704
sample estimates:
mean of x mean of y 
 5.299347  5.312107 

Conclusion: For both species, there is at least 1 mean that is different; however, when testing between the humans and chimps day 0 and 1, there is not enough statistical evidence to suggest a difference in the means of the variances.

The effect size for the global reduction in variance between day 0 and 1 in chimpanzees is small but larger in humans.

Check to see if differences variances of all of the genes between humans and chimps

# Untransformed variances have a range of 0 to infinity. Therefore, we will log transform the variance values so that there is a range of negative infinity to positive infinity. Then we can use a t-test. 


########## Within chimps ##########

t.test(log2((unlist(chimps_day0_var))), log2((unlist(chimps_day1_var))), alternative = c("greater"))

    Welch Two Sample t-test

data:  log2((unlist(chimps_day0_var))) and log2((unlist(chimps_day1_var)))
t = 10.235, df = 20439, p-value < 2.2e-16
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
 0.2388908       Inf
sample estimates:
mean of x mean of y 
-4.751923 -5.036558 
2^(5.036558 - 4.751923)
[1] 1.218102
# t = 10.235, df = 20439, p-value < 2.2e-16

# Is Day 1 to 2 or 2 to 3 greater?

# Day 1 and Day 2
t.test(log2((unlist(chimps_day1_var))), log2((unlist(chimps_day2_var))), alternative = c("greater")) # p-value = 1

    Welch Two Sample t-test

data:  log2((unlist(chimps_day1_var))) and log2((unlist(chimps_day2_var)))
t = -15.864, df = 20550, p-value = 1
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
 -0.4955952        Inf
sample estimates:
mean of x mean of y 
-5.036558 -4.587522 
# Day 2 and Day 3
t.test(log2((unlist(chimps_day2_var))), log2((unlist(chimps_day3_var))), alternative = c("greater")) # p-value = 1

    Welch Two Sample t-test

data:  log2((unlist(chimps_day2_var))) and log2((unlist(chimps_day3_var)))
t = -12.813, df = 20595, p-value = 1
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
 -0.4030692        Inf
sample estimates:
mean of x mean of y 
-4.587522 -4.230311 
########## Within humans ##########


# Day 0 and Day 1
t.test(log2((unlist(humans_day0_var))), log2((unlist(humans_day1_var))), alternative = c("greater"))

    Welch Two Sample t-test

data:  log2((unlist(humans_day0_var))) and log2((unlist(humans_day1_var)))
t = 49.053, df = 20465, p-value < 2.2e-16
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
 1.189854      Inf
sample estimates:
mean of x mean of y 
-3.636568 -4.867707 
2^(4.867707-3.636568)
[1] 2.347523
# Day 1 and Day 2
t.test(log2((unlist(humans_day1_var))), log2((unlist(humans_day2_var))), alternative = c("greater")) # p-value = 1

    Welch Two Sample t-test

data:  log2((unlist(humans_day1_var))) and log2((unlist(humans_day2_var)))
t = -45.735, df = 19997, p-value = 1
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
 -1.141847       Inf
sample estimates:
mean of x mean of y 
-4.867707 -3.765502 
# Day 2 and Day 3
t.test(log2((unlist(humans_day2_var))), log2((unlist(humans_day3_var))), alternative = c("greater")) # p-value = 0.9986

    Welch Two Sample t-test

data:  log2((unlist(humans_day2_var))) and log2((unlist(humans_day3_var)))
t = -2.9879, df = 20393, p-value = 0.9986
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
 -0.107056       Inf
sample estimates:
mean of x mean of y 
-3.765502 -3.696457 

Conclusion: For both species, there is at least 1 mean (of the variances) that is different; however, when testing between the chimps day 0 and 1, there is not enough statistical evidence to suggest a difference in the means of the variances in the chimpanzees. Even if you make the argument that biologically we expect a reduction in variance between day 0 an day 1 (due to cannalization, for example) and therefore try a one-sided t-test, the p-value is still not statistically significant.