R: Interpreting results from ANOVA and TukeyHSD analyses -


i have run anova , tukeyhsd on dataframe containing anatomical regions in column 1 (region) , gene expression values in column 2 (s1). expect p-value aov summary expressed pr(>f), i'm little fuzzy on results i've obtained. also, can me understand tukey multiple comparisons of means results? i'm not totally clear on diff , p adj results indicate. results shown here abridged version of i'm working with, fyi.

> aov.result = aov(s1 ~ region, data=raw.data) > summary(aov.result)              df  sum sq mean sq f value    pr(>f)     region       60  61.713 1.02856  5.9246 < 2.2e-16 *** residuals   655 113.712 0.17361                       --- signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1  > tukeyhsd(aov.result) tukey multiple comparisons of means     95% family-wise confidence level  fit: aov(formula = s1 ~ region, data = raw.data)  $region                      diff           lwr          upr     p adj ab-aa        0.4118651583 -2.864195e-01  1.110149848 0.9847745 aha-aa      -0.0468785098 -7.608569e-01  0.667099930 1.0000000 apir-aa      0.4419135565 -2.563711e-01  1.140198246 0.9502924 b-aa         0.5379787168 -1.603060e-01  1.236263406 0.5846356 

lets start reproducible data, 1 factor , 1 continuous variable:

set.seed(1) df1 <- data.frame(     f1=as.factor(rep(seq(1:3),4)),     c1=abs(rnorm(12))) s1 <- stats::aov(df1$c1 ~ df1$f1) summary(s1) 

this gives output similar yours.

the p-value data appears correct , can confirmed e.g.:

1-stats::pf(q=5.92, df1=60, df2=655) [1] 0 

now, looking @ output from:

s2 <- stats::tukeyhsd.aov(s1) 

i.e.

$`df1$f1`            diff       lwr       upr     p adj 2-1 -0.06282377 -1.038236 0.9125887 0.9823655 3-1 -0.09820762 -1.073620 0.8772048 0.9575774 3-2 -0.03538385 -1.010796 0.9400286 0.9943641 

the first column difference in means. in example:

m1 <- mean( df1$c1[df1$f1==1] ) m2 <- mean( df1$c1[df1$f1==2] ) 

now m2-m1 approximately equal s2$"df1$f1"[1,1], here -0.068..

this 'difference of means' has confidence interval calculated studentized range (q) distribution. mechanics can found in source code of stats::tukeyhsd.aov(). see ?ptukey. note rationale 'correction multiple comparisons' controversial in contexts. sort of question might better suited crossvalidated.


Comments

Popular posts from this blog

java - Jmockit String final length method mocking Issue -

asp.net - Razor Page Hosted on IIS 6 Fails Every Morning -

c++ - wxwidget compiling on windows command prompt -