Thursday, February 10, 2011

Extracting values from R summary objects

This builds on a previous post from Stephen.

I was recently running a series of ANOVA analyses, and I used the aov() function because it had a few options that I preferred. Much like lm(), the function returns an object that you typically pass to summary() to view and interpret the output.

It took me a bit of playing to figure out how to extract the information I needed. My aov object is called "fullmod", and here is the summary() output:

> summary(fullmod)

Error: sample
Df Sum Sq Mean Sq
Type 1 1.4535 1.4535

Error: sample:treatment
Df Sum Sq Mean Sq
treatment 1 0.0086021 0.0086021

Error: Within
Df Sum Sq Mean Sq F value Pr(>F)
treatment 1 0.0077 0.00769 0.0325 0.857505
Type 3 0.3159 0.10529 0.4443 0.722043
age 1 0.0351 0.03512 0.1482 0.701367
sex 1 2.2166 2.21661 9.3547 0.003122 **
hybridization 1 0.1125 0.11255 0.4750 0.492923
Residuals 72 17.0605 0.23695
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


I assigned the summary output to an object s so I could examine it a bit further.

> s <- summary(fullmod)

Then I peeked under the hood using the str() function.


> str(s)
List of 3
$ Error: sample :List of 1
..$ :Classes ‘anova’ and 'data.frame': 1 obs. of 3 variables:
.. ..$ Df : num 1
.. ..$ Sum Sq : num 1.45
.. ..$ Mean Sq: num 1.45
..- attr(*, "class")= chr [1:2] "summary.aov" "listof"
$ Error: sample:treatment:List of 1
..$ :Classes ‘anova’ and 'data.frame': 1 obs. of 3 variables:
.. ..$ Df : num 1
.. ..$ Sum Sq : num 0.0086
.. ..$ Mean Sq: num 0.0086
..- attr(*, "class")= chr [1:2] "summary.aov" "listof"
$ Error: Within :List of 1
..$ :Classes ‘anova’ and 'data.frame': 6 obs. of 5 variables:
.. ..$ Df : num [1:6] 1 3 1 1 1 72
.. ..$ Sum Sq : num [1:6] 0.0077 0.3159 0.0351 2.2166 0.1125 ...
.. ..$ Mean Sq: num [1:6] 0.0077 0.1053 0.0351 2.2166 0.1125 ...
.. ..$ F value: num [1:6] 0.0325 0.4443 0.1482 9.3547 0.475 ...
.. ..$ Pr(>F) : num [1:6] 0.8575 0.72204 0.70137 0.00312 0.49292 ...
..- attr(*, "class")= chr [1:2] "summary.aov" "listof"
- attr(*, "class")= chr "summary.aovlist"




This tells me that s is a list of 3 objects ("Error: sample", "Error: sample:treatment:", and "Error: within"). Each of these objects is a list of 1 element for (some reason!?) with no name or identifier. For "Error: sample" and "Error: sample:treatment:", there are three single numbers. For "Error: Within", each element contains a list of 6 numbers (one for each coefficient in my model).

So, to access the degrees of freedom ("Df") for the "Error: sample", I do the following:


> s[[1]][[1]][[1]]
[1] 1


Or if I prefer to use labels (for the elements that have them)

> s[["Error: sample"]][[1]][["Df"]]
[1] 1


Just to clarify:

s [["Error: sample"]] [[1]] [["Df"]]
^ ^ ^
1st layer 2nd unnamed layer 3rd layer


It is VERY important to use [[ instead of [ here. In the R syntax, [[ retrieves a SINGLE element of the object/array and [ retrieves the full object as a list.

> s["Error: sample"][1]["Df"]
$
NULL


See: that didn't work.

To retrieve the p-values I'm interested in, I'll have to dig one more layer deeper to pull the p-value for the coefficient I want (in this case, the 4th element).

> s[[3]][[1]][[5]][[4]]
[1] 0.003122076



OR

> s[["Error: Within"]][[1]][["Pr(>F)"]][[4]]
[1] 0.003122076



Using this basic strategy, you should be able to pull any value from a summary object, but remember that using offsets as I have can be dangerous. For this to work consistently across many models, they must all have the same number of coefficients in the model.

4 comments:

  1. Fantastic! Thank you very much!

    ReplyDelete
  2. Hello

    I was able to extract the proportion of variance from a PCA, so thank you very much.

    I would like to use these instead of "PC1" and "PC2" on a plot, but I couldn't figure out how to do it. Could you help me?

    ReplyDelete
  3. Great! Thank U from Schiedam!

    ReplyDelete
  4. Exactly what I needed, and clear instructions. Thanks!

    ReplyDelete

Note: Only a member of this blog may post a comment.

Creative Commons License
Getting Genetics Done by Stephen Turner is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License.