Tuesday, January 17, 2012

Annotating limma Results with Gene Names for Affy Microarrays

Lately I've been using the limma package often for analyzing microarray data. When I read in Affy CEL files using ReadAffy(), the resulting ExpressionSet won't contain any featureData annotation. Consequentially, when I run topTable to get a list of differentially expressed genes, there's no annotation information other than the Affymetrix probeset IDs or transcript cluster IDs. There are other ways of annotating these results (INNER JOIN to a MySQL database, biomaRt, etc), but I would like to have the output from topTable already annotated with gene information. Ideally, I could annotate each probeset ID with a gene symbol, gene name, Ensembl ID, and have that Ensembl ID hyperlink out to the Ensembl genome browser. With some help from Gordon Smyth on the Bioconductor Mailing list, I found that annotating the ExpressionSet object results in the output from topTable also being annotated.

The results from topTable are pretty uninformative without annotation:


After annotation:


You can generate an HTML file with clickable links to the Ensembl Genome Browser for each gene:


Here's the R code to do it:

4 comments:

  1. How about package affycoretools? Gives detailed html results with common annotations you want, up, down genes, fold changes, stats etc. corresponding to venn diagrams from limma..quite straightforward

    ReplyDelete
  2. affycoretools definitely sounds useful - will give it a go! Thanks.

    ReplyDelete
  3. Thanks for the script! I modified it to work with the Drosophila genome with Flybase links instead of Ensemble and it worked perfectly

    ReplyDelete

Creative Commons License
Getting Genetics Done by Stephen Turner is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License.