I've started putting together video screencasts for things like this, especially when several of the core's clients ask the same question. In this example, I'll show you how to quickly convert from the Affymetrix Mouse Gene 1.0 ST microarray probeset IDs to an Ensembl gene ID and gene symbol.
You can also do this programmatically in R using the biomaRt package in Bioconductor.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Fit your model on your expressionset and design matrix | |
fit <- lmFit(eset, design) | |
fit2 <- contrasts.fit(fit, contrast.matrix) | |
fit2 <- eBayes(fit2) | |
results <- topTable(fit2) | |
# get biomart annotation | |
library(biomaRt) | |
mart <- useMart("ensembl", dataset="mmusculus_gene_ensembl") | |
attributes <- c("affy_mogene_1_0_st_v1", "chromosome_name", | |
"start_position","end_position", "ensembl_gene_id", | |
"external_gene_id", "description") | |
genes <- getBM(attributes=attributes, filters="affy_mogene_1_0_st_v1", | |
values=topTable$ID, mart=mart, uniqueRows=T) |
Nice job. I do not know how your clients are experienced with Excel but it might be difficult for them join the two tables. MATCH&INDEX functions (or VLOOKUP/HLOOKUP if you prefer) will do the work.
ReplyDeleteAlso you probably know about this but there is a nice collection of Ensembl related videos from Giulietta Spudich (EnsemblHelpDesk), covering the Biomart
http://www.youtube.com/watch?v=DXPaBdPM2vs
This is the way I also convert gene IDs. To join the two tables I use the join function in the plyr library, much easier than moving to excel especially as many of my tables are too large for Excel.
ReplyDeleteYes, I would also recommend avoiding Excel at all costs. The results page allows you to export in other formats (tab, csv, etc). These are much more database-INNER JOIN-friendly.
ReplyDeleteThis is very much dependent who are you clients/collaborators.
DeletePersonally, I use (Ensembl) MySQL both for converting IDs and joining the tables. But I am the only biostat postdoc at the department. So either I will learn my colleagues how to do the job (with Biomart and Excel) or I have to do it for them every time they need.
I prefer the first way and I am grateful for every video that explains how to do it.
BioDBnet is a good tool to do the ID mapping which I used frequently. http://biodbnet.abcc.ncifcrf.gov/db/db2db.php
ReplyDelete