Wednesday, May 20, 2009

Would a gene by any other name be just as significant?

So you have found significant SNPs from a study, and you are investigating the region. Browsing through Ensembl or Entrez-Gene, you find a coding region nearby. Atop this coding region, you see a collection of letters that are commonly used to refer to this gene, lets say "MYLK". So you begin a PubMed search to find publications that describe the function of this gene, searching with "MYLK". Seems reasonable, right?

Beware! Unfortunately, gene names or acronyms are NOT a standardized way of identifying coding regions. According to Gene Cards, the coding region with the symbol "MYLK" has 14 different symbol aliases, and four unique descriptions! To be complete, conduct a PubMed search using all of these terms. For example, searching PubMed for MYLK retrieves only 30 articles, mostly involving muscle contraction. Searching for MLCK on the other hand retrieves 847 articles! These references have much more emphasis on the neural activities of the gene, so perhaps a difference groups of investigators use different symbols.

To make matters worse, according to Entrez-Gene, MYLK is the "official" gene symbol. yet less than 5% of the PubMed articles use that designation! If possible, use the Entrez-gene or Ensembl gene ID when referencing a gene in the literature to help avoid this confusion.

No comments:

Post a Comment

Creative Commons License
Getting Genetics Done by Stephen Turner is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License.