Researchers in the ENGAGE consortium used a clever technique to leverage genome-wide expression data to select or prioritize genes for GWAS analysis. The investigators published the novel candidate genes for obesity in this month's PLoS Genetics, but I think the method they used here is more interesting.
If you're studying obesity and you find that expression of some gene correlates with BMI, you have a problem in that you don't know whether the correlation indicates a causal relationship or if the changes in gene expression were simply reactive to changes in body composition. This is the case when looking at unrelated individuals - some correlations will be reactive, others potentially causal. However, if you're looking only in identical twins, you know that all the correlations you see are reactive, because MZ twins are genetically identical. The authors here took an interesting approach to prioritize genes for GWAS analysis that were correlated in the unrelated individuals only, and not in the MZ twins.
Following up these "causal" genes in a GWAS analysis the authors found that the p-value distribution was highly biased away from the null - in other words, more of these genes were associated than you'd expect by chance. The genes dubbed reactive were biased toward the null, i.e. fewer variants in these genes were associated with the phenotype.
While not everyone has easy access to whole-genome expression data on MZ twins before doing a GWAS, I wonder if the idea can be extended out to siblings or even more distant relatives, perhaps leveraging the kinship coefficient as a measure of relatedness between two individuals to "nudge" the transcript in question more towards causal versus reactive. Anyhow, check out the paper linked, it's a very clever idea.
(On a slightly related note, check out this interesting discussion about open access publishing a la PLoS versus traditional scientific publishing)
PLoS Genetics: Use of Genome-Wide Expression Data to Mine the “Gray Zone” of GWA Studies Leads to Novel Candidate Obesity Genes
Great post, I like the idea. Mining the gray zone is always a little challenging, especially if you don't have anything *but* gray...
ReplyDeleteI'm looking into using functional information for this, but obviously you can find nice functional candidates everywhere in the genome.