A new paper in Nature by the Wellcome Trust Case Control Consortium examining 16,000 cases of 8 common diseases and 3000 shared controls finds that common CNVs probed on existing arrays are well tagged by SNPs and are unlikely to contribute much to common human disease. In regards to where the missing heritability may lie, Peter Donnelly was quoted in the Times Online as saying "my position now is to be very skeptical about the role of common CNVs...we have shown it wasn't Colonel Mustard in the ballroom with the candlestick. It narrows down the search for what is responsible."
Nature: Genome-wide association study of CNVs in 16,000 cases of eight common diseases and 3,000 shared controls
Abstract: Copy number variants (CNVs) account for a major proportion of human genetic polymorphism and have been predicted to have an important role in genetic susceptibility to common disease. To address this we undertook a large, direct genome-wide study of association between CNVs and eight common human diseases. Using a purpose-designed array we typed ~19,000 individuals into distinct copy-number classes at 3,432 polymorphic CNVs, including an estimated ~50% of all common CNVs larger than 500 base pairs. We identified several biological artefacts that lead to false-positive associations, including systematic CNV differences between DNAs derived from blood and cell lines. Association testing and follow-up replication analyses confirmed three loci where CNVs were associated with disease—IRGM for Crohn’s disease, HLA for Crohn’s disease, rheumatoid arthritis and type 1 diabetes, and TSPAN8 for type 2 diabetes—although in each case the locus had previously been identified in single nucleotide polymorphism (SNP)-based studies, reflecting our observation that most common CNVs that are well-typed on our array are well tagged by SNPs and so have been indirectly explored through SNP studies. We conclude that common CNVs that can be typed on existing platforms are unlikely to contribute greatly to the genetic basis of common human diseases.
Wednesday, March 31, 2010
Tuesday, March 30, 2010
Federal Courts Invalidate Myriad's Breast Cancer Gene Patents
A District Court handed down a summary judgment invalidating most of Myriad's claims to both the BRCA1 DNA sequence and the method of testing for early-onset familial breast and ovarian cancer. See Genetic Future and Genomics Law Report for analysis.
Tags:
News,
Noteworthy blogs,
Policy
Wednesday, March 24, 2010
Vanderbilt-Ingram Cancer Center Retreat
VICC's retreat this year is Tuesday, May 11, 2010. The theme of the retreat looks interesting: "Genomic Approaches for Personalized Medicine." You can register free here.
VICC retreat: Genomic Approaches for Personalized Medicine
VICC retreat: Genomic Approaches for Personalized Medicine
Tags:
Announcements
Tuesday, March 23, 2010
Video: ggplot2 Creator Hadley Wickham's Short Course on Data Visualization Using R
Hadley Wickham, creator of ggplot2, has posted a 2 hour video on data visualization using R. You can find links to the videos and slides over at Revolutions Blog.
Check back here soon. I am working with Hadley to arrange a day-long ggplot2 short course here at Vanderbilt this summer. I'll post the date and registration info once everything is set up.
Video: Hadley Wickham gives a short course on graphics with R (via Revolutions)
Check back here soon. I am working with Hadley to arrange a day-long ggplot2 short course here at Vanderbilt this summer. I'll post the date and registration info once everything is set up.
Video: Hadley Wickham gives a short course on graphics with R (via Revolutions)
Tags:
ggplot2,
R,
Visualization
Thursday, March 18, 2010
Create annotated GWAS manhattan plots using ggplot2 in R
*** Update April 25, 2011: This code has gone through a major revision. Please see the updated code and tutorial here. ***
A few months ago I showed you in this post how to use some code I wrote to produce manhattan plots in R using ggplot2. The qqman() function I described in the previous post actually calls another function, manhattan(), which has a few options you can set. I recently had to update this function to allow me to color code SNPs of interest, similar to the plots shown in figure 1 of Cristen Willer's 2008 Nature Genetics paper on lipids. I'll try to explain how to utilize that feature here.
The only extra thing you'll need here is a list of SNPs that you want to highlight. The only thing - that list of SNPs can't have the "rs" prefix on the rs numbers. They must be integers. E.g. if you want to highlight rs1234 and rs5678, you would create an array containing the integers 1234 and 5678. If you already have a list of SNPs, use the substr() command to perform a substring operation to extract only the digits from the rs numbers.
Once you load in your PLINK results and your array containing the rs numbers you want to highlight, simply call the manhattan() function with the option annotate=T, and SNPlist=x, where x is the name of the vector containing rs numbers.
Here's some example code:
If all goes well, you should have a manhattan plot with SNPs of interest highlighted. It might look something like this:
A few tips: You can use the UCSC genome browser to look up coordinates for genes, then select rs numbers based on that range, if you want to highlight certain genes. The default color is green but you can change this on line 118 of the code at the URL above.
**** UPDATE, May 15 2014 *****
The functions described here have now been wrapped into an R package. View the updated blog post or see the online package vignette for how to install and use. If you'd still like to use the old code described here, you can access this at version 0.0.0 on GitHub. The code below likely won't work.
*****************************
A few months ago I showed you in this post how to use some code I wrote to produce manhattan plots in R using ggplot2. The qqman() function I described in the previous post actually calls another function, manhattan(), which has a few options you can set. I recently had to update this function to allow me to color code SNPs of interest, similar to the plots shown in figure 1 of Cristen Willer's 2008 Nature Genetics paper on lipids. I'll try to explain how to utilize that feature here.
The only extra thing you'll need here is a list of SNPs that you want to highlight. The only thing - that list of SNPs can't have the "rs" prefix on the rs numbers. They must be integers. E.g. if you want to highlight rs1234 and rs5678, you would create an array containing the integers 1234 and 5678. If you already have a list of SNPs, use the substr() command to perform a substring operation to extract only the digits from the rs numbers.
Once you load in your PLINK results and your array containing the rs numbers you want to highlight, simply call the manhattan() function with the option annotate=T, and SNPlist=x, where x is the name of the vector containing rs numbers.
Here's some example code:
# This requires ggplot2
require(ggplot2)
# First, load these functions from source:
source("http://dl.dropbox.com/u/66281/0_Permanent/qqman.r")
# Next, load your PLINK results file to a data frame:
mydata=read.table("plink.qassoc", header=TRUE)
# Assuming you already have a vector of rs numbers to highlight
head(ImportantSNPs)
[1] 3821815 1851665 1621816 1403694 1656922 166479
# Call the manhattan function, with annotate=T.
# The SNPlist argument takes the list of SNPs to highlight.
# Save the plot to an object
myplot=manhattan(mydata,annotate=T,SNPlist=ImportantSNPs)
# Finally, save the plot in the current directory using ggsave()
ggsave("manhattan.png",myplot,w=12,h=9,dpi=100)
If all goes well, you should have a manhattan plot with SNPs of interest highlighted. It might look something like this:
A few tips: You can use the UCSC genome browser to look up coordinates for genes, then select rs numbers based on that range, if you want to highlight certain genes. The default color is green but you can change this on line 118 of the code at the URL above.
**** UPDATE, May 15 2014 *****
The functions described here have now been wrapped into an R package. View the updated blog post or see the online package vignette for how to install and use. If you'd still like to use the old code described here, you can access this at version 0.0.0 on GitHub. The code below likely won't work.
*****************************
Tags:
ggplot2,
GWAS,
R,
Visualization
Francis Collins: Computational biologists are "breakthrough artists"
Just caught this on the OpenHelix Blog. In an interview with Charlie Rose, NIH director Francis Collins said Computational biologists will be the "breakthrough" artists of the future.
CHARLIE ROSE: You have said if you were starting over you would be a computational biologists.
FRANCIS COLLINS: I did say that. I still say that. Computational biologists are having a really good time and it’s going to get better.
CHARLIE ROSE: Their day is coming?
FRANCIS COLLINS: Their day is here, but it’s going to be even more here in a few years. So what do they do? They are people who are jointly trained in studying biology in all of its complexes, but they’re also very capable at computation analysis of huge data sets, because — in part because of NIH and the ethic that was adopted by the genome project, huge amounts of data are being made publicly accessible everyday about all kinds of disease questions.
CHARLIE ROSE: So they’re going to be the break through artists of the future?
FRANCIS COLLINS: They’re going to be the breakthrough artists...
Tags:
News
Tuesday, March 16, 2010
$25 Plate Centrifuge
While reading through an article on job hunting success on Bitesize Bio I stumbled upon another piece there that's definitely in the spirit of "getting things done" in genetics research.
I had always halfway considered going into business manufacturing lab supplies. Take a $10 Easy-Bake Oven and add a little bit tighter temperature regulation, call it a hybridization oven, and sell it for thousands. Now it wasn't long ago that I remember doing some TaqMan genotyping before GWAS was all the rage, and how awful the results would be when I would forget to spin down the plates before starting the PCR. I haven't a clue how many tens of thousands of dollars a real plate centrifuge and rotors would set you back, but check out the post on Bite Size Bio below, where a few resourceful folks show you how to make a plate centrifuge from a salad spinner in 5 minutes for $25.
Bitesize Bio: How to Build a Plate Centrifuge for $25
I had always halfway considered going into business manufacturing lab supplies. Take a $10 Easy-Bake Oven and add a little bit tighter temperature regulation, call it a hybridization oven, and sell it for thousands. Now it wasn't long ago that I remember doing some TaqMan genotyping before GWAS was all the rage, and how awful the results would be when I would forget to spin down the plates before starting the PCR. I haven't a clue how many tens of thousands of dollars a real plate centrifuge and rotors would set you back, but check out the post on Bite Size Bio below, where a few resourceful folks show you how to make a plate centrifuge from a salad spinner in 5 minutes for $25.
Bitesize Bio: How to Build a Plate Centrifuge for $25
Monday, March 15, 2010
Seminar: Pathway-based analysis for genome-wide association studies
Vanderbilt Epidemiology Center, Institute for Medicine and Public Health presents:
"Pathway-based analysis for genome-wide association studies"
Steven Chen Ph.D
Assistant Professor of Biostatistics
Tuesday, March 16, 2010
9:00 AM - 10:00 AM
2525 West End Avenue 6th Floor Boardroom
"Pathway-based analysis for genome-wide association studies"
Steven Chen Ph.D
Assistant Professor of Biostatistics
Tuesday, March 16, 2010
9:00 AM - 10:00 AM
2525 West End Avenue 6th Floor Boardroom
Tags:
Announcements,
GWAS,
Pathways
Subscribe to:
Posts (Atom)


