Friday, February 24, 2012

I'm Hiring!

I direct the Bioinformatics Core at the University of Virginia, and I'm hiring. Visit this link on the UVA Jobs website for more information. Here's the description:
The University of Virginia Bioinformatics Core is seeking a full-time position as a bioinformatics analyst. The analyst will work with other core staff on grant-funded and chargeback-based projects to manage and analyze large-scale datasets produced by next-generation sequencing. The analyst will identify opportunities and implement solutions for managing, visualizing, analyzing, and interpreting genomic data, including studies of gene expression (RNA-seq and microarrays), pathway analysis, protein-DNA binding (e.g. ChIP-seq), DNA methylation, and DNA variation, using Affymetrix, Illumina, Nimblegen, Agilent, Roche 454, Ion Torrent, and other high-throughput platforms in both human and model organisms. The analyst will work closely with the core director to assist in experimental design and provide expert consultation, technical, and scientific support for UVA investigators, and assist in outreach and training activities. The analyst will organize large-scale sequence data sets, manipulate and format data with perl, python, or other scripting languages, use established software to assess quality and analyze data, schedule and run jobs on a high-performance computing cluster, use Unix or a scripting language to extract meaningful results from output, use software or genome browsers for visualization, and use established databases and techniques for annotating genetic variants and results from expression/DNA-binding experiments. The successful candidate will have a demonstrated ability to translate biological questions into technical designs, and to identify, prioritize, and execute bioinformatics tasks to meet project goals and deadlines. An M.S. in Bioinformatics, Genomics, Biostatistics, or a related field is required for this position.  
I'm Hiring - Bioinformatics Analyst in the UVA Bioinformatics Core

Friday, February 17, 2012

Your Publications (with PMCID) as a PubMed Query

I'm updating my CV and biosketch for a few grant applications, and for some time now, NIH has required you to include the PubMed Central ID for each article you publish that arose from NIH support. I only have a dozen or so papers indexed in PubMed, but I still wanted a way to do this automatically. If you have scores of publications, looking up all the PMCIDs could easily become a hassle.

First, create an account at My NCBI. Under your bibliography, click "Manage My Bibliography." Then click "Add citation," then in the new window that comes up, select "Citation from PubMed" and hit the "Go To PubMed" button.

Now the trick here is constructing a PubMed query that will get your publications only. There are lots of Stephen D. Turner's out there, so I had to get creative. This query construction tip comes to me by way of my colleague here at UVA, Aaron Mackey:

For many people, simple PubMed author searches suffice, e.g. "Pearson WR[Author]". For some, such name-based searches get it mostly right, but may include a few spurious false hits. For these cases, it's easy enough to exclude those false hits explicitly (e.g. "Mackey AJ"[Author] NOT 9850730[PMID] NOT 10730495[PMID] gets rid of the two AJ Mackey publications that are not, in fact, mine). For others, simple author searches do not suffice at all, but usually adding an institution and/or departmental affiliation does narrow the results sufficiently (e.g. for Jeff Smith, Biochemistry: "Smith JS"[au] AND "University of Virginia"[Affiliation] AND "Biochemistry"[Affiliation] identifies the 16 articles for which Jeff Smith is the senior author; Jeff could also add a few collaborative publications by adding those pubmed IDs to the search, i.e. adding "OR 17482543[PMID]" to the end of his query.

When I did this for myself, I searched by author, AND (any of my institutional affiliations separated by OR's), but NOT (any of the PMIDs that were not mine, separated by OR's). Apparently there was once another Stephen D. Turner at UVA in the department of Urology. Here are the results returned by my unique query:

"Turner SD"[Au] AND ("James Madison"[Affiliation] OR Vanderbilt[Affiliation] OR Hawaii[Affiliation] OR "University of Virginia"[Affiliation]) NOT (11514333[PMID] OR 11058553[PMID])

The final step is clicking the "Send to" link at the top right, and sending the results of your query to My Bibliography.


Now, when you are back at My NCBI, you should see a list of all your publications, complete with both the PMID and PMCID, ready to go in your biosketch.


You can then export this bibliography as text, or simply copy/paste. Finally, you have the option of making your bibliography public (example).



Wednesday, February 8, 2012

Webinar: Genomic Networks - Resolving Biomarkers from a Cloud of Data

Kevin White from the University of Chicago will be giving a special guest lecture at NCI next week on systems biology approaches to mine genomics data for biomarkers and therapeutic targets. The lecture will be available online as a videocast.

Title: Genomic Networks in Development and Cancer: Resolving Biomarkers and Therapeutic Targets from a Cloud of Data

Speaker: Kevin White, University of Chicago

When: Tuesday February 14, 2012, 1:00pm EST

Summary

Systems level approaches to construct abstract molecular networks can lead to predictions about genetic and biochemical functions in cells, organisms and in disease states. I will show examples of this approach from work in my laboratory. In one example we used an integrated experimental and computational approach to construct a large scale functional network in Drosophila melanogaster built around key transcription factors involved in the process of embryonic segmentation. Our network model is based on a combination of gene expression, transcription factor DNA binding site mapping, automated literature mining and protein-protein interaction mapping. We provide a strategy for reducing the dimensionality of the massive networks that result from such integrated whole genome analyses. 

Using results from one factor in particular, we demonstrated that our approach can rapidly translate a finding in a model organism to the development of a therapeutic target in kidney cancer. In another example, we built a large scale network based on gene expression and genome-wide ChIP results for 40 transcription factors, including two dozen Nuclear Receptor (NR) class proteins. Using this NR network we identified novel prognostic signatures for breast cancer survival and recurrence, as well as new therapeutic leads. 

Finally, if time permits I will talk about how we are mining The Cancer Genome Atlas along with data from the Chicago Cancer Genomes Project using the Bionimbus Cloud in order to identify new tumor suppressors and panels of genetic markers capable of classifying cancer subtypes that correspond to patient outcome.

Hadley Wickham: ggplot2 Webinar (Today!)


Title: A Backstage Tour of ggplot2 with Hadley Wickham
Date: Wednesday, February 8, 2012
Time: 11:00AM - 12:00PM Pacific
Presenter: Hadley Wickham, Professor of Statistics, Rice University

Register here.

I used ggplot2 extensively a few years ago, but reverted back to base graphics when ggplot2 was too slow for a project I was working on. But ggplot2 and plyr have improved much in the last few years, and I'm starting to pick it back up again. This webinar will give an overview of ggplot2, a preview of some of ggplot2's forthcoming new features, and will discuss ggplot2's internals and development over the last few years and how ggplot2 development is becoming easier.

I received an email yesterday saying that the registration list is over 1000 long, so it's a good idea to sign into the webinar early to make sure you get a spot. Hit the link below to register and you'll get a link to the webinar.

A Backstage Tour of ggplot2 with Hadley Wickham
Creative Commons License
Getting Genetics Done by Stephen Turner is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License.