Getting Genetics Done: Noteworthy blogs

Showing posts with label Noteworthy blogs. Show all posts

Tuesday, January 20, 2015

Microbiome Digest Blog

I have a noteworthy blogs tag on this blog that I sort of forgot about, and haven't used in years. But I started reading one recently that's definitely qualified for the distinction.

The Microbiome Digest is written by Elisabeth Bik, a scientist studying the microbiome at Stanford. It's a near-daily compilation of papers and popular press articles mostly relating to microbiome research, split up into categories like the human microbiome, the non-human microbiome (soil, animal, plants, other environments), metagenomics and bioinformatics methods, reviews, news articles, and other general science or career advice articles.

I imagine Elisabeth spends hours each week culling the huge onslaught of literature into these highly relevant digests. I wish someone else would do the same for other areas I care about so I don't have to. I subscribe to the RSS feed and the email list so I never miss a post. If you're at all interested in metagenomics or microbiome research, I suggest you do the same!

Microbiome Digest

Tuesday, May 29, 2012

How to Stay Current in Bioinformatics/Genomics

A few folks have asked me how I get my news and stay on top of what's going on in my field, so I thought I'd share my strategy. With so many sources of information begging for your attention, the difficulty is not necessarily finding what's interesting, but filtering out what isn't. What you don't read is just as important as what you do, so when it comes to things like RSS, Twitter, and especially e-mail, it's essential to filter out sources where the content consistently fails to be relevant or capture your interest. I run a bioinformatics core, so I'm more broadly interested in applied methodology and study design rather than any particular phenotype, model system, disease, or method. With that in mind, here's how I stay current with things that are relevant to me. Please leave comments with what you're reading and what you find useful that I omitted here.

RSS

I get the majority of my news from RSS feeds from blogs and journals in my field. I spend about 15 minutes per day going through headlines from the following sources:

Journals. Most journals have separate RSS feeds for their current table of contents as well as their advance online ahead-of-print articles.

Blogs. Some of these blogs are very relevant to what I do on the job. Others are more personal interest.

The OpenHelix Blog
Ensembl blog
Galaxy News
Blue Collar Bioinformatics
Homologus
Golden Helix - our 2 SNPs
Genomics Law Report
R-bloggers (aggregates feeds from >350 blogs about R)
Genomes Unzipped
Jason Moore's Epistasis Blog
23andMe - the Spitoon

Forums.

Mailing lists

I prefer to keep work and personal email separate, but I have all my mailing list email sent to my Gmail because Gmail's search is better than any alternative. I have a filter set up to automatically filter and tag mailing list digests under a "Work" label so I can get to them (or filter them from my inbox) easily.

Bioconductor (daily digest)
Galaxy mailing lists. I subscribe to the -announce, -user, and -dev mailing lists, but I have a Gmail filter set up to automatically skip the inbox and mark read messages from the -user and -dev lists. I don't care to look at these every day, but again, it's handy to be able to use Gmail's search functionality to look through old mailing list responses.

Email Alerts & Subscriptions

Again, email can get out of hand sometimes, so I prefer to only have things that I really don't want to miss sent to my email. The rest I use RSS.

SeqAnswers subscriptions. When I ask a question or find a question that's relevant to something I'm working on, I subscribe to that thread for email alerts whenever a new response is posted.
Google Scholar alerts. I have alerts set up to send me emails based on certain topics (e.g. [ rna-seq | transcriptome sequencing | RNA-sequencing ] or [ intitle:"chip-seq" ]), or when certain people publish (e.g. ["ritchie md" & vanderbilt]). I also use this to alert me when certain consortia publish (e.g. ["Population Architecture using Genomics and Epidemiology"]).
PubMed Saved Searches using MyNCBI, because Google Scholar doesn't catch everything. I have alerts set up for RNA-seq, ChIP-Seq, bioinformatics methods, etc.
GenomeWeb subscriptions. Most of these are once per week, except Daily Scan. I subscribe to Daily Scan, Genome Technology, BioInform, Clinical Sequencing News, In Sequence, and Pharmacogenomics Reporter. BioInform has a "Bioinformatics Papers of Note", and In Sequence has a "Sequencing papers of note" column in every issue. These are good for catching things I might have missed with the Scholar and Pubmed alerts.

Twitter

99.9% of Twitter users have way too much time on their hands, but when used effectively, Twitter can be incredibly powerful for both consuming and contributing to the dialogue in your field. Twitter can be an excellent real-time source of new publications, fresh developments, and current opinion, but it can also quickly become a time sink. I can tolerate an occasional Friday afternoon humorous digression, but as soon as off-topic tweets become regular it's time to unfollow. The same is true with groups/companies - some deliver interesting and broadly applicable content (e.g. 23andMe), while others are purely a failed attempt at marketing while not offering any substantive value to their followers. A good place to start is by (shameless plug) following me or the people I follow (note: this isn't an endorsement of anyone on this list, and there are a few off-topic people I follow for my non-work interests). I can't possibly list everyone, but a few folks who tweet consistently on-topic and interesting content are: Daniel MacArthur, Jason Moore, Dan Vorhaus, 23andMe, OpenHelix, Larry Parnell, Francis Ouellette, Leonid Kruglyak, Sean Davis, Joe Pickrell, The Galaxy Project, J. Chris Pires, Nick Loman, and Andrew Severin. Also, a hashtag in twitter (prefixed by the #), is used to mark keywords or topics in Twitter. I occasionally browse through the #bioinformatics and #Rstats hashtag.

Monday, May 2, 2011

Golden Helix: A Hitchhiker's Guide to Next Generation Sequencing

This is a few months old but I just got around to reading this series of blog posts on next-generation sequencing (NGS) by Gabe Rudy, Golden Helix's VP of product development. This series gives a seriously useful overview of NGS technology, then delves into the analysis of NGS data at each step, right down to a description of the most commonly used file formats and tools for the job. Check it out now if you haven't already.

Part One gives an overview of NGS trends and technologies. Part Two describes the steps and programs used in the bioinfomatics of NGS data, broken down into three distinct analysis phases. Part Three gives more details on the needs and workflows of the final stage of the analysis of NGS data, or the "sense-making" phase. Finally, a fourth part is a primer on the common formats (FASTQ, SAM/BAM, VCF) and tools (BWA, Bowtie, VCFtools, SAMtools, etc) used in NGS bioinformatics and analysis.

Wednesday, July 21, 2010

How to Read a Genome-Wide Association Study (@GenomesUnzipped)

Jeff Barret (@jcbarret on Twitter) over at Genomes Unzipped (@GenomesUnzipped) has posted a nice guide for the uninitiated on how to read a GWAS paper. Barret outlines five critical areas that readers should pay attention to: sample size, quality control, confounding (including population substructure), the replication requirement, and biological significance. It would be nice to see a follow-up post like this on things to look out for in studies that investigate other forms of human genetic variation such as copy number polymorphism, rare variation, or gene-environment interaction.

And this is also a convenient point for me to mention Genomes Unzipped - a collaborative blog covering topics relevant to the personal genomics industry, featuring posts by several of my favorite bloggers including Daniel MacArthur (of Genetic Future), Luke Jostins (of Genetic Inference), Dan Vorhaus (of Genomics Law Report), Jan Aerts (Saaien Tist), Jeff Barret, Caroline Wright, Katherine Morley, and Vincent Plagnol. GNZ, as it's called, has only been live for about two weeks, but looks like a good one to follow as the personal genomics industry begins to mature over the next few years.

Genomes Unzipped: How to Read a Genome-Wide Association Study

Wednesday, June 16, 2010

Ten Reasons Why Grad Students Should Blog

NYU PhD student Drew Conway has compiled a very nice list of 10 reasons why grad students should blog. I've been writing GGD for a little over a year now and it's been a great way to extend my own network past the Vanderbilt walls, participate in lively discussions with other scientists oceans away, and to write stuff that people actually read and find useful. Especially for grad students (and postdocs as well), one of the most important points in Drew's post is on using a blog to establish an identity:

If you are in graduate school to be the “best kept secret in academia,” you are making a fatal mistake. As with any other job market, getting the proverbial foot in the door for a job talk at a university is a critical first step. As a graduate student it can be incredibly difficult to navigate the sea of senior faculty, their research agendas, and how that fits into your career goals. Having a blog provides you an independent beacon upon which you can broadcast your own ideas.

Another reason I can add to the list is related to #4 - extending your network. It's very gratifying to go to a meeting or conference and meet someone who regularly reads your blog. In a way they already know who you are, and you immediately have a starting point to launch a conversation. Check out the full list at Drew's blog, Zero Intelligence Agents, at the link below.

Ten Reasons Why Grad Students Should Blog

Tuesday, March 30, 2010

Federal Courts Invalidate Myriad's Breast Cancer Gene Patents

A District Court handed down a summary judgment invalidating most of Myriad's claims to both the BRCA1 DNA sequence and the method of testing for early-onset familial breast and ovarian cancer. See Genetic Future and Genomics Law Report for analysis.

Monday, January 18, 2010

Coming to R from SQL, Python, SAS, Matlab, or Lisp

Head over to Revolutions Blog for a list of PDF and powerpoint resources for making the transition to R from other programming or stats languages. All of these notes come from the New York R meetup. I enjoyed browsing the meetup's files - lots of powerpoints, PDFs, and example R data files for various topics, including several slideshows on ggplot2. Don't forget the ggplot2 tutorial I posted here earlier this week if you're completely new to ggplot2.

If you're coming from SPSS or SAS, I've read good things about Robert Muenchen's book R for SAS and SPSS Users, which can be found on Amazon (~$50), or downloaded for free from http://rforsasandspssusers.com/.

Revolutions: Coming to R from SQL, Python, SAS, Matlab, or Lisp

Wednesday, January 6, 2010

New Features in ggplot2 0.8.5

Learning R blog details some of the new features in the latest update to ggplot2. The latest version includes functions to make it easier to change axis and legend labels, as well as a function to easily set the limits of the plot display outside the range of the data.

Be sure to check back next week - I'm putting together a short introductory ggplot2 tutorial.

Learning R: New features in ggplot2 version 0.8.5

Friday, September 25, 2009

What happens when a consumer genetics company goes bankrupt?

Dan Vorhaus and Lawrence Moore recently put together this excellent three part series on Genomics Law Report. Headlines about deCODE Genetics on the brink of insolvency and major shifts in the upper management of 23andMe inspired this series of posts on what would happen when a direct-to-consumer (DTC) genomics company goes declares bankruptcy.

Bankruptcy law authorizes the sale of the assets of a business in bankruptcy, and genomic data is likely the most valuable asset of any DTC genomics company. First the authors dissect the privacy policy and terms of service for three major DTC companies: 23andMe, deCODE Genetics, and TruGenetics. Next there's a discussion of how the legal system would treat a DTC genomics company's bankruptcy. The series wraps up with a brief discussion of how this ultimately affects the average DTC genomics cutomer.

Genomics Law Report: What happens if a DTC Genomics Company Goes Belly-Up?

Thursday, September 10, 2009

Machine Learning in R

Revolutions blog recently posted a link to R code by Joshua Reich with self-contained examples of using machine learning techniques in R, including various clustering methods (k-means, nearest neighbor, and kernel), recursive partitioning (CART), principle components analysis, linear discriminant analysis, and support vector machines. This post also links to some slides that go over the basics of machine learning. Looks like a good place to start learning about ML before handrolling your own code.

Be sure to check out one of Will's previous post on hierarchical clustering in R.

Revolutions: Machine learning in R, in a nutshell

Thursday, May 28, 2009

Statistics and sex appeal

Google's chief economist was recently quoted as saying "The sexy job in the next ten years will be statisticians… The ability to take data-to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it-that’s going to be a hugely important skill." I'll leave you for the weekend with this ego-boosting article relating how our skill set as statisticians is a hot commodity in the real world.

Dataspora Blog: The three sexy skills of data geeks

Thursday, May 7, 2009

100 publications every grad student should read

Jason Moore at the previously mentioned Epistasis Blog has begun compiling a list of 100 papers every grad student should read, broken down by discipline. Right now the list is in its infancy, but it's a good start. I'll post here when the list is updated again.

100 Publications Every Graduate Student Should Read

UPDATE 2009-05-08: The list has grown substantially since yesterday. Check the link again!

Thursday, April 9, 2009

Epistasis Blog

If you're interested in gene-gene and gene-environment interaction (and who wouldn't be?), then you should check out the Epistasis Blog. Our friend and colleague Jason Moore at Dartmouth Medical School has maintained compgen.blogspot.com since 2005 writing about epistasis, computational genetics, and related topics. I've personally stumbled upon several interesting papers mentioned here that I may have otherwise missed.

Also, be sure to check out other posts in GGD's Noteworthy Blogs series!

Jason Moore's Epistasis Blog (compgen.blogspot.com)

Tuesday, March 31, 2009

Noteworthy Blog: Chun Li's Course Blog

For the second installment in our Noteworthy Blog series, take a look at Chun Li's biostatistics course blog. Several years ago, CHGR faculty member Chun Li taught a class in biostatistics, maintaining this blog over the duration of the course. Although it hasn't been updated since the course ended, it's filled with masterful guidance on data analysis issues and insightful commentary on important topics and common misconceptions about statistics.

Chun's Course Blog

Monday, March 23, 2009

Noteworthy Blog: The Spittoon

For those of you who attend our computational genetics journal club every other week, you've all heard about this. Say what you will about the "consumer genetics" enterprise, 23andMe maintains an excellent blog. In their "SNPwatch" category, The Spittoon surveys and summarizes the latest findings in human genetics research before they hit the press. About 50% of their content comes from Nature Genetics advance online, and the rest from a smattering of other journals. They usually offer a one-page summary of the research findings detailing the associated SNP's rs-number, risk allele, odds ratio estimate, and the sample size used.

The Spitoon

This blog has moved!