Friday, February 11, 2011

Shellfish for Parallel PCA on GWAS data (Alternative to Eigenstrat)

Recently I tried compiling Eigensoft on my Ubuntu 10.10 Linux system running in Virtualbox and had no success. From comments on this blog post, it looks like the newer Ubuntu distros don't have the libg2c0 and related libraries (which were a part of the gcc3) and gcc4 uses gfortran instead. So it looks like Eigensoft won't be compatible with any of the newer Linux distros, at least without some major tweaking that I'm not prepared to bother with.

Ross Lazarus suggested in a comment to try Shellfish as an alternative. I was able to compile Shellfish without a problem on Ubuntu 10.10 but I haven't had a chance to try it out yet, nor make any comparisons with Eigensoft. The documentation on the website shows that it can directly utilize PLINK ped and map files, so this eliminates the burden of using a tool like PLATO to convert between formats.

Has anyone ever used Shellfish (or anything else besides Eigensoft) for PCA on GWAS or AIMs data?

EDIT 2011-02-14: A tip of the hat to Mike Baldwin for pointing out to me that Eigensoft version 4 is now available on Alkes Price's website. A Google search always puts you at Eigensoft version 3 from the Reich lab software page, which is the old version that doesn't play well with newer Linux distros. I had no problem using Eigensoft 4 on my Ubuntu 10.10 system.


  1. Unfortunately I can't help on this one. But welcome back. Hope the weather there (in Hawaii) is better than here.

  2. Has anyone written the person in charge of maintaining EIGENSOFT? I can't think of their name off hand, but I have it in an email and I think it is on the website. I've written them with questions before (specifically the number of chromosomes allowed) and they were very helpful. Perhaps you could alter the source code?
    I do not have experience with Shellfish, but I did implement the Price et al method using Matlab, since I was using inferred genotypes which EIGENSOFT does not allow.
    Let us know what you think of Shellfish.
    Let us know what you think of Shellfish.

  3. I just compiled Eigensoft in a Ubuntu 10.10 VB last week. It worked fine. Are you using the new version 4? He changed/updted the version of Fortran called by the make file so it works without all of the drama that was required before (the solution before was to either change the make file to reference a newer version of fortran or temporarily use a Ubuntu repository from several versions back to get the gc77 (?) installed)

    Your calling "clobber" before you compile, right?


  4. Mike,

    Never called clobber before compiling. Can you tell me exactly how you did this?

    Thanks in advance!

  5. You are using version 4.0 beta correct?

    Once you've downloaded/extracted the files, take a look at the make file in the src directory with a text editor, there should enough in there to figure out the exact syntax of this in case my memory isn't what it should be. .

    IIRC . . .
    $make clobber
    (this will nuke/clean house and you can see exactly what it does from looking at the make file)

    Then call the makefile to recompile everything as you normally would (I can't remember the syntax for some reason, it's either "$make filename" or "$makefile" isn't it ?)


  6. I am coming across the same problem now. I am using Ubuntu10.04. It seems that I should upgrade my version to 10.10.

    thanks for your comments.

  7. Hi, everyone! I could compile EIGENSOFT 4.2 just fine in Ubntu 12.04, but I had to move the $LAPCK string always to the end of line when it appeared in the makefile.

    Now I'm having some problem with format conversion, I tried to use the PLATO tool but I can't quite understand about how to install it. Where are the executables? Do I have to do some symbolic link with other package? Please could anyone help me in get PLATO to function?

    Thanks a lot in advance!

    1. Pedro - I would recommend contacting the PLATO developers at They're pretty responsive.

  8. This comment has been removed by the author.


Creative Commons License
Getting Genetics Done by Stephen Turner is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License.