Monday, December 28, 2009

Capture system commands as R objects with system(..., intern=T)

Just discovered this very handy R command to capture the output from a system command as an R object.  I wanted to use R to read in the output from another program (PLINK) and do some processing on each output file. Of course if the files are named sequentially (plink1.out, plink2.out, plink3.out, etc.) this would be simple with a for loop.  This wasn't the case for me, but I could still list all the files I wanted to process with some pattern matching in Unix using wildcards.  All the files had "_hdl" in the name, and ended with ".qassoc".  Here's where the system() function is helpful.

This command issues "ls *_hdl*.qassoc" to the system, just as if you were typing it in at the terminal.  The intern=TRUE tells the system command to treat the output as an R object.  I can store this in myfiles, then do some processing on myfiles, which is a vector containing all the filenames of files I want to process.

myfiles = system("ls *_hdl*.qassoc", intern=TRUE)

 [1] "0-all.01_hdl_modeled.qassoc"
 [2] "0-all.02_hdl_modeled_polyresid.qassoc"
 [3] "0-all.03_hdl_med.qassoc"
 [4] "0-all.04_hdl_med_polyresid.qassoc"
 [5] "0-all.05_hdl_med_smokage_polyresid.qassoc"
 [6] "1-male.01_hdl_modeled.qassoc"
 [7] "1-male.02_hdl_modeled_polyresid.qassoc"
 [8] "1-male.03_hdl_med.qassoc"
 [9] "1-male.04_hdl_med_polyresid.qassoc"
[10] "1-male.05_hdl_med_smokage_polyresid.qassoc"

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Creative Commons License
Getting Genetics Done by Stephen Turner is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License.