#8 on the list was "Don't distribute bare JAR files." This is particularly annoying, requiring a user to invoke the software using something like: java -Xmx1000m -jar /path/on/my/system/to/software.jar
A very simple solution to the bare JAR file problem is distributing your java tool with a shell script wrapper that makes it easier for your users to invoke. E.g., if I have GATK installed in ~/bin/ngs/gatk/GenomeAnalysisTK.jar, I can create this shell script at ~/bin/ngs/gatk/gatk (replace GenomeAnalysisTK.jar with someOtherBioTool.jar):
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
PREFIX=$(dirname $0) | |
java -Xmx500m -jar $PREFIX/GenomeAnalysisTK.jar $* |
Once I make that script executable and include that directory in my path, calling GATK is much simpler:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Make the utility script executable (do this once) | |
chmod +x ~/bin/ngs/gatk/gatk | |
# Put the GATK directory in your path (add to .bashrc) | |
export PATH=$HOME/bin/ngs/gatk/:$PATH | |
# The "old way" to invoke GATK's help: | |
java -Xmx100m -jar ~/bin/ngs/gatk/GenomeAnalysisTK.jar -h | |
# The easy way: | |
gatk -h |
Yes, I'm fully aware that making my own JAR launcher utility scripts for existing software will make my code less reproducible, but for quick testing and development I don't think it matters. The tip has the best results when JAR files are distributed from the developer with utility scripts for invoking them.
See the post below for more standards that should be followed in bioinformatics software development.
Torsten Seeman: Minimum standards for bioinformatics command line tools