BinAC/Software/Bowtie
Description | Content |
---|---|
module load | bio/bowtie |
License | Free Software |
Citing |
Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25. |
Links | Homepage | Manual |
Graphical Interface | No |
Description
Bowtie is an ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp reads per hour. Bowtie indexes the genome with a Burrows-Wheeler index to keep its memory footprint small: typically about 2.2 GB for the human genome (2.9 GB for paired-end).
Versions and Availability
A list of versions currently available on all bwHPC-C5-Clusters can be obtained from the
Cluster Information System CIS
{{#widget:Iframe |url=https://cis-hpc.uni-konstanz.de/prod.cis/bwUniCluster/bio/bowtie |width=99% |height=250 |border=0 }} On the command line interface of any bwHPC cluster, a list of the available i versions using
$ module avail bio/bowtie
License
Copyright 2014, Ben Langmead Bowtie is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
Bowtie is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with Bowtie. If not, see <http://www.gnu.org/licenses/>.
Usage
Loading the module
You can load the default version of Bowtie with the command
$ module load bio/bowtie
The module will try to load modules it needs to function (e.g. compiler/intel). If loading the module fails, check if you have already loaded one of those modules, but not in the version needed for Bowtie. If you wish to load a specific (older) version, you can do so using e.g.
$ module load bio/bowtie/1.0.1
to load the version 1.0.1.
Program Binaries
$ bowtie
Bowtie is an ultrafast, memory-efficient short read aligner geared toward quickly aligning large sets of short DNA sequences (reads) to large genomes. bowtie takes an index and a set of reads as input and outputs a list of alignments.
$ bowtie-build
bowtie-build builds a Bowtie index from a set of DNA sequences. bowtie-build outputs a set of 6 files with suffixes .1.ebwt, .2.ebwt, .3.ebwt, .4.ebwt, .rev.1.ebwt, and .rev.2.ebwt. (If the total length of all the input sequences is greater than about 4 billion, then the index files will end in ebwtl instead of ebwt.) These files together constitute the index: they are all that is needed to align reads to that reference. The original sequence files are no longer used by Bowtie once the index is built.
$ bowtie-inspect
bowtie-inspect extracts information from a Bowtie index about what kind of index it is and what reference sequences were used to build it. When run without any options, the tool will output a FASTA file containing the sequences of the original references (with all non-A/C/G/T characters converted to Ns). It can also be used to extract just the reference sequence names using the -n/--names option or a more verbose summary using the -s/--summary option.
Disk Usage
Scratch files are written to the current directory by default. Please change to a local directory before starting your calculations. For example
$ TMP_DIR=$TMP/$USER/job_sub_dir $ mkdir -p $TMP_DIR $ cd $TMP/$USER/job_sub_dir
However, you can also use workspaces for your calculations that are located on the parallel file system. Especially since in- and outputdata for aligining sequences is rather big and if you want to use your results for subsequent analysis.
$ WS_PATH=`ws_allocate bowtie_test 20` $ cd ${WS_PATH}/
Bowtie-Indices
Please contact the HPC-Competence Center for Bioinformatics and Astrophysics via the bwSupport Portal if you need a Bowtie-index permantly. The indices usually need a lot if diskspace. Therefore it is better to make them available to users in a common location like ${DBDATA_BOWTIE_INDEX_DNA}.
Examples
Aligning
The following example shows you how to align simulated short reads against the human genome HG19:
$ msub -I -lnodes=1:ppn=2,walltime=00:00:30:00 $ HOME=`pwd` $ TMP_DIR=$TMP/$USER/job_sub_dir $ mkdir -p $TMP_DIR $ cd $TMP/$USER/job_sub_dir $ module load bio/bowtie/1.0.1 $ module load dbdata/homo_sapiens/hg19_ncbi $ time bowtie -S -p ${MOAB_PROCCOUNT} \ ${DBDATA_BOWTIE_INDEX_DNA} \ ${BOWTIE_EXA_DIR}/hg19_sim.read1.fastq \ bowtie.sam\ &>statistics.txt & $ mkdir -p $HOME/botie_test_results/ $ mv * $HOME/bowtie_test_results/ $ cd $HOME/bowtie_test_results/ $ rm -rfv $TMP_DIR/
Explanation of the parameters:
-S | Output will be written in SAM format |
-p | Calulation will be performed on X cores, the value is taken from the MOAB_PROCCOUNT environment variable. This calculation will be done on two cores since we requested them with -lnodes=1:ppn=2 |
${DATA_BOWTIE_INDEX_DNA} | Location of the bowtie index, in this case hg19 is used |
${BOWTIE_EXA_DIR}/hg19_sim.read1.fastq | Input file containing the short reads. In this example simulated short reads created with dwgsim 0.1.11 are used. |
bowtie.sam | Output file in SAM format named bowtie.sam |
&>statistics.txt | Statistcs are piped into the file statistics.txt |
Indexing
The following script can be used to create a bowtie index. However, please contact the HPC-Competence center for Bioinformatics and Astrophysics (bwSupport Portal) if you need additional Bowtie-Indices that are not already located in $DATA_BOWTIE_INDEX_DNA/
Content of the batch script create_bowtie_indices.moab
#!/bin/bash #MSUB -l nodes=1:ppn=1 #MSUB -l walltime=01:00:00:00 #MSUB -m abe ##MSUB -M PUT_YOUR_EMAIL #MSUB -l mem=20gb module load bio/bowtie/1.0.1 cd $MOAB_SUBMITDIR/ time bowtie2-build hg19.fa hg19.bowtie
More examples can be found in the $BOWTIE_EXA_DIR.
Version-Specific Information
For information specific to a single version, see the information available via the module system with the command
$ module help bio/bowtie