Fasterq-dump
fasterq-dump is a tool for downloading sequencing reads from NCBI’s Sequence Read Archive (SRA). These sequence reads will be downloaded as FASTQ files. fasterq-dump is a newer, streamlined alternative to fastq-dump; both of these programs are a part of sra-tools.
fasterq-dump vs fastq-dump
Here are a few of the differences between fastq-dump and fasterq-dump:
- In
fastq-dump, the flag--split-3is required to separate paired reads into left and right ends. This is the default setting infasterq-dump. - The
fastq-dumpflag--skip-technicalis no longer required to skip technical reads infasterq-dump. Instead, the flag--include-technicalis required to include technical reads when usingfasterq-dump. - There is no
--gzipor--bzip2flag infasterq-dumpto download compressed reads withfasterq-dump. However, FASTQ files downloaded usingfasterq-dumpcan still be subsequently compressed.
The following commands are equivalent, but will be executed faster using fasterq-dump:
$ fastq-dump SRR_ID --split-3 --skip-technical
$ fasterq-dump SRR_ID
Downloading reads from the SRA using fasterq-dump
In this example, we want to download FASTQ reads for a mate-pair library.
fastq-dump --threads n --progress SRR_ID
In this command…
--threadsspecifies the number (n) processors/threads to be used.--progressis an optional argument that displays a progress bar when the reads are being downloaded.SRR_IDis the ID of the run from the SRA to be downloaded. This ID begins with “SRR” and is followed by around seven digits (e.g.SRA1234567).
Demonstration
In this video, fasterq-dump is used to download Saccharomyces cerevisiae RNAseq reads from the SRA.