SPAdes
SPAdes is an assembly toolkit containing various assembly pipelines.
Assembling a genome from Illumina paired-end reads using SPAdes
SPAdes can be used to assemble paired-end reads as follows:
$ spades -1 reads_1.fq.gz -2 reads_2.fq.gz -t 5 -m 200 -o results/directory/
In this command…
-1is the file with forward reads.-2is the file with reverse reads.-tor--threadssets the number of processors/threads to use. The default is 16.-mor--memoryis memory the limit in Gb. SPAdes terminates if it reaches this limit. The default value is 250Gb.-oor--outdiris the output directory to use. The default is the current directory.
SPAdes supports uncompressed (.fastq or .fq) or compressed (.fastq.gz or .fq.gz) sequencing read inputs. In the output directory, the assembled genome will be available as contigs (contigs.fasta) and scaffolds (scaffolds.fasta), both of which are FASTA nucleotide files.
See also
- conda: The
bioinfo-notebookconda environment includes SPAdes - File formats used in bioinformatics