TABLE 2

Performance analysis of assembly toolsa

Analysis tool (reference[s])ConceptComputational requirementSpeedAssembly qualityPreferred sequencing technology(ies)Web address(es)Input formatOutput format(s)
Web based
    Velvet (103, 126)de Bruijn graph-based assembly that resolves repeat-rich regions; can be used for de novo or reference-guided assembly; requires paired reads with 20- to 25-fold coverageMid*Medium*Low*Illuminahttps://cge.cbs.dtu.dk/services/Assembler/FASTA, FASTQ, SAM, or BAMAMOS, modified FASTA
    SPAdes/hybridSPAdes (112)de Bruijn graph-based assembler for de novo assembly of short and long readsLow**Low**Mid*/**Mixed input (Illumina, Ion Torrent, PacBio CLR, Oxford Nanopore)https://cge.cbs.dtu.dk/services/SPAdes/FASTA, FASTQ, or BAMFASTA, FASTQ, FASTG
Command line
    IDBA-UD (108)de Bruijn graph-based assembly designed for assembly of repeat-rich reads of various sequencing depthsLow*Medium*Mid*Illuminahttp://i.cs.hku.hk/~alse/hkubrg/projects/idba_ud/FASTAFASTA
    RAY (96)de Bruijn graph-based assembly that uses seeds instead of Eulerian walks; used for de novo assembly; designed for short readsLow***Fast***Low***Mixed input (454, Illumina, Ion Torrent)http://denovoassembler.sourceforge.net/FASTA, FASTQ, or SFFFASTA, TXT
    Minimap/miniasm (116)OLC framework that computes overlaps and performs read trims and unitig construction; can be used for de novo or reference-guided assemblyLow**High**High*/**PacBio, Oxford Nanoporehttps://github.com/lh3/minimap, https://github.com/lh3/miniasmFASTAGFA, PAF
    Canu (118)OLC framework that computes overlaps and performs read correction, read trims, and unitig construction; used for de novo assemblyMid**Low**High*/**PacBio, Oxford Nanoporehttps://github.com/marbl/canuFASTA or FASTQFASTA
  • a All quantitative performance measures were taken from data reported previously, as indicated. CLR, continuous long reads; GFA, graphical fragment assembly; PAF, pairwise mapping format; SFF, standard flowgram format (454 data format); *, E. coli K-12 MG1655 data set (110); **, Enterobacter kobei data set (233); ***, Illumina data from E. coli (SRA accession number SRX000429) (234). Note that for SPAdes, only the nonhybrid tool is accessible as a Web-based tool.