TABLE 3

Overview of genome characterization toolsa

Analysis tool (reference[s])Concept(s)Input type(s)Input format(s)Output format(s)Web address
Identification
    Web based
        KmerFinder (121, 122)Uses k-mers to identify strain using WGS dataRaw sequences, contigsFASTQ, FASTATab delimited, onlinehttps://cge.cbs.dtu.dk/services/KmerFinder/
        NCBI BLASTb (123)NCBI Web-based interface for performing BLAST searches; searches hits in the database that match the given sequenceContigsFASTAOnline, tab delimitedhttps://blast.ncbi.nlm.nih.gov/Blast.cgi
        MLST Web server (125)Web-based database that identifies STs from short sequencing reads or draft genomesRaw sequences, contigsFASTQ, FASTAOnlinehttps://cge.cbs.dtu.dk/services/MLST/
    Command line
        PathoScope 2.0 (127)Complete framework based on Bayesian missing-data approach, for direct strain identificationRaw sequencesFASTQ, FASTATab delimitedhttps://sourceforge.net/p/pathoscope/wiki/Home/
Annotation
    Web based
        RAST (129)Web-based server for localization and identification of tRNA, rRNA, and coding sequences; includes a browser for screening the outputContigsFASTAGenBank, EMBL, GFF3, GTF, Excel, and tab delimitedhttp://rast.nmpdr.org/
    Command line
        PROKKA (132)Rapid annotation tool for localization and identification of rRNA, tRNA, tmRNA, signal peptides, noncoding RNA, and coding sequencesContigsFASTAFASTA, tab delimited, SQN, GenBank file, GFF3http://www.vicbioinformatics.com/software.prokka.shtml
Virulence
    Web based
        VirulenceFinderDetect virulence genes in WGS data using the BLAST algorithmRaw sequences, contigsFASTQ, FASTATab-delimited summary, FASTAhttps://cge.cbs.dtu.dk/services/VirulenceFinder/
        VFDB (138)Source of virulence information, including Web-based service to perform BLAST to detect virulence genesContigsFASTAOnline, tab delimitedhttp://www.mgc.ac.cn/VFs/
Antimicrobial resistance
    Web based
        ResFinderDetects resistance genes in WGS dataRaw sequences, contigsFASTQ, FASTATab-delimited summary, FASTAhttps://cge.cbs.dtu.dk/services/ResFinder/
        RGI/CARD (144146)Web-based as well as command line versions available to perform resistance gene detection using the CARD databaseContigs, GenBank accession no.FASTA, GenBank accession no. (nucleotide or protein)JSON, tab-delimited summary, FASTA, heat map PDFhttps://card.mcmaster.ca/analyze/rgi
        PlasmidFinderTool to detect plasmids in WGS dataRaw sequences, contigsFASTQ, FASTATab-delimited summary, FASTAhttps://cge.cbs.dtu.dk/services/PlasmidFinder/
        CGE BAP (107)Web-based suite for automated genomic characterization; if raw sequence reads are provided, performs assembly; set of tools is applied to the contigs, ResFinder, VirulenceFinder, and PlasmidFinderRaw sequences, contigsFASTQ, FASTATab-delimited summaries, FASTAhttps://cge.cbs.dtu.dk/services/cge/
  • a ND, no data; NA, not applicable; EMBL, sequence file format; JSON, JavaScript Object Notation; SQN, GenBank submission file; GFF3, General Feature Format 3.

  • b Also available as a command line tool and as GUI via prfectBLAST (124).