Tools


Choosing the right tools for NGS (Next-Generation Sequencing) analysis involves several key considerations:

  1. Define Your Goals: Clearly outline your objectives (e.g., variant calling, transcriptome analysis, metagenomics) to narrow down suitable tools.

  2. Data Type: Consider the type of NGS data you have (e.g., DNA-seq, RNA-seq, ChIP-seq) as tools are often optimized for specific data types.

  3. Scalability: Ensure the tools can handle your data size. Some tools are better suited for large datasets or high-throughput analyses.

  4. Accuracy and Reliability: Look for tools with proven performance and community support. Check published studies or benchmarks for validation.

  5. User-Friendly Interface: If you’re not very experienced, opt for tools with graphical user interfaces (GUIs) or comprehensive documentation.

  6. Integration with Other Tools: Consider tools that can integrate well with existing pipelines or workflows, particularly in a bioinformatics framework.

  7. Community and Support: Active user communities and available support can be invaluable for troubleshooting and learning.

  8. Cost and Licensing: Assess whether the tools are open-source or require licensing fees, as budget constraints can influence your choice.

  9. Computational Requirements: Evaluate the hardware requirements and whether you have the necessary computational resources available.

  10. Reproducibility: Opt for tools that support reproducible research practices, allowing for easier sharing and validation of results.

single nucleotide variant calling

NamePublishedControl NeededIndel DetectionContamination CorrectionRef
Varscan22012++1
MuTect2 *2013++2
FreeBayes2012+3
Strelka *2012++4
Platypus *2014+5
SomaticSniper *2012+6
LoFreq *2012++7
VarDict *2016+8
JointSNVMix *2012+9
MutationSeq *2012+10
EBCall *2013++11
MuSE *2016++12
RADIA2014++13
Virmid2013++14
deepSNV *2014+15
Shimmer *2013++16
qSNP *2013++17
BAYSIC2014+18
SomaticSeq *2015++19
CaVEMan *2016++20
SNooPer *2016++21
SNVSniffer *2016+22
HapMuC2014+23
FaSD-somatic201424
LocHap *2016+++25
LoLoPicker *2017++26

Copy Number Variations

NamePublishedControl NeededContamination CorrectionGC-Content CorrectionREF
Varscan22012+1
CNVnator2011++2
CNV-Seq2009+3
CoNIFER2012+4
Control-FREEC2012++5
ExomeCNV2011++6
XHMM2012++7
ExomeDepth2012++8
cn.MOPS2012++9
Cnvkit2016+++10
CONTRA2012+12
Sequenza2015++13
EXCAVATOR2013+++14
CODEX2015++16
ADTEx2014++17
Seqgene2011+18
FishingCNV201319
HMZDelFinder201720
ExoCNVTest2012+21
CLAMMS2016+22
falcon2015++23
saasCNV2015++24
WISExome201725
GATK

Structural Variations

NameDescriptionREF
MantaManta is a structural variant caller developed by Illumina. It detects various types of SVs, including deletions, duplications, inversions, and translocations. Works with whole-genome and exome sequencing data.1
DellyDelly is a versatile SV detection tool that can identify deletions, duplications, inversions, translocations, and more. It works with various sequencing data types, including exome data.2
LumpyLumpy focuses on identifying interchromosomal translocations and intrachromosomal rearrangements. It can be adapted for use with exome data.3
BreakDancerPindel detects breakpoints of large deletions, medium-sized insertions, inversions, tandem duplications, and other structural variants. Suitable for both whole-genome and exome data.4
GRIDSSGRIDSS is a versatile SV caller that can detect complex structural variants by combining multiple evidence types. Suitable for various sequencing data types, including exome data.5
CNVkitCNVkit is primarily for copy number variation detection but can also identify large-scale structural variations from exome data by analyzing read depth.6
TIDDITTIDDIT is designed for identifying tandem duplications and can be used with exome sequencing data to detect this specific type of structural variation.7

Splicing Site Detection

NameDescriptionREF
SpliceAISpliceAI is a deep learning-based tool that predicts the effect of variants on splicing, providing information about splice site alterations.1
Human Splicing Finder (HSF)HSF is a web-based tool that predicts the potential impact of variants on splicing, analyzing consensus splice site sequences for potential disruptions.2
MMSpliceMMSplice is a machine learning-based tool that predicts the impact of variants on alternative splicing events, providing a score for splicing disruption likelihood.3
SplicePortSplicePort is a web-based tool that predicts the potential impact of variants on splice site creation or disruption, considering both donor and acceptor splice sites.4

Annotation

NameDescriptionREF
AnnovarANNOVAR is a versatile tool for annotating genetic variants, providing information on variant function, population allele frequencies, and predicted functional consequences.1
SnpEffSnpEff annotates genetic variants, categorizing them based on functional impact and providing annotations on genes and transcripts.2
dbNSFPThe Database for Non-Synonymous SNPs' Functional Predictions (dbNSFP) provides functional predictions for non-synonymous variants in exomes, including predictions from multiple tools.3
ExomiserExomiser prioritizes and annotates variants in exome data for rare disease research, integrating variant data with various databases and gene-phenotype information.4
VCFannoVCFanno is a flexible tool for annotating VCF files, allowing customization of annotation sources and rules.5
VEPVEP is a powerful tool from Ensembl for annotating genetic variants, offering insights into functional effects, gene impacts, regulatory regions, and population allele frequencies.6
VariantDBVariantDB is a platform for variant annotation and interpretation, providing various annotation sources and custom annotation tracks.7
GnomADThe Genome Aggregation Database (gnomAD) offers access to allele frequencies of genetic variants in large populations, useful for annotating exome variants.8
ClinVarClinVar is a database of clinically relevant variants, providing annotations related to clinical significance and associations with diseases.9
OMIMOMIM is a comprehensive database of genetic disorders and associated genes, useful for annotating exome variants with disease-related information.10