Nf-core pipelines - Ngs Analysis ToolKit

Nf-core Pipelines for Variant Calling

Introduction

Nf-core is a collaborative initiative to provide a curated set of high-quality Nextflow pipelines. It is designed for reproducible and portable workflows that cater to a wide range of bioinformatics applications. In the context of variant calling, nf-core offers robust pipelines that are extensively tested, easy to use, and customizable. Variant calling is a critical step in genomic analysis, identifying variations such as SNPs (single nucleotide polymorphisms) and indels (insertions and deletions) in DNA or RNA sequences.

Key Features of nf-core Pipelines for Variant Calling

Standardization: Pipelines adhere to best-practice guidelines and community standards.
Portability: Compatibility with multiple environments (local systems, HPC, cloud).
Reproducibility: Version-controlled workflows and containers ensure consistent results.
Customization: Configurable parameters to adapt workflows to specific datasets and research questions.

Commonly Used nf-core Pipelines for Variant Calling

1. `nf-core/sarek`

Purpose: Comprehensive pipeline for germline and somatic variant calling.
Supported Analysis:
- Preprocessing: Alignment, recalibration, and quality control.
- Variant calling with tools like GATK, Mutect2, Strelka, FreeBayes, etc.
- Annotation and filtering of variants.
Highlights:
- Multimodal: Supports both WGS and WES data.
- Scalability: Handles both small-scale and large-scale datasets.

2. `nf-core/somaticseq`

Purpose: Focused on somatic mutation detection.
Key Features:
- Combines multiple variant callers for enhanced accuracy.
- Utilizes machine learning to improve sensitivity and specificity.

3. `nf-core/rna-seq`

Purpose: Primarily for RNA-seq data analysis, but supports variant calling on transcriptome data.
Highlights:
- High-quality alignment with STAR or HISAT2.
- Variant calling on RNA-seq using tools like GATK HaplotypeCaller.

Getting Started

Installation:

Install Nextflow:
```
curl -s https://get.nextflow.io | bash
```
Pull the desired pipeline:
```
nextflow pull nf-core/<pipeline_name>
```

Execution:

Run the pipeline with a configuration file or CLI arguments:

nextflow run nf-core/sarek -profile <docker/singularity/conda> --input samplesheet.csv --genome GRCh38

Configuration:
- Customize the workflow by editing params or providing a custom configuration file.

Best Practices

Use proper genome references and annotations (e.g., GRCh38, hg19).
Follow nf-core documentation to ensure proper usage of profiles (docker, singularity, etc.).
Use MultiQC for summarizing QC metrics.

Conclusion

Nf-core pipelines streamline the complex workflows of variant calling, ensuring reproducibility, scalability, and reliability. Whether analyzing germline variants, somatic mutations, or RNA-seq derived variations, nf-core offers tailored solutions for every need.

Ressource 1: nf-core Website .
Ressource 2: Nextflow Documentation .