Running nf-core pipelines#

What are nf-core pipelines?#

nf-core is an organisation backing an international effort to create high-quality, reproducible pipelines written in Nextflow.

Some examples of nf-core pipelines include:

nf-core/fetchngs: to download raw datasets from public repositories (ENA, SRA...)
nf-core/rnaseq: to perform a differential expression analysis of RNA-Seq datasets
nf-core/ampliseq: to analyse metabarcoding (16S, ITS...) experiments (mostly based on Qiime2)
nf-core/taxprofiler: to run multiple taxonomy profiling tools on a metagenomics dataset
nf-core/mag: to assemble and bin whole metagenome sequencing runs
See the full list online.
💡 See also Using Nextflow

How to run a nf-core pipeline?#

There is a very good documentation available from the nf-core website, and even a great set of video tutorials.

A first attempt of running a pipeline should be using its test profile. This means that the pipeline will try to analyse some test data known to work, and after getting a successful ending we can go further and try with our own data.

The general syntax is:

nextflow run nf-core/<pipeline_name> -r <version> -profile test --outdir  /shared/team/<output-dir>

Where:

<pipeline_name> is of course the actual pipeline you want to run
<version> is the revision you want to use (this is important and will ensure reproducibility, check the pipeline website to see the last version)
<output-dir> where Nextflow will save the files. NOTE that your home directory will not work!

For example, to test the rnaseq pipeline:

nextflow run nf-core/rnaseq -r 3.14.0 -profile test --outdir /shared/team/test-out-rnaseq

An example: fetchngs#

nf-core/fetchngs is a pipeline to download a set of NGS output from public repositories such as NCBI Short Reads Archive.

We can use it as a first example pipeline as its input is a simple text file with a list of accession codes.

Remembering that Nextflow pipelines will not have access to any file saved in your home directory, we can create an input file like:

mkdir -p /shared/team/download-lists/
echo -e "ERR12319563\nERR12319484\nERR12319547" > /shared/team/download-lists/test.csv

Edit

The echo command created a list with three accession numbers from the command line, but you can use the handy text editor built-in in the CLIMB notebook to create a new file. It's important to use the csv extension though.

# The \ in the command allows to break a command in multiple lines
# If you type the command in a single line, do NOT type the "\"s

nextflow run nf-core/fetchngs -r 1.12.0 \
   --input /shared/team/download-lists/test.csv \
   --outdir /shared/team/fetchngs-out/

Example execution:

nf-core fetchngs execution

S3 buckets#

A very handy feature of Nextflow, is that it can read and write to S3 buckets.

If we want to save the output of the nf-core/fetchngs pipeline to a CLIMB S3 bucket (suppose you have a bucket called "ngs-files"), we can simply change the output path to something like:

# The \ in the command allows to break a command in multiple lines
# If you type the command in a single line, do NOT type the "\"s

nextflow run nf-core/fetchngs -r 1.12.0 \
   --input /shared/team/download-lists/test.csv \
   --outdir s3://ngs-files/fetchngs-output/