To start using the h3achipimputation pipeline, follow the steps below:
Nextflow runs on most POSIX systems (Linux, Mac OSX etc). It can be installed by running the following commands:
# Make sure that Java v8+ is installed:
java -version
# Install Nextflow
curl -fsSL get.nextflow.io | bash
# Add Nextflow binary to your PATH:
mkdir -p ~/bin
mv nextflow ~/bin/
# OR system-wide installation:
# sudo mv nextflow /usr/local/bin
See nextflow.io for further instructions on how to install and configure Nextflow.
This pipeline itself needs no installation - NextFlow will automatically fetch it from GitHub if h3abionet/chipimputation
is specified as the pipeline name.
The above method requires an internet connection so that Nextflow can download the pipeline files. If you're running on a system that has no internet connection, you'll need to download and transfer the pipeline files manually:
wget https://github.com/h3abionet/chipimputation/archive/master.zip
mkdir -p ~/my-pipelines/nf-core/
unzip master.zip -d ~/my-pipelines/nf-core/
cd ~/my_data/
nextflow run ~/my-pipelines/h3abionet/chipimputation-master
To stop nextflow from looking for updates online, you can tell it to run in offline mode by specifying the following environment variable in your ~/.bashrc file:
export NXF_OFFLINE='TRUE'
If you would like to make changes to the pipeline, it's best to make a fork on GitHub and then clone the files. Once cloned you can run the pipeline directly as above.
By default, the pipeline runs with the standard
configuration profile. This uses a number of sensible defaults for process requirements and is suitable for running on a simple (if powerful!) basic server. You can see this configuration in conf/base.config
.
Be warned of two important points about this default configuration:
- The default profile uses the
local
executor- All jobs are run in the login session. If you're using a simple server, this may be fine. If you're using a compute cluster, this is bad as all jobs will run on the head node.
- See the nextflow docs for information about running with other hardware backends. Most job scheduler systems are natively supported.
- Nextflow will expect all software to be installed and available on the
PATH
, unless Docker or Singularity used.
First, install docker on your system: Docker Installation Instructions
Then, running the pipeline with the option -profile docker
tells Nextflow to enable Docker for this run. An image containing all of the software requirements will be automatically fetched and used from quay.io (https://quay.io/h3abionet_org/imputation_tools).
If you're not able to use Docker then Singularity is a great alternative.
The process is very similar: running the pipeline with the option -profile singularity
tells Nextflow to enable Singularity for this run. An image containing all of the software requirements will be automatically fetched and used from quay.io (quay.io/h3abionet_org/imputation_tools).
If running offline with Singularity, you'll need to download and transfer the Singularity image first:
singularity pull --name h3abionet-chipimputation-tools.simg docker://quay.io/h3abionet_org/imputation_tools
Once transferred, use -with-singularity
and specify the path to the image file:
nextflow run h3abionet/chipimputation -with-singularity h3abionet-chipimputation-tools.simg
Remember to pull updated versions of the Singularity image if you update the pipeline.