Computational Workflows
Updated: April 14, 2022
Edit this Page via GitHub Comment by Filing an Issue Have Questions? Ask them here.Understanding Workflows
When performing bioinformatic analysis, scientists often need to perform a series of interconnected computational transformations of the raw input data. While it is possible to coordinate multiple tasks using BASH scripts or batch submission to a SLURM cluster, it can be far more convenient to use software which has been specifically designed to coordinate these types of workflows. Some examples of those workflow managers are Snakemake, Cromwell, Galaxy, and Nextflow.
At the moment, the documentation we provide is focused on Nextflow, a published workflow management software which has been adopted by many core facilities around the world supporting a catalog of community-supported bioinformatics workflows. As we continue to expand the documentation available for running workflows, we hope to add content supporting users of Snakemake and Cromwell.
We welcome the contribution of any additional workflow documentation, which can be added following these guidelines for contribution to the SciWiki.
For further details on the concepts of workflow managers, see Background on Workflows.
Getting Started
If you’re new to workflows, follow this guide to run your first workflow.
Workflow Configuration
One of the nice things about using a workflow manager is that you have the ability to change the way that a workflow is run (on what computers, using what resources, analyzing what files), without changing the workflow itself. This includes:
- Running a workflow on SLURM (gizmo)
- Running a workflow on AWS
- Setting up a run script
- Specifying computational resources
Workflow Catalog
While there are many workflows developed by researchers around the world, the Data Core is also working to maintain a catalog of workflows for Fred Hutch researchers. If you have any questions about using these workflows, or if you run into any issues, the Data Core can help provide help with troubleshooting and enhancements as needed.
See the Workflow Catalog for a list of existing workflows you can run.
For a longer list of workflows developed by the worldwide community of Nextflow developers, visit the nf-core workflow catalog.
Many of the nf-core workflows use a common set of reference genomes, such as the iGenomes resource. For more convenient use of the nf-core workflows at Fred Hutch, a set of commonly used reference genomes including both the iGenomes and the human STAR-Fusion alignment index have been made available in a public volume on the shared filesystem (more details here).
Updated: April 14, 2022
Edit this Page via GitHub Comment by Filing an Issue Have Questions? Ask them here.