Workflow Catalog

Hutch Data Core

Updated: September 9, 2022

Edit this Page via GitHub       Comment by Filing an Issue      Have Questions? Ask them here.

The Hutch Data Core works to support researchers by providing access to a set of curated bioinformatics workflows which automate commonly-needed tasks.

We are in the process of expanding this catalog of workflows, and we are actively soliciting input from researchers who have an interest in adding workflow automation to their ongoing projects.

Because of our extensive work with the Microbiome Research Initiative, many of the workflows listed below are designed for the analysis of microbial sequence data. However, the goal of this catalog is to be comprehensive and not specific to any single scientific domain.

If you are interested in adding a workflow to this catalog which meets your needs, please don’t hesitate to get in touch.

Workflow Support

By including a workflow in this catalog, the Data Core is committed to helping individual researchers as they implement the workflow on their own datasets. This may include consultation on the proper use of a workflow, helping to set up an initial test run, as well as troubleshooting errors which may come up. By creating a catalog of well-supported workflows, we hope to streamline the data analysis process for future researchers who will be able to take advantage of the enhanced documentation and debugging that results from our efforts.

Please reach out if you have questions about the use of any workflows listed below, or would like to see an additional use-case supported.

Workflows

Each of the workflows below have a short description and a link to the repository which should contain documentation on how to run the workflow. If any of these materials are insufficient, we are extremely interested in working to make improvements and would appreciate your feedback.

RNAseq - Gene Expression Analysis

Purpose: Quantify the degree of gene expression from RNA sequencing datasets Homepage: https://github.com/nf-core/rnaseq

Differential Expression Analysis

Purpose: Identify the genes which are differentially expressed between treatment and control groups Homepage: https://github.com/FredHutch/pw-differential-expression

Gene Set Enrichment Analysis

Purpose: Identify the gene sets which are enriched in a collection of differentially expressed genes Homepage: https://github.com/FredHutch/pw-gene-set-enrichment

Bulk RNA Immune Clonotypes

Purpose: Identify the immune clonotypes present from bulk RNA sequencing data Homepage: https://github.com/FredHutch/pw-bulk-rna-immune-clonotypes

Bulk RNA Lymphocyte Abundance Estimation

Purpose: Estimate the proportion of lymphocyte cells from bulk RNA gene expression data Homepage: https://github.com/FredHutch/pw-bulk-rna-lymphocytes

Microbial Taxonomic Classification (metaphlan2)

Purpose: Estimate the abundance of microbial taxa from WGS data Homepage: https://github.com/FredHutch/metaphlan-nf

CRISPR Screen Analysis

Purpose: Identify genes which are enriched/depleted from CRISPR screen experiments Homepage: https://github.com/FredHutch/crispr-screen-nf

PacBio CCS Analysis

Purpose: Extract CCS from CLR data generated by the PacBio Sequel instrument Homepage: https://github.com/FredHutch/ccs-nf

CellProfiler Batch Analysis

Purpose: Process images in parallel through a CellProfiler pipeline and combine the results Homepage: https://github.com/FredHutch/cellprofiler-batch-nf

Gene-Level Metagenomic Analysis (geneshot)

Purpose: Perform gene-level metagenomic analysis of WGS datasets for strain-level microbiome analysis Homepage: https://github.com/Golob-Minot/geneshot

FASTQ Quality Assessment

Purpose: Generate a QC summary over a collection of FASTQ input files Homepage: https://github.com/FredHutch/multi-fastqc-nf

Microbial Operon Identification

Purpose: Identify genomes which contain a set of genes in close proximity Homepage: https://github.com/FredHutch/octapus

Microbial Pan-Genome Analysis (anvi’o)

Purpose: Analyze a set of microbial genomes with the anvi’o pangenome pipeline Homepage: https://github.com/FredHutch/nf-anvio-pangenome

Microbial Genome Annotation (PGAP)

Purpose: Annotate a set of bacterial genomes using the PGAP annotation software suite Homepage: https://github.com/FredHutch/PGAP-nf

Microbial Genome Annotation (Prokka)

Purpose: Annotate a set of bacterial genomes using the Prokka annotation software suite Homepage: https://github.com/FredHutch/prokka-nf

Microbial Genome Assembly (UniCycler)

Purpose: Perform de novo assembly of bacterial genomes using the UniCycler hybrid assembler Homepage: https://github.com/FredHutch/unicycler-nf

Updated: September 9, 2022

Edit this Page via GitHub       Comment by Filing an Issue      Have Questions? Ask them here.