The Hutch Data Core works to support researchers by providing access to a set of curated bioinformatics workflows which automate commonly-needed tasks.
We are in the process of expanding this catalog of workflows, and we are actively soliciting input from researchers who have an interest in adding workflow automation to their ongoing projects.
Because of our extensive work with the Microbiome Research Initiative, many of the workflows listed below are designed for the analysis of microbial sequence data. However, the goal of this catalog is to be comprehensive and not specific to any single scientific domain.
If you are interested in adding a workflow to this catalog which meets your needs, please don’t hesitate to get in touch.
By including a workflow in this catalog, the Data Core is committed to helping individual researchers as they implement the workflow on their own datasets. This may include consultation on the proper use of a workflow, helping to set up an initial test run, as well as troubleshooting errors which may come up. By creating a catalog of well-supported workflows, we hope to streamline the data analysis process for future researchers who will be able to take advantage of the enhanced documentation and debugging that results from our efforts.
Please reach out if you have questions about the use of any workflows listed below, or would like to see an additional use-case supported.
Each of the workflows below have a short description and a link to the repository which should contain documentation on how to run the workflow. If any of these materials are insufficient, we are extremely interested in working to make improvements and would appreciate your feedback.
RNAseq - Gene Expression Analysis
Purpose: Quantify the degree of gene expression from RNA sequencing datasets Homepage: https://github.com/nf-core/rnaseq
Differential Expression Analysis
Purpose: Identify the genes which are differentially expressed between treatment and control groups Homepage: https://github.com/FredHutch/pw-differential-expression
Gene Set Enrichment Analysis
Purpose: Identify the gene sets which are enriched in a collection of differentially expressed genes Homepage: https://github.com/FredHutch/pw-gene-set-enrichment
Bulk RNA Immune Clonotypes
Purpose: Identify the immune clonotypes present from bulk RNA sequencing data Homepage: https://github.com/FredHutch/pw-bulk-rna-immune-clonotypes
Bulk RNA Lymphocyte Abundance Estimation
Purpose: Estimate the proportion of lymphocyte cells from bulk RNA gene expression data Homepage: https://github.com/FredHutch/pw-bulk-rna-lymphocytes
Microbial Taxonomic Classification (metaphlan2)
Purpose: Estimate the abundance of microbial taxa from WGS data Homepage: https://github.com/FredHutch/metaphlan-nf
CRISPR Screen Analysis
Purpose: Identify genes which are enriched/depleted from CRISPR screen experiments Homepage: https://github.com/FredHutch/crispr-screen-nf
PacBio CCS Analysis
Purpose: Extract CCS from CLR data generated by the PacBio Sequel instrument Homepage: https://github.com/FredHutch/ccs-nf
CellProfiler Batch Analysis
Purpose: Process images in parallel through a CellProfiler pipeline and combine the results Homepage: https://github.com/FredHutch/cellprofiler-batch-nf
Gene-Level Metagenomic Analysis (geneshot)
Purpose: Perform gene-level metagenomic analysis of WGS datasets for strain-level microbiome analysis Homepage: https://github.com/Golob-Minot/geneshot
FASTQ Quality Assessment
Purpose: Generate a QC summary over a collection of FASTQ input files Homepage: https://github.com/FredHutch/multi-fastqc-nf
Microbial Operon Identification
Purpose: Identify genomes which contain a set of genes in close proximity Homepage: https://github.com/FredHutch/octapus
Microbial Pan-Genome Analysis (anvi’o)
Microbial Genome Annotation (PGAP)
Microbial Genome Annotation (Prokka)
Microbial Genome Assembly (UniCycler)
Updated: September 9, 2022Edit this Page via GitHub Comment by Filing an Issue Have Questions? Ask them here.