Code Examples and Templates

Updated: July 21, 2022

Edit this Page via GitHub       Comment by Filing an Issue      Have Questions? Ask them here.

Our Resource Library includes tutorials of how to perform common computational tasks using software available at Fred Hutch. If you’re interested in implementing these methods on your own data following best practices for reproducibility, the resources below include templates for setting up your own analyses, as well as additional examples of executable code that can be tailored to suit your own needs.

These templates and examples are generally published as GitHub repositories. If you are unfamiliar with GitHub, please see our section on [Managing and Sharing Code]. If you have other templates or examples you would like to see posted here, please file an issue in our GitHub repository or read our contributing guidelines to learn how you can add the content yourself.

Coding Practices

In addition to building your project within the file structure of the templates below, you should apply these coding practices to all your software development work:

  1. Raw and processed data should be stored separately.
  2. Source code and results should be organized and clearly labeled.
  3. All projects should contain a license and a README file, with a project overview and details about each component in the README.
  4. Code should be fully documented. Additional documentation, instructions and examples can be included in a separate folder.
  5. Reuse existing code (or packages) when available.
  6. Code should be automated to reduce transcription errors.
  7. Use inline comments and meaningful variable names to help make your code readable to reviewers, researchers, and your future self.
  8. For Python packages and modules, include Help documentation.

Templates for data analysis and coding

The following repositories were created by researchers at Fred Hutch to assist in software development and data analysis following best practices for reproducibility.

Other groups have developed templates for more general use:

  • Cookiecutter is a command-line utility that allows you to create projects from project templates that includes a wide variety of functionality for different languages.
  • Shablona is a template developed by UW’s eScience that is specifically designed for small scientific Python projects.

Fred Hutch Code Examples

These repositories were created by Fred Hutch researchers and staff, and contain a variety of curated example code with documentation.

  • Data Science Example Code: Example code for a variety of common data science analysis tasks, longer than what appears in the Resource Library, but not so long as to warrant their own repositories.
  • Slurm examples: Examples of using Slurm (the job management system used on our cluster) for life sciences research tasks.
  • Batch pipeline: Example workflows for a multi-step array job on AWS Batch (cloud computing)
  • Python examples: discussed and aggregated by the Python User Group at Fred Hutch.
  • Nextflow examples: this link shows all repositories containing Nextflow (nf) examples in the Fred Hutch GitHub organization. Each repository represents a different kind of analysis.
  • WDL examplesthis link shows all public repositories containing WDL examples in the Fred Hutch GitHub organization.
  • Older repositories involving workflows that are useful for reference:
    • Single Cell RNAseq workflows: basic outline of different approaches to workflows used in scRNAseq. Includes some perspective on why some approaches may be more effective.
    • Reproducible Workflows: examples of various workflows (Cromwell, WDL, AWS), with guidance for use of each.

Updated: July 21, 2022

Edit this Page via GitHub       Comment by Filing an Issue      Have Questions? Ask them here.