Training and Documentation
Updated: October 6, 2023
Edit this Page via GitHub Comment by Filing an Issue Have Questions? Ask them here.Here you can find the opportunities for self-learning and training in the methods used for bioinformatics and data analysis which may be useful for researchers in making use of large-scale datasets, in particular those produced by the instrumentation provided by Shared Resources.
Genomics
Library prep and quality assessment
- Fred Hutch Guidance - Library Preparation for Sequencing
- Guide to library prep: This guide is an overview of library preparation applications and kits that are in common use for next-generation sequencing.
- NGS library preperation resources from Illumina: Illumina library prep resource page.
- Illumina sequencing method explorer: Use this tool to explore cutting-edge experimental next-generation sequencing (NGS) library preparation methods compiled from scientific literature. To find a method to suit your project, along with compatible kits, select a starting material or search for a method by name.
- Library construction for next-generation sequencing: Overviews and challenges: Factors such as the quantity and physical characteristics of the RNA or DNA source material as well as the desired application are addressed in the context of preparing high quality sequencing libraries.
- Fred Hutch Guidance - Assay Preparation
- Labome overview of RNA extraction kits and application’s: This article summarizes commonly used methods and kits for RNA extraction.
- Labome overview of DNA extraction kits and applications: A comprehensive review of DNA extraction and purification kits cited in the literature.
Illumina
- Fred Hutch Guidance - Illumina Sequencers
- Illumina Sequencing by Synthesis: A 5 min video covering the Illumina RNA sequencing technology.
- Illumina sequencing resources: Sequencing resources page on the Illumina website.
PacBio
- Fred Hutch Guidance - PacBio Instrumentation
- Fred Hutch Guidance - DNA Preparation for PacBio
- SMRT Sequencing: Resources about the SMRT sequencing technology from PacBio.
10X genomics
- Fred Hutch Guidance - Library Preparation for 10X Genomics
- 10X Genomics: Link to the 10X genomics webpage
RNAseq
- RNA sequencing: the teenage years: An overview of RNA seq technologies, methods, and where the tech is headed.
- A survey of best practices for RNA seq data analysis: A review of best practices for RNA sequencing analysis covering experimental design, quality control, read alignment, quantification of gene and transcript levels, visualization, differential gene expression, alternative splicing, functional analysis, gene fusion detection and eQTL mapping.
- University of Oregon RNA-seqlopedia
-
Introduction to RNA-seq for researchers: A 30 min video reviewing RNA-sequencing concepts.
- Fred Hutch Guidance - RNA Sequencing
- Illumina gene expression resource: Illumina resources for gene expression analysis.
Small RNAs (e.g., miRNA)
- Fred Hutch Guidance - miRNA
- Genohub guide on small RNA (miRNA): Use this guide to help search for and get accurate pricing and turnaround times for small RNA, microRNA (miRNA) sequencing services. The guide includes considerations you should make before starting your small RNA sequencing project.
Cleavage Under Targets and Release Using Nuclease (CUT&RUN)
- An efficient targeted nuclease strategy for high-resolution mapping of DNA binding sites: The original CUT&RUN paper.
Whole exome sequencing
- Fred Hutch Guidance - Whole Exome Sequencing
- Illumina whole-exome sequencing resources: Whole exome sequencing resources from Illumina.
- Roche: whole exome sequencing: A brief overview of whole exome sequencing from Roche
- The promise of whole-exome sequencing in medical genetics: A summar of the impacts of whole-exome sequencing in medical genetics.
Targeted sequencing
- Fred Hutch Guidance - Targeted DNA Sequencing
- Illumina targeted sequencing resource: Resources on targeted sequencing.
- Disease-targeted sequencing: a cornerstone in the clinic: A review of disease targeted sequencing and it’s clinical applications.
-
Applications and analysis of targeted genome sequencing in cancer studies: A review of how targeted sequencing is used in cancer research. This paper also presents a generalized workflow, explaining important parmeters and best practices.
- Optimizing coverage for targeted sequencing: This tech note from Illumina describes how targeted DNA sequencing panels are designed and sequenced to maximize the data quality and coverage when used.
CRISPR screens
- CRISPR: Gene editing and beyond: A 4 min video covering how the CRISPR technology works.
- Application of CRISPR technologies in research and beyond: A paper discussing the application of CRISPR in a wide range of contexts.
ChIPseq
- Fred Hutch Guidance - ChIPseq
- Epigenie guide to ChIPseq: An overview of ChIP sequencing. A good starting point with references and resources linking to more information.
ATACseq (DNA accessibility)
- Complete guide to understanding and using ATACseq: An blog post from Active Motif covering what ATAC-Seq is, it’s history, how it works, and some discoveries enabled by ATAC-Seq.
- ATAC-seq: a method for assaying chromatin accessibility genome-wide: The original ATAC-seq paper.
- A brief overview of ATAC-seq: A short 7 min video explaining how ATAC-seq works and how it can be applied to epigenetics research.
Nanostring
- Fred Hutch Guidance - SNP Arrays
- Fred Hutch Guidance - Nanostring
- Nanostring: Link to the Nanostring website.
- Illumina microarray: Illumina resources on their microarray technology.
- Illumina gene expression & transcriptome analysis: Resources on gene expression methods and analysis from Illumina.
Self-directed learning through Fred Hutch
- The Scientific Computing Resource Library includes tutorials of how to perform common computational tasks using software available at Fred Hutch. Below we’ve highlighted a few popular links, but be sure to check out the page to see all the tutorials available.
-
Code templates and examples have been developed for those who are interested in implementing methods from the Scientific Computing Resource Library on their own data following best practices for reproducibility. Code templates are provided for setting up your own analyses, as well as additional examples of executable code that can be tailored to suit your own needs.
-
The Fred Hutch Data Science Lab curates training materials on a wide array of data science and computing topics that you can find on their training page.
- Fredhutch.io was an initiative to facilitate education about and promote access to computational resources at Fred Hutch. Those interested in learning core computational skills like R and Python can work through this archived content at their own pace.
Resources In Seattle
- UW Biostatistics Summer Institutes offer yearly intensive courses over the summers on a wide variety of topics.
- Meetup hosts various coding groups that meet regularly to share skills and provide networking opportunities. RLadies Seattle and Seattle UseR both include leaders from Fred Hutch.
External Resources On the Web
Classroom-Style Courses
These resources are organized in a lecture type format as slides, screencasts, and video. Most are work-at-your-own-pace, but some may be linked to a course calander.
- Hutch Learning playlist for Coding & Programming
- MCB517A: Tools for Computational Biology: A graduate-level course taught for UW by Fred Hutch CompBio faculty. This links to a GitHub repository that includes all lectures and homework.
- How to install necessary software for this course
- Ask questions about this course
- Course materials all available for free
- edX: Offers a collection of courses for Data Analysis and Statistics and Bioinformatics
- Rafael Irizarry of Dana Farber has online programs available through edX:
- Generally speaking, edX courses are all free to audit for a limited period of time. Unlimited access and the ability to earn a course Certificate will require payment
- Coursera: Offers a collection of courses for Data Science and Bioinformatics
- R Programming: A beginner-level program has five mini-courses. It takes about 4 months to complete.
- Statistics with R: A beginner-level program with five mini-courses. It takes about 7 months to complete.
- Genomic Data Science: An intermediate-level program for those who are already aquainted with R. It has eight mini-courses. It takes about 6 months to complete.
- Python Programming: An intermediate-level program that takes about 4 months to complete.
- Coursera offers a 7-day free trial, and is a paid subscription service after
- Udacity: Offers a collection of courses for Data Science
- Udacity is a paid subscription service
- Currently offering one month free for their Nanodegree programs.
- Udemy: Offers a collection of courses for Data Science
- Udemy offers courses at various price points.
- Keep an eye out for sales which happen regularly and can drastically reduce the cost.
- CognitiveClass.ai Offers a collection of courses for data science, AI, and cloud computing.
- All courses are free
- The Open Source Data Science Masters: An open-source curriculum for learning data science. This is a mixed media course made up of videos, books, and slides.
- Some content is free, some is paid
- CalTech Learning from Data
- A free YouTube series
Video Tutorials
Command Line
- Jesse Showalter - Command Line Basics (<15min video)
- Corey Schafer - Git Tutorial for Beginners: Command-Line Fundamentals (30 min video)
R
- Webinars from R studio (30 min - 1 hr / video)
- Getting Started With R Markdown
- Easy Ways to Collect Different Types of Data from the Web with R - Part 1
- Easy Ways to Collect Different Types of Data from the Web with R - Part 2
- Debugging Techniques in RStudio
- A Gentle Introduction to Tidy Statistics in R
- Managing Packages for Open Source Data Science
- Tidyverse Visualization and Manipulation Basics
- Introduction to Shiny
For more RStudio video tutorials check out the following links: RStudio Webinars; RStudio Essentials Tutorials; RStudio Data Science Essentials Tutorials
Python
- Corey Schafer - Python beginners series (15 - 30 min / video)
- Install and Setup for Mac and Windows
- Strings - Working with Textual Data
- Integers and Floats - Working with Numeric Data
- Lists, Tuples, and Sets
- Dictionaries - Working with Key-Value Pairs
- Conditionals and Booleans
- Loops and Iterations
- Functions
- Import Modules and Exploring The Standard Library
See Corey’s YouTube Channel for more tutorials including intermediate and advanced Python topics
- Dataschool.io - Best Practices with Pandas (2 hr)
Core Coding Concepts
- Data Structures and Algorithms Series - Coding Dojo (15 - 30 min / video)
Interactive Coding Platforms
These resources offer classes that are work-at-your-own-pace with a major focus on hands-on problem-sets and projects.
- DataQuest: A subscription service that offers programs and courses focused on data anlysis and engineering in Python and R.
- Tiered payment system with basic and premium plans
- CodeAcademy: A subscription service that offers coding programs and courses in many different languages.
- Tiered payment system with limited content available for free
Free ebooks
-
O’Reilly books available through Seattle Public Library
Misc
-
The Carpentries, with lessons from Data Carpentry and Software Carpentry
Please reach out to the Hutch Data Core by sending an email to hutchdatacore
with questions, comments, or suggestions related to training!
Updated: October 6, 2023
Edit this Page via GitHub Comment by Filing an Issue Have Questions? Ask them here.