Genomics Platforms and Data Types

Updated: May 17, 2022

This guide highlights some of the genomics platforms available through the Genomics Shared Resource at Fred Hutch. This guide is intended to give general context to each platform. Access to many of the submission processes involved in using the Genomics Shared Resource is via Hutchbase.

Ask Questions Early and Often

As technologies and reagents change, the most cost-effective and efficient approaches for performing these experiments do as well. It is important to discuss your particular experiment and needs with the Genomics and Bioinformatics Shared Resource during the planning stages of your project. They may also have important insight into technological factors which may influence your experimental design, and the most appropriate control samples to collect. You can contact them by emailing genomics.

Sequencing Based Platforms

Sequencing based platforms currently available via the Genomics Core at the Fred Hutch include the Illumina sequencers (NovaSeq/NextSeq/MiSeq) and the Pacific Biosciences Long Range sequencer (Sequel IIe). These platforms can be used to sequence a variety of assay material types via different library preparation processes. In the RNA or DNA Approaches pages, we discuss different options for the creation of libraries for sequencing from either nucleic acid type and for different research questions. Choosing the appropriate assay material QC and library preparation reagents depends in part on how the libraries will be sequenced. Thus is it important to verify that all phases of the process are using compatible techniques.

Illumina Sequencers

The Illumina sequencers are high read number, short read sequencers that provide a range of sequencing capabilities for many different upstream library types. The primary approach of these sequencers is to sequence at high numbers, individual fragments of DNA generated by the library preparation process, then to reconstruct the sequences in the mixture by, for example, aligning the sequences of the short reads to a reference, or other bioinformatic approach. The basic details for each NovaSeq and NextSeq flowcell type along with the corresponding information for the MiSeq are included below. Exact costs of a sequencing run will depend on the read lengths, whether the sequencing is paired end or single end, as well as your affiliation (Fred Hutch vs external), so please contact the Genomics core by emailing genomics to get current cost estimates.

Sequencer Mode Read Lengths Approx. Reads per Lane Lanes per Run User Prepared Library Submission Requirements*
MiSeq Nano 1M Up to 300bp   1 >= 30uL at >= 4nM
MiSeq V2 Reagent 12 - 15M Up to 300bp   1 >= 30uL at >= 4nM
MiSeq V3 Reagent 22 - 25M Up to 300bp   1 >= 30uL at >= 4nM
NextSeq P1   up to 300bp 100M 1 >= 30uL at >= 3nM
NextSeq P2   up to 300bp 400M 1 >= 30uL at >= 3nM
NextSeq P3   up to 300bp 1100M 1 >= 30uL at >= 3nM
NovaSeq SP   up to 300bp 650 - 800M 1 >= 100uL at >= 4nM
NovaSeq S1   up to 300bp 1300 - 1600M 1 >= 100uL at >= 4nM
NovaSeq S2   up to 300bp 3300 - 4100M 1 >= 150uL at >= 4nM
NovaSeq S4   up to 300bp 8000 - 10000M 1 >= 300uL at >= 4nM
  • Check with Genomics Core to verify current submisssion requirements. Quantification method will be requested, and qPCR is most accurate.

When deciding how much sequencing is needed for a set of libraries to provide sufficient read depth (number of reads per genomic location in the genome covered in the library), issues such as the intended data type, sample type and quality, library preparation type, number of total samples, and the applicability of multiplexing approaches need to be considered. Consulting with the Genomics Core can help provide more clarity for individual projects. You can request more information by emailing genomics.

More from Illumina about Illumina Sequencing can be found here.

Pacific Biosciences (PacBio) Long Range Sequencer

The PacBio Sequel IIe sequencer works differently than the Illumina sequencers in that the read length is not specified by the platform, but is limited by the library itself, with an associated reduction in confidence of the sequence as reads get longer and longer. However, instead of being limited to sequencing only fragments of DNA, PacBio sequencing can provide long stretches of sequencing data that occur in the same fragment. This allow for analyses such as full length isoform discovery, de novo small genome sequencing, assessing structural variants/translocations, and allele phasing. On average the PacBio sequencer aims to provide up to 15kb of read length.

  • 4-5M reads per SMRT cell
  • Insert size for sequencing of 200bp minimum up to 40kb fragments
  • Library prep
    • Reagents purchased by lab from PacBio and prepped by lab, brought to Genomics as completed library (see PacBio website for more info), OR
    • Genomics Shared Resources can help with library prep for a service fee to process a QC’d sample into a Pac Bio library depending on library type:
      • For amplicon library prep up to 5kb
      • For large insert library prep up to 20kb
  • Multiple multiplexing schemas (in-line or ligated) - discuss details with Genomics Shared Resource to plan the approach (email genomics).

Array Based Platforms

Microarrays are a sometimes less costly option that can in some cases be substituted for a wide variety of sequencing types; for example, there are SNP, gene expression, and whole exome arrays. While microarrays are not useful for discovery of novel targets, for well-established targets, assay chemistries and data analysis pipelines are well-vetted. A discussion with the Genomics Core can be useful in helping you decide the best technologies for your work.

Nanostring Hybridization Arrays for Gene Expression

Reagents for Nanostring arrays can be purchased from Nanostring and total RNA ready to be run can be brought to the Genomics Shared Resource for processing.

Library Preparation Reagents and Methods

The choice of genomics platform will dictate the needs of the library preparation method. It is important to understand how library preparation can impact the final data, for example if biases towards detection of specific types of nucleic acids are introduced. Meeting with an NGS specialist at the Genomics Core will help to guide your path.

Library Preparation for Sequencing

The four main steps in preparing RNA or DNA for NGS analysis are:

  1. fragmenting and/or sizing the target sequences to a desired length (via physical, enzymatic, and chemical methods)
  2. converting target to double-stranded DNA (if RNA)
  3. attaching oligonucleotide adapters to the ends of target fragments–these adaptors are multi-purpose. They are used to index/barcode the fragments, and allow the fragments to be attached to the flow cell of the sequencer (where the sequencing of the fragments occurs)
  4. quantitating the final library product for sequencing

10x Genomics Single Cell Library Preparation System

To obtain single cell gene expression data from RNA-seq, the Genomics lab uses the 10X Genomics Single Cell Expression platform. Starting with a cell suspension, this process partitions cells into droplets for cDNA library preparation. After library prep, the droplets are pooled, then sequenced on an Illumina sequencer. Unique molecular identifiers used in the library prep allow the sequencing results to be computationally traced back to individual cells.

Overview of the Sequencing Process

  1. The adapter-ligated DNA library is loaded onto a flowcell.
  2. The fragments are hybridized to the flow cell surface.
  3. Each bound fragment is amplified into a cluster. This step is known as bridge amplification.
  4. Fluorescently-labeled nucleotides and sequencing reagents are added; the flow cell is fluorescently imaged after the incorporation of each nucleotide. The color of the fluorescent dyes identifies which base was incorporated. This cycle of nucleotide incorporation and imaging is repeated n times for a n-reads of sequence.

