Consenting and Large Scale Data

Updated: November 11, 2022

Edit this Page via GitHub       Comment by Filing an Issue      Have Questions? Ask them here.

Before beginning a study and during the proposal preparation process, an important issue to consider is whether the proposed research qualifies as human subjects research as defined by the Department of Health and Human Services Office for Human Research Protections (OHRP) and/or the National Institutes of Health. Take a look at the OHRP decision charts and/or the NIH questionnaire to find out what aspects of human subjects research may apply.

IRB Approval in Human Subjects Research

Institutional Review Boards (IRBs) exist to protect the rights and welfare of human research subjects. IRB oversight supports compliance with the current standards of human subjects research and with current regulations. When conducting its review for genomic research, the IRB will look at whether the research involves human subjects.

When starting a research project that may involve the generation of large-scale genetic or other large-scale molecular datasets, a first step is to ensure that the consent forms involved in the study or studies during which the intended specimens were collected include language specific to the possibility of generating these types of data and how those data can be used and shared.

In some cases, the use of human specimens to generate large-scale molecular datasets is not considered human subjects research, and thus not subject to the specific requirements of human subjects research, even though the dataset uses human data. This distinction allows studies which use human specimens/data and deemed not human subjects research to avoid the relatively high level of documentation and reporting requirements of human subjects research.

For NIH grant applications submitted after January 25, 2018, a new required form allows for more clarity if a study does not qualify as human subjects, but does use human specimens and/or data. More about this form, the PHS Human Subject and Clinical Trial Information form can be found here. This form requires the investigator to have reviewed the NIH questionnaire; if the study does involve human specimens and/or data, but is not deemed human subjects research, additional documentation or justification is required. An example of the Research Involving Private Information or Biological Specimens flowchart is here.

Available Resources

  • If you are affiliated with the the Fred Hutchinson Cancer Center or Seattle Cancer Care Alliance, a good place to start is the Fred Hutch IRB.

  • Similarly, if you work for the University of Washington, the UW IRB curates that information.

  • If you currently have an NIH grant, are considering applying for one, or are in the process of writing one, consider taking a look at the NIH’s human subjects research site.

  • For more information about using human specimens, cell lines or data in the context of a non-human subjects study, here is a pdf of a decision tree provided by the NIH.

Retrospectively Banked Specimens and IRB Review

It is important to be aware that the timeframe in which the specimens were banked may affect the IRB review of the foundational collection consents. On January 25, 2015, NIH policy for viewing and sharing genomic data changed. Consent documents associated with human specimens banked before this date will have different (and fewer) IRB review criteria than consent documents associated with human specimens banked after this date. For data made from human specimens banked after this date, the patient consent documents will be required to address broad sharing in order for broad data sharing to occur. Sharing may be possible for specimens banked earlier for which the consent documents may be ambiguous with respect to genomic datasets. It is important to consult with the relevant IRB if a consent with sharing requirements can limit the types of data which can be generated and if such a consent can restrict secondary usage or sharing of generated data.

If you are working under an NIH grant and sharing data that may fall under the NIH Genomic Data Sharing (GDS) policy, you should be aware of whether the data you are receiving was collected under appropriate consent. The GDS Policy expects subjects who are asked to enroll in a study in which genomic data are obtained to also be asked for their informed consent to the future research use and broad sharing of their data. Only if potential subjects provide such consent would broad sharing of the data be permissible. If a subject does not consent, he or she may still be enrolled in the study, but their data may not be shared, or may be shared in a limited manner consistent with the specifics of the consent form. The IRB of the entity sharing the data will make a determination about what can be shared and any limitations.

In order to meet the NIH expectations under the GDS Policy, for research projects for which the IRB has granted a waiver of some or all of the required elements of informed consent under 45 CFR 46.116(d), or consent is not required because the activity is not subject to 45 CFR 46, investigators will still need to seek or document consent for future use and broad sharing of genomic and phenotypic data. At minimum, the information described below should be provided to prospective participants.

In order to meet the expectations for future research use and broad sharing under the GDS Policy, the consent should capture and convey in language understandable to prospective participants information along the following lines:

  • Genomic and phenotypic data and any other data relevant for the study (such as exposure or disease status) will be generated and may be used for future research on any topic and shared broadly in a manner consistent with the consent and all applicable federal and state laws and regulations.

  • Prior to submitting the data to an NIH-designated data repository, data will be stripped of identifiers such as name, address, account and other identification numbers and will be de-identified by standards consistent with the Common Rule. Safeguards to protect the data according to Federal standards for information protection will be implemented.

  • Access to de-identified participant data will be controlled, unless participants explicitly consent to allow unrestricted access to and use of their data for any purpose. Because it may be possible to re-identify de-identified genomic data, even if access to data is controlled and data security standards are met, confidentiality cannot be guaranteed, and re-identified data could potentially be used to discriminate against or stigmatize participants, their families, or groups. In addition, there may be unknown risks.

  • No direct benefits to participants are expected from any secondary research that may be conducted.

  • Participants may withdraw consent for research use of genomic or phenotypic data at any time without penalty or loss of benefits to which the participant is otherwise entitled. In this event, data will be withdrawn from any repository, if possible, but data already distributed for research use will not be retrieved.

  • The name and contact information of an individual who is affiliated with the institution and familiar with the research and will be available to address participant questions.

  • Studies that include whole genome sequencing (WGS), whole exome sequencing (WES), epigenetic profiles, microbiotic profiles, and related forms of in-depth extra-genomic data generate immense amounts of personal information about participants. It is important to draw a distinction between targeted genetic research and broader sequencing protocols, so that participants understand the scope of data generation.

Available Resources

  • See also National Institutes of Health Points to Consider in Developing Effective Data Use Limitation Statements, prepared by the Office of Science Policy July 13, 2015).

  • The national Human Genome Research Institute has a page for Informed Consent for Genomic Research.

  • The NIH Genomic Data Sharing Policy is here.

  • Large-scale genomic data include genome-wide association studies (GWAS), single nucleotide polymorphisms (SNP) arrays, and genome sequence, transcriptomic, metagenomic, epigenomic, and gene expression data. Examples of research that are subject to the GDS Policy include, but are not limited to, projects that involve generating the whole genome sequence data for more than one gene from more than 1,000 individuals, or analyzing 300,000 or more genetic variants in more than 1,000 individuals, or sequencing more than a 100 isolates of infectious organisms such as bacteria. The Supplemental Information to the NIH Genomic Data Sharing Policy includes detailed description of research under scope of the policy and data submission expectations.

  • The Fred Hutch IRB policy information can be found here, sign-in required if off campus, including the Genomic Data Sharing Supplement here.

  • NIH Information on Institutional Certifications is here. NOTE: Fred Hutch cannot issue a provisional or final “Institutional Certificate” unless Fred Hutch is uploading the final genomic data.

Updated: November 11, 2022

Edit this Page via GitHub       Comment by Filing an Issue      Have Questions? Ask them here.