Real-World Data in Oncology Research
Edit this Page via GitHub Comment by Filing an Issue Have Questions? Ask them here.Real-world data (RWD) are “data relating to patient health status and/or the delivery of health care routinely collected provided outside of a research setting,” for example, data from electronic health records or cancer registries. On this page we outline types of RWD used in oncology research and resources to learn more about real-world evidence (RWE) generation at Fred Hutch and beyond.
Sources of RWD
There are few major sources of RWD:
- Administrative claims
- Registries
- Large-scale harmonized databases
- Electronic health records (EHRs)
Administrative Claims
Administrative claims data are data generated for the purposes of health care billing. Claims data originate from the federal government (Medicare), state-level governments (e.g., Medicaid programs), commercial insurance providers, and other aggregators of health care claims.
Washington State All-Payer Claims Database (WA-APCD)
Registries
Patient registries are databases that collect information about people “diagnosed with a specific disease, genetic disorder, or medical condition.” There are several registries that are important for understanding the distribution and burden of cancer in the United States.
The CDC’s National Program of Cancer Registries (NPCR) supports state/local data collection and the North American Association of Central Cancer Registries (NAACCR) standardizes the data from NPCR and other registries in the US and Canada. The NPCR covers 96% of the US population.
The National Cancer Institute’s (NCI) Surveillance, Epidemiology, and End Results (SEER) program covers 48% of the US population, but contains more detailed surveillance of cancer type and population-specific trends. SEER also links to administrative claims data, such as Medicare. The NCI also runs the National Childhood Cancer Registry (NCCR).
The National Cancer Database (NCDB) is the combined effort of the Commission on Cancer (CoC), American Cancer Society, and the American College of Surgeons. This hospital-based registry covers around 70% of the American population and contains data from CoC-accredited hospitals.
The CDC maintains the US Cancer Statistics database, which includes national-level reporting on cancer burden. It combines data from NPCR and SEER.
Large-scale harmonized databases
Both commercial vendors and non-profit organizations provide data products offering harmonized medical record and claims data. Some are specific to cancer care, and others are more general healthcare databases. Some examples are ASCO’s CancerLinQ, Truveta, and Flatiron Health.
Institutional electronic health record data
Institution-specific EHR data (both raw and processed/transformed) are often available in institutional data warehouses. The barriers to using these data in clinical research include lack of data standardization, proprietary data models, data quality issues, selection biases inherent in using data generated from healthcare utilization, and more. Efforts such the Observational Health Data Sciences and Informatics (OHDSI) program aim to address these issues and provide open-source solutions.
Real-World Data Stewardship at Fred Hutch
There are several groups and departments at Fred Hutch that play a critical role in stewardship of RWD and in real-world evidence generation.
The Cancer Registry (internal link) at Fred Hutch collects data on analytic cases and some non-analytic cases among Fred Hutch patients, and reports this data up to regional, state, and national-level registries.
The Cancer Surveillance System (CSS) at Fred Hutch is part of SEER and collects data from 13 western WA counties. This data is then reported up to the Washington State Department of Health, merged with data from across Washington state, and then reported up to the NCI’s SEER program.
The Hutch Institute for Cancer Outcomes Research (HICOR) conducts essential research to “improve cancer prevention, detection and treatment in ways that will reduce the economic and human burden of cancer.” HICOR maintains access to the following data resources for Fred Hutch researchers:
- SEER Patterns of Care
- SEER-Medical Health Outcomes Survey
- SEER-Consumer Assessment of Healthcare Providers and Systems (CAHPS) surveys
- SEER Medicare linked data resource (includes Medicare Advantage plans)
- SEER Medicaid linked data resource
- Health Economics Research On Cancer (HEROiC)
The Office of the Chief Data Officer at Fred Hutch manages data access for Fred Hutch patient data for research.
Resources to Learn More
- Oncologists Must Consider Participant Data When Using Large-Scale Cancer Data Sets
- Real-World Database Studies in Oncology: A Call for Standards
- Assessing Real-World Data From Electronic Health Records for Health Technology Assessment: The SUITABILITY Checklist: A Good Practices Report of an ISPOR Task Force
- An overview of real-world data sources for oncology and considerations for research