The National Cancer Institute (NCI) seeks to engage the research community in an ongoing discussion about the future of cancer epidemiology. To date, this discussion has included an NCI workshop, multiple blog posts, and a series of commentaries published in Cancer, Epidemiology, Biomarkers and Prevention. NCI incorporated the insights from these intellectual discourses to develop eight recommendations that can be deliberated, debated, and adapted by cancer epidemiologists in order to accelerate the translation of scientific discoveries into individual and population health benefits. As part of this ongoing discussion, we would like to explore the concept of Integrative Epidemiology as a framework to transition cancer epidemiology into the 21st century.
Key Features of Integrative Epidemiology
Spitz et al., originated the concept of “integrative epidemiology” to describe a cohesive approach combining the rigor of epidemiologic study design with rapid advances in analytic systems and biostatistical and bioinformatic tools in order to extend the boundaries of molecular cancer epidemiology. This concept can be expanded beyond molecular epidemiology to include an even broader network of data integration and knowledge from multiple sources to inform, direct, and prioritize scientific research.
Integrative epidemiology research employs three key complementary practices (see figure):
- Leveraging existing infrastructures, including complex data sets and biobanks, by integrating information collected across diverse study designs, methodological approaches, and technology platforms.
- Forging collaborations across scientific domains (e.g., systems biology, epidemiology, genetics, bioinformatics, clinical sciences, health behavior) by establishing an integrative and transdisciplinary scientific endeavor buttressed in part by the principles of team science.
- Applying the practices of data science (an emerging discipline representing a nexus of statistics, computer science, machine learning, data visualization, informatics, bioinformatics, and computational biology) and knowledge integration to aggregate, synthesize, and interpret the diverse and high-dimensional data accrued through these collaborations.
Ideally, integrative epidemiology research fosters scientific innovation and creativity. Rapid and efficient integration of the emerging wealth of data–including “omics” data (e.g., genomic, epigenomic, metabolomic, and transcriptomic), epidemiologic and clinical data, and behavioral and geographical data–will lead to the construction of “system” models. These models will enhance our understanding of the mechanisms underlying cancer risk and the macro-level factors (e.g., social determinants and health care policy) that may influence outcomes, generate hypotheses for future research, and identify possible targets for prevention or clinical intervention. Further, by interconnecting these networks of information and data resources, the long-term promise of integrative epidemiology is to accelerate the translation of scientific discoveries to population health impact.
An Illustrative Example of Integrative Epidemiology Research
These principles could be broadly applied to address an emerging problem, such as the elevated risk of lung cancer in HIV-infected individuals. Investigators can leverage and develop robust HIV databases and biospecimen repositories to characterize the natural history, delineate underlying mechanisms, and define the clinical course of HIV-associated lung cancer. Intervention trials can integrate somatic, genetic, epigenetic, metabolomic, and transcriptomic changes with smoking data, highly active antiretroviral therapy (HAART), and markers of HIV infection. Electronic medical records could provide information on drug interactions between HAART and cancer chemotherapy regimens and data on the efficacy of standard and novel targeted lung cancer therapeutic approaches. Data science methods can be used to integrate these disparate data sources through modeling of individual and macro-level data (social, organizational, and policy factors). Finally, synthesized knowledge can generate evidence-based guidelines and policies for screening, quality of care, and early diagnosis of lung cancer in HIV-infected populations worldwide.
Extending the reach of cancer epidemiology will require integrative epidemiology endeavors that harness the potentials of data science and knowledge integration, the strengths of team science, and the practice of bricolage (a creative use of existing resources in order to create a whole that is more than the sum of its parts). This is not to say that the incorporation of integrative epidemiology is absolute; there are still important unanswered questions that can be addressed in an elegantly designed study with a few key and precisely measured variables. However, cancer epidemiologic research is changing, in part due to the deluge of “big data” generated by technological advancements in the digital age. The discipline must evolve to address the challenge of amalgamating complex and high-dimensional data. These data include the rich “omics”-derived information at the individual level, “big data” collected on exposures (either individual or environmental), and information gathered through other sources (e.g., survey information, electronic health records, or social media). These integrative studies are expensive but hold the promise of great scientific payoff.
In order to facilitate this integrative approach, it is necessary to:
- Promote a model of research that is cross-disciplinary and collaborative. Easy access to shared data and related biospecimens are essential to support this approach.
- Develop systematic approaches to manage and display complex datasets. Maximizing access from investigators in related disciplines and using crowdsourcing to parse complex datasets will lead to innovative and insightful data analysis and interpretation.
- Coalesce and mine data from disparate sources, using the tools developed by the field of data science.
- Combine the infrastructure and population resources assembled by the increasing number of cross-disciplinary consortia; develop approaches for synthesis and translation of multi-level information within the framework of cancer epidemiological research.
Knowledge integration is critical for ensuring that these disparate sources of information can be integrated to produce a cohesive body of scientific evidence that can consequently guide further research and translation into practice.
What Do You Think?
Although we are introducing integrative epidemiology as a concept to transition cancer epidemiology into the 21st century, epidemiologists have already been employing these types of strategies in practice. EGRP invites you to tell us about your own experiences with integrating these new biotechnologies, methodologies, and data into your study designs, as well as how we can assist. We are also interested in your opinions of the promise of and challenges related to integrative epidemiology that need to be addressed.
Tram Kim Lam, Ph.D., M.P.H., is a member of the Knowledge Integration Team and a Program Director in the Epidemiology and Genomics Research Program (EGRP)’s Office of the Associate Director and Modifiable Risk Factors Branch, respectively . She engages in a wide range of knowledge integration projects across EGRP and manages a cross-programmatic research grant portfolio that focuses on genetic, infectious agents, and lifestyle factors that influence susceptibility to cancer.
Dr. Lam earned her master’s and doctorate degrees in Epidemiology at the Johns Hopkins Bloomberg School of Public Health and has a Bachelor of Arts in Biology from Yale University. During her graduate and post-graduate career, her research interests crossed disciplines to include epidemiologic studies on genetic and lifestyle factors associated with cancers, international collaboration on HIV-AIDS intervention, and community-participatory research amongst underserved populations.
Margaret Spitz, M.D., M.P.H., is a professor at the Dan L. Duncan Cancer Center of Baylor College of Medicine, where she provides strategic direction for its Cancer Prevention and Population Sciences program. Dr. Spitz also is a consultant to EGRP through the Federal government’s Intergovernment Personnel Act (IPA).
Dr. Spitz has a long-standing interest in genetic susceptibility to lung cancer. She has developed a lung cancer risk prediction model, has participated in lung cancer genome-wide association studies (GWAS), and is a founding member of the International Lung Cancer Consortium. Dr. Spitz has also outlined an integrative approach to extend the boundaries of molecular cancer epidemiology by integrating modern and rapidly evolving “omics” technologies into state-of-the-art molecular epidemiology in order to comprehensively explore the mechanistic underpinnings of epidemiologic observations into cancer risk and outcomes.