“Big data” researchers have received a $5 million, three-year state Commonwealth Universal Research Enhancement, or CURE, grant to develop better methods for integrating, analyzing and modeling large volumes of diverse data on cancer patients. The goal is to produce more accurate predictions of patient outcomes and to enable clinicians to tailor care for each patient.
Gregory Cooper, M.D., Ph.D., professor and vice chair of biomedical informatics at the University of Pittsburgh, and Ziv Bar-Joseph, Ph.D., professor of computational biology at Carnegie Mellon University, will lead the Big Data For Better Health (BD4BH) project, which also includes UPMC and the Pittsburgh Supercomputing Center.
“We will investigate breast and lung cancer as clinical domains to develop the methods and software tools; however, the methods will be generalizable to other diseases,” Dr. Cooper said. “The basic approach will be to process raw data, such as gene sequence and expression data, to derive highly informative biological patterns in the data that are then used to predict patient outcomes. We believe these biological patterns will predict outcomes significantly better than would using the raw data directly, and we plan to test this hypothesis.”
For example, rather than using the set of mutated genes, a type of raw data, in a cancerous tumor as predictors of cancer metastasis, they will infer the cell signaling pathways, a type of biological pattern, that are likely having a significant influence on tumor growth. Those aberrant pathways will then be used to predict clinical outcomes, such as tumor spread or metastasis. The ultimate goal is for such predictions to help inform clinical care.
“Carnegie Mellon’s unique expertise in analyzing and modeling large-scale data, combined with the cutting edge clinical and biomedical work of UPMC and Pitt, can leverage the large amounts of data being collected on cancer,” Dr. Bar-Joseph said. “This will enable patients and clinicians to take full advantage of this data in ways not previously possible.”
Machine learning methods, for instance, can automatically analyze large datasets to discover patterns that people cannot discern. These automated discoveries can then enable researchers to identify relationships between the way specific individuals respond to treatment and their DNA to allow more personalized tailoring of treatments.
“We hope that more accurate predictions of clinical outcomes will assist physicians in devising treatment plans and help patients in making health care decisions,” Dr. Cooper said. “The patterns contained in the models also may spur biological insights into the diseases being modeled.”
In collaboration with Lincoln University, the BD4BH program also will train underrepresented minority students to work with big data in both the biomedical and data science realms.
The project is funded by the Pennsylvania Department of Health. The department specifically disclaims responsibility for any analyses, interpretations or conclusions.
In addition to the BD4BH project, Pitt, CMU and UPMC are partners in the Pittsburgh Health Data Alliance. Announced in March, the alliance is focused on creating data-intensive software and services with the potential to revolutionize health care and wellness.