Computers can be trained to be more accurate than pathologists in assessing slides of lung cancer tissues, according to a study by researchers at the Stanford University School of Medicine. The researchers found that a machine-learning approach to identifying critical disease-related features accurately differentiated between two types of lung cancers and predicted patient survival times better than the standard approach of pathologists classifying tumors by grade and stage.
"Pathology as it is practiced now is very subjective," said Michael Snyder, PhD, professor and chair of genetics. "Two highly skilled pathologists assessing the same slide will agree only about 60% of the time. This approach replaces this subjectivity with sophisticated, quantitative measurements that we feel are likely to improve patient outcomes."
The research will be published in Nature Communications. Snyder, who directs the Stanford Center for Genomics and Personalised Medicine, shares senior authorship of the study with Daniel Rubin, MD, assistant professor of radiology and of medicine. Graduate student Kun-Hsing Yu, MD, is the lead author of the study.
Although the current study focused on lung cancer, the researchers believe that a similar approach could be used for many other types of cancer.
"Ultimately this technique will give us insight into the molecular mechanisms of cancer by connecting important pathological features with outcome data," said Snyder.
Assessing grade, severity of cancer
For decades, pathologists have assessed the severity, or "grade," of cancer by using a light microscope to examine thin cross-sections of tumor tissue mounted on glass slides. The more abnormal the tumor tissue appeared—in terms of cell size and shape, among other indicators—the higher the grade.
A stage is also assigned based on whether and where the cancer has spread throughout the body.
Often a cancer's grade and stage can be used to predict how the patient will fare. They also can help clinicians decide how, and how aggressively, to treat the disease. This classification system doesn't always work well for lung cancer, however.
In particular, the lung cancer subtypes of adenocarcinoma and squamous cell carcinoma can be difficult to tell apart when examining tissue culture slides. Furthermore, the stage and grade of a patient's cancer doesn't always correlate with their prognosis, which can vary widely.
Fifty percent of stage-1 adenocarcinoma patients, for example, die within five years of their diagnosis, while about 15% survive more than 10 years.
The researchers used 2,186 images from a national database called the Cancer Genome Atlas obtained from patients with either adenocarcinoma or squamous cell carcinoma. The database also contained information about the grade and stage assigned to each cancer and how long each patient lived after diagnosis.
The researchers then used the images to "train" a computer software program to identify many more cancer-specific characteristics than can be detected by the human eye—nearly 10,000 individual traits, versus the several hundred usually assessed by pathologists.
These characteristics included not just cell size and shape, but also the shape and texture of the cells' nuclei and the spatial relations among neighboring tumor cells.
"We began the study without any preconceived ideas, and we let the software determine which characteristics are important," said Snyder, who is the Stanford W. Ascherman, MD, FACS, Professor in Genetics.
"In hindsight, everything makes sense. And the computers can assess even tiny differences across thousands of samples many times more accurately and rapidly than a human."
The researchers homed in on a subset of cellular characteristics identified by the software that could best be used to differentiate tumor cells from the surrounding noncancerous tissue, identify the cancer subtype, and predict how long each patient would survive after diagnosis.
They then validated the ability of the software to accurately distinguish short-term survivors from those who lived significantly longer on another dataset of 294 lung cancer patients from the Stanford Tissue Microarray Database.
Identifying previously unknown physical characteristics that can predict cancer severity and survival times is also likely to lead to greater understanding of the molecular processes of cancer initiation and progression.
In particular, Snyder anticipates that the machine-learning system described in this study will be able to complement the emerging fields of cancer genomics, transcriptomics and proteomics. Cancer researchers in these fields study the DNA mutations and the gene and protein expression patterns that lead to disease.