Autocoding Cancer

An algorithm developed by the US Department of Energy’s (DOE’s) Oak Ridge National Laboratory (ORNL) in partnership with the National Cancer Institute (NCI) is speeding up classification, or coding, of cancer pathology reports. The early results from the Cancer Moonshot program show that the algorithm dramatically reduces the time it takes for national cancer registries to process reports and in turn the time it takes for data to become available for public health monitoring and policy making.

“The partnership between DOE and NCI is critical because both entities bring skills and resources to the table that, when combined, allow for innovative solutions that bring us closer to near–real time population cancer surveillance. The compute and security resources we have at ORNL—paired with the domain knowledge and population-level data they have at NCI—result in a unique and valuable system for measuring population health,” said Heidi Hanson, group leader for the lab’s Biostatistics and Multiscale Systems Group who co-leads the research with principal investigator Georgia Tourassi. Tourassi is director of the National Center for Computational Sciences and the Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science user facility at ORNL.

Collaborating with Hanson are Noah Schaefferkoetter, API developer and data manager for MOSSAIC; research scientists Adam Spannaus, Mayanka Chandra Shekar, Zachary Fox, Alina Peluso, and Shamimul Hasan; statistician Kirsen Sullivan; postdoctoral research associates Shalini Priya and Jordan Miller; computational scientist John Gounley; doctoral student Christoph Metzner; and health data engineer Dakota Murdock. Significant algorithm development was done by Shang Gao, now a senior machine learning researcher at Casetext.

Read the rull article here: https://www.olcf.ornl.gov/2023/02/10/autocoding-cancer/