Visit: www.cdc.gov | Contact CDC at: 1-800-CDC-INFO or www.cdc.gov/info The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention. CS248056 National Center for Chronic Disease Prevention and Health Promotion Division of Cancer Prevention and Control OTHER METHODS BACKGROUND Research and policies meant to reduce the cancer burden depend on cancer surveillance data. Data must meet high standards of quality and reliability. Completeness of incident case ascertainment is an essential component of quality/reliability. Incomplete data results when registries are unable to collect accurate information on all incident cases in a defined geographic area within the given timeframe. – Some missed initially, but collected later (delay). – Some missed completely. Index of Completeness - Quantifies the percentage of actual incident cancer cases reported over a specific geographic area and time period. Estimating the truth (observed vs. expected) – The actual (expected) number of incident cases is unobservable. – The expected cases must be estimated from available data. FUTURE DIRECTIONS Find the best model! – Accurate prediction – Easy to implement at registry-level Collaborate with partners. CONTACT INFORMATION A. Blythe Ryerson, PhD, MPH [email protected] 770.488.2426 I/M Ratio (SEER) Assumes the ratio of age-adjusted incidence to mortality rates is constant across geographic areas for a given cancer site (19 sites), race (white & black only), and gender group Completeness indices weighted by race (white and black only) and gender and combined Fulton JP, Howe HL (1995) Evaluating the use of incidence-mortality ratios in estimating the completeness of cancer registration. In: Howe HL (ed) Cancer incidence in North America, 1988-1990. North American Association of Central Cancer Registries. Springfield, IL, pp V1 – V9 Roffers SDJ (1994) Case completeness and data quality assessments in central cancer registries and their relevance to cancer control. In: Howe HL (ed) Cancer incidence in North America, 1988-1990. North American Association of Central Cancer Registries. Springfield, IL, pp V1 – V9 I/M Ratio (NPCR) Can we improve estimates of expected incidence for NPCR registries by utilizing national NPCR incidence rates? Methods of Least Squares (Simple Linear Regression) A comparison of methods for assessing completeness of case ascertainment in data from the National Program of Cancer Registries A. Blythe Ryerson, PhD, MPH Cancer Surveillance Branch 2014 NAACCR Annual Conference Ottawa, Ontario, Canada June 21-26, 2014 I/M Ratio (SEER) Completeness 2012 Submission, by Registry 0.00 20.00 40.00 60.00 80.00 100.00 120.00 % Complete Registry I/M Ratio (NPCR) Completeness 2012 Submission, by Registry 0.00 20.00 40.00 60.00 80.00 100.00 120.00 % Complete Registry 1150000 1200000 1250000 1300000 1350000 1400000 1450000 Case Counts Diagnosis Year Estimate e x p e c t e d case count for 2012 by extrapolating the fitted line O b s e r v e d case count for 2012 Where Y=case counts or incidence rates X= diagnosis year β 1 =slop of fitted LINEAR line SLR (Incidence) Completeness 2012 Submission, by Registry 0.00 20.00 40.00 60.00 80.00 100.00 120.00 % Complete Registry Puerto Rico Completeness Variations on I/M Method 0 10 20 30 40 50 60 70 80 90 100 2006 2007 2008 2009 2010 2011 2012 % Completeness Submission Year SEER NPCR SEER-Hispanic Only NPCR-Hispanic Only Pros: Simple method Easy to implement Logical Cons: Assumes linear relationship Does not take into account correlation of data Would not identify “chronic under-reporters” No adjustments for confounding Modifications of I/M ratio Stratification or restriction by other race and/or ethnicities Example: Restriction of I/M ratio calculations to Hispanics only in the Puerto Rico data Multiple Linear Regression and/or Linear Transformation Allow for adjustment for confounding Allow non-linear relationships (transformation) Weight data for more recent years Limitations: – Would not identify “chronic under- reporters” – Have to identify confounders/residual confounding – Doesn’t take into account correlated data Time Series/Dynamic Panel Models Models fit to time-series data Can be used to make predictions Often applied when data show evidence of non-stationarity (no trend-like behavior) Example: Autoregressive Integrated Moving Averages (ARIMA) Spatial Prediction Models Includes mortality rates – can be seen as an extension of I/M ratio method Incorporates information on: – Geography – Socio-demographics – Health – Lifestyle factors Also includes spatial random effects to enable better predictions in sparse data areas Das B, Clegg LX, Feuer EJ, Pickle LW. A new method to evaluate the completeness of case ascertainment by a cancer registry. Cancer Causes and Control, Vol. 19, No. 5 (jun., 2008), pp. 515-525. Pros: Simple method Easy to implement What is in place currently (NAACCR) Cons: Relies solely on mortality data (does not take into account screening rates or other factors known to influence cancer incidence rates) Uses national incidence rates from SEER only Makes use of only white and black race groups No variance estimates Pros: Simple method Easy to implement Cons: Relies solely on mortality data (does not take into account screening rates or other factors known to influence cancer incidence rates) Uses national incidence rates from NPCR only Makes use of only white and black race groups No variance estimates Completeness = Observed Expected Expected Incidence Rate = x Registry Mortality Rate ‘National’ Incidence Rate (SEER) National Mortality Rate Expected Incidence Rate = x Registry Mortality Rate ‘National’ Incidence Rate (NPCR) National Mortality Rate METHODS What’s the best method for estimating the truth? Method Comparison for NPCR November 2013 Submission, All Registries 0.00 20.00 40.00 60.00 80.00 100.00 120.00 % Completeness Submission Year I/M (SEER) I/M (NPCR) SLR (Incidence)