Data Quality Data Cleaning Beverly Musick, M.S. May 20, 2010 1 This module was recorded at the health informatics – training course—data management series offered by the Regional East African Centre for Health Informatics (REACH-Informatics) in Eldoret, Kenya. Funding was made possible by NIH’s Fogarty Center. The training was held at the Academic Model Providing Access to Healthcare (AMPATH) , a USAID- funded program, supported by the Regenstrief Institute at Indiana University. The moduleswere created in collaboration with the School of Informatics at IUPUI. Creative Commons Attribution-ShareAlike 3.0 Unported License
13
Embed
Data Quality Data Cleaning Beverly Musick, M.S. May 20, 2010 1 This module was recorded at the health informatics –training course— data management series.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Data Quality Data Cleaning
Beverly Musick, M.S.May 20, 2010
1
This module was recorded at the health informatics –training course—data management series offered by the Regional East African Centre for Health Informatics (REACH-Informatics) in Eldoret, Kenya. Funding was made possible by NIH’s Fogarty Center. The training was held at the Academic Model Providing Access to Healthcare (AMPATH) , a USAID-funded program, supported by the Regenstrief Institute at Indiana University. The moduleswere created in collaboration with the School of Informatics at IUPUI.
• Quality Control is the process of monitoring and maintaining the reliability, accuracy, and completeness of the data during the conduct of the project.
• Requires a multidisciplinary team which includes clinicians, data entry staff, statisticians, systems administrations, and data managers.
• Requires sharing knowledge about disease progression, clinical practice patterns, effects of medical treatments, relationships between variables and expected timing of events.
2
Ensuring Data Quality
• Point of Assessment –Collection: review form before patient
leaves the clinic–Entry: range restrictions, logical checks–Post-entry clean-up queries–Statistical Analysis: data trends
3
Ensuring Data Quality (cont.)
• To ensure data quality the data manager needs to understand:– Goals of program– Standards of operation– Impact of intervention or program– Relationships between variables– Expected timing of events
4
Clean-up Queries
Missing Data• Generate reports regarding the percent of
missing data for each item on the data collection forms
• Highlight differences between programs or specific groups of patients in order to identify methods to minimize missing data
5
Date Comparison • Ensure that the date of birth precedes all
other dates.• Calculate age and verify that the date of birth
makes sense.• For patients who have died, ensure that the
date of death follows all other dates.
6
Clean-up Queries
Date Comparison (cont.)• Generate a clean-up list for observation dates
that are after today’s date or, preferably, the date of data entry.
• Generate a similar list for observation dates that precede the date of inception of your program.
• Examine the interval between observation/visit dates to ensure that the expected time frame is reflected.
7
Clean-up Queries
Checks on Numeric Data • Confirm all values are within the expected
range.• Investigate possible outliers by verifying
against source document, comparing with other values for same subject, or cross-referencing with other variables such as current illnesses in the case of elevated lab result
• Confirm that values make sense with respect to patient’s age, gender, disease status, etc.
8
Clean-up Queries
Checks on Adult Heights/Weights
• Calculate BMI from height and weight (BMI=weight (kg) / height(m))
• Most should be between 10 and 40
• Flag unexpected weight fluctuations
9
Clean-up Queries
Checks on Pediatric Heights/Weights• Calculate weight-for-age Z-scores using Epi
Info NutStat software (http://www.cdc.gov/epiinfo/) or SAS software (http://www.cdc.gov/nccdphp/dnpao/growthcharts/resources/sas.htm)
• Review date of birth, visit date, age and weight for Z-scores less than -5 or greater than 5.
• Similar checks can be made with height-for-age and weight-for-height Z-scores.