Epi 202: Designing Clinical Research Data Management for Clinical Research Thomas B. Newman, MD,MPH Professor of Epidemiology & Biostatistics and Pediatrics,

Post on 04-Jan-2016

217 Views

Category:

Documents

5 Downloads

Preview:

Click to see full reader

Transcript

Epi 202: Designing Clinical Research

Data Management for Clinical Research

Thomas B. Newman, MD,MPH

Professor of Epidemiology & Biostatistics and Pediatrics, UCSF

September 4, 2012

1

Outline

Data management steps Advantages of database vs

spreadsheet entry REDCap demonstration Take-home message: Pretest should

include data entry and analysis

2

Data Management Steps

Design data collection form Capture data Enter data Clean data

Then can do data analysis

3

Traditional Paper method

Data collection form design -- Word Data capture – Pen Data entry -- keyboard transcription

into Excel Data cleaning -- painful

4

Questionnaire from TN’s DCR section 2009

5

Oophorectomy

IDoophe-

rectomy204 no205 yes207 no208 no209 no211 no212 yes214 no

215 no216 yes (one)217 no218 no

219 no

• Advantage of paper form: ability to write in answers you had not anticipated

• Subject might leave it blank or guess if forced to chose

6

Questionnaire from DCR 2009

7

Race coding: Problems

ID race204 black205 hispanic207 Asian208 white209 latina211 white212 asian214 white

215 white216 black217 black218 hispanic

219 white

Free text for “other”: hispanic, latina

“Asian” and “asian” are different values for a string variable

8

Questionnaire from DCR 2009

9

Weight change

ID raceweight change gain/lose

204 black 40 loose205 hispanic 35 gain207 Asian 2 blank (+/-)208 white 10 gain209 latina 5 gain211 white 0 lose212 asian 0 214 white 15 gain

215 white 10 loose216 black 25 loose217 black 0 218 hispanic 15 loose

219 white 5-10

pounds loose10

Data cleaning before transcription- study staff

Different color ink

Person making changes identified

11

Data cleaning (Stata example)

replace race = “Asian” if race == “asian”

replace weightchange = 7.5 if weightchange == “5-10 pounds”

12

Questionnaire from DCR 2009

13

Exercise

IDexercise

typeexercise freqency

204 walking 2-4times/week205 stretch/walk 2-3 days/week207 walking 3x208 Curves 3-5 x/week209 biking every day211 walking 212 walking 2x/week214

215aerobic-resistant 5-6days/week

216 walking 2x/week217 218

219 blank blank

These variables will be hard to analyze. This is what we are trying to avoid.

14

Data cleaning before transcription- study staff

15

Simple coding

Advantages of paper

Rapid data entry anywhere Readily understood Permanent record Allows ready annotation

16

Disadvantages of paper No immediate quality control Branching logic harder Data entry required Allows you to postpone thinking about

data analysis when you should be thinking about it now!

17

Consider data analysis early Restrict options Provide range and logic checks Include coding on the paper form

PRETEST data entry and analysis!

18

Data Dictionary Variable name Type of variable (binary, integer, real,

string, etc.) Variable label (longer name) Value labels (e.g., 0 = No, 1 =Yes) Permitted values Notes

19

Research Electronic Data Capture (REDCap) Design survey or data collection form Creates data dictionary Can track subjects and responses Exports to statistical packages Available with MyResearch account Other options: Access (PC), Epi-Info

(PC), FilemakerPro

20

REDCap demo

21

Home Page

22

My Projects

23

Project Setup

24

Online Survey Designer

25

Add New Field

26

New Question added

27

REDCap Creates a Stata do fileclear

insheet participant_id redcap_survey_timestamp redcap_survey_identifier mas_or_ticr want_attend_review dates_available___1 dates_available___2 dates_available___3 dates_available___4 field comments survey_complete using "DATA_DCR_FINAL_REVIEW_SESSION_SURVEY_COPY_2_TNEWMAN_2011-08-10-22-39-34.CSV", nonames

label data "DATA_DCR_FINAL_REVIEW_SESSION_SURVEY_COPY_2_TNEWMAN_2011-08-10-22-39-34.CSV”

label define mas_or_ticr_ 1 "No" 2 "Yes ===> Exit this survey"

label define want_attend_review_ 1 "No ====> Exit this survey" 2 "Yes"

label define dates_available___1_ 0 "Unchecked" 1 "Checked"

label define field_ 1 "Clinical pharmacology" 2 "Community medicine" 3 "Dentistry" 4 "Dermatology" 5 "Emergency medicine" 6 "Endocrinology" 7 "Epidemiology/environmental health" 8 "Family medicine" 9 "Global health" 10 "Hospital medicine" 11 "Infectious disease" 12 …

label variable mas_or_ticr "Are you in either the Masters Degree in Clinical Research program or the ATCR (Advanced Training in Clinical Research) program?"

28

Most Important Message:

29

Pretest!

Questions and comments

30

Extra slides

31

Main decisions

Electronic capture vs paper Optical form reading vs keyboard

transcription Enter data into database, spreadsheet

or statistical package

Highly recommended!32

Advantages of database vs Spreadsheet Restricts choices Error checking Can track study progress, produce

reports, export to statistical package Safer – harder to accidentally alter data

33

top related