Top Banner
CONFIDENTIAL 1 February 13, 2019 Eric Morrie, Associate Director of Data Management February 13, 2019 R Programming for Data Management
18

R Programming for Data Management - Emanate Life Sciences€¦ · ⎼Supports R, Python, and Scala • R Tools for Microsoft Visual Studio ⎼Use R Alongside Microsoft Languages •

Jun 07, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: R Programming for Data Management - Emanate Life Sciences€¦ · ⎼Supports R, Python, and Scala • R Tools for Microsoft Visual Studio ⎼Use R Alongside Microsoft Languages •

CONFIDENTIAL 1February 13, 2019

Eric Morrie, Associate Director of Data Management

February 13, 2019

R Programming for Data Management

Page 2: R Programming for Data Management - Emanate Life Sciences€¦ · ⎼Supports R, Python, and Scala • R Tools for Microsoft Visual Studio ⎼Use R Alongside Microsoft Languages •

CONFIDENTIAL 3February 13, 2019

Emanate Life Sciences

We’re a data management consulting firm with extensive knowledge in the Life Sciences industry. Our mission is to help our clients bring products to market that improve people’s lives.

• We specializes in:

• Data Management Advisory

• Data Management Service Provider

• Contracting / Permanent Placement

[email protected]

Page 3: R Programming for Data Management - Emanate Life Sciences€¦ · ⎼Supports R, Python, and Scala • R Tools for Microsoft Visual Studio ⎼Use R Alongside Microsoft Languages •

Introduction

Eric Morrie

Associate Director of Data Management⎼ Oversee all clinical trial and genetic specimen

data used in diagnostic development.

Veracyte, Inc.⎼ Provide genomic diagnostics to minimize

procedure complexity and reduce disease uncertainty for patients with cancer.

The views expressed in this presentation are those of the author/presenter and should not be construed to represent the

views of Veracyte, Inc.

Page 4: R Programming for Data Management - Emanate Life Sciences€¦ · ⎼Supports R, Python, and Scala • R Tools for Microsoft Visual Studio ⎼Use R Alongside Microsoft Languages •

CONFIDENTIAL 6February 13, 2019

R Programming for Data Management

Page 5: R Programming for Data Management - Emanate Life Sciences€¦ · ⎼Supports R, Python, and Scala • R Tools for Microsoft Visual Studio ⎼Use R Alongside Microsoft Languages •

CONFIDENTIAL 7February 13, 2019

Why is R attractive?

• Scientific and Statistical Language

• Powerful Regression Capabilities

• Designed for Reproduceable Research

• Focused on Ad-hoc Analysis and Data Exploration

• Earlier Adoption of Latest Techniques

• Over 13,000 Extension Packages

• Free Open Source and Paid Support Available

Page 6: R Programming for Data Management - Emanate Life Sciences€¦ · ⎼Supports R, Python, and Scala • R Tools for Microsoft Visual Studio ⎼Use R Alongside Microsoft Languages •

CONFIDENTIAL 8February 13, 2019

R Packages

Input/Output Utilities

OpenXLSX – Read and write ExcelHaven – Read and write BDATSASxport – Read and write XPTODBC – access database sources

Hmisc – SAS utilities and statisticsKeyRing – Password excryptionMailR – Send emailHTMLTools – HTML generation

Data Transformation Graphing/Layout

Data.Table – Filter, Sort, Group, and MergeDplyr – Summarise and convertStringR – Text parsing and conversionLubridate – Date parsing and conversionReshape2 – Aggregation, split, pivot

Rmarkdown – Page LayoutsGGPlot2 – Graphs and figuresScales – Axis customizationShiny – Interactive ExplorationShinyDashboard – Metrics DashboardsHighCharter – Interactive Graphs

Page 7: R Programming for Data Management - Emanate Life Sciences€¦ · ⎼Supports R, Python, and Scala • R Tools for Microsoft Visual Studio ⎼Use R Alongside Microsoft Languages •

CONFIDENTIAL 9February 13, 2019

Can I use R in clinical trials?

• Second Most Preferred Language for Clinical Research

• FDA does not endorse or require any specific software1

• Used by FDA on a daily basis

• FDA contributes R packages

• R is validated for Clinical Trials3

Page 8: R Programming for Data Management - Emanate Life Sciences€¦ · ⎼Supports R, Python, and Scala • R Tools for Microsoft Visual Studio ⎼Use R Alongside Microsoft Languages •

CONFIDENTIAL 10February 13, 2019

Comparison To SAS

SAS R

Language High-level data manipulation language Lower-level object-oriented language

Data Handling File, SQL, or Web Services

Data Processing File-based processing In-memory processing

Statistical Capabilities Only vetted methods Leading edge techniques

Interface Command-line, local GUI, server

Outputs Formatted Reports Manuscript Ready Layouts

Page 9: R Programming for Data Management - Emanate Life Sciences€¦ · ⎼Supports R, Python, and Scala • R Tools for Microsoft Visual Studio ⎼Use R Alongside Microsoft Languages •

CONFIDENTIAL 11February 13, 2019

Programming Interface

• Jupyter Notebook⎼ Supports R, Python, and Scala

• R Tools for Microsoft Visual Studio⎼ Use R Alongside Microsoft Languages

• R Studio⎼ Dedicated to R⎼ Package management is built in⎼ Supports Markdown and Shiny

Page 10: R Programming for Data Management - Emanate Life Sciences€¦ · ⎼Supports R, Python, and Scala • R Tools for Microsoft Visual Studio ⎼Use R Alongside Microsoft Languages •

CONFIDENTIAL 12February 13, 2019

Data Transformations

• Read data from any source⎼ Including XPT and BDAT

• Derive and Impute new variables

• Assign categories to continuous data

• Pivot data, including text

• Apply conditionals (If-Then-Else)

• Summary Statistics

• Utilize custom functions

Page 11: R Programming for Data Management - Emanate Life Sciences€¦ · ⎼Supports R, Python, and Scala • R Tools for Microsoft Visual Studio ⎼Use R Alongside Microsoft Languages •

CONFIDENTIAL 13February 13, 2019

Graphs

• Graphing in R ensures reproducibility

• Output format must match rendering⎼ Use PDF for print outputs⎼ Use PNG for web outputs

• Extensions for any visualization⎼ GGPlot2⎼ Plotly⎼ SunburstR⎼ HighCharter (not for printed graphs)

Page 12: R Programming for Data Management - Emanate Life Sciences€¦ · ⎼Supports R, Python, and Scala • R Tools for Microsoft Visual Studio ⎼Use R Alongside Microsoft Languages •

CONFIDENTIAL 14February 13, 2019

Formatted Layouts

• R Markdown for Layouts

• Includes Cover Page and Table of Contents

• Change Page Layout Mid-document

• Custom headers and footers

• Include text, tables, and charts

• Output to Word, PDF, RTF, and more

• Perform Burst Distribution COHORT 1

COHORT 2

COHORT 3

TRANSCEND Study Metrics Page 3 of 7

Page 13: R Programming for Data Management - Emanate Life Sciences€¦ · ⎼Supports R, Python, and Scala • R Tools for Microsoft Visual Studio ⎼Use R Alongside Microsoft Languages •

CONFIDENTIAL 15February 13, 2019

Analytics

• Shiny provides ⎼ Dashboards⎼ Data exploration⎼ Data file upload⎼ Report download

• Web browser access

• Graphs refresh automatically

• Supports data selection and filtering

Page 14: R Programming for Data Management - Emanate Life Sciences€¦ · ⎼Supports R, Python, and Scala • R Tools for Microsoft Visual Studio ⎼Use R Alongside Microsoft Languages •

CONFIDENTIAL 16February 13, 2019

References

1.https://blog.revolutionanalytics.com/downloads/FDA-Janice-Brodsky-UseR-2012.pdf

2.https://blog.revolutionanalytics.com/2012/06/fda-r-ok.html

3.https://www.r-project.org/doc/R-FDA.pdf

4.https://channel9.msdn.com/Events/useR-international-R-User-conference/useR2016/Using-R-in-a-regulatory-environment-FDA-experiences

5.https://www.burtchworks.com/2018/07/16/2018-sas-r-or-python-survey-results-which-do-data-scientists-analytics-pros-prefer/

6.https://www.aridhia.com/blog/r-for-researchers-8-essential-cheatsheets-for-research-data-analysis/

7.https://www.rstudio.com/resources/cheatsheets/

Page 15: R Programming for Data Management - Emanate Life Sciences€¦ · ⎼Supports R, Python, and Scala • R Tools for Microsoft Visual Studio ⎼Use R Alongside Microsoft Languages •

CONFIDENTIAL 18February 13, 2019

R Demo

Page 16: R Programming for Data Management - Emanate Life Sciences€¦ · ⎼Supports R, Python, and Scala • R Tools for Microsoft Visual Studio ⎼Use R Alongside Microsoft Languages •

CONFIDENTIAL 21February 13, 2019

Questions & Answers

Eric Morrie

Associate Director of Data Management

[email protected]

Page 17: R Programming for Data Management - Emanate Life Sciences€¦ · ⎼Supports R, Python, and Scala • R Tools for Microsoft Visual Studio ⎼Use R Alongside Microsoft Languages •

CONFIDENTIAL 22February 13, 2019

Upcoming Event

Centralized Monitoring: The Role of Process and Technology!

Wednesday, March 20, 2019

Veeva Systems | Pleasanton CA

5:30-7:30 PM Event Time

www.emanatelifesciences.com

Page 18: R Programming for Data Management - Emanate Life Sciences€¦ · ⎼Supports R, Python, and Scala • R Tools for Microsoft Visual Studio ⎼Use R Alongside Microsoft Languages •

CONFIDENTIAL 23February 13, 2019

Speakers and Volunteers Needed

Speaking Opportunities

• Sensors, Wearable & Biomarkers

• Artificial Intelligence in Clinical Trials

• Improving Site-Study Activation & Performance

• Managing Outsourced Clinical Trials

• Implementing CDISC: CDASH, STDM, ADAM

• Endpoint Adjudication

• Data Management

Contact Information

Hajime Arnold

[email protected]

Terri Buller

[email protected]