Safety Data Mining: Safety Data Mining: Background and Current Background and Current Issues Issues Ramin Arani, PhD Ramin Arani, PhD Safety Data Mining Safety Data Mining Global Biometric Science Global Biometric Science Bristol-Myers Squibb Company Bristol-Myers Squibb Company SAMSI: July, 2006 SAMSI: July, 2006
Safety Data Mining: Background and Current Issues. Ramin Arani, PhD Safety Data Mining Global Biometric Science Bristol-Myers Squibb Company SAMSI: July, 2006. Outline. Rationale for Pharmacovigilance AERS Data Base Data base issues Methodologies BCNN (WHO) MGPS (FDA) Summary - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Safety Data Mining: Safety Data Mining: Background and Current Background and Current
IssuesIssues
Ramin Arani, PhDRamin Arani, PhD
Safety Data Mining Safety Data Mining
Global Biometric ScienceGlobal Biometric Science
Bristol-Myers Squibb CompanyBristol-Myers Squibb Company
SAMSI: July, 2006SAMSI: July, 2006
OutlineOutline Rationale for Rationale for PharmacovigilancePharmacovigilance
AERS Data BaseAERS Data Base
Data base issuesData base issues
MethodologiesMethodologies
BCNN (WHO)BCNN (WHO)
MGPS (FDA)MGPS (FDA)
SummarySummary
Challenges and OpportunitiesChallenges and Opportunities
Information obtained prior to first marketing is inadequate to cover all Information obtained prior to first marketing is inadequate to cover all aspects of drug safety: aspects of drug safety:
tests in animals are insufficiently predictive of human safety,tests in animals are insufficiently predictive of human safety,
in clinical trials patients are selected and limited in number, in clinical trials patients are selected and limited in number,
conditions of use in trials differ from those in clinical practice,conditions of use in trials differ from those in clinical practice,
duration of trials is limitedduration of trials is limited
information about rare but serious adverse reactions, chronic information about rare but serious adverse reactions, chronic toxicity, use in special groups or drug interactions is often not toxicity, use in special groups or drug interactions is often not available.available.
Pre Approval Data- Controlled- Limited # Pts- Safety data not mature
Post Approval Data - Real life ; uncontrolled- Off label use -Generic
PhamacovigilanceSet of methods that aim at identifying and quantitatively assess the risks related to the use of drugs in the entire population, or in specific population subgroups
Adverse Drug ReactionA response to a drug which is harmful and unintended, and which occurs at doses normally used.
Report volume for a drug is affected by, volume of use, publicity, type and severity of the event and Report volume for a drug is affected by, volume of use, publicity, type and severity of the event and other factors, therefore the reporting rate is not a true measure of the rate or the risk other factors, therefore the reporting rate is not a true measure of the rate or the risk
An observed event may be due to the indication for therapy rather than the therapy itself; therefore An observed event may be due to the indication for therapy rather than the therapy itself; therefore observed associations should be viewed as signal, and causal conclusions drawn with cautionobserved associations should be viewed as signal, and causal conclusions drawn with caution
ExamplesExamplesClaritin and arrhythmias (channeling and need for detailed
data not in data base)
Increased number of reports due to preexisting condition. Selection of high risk patients for the drug deemed safest for them.
Prozac and suicide (confounding by indication) Large increase in reports following publicity and stimulated reporting
The Pharmacovigilance The Pharmacovigilance ProcessProcess
Detect SignalsTraditional Methods
DataMining
Generate Hypotheses
Refute/VerifyType A
(Mechanism-based)
Type B(Idiosyncratic)
Insight from Outliers
EstimateIncidence
Public HealthImpact, Benefit/Risk
ActInform
Change LabelRestrict use/
withdraw
MethodologiesMethodologies
Finding “Interestingly Large” Cell Finding “Interestingly Large” Cell Counts in a Massive Frequency Counts in a Massive Frequency
TableTable
Rows and Columns May Have Thousands of CategoriesRows and Columns May Have Thousands of Categories
Most Cells Are Empty, even though Most Cells Are Empty, even though NN++++ Is very Large Is very Large
Only 386K out of 1331K Cells Have Only 386K out of 1331K Cells Have NNijij > 0 > 0
174 Drug-Event Combinations Have 174 Drug-Event Combinations Have NNijij > 1000 > 1000
No. No. ReportsReports
AEAE11
…… AEAEnn TotalTotal
Drug 1Drug 1 NN1111 …… NN1n1n NN1+1+
:: :: NNijij :: ::Drug mDrug m NNmm
11
…… NNmnmn NNm+m+
TotalTotal NN+1+1 …… NN+n+n NN++++
Method - BasicsMethod - Basics Endpoint: No of AEs
Most use variations of 2-way table statistics
No. No. ReportsReports
Target Target AEAE
Other Other AEAE
TotalTotal
Target Target DrugDrug
aa bb a+ba+b
Other Other DrugDrug
cc dd c+dc+d
TotalTotal a+ca+c b+db+d nn
Some possibilities Reporting Ratio: E(a) = (a+b) (a+c)/n Proportional Reporting Ratio: E(a) = (a+b) c / (c+d) Odds Ratio: E(a) = b c / d
OR > PRR > RR when a > E(a)
Basic idea:Flag when R = a/E(a) is “large”
Bayesian ApproachesBayesian Approaches Two current approaches: DuMouchel & WHO
Both use ratio nij / Eij where
nij = no. of reports mentioning both drug i & event j
Eij = expected no. of reports of drug i & event j
Both report features of posterior dist’n of ‘information criterion’
ICij = log2 nij / Eij = PRRij
Eij usually computed assuming drug i & event j are mentioned
independently
Ratio > 1 (IC > 0) combination mentioned more often than expected if independent
DuMouchel, cont’dDuMouchel, cont’d Estimate , a1, b1, a2, b2 using Empirical Bayes -- marginal dist’n of
nij is mixture of negative binomials
Posterior density of ij also is mixture of gammas
ln2 ij = ICij
Easy to get 5% lower bound (i.e. E(ICij) - 2 SD(ICij) )
The control group and The control group and the issue of ‘compared to the issue of ‘compared to
what?’what?’ Signal strategies, compare
a drug with itself from prior time periods
with other drugs and events
with external data sources of relative drug usage and exposure
Total frequency count for a drug is used as a relative surrogate for external denominator of exposure; for ease of use, quick and efficient;
Analogy to case-control design where cases are specific AE term, controls are other terms, and outcomes are presence or absence of exposure to a specific drug.
Other useful metrics and Other useful metrics and methodsmethods
Chi-square statistics
P-value type metric- overly influenced by sample size
Modeling association through directly Multivariate Poisson dist
Incorporation of a prior distribution on some drugs and/or events for which previous information is available - e.g. Liver events or pre-market signals
Interpreting the Signal ThroughInterpreting the Signal Throughthe Role of Visual Graphicsthe Role of Visual Graphics
Four examples of spatial maps that reduce the scores to patterns and user friendly graphs and help to interpret many signals collectively
Example 1
A spatial map showing the “signal scores” for the most frequently reported events (rows) and drugs (columns) in the database by the intensity of the empirical Bayes signal score (blue color is a stronger signal than purple)
Example 2
Spatial map showing ‘fingerprints’ of signal scores allowing one to visually compare the complexity of patterns for different drugs and events and to identify positive or negative co-occurrences
Example 3
Cumulative scores and numbers of reports according to the year when the signal was first detected for selected drugs
Example 4
Differences in paired male-female signal scores for a specific adverse event across drugs with events reported (red means females greater, green means males greater)
Summary Summary
1. There is NO Golden Standard method for signal detection.
2. The signals become more stable over time, however there is a limited time window of opportunity for signal detection.
3. Use Time-slice evolution of signal.-Fluctuation might reveal external risk factors. -Robustness can be assessed.
4. Consider other endpoint such as time to onset, duration of event, etc.
5. For spontaneous case reports, the means to improve content is to standardize and improve intake
6. Data mining likely will generate many false positives and affirmations of what was previously known
7. Causality assessments should largely be reserved refining important signals
Challenges in the Challenges in the futurefuture
More real time data analysis
More interactivity ( Visual Data mining, e.g. ggobi )
Linkage with other data bases to control the bias inherent in data base
Quality control strategies (e.g. Identifying duplicates
Methods to reduce the false positive and negative?