Top Banner
1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department, University of Florida
22

1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

Jan 06, 2018

Download

Documents

Gregory Banks

Perturbation experiments 3 K-RasRafMEK ERK JNK RalGDSRalRalBP1 PLD1 Cob42Rac Perturbation In a perturbation experiment stimulant (radiation, toxic element, medication), also known as perturbation, is applied on tissues. Gene expression is measured before and after the perturbation. A gene can change its expression as a result of perturbation. Differentially expressed gene (DE). Equally expressed gene (EE). Differentially expressed genes
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

1

Identifying Differentially Regulated Genes

Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci

Bioinformatics Lab., CISE Department,University of Florida

Page 2: 1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

2

Gene interaction through regulatory networks

• Gene networks: The genes are nodes and the interactions are directed edges.

• Neighbors– incoming neighbors and outgoing neighbors.

• A gene can changes the state of other genes– Activation– Inhibition

K-Ras Raf MEKERK

JNK

RalGDS Ral RalBP1

PLD1

Cob42Rac

Page 3: 1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

Perturbation experiments

3

K-Ras Raf MEKERK

JNK

RalGDS Ral RalBP1

PLD1

Cob42Rac

Perturbation

• In a perturbation experiment stimulant (radiation, toxic element, medication), also known as perturbation, is applied on tissues.

• Gene expression is measured before and after the perturbation.• A gene can change its expression as a result of perturbation.

• Differentially expressed gene (DE).• Equally expressed gene (EE).

Differentially expressed genes

Page 4: 1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

4

Perturbation experiment : single dataset

• Primarily affected genes : Directly affected by perturbation.

• Secondarily affected genes : Primarily affected genes affect some other genes.

K-Ras Raf MEKERK

JNK

RalGDS Ral RalBP1

PLD1

Cob42Rac

Perturbation

Primarily affected genes

Secondarily affected genes

Page 5: 1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

Differentially and Equally regulated

• Some dataset inherently has two groups.– Fasting vs non-fasting, Caucasian American vs African American

• For these datasets, a gene is– Differentially regulated: DE in one group and EE in another.– Equally regulated: DE or EE in both the groups.– Here, gene g1 is DE in data DA and EE in DB. Hence, it is DR.

5

g1 g4 g5

g2 g3

g1 g4 g5

g2 g3

DADB

Differentially expressed

Equally expressed

Page 6: 1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

666

Two datasets: Primary and secondary effects

• Primarily differentially regulated genes (PDR): Directly affected by perturbation.

• Secondarily differentially regulated genes (SDR): Primarily affected genes affect some other genes.

g1 g4 g5

g2 g3

g1 g4 g5

g2 g3

g0

DADB

Primarily differentially expressed

Secondarily differentially expressed

Equally expressed

Page 7: 1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

7

Problem & method • Input: Gene expression (control and non-control) of

two data groups DA and DB.• Problem: Analyzing the primary and secondary

affects of the perturbation– Estimate probability that a gene is differentially regulated

because of the perturbation or because of the other genes (incoming neighbors)?

– What are the primarily differentially regulated genes? • Method

– Probabilistic Bayesian method, where we employ Markov Random Field to leverage domain knowledge.

Page 8: 1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

Notation • Observed variables

– Microarray datasets:• Two data groups: DA, DB • A single gene gi in group C, (C ϵ

A,B):

• For All genes in group A:

– Neighborhood variables

• Hidden variables– State variables: – Regulation variables: Zi

– Interaction variables: Xij

8

M

1i CiC YY

EE is g ifEE,DE is g ifDE,

Si

ii

otherwise 0

g tog from edgean if 1,W ji

ij

'yyY CiiCCi

SAi SBi SAj SBj Zi Zj Xij

DE DE DE DE 1 1 1

DE DE DE EE 1 2 2

DE DE EE DE 1 3 3

DE DE EE EE 1 4 4

DE EE DE DE 2 1 5

DE EE DE EE 2 2 6

DE EE EE DE 2 3 7

DE EE EE EE 2 4 8

EE DE DE DE 3 1 9

EE DE DE EE 3 2 10

EE DE EE DE 3 3 11

EE DE EE EE 3 4 12

EE EE DE DE 4 1 13

EE EE DE EE 4 2 14

EE EE EE DE 4 3 15

EE EE EE EE 4 4 16

Page 9: 1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

9

Problem formulation• Input to the problem:

– Microarray expression: Y – Gene network V = {G, W}

• G = {g0, g1, g2, …, gM} where g0 is metagene.

• Goal:– Estimate the density p(Xij| X- Xij, Y, V, Wij = 1 ) for all Wij.

This gene estimates the probability that a gene is DR due to the perturbation or due to an incoming neighbor gene.

– Note: A higher value for p(Xij ={2, 3}| X- Xij, Y, V, Wij = 1 ) indicates a higher chance that gj is affected by gi

Page 10: 1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

10

Bayesian distribution• We propound a Bayesian model as it allows us to

incorporate our beliefs into the model.– The joint probability distribution over X

– We can derivate the density of Xij , p(Xij| X- Xij, Y, V, Wij =1) from the joint density function.

X XY

XYXY )θV,|p(X)θV,X,|p(Y

)θV,|p(X)θV,X,|p(Y)θ,θV,Y,|p(X

Posterior density Likelihood density Prior density

Page 11: 1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

11

Prior density function : Markov random field

• MRF is an undirected graph Ψ = (X, E).– X = {Xij} represents an

edge in the gene network.

– E = {(Xij, Xpj)| Wpi = Wij= 1} U {(Xij, Xik) | Wjk= Wij

= 1} • An edge in MRF

corresponds to two edges in the gene network. – (X23, X25) corresponds to

(g2, g3) and (g3, g5)

g1 g4 g5

g2 g3

g1 g4 g5

g2 g3

g0

DA DB

X01 (2) X02 (1) X03 (1) X05 (3)

X04 (4) X12 (5) X23 (1) X35 (3)

X14 (8) X13 (5) X25 (7)

(a) Gene network

(b) Markov random field

Page 12: 1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

12

Prior density function: Feature functions• Three beliefs relevant to our model:

– In a data group, the meta gene g0 can affect the states of all other genes. (modeled by adding directed edges from g0 to all other genes.)

– In a data group, a gene can affect the state of its outgoing neighbors.

– A gene has high probability of being equally regulated.• We incorporate these beliefs into the MRF graph using seven

feature functions.• Feature function: Unary or Binary function over the nodes of

MRF. A feature function allows us to introduce our belief on the graph.

Page 13: 1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

13

Feature Functions• Unary: Capture the frequency of Xij.

• Binary: Encapsulates the second belief that In a data group, a gene can affect the state of its outgoing neighbors.

• Unary: Capture the third belief that a gene has high probability of being equally regulated.

• Prior density function

otherwise 0,

2X if 1,)(XF ij

ij1

1W1,Wp, pjij4ij4piij

)X,(Xf)(XF

1W1,Wk, ikij5ij5jkij

)X,(Xf)(XF

Left External Equality

Right External Equality

))(XFγexp(Δ1)θ|p(X

}7{1,2,...,k1,Wj,i, ijkkXij

Feature functions

otherwise 0,

3X if 1,)(XF ij

ij2)(XF)(XF)(XF ij2ij1ij3

3,...,16}{1,...,4,1t1,W ij6ij6ij

)t,(Xf)(XF

,12,13,16}{1,4,5,8,9t1,W ij7ij7ij

)t,(Xf)(XF

Left Internal Equality

Right Internal Equality

Page 14: 1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

Binary: External feature functions

• The external feature functions encapsulate the belief that in a data group, a gene can affect the state of its outgoing neighbors.

• Left Equality– Xij = Xpj Zi = Zp

• Right Equality– Xij = Xik Zj = Zk

14

g1 g2 g3 g4

X23

X12

X34

X13 X24

(a) Gene network

(a) MRF network

Left equality for X23

Right equality for X23

Page 15: 1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

Unary: Internal feature functions

• The internal feature function represents the belief that a gene has high probability of being equally regulated.

• gi is equally regulated.– Xij = {1,2,3,4} Zi = 1 (DE)

– Xij = {13,14,15,15} Zi = 4 (EE)

• gj is equally regulated.– Xij = {1,5,9,13} Zj = 1 (DE)

– Xij = {4,8,12,16} Zj = 4 (EE)

15

Page 16: 1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

16

Objective function optimization

Obtain an initial estimate of state variables.

Estimate parameters for likelihood density.

Estimate parameters that maximize the prior density.

Estimate parameters that maximize the pseudo-likelihood density.

ICM

Differential evolution

Student’s t

Rank the DE genes based on the likelihood w.r.t the metagene.

Page 17: 1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

17

Dataset and experimental setup• DataSet

– Real: Adapted from Smirnov et al. generated using 10 Gy ionizing radiation over immortalized B cells obtained from 155 doner.

– Real/Synthetic: We created synthetic data to simulate the perturbation experiment based on the real dataset. The simulated model is taken from “Modeling of Multiple Valued Gene Regulatory Networks,” by Garg et. al.

– Gene regulatory network: 24,663 genetic interactions over 2,335 genes collected from KEGG database.

• Experimental setup– Implemented our method in MATLAB and java.– Ran our code on a quad core AMD Opteron 2 Ghz workstation with

32GB memory.

Page 18: 1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

Comparison with other methods

• We compared our method with three other methods:– SMRF: Our old method, developed to analyze the effect of

external perturbation on a single data group.– SSEM: A method to differentiate between primary and

secondary effect of perturbation on gene expression dataset.

– Two sample t-test (Student’s t test)

18

Page 19: 1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

Comparison with other methods

19

Page 20: 1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

20

Conclusions

• Our method could find primarily affected genes with high accuracy.

• It achieved significantly better accuracy than SMRF, SSEM and the student’s t test method.

• Our method produces a probability distribution rather than a fixed binary decision.

Page 21: 1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

21

Acknowledgement

This work was supported partially by NSF under grants CCF-0829867 and IIS-0845439.

Page 22: 1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,

22

Thank you!