Gao Song 2010/02/03. Background Knowledge Problem Description Framework of Solution Own Methods Results.

Post on 04-Jan-2016

216 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

Transcript

Comparative Assemblyfor

Cancer Human GenomeGao Song

2010/02/03

Background Knowledge Problem Description Framework of Solution Own Methods Results

Content

Pair End Tag (PET)

Background Knowledge

Concordant PET (CPET)

Discordant PET (DPET)◦ Distance or orientation is incorrect◦ Map to different chromosomes

DPET Cluster

Background Knowledge

Given:◦ Frequency of DPET and CPET along the reference

genome◦ DPET Cluster

Requirement:◦ Find rearrangement of cancer genome compare to

normal human genome◦ Now focus on Amplicons

Problem Description

The reference genome is cut when CPET is 0=> some big contigs

According to DPET, find the breakpoints Using CPET to check if there is connection

between breakpoints Convert DPET Cluster into edges in the

graph Using high copy edges to form subgraph of

amplicons

Framework of Solution

Framework of Solution

DPETStart and End

Breakpoint

CPET

Filted BreakPoints

Original Contigs

Small Contigs

DPETReference Genome

Edges CPETNodes

Graph

DPET Frequency Curve Using DPET directly

choose a threshold to Select the breakpoint

Problem:◦ How to choose the threshold◦ Within amplicon region, it is hard to find the

breakpoint – basic frequency is too much

Own Methods-NaiveChromosome 9

Using slope(differentiation)

Problem:◦ How to define threshold◦ Too many false positive◦ Also miss some DPET cluster

Own Methods - Slope Chromosome 9

In breakpoint, DPET increases, CPET decreases

Can be used as another criteria Problem

◦ Another Parameter!

Own Method – Consider Ratio

Using slope to find the threshold The previous missing point can be found

New methods of finding breakpoint

Localize checking Using two consecutive windows

◦ Each window has: μ σ

◦ Null Hypothesis: σ2 is not significantly

larger than σ1

◦ Using Binomial Testing:

Significance level: 0.05

Own Method – Hypothesis Testing

window1 window2

Some details:◦ Check if the cluster region is included in window

Not finished yet Calculating σ is time-consuming

- have to recalculate after each step

Own Method – Hypothesis Testing

Results(slope)

10k 20k

# of subgraph 72 35

Max chromosome inOne subgraph

4 4

Average chromosomeIn one subgraph

1.18 1.23

Max edge inOne subgraph

42 44

Average edgeIn one subgraph

5.47 5.77

One Special Case

10k Lib

20k Lib

10k lib 20k lib

Another example

top related