Top Banner
SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative gnome hybridization data
17

SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative gnome hybridization data.

Dec 18, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative gnome hybridization data.

SW-ARRAY: a dynamic programming solution for the identification of copy-number

changes in genomic DNA using array comparative gnome

hybridization data

Page 2: SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative gnome hybridization data.

Motivation

• Chromosomal changes cause genetic diseases– aneusomies

• Easy to detect

– Copy number changes of genes• Not so easy

Page 3: SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative gnome hybridization data.

Array CGH

• Comparative Genome Hybridization CGH to DNA microarrays

• Method for detecting copy number changes– Data analyzed using thresholds– Not reliable to detect single-copy gains or losses

when using large insert clones as probes – High false positives and false negatives– Inconsistent for probes of different chromosomal

regions

• Cannot be used for clinical diagnostic applications!

Page 4: SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative gnome hybridization data.

Data Adjustment

• Normalization and Correction– Reason: variations between probes– Control vs. control data ratio

• Find mean and SD

– Divide control vs. test ratios by that mean

Page 5: SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative gnome hybridization data.

Threshold method

• Compare each data from control vs. test experiment to threshold values– Below 0.8=deletion– Above 1.2=polysomy

Page 6: SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative gnome hybridization data.

SW-ARRAY

• Smith-Waterman algorithm adapted for Array CGH

• New way to analyze Array CGH data

• Reason:– Log ratio data is contiguous one-dimensional

series, where locations of high values may indicate polysomic regions, low deletions

Page 7: SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative gnome hybridization data.

SW-ARRAY

• Step 1:– Remove outlying probes

• Log intensity ratio more than 2.5 MAD away from median of other probes in array

• MAD=Mean Absolute Deviation– Robust measure of Standard Deviation

1

1 n

iix x

n

Page 8: SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative gnome hybridization data.

SW-ARRAY

• Step 2:– Log ratio data - t0

– Ensures that the mean of adjusted data is negative

• t0=median + 0.2 x MAD

Page 9: SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative gnome hybridization data.

SW-ARRAY

• Step 3:– Search for high-scoring islands

• Definition– locally high-scoring segment-a positive

scoring segment whose score cannot be increased by shrinking or expanding segment boundaries

Page 10: SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative gnome hybridization data.

SW-ARRAY

( , ) ( )q

i pT p q X i

T(p,q)=score of segmentX(i)=score for the pth probe ordered along genome

Page 11: SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative gnome hybridization data.

SW-ARRAY

S(p)=score of island ending at pB(p)=beginning point of the islandS(0)=0P>0

Page 12: SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative gnome hybridization data.

SW-ARRAY

• Iterate through locations along gene probes

• Search where scores>0– Find max-scoring island– Record data– Set island=0– Find next max-scoring island

Page 13: SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative gnome hybridization data.

SW-ARRAY

• Statistical Significance– In 1000 runs with permuted log ratios for each

probe• find frequency of highest scoring island in each run

Page 14: SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative gnome hybridization data.

Experiment

• Test Group– DNA from subjects with well-characterized

monosomies

• Control groups

• Data analyzed using 2 methods– Threshold– SW-ARRAY

Page 15: SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative gnome hybridization data.

Experiment Results

• Threshold Method– 78.1% correct identification of copy-number

changes

• SW-ARRAY– Identified 13/14 of the monosomic regions

with high significance levels in the 14 blind tests

Page 16: SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative gnome hybridization data.

Ideal Conditions for SW-ARRAY

• numerious probes border region of copy number change

• long sequences for which edge effects are minimized

Page 17: SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative gnome hybridization data.

Output