Neural Network Analysis of Flow Cytometry Immunophenotype Data Mehrshad Mokhtaran M.D. IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 43, NO. 8, AUGUST.

Post on 22-Dec-2015

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Neural Network Analysis of Flow Cytometry Neural Network Analysis of Flow Cytometry Immunophenotype DataImmunophenotype Data

Mehrshad Mokhtaran M.D.Mehrshad Mokhtaran M.D.

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 43, NO. 8, AUGUST 1996

Ravi Kothari,* Member, IEEE, Hernani Cualing, and Thiagarajan Balachander

Acute LeukemiaAcute Leukemia

• Definition– Malignant Event– Replace the bone marrow with blast– Clinical Complication: Anemia, Infection, Bleeding– Rapidly fatal– With appropriate therapy, the natural history can be

markedly altered, and many patient can be cured.

Acute LeukemiaAcute Leukemia

• Etiology:– Radiation– Oncogenic Viruses– Genetic and Congenital Factors– Chemical and Drugs

Acute LeukemiaAcute Leukemia• Incidence:

– Annual new case (All Leukemia): 8 to 10 per 100,000.– Remained static over the past three decades.– ALL:11% CLL:29% AML:46% CML:14%– 3% of all cancer in United States– ALL is most common cancer in children(<15y)– ALL is second cause of death in children(<15y)– ALL has tow maximum incidence per age– AML gradually increases with age– Half of AML cases occur in patients younger than 50 y

Acute LeukemiaAcute Leukemia

• Pathophysiology:

Acute LeukemiaAcute Leukemia

• Classification– Morphology– Cytochemistry– Cell-surface markers– Cytoplasmic markers– Cytogenetics– Oncogene expression

Acute LeukemiaAcute Leukemia

• Must important Distinction is between: AML & ALL– Clinical behavior, prognosis, response to therapy

• AML (FAB)– M0, M1, M2, M3: Increasing degree of differentiation– M4, M5: Monocytic lineage – M6: Erytroid cell linage– M7: Acute Megakaryocytic Leukemia

• ALL (FAB)– L1– L2– L3

Acute LeukemiaAcute Leukemia• Cell-surface Markers:

– AML• Normal immature myeloid cells and blast cells from most patient

with AML: CD13, CD14, CD33, CD34• M6, M7: Antigens restricted to red cell and platelet lineage• AML may express: HLA-DR antigen• 10-20%: B- or T-cell lineage

– ALL• 60% of ALL: CALLA(CD10) (early pre-B-cell differentiation state)• Pre-B-cell ALL: 20% CALLA-positive that have intracytoplasmic

immunoglobulin• B-cell ALL(5%): Immunoglobulin on cell surface• T-cell ALL(20%): CD5, CD3 or CD2 (normal early T-cell)• Null cell ALL (15%): Fail to express CALLA, B- , T-cell markers• 25% of ALL: Myeloid antigens

Acute LeukemiaAcute Leukemia

• Cytogenetics and Molecular biology:

Acute LeukemiaAcute Leukemia• Clinical Manifestations:

– Decreaseing normal marrow function:• Anemia: Fatigue, pallor, headache, angina or heart failure• Thrombocytopenia: Bleeding(petechiae, ecchymosess, bleeding

gums, epistaxis) • Granulocytopenic(AML>ALL) : Infections (Bacterial)

– Invasioning of normal organs by leukemic blasts (ALL>AML):• Enlargement of lymph nodes, liver, spleen• Bone pain• Skin (Leukemia cutis)• Leukemic meningitis: Headache, nausea• CNS (particular in relapse): palsies and seizures• Testicular involvement (particular in relapse)• Any soft tissue (AML>ALL): Chloroma, myeloblastoma

– Specific subtype of leukemia:• M3: DIC (Disseminated intravascular coagulation)

Acute LeukemiaAcute Leukemia

• Laboratory Manifestations:– CBC– Bone marrow aspiration and biopsy– PT (Prothrombin Time) & PTT (Partial

Thromboplastin Time)– LDH (Lactate dehydrogenase)– …

Acute LeukemiaAcute Leukemia

• Treatment:– Combination Chemotherapy– Bone Marrow Transplantation– Stabilization:

• Hematological• Metabolical• Psychological

• Introduction

• Data Collection

• Classifier Design

• Results

• Discussion

• Conclusion

IntroductionIntroduction

• Immunophenotype data• Flow cytometry• Lineage & Differentiation• ALL: Immature (CALLA+), Pre-B,

Mature-B, T-Lymphoblastic• Response to chemotherapy• AML: M1,M2,…,M8• No relevant prognosis

Data CollectionData Collection

• Flow cytometry immunophenotype data of cases with leukemia or reactive bone marrow were collected retrospectively from computerized archival database.

• Selection Criterion:– Confirmed diagnosis

– Complete flow cytometry antibody panel result

• Total cases: 170– 151 leukemia and 19 nonleukemia

– 62 children and 89 adults

– 81 males and 70 females

First PhaseFirst Phase

• Lineage Categories• Categorize into:

– Reactive– ALL– Remission– Mixed AML-ALL– AML

Second PhaseSecond Phase

• Categorize the ALL Cases into subcategories based on differentiation

• Categorize into:– Pre-B– CALLA+– T Phenotype

• Not include: Mature-B (Difficulty in obtaining sufficient data for meaningful interpretation)

DataData

• Validation / Training set size = 33-50%• Only Bone marrow phenotypes (Most Sensetive specific)• Excluded: Peripheral blood and cerebro-spinal fluids immunophenotype• Flow cytometry immunophenotype data:

– Mean fluorescence intensity of a minimum of 10000 cells analyzed using either a red or green fluorescence tagged antibody

DataData

• 27 Standardized and most commonly used monoclonal antibodies with defined specificities.

• Not all of these are utilized for each case.• Average of 15 antibodies for each case.• At least ten antibodies are commonly used for acute leukemia as a

standard practice.• With a zero value if an antibody was not used• An additional binary input denoting past diagnosis of leukemia, were

used as input a neural network classifier.

Classifier DesignClassifier Design

• A feed-forward neural network

• Trained using back propagation algorithm

ClassifierClassifier

• How many hidden layer neurons are needed for a particular task?

– Having a large number of redundant weights leads to over fitting

ClassifierClassifier

• Given a network with a certain number of inputs, hidden layer

neurons, and output, how many training sample are needed to achieve good generalization?

• For accuracy of (1-ε):

p ≥ O(W/ε)

p: Number of training sample.

W: Total number of weights in the network.

ClassifierClassifier

• Perturbation: To generate a large number of cases by introducing small variation in actual cases.

• Optimal Brain Damage: The weight which least increase the error can be

eliminated

• Optimal Brain Surgeon: The sensitivity of an interconnection is expressed as the

cumulative sum of the changes experienced by a weight, during training.

• Weight Decay: Each weight has a tendency to decay to zero with a rate

proportional to the magnitude of the weight.

ClassifierClassifier

• Inputs: 27 + 1

• Hidden: 50 Progressively increasing the number of hidden neurons until

acceptable performance was achieved on training data.

• Output:– First phase (Based on lineage): 5– Second phase (Based on differentiation): 3

• Learning rate (η): 0.1

• Weight Decay Coefficient (λ): 0.05

ResultsResults

• Mean error was acceptably low (0.0001) in both the cases.

• First phase weights :– Total: 1650– Nonzero: 1106– Very small value(<0.1): 544

• Second phase weights :– Total: 1550– Nonzero: 446– Very small value(<0.1): 1104

Fig. 2. Performance of the network for categorization into reactive and the lineage categories of leukemia (ALL, Remission, Mixed AML-ALL, and AML).

Fig. 3. Performance of the network for categorization of ALL cases into subcategories based on differentiation (Pre-B, CALLA+, and T Phenotype).

ResultResult

• Generalization Error:– First phase: 10.3%– Second phase: 10.0%

• Back propagation without the complexity regulation term (Weight Decay): – Generalization performance was poor

DiscussionDiscussion

• Clustering-based methods fall into one of two categories:

– Partitioning– Hierarchical

DiscussionDiscussion

• Partitioning:– e.g., k-means, c-means fuzzy clustering– Divide the inputs, so that members of a

cluster are close to each other and far away from other clusters

– The shared specificity of some monoclonal antibodies make this extremely difficult.

DiscussionDiscussion

• Hierarchical:– e.g., centroid sorting, linkage methods– Try to merge two closest data points together

at each step, and repeat the process until there is only one cluster.

– Have a better chance of succeeding due to the variability in immunophenotype data

– An error in merging made earlier on is propagated throughout.

ConclusionConclusion

• Off line retraining

• Extract rules from trained networks

top related