Top Banner
DSL for Pedigree Rearrangements CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala
23
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

DSL for Pedigree Rearrangements

CSI5112 Software Engineering

Team: Andrei Anisenia Margi Fumtiwala

Page 2: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

AgendaDSL overviewGoals of DSL Tool support for the DSLDSL creation technology Sample usageForeseen impact of language evolution Potential for analysis Conclusions

Page 3: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

DSL OverviewDomain Specific Language (DSL) is a specific

language being used to solve problems in a particular domainIt is not intended to work outside that domainContains very specific goals in design and

implementation

Page 4: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

DSL for Pedigree Rearrangements Genetic analysis of biological data is one of the most

important research directions in modern bioinformatics

The data to be analyzed is supplied in different textual formats by hospitals/researchers and may be analyzed by different bioinformatics tools

Our focus is on the type of genetic data called pedigree data presented in so-called PEDFILES

Typical pedigree data is presented by the family structureIncludes persons and child-parents relations between the

persons

Page 5: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

A simple pedigree exampleGraphical representation of the pedigree data

Textual representation of the pedigree data - PEDFILES

Each line represents Pedigree data and Biological data for a specific person

Page 6: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

Why do we need a graphical tool?Usual work flow of a bio informaticians involve extensive editing

of PEDFILEsInitial PEDFILEs received from hospitals may include too big

pedigrees or too much biological data in order to be analyzed by existing software

This problem requires splitting PEDFILEs into smaller sub pedigrees or/and removal of different persons from a pedigree in order to make it analyzable

The rearrangement of a pedigree requires multiple point changes within pedigree data It is not feasible

Bio informaticians thus need some visual graphical environment for presentation and rearrangement of pedigrees

Page 7: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

Scope of tool support for the DSLProvides visual graphical environment for

presentation and rearrangement of pedigreesProvides the possibility to save the

constructed and rearranged pedigrees to textual PEDFILEs

Page 8: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

DSL creation technology

Page 9: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

Overview of the approachMeta-model: Construction and development of objects,

relations, constraints and actions for modelling a predefined class of problems.

In the case of Pedigree Rearrangements problem the meta-model is defined as the set of the following:Objects:

Person – represents single person in a pedigree. Holds the following data: name (unique within a pedigree), person's sex and biological data

Pedigree – a set of persons and relations among them. Pedigree has unique id (Pedigree ID)

Relations: Person has a father Person has a mother Pedigree consists of persons

Page 10: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

ConstraintsPerson can have maximum one father and

one motherPerson can be associated to exactly one

pedigreePerson can not be connected to persons in

another pedigreeAdding a new relation of type child-parents

shouldn't create directed cycles in a pedigreeIn simple words, circular relationships can not

exist between persons, ex: mother can not be a child....

Page 11: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

Pedigree Rearrangements Model

Page 12: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

Example – input PEDFILEPed1 grandfather1 0 0 1 1 2 1 1 2 2 1 2 3Ped1 grandmother1 0 0 2 1 1 1 2 3 4 2 12 3 5Ped1 grandfather2 0 0 1 1 2 3 2 2 2 2 2 2 2 2Ped1 grandmother2 0 0 2 4 4 4 4 423 2 3 2 4Ped1 grandfather3 0 0 1 1 2 1 2 4 5 6 3 2Ped1 father grandfather1 grandmother1 1 12 1 3 4 56 4 3 56 7 8Ped1 mother grandfather2 grandmother2 2 1 2 4 5 6 3 4 2 45 32Ped1 stranger grandfather3 grandmother2 1 2 3 4 5 1 3 4 6 7 8

56Ped1 mother2 grandfather3 grandmother2 2 1 2 3 4 6 4 5 6 7 8Ped1 child1 father mother 1 1 2 1 2 1 2 1 2 1 2 Ped1 child2 stranger mother2 2 1 2 34 54 6 7 8

Page 13: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

Example - editor view of the input pedigree

Page 14: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

Example – editor view of the output pedigree

Page 15: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

Example – output PEDFILEPed1 grandfather1 0 0 1 1 2 1 1 2 2 1 2 3Ped1 grandmother1 0 0 2 1 1 1 2 3 4 2 12 3 5Ped1 grandfather2 0 0 1 1 2 3 2 2 2 2 2 2 2 2Ped1 grandmother2 0 0 2 4 4 4 4 423 2 3 2 4Ped1 father grandfather1 grandmother1 1 12 1 3 4 56 4 3 56 7 8Ped1 mother grandfather2 grandmother2 2 1 2 4 5 6 3 4 2 45 32Ped1 child1 father mother 1 1 2 1 2 1 2 1 2 1 2 Ped2 grandfather3 0 0 1 1 2 1 2 4 5 6 3 2Ped2 grandmother2 0 0 2 4 4 4 4 423 2 3 2 4Ped2 stranger grandfather3 grandmother2 1 2 3 4 5 1 3 4 6 7 8 56Ped2 mother2 grandfather3 grandmother2 2 1 2 3 4 6 4 5 6 7 8Ped2 child2 stranger mother2 2 1 2 34 54 6 7 8

Page 16: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

Model-to-text (M2T) transformation technique used

The Model to Text (M2T) transformations focus on the generation of textual artifacts from models

New PEDFILEs are generated using XpandXpand is a language specialized on code

(text) generation based on EMF modelsProvides OCL-like expressions (semantics)

with Java-like syntax

Page 17: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

Language evolution – GUI adjustmentsDifferent shapes and/or colors for person

items according to person sexesPersons dropped onto the pedigree “canvas”

instead of being linked to it by arrowsThese require minor changes in GMF

components Considered for the upcoming release on the

18th of April

Page 18: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

Language evolution – functionality extensions

Pedigree layout is the crucial parameter in visual analysis of pedigree structure

Different auto-layout algorithms should provide different views of a pedigree structure

It takes a lot of time to build a pedigree from a given PEDFILE manually

Automatic construction of a pedigree by only browsing a PEDFILE would be helpful

These are hard to implement in existing DSL Toolkit and require extensions to the framework

Page 19: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

Potential for analysis – Bayesian networks

These statistical networks are used to analyze biological data stored in pedigrees

They are provided as input to different statistical bioinformatics software (Superlink, Gene hunter, Allegro etc.)

BNs are graphs with specific properties, constructed using pedigree structure and biological data

BNs are much more complex to construct than pedigrees (more nodes, more links)

Page 20: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

Potential for analysis – Bayesian networks, example pedigree

Page 21: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

Potential for analysis – Bayesian networks, example network

Page 22: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

Potential for analysis – Bayesian networks, transformation

BNs are presented in structured textual files, just as PEDFILEs

Define meta-model for BayesianNetworkDefine model-to-model (M2M) transformation

from Pedigree Model to Bayesian Network model

Define model-to-text (M2T) transformation for Bayesian Network

Run BN analysis software and find disease gene locations!

Page 23: CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.

ConclusionsOur tool provides support for graphical

representation of biological instancesExisting DSL creation tools do not provide

full coverage for all DSL editor needsData to be manipulated is very complex,

therefore automation of transformations between different models becomes essential

Software Engineering should enhance the ongoing research in biology, otherwise it is not feasible due to exponentially growing amount of biological data