Top Banner
Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008
52

Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

Cluster Computer For Bioinformatics

Applications

Nile University,

Bioinformatics Group.

Hisham Adel

2008

Page 2: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

2

Done By:1. Hisham Adel Hassan.

Supervised by:

Dr. Mohamed Aboualhouda

Page 3: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

3

Points

• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance.• Cluster Computer for Basic Problems.• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.

Page 4: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

4

Introduction

Page 5: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

5

Points

• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance.• Cluster Computer for Basic Problems.• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.

Page 6: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

6

Cluster Definition

•Group of computers and servers (connected together) that act like a single system.

•Each system called a Node.

•Node contain one or more Processor , Ram ,Hard disk and LAN card.

•Nodes work in Parallel.

•We can increase performance by adding more Nodes.

Page 7: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

7

Page 8: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

8

Page 9: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

9

Points

• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance.• Cluster Computer for Basic Problems.• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.

Page 10: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

10

Cluster types

•Load Balancing Cluster (Parallel BLAST).

•Computing Cluster(Parallel sequence alignment).

•High-availability (HA) clusters.

Page 11: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

11

Cluster types:Load Balancing Cluster

Task

Page 12: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

12

Cluster types:Computing Cluster

Task

Page 13: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

13

Cluster type:High-availability Clusters

Page 14: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

14

Cluster advantages

•Performance.

•Scalability.

•Maintenance.

•Cost.

Page 15: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

15

Points

• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance.• Cluster Computer for Basic Problems.• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.

Page 16: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

16

Node 1

switch

Node 4

Node 2

Node 3

Internet

Internet

Internet

Internet

Our Cluster

Page 17: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

17

Communication : Switch 5-Port 10/100Mbps.

Processor and Ram: -Master Node Duo core Processor 1.86 GHZ. Ram 1GB.-Node 1 Pentium 4 Ram 1GB.-Node 2 Pentium 4 Ram 1GB-Node 3 Pentium 4 Ram 512 MB

Our Cluster specification

Page 18: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

18

Operating System OPEN SUSE 10.3

http://software.opensuse.org/

MPICH2

http://www.mcs.anl.gov/research/projects/mpich2/

Our Cluster specification (cont’)

Page 19: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

19

Points

• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance.• Cluster Computer for Basic Problems.• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.

Page 20: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

20

Performance of the Cluster is affected by

1-Node speed.

2-Running Program.

Page 21: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

21

Working…

Running Program(sequential)

Page 22: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

22

Working…

Running Program(sequential)

Page 23: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

23

Working…

Running Program(sequential)

Page 24: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

24

Running Program(sequential)

Page 25: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

25

Data sent

Data sent

Data sent

Running Program(Parallel)

Page 26: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

26

Working…

Working…

Working…

Working…

Running Program(Parallel)

Page 27: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

27

Finished…

Finished…

Finished…

Results

Results

Results

Get results…

Running Program(Parallel)

Page 28: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

28

Points

• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance.• Cluster Computer for Basic Problems.• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.

Page 29: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

29

Sequence Alignment

Page 30: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

30

Sequence Alignment

Used to :

1-Compare between sequences.

2-Search databases.

Page 31: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

31

How to Align two Sequences.

if we have two sequences A A A C G A A A T G ALet match=1, gap=-1 , miss-match=0.

they can be aligned as:

1- A A A C G A | | | | | | Score=3 A A T _ G A

2- A A A C _ G A | | | | | | | Score=1 A A _ _ T G A

Page 32: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

32

Points

• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance• Cluster Computer for Basic Problems..• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.

Page 33: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

33

BLAST(Basic Local Alignment Search Tool)

Searching DataBases

Page 34: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

34

BLAST Algorithm

(High scoring pairs)

Page 35: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

35

Blast search types.

BLASTN - Compares a nucleotide query sequence against a nucleotide sequencedatabase.

BLASTP- Compares an amino acid query sequence against a protein sequencedatabase.

TBLASTN- Compares a protein query sequence against a nucleotide sequenceDatabase.

BLASTX- Compares nucleotide query sequence against a protein sequence database.

Page 36: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

36

Why We need BLAST to be parallelized ?

Page 37: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

37

Our Program:Parallel BLAST

Page 38: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

38

Parallel BLAST(cont’)

Formatdb.c

Nucleotide sequence database “formatdb -i DATABASE -p F “.

Protein sequence database “formatdb -i DATABASE -p T “.

Page 39: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

39

Linux_Cluster_BLASTALL.c

“blastall -p BLAST Search Type -d DATABASE -i QUERY FILE -o out . Txt”

Parallel BLAST(cont’)

Page 40: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

40

Results Average of running 1000 Query, 1000 times.

month.htgs (573 MB)drosoph.nt (118,6 MB))

igseqnt (67.5 MB)Yeastnt (3.2 MB)

mito.nt (3.2 MB)Pdbnt (1.7 MB)

0.0000000

0.2000000

0.4000000

0.6000000

0.8000000

1.0000000

1.2000000

1.4000000

1.6000000

1.8000000

Nucleotide-Nucleotide

1 Node

3 Nodes-Query time

3-Nodes-Query and communication time

Database(Size)

Tim

e(S)

Page 41: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

41

Results(cont’) Average of running 1000 Query, 1000 times.

env_nr(1.6GB) nr(573MB) SwissProt(160MB) Pdbaa(20MB) Yeast.aa(3.2MB)

0.000000

10.000000

20.000000

30.000000

40.000000

50.000000

60.000000

70.000000

80.000000

90.000000

Amino acid_Amino acid

1 Node-Query Time

3 Nodes-Query time

3 Nodes-Query and communication time

Database(size)

Tim

e(S)

Page 42: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

42

Results(cont’) Average of running 1000 Query, 1000 times.

env_nr(1.6GB) Swissprot(160MB) nr(84.7MB) Pdbaa(20.4MB) yeast.aa(3.2MB)

0.0000000

10.0000000

20.0000000

30.0000000

40.0000000

50.0000000

60.0000000

70.0000000

80.0000000

90.0000000

Amino acid_Nucltide

1 Node Query time

3 Nodes Query time only

3 Nodes Query and Communication time

Database(Size)

Tim

e(S

)

Page 43: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

43

Conclusion about Parallel BLAST.

•Performane: Batter by using CLUSTER.

•Scalability:More Nodes time decrease.

Page 44: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

44

Points

• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance.• Cluster Computer for Basic Problems.• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.

Page 45: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

45

Sequence AlignmentCompare between sequences

Page 46: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

46

Sequence Alignment

•Introduction.

•Sequence Alignment Benefits.

•Sequence Alignment Types.

Page 47: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

47

Needleman-Wunsch Algorithm

Page 48: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

48

Why We need Sequence Alignment to be parallelized ?

Page 49: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

49

Parallel Sequence Alignment algorithm

Page 50: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

50

Our Sequence Alignment Program

•Pairwise Alignment.

•Built Using Needleman-Wunsch algorithm.

Page 51: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

51

Learned Skills.

•Using Linux (Suse 10.3) operating system.

• Programming using C language.

• Cluster computers and how to build one.

• MPICH2 for message passing interfaces between nodes.

• Latex.

• Team working, and helping each other.

• Presentation skills.

Page 52: Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.

52

Thank you for your time.

Hisham Adel