Top Banner
Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and Distributed Systems (PMEO-PDS 2007) Rosa Filgueira, David E. Singh, Florin Isaila, Jesús Carretero, Antonio G. Loureiro
19

Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and.

Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application

Performance Modelling, Evaluation, and optimization of Parallel and Distributed Systems (PMEO-PDS 2007)

Rosa Filgueira, David E. Singh, Florin Isaila, Jesús Carretero, Antonio G. Loureiro

Page 2: Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and.

Sumary

I. Description of the problem

II. Main objetives

III. Parallel I/O storage

IV. Evaluation

V. Optimization the I/O

VI. Conclusions

Page 3: Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and.

I. Description of the problem (I)

BISP3D is a semiconductor devices simulator based on finite element methods.

Optimization and evaluation of parallel I/O for the BISP3D .

Page 4: Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and.

I. Description of the problem (II)

The mesh is divided into several sub-domains (METIS).

Each processor makes calculations only with local data.

The results are stored in a sequential way.

The sequential storage is an important bottleneck.

Page 5: Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and.

II. Main objectives (I)

Objetives:

Evaluation of the sequential I/O cost.

Implementing parallel I/O techniques.

Developing a method for selecting the most appropriate I/O technique based on the network type, mesh size and data set size.

Introducing a new data clustering technique called Interval Data Grouping (IDG).

Page 6: Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and.

II. Main objectives (II)

Several I/O configurations has been implemented and evaluated: Sequential I/O over NFS. Sequential I/O over PVFS. Parallel I/O over PVFS (unoptimized). Parallel I/O over PVFS with two phase I/O. Parallel I/O over PVFS with List I/O.

Page 7: Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and.

III. Parallel I/O All processors write on the disk their local data. Each processor constructs a view over the file using the

distibution provided by METIS.

1 32 4 5 6 7 98 10 1211 1413 15 16 17 18 19 20

1 2 4 98 1211 1413 15

3 5 6 7 10 16 17 19 20

View over the file for processor 1

View over the file for processor 2

1 32 4 5 6 7 98 10 1211 1413 15 16 17 18 19 20

Metis distribution for partition 0 Metis distribution for partition 1

Page 8: Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and.

IV. Evaluation (I) We have make tests:

Different networks (Myrinet and Fast Ethernet), Different meshes.

Mesh 1 Mesh 2 Mesh 3 Mesh 4

Nodes 47219 32888 73260 289650

Vertices 305120 210437 416950 2027885

Page 9: Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and.

IV. Evaluation (II)

1 32 4 5

1 21’ 2’ 3 3’ 4 54’ 5’

1 1’’1’ 2 2’ 2’’ 3 3’’3’ 4 4’’4’ 5’5 5’’

Mesh with load 2

Mesh

Mesh with load 3

Using a parameter (Load) we increase the size of the mesh

Note that with this parameter we change the grain size of the acceses

Page 10: Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and.

IV. Evaluation: Myrinet

Two phase List I/O

Page 11: Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and.

IV. Evaluation: : Fast Ethernet

List I/O

Page 12: Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and.

IV. Evaluation: Decision tree

Nx=70,000

Nld=50Nld=90

Page 13: Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and.

V. Optimizing the I/O

We introduce a novel technique of data grouping: Interval Data Grouping (IDG)

The goal of IDG: grouping data for I/O in order to increase the locality and reduce the disk write time.

Page 14: Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and.

V. Optimizing the I/O : Distribution of example mesh

0 21 3 4 5 6 7

0 1 2 3 4

1 2 4 5 6

BISP3D Data distribution

Processor 0

Processor 1

Local

Shared

5 7

7

0 1 3 5 7

2 4 6

Processor 0

Processor 1

METIS assignation

Page 15: Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and.

V. Optimizing the I/O : Distribution of example mesh (II)

IDG algorithm has two stages: Node classification:

Analyze the mesh structure and Metis distribution to clasifying mesh node (shared or local):

Disk access scheduler: For local nodes they are written by processor which belong

to For shared nodes we have to choose the most appropriate

one looking its previous and subsequent node.

Page 16: Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and.

V. Optimizing the I/O : Distribution of example mesh (III)

0 21 3 4 5 6 7

0 1 2 3 4

6 7

5 Processor 0

Processor 1

Local

Shared

IDG distribution0 1 2 3 4

1 2 4 5 6

BISP3D Data distribution

Processor 0

Processor 1

5 7

7

0 1 3 5 7

2 4 6

Processor 0

Processor 1

METIS assignation

Page 17: Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and.

V. Optimizing the I/O : evaluation (I)

We have combined IDG with List I/O for different meshes and different loads.

We have compared the IDG performance with other strategies:

METIS Original node distribution. Random Each shared node is assigned to partition

radomly. First Position Each shared node is assigned to the

first particion among all that it belongs.

Page 18: Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and.

V. Optimizing the I/O : evaluation (II)

Page 19: Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and.

VI. Conclusions

Optimization and evaluation of parallel I/O operations for BISP3D simulator.

A decision tree to choose the best I/O configuration was made.

We have introduced a novel technique which exploits the data replication of mesh nodes for scheduling disk accesses .With this proposal the perfomance of the parallel I/O operations is improved.