Top Banner
Multi- threaded Algorithm 3 Michael Tsai 2011/6/17
15

Multi-threaded Algorithm 3

Jan 07, 2016

Download

Documents

yeriel

Multi-threaded Algorithm 3. Michael Tsai 2011/6/17. Multithreaded matrix multiplication. P-SQUARE-MATRIX-MULTIPLY(A,B) n= A.rows let C be a new n x n matrix parallel for i=1 to n parallel for j=1 to n for k=1 to n return C. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Multi-threaded Algorithm 3

Multi-threaded Algorithm 3

Michael Tsai2011/6/17

Page 2: Multi-threaded Algorithm 3

2

Multithreaded matrix multiplicationP-SQUARE-MATRIX-MULTIPLY(A,B)n=A.rowslet C be a new n x n matrixparallel for i=1 to n

parallel for j=1 to n

for k=1 to n

return C

๐‘‡ 1(๐‘›)=ฮ˜ (๐‘›3 )

๐‘‡ โˆž (๐‘›)=ฮ˜ ( log n )+ฮ˜ (log๐‘›)+ฮ˜ (๐‘›)=ฮ˜(๐‘›)

๐‘‡ 1 (๐‘›)๐‘‡ โˆž (๐‘› )

=ฮ˜ (๐‘›3 )ฮ˜ (๐‘›)

=ฮ˜ (๐‘›2)

Page 3: Multi-threaded Algorithm 3

3

Divide-and-conquer Multithreaded Algorithm for Matrix Multiplication ( ๅ‹’ๅ‹’้•ท )

C T

Page 4: Multi-threaded Algorithm 3

4

P-MATRIX-MULTIPLY-RECURSIVE(C,A,B)n=A.rowsif n==1

else let T be a new n x n matrixpartition A,B,C, and T into n/2 x n/2

submatrices (spawn P-MATRIX-MULTIPLY-RECURSIVE( spawn P-MATRIX-MULTIPLY-RECURSIVE(spawn P-MATRIX-MULTIPLY-RECURSIVE(spawn P-MATRIX-MULTIPLY-RECURSIVE(spawn P-MATRIX-MULTIPLY-RECURSIVE(spawn P-MATRIX-MULTIPLY-RECURSIVE(spawn P-MATRIX-MULTIPLY-RECURSIVE(P-MATRIX-MULTIPLY-RECURSIVE(syncparallel for i=1 to n

parallel for j=1 to n

๐‘€ 1 (๐‘›)=8๐‘€ 1(๐‘›2 )+ฮ˜ (๐‘›2 )=ฮ˜ (๐‘›3 )

๐‘€โˆž (๐‘›)=๐‘€โˆž(๐‘›2 )+ฮ˜ ( log๐‘› )+ฮ˜ (log๐‘›)=ฮ˜ ( log2๐‘›)

๐‘€ 1 (๐‘›)๐‘€โˆž (๐‘› )

=ฮ˜ (๐‘›3 )ฮ˜ ( log2๐‘›)

=ฮ˜( ๐‘›3

log2๐‘› )

Page 5: Multi-threaded Algorithm 3

5

How about Strassenโ€™s method? Reading assignment: p.795-796.

Parallelism: , slightly less than the original recursive version!

Page 6: Multi-threaded Algorithm 3

6

Multithreaded Merge SortMerge-Sortโ€™(A,p,r)if p<r

q=spawn MERGE-SORTโ€™(A,p,q)MERGE-SORTโ€™(A,q+1,r)syncMERGE(A,p,q,r) ฮ˜(๐‘›)

๐‘€๐‘†1โ€ฒ (๐‘› )=2๐‘€๐‘†1

โ€ฒ (๐‘›2 )+ฮ˜ (๐‘› )=ฮ˜ (๐‘› log๐‘›)

๐‘€๐‘†โˆžโ€ฒ (๐‘› )=๐‘€๐‘†โˆž

โ€ฒ (๐‘›2 )+ฮ˜ (๐‘›)=ฮ˜ (๐‘› )

Parallelism=

Merge() is the bottleneck!

A: Array to be sortedp and r: Start and end index of the range to be sorted

Page 7: Multi-threaded Algorithm 3

7

1. ๆŒ‘ๅ‡บ็š„ไธญไฝๆ•ธ x2. ๆ‰พๅ‡บไธญ้€™ๅ€‹ไฝ็ฝฎไฝฟๅพ—้ƒฝ < x, ้ƒฝ x

ๆŠŠ Merge ไนŸ็”จ Divide & Conquer ไพ†่งฃ !( ๆ–นไพฟไบค็ตฆไธๅŒ็š„ thread ๅŽปๅš )

3. Copy x ๅˆฐๆ–ฐๅœฐๆ–น ( ๆ นๆ“š x ๅ’Œ

4. ้–‹ๅˆ†่บซๅŽป merge ๅ’Œ , ๅŠ ๅ’Œๅ…ฉๅ€‹ๅ€ๆฎต

Base case: ๅ’Œ้ƒฝๆ˜ฏ็ฉบ็š„

Page 8: Multi-threaded Algorithm 3

8

Multithreaded MergeP-MERGE(T,,, ,,A,)

if exchange exchange exchange

if return

else

BINARY-SEARCH(T[],T, ,)

A[]=T[]spawn P-MERGE(T,,, ,,A,)P-MERGE(T,,, ,,A,)sync

T: Array to be mergedA: Array to save the merged result,, ,: Start and end index of the range to be merged: The index of the median in A

Page 9: Multi-threaded Algorithm 3

9

,

ๆœ€็ณŸ็š„็‹€ๆณไธ‹ , x ๆฏ”ๆ‰€ๆœ‰็š„้ƒฝๅคง ่ฆ merge ๅ€‹ๅ…ƒ็ด 

Page 10: Multi-threaded Algorithm 3

10

Multithreaded MergeP-MERGE(T,,, ,,A,)

if exchange exchange exchange

if return

else

BINARY-SEARCH(T[],T, ,)

A[]=T[]spawn P-MERGE(T,,, ,,A,)P-MERGE(T,,, ,,A,)sync

๐‘ƒ ๐‘€โˆž (๐‘› )=๐‘ƒ ๐‘€โˆž( 3๐‘›4 )+ฮ˜ (log๐‘›)

ฮ˜ (log๐‘›)

, ๅ› ็‚บ่‡ณๅฐ‘่ฆ copy n elements

, ่ฆ‹่ชฒๆœฌ p.802

โ‡’๐‘ƒ๐‘€ 1 (๐‘›)=ฮ˜ (๐‘›)

Page 11: Multi-threaded Algorithm 3

11

P-MERGE-SORT(A,p,r,B,s)n=r-p+1if n==1

B[s]=A[p]else let T[1..n] be a new array

spawn P-MERGE-SORT(A,p,q,T,1)P-MERGE-SORT(A,q+1,r,T,qโ€™+1)syncP-MERGE(T,1,qโ€™,qโ€™+1,n,B,s)

A: Array to be mergedp,r: Index of the range to be sortedB: Array to save the result

๐‘ƒ๐‘€ ๐‘†1 (๐‘›)=2๐‘ƒ๐‘€๐‘†1(๐‘›2 )+๐‘ƒ ๐‘€1 (๐‘›)=2๐‘ƒ๐‘€ ๐‘†1(๐‘›2 )+ฮ˜ (๐‘›)=ฮ˜ (๐‘› log๐‘›)

๐‘ƒ๐‘€ ๐‘†โˆž (๐‘›)=๐‘ƒ๐‘€๐‘†โˆž(๐‘›2 )+๐‘ƒ๐‘€โˆž (๐‘›)=๐‘ƒ๐‘€๐‘†โˆž(๐‘›2 )+ฮ˜ (log 2๐‘›)=ฮ˜ ( log3๐‘› )

Parallelism = Much better now!

Page 12: Multi-threaded Algorithm 3

12

ๆ€Ž้บผ่‡ชๅทฑๅญธๆผ”็ฎ—ๆณ• ้žๅธธ้‡่ฆ ้€™ๅ ‚่ชฒๆฒ’่พฆๆณ•ๆŠŠๆ‰€ๆœ‰โ€้‡่ฆโ€็š„ๆผ”็ฎ—ๆณ•้ƒฝๆ•™ๅฎŒ ไฝ†ๆ˜ฏๅธŒๆœ›้Ž็จ‹ไธญไฝ ๅทฒ็ถ“ๅปบ็ซ‹ไบ†่‡ชๅทฑๅญธ็ฟ’ๆผ”็ฎ—ๆณ•็š„่ƒฝๅŠ›

ๆบ–ๅ‚™็ ”็ฉถๆ‰€่€ƒ่ฉฆ็š„ๆ™‚ๅ€™ๅฏ่ƒฝๆœƒ้œ€่ฆ

Page 13: Multi-threaded Algorithm 3

13

ๆˆ‘็š„็ถ“้ฉ— ๆˆ‘้€™ไธ€ๅนดไพ†ไนŸ่ทŸ่‘—ๅคงๅฎถไธ€่ตทๅญธ็ฟ’ ่ณ‡ๆ–™็ตๆง‹ ่ˆ‡ ๆผ”็ฎ—ๆณ• ไธ€ไบ›ๅฅฝๆ–นๆณ• ( ่‡ชไปฅ็‚บ ) ่ทŸๅคงๅฎถๅˆ†ไบซ :

1. ็•ซๅœ– , ็”จ็ฐกๅ–ฎ็š„ไพ‹ๅญๅœ–่งฃๆญฅ้ฉŸ2. ๅ…ˆไบ†่งฃๆฆ‚ๅฟต ( ๅคงๆ–นๅ‘ ), ไธๆ€ฅ่‘—็œ‹่ค‡้›œ็š„ๆ•ธๅญธ

( ๆˆ‘ๅธธ่ชช : ๅฆ‚ๆžœไป€้บผ้ƒฝๆฒ’ๅผ„ๆ‡‚ , ๅ…ˆๅผ„ๆ‡‚้€™ๅ€‹ )3. ไธ€ๆฌกไบ†่งฃไธ€ๅฐ้ƒจๅˆ†ๅฐฑๅฅฝ ( ๆจก็ต„ๅŒ–ๅญธ็ฟ’ ), ไธๆ€ฅ่‘—ๅผ„ๆ‡‚ๆ‰€

ๆœ‰็š„้ƒจๅˆ†4. ็›ธไฟก่‡ชๅทฑไธ€ๅฎšๅฏไปฅๅผ„ๆ‡‚ ( ้žๅธธ้‡่ฆ )5. ไธ็Ÿฅ้“่‡ชๅทฑๅˆฐๅบ•ๆ‡‚ไบ†ๆฒ’ ? ็”จไธ€ไบ›ๅคๆ€ช็š„ไพ‹ๅญไพ†่ฉฆ่ฉฆ

(boundary case)6. ็ทด็ฟ’่‡ชๅทฑ็œ‹่ชฒๆœฌ ( ้žๅธธ้‡่ฆ )

Page 14: Multi-threaded Algorithm 3

14

ๆœŸๆœซ่€ƒๅ…งๅฎน & ๅž‹ๅผ Closed book, 2 x A4 cheat sheets (double-

sided) ไฝ”ๅญธๆœŸๆˆ็ธพ 30% ๅŒ…ๅซๆ•ดๅญธๆœŸไธŠ่ชฒๅ…งๅฎน , ไฝ†ไปฅๆœŸไธญ่€ƒ้ŽๅพŒ็š„ๅ…งๅฎน็‚บไธป 180 minutes ้กŒๅž‹่ˆ‡ๆœŸไธญ่€ƒ้กžไผผ ( ๆ˜ฏ้ž + ้ธๆ“‡ + ่งฃ้‡‹ ) Extra office hour ??

Page 15: Multi-threaded Algorithm 3

15

ๅ†ๆœƒ , ๆˆ‘ๆ•™ๅญธ็”Ÿๆถฏ็š„็ฌฌไธ€็ญๅญธ็”Ÿ~

ๅธŒๆœ›ไปฅๅพŒไนŸ่ƒฝๅœจ้ธไฟฎ่ชฒ / ๅฐˆ้กŒ่ชฒ็œ‹ๅˆฐไฝ ๅ€‘ ~

Iโ€™ll miss you

็ฅๅ„ไฝๆœ‰ๅ€‹ไธๅคช็ˆ†่‚็š„ๆœŸๆœซ + ๆ„‰ๅฟซ็š„ๆš‘ๅ‡