integrate Hadoop with Lustre
- 1. MapReduce over Lustre report David Luan, Simon Huang, GaoShengGong2008.10~2009.6
2. Outline
- Platform design & improvement
- Test cases, test process design
- Related jobs(GFS-like redundancy)
3. Early research, analysis
- HDFS, Lustre overallBenchmarktests
- WebDAV (an indirect way to mount HDFS)
- Three kinds of Hadoop I/O
- Shortcoming & bottlenecks
4. Early research, analysis Overall Benchmark tests 5. Early research, analysis Input Map Key, Value Key, Value = Map Map Split Input into Key-Value pairs. For each K-V pair call Map. Each Map produces new set of K-V pairs. Reduce(K, V[]) Sort Output Key, Value Key, Value = For each distinct key, call reduce. Produces one K-V pair for each distinct key.Output as a set of Key Value Pairs. MapReduce Flow Key, Value Key, Value Key, Value Key, Value Key, Value Key, Value 6. Early research, analysis Hadoop I/O phases Map Read Local Read Local Read HTTP Reduce write 7. Early research, analysis
- Compute/storagetightlycoupled
- app limited ( job splitdifficult)
- Compute/storage loose coupled
Platform comparison 8. Early research, analysis
- No general use (design for MapReduce)
Shortcomingscomparison 9. Outline
- Platform design & improvement
- Test cases, test process design
- Related jobs(GFS-like redundancy)
10. Platform design & improvement
- Java wrapper for liblustre (without Lustre client )
- Design a method to merge these two system. Implement Hadoops FileSystem interface with java wrapper, then MapReduce can work without Lustre Client.
11. Platform design & improvement
- Java wrapper touch impasse -_-
- JNI call liblustre.so error:
- Java JNI willmis-linkthe function whose name is the same assystem call(such as: mount, read, write, etc.)
- If we use C to call static-lib (liblutre.a), compile to a executable program, it works ok.
- liblustres other problems
- Liblustre is not recommended to use in wiki
- When use it, use liblustre.a instead of liblustre.so
- Liblustre depends on gcc version
12. Platform design & improvement
- Advantages for each Task (with Lustre)
- Great fornon-splitablejobs
Platform design(1) advantages: 13. Platform design & improvement
- Platform design(2) modules
14. Platform design & improvement Platform design(3) read/write 15. Platform design & improvement
- UseHardlinkin instead ofHTTPshuffle before ReduceTaskstarts[1]
- decentralized network bandwidth usage
- delay ReduceTask actual Read/Write
- Use Lustreblock location infoto distribute tasks[2]
- move the compute to its data
- Use a java Child thread to run shell to fetch the location info (detail in White paper)
Platformimprovement1 16. Platform design & improvement
Addlocationinfoas a schedulingparameterUse hardlinktodelay shuffle pahse 17. Outline
- Platform design & improvement
- Test cases, test process design
18.
- Test cases design (Two kinds apps)
- Apps of statistics (search, log processing, etc.)
- Little grained tasks (jobtasks)
- MapTask intermediate result is small
- Apps of no-good splitable & highly complex
- large grained tasks (jobtasks)
- MapTask intermediate result is big
- Each task is highly compute
Test cases, test process design 19. Platform design & improvement
- Apps of highly complex, no-good splitable
intermediate result isbig Each task ishighly compute 20. Test cases, test process design
- Apps of statistics:WordCount
- This test reads text files and count each words. The output contains a word and its count, separated by a tab.
- Apps of no-good splitable :BigMapoutput
- It is a map/reduce program that works on a very big non-splittable file , for map or reduce tasks it just read the input file and the do nothing but output the same file.
21. Test cases, test process design
- Map Read phase(the most time-consuming for Lustre )
- Local read/write and HTTP phase
22. Test cases, test process design
- Use hadlink and location info
23. Outline
- Platform design & improvement
- Test cases, test process design
- Related jobs(GFS-like redundancy)
24. Result analysis
25. Result analysis
- Test1: WordCount with a big file
- process one big textfile(6G)
- Reduce Tasks=0.95((1.75))*2*7=13
26. Result analysis
- Test2:WordCount with many small files
- process a large number small files(10000)
27. Result analysis
- Test3: BigMapOutput with one big file
- Result3 (set mapred.local.dir to default value)
28. Result analysis
- Test4: BigMapOutput with hardlink
- Test5: BigMapOutput with hardlink & location information
29. Result analysis
- Test6: BigMapOutputMap Readphase
- Map Read is The mosttime-consultingpart :
30. Result analysis Conclusion 1: Hadoop+ HDFS Map Read Local Read Local Read HTTP Reduce write 31. Result analysis Conclus