An In-Depth Examination of Java I/O Performance and Possible Tuning Strategies Kai Xu [email protected] Hongfei Guo [email protected]
Jan 02, 2016
An In-Depth Examination of Java I/O Performance
and Possible Tuning Strategies
Kai Xu [email protected]
Hongfei Guo [email protected]
Outline
• Why bother? (Problems)
• Our goals
• Java I/O overview
• Tests design
• Test results and analysis
• Conclusions
Why bother?
• Growing interest in using Java
• Much works had been done in Java
performance evaluation but• NOT in Java I/O
Our Goals
• Is it really bad (Compared with C/C++)
• How bad
• Possible tuning strategies
• How well they work
Random AccessFile
OutputStream InputStream
FileOutputStream
FilterOutputStream
ByteArrayOutputStream
BufferedOutputStream
DataOutputStream
FileInputStream
FilterInputStream
ByteArrayInputStream
BufferedInputStream
DataInputStream
An Overview of Java I/O Classes
Test Design
• Access patterns
Sequential write/read
Random write/read
• Data interested
Elapse time, CPU breakdown
• Comparison group: C/C++
Test Design (continued)
• Tests on basic Java I/O strategies
Test 1: The lowest level I/O
Test 2: Buffered I/O
Test 3: Direct buffering
Test 4: Operation size
Test 5: Java JNI
Test Setup
• Hardware configurationCPU: Pentium III 667 MHz
Memory: 128 MB
Disk: 10 GB IDE
• Software configurationOS : Redhat 6.2
JVM : JDK 1.2.2
• Profiling Tools: PerfAnal profiler, gprof profiler 2.9.5, time
Test 1: The lowest level Java I/O
• Test parameters:
buffer size : 0 Byte
operation size : 1 Byte
• Sequential Write/Read
• Random Write/Read
Sequential Write/Read
0
100
200
300
400
500
600
700
800
0 10 20 30 40 50 60 70 80 90 100
File Size (M)
Ela
pse
Tim
e (s
)
C
Java
Breakdown -- File size: 100M
0
50
100
150
200
250
300
350
400
read write etc.cpu other
Tim
e (
s)
C
Java
0
50
100
150
200
250
1 2 3 4 5 6 7 8 9 10
File Size (M)
Ela
ps
e T
ime
(s
)
C
Java
Random Write/Read
Breakdown -- File size: 10M
0
10
20
30
40
50
60
70
read w rite seek other cpu w aiting
Tim
e (s
)
CJava
Test 1 Analysis
• Java raw I/O: 200%x slower
• Java system calls cost more read : 224%x write: 158%x
• Random Access is similar
Test 2: Buffered I/O in Java
• Test parameters:
buffer size : 1024 Bytes
file size : 100 MB
• Sequential Write/Read
• Buffering Strategies: No Buffering:
(FileInputStream/FileOutputStream) BufferedInputStream/BufferedOutputStream Direct Buffering
Buffering Strategies in Java
0
100
200
300
400
500
600
700
800
Total Seq Write Seq Read
No Buffer
Buffered Stream
Direct Buffer
CPU Breakdown
0
20
40
60
80
100
120
140
160b
uff
er.
rea
d
bu
ffe
r.w
rite
re
ad
wri
te
arr
ay
co
py
Tim
e (
s)
BufferedStream
Direct Buffer
Test 2 Analysis
• Buffering improves I/O reducing system calls
• Buffered Stream: ~25%
• Direct Buffering: ~40% special purpose vs. general purpose
• No buffering for random access
Test 3: Direct Buffering
• Test parameters:
file size : 100 MB
operation size : 1 Byte
• Sequential Write/Read
• Random Write/Read
Sequential Write/Read
0
100
200
300
400
500
600
700
800
0 2 4 6 8 10 12
Buffer Size (2^x Bytes)
Ela
ps
e T
ime
( s
)
C
Java
Breakdown – Java
0
50
100
150
200
250
300
350
400
450
500
16B 32B 64B 128B 256B 512B 1K 2K 4K 1M 10M
Buffer Size
Tim
e (
s)
other
etc. CPU
memcpy
read
write
Breakdown – C
0
20
40
60
80
100
120
16B 32B 64B 128B 256B 512B 1K 2K 4K 1M 10M
Time (s)
Bu
ffer
Siz
e
wait
etc. CPU
memcpy
read
write
Random Write/Read
0
100
200
300
400
500
600
700
800
900
0 2 4 6 8 10 12
Buffer Size (2^x Bytes)
Ela
pse
Tim
e (s
)
C
Java
Breakdown – Java
0
100
200
300
400
500
600
700
800
900
16B 32B 64B 128 256B 512B 1K 2K 4K 10M
Buffer Size
Tie
m (
s)
Otheretc. CPUSeekWriteRead
Breakdown – C
0
50
100
150
200
250
300
350
400
450
16B 32B 64B 128B 256B 512B 1K 2K 4K 10M
Buffer Size
Tim
e (s
)
idle
etc.CPU
seek
read
write
Test 3 Analysis
• Direct buffering improves I/O: ~50% reducing system calls slower than C/C++: ~300%
• Larger buffer? no big gain: Amdahl’s law!
• Does not help in random access low hit ratio: less than 1%
Test 4: Operation Size
• Test parameters:
buffer size : 0 Byte
• Sequential Write/Read
• Random Write/Read
Sequential Write/Read: 100M
0
100
200
300
400
500
600
700
800
0 2 4 6 8 10 12
Operation Size (2^x Bytes)
Ela
ps
e T
ime
(s
)
C
Java
Random Write/Read: 10M
0
50
100
150
200
250
0 2 4 6 8 10
Operation Size ( 2^x Byte)
Ela
ps
e T
ime
(s
)
C
Java
Test 4 Analysis
• Increasing operation size helps: ~ 85%reducing I/O system calls comparable to C/C++
• Large operation size
– no big gain.
• Random Access is similar
Test 5: Java JNI
• Test parameters:
file size : 100 MB
buffer size : 4 KB
• Sequential Write/Read
Java JNI Buffering
0
50
100
150
200
250
300
350
0 2 4 6 8 10
Operation Size (2^x Bytes)
Ela
pse
Tim
e (s
)
JNI
Direct Buffer
C
Breakdown – JNI buffering
0
50
100
150
200
250
300
1B 4B 16B 64B 256B 1K
Operation Size
Ela
pse
Tim
e (s
)
idle
etc.CPU
jniwrite
jniread
Test 5 Analysis
• I/O system calls are cheap (C/C++ level);• But, cost of calling native method is high;• Small operation size:
more native calls, comparable to Direct Buffering;
• Large operation size:less native calls,comparable to C/C++.
Conclusions
• Java raw I/O: 200%x slower than C• Buffering improves I/O
Reducing system calls220% improvement vs. no bufferBut, still 364%x slower than C
• Random I/O – no help with buffering? – locality of access;
Conclusions (continued)
• Increasing operation size helpsComparable to C/C++
• JNIsystem calls are cheap (C/C++ level);cost of calling native method is high;reduce native call times:
Comparable to C/C++
Thank You…