This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1. Senior Director of Performance Engineering, MongoDB Alvin
Richards #MongoDBWorld Mythbusting: Understanding How We Measure
the Performance of MongoDB
2. Before we start We are going to look a lot at C++ kernel
code Java benchmarks JavaScript tests And lots of charts
4. Benchmarking Some common traps Performance measurement &
diagnosis What's next
5. Part One Some Common Traps
6. The Milk Train Doesn't Stop Here Anymore Tennessee Williams
"We all live in a house on fire, no fire department to call; no way
out, just the upstairs window to look out of while the fire burns
the house down with us trapped, locked in it."
7. long startTime = System.currentTimeMillis(); for (int
roundNum = 0; roundNum < numRounds; roundNum++) { for (int i =
0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new
BasicDBObject(); doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = ""
doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal);
aDocs[i]=doc; } coll.insert(aDocs); numInserts +=
documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); }
long endTime = System.currentTimeMillis(); #1 Time taken to Insert
x Documents
8. long startTime = System.currentTimeMillis(); for (int
roundNum = 0; roundNum < numRounds; roundNum++) { for (int i =
0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new
BasicDBObject(); doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = ""
doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal);
aDocs[i]=doc; } coll.insert(aDocs); numInserts +=
documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); }
long endTime = System.currentTimeMillis(); #1 Time taken to Insert
x Documents
9. long startTime = System.currentTimeMillis(); for (int
roundNum = 0; roundNum < numRounds; roundNum++) { for (int i =
0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new
BasicDBObject(); doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = ""
doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal);
aDocs[i]=doc; } coll.insert(aDocs); numInserts +=
documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); }
long endTime = System.currentTimeMillis(); #1 Time taken to Insert
x Documents
10. long startTime = System.currentTimeMillis(); for (int
roundNum = 0; roundNum < numRounds; roundNum++) { for (int i =
0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new
BasicDBObject(); doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = ""
doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal);
aDocs[i]=doc; } coll.insert(aDocs); numInserts +=
documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); }
long endTime = System.currentTimeMillis(); #1 Time taken to Insert
x Documents
11. long startTime = System.currentTimeMillis(); for (int
roundNum = 0; roundNum < numRounds; roundNum++) { for (int i =
0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new
BasicDBObject(); doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = ""
doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal);
aDocs[i]=doc; } coll.insert(aDocs); numInserts +=
documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); }
long endTime = System.currentTimeMillis(); #1 Time taken to Insert
x Documents
12. long startTime = System.currentTimeMillis(); for (int
roundNum = 0; roundNum < numRounds; roundNum++) { for (int i =
0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new
BasicDBObject(); doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = ""
doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal);
aDocs[i]=doc; } coll.insert(aDocs); numInserts +=
documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); }
long endTime = System.currentTimeMillis(); So that looks ok,
right?
13. long startTime = System.currentTimeMillis(); for (int
roundNum = 0; roundNum < numRounds; roundNum++) { for (int i =
0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new
BasicDBObject(); doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = ""
doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal);
aDocs[i]=doc; } coll.insert(aDocs); numInserts +=
documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); }
long endTime = System.currentTimeMillis(); What are else you
measuring? Object creation and GC management?
14. long startTime = System.currentTimeMillis(); for (int
roundNum = 0; roundNum < numRounds; roundNum++) { for (int i =
0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new
BasicDBObject(); doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = ""
doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal);
aDocs[i]=doc; } coll.insert(aDocs); numInserts +=
documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); }
long endTime = System.currentTimeMillis(); What are else you
measuring? Thread contention on nextInt()? Object creation and GC
management?
15. long startTime = System.currentTimeMillis(); for (int
roundNum = 0; roundNum < numRounds; roundNum++) { for (int i =
0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new
BasicDBObject(); doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = ""
doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal);
aDocs[i]=doc; } coll.insert(aDocs); numInserts +=
documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); }
long endTime = System.currentTimeMillis(); What are else you
measuring? Time to synthesize data? Object creation and GC
management? Thread contention on nextInt()?
16. long startTime = System.currentTimeMillis(); for (int
roundNum = 0; roundNum < numRounds; roundNum++) { for (int i =
0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new
BasicDBObject(); doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = ""
doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal);
aDocs[i]=doc; } coll.insert(aDocs); numInserts +=
documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); }
long endTime = System.currentTimeMillis(); What are else you
measuring? Object creation and GC management? Thread contention on
addAndGet()? Thread contention on nextInt()? Time to synthesize
data?
17. long startTime = System.currentTimeMillis(); for (int
roundNum = 0; roundNum < numRounds; roundNum++) { for (int i =
0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new
BasicDBObject(); doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = ""
doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal);
aDocs[i]=doc; } coll.insert(aDocs); numInserts +=
documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); }
long endTime = System.currentTimeMillis(); What are else you
measuring? Object creation and GC management? Clock resolution?
Thread contention on nextInt()? Time to synthesize data? Thread
contention on addAndGet()?
18. // Pre Create the Object outside the Loop BasicDBObject[]
aDocs = new BasicDBObject[documentsPerInsert]; for (int i=0; i <
documentsPerInsert; i++) { BasicDBObject doc = new BasicDBObject();
String cVal = ""; doc.put("c",cVal); String padVal = "";
doc.put("pad",padVal); aDocs[i] = doc; } Solution: Pre-Create the
objects Pre-create non varying data outside the timing loop
Alternative Pre-create the data in a file; load from file
19. // Use ThreadLocalRandom generator or an instance of
java.util.Random per thread java.util.concurrent.ThreadLocalRandom
rand; for (long roundNum = 0; roundNum < numRounds; roundNum++)
{ for (int i = 0; i < documentsPerInsert; i++) { id++; doc =
aDocs[i]; doc.put("_id",id); doc.put("k", nextInt(rand,
numMaxInserts)+1); } coll.insert(aDocs); numInserts +=
documentsPerInsert; } // Maintain count outside the loop
globalInserts.addAndGet(documentsPerInsert * roundNum); Solution:
Remove contention Remove contention nextInt() by making Thread
local
20. // Use ThreadLocalRandom generator or an instance of
java.util.Random per thread java.util.concurrent.ThreadLocalRandom
rand; for (long roundNum = 0; roundNum < numRounds; roundNum++)
{ for (int i = 0; i < documentsPerInsert; i++) { id++; doc =
aDocs[i]; doc.put("_id",id); doc.put("k", nextInt(rand,
numMaxInserts)+1); } coll.insert(aDocs); numInserts +=
documentsPerInsert; } // Maintain count outside the loop
globalInserts.addAndGet(documentsPerInsert * roundNum); Solution:
Remove contention Remove contention on addAndGet() Remove
contention nextInt() by making Thread local
21. long startTime = System.currentTimeMillis(); long endTime =
System.currentTimeMillis(); long startTime = System.nanoTime();
long endTime = System.nanoTime() - startTime; Solution: Timer
resolution "resolution is at least as good as that of
currentTimeMillis()" "granularity of the value depends on the
underlying operating system and may be larger" Source
http://docs.oracle.com/javase/7/docs/api/java/lang/System.html
22. General Principal #1 Know what you are measuring
23. BasicDBObject doc = new BasicDBObject(); doc.put("v", str);
// str is a 2k string for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new
BasicDBObject(); long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate); DBObject foundObj; while
(cur.hasNext()) { foundObj = cur.next(); } long endTime =
System.currentTimeMillis(); #2 Response time to return all
results
24. BasicDBObject doc = new BasicDBObject(); doc.put("v", str);
// str is a 2k string for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new
BasicDBObject(); long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate); DBObject foundObj; while
(cur.hasNext()) { foundObj = cur.next(); } long endTime =
System.currentTimeMillis(); #2 Response time to return all
results
25. BasicDBObject doc = new BasicDBObject(); doc.put("v", str);
// str is a 2k string for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new
BasicDBObject(); long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate); DBObject foundObj; while
(cur.hasNext()) { foundObj = cur.next(); } long endTime =
System.currentTimeMillis(); #2 Response time to return all
results
26. BasicDBObject doc = new BasicDBObject(); doc.put("v", str);
// str is a 2k string for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new
BasicDBObject(); long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate); DBObject foundObj; while
(cur.hasNext()) { foundObj = cur.next(); } long endTime =
System.currentTimeMillis(); #2 Response time to return all
results
27. BasicDBObject doc = new BasicDBObject(); doc.put("v", str);
// str is a 2k string for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new
BasicDBObject(); long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate); DBObject foundObj; while
(cur.hasNext()) { foundObj = cur.next(); } long endTime =
System.currentTimeMillis(); So that looks ok, right?
28. BasicDBObject doc = new BasicDBObject(); doc.put("v", str);
// str is a 2k string for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new
BasicDBObject(); long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate); DBObject foundObj; while
(cur.hasNext()) { foundObj = cur.next(); } long endTime =
System.currentTimeMillis(); What are else you measuring? Each doc
is is 4080 bytes on disk with powerOf2Sizes
29. BasicDBObject doc = new BasicDBObject(); doc.put("v", str);
// str is a 2k string for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new
BasicDBObject(); long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate); DBObject foundObj; while
(cur.hasNext()) { foundObj = cur.next(); } long endTime =
System.currentTimeMillis(); What are else you measuring? Each doc
is is 4080 bytes on disk with powerOf2Sizes Unrestricted
predicate?
30. BasicDBObject doc = new BasicDBObject(); doc.put("v", str);
// str is a 2k string for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new
BasicDBObject(); long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate); DBObject foundObj; while
(cur.hasNext()) { foundObj = cur.next(); } long endTime =
System.currentTimeMillis(); What are else you measuring? Each doc
is is 4080 bytes on disk with powerOf2Sizes Measuring Time to parse
& execute query Time to retrieve all document But also Cost of
shipping ~4MB data through network stack Unrestricted
predicate?
31. BasicDBObject predicate = new BasicDBObject();
predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte",
20)); BasicDBObject projection = new BasicDBObject();
projection.put("_id", 1); long startTime =
System.currentTimeMillis(); DBCursor cur = coll.find(predicate,
projection ); DBObject foundObj; while (cur.hasNext()) { foundObj =
cur.next(); } long endTime = System.currentTimeMillis(); Solution:
Limit the projection Return fixed range
32. BasicDBObject predicate = new BasicDBObject();
predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte",
20)); BasicDBObject projection = new BasicDBObject();
projection.put("_id", 1); long startTime =
System.currentTimeMillis(); DBCursor cur = coll.find(predicate,
projection ); DBObject foundObj; while (cur.hasNext()) { foundObj =
cur.next(); } long endTime = System.currentTimeMillis(); Solution:
Limit the projection Only project _id Return fixed range
33. BasicDBObject predicate = new BasicDBObject();
predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte",
20)); BasicDBObject projection = new BasicDBObject();
projection.put("_id", 1); long startTime =
System.currentTimeMillis(); DBCursor cur = coll.find(predicate,
projection ); DBObject foundObj; while (cur.hasNext()) { foundObj =
cur.next(); } long endTime = System.currentTimeMillis(); Solution:
Limit the projection Only project _id Only 46k transferred through
network stack Return fixed range
34. General Principal #2 Measure only what you need to
measure
35. Part Two Performance measurement & diagnosis
36. The Physical Principles of the Quantum Theory (1930) Werner
Heisenberg "Every experiment destroys some of the knowledge of the
system which was obtained by previous experiments."
37. Broad categories Micro Benchmarks Workloads
38. Micro benchmarks: mongo-perf
39. mongo-perf: goals Measure commands Configure Single mongod,
ReplSet size (1 -> n), Sharding Single vs. Multiple DB O/S
Characterize Throughput by thread count Compare
40. What do you get? Better
41. What do you get? Measured improvement between rc0 and rc2
Better
49. Ouch where's the tree in the woods? 2.4.10 -> 2.6.0 4495
git commits
50. git-bisect Bisect between good/bad hashes git-bisect
nominates a new githash Build against githash Re-run test Confirm
if this githash is good/bad Rinse and repeat
54. The problem with measurement Observability What can you
observe on the system? Effect What effects can an observation
cause?
55. mtools
56. mtools MongoDB log file analysis Filter logs for
operations, events Response time, lock durations Plot
https://github.com/rueckstiess/mtools
57. Response Times > 100ms Bulk Insert 2.6.0-rc0 Ops/Sec
Time
58. Response Times > 100ms Bulk Insert 2.6.0-rc0 vs.
2.6.0-rc2 Floor raised
59. Code Change Yielding Policy
60. Code Change
61. Response Times Bulk Insert 2.6.0 vs 2.6.1 Ceiling similar,
lower floor resulting in 40% improvement in throughput
62. Secondary effects of Yield policy change Write lock time
reduced Order of magnitude reduction of write lock duration
63. > db.serverStatus() Yes will cause a read lock to be
acquired > db.serverStatus({recordStats:0}) No lock is not
acquired > mongostat Yes - until SERVER-14008 resolved, uses
db.serverStatus() Unexpected side effects of measurement?
64. CPU sampling Get an impression of Call Graphs CPU time
spent on node and called nodes
66. > mongodb dbpath Note: Do not use fork > mongo >
use admin > db.runCommand({_cpuProfilerStart: {profileFilename:
'foo.prof'}}) Execute some commands that you want to profile >
db.runCommand({_cpuProfilerStop: 1}) Start the profiling
67. Sample start vs. end of workload
68. Sample start vs. end of workload
69. Code change
70. Public Benchmarks Not all forks are the same YCSB
https://github.com/achille/YCSB sysbench-mongodb
https://github.com/mdcallag/sysbench-mongodb
71. Part Three And next?
72. Beavis & Butthead "The future sucks. Change it." "I'm
way cool Beavis, but I cannot change the future."
73. What we are working on mongo-perf UI refactor Adding more
micro benchmarks Workloads Adding external benchmarks Creating
benchmarks for common use cases Inbox fan in/out Analytical
dashboards Stream / Feeds Customers, Partners & Community
74. Here's how you can help change the future! Got a great
workload? Great benchmark? Want to donate it?
[email protected]
75. Don't be that benchmark #1 Know what you are measuring #2
Measure only what you need to measure
76. [email protected] Senior Director of Performance
Engineering, MongoDB Alvin Richards #MongoDBWorld Thank You