Transcript
Building a high performance directory server in JavaLessons learned and tips from the OpenDS project.Ludovic Poitou
Matthew Smith
Sun Microsystems
#118
2
AGENDA
> Introduction to the OpenDS project
> Architecture, Design Patterns and Tips
> Experiences with Sun JVM
> Conclusion
3
AGENDA
> Introduction to the OpenDS project
> Architecture, Design Patterns and Tips
> Experiences with Sun JVM
> Conclusion
The OpenDS Project
> Released in Open Source in July 2006
– CDDL
– Source code at https://opends.dev.java.net/
> Sponsored by Sun Microsystems
> Written in Java by LDAP experts
What is it ?
> OpenDS is effectively a Java based Server supporting the LDAPv3 protocol and services
– Objet Oriented, Hierarchical Data Model
– CRUD operations
> It comes with its own embedded database
– Based on Berkeley DB Java Edition
– Not accessible from outside
> It has all security, access controls, password management features to safely store the information about Users
What is it for ?
> Generic object oriented data store
> White pages and Email Address Book
> Mostly the data store for Identities
– For Authentication and Authorization
– For profiles and personalization
> The underlying infrastructure in all Enterprises
– Leveraged by Web and Mail infrastructure products
– Cornerstone of Identity Management products: Access Management and Federation Provisioning and De-provisioning tools
7
Who for ?
> Telecom service providers, financial institutions use LDAP directories for customer related services (Portal)
– Storing customers identities, phones, services associated
– Building highly available services for 10 Millions, up to 200 Millions users
> But OpenDS can be used as OS naming service, or for SMB
– OpenSolaris, Solaris, Linux...
– Coupled with SAMBA, as a Domain controler
– Integrated with Kerberos
– White Pages...
> And being 100% pure Java, OpenDS can be embedded in other Java applications or Web applications
– OpenSSO
OpenDS 2.2
> Released in December 2009
> LDAPv3 directory server fully standard compliant
– Supports many LDAP standard and experimental extensions
– Supports Multi Master Replication with 3 different levels of data consistency
– Extensive security features
> Improved performances, reliability over OpenDS 1.0
> Installs in 6 clicks and less than 3 minutes
> Several GUI and CLI to manage, monitor the OpenDS server
> Extensive documentation
> Localized in 6 different languages
Performance characteristics
> As for most servers, scalability is extremely important
– Up to hundreds million of entries
– Up to thousands connections
– Maximize use of CPUs
> What is the operations throughput ?
> What is the average response time ? The maximum response time ?
> Our basic test
– 10 M entries, with an average size of 2.6K
– 2 servers, with Multi-Master replication between them
Searchrate on Sun x4170 box
Modrate on a Sun x4170 box
12
AGENDA
> Introduction to the OpenDS project
> Architecture, Design Patterns and Tips
> Experiences with Sun JVM
> Conclusion
How to reach those results ?
> 2 main aspects
– Architecture and code
– Run-time : JVM and Garbage Collector optimization
> There is a strong relationship between code design and memory optimization
Architecture Overview
15
Patterns
> Use of Asynchronous I/O
– Exception for write disk transactions
> Use of Immutable Objects
– Intrinsic thread safety
– Avoid need for defensive copies
> Use of “Factories” over Constructors
– Avoid creating an object
– Ease optimization for common cases Example : Most AttributeDescriptions have 0 options Example : Attributes generally have 1 value
– For immutable objects.
16
Patterns
> Producers / Consumers
– Queues
– Thread Pool
– Monitors
> Strategies
– Queue Strategies : ConcurrentLinkedQueue vs LinkedBlockingQueue
17
Anti-Patterns
> String concatenation
– Make sure to use a StringBuilder
– Compiler now optimize simple “Aaa” + “Bbb” concatenation
> Avoid very long methods (thousands of lines of code)
> Avoid exposing the concrete representation of an object
– Set vs LinkedHashSet.
– Not a performance issue, but will require more work when optimizing code for performance later
> Try to define only the methods you need.
18
Java Collections
> Vector and Hashtable are synchronized for all methods
– Pay the price even if not necessary
> Some Java collection classes are not synchronized by default
– ArrayList, LinkedList replace Vector
– HashSet, HashMap replace Hashtable
> To synchronize, wrap in a class
> With a “static factory”
– Collections.synchronizedList(new ArrayList())
> ConcurrentHashMap, for concurrency
– But watch when using the iterator
Critical Sections
> Try to minimise the code and time spent in the critical sections
> But the throughput is limited by the time spent in the largest critical section
– Example : LinkedBlockingQueue 200 000 operations on x64 processor 20 000 operations on the T2000 processor
– We use it for the WorkQueue and Access Logs
Caching data
> Using Caches reduce the disk access thus should provide better performances
> But cache eviction add pressure to the GC
– When modifying entries
– When the cache is too small to hold all data
> A cache is also a contention point
> If you want to cache objects, make sure you cache those that will be reused.
> Alternate possibility, use “thread local” cache.
– But watch out for the cost (with 1000 threads?)
Server monitoring
> Getting statistics for a server is mandatory
> Beware of contention
– Stats are updated frequently
– But seldom read
> A strategy could be to keep per thread statistics and collect them on demand
– Not yet implemented in OpenDS !
22
AGENDA
> Introduction to the OpenDS project
> Architecture, Design Patterns and Tips
> Experiences with tuning Sun JVM
> Conclusion
23
Performance Tuning
> When dealing with performances, you should consider the whole system
– Java VM
– OS
– Hardware : CPU, Memory, Disks, Network...
> In our case, we try to avoid disk I/Os
– And try to cache as much of the database
> We also want deterministic response times
– Avoid any Full GC (Stop The World)
– Make sure minor GC pauses are as small as possible
24
JVM Tuning for OpenDS
> Super Size The Heap !
– We use 32GB Heaps, sometime up to 96GB
– 2GB for the New Generation (or ¼ of heap if < 8GB)
– -Xms32768M -Xmx32768M -Xmn2048M
> Use CMS
– -XX:+UseConcMarkSweepGC
– -XX:+UseParNewGC
> -XX:MaxTenuringThreshold=1
– Avoid copy of objects in New Gen
25
Some interesting JVM Options
> -XX:CMSInitiatingOccupancyFraction=70
– Define the amount of occupancy in Old Gen before starting to collect
– Larger = better throughput but higher full GC risk
> -XX:+UseCompressedOops
– For 64bits JVM, less than 32BG of heap
– Will be the default in coming Java 6 updates
> If running on processor with NUMA architecture
– -XX:+UseNUMA> -XX:+AggressiveOpts
– Enables aggressive JIT optimizations, not related to GC
26
The Garbage First (G1) GC
> Introduced in the Java HotSpot VM in JDK 7.
> An experimental version of G1 has also been released since Java SE 6 Update 14.
> G1 is the long-term replacement for HotSpot's low-latency Concurrent Mark-Sweep (CMS) GC
> Should be officially supported with Java SE 6 Update 21
27
G1 Characteristics
> Future CMS Replacement
– Server “Style” Garbage Collector
– Parallel, Concurrent
– Generational
– Good Throughput
– Compacting
– Improved ease-of-use
– Predictable (though not hard real-time)
> The main stages consist of remembered set (RS) maintenance, concurrent marking, and evacuation pauses.
28
JVM Options With G1
> -XX:+UnlockExperimentalVMOptions-XX:+UseG1GC
> PauseTime (Hints, Goal with no promise, otherwise use Java Real Time )
– -XX:MaxGCPauseMillis=50 (target of 50 milliseconds)
– -XX:GCPauseIntervalMillis=1000 (target of 1000 msecs)
> Generation Size
– -XX:+G1YoungGenSize=512m (for a 512 MB young gen)
> Parallelism
– -XX:+G1ParallelRSetUpdatingEnabled
– -XX:+G1ParallelRSetScanningEnabled
29
OpenDS et G1
> Goal: Avoid any Full GC, best control of pauses
> Collaboration between the HotSpot and the OpenDS teams
– OpenDS is used as a “Large” reference application
– Between 10 and 20 enhancements integrated in G1 following the tests
– Performance with large heaps improved by a factor of 10
> We're still discovering it
– When doing read operations, we see pauses between 10 and 20 ms with 32GB JVM
– But we're still seeing Full GC when doing Write operations (More garbage, stresses more the Old Gen)
– Hopefully this will be resolved in next builds
30
OpenDS G1 and Searches
31
AGENDA
> Introduction to the OpenDS project
> Architecture, Design Patterns and Tips
> Experiences with Sun JVM
> Conclusion
32
Summary
> OpenDS
– A open source LDAP directory server, 100% pure Java
– Easy to install and use
– Designed for high performance and high scalability
> We saw some patterns and tips used in the OpenDS project
> Knowing and understanding the JVM and GC is required to build high performance server
– Tuning JVM and GC is an art
– Performance engineering is a profession
> Who said that Java is slow ?!
33
The Art of GC Tuning
http://developers.sun.com/learning/javaoneonline/j1sessn.jsp?sessn=TS-4887&yr=2009&track=javase
JavaOne Presentation: GC Tuning in HotSpot JVM
34
Now...
> Give OpenDS a try
– http://www.opends.org
> Join our community:
– Join/Login sur Java.net
– http://opends.dev.java.net
– Request a Role
– Subscribe to the mailing lists
– IRC: #opends on freenode.net
> OpenDS is localized in several languages. It's community based, through online tools. An easy way to participate.
Ludovic Poitou blogs.sun.com/Ludo
Sun Microsystems Ludovic.Poitou@sun.com
Matthew Swift
Sun Microsystems Matthew.Swift@sun.com