Page 1
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Clustering in the wild
● Ugo Landini
– CTO, Sourcesense● Sergio Bossa
– Software Architect, Sourcesense
Page 2
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Agenda
● Why Clustering?
● Clustering J(2)EE
● Terracotta in a nutshell.
● Jira clustering issues.
– Files and indexes.
– Stateful applications and home grown caches.
– Thread and services.
– HTTP Session.● Summary.
Page 3
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Why clustering?
● Horizontal scalability:
– Scale out.
– More computers, to improve throughput when a single one is not enough or costs too much.
● High availability:
– More computers to improve uptime.
– If you unplug a network cable, the system should remain up and running.
– 24/7, or around.
– Usually more important than scalability.
Page 4
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Clustering J(2)EE
● In an ideal world
– <distributable /> tag in your web.xml
– Serializable objects in your HTTP session.● True, if and only if is J(2)EE Compliant
– Basically, no arbitrary use of resources and state● Files.● Threads.● Sockets.● ... ?
Page 5
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Clustering J(2)EE
● What do I do with my files?
– java.io.tmpdir
– JNDI lookup● What do I do with the state of my application (caches,
conversational state, etc.)?
– Stateful Enterprise Java Beans
– Well established caching frameworks ● EHCache, OSCache, JbossCache● JSR 107
Page 6
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Clustering J(2)EE
● What do I do with my thread/services?
– JMS (MDBs and topics, mostly)
– Commonj (Bea and IBM effort)● What do I do with my HTTP Session?
– Serializable objects.
– Use a good Load Balancer.
Page 7
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Wake up!
● Almost all successful J(2)EE applications around won't pass the Sun AVK (Application Verification Kit).
● Most people go straight for the simple solution
– and that one could be a cluster antipattern
– home grown caches, lucene indexes, quartz jobs, singletons... add your favourite quickie here.
Page 8
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Enter Terracotta
● Transparent (Translucid? ...) Clustering.
– Very few changes to already existent code.
– Low development effort.● Open Source, free for any use.
● Emerging (and cool!) technology.
● Did I mention that we are Terracotta partner? :)
Page 9
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
The quest for antipatterns
● Jira is NOT easily clusterable, so it is a nice testbed.
● Jira is a bug tracking, issue tracking, and project management application developed to make this process easier.
● Jira is the leading issue tracker in the open source world (though it is not strictly open source).
● People is asking for a clustered Jira!
– http://jira.atlassian.com/browse/JRA-7330● Did I mention that we are Atlassian partner?
Page 10
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Terracotta magic
Page 11
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Terracotta magic
Page 12
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Terracotta magic
Page 13
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Terracotta magic
Page 14
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Terracotta magic
Page 15
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Terracotta magic
Page 16
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Terracotta magic
Page 17
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Terracotta magic
● Terracotta moves around the bytes changed in shared objects
– No serialization.
– superstatic objects!
– same semantic, only new() behaves differently● Demarcation of transaction with guarded block
– essentially moves multi-thread application semantic to cluster level.
● For performance reasons, for certain objects it moves behaviour and not data (logicaly managed vs physically managed objects)
– you can do the same thing if you need to. (distributed methods)
Page 18
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Terracotta in a nutshell
● Features, part one:
– Transparent JVM-level clustering.● Transparently works inside your JVM as an infrastructure
service.● Plugs into your code thanks to bytecode injection.● No API, no code changes!
– Hub-and-Spoke architecture.● Central server based architecture.● All nodes talk only to the central server.● Linear scalability.● No split-brain problem.
Page 19
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Terracotta in a nutshell
● Features, part two:
– Active/Passive mode.● One central active server, n passive servers.
– Network Attached Memory.● Shares your objects graph with the central server.● Virtual Heap (on disk, with Berkeley DB)● Maintains your object graph in the memory heap.
– Preserved Java semantics.● Object equality (equals, hashCode)● Concurrency. (syncronized, java.util.concurrency)● Thread communication. (wait, notify)
Page 20
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Terracotta in a nutshell
● Main concepts:
– Roots.● Defines where your shared objects graph starts.
– Locks.● Ensures data consistency.● Enables Terracotta intra-node communication.● All code changing parts of the shared objects graph must
be guarded by locks.– Distributed methods.
● Enables plain old Java methods to be simultaneously called in all cluster nodes.
Page 21
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Out in the wild
How did we actually cluster the beast?
Page 22
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Clustering Lucene indexes : Problems
● Lucene indexes are typically stored in files.
– Do you remember? clustering antipattern● Used to improve data access speed.
● How to cluster them?
– Network based solution : SAN or NFS.● Not a viable solution due to locks
– Messaging based solution : JMS● Complicated!● Indexes should improve performances, rather than make
them worse!
Page 23
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Clustering Lucene indexes : Solution
● Let's store indexes in memory!
● Lucene:
– Provides support for memory-based indexes.
– Just use org.apache.lucene.store.RAMDirectory.● Terracotta:
– Just a matter of configuration.
– And you can share your lucene indexes.
Page 24
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Clustering Jira caches : Problems
● Guess what ... Jira uses home grown caches!
– Do you remember? clustering antipattern
– From bad to worse:● No unified API!
– Just a lot of HashMaps and HashSets.● Very poor locking policies.
– Makes configuration-only Terracotta clustering impossible!
– Unfeasible to use an already existent caching framework.
Page 25
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Clustering Jira caches : Solution
● Write a new, ad-hoc, unified caching API.
● Goals:
– Simplicity.● As simple as using an HashMap.
– Thread safety.● Cache consistency.● Terracotta ready.
– Efficiency.● No bottlenecks.● No liveness failures.
Page 26
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Caching API :Striving for simplicity.
● No strange methods. No cluster related configuration.
– Just the usual GET/PUT methods, and alike.
– Terracotta makes the clustering work!● When choosing how to cluster the cache:
– Distribute behaviour, rather than data.● Jira puts heavyweight objects in cache.
– Distribute cache invalidation, rather than cache updates.● Lower hit ratio but ...
– Lower network traffic!– Higher simplicity!
Page 27
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Caching API :Striving for thread safety.
● Carefully use Java locks (ok, this was obvious ...).
● Due to how Jira works:
– The caching API must be able to group more than one cache under the same lock.
– The caching API must be able to execute a code block atomically under the same lock.
– Not so obvious ...● Use what we call “owner based locking.”
Page 28
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Caching API :Striving for efficiency.
● Choose the right balance between too fine grained and too coarse grained locks.
– Do not use complex lock constructs.● Use plain synchronized blocks.
– Use lock striping techniques.
Page 29
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Threads and services
● Jira periodically triggers threads:
– Do you remember? clustering antipattern● Threaded Jira services:
– Mail sending.
– Backup export.
– Index optimization
Page 30
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Clustering threads and services :Problems
● Threads cannot be clustered.
● We have to cluster the launched services.
– Some services must be shared among cluster nodes.
– Other services must be distributed.
– How to distinguish them?
Page 31
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Clustering threads and services :Solution
● Shared services.
– Clustered through Terracotta XML configuration.
– A shared service is executed only on a single node.
– The default.● Distributed services.
– Distributed through Terracotta XML configuration.
– A distributed service is executed on every node.
– Just implement com.atlassian.jira.service.JiraDistributedService
Page 32
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
HTTP Session
● Two choices:
– Cluster it through Terracotta.● Very hard.
– Again, Jira puts a lot of heavyweight objects into session.
– Leave it unclustered.● Use a load balancer with sticky sessions enabled.
– Jira is not a mission critical application.– More simplicity, less complexity.
● Guess what we chose ...
– Please give me that shiny new load balancer ...
Page 33
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Dealing with external code
● Applications are often pluggable.
● Jira has a rich plugin architecture.
● External plugins must fit and work into the cluster
– It is necessary to provide simple APIs or configuration options for making cluster-ready plugins.
● Practical example : com.atlassian.jira.service.JiraDistributedService
Page 34
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Toward an end
Conclusions
Page 35
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Summary
● Terracotta is a transparent clustering solution but ...
– You have to take a lot of decisions and trade-off.● If you have to access files in a clustered environment:
– Slow access: network filesystem, database system.
– Fast access: use Terracotta network attached memory.● If you have to cluster your application state:
– Carefully make it thread safe.
– Choose between distributing data or behaviour.
Page 36
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Summary
● If you have application services:
– Choose services to share.● A shared service runs once per cluster.
– Choose services to distribute.● A distributed service runs once per node.
● If you have to cluster the HTTP session state:
– Consider not to cluster it!● If you have to deal with application plugins:
– Provide API hooks or configuration options.
Page 37
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Terracotta + Jira = Scarlet
● Scarlet.
– Clusters Jira through Terracotta.
– Published as a Jira extension.● http://confluence.atlassian.com/x/woQuBg
– Open Source.● We want you!
– Actively developed:● November 06, 2007 : 1.0 Beta 1.● Very soon : 1.0 Beta 2.
Page 38
Sergio Bossa ([email protected] ) - Pronetics/SourcesenseUgo Landini ([email protected] ) - Pronetics/Sourcesense
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
The end
Q&A