Advanced Queries on the Infinispan Data Grid Navin Surtani 13 th May 2015 GeeCon, Krakow
Jul 28, 2015
Advanced Queries on the Infinispan Data Grid
Navin Surtani
13th May 2015
GeeCon, Krakow
Who is Navin?
• Worked on Red Hat projects
since 2008
• Infinispan
• Hibernate Search
• Wildfly/JBoss EAP
Tweet your questions
@navssurtani
#advancedqueries
What are we talking about?
• What is Infinispan?
• The Query module
• Backend tech Hibernate Search & Apache Lucene
• Setup and configuration
• Demo and code walkthrough
What is Infinispan?
• Distributed in-memory key/value data store
• Extension of java.util.Map
• Modes
• Library Embed into EE/SE application
• Server Connect remotely
Some features
• Fully transactional (JTA, XA)
• Hibernate 2nd level caching
• Full-text querying
• Non-JVM clients for server mode
How do I use it?
• Cache Sit in front of your NoSQL data store
• In-memory DB Primary data store is in memory
• Clusterability Manage state that is distributed
… but we have a problem here
• How do I find my data?
• I don’t want to give out
keys
• I might not know what I
need to find
Query module to the rescue
• Allows searching of values in the cache
• Original project: JBoss Cache Searchable in 2008
• Integration between Infinispan and Hibernate Search
• Became Query module in 2009
Full-text search
• Library example:
• Is author name: Surname, Name?
• Name, Surname?
• How do I deal with …
• Special characters?
• Typos?
Lucene
• Scalable high-performance indexing
• Small RAM requirement ~ 1MB heap
• Index size ~ 20-30% size of data
• 100% open source and written in Java
• Apache Licensing
• Ports to other languages exist
Lucene
• Optimised for searching and querying
• Rich feature-set for query types
• Typo-tolerant searches
• Similar keywords
• Document structure
• Unstructured data
• Documents stored in-memory or on disk
Two features we will look at
Facets
• Obtain counts, or frequencies of a result
• O(1) to obtain counts
• EBay counts
Filters
• Filters are:
• Declarative
• Stacking
• Reusable
How it all fits together
XML Configuration
<local-cache name="Votes">
<transaction mode="NONE"/>
<indexing index="ALL">
<property name="default.directory_provider">
ram
</property>
</indexing>
</local-cache>
Programmatic Configuration
ConfigurationBuilder cb = new ConfigurationBuilder();
cb.indexing()
.enable()
.indexLocalOnly() // Will only index local node
.withProperties(properties);
EmbeddedCacheManager cm = new DefaultCacheManager(cb.build());
// My key is an int and value is of type Person
Cache<int, Person> cache = cm.getCache();
Annotations required
• @Indexed
• @Field
• @IndexedEmbedded
Running queries
// I have a cache instance which is not empty
SearchManager sm = Search.getSearchManager(cache);
QueryBuilder qb = sm.buildQueryForClass(Person.class)
.get();
Query q = qb.keyword().onField(“name”).matching(“Surtani”)
.createQuery();
CacheQuery cq = sm.getQuery(q, Person.class);
How it all ties together …
• Web-application using Infinispan running on WildFly 9 CR
• App-server ships with Query module
• Use a web-form to vote in an ‘election’
• One vote for governor
• One vote for senator
Flow I: Query ‘warm-up’
• Story: ‘We don’t know who is running in the election’
• WebSocket endpoint to delegate to Worker object
• Worker object executes on CandidateCacheDao
• Returns results through WebSocket endpoint
Flow II: Voting form
• Story: ‘This is our ballot paper’
• Front-end creates JSON to go to WebSocket endpoint
• JSON gets parsed by BallotWorker object
• BallotWorker puts parsed JSON into Cache through VotingCacheDao
Flow III: Faceted search
• Story: ‘We want to know who has won the election’
• Front-end asks for the result of an election (governor or senator)
• ElectionResultWorker object runs a query through the
VotingCacheDao
• Result passed back to web-page as JSON
Flow III: Faceted search with Filter
• Story: ‘We would like to know who has received the most votes
in a particular region’
• Essentially the same workflow as III but we also pass a Filter to our query
• We are using the same query code, except we also filter out our results.
Demo time
Roadmap
• API:
• JDK 8 integration
• FunctionalCache interface
• Query:
• Query on Non-Indexed fields
• Continuous querying
Summary
• Query module 101
• Configuration
• Demo
• Basic query on multiple fields
• Faceted search with and without filter
Get in touch
Twitter:
• @navssurtani
• @infinispan
• @c2b2consulting
IRC:
• #infinispan on FreeNode
Blogs:
• navssurtani.blogspot.com
• blog.infinispan.org
• blog.c2b2.co.uk
Demo:
• github.com/navssurtani/query-demo
Q&A
#thankyougeecon