Thursday, May 26, 2011
Nov 01, 2014
Thursday, May 26, 2011
2
Hands on with the App Engine Datastore
Ikai LanMay 9th, 2011
Thursday, May 26, 2011
About the speaker
• Ikai Lan - Developer Programs Engineer, Developer Relations• Twitter: @ikai• Google Profile: http://profiles.google.com/ikai.lan
3
Thursday, May 26, 2011
Lab prerequisites
• JDK 1.5+• Apache Ant• Codelab package: http://code.google.com/p/2011-datastore-
bootcamp-codelab/downloads/detail?name=2011-datastore-bootcamp-codelab.zip
Shortlink: http://tinyurl.com/datastore-bootcamp
4
Thursday, May 26, 2011
Goals of this talk
• Understand a bit of how the datastore works underneath the hood
• Have a conceptual background for the persistence codelab
5
Thursday, May 26, 2011
Understanding the datastore
• The underlying Bigtable• Indexing and queries• Complex queries• Entity groups• Underlying infrastructure
6
Thursday, May 26, 2011
Datastore layers
7
Complex queries
Entity Group Transactions
Queries on properties
Key range scan
Get and set by key
Datastore ✓ ✓ ✓ ✓ ✓Megastore ✓ ✓ ✓ ✓Bigtable ✓ ✓
Thursday, May 26, 2011
Datastore layers
8
Complex queries
Entity Group Transactions
Queries on properties
Get and set by key, key range scans
Datastore ✓ ✓ ✓ ✓Megastore ✓ ✓ ✓Bigtable ✓
Complex queries
Entity Group Transactions
Queries on properties
Key range scan
Get and set by key
Datastore ✓ ✓ ✓ ✓ ✓Megastore ✓ ✓ ✓ ✓Bigtable ✓ ✓
Thursday, May 26, 2011
What does a Bigtable row look like?
9
Source: http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google.com/en/us/papers/bigtable-osdi06.pdf
Thursday, May 26, 2011
Bigtable API
• “Give me the column ‘name’ at key 123”• “Set the column ‘name’ at key 123 to ‘ikai’”• “Give me all columns where the key is greater than 100 and less
than 200”
10
Thursday, May 26, 2011
Datastore layers
11
Complex queries
Entity Group Transactions
Queries on properties
Get and set by key, key range scans
Datastore ✓ ✓ ✓ ✓Megastore ✓ ✓ ✓Bigtable ✓
Complex queries
Entity Group Transactions
Queries on properties
Key range scan
Get and set by key
Datastore ✓ ✓ ✓ ✓ ✓Megastore ✓ ✓ ✓ ✓Bigtable ✓ ✓
Thursday, May 26, 2011
Megastore API
• “Give me all rows where the column ‘name’ equals ‘ikai’”• “Transactionally write an update to this group of entities”• “Do a cross datacenter write of this data such that reads will be
strongly consistent” (High Replication Datastore)• Megastore paper: http://www.cidrdb.org/cidr2011/Papers/
CIDR11_Paper32.pdf
12
Thursday, May 26, 2011
Datastore layers
13
Complex queries
Entity Group Transactions
Queries on properties
Get and set by key, key range scans
Datastore ✓ ✓ ✓ ✓Megastore ✓ ✓ ✓Bigtable ✓
Complex queries
Entity Group Transactions
Queries on properties
Key range scan
Get and set by key
Datastore ✓ ✓ ✓ ✓ ✓Megastore ✓ ✓ ✓ ✓Bigtable ✓ ✓
Thursday, May 26, 2011
App Engine Datastore API
• “Give me all Users for my app where the name equals ‘ikai’, company equals ‘Google’, and sort them by the ‘awesome’ column, descending”
14
Thursday, May 26, 2011
Thursday, May 26, 2011
Queries
Thursday, May 26, 2011
Let’s save an Entity with the low-level Java API DatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "[email protected]");
ikai.setProperty("firstName", "ikai"); ikai.setProperty("company", "google");
ikai.setUnindexedProperty("biography", "Ikai is a great man, a great, great man."); datastore.put(ikai);
16
Thursday, May 26, 2011
Get an instance of the DatastoreServiceDatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "[email protected]");
ikai.setProperty("firstName", "ikai"); ikai.setProperty("company", "google");
ikai.setUnindexedProperty("biography", "Ikai is a great man, a great, great man."); datastore.put(ikai);
17
Fetch a client instance
Thursday, May 26, 2011
Instantiate a new EntityDatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "[email protected]");
ikai.setProperty("firstName", "ikai"); ikai.setProperty("company", "google");
ikai.setUnindexedProperty("biography", "Ikai is a great man, a great, great man."); datastore.put(ikai);
18
Set the Entity Kind
Thursday, May 26, 2011
Instantiate a new EntityDatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "[email protected]");
ikai.setProperty("firstName", "ikai"); ikai.setProperty("company", "google");
ikai.setUnindexedProperty("biography", "Ikai is a great man, a great, great man."); datastore.put(ikai);
19
Set a unique key
Thursday, May 26, 2011
Set indexed propertiesDatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "[email protected]");
ikai.setProperty("firstName", "ikai"); ikai.setProperty("company", "google");
ikai.setUnindexedProperty("biography", "Ikai is a great man, a great, great man."); datastore.put(ikai);
20
First argument is the property name
Second argument is the property value
Thursday, May 26, 2011
Set unindexed propertiesDatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "[email protected]");
ikai.setProperty("firstName", "ikai"); ikai.setProperty("company", "google");
ikai.setUnindexedProperty("biography", "Ikai is a great man, a great, great man."); datastore.put(ikai);
21
This property will be saved, but we will not run queries against it
Thursday, May 26, 2011
Commit the entity to the datastoreDatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "[email protected]");
ikai.setProperty("firstName", "ikai"); ikai.setProperty("company", "google");
ikai.setUnindexedProperty("biography", "Ikai is a great man, a great, great man."); datastore.put(ikai);
22
Save the thing!
Thursday, May 26, 2011
What happens when we save?
23
Write the entity
Write the indexes
Make the write RPC Success!
Thursday, May 26, 2011
What actually gets written?
24
Bigtable key Value
AppId:User:[email protected] ( Protobuf serialized entity - includes firstName, company and biography values )
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key Value
AppId:User:firstName:ikai:[email protected] ( Empty )
AppId:User:company:google:[email protected] ( Empty )
Entities table
Indexes table
Thursday, May 26, 2011
Now let’s run a query
• If we have the key, we can fetch it right away by key• What if we don’t? We need indexes.
25
Thursday, May 26, 2011
Let’s run a queryDatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Query queryByName = new Query("User");
queryByName.addFilter("firstName", FilterOperator.EQUAL, "ikai");
List<Entity> results = datastore.prepare( queryByName).asList( FetchOptions.Builder.withDefaults());
// Roughly equivalent to: // SELECT * from User WHERE firstname = ‘ikai’;
26
Thursday, May 26, 2011
Step 1: Query the indexes table
27
Bigtable key Value
AppId:User:[email protected] ( Protobuf serialized entity - includes firstName, company and biography values )
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key Value
AppId:User:firstName:ikai:[email protected] ( Empty )
AppId:User:company:google:[email protected] ( Empty )
Entities table
Indexes table
Scan the indexes table for values >= AppId:User:firstName:
Thursday, May 26, 2011
Step 2: Start extracting keys
28
Bigtable key Value
AppId:User:[email protected] ( Protobuf serialized entity - includes firstName, company and biography values )
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key Value
AppId:User:firstName:ikai:[email protected] ( Empty )
AppId:User:company:google:[email protected] ( Empty )
Entities table
Indexes table
That gets us this row - extract the key [email protected]
Thursday, May 26, 2011
Step 3: Batch get the entities themselves
29
Bigtable key Value
AppId:User:[email protected] ( Protobuf serialized entity - includes firstName, company and biography values )
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key Value
AppId:User:firstName:ikai:[email protected] ( Empty )
AppId:User:company:google:[email protected] ( Empty )
Entities table
Indexes tableNow let’s go back to the entities table and fetch that key. Success!
Thursday, May 26, 2011
Key takeaways
• This isn’t a relational database– There are no full table scans– Indexes MUST exist for every property we want to query– Natively, we can only query on matches or startsWith queries– Don’t index what we never need to query on
• Get by key = one step. Query on property value = 2 steps
30
Thursday, May 26, 2011
Let’s run a more complex query!DatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Query queryByName = new Query("User");
queryByName.addFilter("firstName", FilterOperator.EQUAL, "ikai");
queryByName.addFilter("company", FilterOperator.EQUAL, "google");
List<Entity> results = datastore.prepare( queryByName).asList( FetchOptions.Builder.withDefaults());
// Roughly equivalent to: // SELECT * from User WHERE firstname = ‘ikai’ // AND company = ‘google’;
31
Thursday, May 26, 2011
Query resolution strategies
• This query can be resolved using built in indexes– Zig zag merge join - we’ll cover this example
• Can be optimized using composite indexes
32
Thursday, May 26, 2011
Zig zag across multiple indexes
33
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key
AppId:User:firstName:alfred:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:zed:[email protected]
Bigtable key
AppId:User:company:acme:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:megacorp:[email protected]
Begin by scanning indexes >= AppId:User:company:google
Thursday, May 26, 2011
Zig zag across multiple indexes
34
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key
AppId:User:firstName:alfred:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:zed:[email protected]
Bigtable key
AppId:User:company:acme:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:megacorp:[email protected]
There’s at least a partial match, so we “jump” to the next index
Thursday, May 26, 2011
Zig zag across multiple indexes
35
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key
AppId:User:firstName:alfred:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:zed:[email protected]
Bigtable key
AppId:User:company:acme:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:megacorp:[email protected] to the next index. Start a scan for keys >= AppId:User:firstName:ikai:[email protected]
Thursday, May 26, 2011
Zig zag across multiple indexes
36
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key
AppId:User:firstName:alfred:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:zed:[email protected]
Bigtable key
AppId:User:company:acme:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:megacorp:[email protected], so that’s a twist. The first value that matches has key [email protected]! Does this value exist in the first index?
Thursday, May 26, 2011
Zig zag across multiple indexes
37
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key
AppId:User:firstName:alfred:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:zed:[email protected]
Bigtable key
AppId:User:company:acme:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:megacorp:[email protected]
Let’s advance the original cursor to >= AppId:User:company:google:[email protected]
Thursday, May 26, 2011
Zig zag across multiple indexes
38
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key
AppId:User:firstName:alfred:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:zed:[email protected]
Bigtable key
AppId:User:company:acme:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:megacorp:[email protected]
Alright! We found a match. Let’s add the key to our in memory list and go back to the first index
Thursday, May 26, 2011
Zig zag across multiple indexes
39
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key
AppId:User:firstName:alfred:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:zed:[email protected]
Bigtable key
AppId:User:company:acme:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:megacorp:[email protected]
Bigtable key
AppId:User:company:acme:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:megacorp:[email protected]
Bigtable key
AppId:User:firstName:alfred:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:zed:[email protected]
Let’s move on to see if there are any more matches. Let’s start at [email protected]
Thursday, May 26, 2011
Bigtable key
AppId:User:firstName:alfred:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:zed:[email protected]
Bigtable key
AppId:User:company:acme:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:megacorp:[email protected]
Zig zag across multiple indexes
40
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Are there any keys >= AppId:User:firstName:ikai:[email protected]?
Thursday, May 26, 2011
Zig zag across multiple indexes
41
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key
AppId:User:firstName:alfred:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:zed:[email protected]
Bigtable key
AppId:User:company:acme:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:megacorp:[email protected]
Bigtable key
AppId:User:company:acme:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:megacorp:[email protected]
No. We’re at the end of our index scans. Let’s do a batch key of our list of keys: [ ‘[email protected]’ ]
Bigtable key
AppId:User:firstName:alfred:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:zed:[email protected]
Thursday, May 26, 2011
Batch get the entities themselves
42
Bigtable key Value
AppId:User:[email protected] ( Protobuf serialized entity - includes firstName, company and biography values )
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Entities table
Now let’s go back to the entities table and fetch that key. Success!
Thursday, May 26, 2011
Let’s change the shape of the data
• Zig zag performance is HIGHLY dependent on the shape of the data
• Let’s go ahead and muck with the data a bit
43
Thursday, May 26, 2011
Same query, sparsely distributed matches
44
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key
AppId:User:firstName:alfred:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:igor:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:zed:[email protected]
Bigtable key
AppId:User:company:acme:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:megacorp:[email protected]
Thursday, May 26, 2011
Same query, sparsely distributed matches
45
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key
AppId:User:firstName:alfred:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:igor:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:zed:[email protected]
Bigtable key
AppId:User:company:acme:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:megacorp:[email protected]
Begin by scanning indexes >= AppId:User:company:google
Thursday, May 26, 2011
Same query, sparsely distributed matches
46
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key
AppId:User:firstName:alfred:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:igor:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:zed:[email protected]
Bigtable key
AppId:User:company:acme:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:megacorp:[email protected]
Move to the next index. Start a scan for keys >= AppId:User:firstName:ikai:[email protected]
Thursday, May 26, 2011
Same query, sparsely distributed matches
47
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key
AppId:User:firstName:alfred:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:igor:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:zed:[email protected]
Bigtable key
AppId:User:company:acme:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:megacorp:[email protected]
Oh ... no matches. Let’s move back to the first index and move the cursor down
Thursday, May 26, 2011
Same query, sparsely distributed matches
48
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key
AppId:User:firstName:alfred:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:igor:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:zed:[email protected]
Bigtable key
AppId:User:company:acme:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:megacorp:[email protected]
Okay, we’ve got another Googler
Thursday, May 26, 2011
Same query, sparsely distributed matches
49
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key
AppId:User:firstName:alfred:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:igor:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:zed:[email protected]
Bigtable key
AppId:User:company:acme:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:megacorp:[email protected]
Move to the next index. Start a scan for keys >= AppId:User:firstName:ikai:[email protected]
Thursday, May 26, 2011
Same query, sparsely distributed matches
50
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key
AppId:User:firstName:alfred:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:igor:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:zed:[email protected]
Bigtable key
AppId:User:company:acme:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:megacorp:[email protected]
Oh ... no matches here either. Let’s go back to the first index.
Thursday, May 26, 2011
Same query, sparsely distributed matches
51
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key
AppId:User:firstName:alfred:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:igor:[email protected]
AppId:User:firstName:ikai:[email protected]
AppId:User:firstName:zed:[email protected]
Bigtable key
AppId:User:company:acme:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:google:[email protected]
AppId:User:company:megacorp:[email protected]
Oh ... no matches here either. Let’s go back to the first index.
... if these indexes were huge, we could be here for a while!
Thursday, May 26, 2011
What happens in this case?
• If we traverse too many indexes, the datastore throws a NeedIndexException
• We’ll want to build a composite index
52
Thursday, May 26, 2011
Composite index
53
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key
AppId:User:company:acme:firstName:alfred:[email protected]
AppId:User:company:google:firstName:david:[email protected]
AppId:User:company:google:firstName:ikai:[email protected]
AppId:User:company:google:firstName:max:[email protected]
AppId:User:company:megacorp:firstName:zed:[email protected]
Thursday, May 26, 2011
Composite index
54
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key
AppId:User:company:acme:firstName:alfred:[email protected]
AppId:User:company:google:firstName:david:[email protected]
AppId:User:company:google:firstName:ikai:[email protected]
AppId:User:company:google:firstName:max:[email protected]
AppId:User:company:megacorp:firstName:zed:[email protected]
Search for all keys >= AppId:User:company:google:firstName:ikai
Thursday, May 26, 2011
Composite index
55
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
Bigtable key
AppId:User:company:acme:firstName:alfred:[email protected]
AppId:User:company:google:firstName:david:[email protected]
AppId:User:company:google:firstName:ikai:[email protected]
AppId:User:company:google:firstName:max:[email protected]
AppId:User:company:megacorp:firstName:zed:[email protected]
Well, that was much faster, wasn’t it?
Thursday, May 26, 2011
Composite index tradeoffs
• Created at entity save time - incurs additional datastore CPU and storage quota
• You can only create 200 composite index• You need to know the possible queries ahead of time!
56
Thursday, May 26, 2011
Complex Queries takeaways
• This isn’t a relational database – There are no full table scans– Indexes MUST exist for every property we want to query
• Performance depends on the shape of the data• Worse case scenario: if your query matches are highly sparse• Build composite indexes when you need them
57
Thursday, May 26, 2011
Thursday, May 26, 2011
Entity Groups
Thursday, May 26, 2011
Why entity groups?
• We can perform transactions within this group - but not outside• Data locality - data are stored “near” each other• Strongly consistent queries when using High Replication
datastore within this entity group
59
Thursday, May 26, 2011
Entity groups and transactions
• A hierarchical structuring of your data into Megastore’s unit of atomicity
• Allows for transactional behavior - but only within a single entity group
• Key unit of consistency when using High Replication datastore
60
Thursday, May 26, 2011
Example: Data for a blog hosting service
61
Comment
Blog
Entry
User
Has manyHas many
Has many
Thursday, May 26, 2011
Example: Data for a blog hosting service
62
Comment
Blog
Entry
User
Has manyHas many
Has many
This can be structured as an entity group (tree structure)!
Thursday, May 26, 2011
Structure this data as an entity group
63
Blog
Entry
User
Blog
Entry Entry
CommentCommentComment
Entity group root
Thursday, May 26, 2011
How are entity groups stored?
64
Bigtable key ValueAppId:User:[email protected] ( Protobuf serialized User )
AppId:User:[email protected]/Blog:123 ( Protobuf serialized Blog )
AppId:User:[email protected]/Blog:123/Entry:456 ( Protobuf serialized Entry )
AppId:User:[email protected]/Blog:123/Entry:789 ( Protobuf serialized Entry )
AppId:User:[email protected]/Blog:123/Entry:456/Comment:111
( Protobuf serialized Comment )
AppId:User:[email protected]/Blog:123/Entry:456/Comment:222
( Protobuf serialized Comment )
AppId:User:[email protected]/Blog:123/Entry:789/Comment:333
( Protobuf serialized Comment )
Read more: http://code.google.com/appengine/docs/python/datastore/entities.html
Entities table
Thursday, May 26, 2011
How are entity groups stored?
65
Bigtable key ValueAppId:User:[email protected] ( Protobuf serialized User )
AppId:User:[email protected]/Blog:123 ( Protobuf serialized Blog )
AppId:User:[email protected]/Blog:123/Entry:456 ( Protobuf serialized Entry )
AppId:User:[email protected]/Blog:123/Entry:789 ( Protobuf serialized Entry )
AppId:User:[email protected]/Blog:123/Entry:456/Comment:111
( Protobuf serialized Comment )
AppId:User:[email protected]/Blog:123/Entry:456/Comment:222
( Protobuf serialized Comment )
AppId:User:[email protected]/Blog:123/Entry:789/Comment:333
( Protobuf serialized Comment )
Read more: http://code.google.com/appengine/docs/python/datastore/entities.html
Entities table Entity groups have a single root entity
Thursday, May 26, 2011
How are entity groups stored?
66
Bigtable key ValueAppId:User:[email protected] ( Protobuf serialized User )
AppId:User:[email protected]/Blog:123 ( Protobuf serialized Blog )
AppId:User:[email protected]/Blog:123/Entry:456 ( Protobuf serialized Entry )
AppId:User:[email protected]/Blog:123/Entry:789 ( Protobuf serialized Entry )
AppId:User:[email protected]/Blog:123/Entry:456/Comment:111
( Protobuf serialized Comment )
AppId:User:[email protected]/Blog:123/Entry:456/Comment:222
( Protobuf serialized Comment )
AppId:User:[email protected]/Blog:123/Entry:789/Comment:333
( Protobuf serialized Comment )
Read more: http://code.google.com/appengine/docs/python/datastore/entities.html
Entities table
Child entities embed the entire ancestry in their keys
Thursday, May 26, 2011
Let’s write an entity group transactionallyDatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "[email protected]"); Entity blog = new Entity("Blog", "ikaisays.com", ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", blog.getKey()); // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); Transaction tx = datastore.beginTransaction(); // Helper function for clarity datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit();
67
Thursday, May 26, 2011
Let’s write an entity group transactionallyDatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "[email protected]"); Entity blog = new Entity("Blog", "ikaisays.com", ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", blog.getKey()); // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); Transaction tx = datastore.beginTransaction(); // Helper function for clarity datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit();
68
Create the root entity
Thursday, May 26, 2011
Let’s write an entity group transactionallyDatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "[email protected]"); Entity blog = new Entity("Blog", "ikaisays.com", ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", blog.getKey()); // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); Transaction tx = datastore.beginTransaction(); // Helper function for clarity datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit();
69
This is the first child entity - notice the third argument, which specifies the parent entity key
Thursday, May 26, 2011
Let’s write an entity group transactionallyDatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "[email protected]"); Entity blog = new Entity("Blog", "ikaisays.com", ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", blog.getKey()); // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); Transaction tx = datastore.beginTransaction(); // Helper function for clarity datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit();
70
The next deeper entity sets the blog as the parent
Thursday, May 26, 2011
Let’s write an entity group transactionallyDatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "[email protected]"); Entity blog = new Entity("Blog", "ikaisays.com", ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", blog.getKey()); // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); Transaction tx = datastore.beginTransaction(); // Helper function for clarity datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit();
71
We can also opt to not provide a key name and just use a parent key for a new entity
Thursday, May 26, 2011
Let’s write an entity group transactionallyDatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "[email protected]"); Entity blog = new Entity("Blog", "ikaisays.com", ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", blog.getKey()); // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); Transaction tx = datastore.beginTransaction(); // Helper function for clarity datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit();
72
Start a new transaction
Thursday, May 26, 2011
Let’s write an entity group transactionallyDatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "[email protected]"); Entity blog = new Entity("Blog", "ikaisays.com", ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", blog.getKey()); // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); Transaction tx = datastore.beginTransaction(); // Helper function for clarity datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit();
73
Put the entities in parallel
Thursday, May 26, 2011
Let’s write an entity group transactionallyDatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "[email protected]"); Entity blog = new Entity("Blog", "ikaisays.com", ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", blog.getKey()); // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); Transaction tx = datastore.beginTransaction(); // Helper function for clarity datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit();
74
Actually commit the changes
Thursday, May 26, 2011
Step 1: Commit
75
Commit Changes to entities visible
Changes to entities and indexes visible
Roll the timestamp forward on the root entity
Thursday, May 26, 2011
Step 2: Entity visible
76
Commit Changes to entities visible
Changes to entities and indexes visible
On read, check for the most recent timestamp on the root entity
This is the version we want since it represents a complete write
Thursday, May 26, 2011
Step 3: Indexes updated
77
Commit Changes to entities visible
Changes to entities and indexes visible
Indexes are written - now we can query for this entity with the new properties
Thursday, May 26, 2011
Entity group and transactions takeaways
• Structure data into hierarchical trees– Large enough to be useful, small enough to maximize
transactional throughput
• Transactions need an entity group root - roughly 1 transaction/second– If you write N entities that are all part of 1 entity group, it counts as
1 write
• Optimistic locking used - can be expensive with a lot of contention
78
Thursday, May 26, 2011
General datastore tips
• Denormalize as much as possible– As much as possible, treat datastore as a key-value store
(Dictionary or Map like structure)– Move large reporting to offline processing. This lets you avoid
unnecessary indexes
• Use entity groups for your data• Build composite indexes where you need them - “need” depends
on shape of your data
79
Thursday, May 26, 2011
Thursday, May 26, 2011
Questions?
Thursday, May 26, 2011
Thursday, May 26, 2011