Top Banner
Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004
21

Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004.

Lecture 11: Data Synchronization Techniques for Mobile Devices

© Dimitre Trendafilov 2003

Modified by T. Suel 2004

CS623, 4/20/2004

Page 2: Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004.

Problem Definition

Given two versions of a data set on different machines, say an outdated and a current one, how can we update the outdated one with minimum communication cost?

Related Problem: What if data has been changed in several machines? (How to reconcile data: difficult, application dependent)

Page 3: Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004.

Obvious Solutions

Send the all of the current data. Compress the current data and then send it. Send only the compressed difference

between the two data sets. If the sender has both versions use a suitable

delta compression tool.What if the sender has no access to the outdated

version?

Page 4: Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004.

Two Aspects of the Problem File Synchronization (rsync)

Update an outdated file so that it becomes identical to a current one

Set Reconciliation (today)Assume you have many small data records, but

you only want to send modified recordsE.g., Database with a set of 100-byte recordsUnordered: order of records not importantFind which records need to be transmitted, then

send the entire recordRecord identified by number (hash, record ID)

Page 5: Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004.

Applications for Data Synchronization

Synchronizing data between PDA and PC Microsoft briefcase etc. Synchronizing databases over a network Synchronizing a file system in two stages:

find which files have changed (MD5 of files) use rsync on those that have changed

Page 6: Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004.

Palm Hot Sync

Relies on metadata maintained on both machines.

The metadata is stored in Palm DB There is one Palm DB for each application

(Date Book, To Do, Address Book, etc) A record in Palm DB consist of unique id,

pointer to the object, and status flag.

Page 7: Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004.

Palm Hot Sync Preferred mode of operation:

Fast Sync Exchange only the modified records. Works only if the synchronization is done between

two machines.

Page 8: Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004.

Palm Hot Sync “Backup” mode of operation:

Slow Sync Copy all of the data. Used when the last synchronization was done

with different machine.

Page 9: Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004.

Timestamps

Maintain a timestamp for each record. Send only the records with timestamp greater then

timestamp of the last synchronization Good for synchronization between two machines

but inefficient for more

Page 10: Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004.

SyncML (www.syncml.org, now part of Open Mobile Alliance) Fairly large initiative funded by Ericsson, IBM,

Lotus, Matsushita, Motorola, Nokia Seeks to provide an open standard for

synchronization between different platforms and devices

Uses XML Based on timestamps A device stores a timestamp for each record

and each device it communicates with. N records and M devices result in N*M timestamps Not scalable!

Page 11: Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004.

Intellisync Anywhere

Developed by Puma Technologies. Relies on a central server Similar to Fast Sync, but each devices

synchronizes only with the central server. It has a single point of failure The central server can get congested

Page 12: Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004.

Intellisync Anywhere Puma technologies

Page 13: Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004.

Characteristic Polynomial Interpolation Synchronization (CPISync)

Time/bandwidth complexity depends on the number of differences.

Computationally expensive – cubic in the number of differences

But can be improved Computations could be done on only one of

the two devices (the faster one) Works in general peer-to-peer environment

Page 14: Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004.

CPISync Preliminaries

Each data set can be represented as a set of numbers [using hash functions].

A characteristic polynomial for a sets is:

Note that for two polynomials SA and SB

Page 15: Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004.

CPISync

Host A and B evaluate their characteristic polynomials and at the same sample points , .

Host B sends to host A its evaluations The evaluations are combined at host A to

compute . The zeroes in and are determined.

Those are the differences!

Page 16: Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004.

CPISync

Page 17: Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004.

IPSync – Finding the Number of Differences Guess a bound. Send evaluations at k random points Verify at k points Repeat with another bound if needed. The probability for error is:

Page 18: Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004.

IPSync vs. Slow Sync

Page 19: Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004.

Taxonomy of Synchronization Techniques

Page 20: Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004.

More Techniques: Bloom Filters

Get a bloom filter for the receivers data set Send only elements that are not found in the

bloom filter.

Page 21: Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004.

More Techniques:Using Error Correction Codes

Send error correction code for the data set The receiver, “correct the errors” in its

outdated data set. Reed-Solomon Codes Decoding time depends only on the number

of differences between the sets (almost linear, not cubic)

But extra factor of 2 transmission