P2PR-tree: An R-tree-based Spatial P2PR-tree: An R-tree-based Spatial Index for P2P Environments Index for P2P Environments ANIRBAN MONDAL ANIRBAN MONDAL YI LIFU YI LIFU MASARU KITSUREGAWA MASARU KITSUREGAWA University of Tokyo. University of Tokyo. E-mail: [email protected]E-mail: [email protected]tokyo.ac.jp tokyo.ac.jp
29
Embed
P2PR-tree: An R-tree-based Spatial Index for P2P Environments
P2PR-tree: An R-tree-based Spatial Index for P2P Environments. ANIRBAN MONDAL YI LIFU MASARU KITSUREGAWA University of Tokyo. E-mail: [email protected]. PRESENTATION OUTLINE. Motivating Spatial Applications on P2P systems Existing Spatial Indexes - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
P2PR-tree: An R-tree-based Spatial P2PR-tree: An R-tree-based Spatial Index for P2P EnvironmentsIndex for P2P Environments
ANIRBAN MONDALANIRBAN MONDALYI LIFU YI LIFU
MASARU KITSUREGAWAMASARU KITSUREGAWAUniversity of Tokyo.University of Tokyo.
Motivating Spatial Applications onMotivating Spatial Applications on
P2P systemsP2P systems Existing Spatial IndexesExisting Spatial Indexes Our proposal: The P2PR-tree Our proposal: The P2PR-tree Performance AnalysisPerformance Analysis Conclusion and Future WorkConclusion and Future Work
Spatial Applications on P2P systemsSpatial Applications on P2P systems Spatial data occurs in several important and diverse applications
Geographic Information Systems (GIS) Computer-aided design (CAD) Resource management Development planning, emergency planning and scientific research.
Unprecedented growth of available spatial data at geographically distributed locations.
Trend of increased globalization. Popularity of P2P data sharing
Efficient global sharing of distributively owned spatial data in P2P systems
Each client has its own R-tree for managing its own data
Master
client clientclientclient
CentralizationCentralization
Designed for clusters.Designed for clusters.
Optimize disk I/Os.Optimize disk I/Os.
Why can’t we use existing Why can’t we use existing R-tree-based approaches?R-tree-based approaches?
They use centralized mechanisms They use centralized mechanisms →→ not scalable.not scalable.
All updates must pass through Master NodeAll updates must pass through Master NodeAll searches need to be routed by the Master All searches need to be routed by the Master
NodeNode→→ Performance bottleneck at the Master NodePerformance bottleneck at the Master Node
They do not optimize communication time.They do not optimize communication time.
GRID-Related Projects
GRID Physics Network and European DataGrid Improving scientific research which require efficient
distributed handling of data in the petabyte range,
Earth Systems GRID (ESG) aims at facilitating detailed analysis of huge amounts of
climate data by a geographically distributed community via high bandwidth networks.
NASA Information Power GRID (IPG) improve existing systems in NASA for solving complex
scientific problems efficiently
How our proposal differs from GRID-related spatial works?
GRID
Restrict data sharing only among scientific and research organizations
Individual nodes are usually dedicated and expected to be available most of the time.
Some amount of centralized control is possible by collaborations between organizations.
Our proposal
Allow normal users to share/upload data.
Individual nodes may join/leave anytime.
Distributively owned peers, hence centralized control practically challenging.
Existing Search mechanisms Existing Search mechanisms in P2P systemsin P2P systems
Broadcast (Gnutella)Broadcast (Gnutella)Centralized (Napster)Centralized (Napster)Routing indices (RIs)Routing indices (RIs)Distributed hash tables (Chord,CAN,Tapestry)Distributed hash tables (Chord,CAN,Tapestry)
Existing works on P2P systems Existing works on P2P systems mostly address file-sharing. mostly address file-sharing.
P2PR-tree (Peer-to-Peer R-tree)
A distributed R-tree-based indexing scheme designed for P2P systems
Parts of the distributed indexes are built autonomously by each peer.
Hierarchical and performs efficient pruning.Completely decentralized
Highly Scalable
Block 1 Block 2
Block 3 Block 4
Dividing the Universe Dividing the Universe
P5 P6P1 P2P4
PPP
P
P
P
P
PP
P
PPP P
P
P
P
PP
P
PPP P
P
P
P
PP
P
PPP P
P
Level 2
B1 B2 B3 B4
G1 G2 G3 G4
P5 P6 P3
P1 P2 P20 P3 P4
SG1 SG2
Level 0
Level 1
Level 3
…..
P20 P3
DefinitionsDefinitions
Unit: A Block, Group, Subgroup atUnit: A Block, Group, Subgroup at any level, or a peerany level, or a peer UnitMBR: Minimum Bounding Rectangle of a UnitUnitMBR: Minimum Bounding Rectangle of a Unit Router: In order to route messages to a Unit X, a peer A Router: In order to route messages to a Unit X, a peer A
needs to know at least one peer (say peer needs to know at least one peer (say peer B) B) which belongs to Unit X. We define peer B which belongs to Unit X. We define peer B as Peer as Peer A’s Router to Unit X. A’s Router to Unit X.
UnitRouterInfo: The addresses of routers to a Unit UnitRouterInfo: The addresses of routers to a Unit UnitInfo: UnitMBR and UnitRouterInfo of a UnitUnitInfo: UnitMBR and UnitRouterInfo of a Unit ChildInfo (Level i): UnitInfo of Child Units at Level i+1 in ChildInfo (Level i): UnitInfo of Child Units at Level i+1 in
the P2PR-tree the P2PR-tree
Data Structure at a peerData Structure at a peer
).....( 210 LiiiiPeer
StructureDatatreeRLocal
iiiiChildInfo
iiChildInfo
iChildInfo
BlockInfoAll
L
).....(
...
).(
)(
1210
10
0
A Peer of Level L can be specified as
maintains the following information).....( 210 LiiiiPeer
jlevelatIDUniti j where
Example of Data StructureExample of Data Structure
Peer Join operation in P2PR-treePeer Join operation in P2PR-tree
Block 1
Routing IssuesRouting Issues
Assumption: A peer initially knows at least N Assumption: A peer initially knows at least N routers for a Unit.routers for a Unit.
Piggybacking to refresh routers for each peer. Piggybacking to refresh routers for each peer. During piggybacking, a peer sends the addresses and During piggybacking, a peer sends the addresses and
reliability information of other peers in its own Unit.reliability information of other peers in its own Unit. Each peer maintains most reliable R routers for Each peer maintains most reliable R routers for
Units based on reliability.Units based on reliability. What if all routers that a peer knows in a specific What if all routers that a peer knows in a specific
Unit are unavailable?Unit are unavailable? Peer contacts the peers in other blocks to find out Peer contacts the peers in other blocks to find out
new routers for that block.new routers for that block.