1 A Multi-Relational Approach to Spatial Classification Richard Frank School of Computing Science Simon Fraser University Burnaby BC, Canada [email protected]Martin Ester School of Computing Science Simon Fraser University Burnaby BC, Canada [email protected]Arno Knobbe LIACS, Leiden University Leiden, the Netherlands [email protected]
28
Embed
1 A Multi-Relational Approach to Spatial Classification Richard Frank School of Computing Science Simon Fraser University Burnaby BC, Canada [email protected].
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
A Multi-Relational Approach to Spatial Classification
Richard FrankSchool of Computing Science Simon Fraser UniversityBurnaby BC, [email protected]
Martin EsterSchool of Computing Science Simon Fraser UniversityBurnaby BC, [email protected]
• Why are some houses burgled?• Good location? • Expensive neighbourhood?• Close to major roads?
• Learn classifiers given• Location• Feature values• Neighbouring locations • Features of neighbours
• Use classifier to predict label of unknown entities
Burnaby, British Columbia, Canada
Burgled Houses
3
• Spatial data seems to have multi-relational (MR) aspects• MR classification techniques cannot be applied directly to spatial data
• With MR data the relationships between the entities are explicitly given• Spatial relationships are only implied via the entity’s spatial location• Non-spatial aggregation cannot deal with spatial dependencies• Many relationships large search space
INTRODUCTION
5STEPS
Steps to apply multi-relational techniques1. Select multi-relational framework
2. Determine neighbour relationships
3. Establish relationships and spatial features/literals that can be extracted
4. Apply spatial classifier• Incorporate relationships and spatial features/literals• Perform the classification in parallel
5. Analyze results
6MULTI-RELATIONAL CLASSIFICATION
• Classification with Inductive Logic Programming (ILP)• Find rules that can predict the labels of instances of a target entity• Example. “If a mall has a neighbouring house with income > 50,000 then the
Steps to apply multi-relational techniques1. Select multi-relational framework
2. Determine neighbour relationships
3. Establish relationships and spatial features/literals that can be extracted
4. Apply spatial classifier• Incorporate relationships and spatial features/literals• Perform the classification in parallel
5. Analyze results
15
• Unified Multi-relational Aggregation-based Spatial Classifier (UnMASC)• Multi-relational based spatial classification algorithm• Two-class problem• Based on the idea of the sequential covering algorithm
• Sequential covering algorithm• Generate one rule at a time• Refine rule by adding literals• Start new rule when rule-termination condition applies• Once a rule is finalised
• Entities covered are removed• Another rule is started
RULE LEARNING – OVERVIEW
16RULE LEARNING – LITERAL SEARCH
• Entity-types needs to be searched• Each search needs to identify the best
• Feature(s) ex: value & distance
• Aggregation (possibly) ex: trend
• Threshold value ex: 0.1
• Comparison operator ex: >
• This creates the best candidate literal for that search• Ex: trend({distance(M,H),value(H)}, {house(H),neighbour(H,M)},S), S>0.1
17RULE LEARNING – LITERAL SEARCH
Initialize:Pick entity type for classification (target entity type)Select class labelStart a blank rule
Class Label
Target Entity Type
R1: profitable(M,’yes’) mall(M)
18RULE LEARNING – LITERAL SEARCH
Rule 1 – Iteration 1:Search the entity-types referenced in rule for best feature Search neighbours of entity-types referenced in rule for best featureAdd best feature to the rule
Rule 1 – Iteration 2:Search the entity-types referenced in rule for best feature Search neighbours of entity-types referenced in rule for best featureAdd best feature to the rule
• UnMASC split into two methods• RuleLearner, the server-component
• Collects the searches that require evaluation• Maintains the rules and matching entities• Calls LiteralEvaluator
• LiteralEvaluator, performs the searches• Extracts all spatial features• Performs all applicable aggregations• Finds best feature and threshold• Returns the best literal
• RuleLearner picks best literal• Number of simultaneous searches possible is set apriori
• If number of searches possible < searches required then queuing is done
• {Malls}: expected to be small • Only a few malls in a city• No aggregations are involved
• {Malls} < {Malls, Houses}• Many houses in a city• Houses must be aggregated over their neighbouring malls
• {Malls} < {Malls, Roads} < {Malls, Houses}• Aggregation has to occur• |roads| < |houses|
• Very costly search can execute last• Estimate cost of each search based on
• Number of entities in target table• Features of entities• Number of relationships between entities used in rule
• Reprioritize queue execute costly search first
23STEPS
Steps to apply multi-relational techniques1. Select multi-relational framework
2. Determine neighbour relationships
3. Establish relationships and spatial features/literals that can be extracted
4. Apply spatial classifier• Incorporate relationships and spatial features/literals• Perform the classification in parallel
5. Analyze results
24EXPERIMENTS
• Dataset• Real-world crime data
• Collaboration with the Criminology Department at Simon Fraser University• For the Royal Canadian Mounted Police (RCMP) in British Columbia (BC) • Between August 1, 2001 and August 1, 2006• Location of crime• Type of crime
• British Columbia Assessment Authority (BCAA) dataset• Containing the property values of all plots of land within BC
• The city of Burnaby, BC was selected • 66,000 entities• Types of entities & counters
• Each property was labelled• Burglary exists or not
25EXPERIMENTS
• UnMASC was evaluated using three experiments
• Neighbour using Buffer zones• 2.8 million spatial relationships between entities
• Neighbour using Voronoi neighbourhoods • 3.8 million spatial relationships
• Use only the target entity type • no neighbouring entity types are evaluated
• Effectiveness of the parallelization of UnMASC• Parallel (6 threads) • Serial (1 thread)
• 5-fold cross-validation was performed
26EXPERIMENTS
• Burglaries of Commercial Properties• 2812 commercial properties were selected as the target entities• Target: 33% were burglarized
• Commercial property is• In a relatively inexpensive industrial neighbourhood• Neighbour to parks which could be a source of people inclined to commit crimes