Tung-Wei Kuo, Kate Ching-Ju Lin, and Ming-Jer Tsai Academia Sinica, Taiwan National Tsing Hua University, Taiwan Maximizing Submodular Set Function with Connectivity Constraint: Theory and Application to Networks
Jan 29, 2016
Tung-Wei Kuo, Kate Ching-Ju Lin, and Ming-Jer Tsai
Academia Sinica, TaiwanNational Tsing Hua University, Taiwan
Maximizing Submodular Set Function
with Connectivity Constraint: Theory and Application to Networks
• Mesh network deployment
Motivation
• Mesh network deployment
Motivation
How should we deploy the network?
Candidate
location
• Mesh network deployment
Motivation
Candidate
location
The budget is limited!
• Only one router can access the Internet
• Mesh networks exploit multi-hop relays
Connectivity
Candidate
location
• Only one router can access the Internet
• Mesh networks exploit multi-hop relays
Connectivity
Candidate
location
• Only one router can access the Internet
• Mesh networks exploit multi-hop relays
Connectivity
The network must be connected!
Various Performance Metrics
• A variety of performance metrics– The number of covered users, total
throughput, the size of the coverage area, …
Given limited resources (routers or budget),
deploy a connected mesh that optimizes the performance metric
𝐺=(𝑉 ,𝐸)
Mesh Deployment Problem• Given:
1. routers, where one of them is a gateway2. The set of candidate locations, 3. The set of connection edges, 4. The optimization goal (e.g., the number of covered
users)
𝑘=3This is the
optimal solution
A graph
GOAL: Construct a connected network such that the optimization goal is
achieved
{¿
Design an algorithm for each of the various
optimization goals?Many optimization goals can be modeled as submodular set
functions
Our goal: A universal algorithm for a family of problems whose objective can be modeled
as asubmodular set function
Submodular Set Function
A function is a submodular set function if
Example: Number of covered users
𝑓 (𝑆 )+ 𝑓 (𝑇 )≥ 𝑓 (𝑆∩𝑇 )+ 𝑓 (𝑆∪𝑇 ) ,∀ 𝑆 ,𝑇⊆𝑉𝑎
𝑏
𝑐 𝑑
𝑆={𝑎 ,𝑏 ,𝑐 } 𝑇={𝑐 ,𝑑}
𝑑
𝑎
𝑏
𝑐
𝑓 (𝑆 )=6𝑆={𝑎 ,𝑏 ,𝑐 } 𝑇={𝑐 ,𝑑}
Example: Number of covered users
𝑏
𝑎
𝑐 𝑑
𝑓 (𝑇 )=4𝑓 (𝑆 )=6𝑆={𝑎 ,𝑏 ,𝑐 } 𝑇={𝑐 ,𝑑}
Example: Number of covered users
𝑏
𝑎
𝑐 𝑑 𝑓 (𝑆∩𝑇 )= 𝑓 ( {𝑐 } )=3
𝑓 (𝑇 )=4𝑓 (𝑆 )=6𝑆={𝑎 ,𝑏 ,𝑐 } 𝑇={𝑐 ,𝑑}
Example: Number of covered users
𝑏
𝑎
𝑐 𝑑𝑓 (𝑆∪𝑇 )= 𝑓 ( {𝑎 ,𝑏 ,𝑐 ,𝑑} )=6
6+4>3+6𝑓 (𝑇 )=4𝑓 (𝑆 )=6
𝑆={𝑎 ,𝑏 ,𝑐 } 𝑇={𝑐 ,𝑑}
Example: Number of covered users
𝑓 (𝑆∩𝑇 )= 𝑓 ( {𝑐 } )=3
Example: Total Data Rate
100100
11001
1
𝑏
𝑎
𝑐 𝑑1
100
Example: Total Data Rate
𝑓 ({𝑎 ,𝑏 ,𝑐 })=303
100100
1100
1
𝑏
𝑎
𝑐 𝑑1
100
1
Formal Problem Definition
• Given:1. A graph 2. A positive integer 3. A nondecreasing submodular set function on the set of subsets of with
• Goal: Find a subset such that1. Connectivity: is connected with respect
to 2. Limited resources:3. Optimization goal: is maximized
Formal Problem Definition
• Given:1. A graph 2. A positive integer 3. A nondecreasing submodular set function on the set of subsets of with
• Goal: Find a subset such that1. Connectivity: is connected with respect
to 2. Limited resources:3. Optimization goal: is maximized
The problem is NP-hard.An approximation algorithm will be
given
Our Algorithm
For every candidate location, , generate a solution in the following
way:Step 1. Find an area, , centered at
Step 2. Deploy some routers on
Step 3. Use the remaining routers to make the solution connected
The Idea
The best solution is then the final output
The Solution-Step 1
1. Find an area centered at with a radius of hops
𝑘=16
𝐺=(𝑉 ,𝐸)
𝑟
The Solution-Step 1
𝐺=(𝑉 ,𝐸)
radius : hops𝑘=16
1. Find an area centered at with a radius of hops
𝑟
The Solution-Step 2
𝐺=(𝑉 ,𝐸)
radius : hops𝑘=16
2. Deploy routers, where one of them is at the center
𝑟
The Solution-Step 2
𝐺=(𝑉 ,𝐸)
radius : hops𝑘=16
𝑟
# of covered users2. Deploy routers, where one of them is at the center
The Solution-Step 2
𝐺=(𝑉 ,𝐸)
User
Candidate
location
radius : hops
# of covered users
𝑘=16
𝑟
2. Deploy routers, where one of them is at the center
The Solution-Step 2
𝐺=(𝑉 ,𝐸)
User
Candidate
location
# of covered users
𝑘=16radius : hops
𝑟
2. Deploy routers, where one of them is at the center
The Solution-Step 3
𝐺=(𝑉 ,𝐸)
User
Candidate
location
3. Use shortest paths to connect routers to the center
# of covered users
𝑘=16radius : hops
This is a feasible solution
The Solution-Step 3
𝐺=(𝑉 ,𝐸)
User
Candidate
location
radius : hops
# of covered users
𝑘=16
3. Use shortest paths to connect routers to the center
The Algorithm
For every candidate location, , generate a solution in the following
way:Step 1. Find an area, , centered at , with radius
Step 2. Deploy routers on
Step 3. Use the remaining routers to make the solution connected
The best solution is then the final output.How, exactly, should we deploy the routers?
How to Deploy the Routers?
• Solve a subproblem that is similar to the main problem, except that① The solution can be disconnected② The center of the given area must be
chosen
• It is still NP-hard–When is dropped, Nemhauser et al.
propose an -approximation algorithm [9]
–We modify Nemhauser’s algorithm to satisfy[9] G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher, “An analysis of
approximations for maximizing submodular set functions-I,” Mathematical Programming, vol. 14, pp. 265–294, 1978.
Approximation Ratio
– is the optimal solution when only routers can be used
Our algorithm is an -approximation algorithm
The Problem with Heterogeneous Deployment Costs
Different locations might have different deployment costs
Formal Problem Definition
• Given:1. A vertex-weighted graph 2. A nondecreasing submodular set function on the set of subsets of with 3. A positive integer
• Find a subset such that:1. Connectivity: is connected with respect to 2. Limited budget: The total weight of 3. Optimization goal: is maximized
Approximation Ratio
• , where is the maximum degree of
• A special case: Unit disk graph ⇒
Simulation Results-Use Synthesis Data
Simulation Setting
• Field size: 1200 m × 1200 m • User:– # of users: 200– Zipf’s law– 802.11b
• Candidate locations:– Grid network– Grid size: 100 m × 100 m
• Communication range: 150 m• Channel error model: 802.11b PHY
Simulink Model
Another Common Scenario
• In some applications, a specific location may need to be included in the solution
• We modify our algorithm accordingly:How to findthe center?
Our algorithmTry all the possible centers and
choose the best one
Our algorithmw/ specific center
Let the specificlocation be the desired center
Comparison Schemes
• Two greedy heuristics: – Try all the possible starting locations– Add one neighboring vertex at a time–Minimum deployment cost or
maximum performance gain
• When
[17] F. Vandin, E. Upfal, and B. J. Raphael, “Algorithms for detecting significantly mutated pathways in cancer,” Journal of Computational Biology, vol. 18, pp. 507–522, 2011.
Goal = maximum number of covered usersHomogeneous costs
We compare with Vandin’s algorithm [17]
Simulation Scenarios
• Two types of deployment costs:1. Homogeneous costs2. Heterogeneous costs
• Two performance metrics:1. Total data rate2. The number of covered users
Maximum Total Data Rate
• Homogeneous costs
0 10 20 30 40 50 60 70 80 90 1000
200
400
600
800
1000
1200
1400
1600
1800
Tota
l data
rate
of
covere
d
use
rs (
Mb/s
ec)
Number of routers, k
Upper boundArbitrary solutionGreedy: max date rateGreedy: max data rate w/ specific centerOur algorithmOur algorithm w/ specific center
• Heterogeneous costs
0 100 200 300 400 500 600 7000
200
400
600
800
1000
1200
1400
1600
1800
Tota
l data
rate
of
covere
d
use
rs (
Mb/s
ec)
Total budget for deployment, B
Upper boundArbitrary solutionGreedy: min costGreedy: min cost w/ specific centerGreedy: max data rateGreedy: max data rate w/ specific centerOur algorithmOur algorithm w/ specific center
Maximum Total Data Rate
Maximum Number of Covered Users
0 5 10 15 20 25 300
50
100
150
200
Nu
mb
er
of
cove
red
use
rs
Upper boundArbitrary solutionVandin’s algorithmVandin’s algorithm w/ specific centerOur algorithmOur algorithm w/ specific center
• Homogeneous costs
Number of routers, k
• Heterogeneous costs
0 100 200 300 400 500 600 7000
50
100
150
200
Nu
mb
er
of
cove
red
use
rs
Total budget for deployment, B
Upper boundArbitrary solutionGreedy: min costGreedy: min cost w/ specific centerGreedy: max coverageGreedy: max coverage w/ specific center Our algorithmOur algorithm w/ specific center
Maximum Number of Covered Users
Summary of the simulation results
1. Our algorithm can be applied to different optimization goals
2. The ratio between the upper bound and our algorithm matches the approximation ratio
3. Our algorithms perform better than the greedy heuristics
Simulation Results-Use the Census of Taipei
Use the Census of Taipei
• Use the census to locate the users• Heterogeneous deployment costs:– Higher costs are assigned to locations
with higher population density
• Goal: Maximize the number of covered users
Input
8
km
12 kmTotal cost of all locations: 60053
Number of users: 7126
Output
The output when the available budget = 15000Number of covered users: 6600 (≈93% of the total users)
8
km
12 km
The Results
0 5000 10000 15000 200000
1000
2000
3000
4000
5000
6000
7000
Nu
mb
er
of
covere
d u
sers
Total budget for deployment, B
Upper boundArbitrary solutionGreedy: min costGreedy: min cost w/ specific centerGreedy: max coverageGreedy: max coverage w/ specific centerOur algorithmOur algorithm w/ specific center
Conclusion• We study the problem of finding a
connected set that maximizes some submodular set function under a limited budget
• We propose a universal algorithm for the mesh deployment problem
• We prove that the approximation ratio of the universal algorithm is
Thank you