Top Banner
Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software Engineering University of Wisconsin at Platteville
29

Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Dec 26, 2015

Download

Documents

Eustace Preston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Efficient and Effective Practical Algorithms forthe Set-Covering Problem

Qi Yang, Jamie McPeek, Adam Nofsinger

Department of Computer Science and Software Engineering

University of Wisconsin at Platteville

Page 2: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

The Set-Covering Problem

Given N sets, let X be the union of all the sets. A cover of X is a group of sets from the N sets

such that every element of X belongs to a set in the group.

The set-covering problem is to find a cover of X of the minimum size.

Page 3: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Matrix Representation of the Set-covering Problem

a b c d e f

S1 0 1 1 0 1 0

S2 0 0 1 1 0 0

S3 1 1 0 1 0 1

S4 0 1 0 0 1 1

Number of sets: N = 4

Number of elements: M = 6

One cover: S1, S3, S4

One minimal cover: S1, S3

Not a cover: S1, S2, S4 (a is not covered)

Page 4: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

NP-Hard Problem

Introduction to Algorithms by T. H. Cormen, C.E. Leiserson, R. L. Rivest

The Set-covering problem has been proved to be NP hard

A Greedy Algorithm

Page 5: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Algorithm Greedy

ResultCover : The minimum cover to be found.Uncovered : The set of elements not covered yet.

1. Set ResultCover to the empty set2. Set Uncovered to the union of all sets3. While Uncovered is not empty

a. select a set S that is not in ResultCover and covers the most elements of Uncovered

b. add S to ResultCoverc. remove all elements of S from Uncovered

Page 6: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Algorithm Check And Remove (CAR)

Identifying Redundant Search Engines in a Very Large Scale Metasearch Engine Context

8th ACM International Workshop on Web Information and Data Management

The set-covering problem is equivalent to the problem of identifying redundant search engines on the Web

Algorithm CAR is much faster than Algorithm Greedy

Page 7: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Algorithm CAR (Check And Remove)

1. Set ResultCover to the empty set

2. For each set Sa. determine if S has an element that is not covered by ResultCoverb. add S to ResultCover if S has such an elementc. exit the for loop if ResultCover is a cover of X

3. For each set S in ResultCovera. determine if S has an element that is not covered by any other set of

ResultCoverb. Remove S from ResultCover if S has no such an element

Page 8: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Example

a b c d e f

S1 0 1 1 0 1 0

S2 0 0 1 1 0 0

S3 1 1 0 1 0 1

S4 0 1 0 0 1 1

Set ResultCover UnCovered

{} {a, b, c, d, e, f}

S1 {S1} {a, d, f}

S2 {S1, S2} {a, f}

S3 {S1, S2, S3} {}

Removing S2

{S1, S3} {}

Page 9: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Time Complexity

Algorithm Greedy O(M * N * min(M, N))

Algorithm CAR O(M * N)

N: number of setsM: number of elements of the union X

Page 10: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

CPU Time

CPU Time

05000100001500020000

25000300003500040000

100

200

300

400

500

600

700

800

900

1000

Actual Cover Size

CPU Time (Sec)

Greedy

CAR

Page 11: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Cover Size

Cover Sizes of the Two Algorithms

Actual 100 300 500 700 900

Greedy 105 300 501 700 900

CAR 485 300 500 700 900

Page 12: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Implementation Details

Read data

Binary search tree

BitMap indicating which sets cover an element Convert the tree to an array of BitMaps

Matrix representation of the set-cover problem Find a cover

Page 13: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Binary Search Tree and BitMap

element

element

element

Number of sets (N) is knownNumber of elements of each set is knownThe total number of elements is unknown

Reading elements of one set at a time

BitMap size N which sets cover the element a column of the matrix

Page 14: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Array of Column BitMaps

Row Operations• Find the number of elements in a set that are not covered by the result cover • Determine if a set contains an element that is not covered by the result cover• Determine if a set in the result cover has an element that is not covered by any other sets in result cover• …

. . . . . . . . . .

e1 e2 e3 e4 em-1 em

Page 15: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Array of Row BitMaps

It takes some time to convert column BitMaps to row BitMaps.

But all row operations are performed within a row BitMap.

Page 16: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

CPU Time

Running Times (seconds) of the Greedy Algorithm

Col 0.63 53.9 300 1220 2130 3457 5056

Row 0.28 7.6 41 161 274 437 629

Running Times (seconds) of the CAR algorithm

Col 0.01 0.31 1.63 6.36 11.15 16.92 20.70

Row 0.09 0.39 0.96 2.12 3.27 4.34 5.46

The CPU time includes the time to convert column BitMaps to row BitMaps, but not the time to build the tree.

Page 17: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

CPU Time (Row BitMap)

Running Times (seconds) of the Two algorithms

Greed 0.28 7.6 41 161 274 437 629

CAR 0.09 0.39 0.96 2.12 3.27 4.34 5.46

Page 18: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Algorithm Greedy

1. Set ResultCover to the empty set2. Set Uncovered to the union of all sets3. While Uncovered is not empty

a. select a set S that is not in ResultCover and covers the most elements of Uncovered

b. add S to ResultCoverc. remove all elements of S from Uncovered

Page 19: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Algorithm Greedy Update

UncoveredCount: number of elements of a set not covered by ResultCover

1. Set ResultCover to the empty set

2. Set Uncovered to the union of all sets

3. For each set, set the UncoveredCount to the size of the set

4. While Uncovered is not empty

a. select a set that has the largest value of UncoveredCount among all sets not in ResultCover

b. add the set to ResultCover

c. remove all elements of the set from Uncovered

d. update the value of UncoveredCount for each set not in ResultCover

Page 20: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Update Uncovered Count

For each element in the set to be added to the ResultCover

If the result cover does not covers it

For each set not in the result cover

If the set contains the element

uncovered count is decremented by one

Page 21: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Time Complexity

Algorithm Greedy O(M * N * min(M, N))

Algorithm CAR O(M * N)

Algorithm Greedy Update O(M * N)

Page 22: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

CPU Time

Running Times (seconds) of the Two algorithms

Update 0.15 0.92 2.26 5.13 7.31 10.1 13.1

CAR 0.09 0.39 0.96 2.12 3.27 4.34 5.46

Page 23: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Algorithm List And Remove (LAR)

Implemented the matrix using linked list instead of array of BitMaps

Algorithm Update plus the remove phase from algorithm CAR

Page 24: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Linked List for Matrix

e1 e2 e3 e4 e5 e6 e7

S5

S4

S3

S2

S1

Page 25: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

CPU Time

Running Times (seconds) of the Two algorithms

LAR 0.21 0.35 0.51 0.86 1.11 1.40 1.66

CAR 0.26 0.49 0.65 1.01 1.24 1.46 1.67

Page 26: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Cover Size

Cover Sizes of the Two algorithms

LAR 10 87 191 422 607 815 971

CAR 16 120 235 467 648 824 975

Page 27: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Cover Size (Different Data Sets)

Cover Sizes of the Two algorithms

Actual 50 70 90 110 200 500 900

LAR 50 70 90 110 200 500 900

CAR 291 391 496 528 200 500 900

Page 28: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Summary

Algorithm LAR runs faster than Algorithm CAR Algorithm LAR generates smaller cover sets than

Algorithm CAR Algorithm: Updating vs. searching every time Data Structure: Link list vs. array of BitMaps

Page 29: Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software.

Questions?