Small-size -nets for Axis- Parallel Rectangles and Boxes Boris Aronov Esther Ezra Micha sharir polytechnic Duke Tel-Aviv Institute of NYU University.

Small-size Small-size -nets for Axis--nets for Axis-Parallel Rectangles and Parallel Rectangles and

BoxesBoxes

Boris Aronov Boris Aronov EstherEsther Ezra Micha sharirEzra Micha sharirpolytechnic Duke Tel-Avivpolytechnic Duke Tel-Aviv

Institute of NYU University UniversityInstitute of NYU University University

Range SpacesRange Spaces

Range space (X, R) :

X – Ground set (the “universe”).

R – Ranges: Subsets of X .

|R| 2|X|

Abstract form: Hypergraphs.

X – vertices.

R – hyperedges.

specification: X d, R = set of simply-shaped regions in d .

X – Points on the real line.

R – Intervals.

X – Points on the plane.

R – halfplanes, disks,…

For simplicity, assume X is finite:

|R| is polynomial in |X|.

Geometric Range SpacesGeometric Range Spaces

-nets for range spaces-nets for range spacesGiven:

• A range space (X, R) , assume X is finite, |X| = n .

• A parameter 0 < < 1 ,

An -net for (X, R) is a subset N X that hits every

range Q R, with |Q X| n .

N is a hitting set for all the ``heavy'' ranges.

Example:

Points and intervals on the real line: |N| = 1/ .

n

Bound does not depend on n.

Captures at least an

-fraction of the universe.

The hitting-set problemThe hitting-set problem

A hitting set for (X, R) is a subset H X, s.t., for any Q R , Q H .

Goal: find smallest hitting set.

Useful applications: art-gallery,sensor networking, and more.

Hardness of hitting setsHardness of hitting setsFinding a hitting set of smallest size is NP-hard,even for geometric range spaces!

Use an approximation algorithm instead.

Abstract range spaces [Chvatal 79]: Greedy algorithm.Approximation factor: O(log |X|)

Geometric range spaces [Bronimann-Goodrich95], [Clarkson 93]:

Achieve improved approximation factor!Approximation factor: O(log OPT) , or smaller!

This is achieved via -nets:Small-size -nets imply small approximation factors!

OPT = size of the smallest hitting set.

An upper bound for the An upper bound for the - -net sizenet size

The -net theorem [Haussler-Welzl 87]:If the ranges are simply-shaped regions, then, for any > 0, a random sample of size

O(1/ log (1/ )) is an -net, with constant probability.

Remark:In fact, it is sufficient to assume that the number of ranges is only polynomial in n.

Is it optimal?

Bound does not depend on n.

The lower boundThe lower boundTheorem [Komlos, Pach, Woeginger 92]:The bound is tight!

The construction: Artificial on abstract hypergraphs (non-geometric!).No lower bound better than (1/ ) is known in geometry.

What is the actual bound? O(1/ ) ?

Goal: Obtain smaller bounds forgeometric range spaces. Ideally O(1/ ), but anything better than O(1/ log (1/ )) is `exciting‘ !

Achieved by points and intervals on the

real line.

Previous resultsPrevious results

Points and halfspaces in 2D, 3D.

O(1/ ) [Matousek 92],

[Pyrga, Ray 08], [Har-Peled et al. 08]

Points and disks, or pseudo-disks in 2D: O(1/ ) [Matousek, Seidel, Welzl 90], [Pyrga, Ray 08].

Pseudo-disks

Our resultsOur results

Points and axis-parallel rectangles in the plane. -net size is O(1/ log log (1/ )) .

Points and axis-parallel boxes in 3-space. -net size is O(1/ log log (1/ )) .

Points and -fat triangles in the plane. -net size is O(1/ log log (1/ )) .

Points uniformly distributed over the unit-cube,

and axis-parallel boxes in d-space. -net size is O(1/ log log (1/ )) .

Each of the angles

Improved approximation factorsImproved approximation factorsfor geometric hitting setsfor geometric hitting sets

Ranges previous bound new bound

Axis-parallel rectangles log OPT log log OPT

Axis-parallel 3-boxes log OPT log log OPT

-fat triangles log OPT log log OPT

Axis-parallel d-boxes log OPT log log OPT

Uniformly distributed points in [0,1]d .

Main idea :Main idea :Use two-level samplingUse two-level sampling

Primary sampling step:Obtain an initial sample S of ~ 1/ points of X. On average, each heavy rectangle Qmust satisfy Q S .

Second sampling step (repair step):In each heavy rectangle Q R ,with Q S = , sample additional points to guarantee that Q is stabbed by the net.

contains at least n points

S

Q

The The -net construction-net construction

Input: X - a set of n points.Parameters: r := 1/ .

Primary sample:Produce a random sample S X of size r .Make S part of the output. |S| = r.

Apply the second sampling step in each empty rectangle…

Instead of processing all input rectangles,we consider a smaller set of representative rectangles.

The set of maximal S-empty The set of maximal S-empty rectanglesrectangles

A maximal S-empty rectangle M satisfies

int(M) S = , and for each rectangle

M’ M, int(M’) S .

M is defined by 4 points of S.

- set of all maximal S-empty rectangles.

Apply repair-step on instead on the input rectangles.

SM

Why is it sufficient to consider?

For each input heavy rectangle Q,

with Q S = , expand Q until each of its

sides touches a point of S

or continues to .

Since Q is heavy, a sufficiently large sample

in M will hit Q, with high probability.

Q

MOtherwise, done!

The repair step [CF-90, CV-07]The repair step [CF-90, CV-07]

Consider a heavy rectangle M, with |M X | = t n/r ,1 t log r .

Second sampling step:

Construct (1/ t)-net NM inside M ,

by sampling O(t log t ) points in M.

According to the -net theorem, each input (empty) rectangle Q R, Q M,

with |Q| n/r , must be stabbed by NM !

According to the -net theorem

Q

MThe excess of M

The “universe” size is now t n/r

r = 1/

The final The final -net-net Output:

The union of S and M NM .

What is the expected size of the -net ?

r + E{ t1 t log t |t | }

Exponential Decay Lemma:[Chazelle, Friedman 90], [Agarwal Matousek, Schwarzkopf 98]

E{ |t | } = O( 2-t E{ || }) ,

The number of heavy rectangles decreases exponentially!

t = set of rectangles in with excess t

Expected number of maximal empty

rectangles

Theorem: E{ || } = O(r log r)

E{ |t| } = O(r log r).

The expected -net size is O(1/ log (1/ )) .

Key observation: Use oversampling.Choose a slightly larger primary sample,

and repair only rectangles M with excess t c log log r .

An improved An improved -net-net

No improvement yet…

|M X | n/r

c > 1

|S| = c r log log r|S| = r

t 1

What have we gained?What have we gained?“On average”, an S-empty rectangle contains nowat most O(n/(r loglog r)) << n/r points.

So M cannot be an “average” S-empty rectangle.It is much heavier.

Exponential Decay Lemma: E{ |t | } = O( 2-t E{ || }) .# maximal heavy S-empty rectangles is much smaller!

E{ |t| } = O(s log s / polylog r) = o(s) = o(r).

The number of heavy (empty) rectangles is only sublinear in r !The expected -net size is O(r log log r) .

s = |S| = c r log log r t = c log log r .

Oversampling: Oversampling: A trick or a technique?A trick or a technique?

By oversampling at the preliminary step,

we significantly decrease the size of the secondary sample.

Note: The number of maximal S-empty rectangles is O(s log s) ,

however, we do not traverse all of them,

but only the heavy ones!

New Concept:

The sample points and the maximal S-empty rectangles

are two different entities.

merci beaucoup!merci beaucoup!

Bounding the number of maximal Bounding the number of maximal S-empty rectanglesS-empty rectangles

Upper bound: O(s2) .

Each rectangle is determined by

its two opposite corners.

problem:

The bound O(s2) is bad for the analysis,

and yields an -net of size O(1/ 2 ) !

Recall s = c 1/ log log 1/

Quadratic Lower bound constructionQuadratic Lower bound construction

A staircase construction:Each point in the upper staircase is matched with each point in the lower staircase.

(s2) empty rectangles.

We can prune away most of these rectangles and remain only with O(s log s) rectangles .

An O(An O(ss log log ss) bound for ) bound for ||||

Key observation:Consider a vertical line , and all points to its left.

Claim:The number of maximal S-empty rectangles, anchored at is only linear.

Next step: Use a tree decomposition built on top of X in order to obtain the O(s log s) bound. v

Q

Q’

1 3 ‘22 ‘‘3‘33

Dual (Geometric) Range SpacesDual (Geometric) Range Spaces

Flip roles of X and R, and obtain (R, X*) .R = set of regions in d ,

X* = {Rp | p X}, Rp = {r | r R , r contains p} .

R – Intervals.X* – Subsets of intervals containing a common point in 1 .

R – Disks.X* – Subsets of Disks containing a common point in 2

p

p

-nets for dual range spaces-nets for dual range spaces

-net for (R, X*) is a subset N R that

covers all points at depth |R| .

An -net is a set cover for all the “deep” points.

Upper bound [Haussler-Welzl, 87]:

O(/ log (/ )) , is the VC-dimension of (R, X*) .

depth(p) = #ranges that cover p X.

Range space (R, X*), s.t., for each T R, |T| = m, the union T has (a small) complexity o(m log m) : o(1/ log (1/ )) .[Clarkson, Varadarajan 07]

Theorem: [Clarkson, Varadarajan 07]The complexity of the union is O(m (m)) -net size is O(1/ (1/ )).

Previous resultsPrevious results

() is a slowly growing function.

In fact, this should be the complexity of the

vertical decomposition of the complement of

the union.

More about the Clarkson-More about the Clarkson-VaradarajanVaradarajan techniquetechnique

Example: disks (or pseudo-disks) and points

Input: A set T of m (pseudo) disks.

Union complexity: O(m) .

[kedem et al. 86]

-net size is O(1/ ) .

Example: fat triangles and points

Input: A set T of m -fat triangles.

Union complexity: O(m loglog m) .

[Matousek et al. 1994]

-net size is O(1/ log log (1/ )) .

Each of the angles

Our results: DualOur results: Dual

Theorem: [Clarkson, Varadarajan 07]The complexity of the union is O(m (m)) -net size is O(1/ (1/ )).

Using the oversampling concept:

Theorem (improvement!): The complexity of the union is O(m (m)) -net size is O(1/ log (1/ )) .

Draw a random sample S ofs = c/ log (1/ ) regions.

Construct the union of S:Decompose its complementinto O(s (s)) “trapezoidal cells”.Each cell is defined by 4 regions.

Claim:With high probability, meets (n/s) log s regions of the input.

Proof sketchProof sketch

Proof sketchProof sketch

Apply a repair step on the heavy cells:

Sample O(t log t ) regions in each cell that meets t n/s regions, for t c log (1/ )

Each point at depth n is covered by at least one region.

Use the Exponential Decay Lemma to show:# regions sampled at the repair step = o(1/ ) .

Overall -net size: O(1/ log (1/ )) .

New New -net bounds-net bounds

Fat triangles: Union complexity: O(m loglog m) -net size is O(1/ log log log (1/ )) .

Locally -fat objects: Union complexity: O(m polylog m) -net size is O(1/ log log (1/ )) .

And several other improved bounds.

O D

area(D O) area(D)

0 < 1

• Improve our upper bound O(1/ log log (1/ )) for points and axis-parallel rectangles.Conservative goal: Obtain a weak -net of size o(1/ log log (1/ )) .

• Extend our bound to points and axis-parallel boxes in d 4.Best known upper bound: O(1/ log (1/ )) .

• Dual range spaces for rectangles and points.Best known upper bound: O(1/ log (1/ )) .Can improve to O(1/ log log (1/ )) ?

Open problemsOpen problems

p

The points of the -net are not necessarily chosen from X .

Motivation: Approximation for Motivation: Approximation for geometric hitting setsgeometric hitting sets

The Bronimann-Goodrich technique / LP-relaxationIf (X, R) admits an -net of size f(1/ ) ,then there exists a polynomial-time approximation algorithm that reports a hitting set of size O(f(OPT)) .

Idea:

Assign weights on X s.t each range Q R becomes heavy .Construct an -net for the weighed range space. Each range is hit by the -net.

Small-size -nets imply small approximation factors!

The repair stepThe repair step

repair step:On average, each heavy rectangle Qmust satisfy Q S . The number of “bad” rectangles is small.

It is sufficient to consider a set of maximal S-empty rectangles, instead of R. is defined over the points of S. || = f(1/ ) (does not depend on n).

and so does #points sampled at the repair step.

Q

M

S

An O(An O(ss log log ss) bound for ) bound for ||||

Key observation:Consider a vertical line , and all points to its left.

Claim:The number of maximal S-empty rectangles,anchored at is only linear.

Handling a query rectangle Q:One of the halves Q’of Q contains at least n/(2r) points. Q’ is anchored at .Expand Q’ on “heavier” side of .

Q

Q’

Tree decompositionTree decomposition

• Build balanced binary tree on X, sorted by x-coordinate

• Stop expansion of when nodes have n/r points.

has O(log r) = O(log s) levels.

At each level:

#maximal S-empty anchored rectangles: O(s)

Overall (over all levels): O(s log s) . 1 3 ‘22 ‘‘3‘33

v

Each node is a vertical strip

Query rectangle QQuery rectangle Q

For an input rectangle Q with n/r points:

Find the first (highest) node of whose bounding line meets Q.

Expand Q within the “heavier”

strip v bounded by .

The maximal S-empty anchored

rectangles comprise the representative

set for R.

v

Q

Q’

1 3 ‘22 ‘‘3‘33

Is the bound optimal?Is the bound optimal?Theorem [Komlos, Pach, Woeginger 92]:The bound is tight!

The construction: Artificial on abstract hypergraphs (non-geometric!).No lower bound better than (1/ ) is known in geometry.

What is the actual bound? O(1/ ) ?

Goal: Obtain smaller bounds forgeometric range spaces. Ideally O(1/ ), but anything better than O(1/ log (1/ )) is `exciting‘ !

Achieved by points and intervals on the

real line.

Bounding the Bounding the -net size-net size

Exponential Decay Lemma:[Chazelle, Friedman 90], [Agarwal Matousek, Schwarzkopf. 98]

E{ |t| } = O( 2-t E{ | '| }) ,where:• S' is a smaller random sample, each point chosen withprobability s/(t n) .• t - all maximal S-empty rectangles M with tM t .• ' - all maximal S'-empty rectangles.

Bounding the final Bounding the final -net size-net size

A very useful tool:

Exponential Decay Lemma:

[Chazelle, Friedman 90], [Agarwal Matousek, Schwarzkopf. 98]

E{ |t| } = O( 2-t E{ || }) ,

where t is all maximal S-empty rectangles M with tM t .

The number of heavy rectangles decreases exponentially!

A nearly-linear bound for A nearly-linear bound for ||||

Fix a node v of and its strip v :

Xv = S v , Sv = S v

Lemma:

The number of maximal Sv-empty

anchored rectangles in v is O(Sv) .

At a fixed level i of , overall numberis O(s) .

Overall: O(s log r) .

v

Entry side

The set-cover problemThe set-cover problem

Primal: A hitting set for (X, R) is a subset H X, s.t., for any Q R , Q H .

Dual: A set cover for (X, R) is a subset S R, s.t., any x X is covered by S .

A set cover for (X, R) is a hitting set for (R, X*) Finding a set cover of smallest size is NP-hard!(even for geometric range spaces).

Achieve improved approximation factors via -nets(using the Bronimann-Goodrich technique / LP-relaxation).

f(1/ ) O(f(OPT))

-nets for dual range spaces-nets for dual range spaces

-net for (R, X*) is a subset N R that

covers all points at depth |R| .

An -net is a set cover for all the deep points.

Example:

Intervals and points on the real line: |N| = 1/ .

depth(p) = #ranges that cover p X.

n

Extensions to axis-parallel boxes in Extensions to axis-parallel boxes in 3-space3-space

Use similar machinery,

with s = c r log log r,

and a 3-level range tree

decomposition.

At each fixed triple-level

of the tree, we have

a subdivision of space

into (clipped) orthants.

x-order

y-order

z-order

Axis-parallel boxes in 3-spaceAxis-parallel boxes in 3-space

Fix a orthant .

Consider the points in , and the set of all maximal S-empty boxes anchored at the apex of .

Claim: = O(s).

E{| | } = E{ || } = O(s log3 s)

The expected size of the -net is O(1/ log log (1/ )) .

All these boxes grow from a common point.They behave as maximal S-empty orthants!

Small-size -nets for Axis- Parallel Rectangles and Boxes Boris Aronov Esther Ezra Micha sharir polytechnic Duke Tel-Aviv Institute of NYU University.

Documents