Click here to load reader
Jan 14, 2016
*Multimedia Database SystemsIndexing Part BMetric-based Indexing Techniques
Department of InformaticsAristotle University of ThessalonikiFall 2008
-Slim- M-trees
**
-> .. .. (, + ) (, ) -> (R, R*, M, Slim )*
=(D,d) D d 3 , d(Ox, Oy) = d(Oy, Ox), d(Ox, Oy) > 0 (Ox Oy) d(Ox, Ox) = 0 , d(Ox, Oy) d(Ox, Oz) + d(Oz, Oy)*
3 Q a 2 Q ( best so far) d(Q,b) = 7.81 d(Q,c) : d(Q,b) d(Q,c) + d(b,c)d(Q,b) - d(b,c) d(Q,c) 7.81 - 2.30 d(Q,c) 5.51 d(Q,c) c 5.51 Q best so far 2*abcQ
a
b
c
a
6.70
7.07
b
2.30
c
Q D r(Q), range(Q, r(Q)) Oj d(Oj, Q) r(Q) k Q D k 1, k NN(Q, k) k Q.*
d (black-box) CPU ( ) / ( )*M-tree
R- - *Euclidean L2M-tree
..
routing objects
*M-tree
Oj joid(Oj) d(Oj, P(Oj)) Oj
Or routing rptr(T(Or)) (covering tree) T(Or)r(Or) Ord(Or, P(Or)) Or
range(Q, r(Q)) d(Or, Q) > r(Q) + r(Or), Oj (Or) : d(Oj, Q) > r(Q). (Or) |d(Op, Q) d(Or, Op)| > r(Q) + r(Or), d(Or, Q) > r(Q) + r(Or) Or 40%*-tree
kNN branch-and-bound 2 PR - dmin(T(Or)) = max{d(Or, Q) r(Or), 0} ( ) k ( )*-tree
- , - Or On d(Or, On) r(Or)*-tree
*PromotionPartition (split)
: promoting partioning ( ) ( )* (split)
Promotion , promoted m_RAD promote r(Op1) + r(Op2) ( ) mM_RAD M_LB_DIST RANDOM 2 SAMPLING , ( )* (promotion)
Partition routing , , 1 2 Generalized Hyperplane, Oj routing . d(Oj, Op1) d(Oj, Op2), Oj 1 2Balanced: d(Oj, Op1) d(Oj, Op2) Oj N. , . 1 p1 2 p2 , .* (partition)
- . / CPU .*
paged, (balanced) , , . features* M-tree
(-), : (minimum spanning tree MST) Slim-down post-processing tighter . (fat-factor, bloat-factor) *Slim-tree
-
*Slim-tree
OidiIdentifier iD(Oi, Rep(Oi)) Oi Rep(Oi)Oi i
Oi - Radiusi D(Oi, Rep(Oi)) Oi Rep(Oi)Ptr(TOi) -NEntries(Ptr(TOi)) Ptr(TOi)
Slim-
ChooseSubtree ChooseSubtreerandom: mindist: minoccup: (occupancy)*Slim-tree
random: minMax: . . MST: , ( ) . *Slim-tree
2 1 2 - 2 2 - 2 Fat-factorBloat-factor*Slim-tree
Fat-factor Bloat-factor 0 10 1 *Slim-tree
Slim-down (tighter)
i , c b i, j, c. j , i c j. i 1 2 . (full round) 2 , , 1 2*Slim-tree
Slim-down c i j 2, i , i i * a Slim- Sierpinsky (bloat-factor = 0.03) b (bloat-factor = 0.01)Slim-tree
Slim- - *Slim-tree
Slim- : ChooseSubtree - (MST) Slim-down , , fat-factor bloat-factor
* Slim-tree
; exact / *
M-trees k-NN (10 k-) CHV10.000 45 UV CV cluster
*
Improvement in efficiency (IE), .Precision of approximation (P)
Relative distance error ()
= 0 / *
Approximation through relative distance errors* . .
Approximate search through distance distributions* . , ( =0,01).
Approximation through the slowdown of distance improvements* (precision) .
*Approximation through the slowdown of distance improvements .
precision. 10 100 100 10 . 3 . . .*
CV *
CHV *
UV
*
10*
: *
MS (metric spaces) VS (vector spaces)
VSLp (vector spaces, Lp distance)*
CS (changing space) RC (reducing comparisons) *
NG (no guarantees)DG (deterministic guarantees)PG (probabilistic guarantees)PGpar (probabilistic guarantees, parametric)PGnpar (probabilistic guarantees, non-parametric)*
SA (static approach)
(interactive approach)
*
.
*
*