Top Banner
26SEP2007 FOSS4G2007 Database Shootout: Benchmarking spatial DBMSs Wim de Haas Wilko Quak
32

Database Shootout: Benchmarking spatial DBMSs

Jan 12, 2016

Download

Documents

aizza

Database Shootout: Benchmarking spatial DBMSs. Wim de Haas Wilko Quak. You’re in the eye of the storm!. Early morning : Brock Anderson about WMS/PostGIS/Shapefile performance This afternoon : Kevin Neufeld about tips for the PostGIS power user - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007

Database Shootout: Benchmarking spatial DBMSs

Wim de HaasWilko Quak

Page 2: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 2

Delft University of Technology

You’re in the eye of the storm!

Early morning: Brock Anderson about WMS/PostGIS/Shapefile performance

This afternoon: Kevin Neufeld about tips for the PostGIS power user

Now: Reflect on the factor 10 and the framework for testing

Page 3: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 3

Delft University of Technology

Overview

• Introduction• What are the problems?• A classification of Spatial DBMS users• How can we help them• Benchmark proposal• First test results• Next steps

Page 4: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 4

Delft University of Technology

Introducing the Ministry of Transport, Public Works and Water Management

Our core tasks are:• to offer protection against floods • to guarantee safe and reliable

connections over land, water and through the air

• to ensure clean and sufficient water

• Rijkswaterstaat (RWS) is the executive branche of the Ministry of Transport

Page 5: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 5

Delft University of Technology

Business drivers

• How to keep track of all these assets?• How to ensure consistency & coherence

in operations and change of Rijkswaterstaat?

• How to facilitate decisionmaking and communication

• Enter the Digitaal Topografisch Bestand (DTB)

– 3D– 1:1000– EUR 60M

Page 6: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 6

Delft University of Technology

DTB waterway and highway

Page 7: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 7

Delft University of Technology

DTB Birds eye view

Page 8: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 8

Delft University of Technology

DTB Amsterdam Airport

Page 9: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 9

Delft University of Technology

Enter IVRI

• The new system for data acquisition and maintenance for the DTB

• Oracle 10g• ArcGIS 9.2• Summit Evolution• Very complex project

Page 10: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 10

Delft University of Technology

First comment on Murphy’s Law

• Murphy was an optimist• Oracle and ESRI were pushed to the

limits• Took extra time in the project• Triggered us to be less dependent on

Oracle• Oracle Spatial is not cheap, so can we

use PostGIS as the main datastore for Spatial data?

Page 11: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 11

Delft University of Technology

Why bother …

Stonebraker2007:

• Where to find dramatic differences in Spatial DBMSs?

We define “dramatically outperform” to mean at least a factor 10 advantage […then] customers will be inclined to try the new architecture

Page 12: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 12

Delft University of Technology

Where to expect Dramatic differences?

• Operating System (No)• MySQL Spatial Extension vs PostGIS

(Yes)• Choice of FileSystem (Maybe)• Functionality Difference (Yes)• Choice of Parameters (Maybe)

Page 13: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 13

Delft University of Technology

Problems with testing

• DBMS vendors do not want published results

– Oracle explicitly forbids publishing benchmark results

• Hardware– Moore’s Law– I/O

• Release Frequency of Software• Spatial testing cannot be done on

synthetic data• Too many parametersBenchmark results are outdated

before they are publised

Page 14: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 14

Delft University of Technology

Benchmark consideration: Weird Cases department

diagonalquery

geometry

flatquery

geometry

Page 15: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 15

Delft University of Technology

Benchmark consideration: Hot vs Cold

Page 16: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 16

Delft University of Technology

Solution

• Do not publish the result of the benchmark

• Publish a framework that lets people do their own benchmarking

• No “One size fits all”: Buyer’s guide• Help different users to find best DBMS

Page 17: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 17

Delft University of Technology

Classification of spatial DBMS users

Four classes:1. Server Builders: publish spatial data via

web services2. GIS User: Load various datasets and

perform complex analyses3. Data Maintainer: Maintain one core

dataset4. Power Users: All of the above and more

Page 18: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 18

Delft University of Technology

Class 1: Web Server Builders

• You do not really need a DBMS for this (You use a fraction of DBMS functionality)

– This maybe oversimplified, but is used here for the purpose of clarity

• Only one query counts: Find everything within BBOX

Page 19: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 19

Delft University of Technology

Class 2: GIS users

• Main interest is functionality• Spend more time on loading data• Need a good query optimiser• Analysis

Page 20: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 20

Delft University of Technology

Class 3: Dataset Maintainers

• Limited number of queries• Transactions are an issue• Clustering of data after updates is

interesting• More time to tweak

– And after all, there are a lot of buttons to push

Page 21: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 21

Delft University of Technology

Class 4: Power users

• Do their own testing• Need a platform to discuss their

findings

Page 22: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 22

Delft University of Technology

Benchmark components

1. Functionality test• Literature review• Factual testing

2. Very simple performance test script with few parameters• BBOX Query• Fixed Dataset (Propasal OpenStreetMap dataset)

3. Configurable test suite• Full Suite that tests every corner of DBMS• For specialists only

Page 23: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 23

Delft University of Technology

Configuration

• HW– Compaq DL380

• OS– Linux RH

• SW– MySQL 5.0– PostgreSQL 8.2.4– PostGIS 1.3.1

• Dataset is National Road Map

Page 24: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 24

Delft University of Technology

Test 1 – Functionality:MySQL vs PostGIS

MySQL• Almost all operations

in MySQL return the same result as the corresponding MBR-based functions

– However, MySQL is making an effort complying to full OGC support

PostGIS• Full OGC support

Functionality of MySQL is only suited for simple WMS support and no spatial operations are done on

geometry

Page 25: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 25

Delft University of Technology

Test 2: simple BBOX select

Write simple script that generates a lot of rectangle queries.

Paremeter:• DBMS size• query box size• experiment length

Page 26: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 26

Delft University of Technology

Test 2: grow DBMS size

• Question: Does query response time depend on DBMS size or on core memory?

• Experiment: Run same test on more then one copies of same database

Page 27: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 27

Delft University of Technology

Test 2 – result: PostGIS vs MySQL

0

0.01

0.02

0.03

0.04

0.05

0.06

0 500000 1000000 1500000 2000000 2500000 3000000 3500000

PostGIS

MySQL

Page 28: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 28

Delft University of Technology

Test 2 – result: Conclusions

• As long as dataset fits in core memory differences are small

• MySQL can do more with less memory• MySQL degrades faster if you run out of

memory• Out of the box installation is bad PR for

PostGIS– Maybe because MySQL leaves caching of disk-

blocks to OS, while PostgreSQL is doing it otherwise

Page 29: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 29

Delft University of Technology

Test 3: Comprehensive Test Suite

• Create set of killer polygons so that every line of source code will be touched by running operations

• Test Query optimizer• Test Join Operator

– Must be done with Skewed Data

Page 30: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 30

Delft University of Technology

Conclusions (overall)

• This is a work in progress– We still miss polygons and spatial queries

• The factor 10 is not within reach, yet– No dramatic differences

Page 31: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 31

Delft University of Technology

How to proceed

• Finish the work and publish this– Timeline OCT-NOV2007

• TU Delft wiki or osgeo.org wiki?• Start a Special Interest Group a.k.a.

Committee?

Page 32: Database Shootout: Benchmarking spatial DBMSs

26SEP2007

FOSS4G2007 32

Delft University of Technology

Questions

[email protected]@tudelft.nl