Top Banner
Markus Nullmeier Zentrum für Astronomie der Universität Heidelberg Astronomisches Rechen-Institut [email protected] Accelerating access to data archives with the new version of pgSphere
15

Accelerating access to data archives with the new version ...

Dec 03, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Accelerating access to data archives with the new version ...

Markus Nullmeier

Zentrum für Astronomie der Universität HeidelbergAstronomisches Rechen­Institut

[email protected]

Accelerating access to data archives with the new version of pgSphere

Page 2: Accelerating access to data archives with the new version ...

Markus Nullmeier

[email protected]

● About pgSphere

● New pgSphere features since 2014

● Extending pgSphere with sky coverage data types

Accelerating access to data archives with the new version of pgSphere

Page 3: Accelerating access to data archives with the new version ...

● pgSphere?

About pgSphere

Page 4: Accelerating access to data archives with the new version ...

● PostgreSQL extension: new SQL data types, functions, indexes

● PostgreSQL: “The world's most advanced open source database”

  

● SQL data types: spherical points (RA, DEC),                             spherical lines, polygons, ellipses, paths,                             spherical transformations (rotations)

About Pgsphere

Page 5: Accelerating access to data archives with the new version ...

VO Usage of pgSphere

RA

DEC

SR

X­match

Page 6: Accelerating access to data archives with the new version ...

Database indexes of spherical coordinates for, e. g.:

● Cone search 

● Cross­match

● Images (e. g., digitised astronomical plates)

Pgsphere internals

Page 7: Accelerating access to data archives with the new version ...

Pgsphere internals

R­treeR1

R3

R4

R9

R11

R13

R10

R12

R16

R15

R14R8

R2

R6

R7

R17

R18

R19

R5

R1 R2

R3 R4 R5 R6 R7

R8 R9 R10 R11 R12 R13 R14 R15 R16 R17 R18 R19

Page 8: Accelerating access to data archives with the new version ...

Pgsphere development history

Janko Richter

Teodor Sigaev Oleg Bartunov

Igor Chilingarian

Page 9: Accelerating access to data archives with the new version ...

Pgsphere development nowadays

 Dmitry Ivanov

  Alexander Korotkov

Markus Nullmeier

contributors: Pat Dowler, Serge Monkewitz

Page 10: Accelerating access to data archives with the new version ...

● Greatly improved R­tree indexing, 1..2 order of magnitude faster:A. Korotkov “A new double sorting­based node splitting algorithm for R­tree”, Programming and Computing Software38(3), 2012, DOI: 10.1134/S0361768812030024 

● All open / known open bugs fixed

● Addition of new­style SQL “contains” operators

● More numerical stability

● Custom PostgreSQL optimisation for spatial joins(= crossmatch)

New PgSphere features since 2014 

Page 11: Accelerating access to data archives with the new version ...

[publication of benchmarks planned for ADASS XXVI, Trieste 2016]

New R­tree indexing 

Page 12: Accelerating access to data archives with the new version ...

F

MOC = Multi­order coverage   (HEALPix Multi­Order Coverage map)

● Concise mapping of a catalog's coverage of the sphere

 

● Coverage made up from discrete elements 

● Making MOC and sky maps a first­class SQL data type...

Extending pgSphere with sky coverage data types

  go to the MOC tutorial tomorrow!

Page 13: Accelerating access to data archives with the new version ...

MOC as indexable SQL data type

● I/O to / from files

● Create one MOC from table column or query

● Specify your own MOC and  search over all catalogs of a data center:

SELECT name FROM catalogs WHERE my_moc <@ catalogs.moc ;

Sky map data type: analogous to MOC

WIP: sky coverage data types for pgSphere

Page 14: Accelerating access to data archives with the new version ...

● R­trees will not work for MOC representing catalogs

● PostgreSQL custom indexing will be in Release 9.6:https://github.com/postgrespro/rum

● Core of new index structure:

MOC: indexing

RANGES OF NUMBERS OF HEALPIX ELEMENTS

SETS OF MOC IDs

range0 { id7, id11 }

range1 { id2, id108, id109 }

range2 { id108, id732, id11030 }

... ...

Page 15: Accelerating access to data archives with the new version ...

● Download, use, test, and join the community at the pgSphere home page: http://pgsphere.github.io

● Send in bug reports● Send in test cases● Send in patches● Send in feature requests :­)

Your involvement