Top Banner
Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS Seminar 5 Parallel Computing Concepts and the Pan-STARRS Image Processing Pipeline
46

Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Feb 28, 2018

Download

Documents

dinhnhan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

Pan-STARRSSeminar 5

Parallel Computing Conceptsand the Pan-STARRSImage Processing Pipeline

Page 2: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

IPP in PS1

PreferredScience Clients

OTIS

PSPS

MOPSScience Client

IPP

PS1 Community

Solar SystemCommunity

queries

metadata,detections

rawimages

metadata,detections

metadata

metadata,detections

orbits,identifications

filtered detections &metadata

Camera

Telescope

static sky images

PS Subsystem

Legend

pixel data

meta & object data

ExternalSystem

commands

photons

PreferredSci Client

DVOcmf / smf filesdistribution systempostage stamp server

Page 3: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

Analysis Strategies

warp & stack

reference image(static sky or other warp image)

difference image

-

cleanedstackedimage

+

The Static Sky Image Combine&

Image Difference

Page 4: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

single-imagescience analysis

chip

camera

fake

warp

imagecombinations

stack

diff

magic

image registration

register

summitcopy

detrend creation

process

stack

normalize

residuals

reject

detrend image

IPP Flowchart (simplified)

DVO

photometry calibration

astrometry calibration

data release

destreak

distribution

publish

PS1SCclientsPSPS

Page 5: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

IPP Architecture

Image Server

(Nebulous)

Metadata DB(ippdb)

Object DB(DVO)

IPP Controller (pcontrol)

Camera

Client SciencePipelines

PSPS

IPP Scheduler (pantasks)

IPPProcess

ExternalSystem

Legendpixel data

meta & object data

commands & messages

Analysis Tasks

OTIS

images

static sky images metadata

detections filtereddetections

metadata

metadata

metadata

commands

commands

images

detections

metadata, Q/A

IPPData

publishing process

observing operations / processing boundary

Page 6: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

A short digression on parallel processing● The scale of the problem for IPP:

● Full 3pi Survey = 300,000 exposures x 60 chips = 18M images● or: 300,000 exp x 1.4Gpix = 840 Terabytes (raw)

● Goal: full reprocess in ~6 months● Single Threaded thought experiment:

● Some minimal analysis might take ~1 minute per chip● 1 core would take 34 years for 3pi!

● In reality, I want to do more work than that (full IPP processing requires a total of ~900 sec per chip per core)

● Need to parallelize to make the problem tractable

Page 7: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

basic parallel processing● a simple concept...

job 1 job 2 job 3 job 4 serial jobs on onecomputer / core

job 1

job 2

job 3

job 4

parallel jobs onN computers =N x as fast..

some jargon:● job : some specific thing to be done● thread : a job or part of a job running in serial● lock : a tool to allow one thread to block other threads● message : information passed between threads or jobs

Page 8: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

some caveats...● job dependency and sequencing

job 1 job 2a

job 2b

job 2c

if some jobs depend on results of other jobs● we need to manage the sequence● total speed up is < N (here 3/5 not 1/5 of time)

(Amdahl's law, 1967)

job 3

Page 9: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

some caveats...● what is limiting resource?

job 1

job 2

job 3

job 4

if limiting resource is notcomputation, there mayno gain from more computers..

job 1 job 2 job 3 job 4

Page 10: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

some caveats...● locks

job 1

job 2

job 3

if some jobs need to modify a common resource,they need to set a lock to avoid conflicts.

careful: locks block processing and can kill your thoughput!

fine-grained locking allows higher throughput (but can be harder to program)

job 1

job 2

job 3

Page 11: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

other caveats● beware of deadlocks...

job 1

object A

object B

job 1

job 2

object A

object B

job 1

job 2

object A

object B

job 2

object A

object B

deadlock

safeimplementation

Page 12: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

computers vs cpus vs cores● 'computer' : probably a single motherboard, I/O via ethernet● 'cpu' : probably a single chunk of silicon, I/O via mobo● 'core' : subdivision of a cpu, possible core-to-core I/O

● multi-core cpus common since ~2005

ixbtlabs.com

Page 13: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

parallel vs multi-threaded● parallel : spread work across machines

● data exchange via network (eg, ethernet)● multi-threaded : spread work across cores

● dual- or multi-processor computers share memory / disk

ixbtlabs.com

Page 14: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

parallel processing strategies● fine-grained : lots of coordinates and synchronization needed

● use multi-threaded programming● coarse-grained : parallel operations on larger chunks, easier to code

● multi-thread or cluster computing?● embarrassingly parallel : many equivalent, large-scale tasks

● use a cluster...

Page 15: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

some parallel processing technologies● running parallel tasks:

● PVM (Parallel Virtual Machine)● Condor● pcontrol (IPP integrated tool)● eg, ssh machine1 job1; ssh machine2 job2

● parallel program with message passing● MPI (Message Passing Interface)

● standard for libraries● communication between cluster nodes

● multi-threaded programming : pthreads● standard UNIX / Linux library● provides locks and thread message functions

Page 16: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

Parallel Processing in the IPP● pantasks + pcontrol : high-level (embarrassing) parallelism

● pantasks : task management (beyond scope)● data and jobs are pre-assigned and co-located

● eg, chip XY03 -> ipp021● all processing on XY03 -> ipp021● of, skycell.1204.02 -> ipp053● all processing on skycell.1204.02 -> ipp053

● targeted machines are 'desired' but not usually 'required'

004 005 006 007 008 009 010

Big Switch

Page 17: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

Parallel Processing in the IPP● most programs use pthreads for multithreading

● eg, parallel analysis of object moments● eg, parallel fitting of star and/or galaxy models

● some programming care is needed to avoid collisions

Page 18: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

Multithreaded programs : avoiding collisions● threaded analysis of models : cannot process same pixels in 2

threads (because we add in and subtract the models)● how to lock?

Page 19: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

Multithreaded programs : avoiding collisions● lay down a virtual chessboard (does not need to be 8x8)● do the analysis in 4 passes● 1: red cells to threads● 2: blue cells to threads● 3: yellow cells to threads● 4: green cells to threads● limited by slowest thread

Page 20: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

GPU programming● GPUs are like multiple cores taken to an extreme...● Advantages

● many simultaneous operations (1000s)● massive aggregate floating-point-op/sec● relatively cheap

● Disadvantages● limited language support● more rigid programming model● heterogeneous hardware / incompatibilities

● FFTW has easy GPU library support

Page 21: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

Parallel data● The Pan-STARRS Data Volume is huge

● already > 800TB of raw data (compressed, 2 copies)● output data volume potentially huge (10x - 20x raw volume)

● Storage mandates a distributed solution● Current largest single machines ~120TB● PS1 currently has 3.5 PB of total storage

● Data Management strategies are critical● Keep track of ~ 1 Billion files on the cluster● RAIDs are falible: Keep duplicate copies for safety.● name abstraction is needed (easy to move real files)

Page 22: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

PS1 / IPP Data Products● FITS Images

● Chip vs Warp vs Diff vs Stack● Access via Postage Stamp Server● User Interface is still being improved...

● FITS Tables● Chip vs Camera vs Diff vs Stack Photometry● Detections (properties of things in an image)

● DVO Database(s)● simple (simplistic), organized access to detections & objects● feeds to PSPS● requires some hefty hardware (currently, ~10TB)

http://svn.pan-starrs.ifa.hawaii.edu/trac/ipp/browser/trunk/psModules/src/objects/pmSourceIO_CMF.txt

Page 23: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

IPP Output Source Tables ('CMF' or 'SMF' File)● FITS Table format

● multiple tables per chip● PHU Carries global data for the exposure● One Table group for each image Chip

● Header for each Chip● derived from original chip header● NAXIS = 0 : no pixel data● carries metadata, Astrometric & Photometric transformation

● PSF table : PSF fits for all objects● XSRC table : Aperture-like measurements (Petrosian, etc)● XFIT table : Extended source fit measurements

● Segments are identified by EXTNAME = CHIP.psf, xscr, etc● Multiple Data Schemas available:

● PS1_V3 (most complete single image)● PS1_DV2 (most complete diff image)● PS1_SV1 (most complete 'static sky' images)

http://svn.pan-starrs.ifa.hawaii.edu/trac/ipp/browser/trunk/psModules/src/objects/pmSourceIO_CMF.txt

Page 24: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

Detections in IPP and PSPS

smf smf smf smf cmf

DVO PSPS

exp 1 exp 2 exp 3 exp 4 stack

(detection to object association) note: PSPS has complete smf data(DVO only has PSF & Kron data)

http://svn.pan-starrs.ifa.hawaii.edu/trac/ipp/browser/trunk/psModules/src/objects/pmSourceIO_CMF.txt

Page 25: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

IPP Output Source Tables : PSF Table Contents● Format version defined by header keyword EXTNAME● Parameters based on PSF fits:

● detection ID● X, Y, error● Instrumental Mag, flux, error, aperture mag (+ raw ap mag)● Kron parameters● Peak flux as Mag● sky, error● fit chi-square● CR, EXT Nsigma deviation● PSF shape (major, minor, theta) & moments (+ high order)● psf weight factor (Sum(psf * (1 - mask))) (+ 'suspect' version)● nFrames (for stack & diff)● 32 bit analysis flags (+ flags2)● additional special fields for diff and static sky

● RA, DEC, Calibrated Mags also available● Table of Matched reference stars (photometry & astrometry cal.)

Page 26: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

IPP Output Source Tables : XFIT Table Contents● Format version defined by header keyword EXTNAME● One row for each object and model● Only a subset of objects in PSF table● Parameters based on extended model fits:

● detection ID (matched to PSF table)● X, Y, error● Instrumental Mag, error● Model and Nparams● ellipse shape (major, minor, theta)● additional parameters (model-depended)● full covariance matrix● fit chi-square

● Export version of these files with RA, DEC, Calibrated Mags

http://svn.pan-starrs.ifa.hawaii.edu/trac/ipp/browser/trunk/psModules/src/objects/pmSourceIO_CMF.txt

Page 27: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

IPP Output Source Tables : XSRC Table Contents● Format version defined by header keyword EXTNAME● on row for each object, subset of PSF sources● Parameters:

● detection ID● X, Y from PSF fit● Petrosian Radius, Flux, errors● Elliptical Surface Brightness profile● note: not all parameters are measured for all objects

● Export version of these files with RA, DEC, Calibrated Mags

http://svn.pan-starrs.ifa.hawaii.edu/trac/ipp/browser/trunk/psModules/src/objects/pmSourceIO_CMF.txt

Page 28: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

Descriptions of SMF / CMF fields● IPP_IDET : IPP detection identifier index ● X_PSF : PSF x coordinate ● Y_PSF : PSF y coordinate ● X_PSF_SIG : Sigma in PSF x coordinate ● Y_PSF_SIG : Sigma in PSF y coordinate ● RA_PSF : PSF RA coordinate (degrees) ● DEC_PSF : PSF DEC coordinate (degrees) ● POSANGLE : position angle at source (degrees) ● PLTSCALE : plate scale at source (arcsec/pixel)● FLAGS : psphot analysis flags ● FLAGS2 : psphot analysis flags ● N_FRAMES : Number of frames overlapping source center● PADDING : padding

http://svn.pan-starrs.ifa.hawaii.edu/trac/ipp/browser/trunk/psModules/src/objects/pmSourceIO_CMF.txt

Page 29: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

Descriptions of SMF / CMF fields● PSF_INST_MAG : PSF fit instrumental magnitude ● PSF_INST_MAG_SIG : Sigma of PSF instrumental magnitude ● PSF_INST_FLUX : PSF fit instrumental flux (counts) ● PSF_INST_FLUX_SIG : Sigma of PSF instrumental flux ● AP_MAG : magnitude in standard aperture ● AP_MAG_RAW : magnitude in reported aperture ● AP_MAG_RADIUS : radius used for aperture mags ● PEAK_FLUX_AS_MAG : Peak flux expressed as magnitude ● CAL_PSF_MAG : PSF Magnitude using supplied calibration ● CAL_PSF_MAG_SIG : measured scatter of zero point calibration● SKY : Sky level ● SKY_SIGMA : Sigma of sky level

http://svn.pan-starrs.ifa.hawaii.edu/trac/ipp/browser/trunk/psModules/src/objects/pmSourceIO_CMF.txt

Page 30: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

Descriptions of SMF / CMF fields● PSF_CHISQ : Chisq of PSF-fit ● CR_NSIGMA : Nsigma deviations from PSF to CF ● EXT_NSIGMA : Nsigma deviations from PSF to EXT ● PSF_MAJOR : PSF width (major axis) ● PSF_MINOR : PSF width (minor axis) ● PSF_THETA : PSF orientation angle ● PSF_QF : PSF coverage/quality factor (bad) ● PSF_QF_PERFECT : PSF coverage/quality factor (poor) ● PSF_NDOF : degrees of freedom ● PSF_NPIX : number of pixels in fit

http://svn.pan-starrs.ifa.hawaii.edu/trac/ipp/browser/trunk/psModules/src/objects/pmSourceIO_CMF.txt

Page 31: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

Descriptions of SMF / CMF fields● MOMENTS_XX : second moments (X^2) ● MOMENTS_XY : second moments (X*Y) ● MOMENTS_YY : second moments (Y*Y) ● MOMENTS_M3C : third momemt cos theta ● MOMENTS_M3S : third momemt sin theta ● MOMENTS_M4C : fourth momemt cos theta ● MOMENTS_M4S : fourth momemt sin theta ● MOMENTS_R1 : first radial moment ● MOMENTS_RH : half radial moment ● KRON_FLUX : Kron Flux (in 2.5 R1) ● KRON_FLUX_ERR : Kron Flux Error ● KRON_FLUX_INNER : Kron Flux (in 2.5 R1) ● KRON_FLUX_OUTER : Kron Flux (in 2.5 R1)

http://svn.pan-starrs.ifa.hawaii.edu/trac/ipp/browser/trunk/psModules/src/objects/pmSourceIO_CMF.txt

Page 32: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

Descriptions of SMF / CMF fields (diff image version)● DIFF_NPOS : nPos (n pix > 3 sigma) ● DIFF_FRATIO : fPos / (fPos + fNeg) ● DIFF_NRATIO_BAD : nPos / (nPos + nNeg) ● DIFF_NRATIO_MASK : nPos / (nPos + nMask) ● DIFF_NRATIO_ALL : nPos / (nPos + nMask + nNeg) ● DIFF_R_P : distance to positive match source ● DIFF_SN_P : signal-to-noise of pos match src ● DIFF_R_M : distance to negative match source ● DIFF_SN_M : signal-to-noise of neg match src

http://svn.pan-starrs.ifa.hawaii.edu/trac/ipp/browser/trunk/psModules/src/objects/pmSourceIO_CMF.txt

Page 33: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

Descriptions of SMF / CMF fields (Static Sky version)● APER_FLUX : flux within annuli ● APER_FLUX_ERR : flux error in annuli ● APER_FILL : fill factor of annuli

http://svn.pan-starrs.ifa.hawaii.edu/trac/ipp/browser/trunk/psModules/src/objects/pmSourceIO_CMF.txt

Page 34: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

DVO Database● mini dvo databases are updated nightly● rsync of DVO databases is possible

● nightly: download mini db and run dvomerge yourself● monthly (or longer): download master dvo db

● total DVO data volume at end of survey ~30 TB? (probably less)● DVO dbs split by survey

http://svn.pan-starrs.ifa.hawaii.edu/trac/ipp/wiki/DVO_TopLevel

Page 35: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

DVO : What is it?● DVO = Desktop Virtual Observatory● Database to track astronomy outputs

● detections + objects● image parameters (astrometry & photometry)● photometry zero points

● Used by IPP for quality assurance & calibration (astrom + photom)● High-throughput detection + object correlations● Note: DVO is not a fully-featured relational database ● But: may be interesting to end users, complements PSPS● DVO databases (or subsets) can be copied and used locally● DVO has built in visualization language

http://svn.pan-starrs.ifa.hawaii.edu/trac/ipp/wiki/DVO_TopLevel

Page 36: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

DVO : Data Tables● Data stored as FITS Tables● Tables are autocode-defined

● Versioning is simple● Current Schema = PS1_V4

Images

Photcodes

Transparency

Cameras

Main Observational DataSkyRegions

Filters

Static Objects Average

Detections Measure

Object Data distributed on sky

Static Objects SecFilt

Non-Detections Missing

Other Data

http://svn.pan-starrs.ifa.hawaii.edu/trac/ipp/wiki/DVO_TopLevel

Page 37: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

DVO : data organization● some tables are partitioned by sky region● sky regions are RA,DEC bounded● sky regions completely defined by definition table● hierarchical table grouping (eg: fullsky, dec bands, ra segments...)● table associated with a host (using an abstracted name)

skyregions

images

images

objects objects

objects

detections detections

detections

objects

detections

objects 1

objects 2

objects 3

objects 4

images 1

images 2

http://svn.pan-starrs.ifa.hawaii.edu/trac/ipp/wiki/DVO_TopLevel

Page 38: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

DVO : Sky Partitioning● Sky regions bounded by lines of constant RA, DEC ● partitioning increasing input / output speed for most queries● region size scales with stellar density, size is adjustable● any subset of the region files may be copied elsewhere (local DVO)

Page 39: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

DVO Sky Partitioning vs Projection Cells / Sky Cells● Projection Cells:

● Overlapping tangent planes● Define images of the sky

● DVO Sky Partition● bounded by RA,DEC● Define db catalogs

The RINGS.V3 Tessalation

Page 40: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

DVO Sky Partitioning vs the Three Pi Survey Tessalations● Three Pi Tessalation defines telescope pointings for 3pi Survey.● 6 related tessalations for successive epochs : 0.5-1 deg rotations● 5466 fields for 3pi Survey blue 3π survey tess.

green eclipticblack galactic

Page 41: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

DVO shell : visualization● rich data language for data

visualization● SQL-like queries

– avextract ra, dec, i:ave, 2MASS_J where (i:rel - 2MASS_J < 2.0)– mextract ra, dec, time, i:ave, i:rel

region 0.0 25.0 90.0 sinstyle -c red; cgridstyle -c black; imagesplot-landoltplot-sdss

region -25.5 -12.8 6.0style -c blue; pcat -allstyle -c black; images

Page 42: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

DVO shell : Praesepe Example

region 130.0 19.3 1.6images; pcat -c red -lw 2pmeasure -all -m 8 12 -pt 7 -c blue -photcode 2MASS_J

Page 43: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

DVO shell : Praesepe Example

uRA (milli-arcsec / year)

uDEC (milli-arcsec / year)

-50.0 +50.0

-50.0

-50.0

Page 44: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

Personal DVO● 3pi Survey at end of mission:

– 5x109 objects = 800 GB– 1011 detections = 12TB– 90% of detections / objects at |b| < 10 degrees– outside of Galactic Plane : 50 MB / square degree– inside Galactic Plan : 1.5GB / square degree

● End Users may have local working copy of region of interest– carry your fields on your laptop!

● Transition to PSPS – PSPS will carry IPP DVO object IDs– DVO shell will be able to query PSPS if desired

Page 45: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

Photcodes● Defines the photometric system of a magnitude● Three classes

● 'average' photcodes (SEC)● 'measure' photcodes (DEP)● 'reference' photcodes (REF)

● a photcode defines:● numerical code (for db)● name (eg, g_PS1, GPC1.r.XY33, or 2MASS_J)● type (SEC, DEP, REF)● equivalent photcode (transformation target)● photometry transformation coefficients:● M

target = M

source + ZP + K

z(secz - 1.0) + Sum(A

c,icolori)

● systematic error, flags● 'average' photcodes have ZP ~ 0.0● 'measure' photcodes have ZP of telescope + camera + filter● 'reference' photcodes have ZP == 0.0

Page 46: Pan-STARRS Seminar IfA 2012.09.21 Pan-STARRS · PDF filePan-STARRS Seminar IfA 2012.09.21 basic parallel processing a simple concept... job 1 job 2 job 3 job 4 serial jobs on one computer

Pan-STARRS Seminar IfA 2012.09.21

DVO queries and photcodes● mextract, avextract can return magnitudes:● interpretation is somewhat context dependent:

● mextract mag -- all 'measure' magnitudes ● mextract g -- all g-equivalent magnitudes or NaN● mextract GPC1.g.XY11 -- all g-equivalent magnitudes or NaN● mextract mag:inst -- 'measure' mags as instrumental● mextract g:inst -- g-equiv as instrumental● mextract g:sys, g:cat, g:rel -- other magnitude versions● mextract g:err -- error on 'measure' mags● mextract g:ave, g:inst -- join 'average' to 'measure'● avextract g, r, i -- 'average' magnitudes● avextract g, 2MASS_J -- limited join to 'measure' (first match)● avextract g, g:err, g:chisq -- magnitude, error on ave, chi-square● avextract g:ncode -- number of measurements in photcode● avextract g:nphot -- number in photcode used for photom.