Top Banner
SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10
21

SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

Mar 27, 2015

Download

Documents

Landon Houston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

SimDB and SimTAP

Dealing with a complex data model

Gerard Lemson, Nara, 2010-12-10

Page 2: SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

SimDB and SimDALProtocols to support• describing simulations

– Simulation Data Model: Model for N-body 3+1D any simulations http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/specification/uml/SimDB_DM.png

• publishing simulations– Simulation Database (SimDB): protocol for accessing a database built according

to SimDM.• finding simulations

– SimDB/TAP– queryData in SimDAL– SimTAP

• retrieving simulation data, whole, in parts, manipulated– SimDAL getData services (not in this talk)

• Btw: “simulation” can be– simulation run– simulation result– simulation data– post-processing of simulation results

Page 3: SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

SimDB/REST

• “simple” access to SimDB• Uses XML representation of model

– XML schema http://code.google.com/p/volute/source/browse/#svn/trunk/projects/theory/snapdm/specification/xsd

• Examples http://code.google.com/p/volute/source/browse/#svn/trunk/projects/theory/snapdm/specification/examples

– PDR http://code.google.com/p/volute/source/browse/#svn/trunk/projects/theory/snapdm/specification/examples/external/PDR

– Gadget2http://volute.googlecode.com/svn-history/r1382/trunk/projects/theory/snapdm/specification/examples/external/Gadget2/Gadget2.xml

– TODO more (SVO)

• VO-URP – validator http://www.g-vo.org/SimDB-browser/Validate.do

– upload http://www.g-vo.org/SimDB-browser

– download http://www.g-vo.org/SimDB-browser

Page 4: SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

SimDB/TAP

• Model complex– Too(?) complex for trivial (parameter based) query language– Need special navigation tools (vo-urp@gavo)– Need powerful query language

• Impement TAP on database built according to SimDM• Map UML to RDB model

– TAP_SCHEMA for SimDM (vo-urp@gavo old)http://code.google.com/p/volute/source/browse/#svn/trunk/projects/theory/snapdm/specification/tap

– create table + inserts– VODataService

• VO-URP SQL query http://www.g-vo.org/SimDB-browser/Query.do

• Not always easy!

Page 5: SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

Model complex

• Normalised (see image)

• General Abstract– e.g. parameters must be fully defined, no

assumptions

• Hard to deal with quantities with a priori unknown units– ParameterSetting table has value AND unit

attributes (Quantity datatype)

Page 6: SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

Example queries

• Find synthetic spectra of white dwarf stars

• Find cosmological simulations with Ω=0.9, ΩΛ= 0.7 and Ωb=0.02

• Find all SPH simulations containing a galaxy cluster with mass around1014 Msun

Page 7: SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

select e.* from experiment e , targetObject t , result r , product p where t.label=‘white_dwarf’ and t.containerid=e.id and r.containerid=e.id and r.targetId=t.id and p.containerid=r.id and p.productType=‘spectrum’

Page 8: SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

Example queries

• Find synthetic spectra of white dwarf stars

• Find (cosmological) simulations with Ω=0.9, ΩΛ= 0.7 and Ωb=0.02

• Find all SPH simulations containing a galaxy cluster with mass around1014 Msun

Page 9: SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

select e.* from Experiment e , InputParameter ip1 , ParameterSetting ps1 , InputParameter ip2 , ParameterSetting ps2 , InputParameter ip3 , ParameterSetting ps3 where ps1.containerId = e.id and ps1.parameterId = ip1.id and ip1.label = ‘omega_lambda’ and ps1.numericalValue_value=0.7 and ps2.containerId = e.id and ip2.label = ‘omega_baryon’ and ps2.parameterId = ip1.id and ps2.numericalValue_value=0.02 and ps3.containerId = e.id and ip3.label = ‘omega’ and ps3.numericalValue_value=0.9

Page 10: SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

Example queries

• Find synthetic spectra of white dwarf stars

• Find (cosmological) simulations with Ω=0.9, ΩΛ= 0.7 and Ωb=0.02

• Find all SPH simulations containing a galaxy cluster with mass around1014 Msun

Page 11: SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

select e.* from Experiment e , ExperimentRepresentationObject ero , RepresentationObjectType rot , TargetObject to , Property p, StatisticalSummary s where ero.containerId = e.id and ero.typeId= rot.id and rot.label=‘sph.particle’ and to.containerId = e.id and to.label = ‘galaxy.cluster’ and p.containerId = to.id and p.label=‘mass’ and s.propertyId = p.id and s.statistic = ‘value’ and s.numericalValue_value=1e14 and s.numericalValue_unit=‘M_sun’

Page 12: SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

SELECT r.id as id, r.publisherdid as publisherdid, s0.numericValue_value as mass, s1.numericValue_value as x, s2.numericValue_value as y, s3.numericValue_value as z FROM result r , product o , statisticalsummary s0 , property p0 , statisticalsummary s1 , property p1 , statisticalsummary s2 , property p2 , statisticalsummary s3 , property p3 WHERE r.containerid = 6 AND o.containerid = r.id and s0.containerid = o.id and s1.containerid = o.id and s2.containerid = o.id and s3.containerid = o.id and p0.publisherdid = 'mass' and s0.proprtyid=s3.id and s0.statistic = ‘nominal’ and p1.publisherdid = 'x' and s1.proprtyid=s3.id and s1.statistic = ‘nominal’ and p2.publisherdid = 'y' and s2.proprtyid=s3.id and s2.statistic = ‘nominal’ and p3.publisherdid = 'z' and s3.proprtyid=s3.id and s3.statistic = ‘nominal’

An example from Paris.Find typical values of mass,x,y,z properties in a given simulation result

Page 13: SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

SELECT r.id as id, r.publisherdid, max(case when p.publisherdid = ‘mass’ and s.statistic=‘nominal’ then s.numericValue_value else null end) as mass, max(case when p.publisherdid = ‘x’ and s.statistic=‘nominal’ then s.numericValue_value else null end) as x, max(case when p.publisherdid = ‘y’ and s.statistic=‘nominal’ then s.numericValue_value else null end) as y, max(case when p.publisherdid = ‘z’ and s.statistic=‘nominal’ then s.numericValue_value else null end) as z FROM result r , product o , statisticalsummary s , property p WHERE r.containerid = 6 AND o.containerid = r.id and s.containerid = o.id and p.id = s.propertyidgroup by r.id,r.publisherid,o.id

Page 14: SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

Conclusions

• Some queries can be phrased nicely

• Others using standard SQL, but due to level of normalisation and abstraction MANY joins required

• Can we simplify this a bit?

Page 15: SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

zoom

Page 16: SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

containerId value unit parameterId... ... ... ...

123 0.02 456

123 0.7 457

123 0.9 458

345 .04 456

345 .7 457

345 1 458

... ... ... ...

id name label datatype description456 omega_b omega.baryon real ...

457 omega_l omega.lambda real ...

458 omega omega real ...

... ... ... ... ...

ParameterSetting

InputParameter

id omega_b omega_l omega ...

123 0.02 0.7 0.9

345 0.04 0.7 1

+

simtap.Experiment

Page 17: SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

SimTAP

• When Protocol is fixed, tap schema can be simplified– parameters columns in simtap.Experiment

table– property characterisation columns in

product specific characterisation table(s)– ...

Page 18: SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

select e.* from Experiment e , InputParameter ip1 , ParameterSetting ps1 , InputParameter ip2 , ParameterSetting ps2 , InputParameter ip3 , ParameterSetting ps3 where ps1.containerId = e.id and ps1.parameterId = ip1.id and ip1.label = ‘omega_lambda’ and ps1.numericalValue_value=0.7 and ps2.containerId = e.id and ip2.label = ‘omega_baryon’ and ps2.parameterId = ip1.id and ps2.numericalValue_value=0.02 and ps3.containerId = e.id and ip3.label = ‘omega’ and ps3.numericalValue_value=0.9

Instead ofthis

Page 19: SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

this

select e.*

from simtap.Experiment

where omegaLambda=0.7

and omegaBaryon=0.02

and omega=0.9

Page 20: SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

Table definitions can be derived

• From a Protocol definition– input parameters– for each Representation object type

• a table with statistical summaries of properties

– target object type• ala SimDM (units in ADQL required)• pivoted per project?

– input data sets (urls)

• Pivoting queries can be generated

Page 21: SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

Proposal

• SimDAL services MAY include a SimTAP service

• 1 SimTAP schema per Protocol• Each such schema contains

– 1 Experiment table with columns for parameters

– >=1 Product tables with characterisation of properties

– Possibly other tables from SimDB/TAP