Top Banner
WP3 Werner Nutt (Heriot-Watt University) <[email protected]> R-GMA – Architecture and Query Mediation 24/4/2003
27

WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Mar 28, 2015

Download

Documents

Jada Payne
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

WP3

Werner Nutt (Heriot-Watt University)

<[email protected]>

R-GMA – Architecture and Query Mediation

24/4/2003

Page 2: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 2

WP3Contributors

• Rob Byrom RAL• Andy Cooke Heriot-Watt• Roney Cordenonsi QMUL• Abdeslem Djaoui RAL• Laurence Field PPARC• Steve Fisher RAL• Alasdair Gray Heriot-Watt• Steve Hicks RAL• Jason Leake RAL• Lisha Ma Heriot-Watt• James Magowan IBM-UK• Werner Nutt Heriot-Watt• Norbert Podhorszki SZTAKI• Manish Soni PPARC• Paul Taylor IBM-UK• Antony Wilson PPARC

Page 3: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 3

WP3

Grid Monitoring: Where are the Concepts?

There are two styles of talking about the Grid:– General metaphors (virtual organisations, services,…)

– Low-level technicalities and jargon (LDAP, XML, SOAP, OGSA, OGSI, ...)

What is missing– Clear definitions of the problems– intuitive concepts for solving them

Needed for communication with both, users and developers

Page 4: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 4

WP3The Grid Monitoring Problem

In a Grid we have– Computers– Storage elements– Network nodes and connections– Application programmes, …

Monitoring:– What is the current state of the system?– How did the system behave in the past ?

Page 5: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 5

WP3

Monitoring Data Come in two Kinds

A Grid monitoring system makes available two kinds of data

• static data “pools”, e.g., databases on – network topology, nodes connected – applications available (versions, licences, ...)

• “streams” of data, e.g.,– sensor data (cpu load, network traffic, ...)

Data streams may give rise to data pools if they are archived

Today: R-GMA is tailored towards streams, but not pools

Page 6: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 6

WP3

Examples of Monitoring Queries

• “Show me the (average) cpu-load of computers at Heriot-Watt!”

• “Between which nodes was yesterday the average transportation time for 1 MB packets higher than than 0.… seconds?”

• For every node N, how many computers connected to N have currently a cpu-load of no “ more than 30%?”

Page 7: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 7

WP3

Stream Queries can have Various Temporal Interpretations

Consider a query over the relation “Transport Time”

tt(src, dest, pcktSize, method, timestamp, time)

SELECT * FROM ttWHERE src = ral AND dest = bologna

What is meant? Measurements– from now ? (Continuous Query)

– up until now ? (History Query)

– right now ? (Latest Snapshot Query)

Today: Queries can be “flagged” with their type

Page 8: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 8

WP3

Advanced Queries:

Mixing Temporal Query Types

• “Which connections have currently a transportation time that is higher than last week's average?”

(latest snapshot and history)

• “Show me the cpu load of those machines where it is lower than yesterday's load average!”

(continuous and history)

We do not intend to support such queries by R-GMA!

Page 9: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 9

WP3

Architecture Approach 1: A Monitoring Data Warehouse

Idea:– store all data about the Grid status into a huge

database– and query it

Not realistic:• Loading takes time• Data occupy space• Connections to the warehouse may fail• Often monitoring data flow as data streams, and

queries ask for data streams as output

Page 10: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 10

WP3

Approach 2: Monitoring with a “Multi-agent System”

The Grid Monitoring Architecture (GMA) of the Global Grid Forumdistinguishes between:

Consumer

Producer

Monitoring-ApplicationData BaseSensor

DirectoryService

find/register

• Consumers of information

• Producers of information

• Directory Service– Producers register their

supply– Consumers register their

demand

Directory Service mediates between producers and consumers

Page 11: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 11

WP3Questions about GMA:

• Which kinds of producers and consumers are there?

• In which language do producers register their supply and consumers their demand ?

• What is the meaning of a registration?

• How does a consumer find suitable producers? And how does a producer find suitable consumers?

• Producers have different capabilities to answer queries (e.g. selections, joins, …).

Which of them should they register?

Page 12: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 12

WP3R-GMA: A Virtual Monitoring

Data Warehouse• Language of producers and consumers:

relational queries (SQL)• Vocabulary: Relations in a global schemaConsumer

DB-Producer

Global Schema S

DB

Stream Producer

Sensor

V1V2...

Vn

VViews on S

Registry

Query • Consumer: poses queries over global schema

• Producer: – has a type (stream p., database p.)

– publishes relations R1, … ,Rk

– for every R, registers a simple view V on the global schema

Page 13: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 13

WP3Primary Producers

Database producer• supports queries over fixed set of tuples (static queries)

• can be used to publish a database

Stream producer• supports queries over changing set of tuples

(continuous queries)

• supports “latest snapshot queries”– offers up-to-date values for each primary key

Today: DatabaseProducer’s and StreamProducer’s in R-GMA are different from the above!

Page 14: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 14

WP3

Communication Modes of Stream Producers

Stream Producers may offer two communication modes for continuous queries:

– lossless (… but tuples could become stale)– lossy (… but tuples are fresh)

ProducerServlet

IIIIIIII...

Producer ConsumerConsumer

Servlet

IIIIIIII...

Queue Queue

Today: R-GMA’s StreamProducer’s are resilient and support lossless communication

Page 15: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 15

WP3

Republishers Publish Query Answers

Archiver: shows the history of a stream.

Stream Republisher: enables – merging, – thinning, – summarising of streams …

into database into stream

Static Query Materialised View --

ContinuousQuery Archiver

StreamRepublisher

Page 16: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 16

WP3Republishers in R-GMA Today

Republishers are called “archivers” (although some of them don't archive anything)

An archiver (= republisher)

• is defined by a query • consumes only from “stream producers”• publishes the query result according to its type, using

– a “stream producer”, or– a “latest snapshot producer”, or– a “database producer”

(which keeps an archive)

Page 17: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 17

WP3

Which View should a Republisher Register?

Problem:

Republishers may compute complex queries

… but complex views would confuse the “mediator”!

Ideas:– register a simplified view for a complex query– register a new table

Page 18: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 18

WP3

What is the Meaning of a Query in R-GMA?

Assumption: the views of (primary) producers are selections on a single relation, i.e., queries of the form

SELECT * FROM cpu_load WHERE machine_id = ‘AB123’ AND loc = ‘hw’

(each producer contributes its parts of a relation)

• The virtual database contains the union of the data of all the primary producers

• Conceptually, a query is evaluated over the entire virtual db

Page 19: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 19

WP3

In R-GMA Query Answering Needs MediationSuppose P1, P2 produce for tt (Transport Time)

P1: … WHERE src = hw P2: … WHERE src = ral AND pcktSize > 20

A global consumer poses its query over global relations

SELECT * FROM tt WHERE pcktSize > 10

A mediator translates this into queries over local relations

SELECT * FROM P1.tt WHERE pcktSize > 10UNIONSELECT * FROM P2.tt

Today: R-GMA’s mediator handles simple queries like the one above

Page 20: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 20

WP3

• Global consumers pose queries over global relations

SELECT * FROM tt WHERE pcktSize > 10 ,

which are translated into queries over local relations

SELECT * FROM P1.tt WHERE pcktSize > 10UNIONSELECT * FROM P2.tt

• Local consumers pose queries over local relations directly

SELECT * FROM P1.tt WHERE method = ping

Today: a consumer can be global or local, but local relations cannot be referred to explicitly

Global and Local Consumers

Page 21: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 21

WP3

How does the Mediator Find Suitable Producers?

P1, P2, P3 produce for tt (Transport Time)

P1: … src = hwP2: … src = ral AND pcktSize > 20P3: … src = ral AND method = ping

Q: SELECT * FROM tt WHERE src = ral AND method = ping

We see: P1 is not suitable for Q, but P2 and P3 are. Why?

src = hw AND src = ral AND method = ping is never true

src = ral AND pcktSize > 20 AND … is sometimes true

Satisfiability Test! Today: implemented

Page 22: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 22

WP3

… So Which Producers Should the Mediator Ask?

P2: … src = ral AND pcktSize > 20P3: … src = ral AND method = ping

Q: SELECT * FROM tt WHERE src = ral AND method = ping

All answers to Q returned by P2 are also returned by P3 :

whenever src = ral AND pcktSize > 20 AND src = ral AND method = ping

is true, then src = ral AND method = ping AND src = ral AND method = ping

is true.

Hence, R-GMA only needs to ask P3 Entailment Test!

Today: not implemented

Page 23: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 23

WP3

… But What Did the Producers Promise?

P registers view V

Does P promise– some of V ? (sound description)

– all of V? (sound and complete description)

• The Entailment Test only makes sense when the registered views are sound and complete descriptions

• Producers should register completeness flags

Page 24: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 24

WP3

… Why May a Producer not be Complete?

• The language of views is more restricted than the language of queries

Hence: republishers may be unable to say exactly what they publish

• Archivers may archive in lossy mode

• Producers may lose tuples

• A producer may not know everything about the real world

Open to debate

Page 25: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 25

WP3Keys in the Global Schema tt(src, dest, method, pcktSize, timestamp, time)

Intuitively, tt has the primary key

(src, dest, method, pcktSize, timestamp).

We need to know the primary keys• to understand the global schema• to answer latest snapshot queries

But can we enforce them?

Sometimes, they hold globally if they hold locally !

Today: global tables have keys, which are used to keep a latest snapshot cache

Page 26: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 26

WP3Summary (1)

Types of Stream Queries• continuous vs. history vs. latest snapshot

Producers• primary producers vs. republishers

• DB producers: publish database

• stream producers: lossless vs. lossy communication modes

• republishers: materialised views vs. archivers vs. stream republishers

Page 27: WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation 27

WP3Summary (2)

Global Schema• primary keys

Consumers• global vs. local consumers

Mediator• translates global query into local queries• applies Satisfiability Test to find suitable producers

Query Planning• Entailment Test• sound vs. sound and complete producers