Top Banner
The Future of The Future of MOCHA MOCHA Nick Roussopoulos October 5, 2001
69
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Future of MOCHA Nick Roussopoulos October 5, 2001.

The Future of The Future of MOCHAMOCHA

Nick Roussopoulos

October 5, 2001

Page 2: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 2

The ProblemThe Problem

• Data Sources for an enterprise are:

– Distributed• Internet, intranets, extranets

– Heterogeneous• Web servers, relational databases, file systems

– Mission-critical• Weather service, ocean temperature, stock status, …

– Costly to replace or upgrade• Risk of breaking it and loss of investment

Distributed and Distributed and heterogeneous data sourcesheterogeneous data sources

Page 3: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 3

The ProblemThe Problem

Internet

Oracle 8i Informix XML Data Text Data

High volume access from everywhereHigh volume access from everywhere

ClientClient

ClientClient

Client

ClientClient

ClientClient

Client

ClientClient

ClientClient

Client

ClientClient

ClientClient

Client

Page 4: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 4

Client-Server Client-Server 2-tier architecture

complex FAT clients

Bad Idea

ClientClient

ClientClient

Client

Internet

Oracle 8i Informix XML Data Text Data

Page 5: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 5

Middleware 3-tier architectureMiddleware 3-tier architecture

Oracle 8i Informix XML Data Text Data

Internet

Translator Translator Translator Translator

IntegrationServer Catalog

ClientClient

Client

ClientClient

Client

Thin & fit clients

Page 6: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 6

Nice but…Nice but…

• Most middleware solutions are static• Not flexible for dynamic environments• Not scalable to hundreds of client and server sites

• Development cost is high• One-site-at-a-time at a fixed cost

• Maintenance cost is high• Upgrades are practically redevelopments

Page 7: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 7

A dynamic world needs Code extensibility A dynamic world needs Code extensibility & auto-deployment& auto-deployment

• Need for user-defined types and functions– Polygon– Composite() – image aggregation

• Porting and manual installation of code (C/C++)– Operating System– Hardware Platform

• High cost of code maintenance– Updates on all platforms– Version management

• Security in hostile platforms

Page 8: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 8

Code Deployment ProblemCode Deployment Problem

ClientClient

Oracle 8i Informix XML Data Text Data

Internet

Translator Translator Translator Translator

IntegrationServer Catalog

Not Scalable

Page 9: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 9

Query ProcessingQuery Processing

• Query execution options– Limited by site-dependent software

• Composite() – must be ported before use

• Most processing done at the Integration Server– Powerful Data Servers are under-utilized

• I/O Nodes

– Excessive data movement over the network• Network bottleneck

• Slow internet access

Page 10: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 10

Query Processing ProblemQuery Processing Problem

ClientClient

Oracle 8i Informix XML Data Text Data

Internet

Translator Translator Translator Translator

IntegrationServer Catalog

100MB

100MB

100MB 200MB

200MB

200MB

Inefficient & not scalable

Page 11: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 11

SolutionSolution

MOCHAMOCHAMiddleware Based On a Code SHipping Architecture

Page 12: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 12

MOCHAMOCHA Solution: Ship Java Code Solution: Ship Java Code MochletsMochlets

Select location, Composite(image)From RastersWhere week BETWEEN t1 and t2Group By location

Client

Oracle Informix

DAP DAPQPC

CodeRepository

Catalog

Virginia

MarylandVirginiaTexas

Q

Q

Q

QQQQ

Q QNo code porting & no maintenance

Page 13: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 13

MOCHAMOCHA Solution: Filter Data @ Source Solution: Filter Data @ Source

Select location, Composite(image)From RastersWhere week BETWEEN t1 and t2Group By location

Client

Oracle Informix

DAP DAPQPC

CodeRepository

Virginia

MarylandVirginiaTexas

Catalog200MB

tuples

100MB

tuples

results

200KB

results

150KB

results

150KB

results

200KBresults

150KB

results

200KB

results

350KB

results

350KB

No bandwidth waste

Page 14: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 14

Software architectureSoftware architecture

Client

DBMS OS File

DAP DAPQPC

CodeRepository

Catalog

Page 15: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 15

QPC: The Query Processing CoordinatorQPC: The Query Processing Coordinator

Client API

Query Parser

Catalog Manager

Query Optimizer

Execution Engine

CodeLoader

SQL &XML

Proc.Interface

DAP Access API

XMLCatalog

CodeRepository

DAP

QPC Controls and Coordinates Query Execution

Page 16: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 16

DAP: The Data Access ProviderDAP: The Data Access Provider

DAP Provides QPC withRemote Access to the Data

Data Source

DAP Access API

Control Module

Execution Engine

CodeLoader

SQL &XML

Proc.Interface

Data Source Access Layer

JDBC I/O API DOM JNI

Page 17: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 17

Data Server: Storage SystemData Server: Storage System

• Stores and Manages the data sets– database, web server, file system, XML repository

Data Server

Page 18: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 18

Processing a Query in Processing a Query in MOCHAMOCHA

Query Parsing

Resource Discovery

Query Optimization

Metadata and Control

Exchange

Code Deployment Phase

Query Execution Table Rasters location image week band

Select location, Composite(image)From RastersWhere week BETWEEN t1 and t2Group By location

Query:

Page 19: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 19

Plan GenerationPlan Generation

Client

Client

Informix Oracle

QPC

DAP DAP

CodeRepository

Catalog

Coordination Thread

Execution Thread

Execution Thread

Select location, Composite(image)From RastersWhere week BETWEEN t1 and t2Group By location

Page 20: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 20

Automatic Code DeploymentAutomatic Code Deployment

Client

Client

Informix Oracle

QPC

DAP DAP

CodeRepository

Catalog

Coordination Thread

Execution Thread

Execution Thread

Select location, Composite(image)From RastersWhere week BETWEEN t1 and t2Group By location

Page 21: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 21

Data ProcessingData Processing

Client

Client

Informix Oracle

QPC

DAP DAP

CodeRepository

Catalog

Coordination Thread

Execution Thread

Execution Thread

Select location, Composite(image)From RastersWhere week BETWEEN t1 and t2Group By location

Page 22: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 22

Features of Features of MOCHAMOCHA

• Automatic code deployment • “Plug-N-Play”• no system-wide installations

• Metadata and Schema Mapping framework• XML, RDF • easy to exchange and map schemas • semi-automatic mapping

• Query optimization based on code shipping– reduce data movement overhead

• filters at the source• expands at the client

• metrics for code (operator) placement• optimization for selection, union and join plans

Page 23: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 23

MOCHA Demo: Global Land Cover MOCHA Demo: Global Land Cover FacilityFacility

• Integrates the following DAP sites– University of New Hampshire (Webster), NASA GSFC, UMD-

CS, UMD-Geography, UMD-UMIACS SP-2 HPSS

• GLCF hosts the QPC• Operations supported:

– Coverage queries– Visualization of preview images for– Data sets MODIS, TM, AVHRR– GIS Features

• Dynamic Sub-setting of TM scenes

• Composites of GIS Features and AVHRR images

Page 24: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 24

Multi-Sensor Analysis of the Multi-Sensor Analysis of the Los Alamos Fire Event Using Los Alamos Fire Event Using MOCHAMOCHA

• Data Synergy and Multi-Resolution Instrument Analysis using MOCHA – Access data residing at various data sources– Utilize image processing tools

• Fire Analysis required a multi-resolution approach– MOCHA is independent of instrument or resolution specifics

• High Resolution: IKONOS and TM data

• Moderate Resolution: 250m MODIS

• Coarse Resolution: AVHRR and DMSP

Page 25: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 25

MOCHAMOCHA Search Utility Search Utility

Page 26: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 26

MOCHAMOCHA Search Utility (cont’d) Search Utility (cont’d)

Page 27: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 27

MOCHAMOCHA Search Utility (cont’d) Search Utility (cont’d)

Page 28: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 28

MOCHAMOCHA Query Results Query Results

Page 29: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 29

MOCHAMOCHA ETM+ Subsetting Utility ETM+ Subsetting Utility

Page 30: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 30

May 9, 2000 Los Alamos (Bands 1,2,3) May 9, 2000 Los Alamos (Bands 1,2,3)

Page 31: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 31

May 9, 2000 Los Alamos (Bands 7,5,4) May 9, 2000 Los Alamos (Bands 7,5,4)

Page 32: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 32

Multi-Sensor QueryMulti-Sensor Query

Page 33: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 33

Tabular Query ResultsTabular Query Results

Page 34: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 34

MODIS: May 11, 2000: During FireMODIS: May 11, 2000: During Fire

Page 35: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 35

MODIS: May 24, 2000: After FireMODIS: May 24, 2000: After Fire

Page 36: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 36

DMSP: Night Visibility of FireDMSP: Night Visibility of Fire

Page 37: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 37

IKONOS 4m resolutionIKONOS 4m resolution

Page 38: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 38

IKONOS 4m SubsetIKONOS 4m Subset

Page 39: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 39

IKONOS 1m resolutionIKONOS 1m resolution

Page 40: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 40

IKONOS 1m SubsetIKONOS 1m Subset

Page 41: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 41

MOCHAMOCHA Metadata Publishing Framework Metadata Publishing Framework

• Provides information about system resources• Data sources• schemas and mappings• user-defined types and functions

• Automates operation of MOCHA• Incremental system growth

• neither fixed nor hardwired parameters • no extension by re-compilation

• Share metadata with others (Internet)• machine readable form

Page 42: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 43

MOCHAMOCHA Catalog Organization Catalog Organization

• Metadata about “resources”– Local and global tables– UDF data types and operators– Schema mapping rules– DAPs

• Each one has Uniform Resource Identifier (URI) global namespace– e.g.: mocha://cs1.umd.edu/EarthSci/Polygon

• Modeled with RDF, serialized with XML easy to understand, use and exchange

Page 43: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 44

RDF Model: Data TypesRDF Model: Data Types

mocha:T

ype

mocha:Class

mocha:Repos ito

ry

mocha:Size

mocha:Creator

mocha://cs1.umd.edu/EarthSci/Raster

Raster

Raster.class cs1.umd.edu/EarthSci 1 megabyte

[email protected]

Page 44: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 45

XML Serialization: Data TypesXML Serialization: Data Types

• W3C Standards• Easy to specify using

GUI tools• Easy to exchange • Crawlers can harvest it• Stored in

– DB– File System

<rdf:Description about= “mocha://cs1.umd.edu/EarthSci/Raster”> <mocha:Type>Raster</mocha:Type> <mocha:Class> Raster.class </mocha:Class> <mocha:Repository>

cs1.umd.edu/EarthSci </mocha:Repository> <mocha:Size> 1 MB</mocha:Size> <mocha:Creator>[email protected] </mocha:Creator></rdf:Description>

Page 45: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 46

Other Resources in Other Resources in MOCHAMOCHA

• Local and Global tables– data sources + columns + types

• UDF Functions– argument types + return type– code repository

• Schema mapping rules• DAPs

– URL– login information

Page 46: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 47

Schema Mapping in Schema Mapping in MOCHAMOCHA

locationimageweekband

point1point2photodateband

• Direct column mappings• Complex Expressions

RastersRastersRastersMDRastersMD

rect()

week()

Page 47: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 48

MOCHAMOCHA Schema Mapping Rules Schema Mapping Rules

• Use XML to encode mapping rules

• Schema mapping sub-plans– leaf nodes

<MapList> <mi mapped = “direct”> <mocha:Column>image</mocha:Column> <mocha:Expr>photo</mocha:Expr> </mi> <mi mapped = “expression”> <mocha:Column> location </mocha:Column> <mocha:Expr> rect(point1, point2) </mocha:Expr> </mi>…

PlanTree

SMPSMP SMP

Page 48: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 50

MOCHAMOCHA Optimization Framework Optimization Framework

• Query optimization based on heuristics• cost = network + CPU + I/O• Network is the dominant factor (WAN)

• optimize for it first

• CPU and I/O are cheaper• optimize for them later

• Operator placement: Enhanced Hybrid Shipping • Code• Data

Page 49: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 51

Operator Placement in Operator Placement in MOCHAMOCHA

• Data-Reducing Operators– “Filter” the data – aggregates, predicates, projections, semi-joins

• Composite(), Overlaps() , AvgEnergy()

Push to the DAPs• Return distilled results• Less data movement

Composite()

Page 50: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 52

Operator Placement in Operator Placement in MOCHAMOCHA

• Data-Inflating Operators• “Expand” the data • projections, image processing, some joins …

• DoubleResolution(), RotateSolid()

• Pull to the QPC• Data Shipping policy [FJK96]• Only send back raw arguments• Less data movement

DoubleRes()

Page 51: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 53

Placement Metric: VRFPlacement Metric: VRF

Volume Reduction Factor: Given operator f and relation R, then VDA

VDTfVRF )(

•VDT - volume of data transmitted after applying f to R•VDA - volume of data originally present in R

f is Data-Reducing VRF < 1

Composite()

f is Data-Inflating VRF 1

DoubleRes()

Page 52: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 54

Goal: Plans with small CVRFGoal: Plans with small CVRF

Cumulative Volume Reduction Factor:Given a plan P to solve query Q over relations R1, …, Rn

CVDA

CVDTPCVRF )(

• CVDT - volume of data transmitted by applying all operators in P to R1, …, Rn• CVDA- volume of data originally present in R1, …, Rn

Search SpaceOptimizer searchesfor plans that move

minimal amount of data.

CVRF(Plan) [0,1]

Page 53: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 55

MOCHAMOCHA Query Optimizer Query Optimizer

• System R style– Left-deep plans (joins at QPC)– cost: execution time (network + CPU + I/O)– operator placement : VRF and plan cost– selections, unions and joins

• Placement Policy: Enhanced Hybrid Shipping– Code Shipping: operators at DAPs– Data Shipping: operators at QPC– generalizes Hybrid Shipping [FJK96]

Page 54: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 56

Sequoia 2000 BenchmarkSequoia 2000 Benchmark

• Goals of first experiment:– Measure how good code shipping can be– Validate heuristics being proposed

• VRF

• CVRF

• Configured MOCHA with plans that place operators– at DAP with code shipping– at QPC with data shipping

Page 55: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 57

Reducing vs. InflatingReducing vs. Inflating

Ru

nnin

g T

ime

(sec

s)0

200

400600

8001000

12001400

16001800

2000

DB CPU NET MISC

QPC QPCQPC

DAPDAP

DAP

Query Class

Q1 Q2 Q3

• Query classes– Q1: Composite of all images

– Q2: Clipping and sub-setting

– Q3: Double resolution of images

Performance– composites

• 99% data reduction• 4-1 better performance

– clipping and expansion• 80% data reduction• 3-1 better performance

Validates heuristics

Page 56: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 58

VRF vs. SelectivityVRF vs. Selectivity

• Selectivity and cardinality not

enough for distributed predicate

placement

• Consider 50% selectivity

• DAP CVRF = 0.01

• QPC CVRF = 1

0

100

200

300

400

500

600

700

800

DB CPU NET MISC

Ru

nnin

g T

ime

(sec

s)

SelectivityQ

PC

DA

P

QP

C

DA

P

QP

C

DA

P

QP

C

DA

P

QP

C

DA

P

0 .25 .50 .75 1

VRF is a better metric

Page 57: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 59

WAN ExperimentWAN Experiment

• Sites used:– University of Maryland (QPC)– University of Puerto Rico– Oregon Graduate Institute– University of North Dakota– University of Alabama

Page 58: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 60

Union with Data-ReducingUnion with Data-Reducing

Execution Time Q6

0

100

200

300

400

500

600

700

DS QS EHS

Execution Policy

Ex

ec

uti

on

Tim

e (

se

cs

)

• EHS is the better option– Filters data – 2-1 better performance– Minimal resource usage

Resouce Usage for Q6

0

200

400

600

800

1000

1200

DS QS EHS

Execution Policy

Usa

ge

T

ime

(se

cs)

Q6:Select landuse, locationFrom polygonsWhere perimeter(location) > 2000.0

Sites: UPR and OGI

Page 59: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 61

Union with Reducing and InflatingUnion with Reducing and Inflating

Execution of Q5

0

500

1000

1500

2000

DS QS EHS

Execution Policy

Ex

ec

uti

on

Tim

e (

se

cs

)

Resource Usage for Q5

0

5001000

1500

2000

25003000

3500

DS QS EHS

Execution Policy

Usa

ge

Tim

e (

secs

)

Q5:Select landuse, location, triangulate(location)From PolygonsWhere perimeter(location) > 2000.0

EHS is better than DS and QS• 2-1 better than QS• 6-1 better than DS

• Consumes least resourcesSites: UPR and OGI

Page 60: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 62

Join with Data-ReducingJoin with Data-Reducing

Execution Time Q8

0100200300400500600700

DS QS EHS

Execution Policy

Exe

cuti

on

Tim

e (

se

cs)

• EHS is the better option• 3-1 better performance

– Minimal resource usage• Same pattern as with unions

– Data movement is the key

Q8:Select P.landuse, R.location, R.weekFrom polygons P, rasters RWhere overlaps(P.location, R.location)And perimeter(P.location) > 2000.0

Sites: UPR and OGI

Resouce Usage for Q8

0

200400

600

800

10001200

1400

DS QS EHS

Execution Policy

Usa

ge

T

ime

(se

cs)

Page 61: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 64

MOCHAMOCHA System Status System Status

• Operational MOCHA prototype– It’s real!– over 40,000 lines of 100% Java code (JDK 1.3)– People involved:

• Manuel Rodriguez-Martinez (lead)• Mike McGann

• Steve Kelley

• Vadim Katz

• John Towshend, Frank Lindsay, Ben White (Geographers)

• Joseph JaJa (Algorithms)

– Tested with NASA ESIP Federation

• Los Alamos fire

– Supports: Oracle, Postgres, Informix, Sybase, HPSS

Page 62: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 65

Features of Features of MOCHAMOCHA

• Automatic Code Deployment• Scalable middleware architecture• Query optimization based on data movement reduction• Metadata publishing framework [RMR00a]

• RDF and XML • Publish schemas, mappings, types and functions• Drives automatic code deployment

• Schema mapping rules expressed in XML • attach as leaf nodes in query plan• extensible

Page 63: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 66

MOCHAMOCHA Publications Publications

• Research papers and talks– ACM SIGMOD 2000– EDBT 2000

• Demos – ACM SIGMOD 2000– SSDBM 2001– NASA ESIP meetings and workshops– U.S. National Academy of Sciences

Page 64: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 67

The Future of The Future of MOCHAMOCHA

A Million Site A Million Site MOCHAMOCHA

Page 65: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 68

The Future of The Future of MOCHAMOCHA

• The role of MOCHA in distributed software systems– sensors– satellites – network switches and routers– laptops, palm computers– custom-built devices– cars, planes, boats– people (fireman), animals (whales)

Page 66: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 69

Network of Network of MOCHAMOCHA enabled sensors enabled sensors

• Sensors are deployed in an area using ad hoc network techniques

• Sensors run Java JDK 1.3

• Lighter Sensors run Java JDK 1.3 Micro Edition

DAP

DAP

DAPDAP

DAP

DAP

DAPDAP

DAPDAP

DAP

DAP

DAP

Page 67: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 70

Organization of sensorsOrganization of sensors

Leader

Normal Sensor Groups

• Sensors are grouped together for specific goal or service

• data acquisition

• data aggregation, analysis

• data streaming

• Group leaders are responsible for • establishing themselves

(broadcast, voting, …)

• coordination among sensors

• making decisions (agents)

• participate in other higher level groups (hybrid P2P)

Page 68: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 71

Concrete Example (from NASA)Concrete Example (from NASA)

• Constellation of Satellites (with sensors)• A group observes Gamma radiation

– aggregates measurements– determines an important radiation event

• Group leader tells other peer group leaders to instruct their sensors to observe the Gamma radiation event (reaction).

• system adapts to changes in the environment

Page 69: The Future of MOCHA Nick Roussopoulos October 5, 2001.

Stanford Oct 5, 2001 Nick Roussopoulos 72

MOCHAMOCHAss Code Shipping feature for Code Shipping feature for

• upgrades to fix bugs• fresh code to gather data

– at different resolution– new aggregates or functions

• dynamically configured code – application-specific security protocol– location-dependent encryption