Large-scale (meta)Data Aggregators & Infrastructure Requirements the case of agriculture Nikos Manouselis Agro-Know Technologies & ARIADNE Foundation nikosm@ieee.org.

Post on 27-Mar-2015

214 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

Transcript

Large-scale (meta)Data Aggregators & Infrastructure

Requirementsthe case of agriculture

Nikos ManouselisAgro-Know Technologies & ARIADNE Foundation

nikosm@ieee.org @eAGE 2012, Dubai, 13/12/12

• Publications, theses, reports, other grey literature• Educational material and content, courseware• Primary data:

– Structured, e.g. datasets as tables– Digitized : images, videos, etc.

• Secondary data (elaborations, e.g. a dendogram)• Provenance information, incl. authors, their

organizations and projects• Experimental protocos & methods• Social data, tags, ratings, etc.

(agricultural) research data

• stats

• gene banks

• gis data

• blogs,

• journals

• open archives

• raw data

• technologies

• learning objects

• ………..

educators’ view

• stats

• gene banks

• gis data

• blogs,

• journals

• open archives

• raw data

• technologies

• learning objects

• ………..

researchers’ view

• stats

• gene banks

• gis data

• blogs,

• journals

• open archives

• raw data

• technologies

• learning objects

• ………..

practioners’ view

• aim is:promoting data sharing and

consumption related to any research activity aimed at improving productivity and quality of crops

ICT for computing, connectivity, storage, instrumentation

data infrastructure for agriculture

• aim is:promoting data sharing and

consumption related to any research activity aimed at improving productivity and quality of crops

ICT for computing, connectivity, storage, instrumentation

data infrastructure for agriculture

Publisher

Date Catalog

SubjectID

AuthorTitle

we actually share metadata

e.g. an educational resource

…metadata reflect the context

…sometimes, data also included

metadata aggregations

• concerns viewing merged collections of metadata records from different sources

• useful: when access to specific supersets or subsets of networked collections– records actually stored at aggregator– or queries distributed at virtually aggregated

collections

12

typically look like this

13 Ternier et al., 2010

typical problem: computing

typical problem: hosting

an ideal scenario

Data provider Data provider in need of in need of hosting & hosting & storage of storage of small-scale small-scale CMSCMS

sets upsets up own own CMS CMS instance instance

Data provider in Data provider in need of large need of large scale hosting & scale hosting & replication CMSreplication CMS

requests space/accounts requests space/accounts in large-scale CMSin large-scale CMS

Data provider Data provider hosting CMS at hosting CMS at own or own or external/commercexternal/commercial infrastructureial infrastructure

interested to expose interested to expose (meta)data to e-(meta)data to e-infrastructure infrastructure

register as data source

register as data source

register as data source

hosted over cloud

hosted over cloud

computed over grid

shares (meta)data shares (meta)data e.g. e.g. through OAI-through OAI-PMHPMH

indexed & available through CIARD RING

shares (meta)data shares (meta)data e.g. e.g. through OAI-through OAI-PMHPMH

shares (meta)data shares (meta)data e.g. e.g. through OAI-through OAI-PMHPMH

(META)DATAAGGREGATOR supported by

scientific gateway

computed & hosted over agINFRA grid/cloud

computed over grid & hosted over cloud

computed over grid

computed over grid

computed over grid & hosted over cloud

……

• its all about efficient metadata management• storage issues: where components are hosted,

how metadata aggregations & their versions handled/stored, scaling up

• computing issues: harvesting takes time/resources and needs to be invoked often, automatic tagging tasks demanding

• often recurring, similar workflows are needed (validate, transform, harvest, auto-tag, index)

overall need

why should you care?

promoting course descriptions

22

• push course information to various syndication/aggregation sites to allow users discover them– OCW search engine

(http://www.ocwsearch.com) – Moodle Hub concept (hub.moodle.org)

including relevant content

23

• allow course creator/author to find relevant material and resources to enrich course– Europeana ingestion widget

(http://wiki.agroknow.gr/agroknow/index.php/Hack4Europe_2012)

• suggest to learners additional courses and material relevant to what they access– Eummena’s Moodle Widget

(http://www.eummena.org/index.php/labs)

developing more end-user services

24

• Web portals to support user communities (e.g. thematic, geographical, social, cultural)– MACE portal (http://portal.mace-project.eu) – Photodentro Greek school collections portal

(http://photodentro.edu.gr) – VOA3R social platform for researchers

(http://voa3r.cc.uah.es)

wrap upwrap up

(META)DATA AGGREGATOR

considerations

• easily replicated cloud-hosted software applications (e.g. DSPACE instances)

• portal/service owners and software developers to use the infrastructure as a basis

• power up existing data & service networks

interesting: TERENA OER pilot

• interconnecting open educational resource repositories of NRENs

https://confluence.terena.org/pages/viewpage.action?pageId=33751325

interesting: GLOBE

• Global Learning Objects Brokering Exchange Alliance

• http://globe-info.org

thank you!nikosm@ieee.org

http://wiki.agroknow.grhttp://ariadne-eu.org

top related