Top Banner
SEASR Introduc.on High Performance Compu.ng in the Humani.es, Arts, and Social Science Workshop UIUC/NCSA July 28, 2008 LoreHa Auvil Na.onal Center for Supercompu.ng Applica.ons University of Illinois at Urbana Champaign
24

ICHASS Workshop Seasr

Nov 28, 2014

Download

Education

Loretta Auvil

 
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: ICHASS Workshop Seasr

SEASRIntroduc.on

HighPerformanceCompu.ngintheHumani.es,Arts,andSocialScienceWorkshop

UIUC/NCSAJuly28,2008

LoreHaAuvil

Na.onalCenterforSupercompu.ngApplica.onsUniversityofIllinoisatUrbanaChampaign

Page 2: ICHASS Workshop Seasr

SEASRGoalsThisprojectwillfocusondeveloping,integra.ng,deploying,andsustainingasetofreusableandexpandablesoPwarecomponentsandasuppor.ngframeworkthatwillbenefitabroadsetofdataminingapplica.onsforscholarsinhumani.es.

ThekeygoalsestablishedforthiseffortareasetofsoPwarecentricdirec.ves:

–  Supportthedevelopmentofastate‐of‐the‐artsoPwareenvironmentfordatamanagementandanalysisofdigitallibraries,repositoriesandarchives,aswellaseduca.onalplaVormsthatareexpectedtocontributetomanyofthehumani.esbreakthroughsofthe21stcentury.

–  Supportthecon.nueddevelopment,expansion,andmaintenanceofend‐to‐endsoPwaresystem–userinterfaces,workflowengines,datamanagement,analysisandvisualiza.ontools,collabora.vetools,andothersoPwareintegratedintoacompleteenvironment–tobringthefullpowerofdataanaly.cstothescholars.

–  Supporteduca.onandtrainingforuseofthissoPwareenvironmentforanalysisthroughworkshopstopromoteitsusageamongscholars.

Page 3: ICHASS Workshop Seasr

ProjectHighlights

•  SEASRwillemployacomprehensiveenvironmentthatintegratestwocomplementaryandrevolu.onarytechnicaladvances–ServiceOrientedArchitectureandSeman.cWeb,intoasinglecompu.ngarchitecture–Seman.cEnabledServiceOrientedArchitecture

•  SEASRaddressesthechallengesoftransforminginforma.onintoknowledgebyconstruc.ngthesoPwarebridgesthatarerequiredtomovefromtheunstructuredandsemi‐structureddataworldtothestructureddataworld

Page 4: ICHASS Workshop Seasr

WhatdoesthismeanfortheDHcommunity?

SEASRwill:

•  helpscholarsaccessexis.nglargedatastoresmorereadily•  providescholarswithenhanceddatasynthesisandquery

analysis

–  fromfocuseddataretrievalanddataintegra.on–  tointelligenthuman‐computerinterac.onsforknowledgeaccess

–  toseman.cdataenrichment–  toen.tyandrela.onshipdiscovery–  toknowledgediscoveryandhypothesisgenera.on

•  empowercollabora.onamongscholarsbyenhancingandinnova.ngvirtualresearchenvironments

Page 5: ICHASS Workshop Seasr

Seman.callyEnabledSOA

Page 6: ICHASS Workshop Seasr

Seman.callyEnabledSOA2

Page 7: ICHASS Workshop Seasr

TechnicalComponents

•  High‐LevelComponentRequirements–  Hardwareabstrac.on(virtualiza.on)–  Assetsstorageandcura.on–  Taskcrea.onanddefini.on(components)–  Processdescrip.on(flows)–  Openservicesandstandardizemetadataexchange–  Easyreachingtoanontechnicalcommunity(visualprogrammingandinterac.onUIs)

–  Socialinterac.onplaVormforresearchers–  NLP,machinelearning,andunderstandablevisualiza.ons

Page 8: ICHASS Workshop Seasr

TechnicalComponentsTechnicalarchitecturethatemphasizesflexibility,scalability,

modularity,providescommunityhubtoheterogeneoussystems,andreducespathdependence

•  Seman.c‐webdrivenarchitecturetostandardizeinteroperability

•  Designforcommunitybuildingandtoencouragesharingandpar.cipa.on

•  Data‐intensiveflowstomovefromasimpledesktoptoalargeclustertransparently

•  Movablecomputa.on.Computa.oncanbetransparentlyshippedtotheassets(complyingwithprivacyissues)

•  Quickre‐configurability(flowscanbeadaptedandreusedinseconds)

•  Buildtoreuseandcross‐fer.liza.onacrossdomains

Page 9: ICHASS Workshop Seasr

SEASRComponents

Virtualiza.onInfrastructure

HadoopFSSharedStores SOAGateways

MeandreInfrastructure

Visualiza.on

MetadataStores

ComponentRepository ComponentDiscovery

MeandreData‐IntensiveFlows

SEASRApps SEASRServicesSEASRPlugins SEASRWebApps

Analy.csData

GatewayConnec.onsDataPersistence

DataTransforma.on

Predic.veModelingDiscovery

NaturalLangProcessing

Char.ngModelingVis

InfoVis

Develop

erToo

ls

Page 10: ICHASS Workshop Seasr

SEASRApps:CommunityHub

Page 11: ICHASS Workshop Seasr

MoreCommunityHub

Page 12: ICHASS Workshop Seasr

CommunityHubImplementa.on

Implemen.ngCommunityHubfunc.onalityaswordpressplugins

Page 13: ICHASS Workshop Seasr

MoreCommunityHub

Page 14: ICHASS Workshop Seasr

MeandreWorkbenchDesign

Page 15: ICHASS Workshop Seasr

MeandreWorkbench

Page 16: ICHASS Workshop Seasr

SEASRApps:WebApp

•  Administra.ontool–  Future:Addsecuritylevels

•  Jobmanagementcontrol

•  Usermanagement/profile

•  Repositoryexplora.on

Page 17: ICHASS Workshop Seasr

MeandreInfrastructure

•  ComponentandFlowAPI•  Repository–  Future:VersioningofComponentsandFlows

•  Execu.onEngine–  Future:Parallelism,checkpoin.ng,faulttolerance,extendfiringpolicy

•  Debugger/Monitorforflowexecu.on•  ZigZag– Highlevellanguagefordescribingflows–  Interpreter/compilerforexecu.ngtheflows– Automa.cparalleliza.onatcomponentlevel

Page 18: ICHASS Workshop Seasr

MeandreInfrastructure

•  WebServiceOpera.ons– Callstotherepositoryforflowsandcomponents

– Current:REST– Future:SOAPenable

•  WebUI– Current:ComponentsusewebUIfragment(whichpasshtml)

– Future:Enablemorecomplexvisualcomponentsforlandscapeconstruc.on

Page 19: ICHASS Workshop Seasr

ComponentRepository

•  MeandreRepository– RDFdescrip.onsforcomponentsandflows

– Supportforrdfonlocalfile;webaccessiblefiles;jdbcenabledrela.onaldatabase(Derby)oratriplestore

– Supportforrdf,Hl,ntformats

Page 20: ICHASS Workshop Seasr

SEASRComponents:NLP•  Syntac.canalysis

–  Tokeniza.on–  POStagging–  Shallowparsing–  Customliterarytagging

•  Seman.canalysis–  NamedEn.tytagging–  Seman.cCategory(unnamed

en.ty)tagging–  Co‐referenceresolu.on–  Ontologicalassocia.on(WordNet,

VerbNet)–  Seman.cRoleanalysis–  Concept‐Rela.onextrac.on–  Logicalanalysis–  Eventsequenceinference–  Eventcausalinference

•  TopicFiltering–  bytopic–  by.meperiod–  byloca.on–  etc.

•  Seman.cnetwork–  Extractpredicate‐argument&other

triples–  ConverttoRDFtriples–  AddtriplestoRDFstore–  Posestructuredqueries–  Graph‐basedinference

•  Explora.on,DiscoveryandKnowledgeExtrac.on–  Query‐based–ques.onanswering–  Visual–naviga.on

Page 21: ICHASS Workshop Seasr

SEASRComponent:MachineLearning

•  DataTransforma.on–  Featureextrac.onandconstruc.on–  Boos.ngandBagging

•  UnsupervisedLearning–  Clustering,SOMs– HypothesisGenera.on

•  SupervisedLearning–  Tradi.onalSta.s.calLinearMethods–  Bayesian,SupportVectorMachines,DecisionTrees–  EnsembleModels

•  Op.miza.onApproaches– GAs

Page 22: ICHASS Workshop Seasr

Developers:EclipsePlugin

Page 23: ICHASS Workshop Seasr

SEASR@Work‐MONK

Page 24: ICHASS Workshop Seasr

SEASR@Work–NEMA