EIROforum IT Working Group 27 October 2015 This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence. The European Open Science Cloud Abstract This document outlines the position of EIROforum on the European Open Science Cloud. It explores the essential characteristics of the European Open Science Cloud if it is to address the big data needs of the latest generation of Research Infrastructures. The high‐level architecture and key services as well as the role of standards is described. A governance and financial model together with the roles of the stakeholders, including commercial service providers and downstream business sectors, that will ensure the European Open Science Cloud can innovate, grow and be sustained beyond the current project cycles is described. About the EIROforum EIROforum partners are intergovernmental research organisations – CERN, ESA, EMBL, ESO, EuroFusion, European XFEL, ILL and ESRF – covering disciplines ranging from particle physics, space science and biology to fusion research, astronomy, and neutron and photon sciences. The partner organisations have a truly European governance, funding and remit, and in many cases share a global engagement. They are world leaders in basic research, as well as in managing and operating large research infrastructures and facilities. The EIROforum collaboration is helping European science reach its full potential through exploiting its unparalleled resources, facilities and expertise. By combining international facilities and human resources, EIROforum exceeds the research potential of the individual organisations, achieving world‐ class scientific and technological excellence in interdisciplinary fields. EIROforum works closely with industry to foster innovation and to stimulate the transfer of technology. Prepared by CERN IT department on behalf of the EIROforum IT Working Group.
25
Embed
The European Open Science Cloud2015/10/27 · A Pre‐Commercial Procurement (PCP) is being negotiated to build a new form of IT as a Service (IaaS) platform using open source solutions
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
EIROforumITWorkingGroup27October2015
This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
TheEuropeanOpenScienceCloudAbstractThis document outlines the position of EIROforum on the European Open Science Cloud. ItexplorestheessentialcharacteristicsoftheEuropeanOpenScienceCloudifitistoaddressthebigdataneedsofthelatestgenerationofResearchInfrastructures.Thehigh‐levelarchitectureandkeyservicesaswellastheroleofstandardsisdescribed.Agovernanceandfinancialmodeltogether with the roles of the stakeholders, including commercial service providers anddownstreambusinesssectors,thatwillensuretheEuropeanOpenScienceCloudcaninnovate,growandbesustainedbeyondthecurrentprojectcyclesisdescribed.AbouttheEIROforumEIROforum partners are intergovernmental research organisations – CERN, ESA, EMBL, ESO,EuroFusion,EuropeanXFEL,ILLandESRF–coveringdisciplinesrangingfromparticlephysics,spacescienceandbiologytofusionresearch,astronomy,andneutronandphotonsciences.ThepartnerorganisationshaveatrulyEuropeangovernance,fundingandremit,andinmanycasesshareaglobalengagement.Theyareworldleadersinbasicresearch,aswellasinmanagingandoperatinglargeresearchinfrastructuresandfacilities.TheEIROforumcollaborationishelpingEuropeansciencereachitsfullpotentialthroughexploitingitsunparalleledresources,facilitiesandexpertise.Bycombininginternationalfacilitiesandhumanresources,EIROforumexceedstheresearch potential of the individual organisations, achieving world‐ class scientific andtechnological excellence in interdisciplinary fields. EIROforumworks closelywith industry tofosterinnovationandtostimulatethetransferoftechnology.
iThis document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
ExecutiveSummaryEIROforummembersandotherResearchInfrastructureoperatorsfaceunsustainabledemandforcomputingandnetworkingservicestodeliverthepromiseofOpenScience.Theyneedmorecost‐effective approaches to collecting, processing, distributing and re‐using the rapidly growingamountsofdatabeingproducedbytheirinstruments.This will require innovative ways of providing an integrated IT infrastructure andoperationsexpertiseneededtorunapplications. Currently in‐house resources, public e‐infrastructure and commercial cloud services are notintegratedtoprovideaseamlessenvironmentfordata–intensivescience.Existingservicesdonotcover the full lifecycle of research from proposal submissions requesting access to ResearchInfrastructures,throughtodataacquisition,sharingandpublication.Researchersareby‐passingtheirin‐houseITdepartmentsandpubliclyfundede‐Infrastructurestomakeuseofcommercialcloudservicesthatofferinnovative,easy‐to‐usesolutionsandfilltheservicegaps.ThisshadowITinnovationrepresentsanopportunitytointroducechangebutmustbeundertakenwithfullknowledge of the policy aspects including data protection, intellectual property rights andapplicablelegislation.TheEuropeanOpenScienceCloudhas thepotential toprovide themeans to link suchservicestogetherandincreasescientificoutput.TheHelixNebulainitiative(HNI)hasbroughttogethermorethan40serviceproviders,researchorganisations,dataprovidersandpublicly fundede‐infrastructures. Ithasdevelopedahybridcloud model with procurement and governance components suitable for the dynamic cloudmarket.APre‐CommercialProcurement(PCP)isbeingnegotiatedtobuildanewformofITasaService(IaaS)platformusingopensourcesolutionsinafederatedScienceCloud.Procuring cloud services from providers on a pay‐per‐usagemodel on the operationsbudgetratherthanthecapitalbudgetoffersbothflexibilityandscalability.E‐infrastructurecostswillbecomeanintegralpartofthecostofdoingscienceand,consequently,must be cost‐justified in terms of benefits and impact.Moving to the cloud can enablemoreflexiblepricingmodelssuchaspercore/hourorperrequest/transactionormigrationtoOpenSourceSoftware(OSS)tocontrolgrowingsoftwarelicensingcosts.Mostpubliclyfundedresearchorganisationslackdetailedcostmodelsinhibitingfinancialcomparisonsbetweentraditionalandcloud‐basedsolutions.RIsneedtounderstandthebenefitsaswellasthefullcostsof‘bigdata’servicesandbeabletomanagetheirownprocurements inacompetitivemarketplace,migrateusecasesandexistinginfrastructurestothecloudparadigm,andadoptanappropriatecollaborativegovernancemodel.Serviceswillbeprovisionedfromcommercialsupplierswhentheyarenotavailablein‐houseorcan be delivered externally on better terms (i.e. at shorter notice, lower cost or betterperformance etc.). Publicly funded data centres will continue to guarantee long‐term datapreservation and service supplier independence.Amarket assessment of the public researchsector and downstream business sectors that could build on the data produced by ResearchInfrastructuresisneededtobuildconfidenceinthebusinessmodelandjustifyinvestmentsintheEuropeanOpenScienceCloudbythesupply‐side.
EIROforumITWorkingGroup27October2015
iiThis document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
A significant difference compared to the currentmodel is that funding agencies andresearchorganisationswillno longerprovision servicesexclusively from theirown in‐houseresources.Stakeholdersinthepublicandcommercialsectorsmustnotonlyinvestinthebuildingblocksforthedevelopmentofe‐InfrastructurelistedinTable1,butalsoinend‐userfacingservicesandintrainingthenextgenerationofIT‐savvyresearchers.Thiswillleveragetheinvestmentsalreadymadeinthepubliclyfundede‐infrastructuresandcommercialcloudservices.Insummary,allstakeholdergroupsneed towork together toensurewideadoptionofcompetitive,secure,reliableandintegratedcomputingservices.ManyresearchorganisationsthatoperateresearchinfrastructuresdonothavethemandatetoprovideITservicestotheirusersforthemanagementandprocessingoftheirexperimentaldataandwillrequireassistancetobridgethegap fromdatatoknowledgeacquisition.Theguidingprinciple is that funding from stakeholders like the EC and national funding agencieswill befocusedoninnovationofservicesanduptakebynewusercommunitiesandbusinessactorswhiletheoperationalcostswillbebornebytheoperatingorganisationsandtheusercommunities.The fundingmodel for theEuropeanOpenScienceCloudmustbedesignedso that theservicescanbesustainedbytheiroperatingorganisations.The EC’s INFRASTRUCTURES 2016‐2017work programme foresees new e‐Infrastructure fordataanddistributedcomputingandapilotforthefederation,networkingandcoordinationofpan‐Europeanresearchinfrastructuresandcloudsingeneral.Lookingfurtherahead,theEChastaken steps to ensure funding for GÉANT over the full duration of H2020 by introducing‘Framework Partnership Agreements’ (FPA). The FPA model represents a more long‐termengagementthatcouldencouragetheintegrationofe‐infrastructuresco‐fundedviaECprojectsintotheResearchInfrastructures’computingmodels.TheapplicationoftheFPAapproachtotheEuropeanOpenScienceCloudcouldestablishthebasisfortheEuropeanResearchArea’sdigitalcommonsandleadtowardsScience2.0.The European Open Science Cloud represents a strategic vision that can be a vector forintroducing change in theserviceprovisioningandcomputingmodels for thepublicly fundedresearchsectorinthemediumtolongterm.TheEuropeanOpenScienceCloudhas thepotential togreatly improve theprovisioningof ITservicesforResearchInfrastructurestoaddresstheirbigdataneeds. Itcanencompassall thephasesoftheresearchlifecycleandofferaplatformofjointinnovationforthepublicandprivatesectors.ItwillsignificantlychangethewayITservicesareprocured,organisedandfunded.Thekeychallengesareintegratingfrequentlychangingtechnologies,managingthecomplexityandidentifyingtheoptimalorganisationalandfinancialmodels.Researchersmustbeconvincedthattheywillnotlosecontroloftheirpreciousdata.Itisanambitiousundertakingrequiringtheactiveengagement of many stakeholders and careful planning of the technical, financial, legal andgovernanceaspects.Forittosucceeditmustbecomeapriorityforalltheactorsinvolvedwithmonitoringbythefundingagenciesandregularassessmentbytheusercommunities.Thispositionpaperisarallyingcallforadoptionofsuchastrategicapproach–withintheECandotherfundingbodiesaswellastheoperatorsofResearchInfrastructures.
EIROforumITWorkingGroup27October2015
iiiThis document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
Table 1 – major stakeholder groups
National funding agencies
Policy makers
Third sector
Granting bodies
European Commission
DG CONNECT
DG RTD
Research communities
Thought leaders
Peers
Scholarly publishers
Research Infrastructures
Policy‐makers
Operational staff
Data users
Public e‐infrastructures
Service providers
Host organisations
Technology providers
Commercial cloud service providers
Independent Software Vendors
Open Source developer communities
Standards bodies
Table 2 ‐ relevant EC co‐funded projects
AARC https://aarc-project.eu
Cloud for Europe
http://www.cloudforeurope.eu/downloads
EGI https://wiki.egi.eu/wiki/Main_Page/
EUDAT http://www.eudat.eu
GÉANT http://www.geant.net/
Helix Nebula
http://www.helix-nebula.eu
Indigo Datacloud
https://www.indigo-datacloud.eu/
OpenAIRE https://www,openaire.eu
PICSE http://www.picse.eu/
PRACE http://www.prace-ri.eu/
SLALOM http://www.slalom-project.eu/
EIROforumITWorkingGroup27October2015
iiiThis document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
SustainabilityinaworldexperiencingthedatatsunamiTraditionalwaysofmeetingthegrowingdemandforcomputingandnetworkingservicescapableofaddressingthe‘DataTsunami1’areseentobeunsustainablebyfundingagenciesaswellastheinfrastructureoperatorssuchasGÉANTandEGI.Thecostofcollecting,processing,distributingand re‐using the rapidly growing amounts of data produced by their instruments is amajorconcern for Research Infrastructure operators including the EIROforum members. Acollaborativeshifttowardsmorecost‐effectivewaysofgeneratingandusingscientificdataandagreaterrolefortheusersofthatdataisrequiredinordertodevelopasustainablefuturefortheevolutionofOpenScience.
MindtheGapOverthelastdecade,drivenwithsustainedfundingfromtheEC,thee‐InfrastructurelandscapeacrossEuropehasgrownfromregionalprototypestoasetofpan‐EuropeanproductionresourcesincludingEGI,GEANT,PRACEetc.Thishasresultedinanumberofserviceswithinthecontextofeachproject but there is no common, overarching goal and souser communitiesmust investsignificantefforttobringtheseservicestogether.Currently in‐house resources, public e‐infrastructure and commercial cloud services are notintegratedtoprovideaseamlessenvironmentfordata–intensivescience.Existingservicesdonotcover the full lifecycle of research from proposal submissions requesting access to ResearchInfrastructures,throughtodataacquisition,sharingandpublication.Researchersareby‐passingtheirin‐houseITdepartmentsandpubliclyfundede‐Infrastructurestomakeuseofcommercialcloudservicesthatofferinnovative,easy‐to‐usesolutionstofill‐intheservicegaps.ThisshadowITinnovationrepresentsanopportunitytointroducechangebutmustbeundertakenwithfullknowledge of the policy aspects including data protection, intellectual property rights andapplicablelegislation.TheEuropeanOpenScienceCloudhasthepotentialtoprovidethemeanstolinksuchservicestogetherandincreasescientificoutput.
NeedfornewwayofprocuringICTservicesPublicresearchorganisationshavetofindalternativestothetraditionalrouteofpurchasingandoperating in‐house IT equipment which requires capital investment on the physicalinfrastructure (servers, network, storage) needed to run an application aswell as operationsexpertise.Cloudcomputinghasthepotential toreduceITexpenditurewhileat thesametimeimprovingthescopeforinnovativeandflexiblehigh‐qualityservices.Procuringexternalcloudservices from providers on a pay‐per‐usage model implies that infrastructure is no longer‘institutionalised’andthecostofcloudservicescanbe foundontheoperationsbudgetrather
2This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
than the capital budget. There is ‘elasticity’ in cloud‐based services and cloud‐basedinfrastructureisinherentlyscalable.
OpenSciencerequiresanintegratedapproach‘Open Science’ is still in its infancy ‐ driven predominantly by the availability of enablingtechnologiesandtheopportunitiesfornewwaysofworkingratherthanbydemandfromsocietyatlarge,accordingtoarecentconsultation2.Lackofintegrationoftheexistinginfrastructures(and,byinference,accesstothedatatheycarry)wasseentobeabarriertoadoptionofthosetechnologiesandworkingpracticesby86%of the individual scientistswhoresponded to thesurvey.
Hybridcloud‐basedsolutionsThe Cloud for Europe project3 has shown that uptake of cloud services by European PublicAdministrationsisstillveryfragmentedintermsofdemandandprocurementofITservices.TheHelixNebula4 initiative, however,hasdemonstrated thepotential of a hybridmodel inwhichserviceproviders,researchorganisations,dataprovidersandpubliclyfundede‐infrastructuresarebroughttogether.Buildingonthatpotentialwillallowustosupportandtransformpubliclyfundedresearchintodatadrivenknowledgewhichisofvaluetothewiderresearchcommunityanddownstreamindustries.HelixNebulahasalreadybroughtinnovationtotherelationshipbetweensuppliersandusersandintroducedawiderrangeofnewplayerstothemarketplace.ThisprovidesaplatformontowhichtheEuropeanOpenScienceCloudinitiative5willaddafurthermuch‐neededdoseofinnovationandaccountabilityinthewaytechnologyisprocuredanddeployed.The goal of this position paper is to allow the EIROforummembers to articulate their ownexpectationsof the initiativebyhelping them tounderstand thenewEuropeanOpenScienceCloudandthewaythatitaddressestheneedsoftheInfrastructureoperatorsandusers.
Hybrid clouds combine private infrastructure and operationswith shared infrastructure andoperations.Atypicalhybridcloudusecasewouldbetherelocationofthepresentationtier(userinterface)andlogictierwheretheapplicationknowledgeisencapsulatedtoanoff‐sitecloudandhavethemcommunicatewiththedatabasestoredandmanagedwithintheorganisation’sownITinfrastructure.Inorderforthedemand‐sideuserstobeencouragedtopurchasecloudcomputingservices,theservicesofferedmustbeeconomicallyadvantageouscomparedtoothermeansofprocuringITservices.
Pre‐CommercialProcurementPromotionofjointprocurementhasledtothecreationofanexpandingprocurementnetworkofpubliclyfundedresearchorganisationsandestablishmentofanewPre‐CommercialProcurement(PCP),theHelixNebulaScienceCloud(HNSciCloud).HNSciCloud is designed to pull together publicly‐funded e‐Infrastructures using open sourcesolutions,tobuildahybridInfrastructureasaService(IaaS)platform.ItwillhostacompetitivemarketplaceofEuropeancloudplayerswheretheycandeveloptheirownservicesforawiderrange of users beyond research and science including downstreambusiness sectors that canmakeuseofpubliclyfundedresearchdata.ThegoalistoestablishasustainableEuropeanOpenScienceCloudservingEurope’sResearchInfrastructures,communitiesandrelatedbusinesssectorsandsurpassingthecapacitycurrentlyavailableviaexistingpublice‐infrastructuresandthein‐housefacilitiesofresearchorganisations.ItwillbebasedonthemigrationofInfrastructureasaServiceintothemoregeneralITasaServiceconsistingofsoftwaretoolsandapplicationsandtheplatformsonwhichtheyrun.Serviceswillbe provisioned from commercial suppliers when they are not available in‐house or can bedeliveredexternallyonbetterterms(i.e.atshorternotice,lowercostorbetterperformanceetc.).Publicly funded data centres will continue to guarantee long‐term data preservation andcommercialservicesupplierindependence.
ChallengesfacingResearchInfrastructureoperatorsHNSciCloud will enable the federation, networking and coordination of existing ResearchInfrastructuresandscientificcloudsinpreparationforwhatthe2016INFRASTRUCTURESWorkProgrammecallsthe“EuropeanOpenScienceCloudforResearch”.ItbringsEurope’stechnicaldevelopment,policyandprocurementactivitiestogethertoremovefragmentationandsupportResearchInfrastructureoperatorsfacingthreekeychallenges:
We expect the scale and range of services being provisioned from commercial suppliers tograduallyincreaseovertimeasthecloudmarketmaturesandOpenSciencebecomesembeddedintheresearchlifecycle.Asignificantdifferencecomparedtothecurrentmodelisthatfundingagenciesandresearchorganisationswillnolongerprovisionservicesexclusivelyfromtheirownin‐houseresources.In an answer to a written question in the European Parliament about the current positionregardingprocurementoftheEuropeanScienceCloud,CommissionerOettingerstatedthat:“TheCommissionhassupportedpathfindingstudiesontheuseofhybridmodels,bringingtogetherpublicresearchorganisationsande‐infrastructureswithcommercialsupplierstobuildacommonplatformofferingarangeof services toresearchcommunities.Thiscanbeachievedbybuildingoncloudtechnologieseasilyaccessibletousersandbypromotingprocurementofcloudservicestoencourageinnovation on the supply side.” The role of Helix Nebula and the HNSciCloud in shaping thatpositionisclear.
BenefitsofahybridapproachforscalabilityIf there is significant variation in demand, theremay be an opportunity to reduce operatingexpenditurebymatchingthesupplyofresourcestothelevelofdemand.Byemployingahybridcloudmodel,anorganisationcanquicklyandeconomicallyaddresourcesasneededbyburstingoutofitsprivateITinfrastructuretoacommercialcloudprocessingandstoragecapacity.Acloud‐burstingscenariocanprovidethebenefitsofcostsavings,maximumutilisationofon‐premisesresources and rapid innovation, but also has its own set of challenges in ensuring theperformance,agility,securityandmanagementaspectsofahybridcloudinfrastructure.Byintermixingprivateandpubliccloudinfrastructures,organisationsareabletousethehybridmodeltoleveragein‐houseandoff‐siteresources.Thehybridmodelallowsorganisationstorelyonthecost‐effectivecommercialcloudfornon‐sensitiveoperationsandontheprivatecloudforcritical,particularlysensitiveoperationsprovidingenhancedagilitytomoveapplicationseasilybetweenthein‐houseandoff‐siteresourcestakingintoaccountaspectsofpolicy,cost,securityandavailability.
Supply‐sideOneimportantconsiderationisthatthisapproachmustgeneratebenefitfortheproviderswhohavetheresponsibilityofensuringthattheyhavethephysicalinfrastructuretomeettheirusers’demandandthattheirperformancemeetsagreedservicequalitylevels.Withoutanaccurateviewoffuturedemand,planningforvariablecostssuchasstaff,replacementserversorcoolers,andelectricitysuppliescanallbeverydifficult,andoptimisingthedistributionofvirtualmachinespresentsamajorchallenge.Themoreunpredictableandspikeytheworkloads,thegreatertheeconomicbenefitofsharingthesameservicesacrossdiverseresearchcommunitiesinthepublicand private sectors. Analysis of the procurements made via Helix Nebula, suggests there is
EIROforumITWorkingGroup27October2015
5This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
insufficient installed capacity currently available in the European market to satisfy theexceptionaldemandthatwillbegeneratedbythelatestgenerationofresearchinfrastructures.Significantinvestmentsbythesupply‐side,basedonaccuratefuturepredictionsofusagewillbenecessary.Consequentlyitisimportantthatamarketassessmentofthepublicresearchsectorand downstream business sectors that could build on the data produced by ResearchInfrastructures isperformed (similar thatperformedby theUKgovernment forpublic sectorinformation9)inordertobuildconfidenceinthebusinessmodelandjustifyinvestmentsintheEuropeanOpenScienceCloudbythesupply‐side.Therearealsolicensingimplicationswhentransitioningfromascale‐uparchitecturetoascale‐outarchitecture:someapplicationsarelicensedper‐instanceorper‐CPU,oftenoveranannualterm.Inthisinstance,therecanbesignificantcostimplicationsofaddingnewinstancestoapoolofresources.Intime,applicationvendorswillfollowinfrastructureserviceprovidersinmovingtomoreflexiblepricingmodelssuchaspercore/hourorperrequest/transaction.ThealternativeistouseOpenSourceSoftware(OSS)wherethelicensecostissueisnon‐existent.
Demand‐sideAs identified in theGEANTExpert Group report10, the user communitieswill increasingly becalledupontopayfortheservicestheyreceiveife‐infrastructuresonwhichuserscandependare to continue to survive. E‐infrastructure costswill be an integral part of the cost of doingscienceand,consequently,e‐infrastructureinvestmentsmustmakeasubstantialandsustainableimpactinordertobejustifiedintermsofcostsandbenefits.AstudyofthecosteffectivenessofEuropeandedicatedHTCandHPCcomputinge‐infrastructuresforresearchcomparedtoequivalentcommercialleasedoron‐demandofferingswasperformedby the eFISCAL project11 in 2011. The conclusion was that the ratio of CAPEX (CAPitalEXpenditure) to OPEX (OPerational EXpenditure) for e‐infrastructures was 30%‐70% andmanpower accounted for approximately 50% of the costs (CAPEX+OPEX). A Total Cost ofOwnership(TCO)study12wasperformedbySAPResearchonspecificCERNin‐houseserviceswithinthecontextoftheHelixNebulaFP7project.Bothof thesestudies indicatedthatmostpublicly fundedresearchorganisations lackdetailedcostmodelsforindividualservices.Financialcomparisonsbetweentraditionalandcloud‐basedsolutionswouldneedasetofguidelinesforsuchorganisationsproposingwhichcategoryofcostsshouldbeincludedorexcluded.ItisimportanttorecognisethatshiftingtheprocurementofITservicestoapay‐per‐usagemodelwillnormallyhavealimitedimpactonTCOsincethebulkofexpenditure over the lifetime of an application is not related to the purchase of physicalinfrastructure.ItisalsothecasethatnotallpubliclyfundedresearchcentresareinapositiontomakeaccurateestimationsoftheTCOofin‐houseITservicessincesomecontributingcostsarebornebydifferentdepartments.Theadoptionofcloudcomputingservicesbypublicresearchorganisationsrequiresadditionaljustificationintermsofthebenefitsofthenewwaysofworkingthatcloud‐basedservicesenable.ResearchorganisationsjustifytheirinvestmentsbytheimpactmadeinITservicesontheend‐usercommunitiesintermsofscientificoutput.Togaugethisimpactitisnecessarytounderstand
6This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
theneedsandactivitiesoftheend‐users.Factorssuchaspatternofdemandandtransitionalcostsneedtobeincludedinanyfinancialanalysisofapotentialcloudcomputingsolution.The EuropeanOpen Science Cloudwill need to perform IT capacity planning for all engagedresearch communities on a regular basis. As an example, theWLCGproject has a ComputingResourcesScrutinyGroup13whichreviewsthecomputingresourcesfortheLHCexperimentsonanannualbasis.
ProcurementThe EC‐funded ‘Procurement Innovation for Cloud Services in Europe’ (PICSE14) project isidentifying barriers to procurement of cloud services bypublic research organisations and isdevelopinganewprocurementmodeltoovercomethem.Withtheadventofcloudcomputing,the delivery of ICT services is going through a fundamental change. However, while cloudtechnology service options continue to evolve, procurement processes and policies of publicresearch organisationshave remained firmly rooted inhistorical practices that areno longereffective.Inorderforpublicresearchorganisationsofallsizestotakeadvantageofthebestthecloudmarkethastooffer,amoreflexibleandagileprocurementmodelmustbeidentifiedandimplemented.PICSEhascontactedanumberofpublicsectororganisationsandinitiatives(includingCERN15,Cloud for Europe16, DG DIGIT17, ECMWF18, EMBL19, ESA20, ESRF21, Europeana22, GRNET23 andUmeåUniversity24)todiscusstheircurrentpractices.Themainchallengesidentifiedthatneedtobeaddressedintheprocurementofcloudservicescanbesummarisedasfollows: Aswithallpurchasesofnewtechnologies,procuringinnovativeservicesrequiresnew
Thesechallengeshavean impactonall thestepsof theprocurementprocess.There isaclearimpact on skills and knowledge required. IT managers within public research organisationsshouldhaveaclearunderstandingofthenewtechnologybeingpurchased.Functionally similar to financial market brokers, cloud brokers match provider supply withconsumerdemand.Thismodelbenefitsallparties:experiencingmorepredictabledemand,cloudproviderscanbetteroptimizetheirworkflowtominimizecosts;cloudusersaccesscheaperratesoffered by brokers; and cloud brokers generate profit from charging fees. Including suchbrokeragemodels in theEuropeanOpenScienceCloudcouldreduce therisks thatarise frommarketinstability.TheadoptionofahybridcloudmodelwillalsohelptoreducetheimpactofmarketinstabilitiesontheEuropeanOpenScienceCloud.
TheroleofstandardsStandardsimprovetransparencyandcomparabilityforserviceusers.Theyopenupnewmarketsfor suppliers and offer equal access conditions, particularly for small and medium‐sizedcompanies. Standards also improve the quality, security and sustainability of products andservices and adoption of suitably defined standards exposes the supplier’s unique sellingpropositions.Openstandardscanbeadoptedtoprovideinteroperabilitybetweenpartsoftheinfrastructure,portabilityfromonecloudserviceprovidertoanotherandtrustintheintegrity(provenance,reliability,etc.)oftheinfrastructurethathasbeenbuilt.Emergingcloudstandardsforapplicationorchestrationprovidetemplate‐drivendescriptionsofapplicationsasatransparentwayofabstractingtherelationshipsbetweencloudapplicationsandservicesandtheunderlyingplatformorinfrastructure.OneexampleofthisisTOSCA(TopologyandOrchestrationSpecification forCloudApplications) fromOASIS26, selectedby theHorizon2020 EC co‐funded IndigoDataclouds27 project. This gives suppliers and users interoperabledescriptionsofcloud‐hostedservicesandapplications,includingtheircomponents,relationships,dependencies, requirements, and capabilities. TOSCA has the potential to expand customerchoice,improvereliability,andreducecostandtime‐to‐value,facilitatingtheagile,continuousdeliveryofapplications(DevOps)acrosstheirentirelifecycle.Portabilityisanothersignificantpropertysinceprospectiveuserswanttoavoidvendorlock‐inwhentheychoosetousecloudservices.Usersneedtoknowthattheycanmovetheirdataandapplicationsbetweenmultiplecloudserviceprovidersatlowcostandwithminimaldisruption.Portability through the appropriate standardisation of APIs, data models, data formats andvocabularieswillhelpautomatebusinessprocessessurroundingcloudcomputingprocurement,enable straightforward technical integration between the client and provider, and allow forflexibleanddynamicapplicationdeploymentsacrossmultipleclouds.
8This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
TrustandconfidenceincloudcomputingservicesreliesonworksuchasENISA’sCIIP28(CriticalInformation InfrastructureProtection) initiativewhichdefinesappropriatestrategies,policiesandspecificmeasuresforprotectinginformationonthecloud.Theunderlyingcauseofmanyofthe risks and challenges associated with cloud computing is that the user passes overresponsibilityfordataandforapplicationstothecloudserviceproviderandtheproviderhasamulti‐tenantenvironmentinwhichresourcesareshared.Inadditiontothemanyeconomicandtechnologicaladvantages that cloudcomputingoffers to researchcommunities, therearealsosignificantsecuritybenefitsinmigratingapplicationsandusagetothecloud,asnotedbyENISA.The shared resources available in clouds also potentially include rare expertise, shared bestpracticesandadvancedsecuritytechnologies,beyondthemeansorabilitiesofthevastmajorityofSMEs,manylargercompaniesandevenmanygovernmentbodies,toprovidefortheirin‐housesystems.Atrulyinteroperablecloudwillencourageadoptionbyusers,safeintheknowledgethattheycanchangeproviders,orusemultipleproviders,withoutsignificant technicalchallengesoreffort.Thiswillexpandthesizeofmarketsinwhichcloudprovidersoperate.
FederatedApproachTheEuropeanOpenScienceCloudshouldofferaninitialportfolioofservicescorrespondingtothe list of e‐Infrastructure services documented by eIRG in its blue paper of 201029with thetechnical characteristics identified by theHigh Level Expert Group on ScientificData in their“RidingtheWave”reportfromthesameyear30.Implementations for the majority of the foreseen services already exist at varying levels ofmaturity. The key challenges are integrating frequently changing technologies,managing thecomplexityandidentifyingtheoptimalorganisationalandfinancialmodels.Researchersmustbeconvinced that theywillnot losecontrolof theirpreciousdata.Thedata centresoperatedbypublicresearchorganisationscanprovidesuchguarantees.Theycanrapidlyexpandtheavailablecapacitybymakinguseofcommercialserviceprovidersofferingcommoditycomputeanddataservicesaspartofthehybridcloudmodel.Bykeepinga“safecopy”oftheresearchdata,thepublicresearch organisations can also insulate the researcher communities from changes in serviceproviderandtechnology.TheEuropeanOpenScienceCloudshouldtakeabottom‐upapproachtoimplementation,startingwithIaaS.Integrationshouldstartwithacommoncatalogueofservicesandafederatedidentitymanagement system offering a single sign‐on facility to access services across all suppliers.Startingbottom‐upisessentialtogetthecoretechnical,financial,andpolicyprinciplesright.IaaScan be introduced without impacting higher‐level user‐facing services that will require asignificantsoftware investment. Italsorepresentsastrategywith lowerriskbecausethe IaaSmarketismorematurethanthePaaSandSaaSmarkets.
9This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
The services of the EuropeanOpen Science Cloudwill need to be integratedwith a range ofresourcescurrentlyoperatedbypublicorganisationstoformahybridcloudsolution.Realisationofthebenefitsofahybridcloudisinhibitedbymanybarriersrelatedtoprocurement,trustworthiness,technicalstandardsandlegaltermsofreference,riskofvendorlock‐inandsoon. The overall challenge is to overcome these barriers in order to boost productivity bystimulating all stakeholder groups towork together to ensurewide adoption of competitive,secure,reliableandintegratedcomputingservices.InorderfortheEuropeanOpenScienceCloudtobedeployedrapidly,itisessentialtobuildonthe existing infrastructures. This requires an agreed overarching architecture and thecommitmentoftheserviceoperatorstomaketheEuropeanOpenScienceCloudapriority.TheremustalsobeagreementbyallthestakeholdersonthegovernancestructureandfinancialmodeltoensuretheEuropeanOpenScienceCloudcangrow,innovateandbesustained.The EGI Federated Cloud31 is an example of an inter‐disciplinary approach to infrastructureimplementationallowingdatasharingandcollaborationbetweenresearchcommunities.Itisagridofacademicprivatecloudsandvirtualisedresources,builtaroundopenstandardsandfocusingontherequirementsofthescientificcommunity.Technicalconsistencyintheservicedelivery between participating suppliers is ensured by use of recommended publicly definedinterfacespecificationssuchasOCCI32,CDMI33andOVF34.The experience gathered by EGI in managing its federated infrastructure35 will be directlyrelevantandprovideinsightintomakingalargerportfolioofcapacitystyleHPCservicesfordatacentricapplicationsaccessibletoitsexistinguser‐base.Workingwithcommercialcloudserviceproviders will inject the innovation potential created by the uptake of cloud computing inresearchandbusinesssectors.The complementary expertise developed by PRACE and related projects in efficient parallelprogrammingparadigms andoptimising software for a range of architectures is also directlyrelevant to the European Open Science Cloud and application/service developers. The HPCcapabilityservicesofferedbythePRACEcentresshouldbeintegratedtoformpartoftheoverallecosystem. This will require the PRACE HPC centres to participate in the federated identitymanagementschemeanddatasharingservicesdescribedbelow.
SupportservicesSupport services will also be required to ensure the operational staff in the public researchorganisations can resolve end‐user support issues as quickly and as efficiently as possible.Similarly, security responseserviceswillbenecessary tohandle incidents thatmayaffect theplatform.Thepubliclyoperated infrastructures thatarepartof thehybridcloudalreadyhaveuser‐supportandComputerSecurityIncidentResponseteams(CSIRTs)inplacebuttheydonotfullyinteroperateandallcloudservicessupportedbytheEuropeanOpenScienceCloud,whetheroperatedbycommercialserviceprovidersorpublicorganisations,willneedtobeintegratedintothesestructures.
Transportofhugeamountsofdataandthelackofhigh‐performancelinksInorderfortheEuropeanOpenScienceCloudtooperateeffectively,itisnecessarytoassurethereissufficientnetworkcapacitytopermitdataingressfromtheResearchInfrastructures.GÉANT36 is the high bandwidth pan‐European research and education backbone thatinterconnectsNationalResearchandEducationNetworks(NRENs)acrossEuropeandprovidesworldwideconnectivitythroughlinkswithotherregionalnetworks.TheGÉANTnetworkistheprimary means of connecting the research organisations and universities to the commercialproviders. The Helix Nebula initiative has already demonstrated that it is possible to makepracticaluseofthedatacentresofcommercialcloudserviceprovidersovertheGÉANTnetwork.GÉANTOpen37isaserviceallowingNRENsandapprovedcommercialorganisationstoexchangeconnectivityforthepublicResearchandEducationsectorwithNOCsupport,SLAmonitoring,adefinedpolicy38andcostmodel39.CommercialcloudserviceprovidersareexpectedtoaddthecostofconnectionandusageofGÉANTOpentothepriceofthecloudservicesdeliveredtotheresearchcommunity.Commercialcloudserviceproviderswillalsowanttoofferthesametypesof cloud services to customers from business sectors and will have to integrate alternativenetwork providers which will allow the stakeholders to compare the efficiency and costeffectivenessofallthenetworkservicesprovidedbythedifferentsuppliers.TheopeningupoftheEuropeanOpenScienceCloudtousersbeyondthepubliclyfundedresearchsector is essential if it is to attract investment from theprivate sector and support a vibrantinnovationcycle.Lookingfurtherintothefuture,theEuropeanOpenScienceCloudcouldbemorecloselylinkedtothedataacquisitionandreal‐timerequirementsofResearchInfrastructures.Forexample,EuropeanXFEL,ILLandESRFtogetherwithEurofusionsitesallrequireonlineorrapidfeedbackinordertoprepareforthenextexperimentalrun.Thisimpliesimportantincreasesinnetwork capacity. Similar real‐time needswill also be important for the applications of newdetectortechniquesaddressedbytheATTRACTconsortium40.
IdentitymanagementeduGAIN41 isan international inter‐federationservice interconnectingresearchandeducationidentity federations. It enables the secure exchange of information related to identity,authentication and authorisation between participating federations. eduGAIN provides aninfrastructure forestablishing trustedcommunicationsbetweenIdentityProviders(IdPs)andServiceProviders(SPs)indifferentparticipatingfederations.End‐usersauthenticateatIdPsandobtainaccesstoservicesdeliveredbySPs.FederatedidentitymanagementisalsogainingtractioninbusinesssectorsasshownbytherisingpopularityofUniversal2ndFactor(U2F)asanauthenticationstandardcreatedbytheFIDO(Fast36http://www.geant.net/37http://www.geant.net/Services/ConnectivityServices/Documents/GEANT%20Open%20Service%20Brief.pdf38http://www.geant.net/Services/ConnectivityServices/Documents/GN3PLUS13‐1439‐12_geant_open_exchange_production_policy_v4_3.pdf39http://www.geant.net/Services/ConnectivityServices/Documents/GEANT%20Open%20Service%20Description.p40http://www.attract‐eu.org/41http://services.geant.net/edugain/Pages/Home.aspx
EIROforumITWorkingGroup27October2015
11This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
IDentity Online) Alliance42 an industry group established to standardize authenticationtechnologyanddevicesthatcansimplifyandstrengthentwo‐factorauthenticationforbusinessesandconsumers.SoitwillbeessentialforeduGAINtoensureitcanengagewithcommercialIdPsandSPs to avoid isolating the researchandeducationcommunity.The recently startedAARC(AuthenticationandAuthorisationforResearchandCollaboration)43H2020projectintendstofurtherdevelopeduGAINanditisessentialthataprimarygoalofthisprojectshouldbetoensureeduGAINcansupporttheEuropeanOpenScienceCloudinproductionusage.
TheUKisrankedtopof86countriesbytheOpenDataBarometer46,whichmeasuresacountry’sreadiness to secure benefits from open data, its publication of key datasets and evidence ofemergingimpactsfromopengovernmentdata.The2015OpenDataInstitutereport“Opendatameansbusiness:UKinnovationacrosssectorsandregions47”providesconvincingargumentsforlearningfromtheprivatesectorwhenitcomestomanagingthesharingofpublicsectordata,highlightingtheroleofvalue‐addedproviders.TheUK’scentralrepositoryofpublicsectoropendata,data.gov.uk,containsnearly15,000datasetspublishedwithanOpenGovernmentLicense.
12This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
Examples include geospatial/mapping data (OpenStreetMap48), transport‐related data(Traveline49), demographics/social data (Office for National Statistics50) and business data(CompaniesHouse51).BestpracticesincludetheadoptionofOpenDataCertificates52andtheuseofCreativeCommons53publicdomainlicence(CC0)andattributionlicence(CC‐BY).TheCreativeCommonsattributionandshare‐alikelicence(CC‐BY‐SA)isalsoused,butmaylimitacompany’sabilitytousethatdataforcommercialproductsandservicesbyrequiringthemtoalsoattachthesameopenlicencetothedatatheyderive.Somedatacanneverbe“open”intheliteralsenseandspecificauthorizationmayberequired(e.g. for medical patient data). However, the “FAIR” principles of Findability, Accessibility,InteroperabilityandReusability54shouldstillberespectedandformthebasisfortheEuropeanOpenScienceClouddatapolicy.OpenAIRE55isanetworkofOpenAccessrepositories,archivesandjournalsthatsupportOpenAccesspolicies.OpenAIREisanetworkofmorethan580dataproviders,integratingmorethan10millionOpenAccesspublications,relatedtoabout25,000organisationsand45,000projectsfrom3funders.OpenAIREiscontributingtotheLinkedOpenDatamovement,andhasrecentlylaunchedtheDLIService56,forDataLiteratureInterlinking.TheEuropeanOpenScienceCloudwillbeinterfacedasacontentprovidertothisresourceandasaconsumerofserviceAPIswhichwillallowotherstobuildintegrateddatadiscoveryandanalysisservices.TheZenododigitalrepositorypoweredbyInvenioandoperatedbyCERNaspartofOpenAIREhasbeenextendedwithimportantfeaturesthatgreatlyimprovedatasharingandithasbecomeverypopularwithresearchers frommanydisciplinesaround theworld. Inparticular,Zenodonow offers persistent identifiers for data objects so datasets and software from the popularGitHubcoderepositoryaswellaspublicationscanbecitedandincludesinterfacespermittingmetadatatobeharvested.EUDAT57 is developing a collaborative data infrastructure (CDI) for European researchcommunities. TheB2servicessuitecurrentlyconsistsoftheB2SAFEserviceforimplementingdatamanagementpolicieswithintheEUDATCDI,theB2STAGEservicewhichprovidestoolsandAPI’stointeractwiththeEUDATCDI,theB2SHAREdatarepositoryservicetostoreandshareresearchdata, theB2FIND service for finding researchdata, theB2DROP service as EUDAT’sDropBox‐likeservicetosynchroniseandexchangedatawithinatrustedenvironment.Metadata and indexing facilities across the set of services from OpenAIRE repositories andEUDATdataservicesaswellasengagedcloudserviceprovidersareseenasbeingparticularlyrelevant.
13This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
DatapreservationDatacentresoperatedbythegroupofpubliclyfundedresearchorganisationsandrelatedthirdpartiesprovidecomputeandstorageservices to theresearchcommunityaswellasaccess toscientificdatasetsandpublications.Nextgeneration“datafactories”,includingtheResearchInfrastructuresontheESFRIroadmap,arecharacterisedbydatavolumesthatcanextendfrommultiplePetaBytetoseveralExaBytesandevenbeyond(suchastheSKA)servinguptoseveralthousandsofresearchersaroundtheworld,aswellasmanymorepotentialusersviaOpenAccess.Datapreservation–forcurrentandfuturere‐useandsharing–isafundamentalcomponentofon‐goingdatamanagementplansandthereiscommonagreementontheOAISmodel(ISO14721)togetherwithcloselyrelatedstandards(ISO16363and16919).Thisapproachfocusesalmostexclusivelyonmanagementofrepositorydataandadditionalcapabilitiesareneededtosatisfythe key use cases driving data (knowledge) preservation, sharing and re‐use in a multi‐disciplinaryenvironment.Theseadditionalcapabilitiesrequireagoodunderstandingofwhowillre‐usethedata(“theconsumers”)togetherwithknowledgecapturefromtheOpenScientistswhoare“theproducers”(OAISterms)ofthedata.Preservation policies implemented in a measurable and certifiable manner across shared e‐infrastructurestogetherwithdomainandinstitutionalrepositorieswouldstimulatemuchwiderre‐use of data through the captured and preserved knowledge, as well as the capability topreserveandre‐usedataandknowledgeforsignificantlylongerperiodsoftime.Thistranslatesto a larger returnon investment for the funding agencies, togetherwith associated scientific,educationalandculturalbenefits.
ReproducibilityofresearchFederated cloud‐based services will improve reproducibility and transparency (servingResponsible Research & Innovation principles, as envisaged by the OpenAIRE & FOSTER58report59),facilitatingwideraccessfortheknowledge‐basedindustries,andlettingthefreeflowofideasandknowledgespeedupinnovationanddeliveryofaddedvaluetothemarketplace.TheRDAReproducibilityInterestGroupdefinedasetofhigh‐priorityservicesforreproducibilityofOpenScience,asfollows60:1)Persistentlinkingandavailabilityofdataandcode(viarepositoriesorothermechanisms)usedinthegenerationofpublishedresearchresults,withthepublicationitself;2) Development, encouragement, and adoption of meta‐data standards for data and code,especiallyforthoselinkedtopublications;3)Development, encouragement, and adoption of data and code publication, authorship, andcitationpractices,especiallyforthoselinkedtopublications;4)Developmentandadoptionofappropriatetoolsandcomputationalinfrastructurethatenable:thesharingofresearchworkflowsandpermitreplicationofcomputationalscientificfindings;thepersistent linking of all digital scholarly objects used to generate research findings such asdatasets in repositories; and versioning of digital scholarly objects to ensure persistentreproducibility.To support reproducible science the European Open Science Cloud will need to integrate anetworkofZenodo‐likerepositoryservicesandlinkthemtothecomputingservicestoensurethatregisteringandstoringresearchoutputsbecomesasimpleandstandardoperationattheendofthecomputecycle.Inaddition,thiswillenableuserstoanalysetore‐analysetheregistered
Disruptivetechnologiessuchascloudofferamyriadofpossibilitiesbutcomewithnewpressuresforserviceprovisioning.Cloudtechnology ismoreaccessible tousersmeaning theyaremoreknowledgeableaboutwhatproductsandservicestheyneedandduetotherapidlygrowingandeasilyaccessiblecloudservicesmarket, theyhavealternatives to their traditionalsupplier foracquiringthem.Aroundtheworld,ITdepartmentsarebeingby‐passedasusersprocuretheirowncloudservicesdirectly.Thisagrowingtendencybyindividualsandworkgroupstosign‐upforcommerciallyoperatedcloudserviceswithoutany involvement fromtheir ITdepartmentswhichcreatesseriousrisksforpublicorganisations.Therisksfromsuchshadowcloudservicesinclude issues with data security, transaction integrity, business continuity and regulatorycompliance.ConsequentlytheroleofserviceprovisioningforITdepartmentshastochangetobecomemoreofabrokerfortechnologyandservices.InthisnewroleitisimportantfortheITdepartment toknowwhat is availableon themarket, howwell itworks, tobe able to assessproviders, validate security, understand service levels and ensurepolicies and legislation arerespected.Sothereisanurgentneedtoorganisetheintroductionofcommercialcloudservicesinthepublicresearchsectorinaconsolidatedandsecuremanner.Forminganetworkofpublicresearch organisations that can procure cloud services will attract the interest of servicesuppliersaswellasfundingagencies.Themajorityofthisprocurementfundingwillbedirectedto service providers and the approach has the advantage of permitting the procuringorganisationstochoosewhichservicesandprovidersreceivethesefundsandthusrepresentsachangetotheestablishedfundingmodelforpublicsectorITservices.BringingtogetherthepublicandprivatesectorintheinnovationcyclewillstrengthenEurope’sglobalcompetitivenessandencouragethecreationofnewandsustainablejobsandthepromotionofgrowth.Theintroductionofprocurementofpay‐per‐usecloudservicesbyfundingagenciesandresearchorganisationsonbehalfof theirend‐usersrepresentsasignificantchange toe‐Infrastructuresandwillimpactthegovernancemodel.Currentlypubliclyfundede‐Infrastructuresaresupplierdriven while the European Open Science Cloud puts procures and users at the heart of thedecision making process. It will be necessary to establish an inclusive governance structurewhereallthestakeholdersarerepresentedandavoidamonopolyofanyprocurer,supplierorresearchcommunity.Thegovernanceprincipleshavetoensuretheinterestsofbothpublicandprivate participants aremet and that the EuropeanOpen Science Cloud becomes sustainablyattractiveandbeneficialforallstakeholdersfrombothsectors.TheEuropeanOpenScienceCloudwillbea cornerstoneof anopenscience commonsand itsgovernancemodelneedstotakeintoaccounttherealitiesofthepublicresearchsectorwiththefollowingobjectives:1. Enableintegrationofexistinge‐Infrastructureswithcommercialcloudcomputingeffectively
andefficiently2. Ensure alignment with the Digital Single Market, foster coherence, equitability and
innovationandgrowth.InadditiontheEuropeanOpenScienceCloudwouldbecomeacriticalICTinfrastructurefortheEuropean Research Area and would need to be protected by identifying vulnerabilities andensuring an operational security plan is in place to minimize the detrimental effects ofdisruptions.ThegovernancestructureiscomposedofseveralbodiesasshowninFigure1.
Eachbodyinthegovernancestructurehasaspecificroleandcomposition: Board of Procurers – this grouping of all procurers (research organisations, funding
thattheEuropeanOpenScienceCloudiscompatiblewithEuropeanlegislation.Itwouldensure theapplicationofbestpractises for thecontractual aspectsofdeliveringcloudservicesincludingservicelevelagreementsimplementingrecognisedpoliciesfortrust,security andprivacynotably fordataprotection; certification requirements; a codeofconduct;andtermsandconditionsthatrespectEuropeanlegislation.
ProcurementandAssessmentAgency–oneormoreorganisationscommissionedbytheBoardofProcurerstoperformthejointprocurementandcentralisedbillingofservicesonbehalfofallprocurersaswellasgatherdatanecessarytomeasureasetofagreedKeyPerformance Indicators (KPIs). Having an organisation to oversee the procurementprocess, certify and enrol service providers as well as handle the contractualarrangementsbetweensuppliersandprocurerswithcentralisedbillingwouldsimplifytheoperationandexpansionoftheEuropeanOpenScienceCloud.
End‐UserBoard–groupingofend‐usersfromengagedresearchcommunitiesincludingthelong‐tailofsciencetoprovideaconsultativeopinionontherelevanceandaddedvalueofdeliveredservices.End‐userscontributeapplicationssoftware,dataandpublications.Responsibility for all data that is made available, linked or accessed via the servicesprovidedbytheprojectremainswiththedataprovidersandmusthavebeenobtainedinaccordancewiththelawsandregulationsinoperationinthecountryinwhichthedataproviderresides.Thisincludesanyrequirementforapprovalfromanappropriateethicscommitteeorotherregulatorybody.
Figure1 GovernanceStructure
EIROforumITWorkingGroup27October2015
16This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
TechnicalBoard – grouping of technical experts to assess the technical maturity andsuitabilityofservices,includingsecurityaspects.
External Advisory Committee – grouping of external experts from the public andcommercialsectorsthatwillprovideadvicetotheBoardofProcurersonthestateandfuturedirectionsoftheEuropeanOpenScienceCloud.
ProprietarysolutionsarenotsolutionsThecostofprovidinglicensestopopularproprietysoftwarepackagesfortheusersofResearchInfrastructurescontinuestoincrease.Asanexample,between2008and2014,CERN’sspendingonsoftwaredoubledwithoutanysignificantincreaseinthenumberoflicenses.Movingtoacloudmodelwheresoftwarelicensesarerentedonapay‐per‐usebasismayhelpstemthisincrease.Butsomeproprietarysoftwarepackageshaveaneffectivemonopolyintheresearchdomainandtheirmarketdominancecanoffsetanypotentialsavings.Itisessentialthatthereisappropriateinvestmentinopensourcesolutionsinkeydomainssotheycanbesupportedbymultipleproviders.WemustleveragetherichnessinthediversityofEuropeansuppliersandtomatchitwiththeexpertiseavailableinproductione‐Infrastructures,demonstratingthetechnicalfeasibilityofinteroperabilitybetweentheseplayers.The European Technology Platform for High Performance Computing project61 published aStrategicResearchAgendaforachievingHPCleadershipinEurope62whichspecificallyhighlightsthe upcoming big‐data challenges for leading research activities and the relevance of cloudservices:
“Europe is in a unique position to excel in the area of HPC Usage and Big Data owing to the experience level of current and potential users (and the recognition of the importance of data by such users as CERN, ESA, and biological data banks) and the presence of leading ISVs for large‐scale business applications. Europe should exploit that knowledge to create competitive solutions for big‐data business applications, by providing easier access to data and to leading‐edge HPC platforms, by broaden the user base (e.g., through Cloud Computing and Software as a Service (SaaS), and by responding to new and challenging technologies.”
17This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
ThereisnoclearbusinesscaseforpurelycommercialHPCservicesatthescaleofPRACEtier‐0installationsbutsmaller‐scalecommercial‘HPCinthecloud’offeringsarestartingtoappearonthemarket.ThiswillhelpaddresstheshortfallbetweensupplyanddemandforcapabilityHPCservicesasseenasPRACE63wheretypicallyonlyonethirdoftherequestscanbesatisfied.TheuseofcapabilityHPCservicesbythecommercialsector,inparticularSMEs,isbeinginvestigatedby the EC funded Fortissimo project64. This will make hardware, expertise, applications,visualisation and tools available and on a pay‐per‐use basis. In parallel, the UberCloudMarketplace65 is offering on‐demand access to HPC services for individual engineers andscientists.
PublicinvestmentThestepsdescribedabovewillneedconsiderablepublicinvestmentaswellasinvestmentfromcommercialserviceproviderstobringtheplatformtogether.InorderfortheresearchcommunitytobeabletobenefitfullyfromtheexistenceofaEuropeanOpenScienceCloud,ithastoexpandbeyondthebasicIaaSlevelandprovidehigher‐levelservicesthatareclosertotheneedsofthedailyworkofaresearcher.TheHNSciCloudPCPprojectprovidesavehicleforjointinvestmentinIaaSservicesandasimilarapproachshouldbeenvisagedforhigher‐levelsoftwareservices.Thenatural follow‐on step for successful PCPprojects is toprocure at a larger scalewithPPI co‐fundedprojectsthatcouldsignificantlyincreasethecapacityandimpactoftheEuropeanOpenScienceCloud.Thiswilltakeasustainedinvestmentbyallthestakeholdersinboththepublicandcommercialsectors,notonlyincloudtechnology,supportinginfrastructureandstrategicsoftwarebutalsoinend‐userfacingserviceswhichwillsimplifyaccesstotheEuropeanOpenScienceCloud.Significant investment in software capability will be absolutely essential to obtain the bestperformancefromcurrentandfuturecomputerandstoragearchitectures.ManysciencestodaybenefitfromcommodityCPUanddiskstoragebuttherearesignificantarchitecturalchangesinmodernCPUs(memorylayout,I/Opaths,accelerators,vectorunits,etc.)whichmeansitwillbenecessaryforsciencetoinvestheavilyinsoftwareandtrainingtobeabletomigrateapplicationcodesandprogrammersandfullyexploitthesenewtechnologies.ThisinvestmentinsoftwareisessentialtomaintainEuropeancompetitivenessinthisarea,andshouldincludecoordinationofexistingexpertisetothebenefitofdiversecommunities.
InvestmentinskillsThe design, creation and operation of e‐infrastructure services are essential tools in thedevelopmentofskillsandcompetenciesfortheEuropeanmarket.Theabilitytofullyexploitthepotentialforknowledgeandjobcreationthatislocked‐upinthedatasetsandalgorithmsatthecentreofOpenSciencewillrequirethenurturingofanewgenerationofdatascientistswithacore set of ICT skills. The EIROforum organisations have core competences in training andeducationwhichcancontributetothisactivity.TheEuropeanOpenScienceCloudcanbuildonthis and similar initiatives tohelp train thenext generationof IT‐savvy researchers, andalsoimproveoutreachtothegeneralpublic.
18This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
andnationalfundingagencieshaverecentlyconfirmedtheircommitmentstoGÉANT,AARC,EGI,OpenAIRE,EUDATandPRACE.Inordertoensurefullsynergies,DGCONNECTforeseesthate‐infrastructureprojectswillbegroupedintoclustersofrelatedprojects.Thisnewphaseoffundingfortheclustersofe‐infrastructureprojectsofferstheECawindowofopportunityandameanstofocusonestablishingtheEuropeanOpenScienceCloud.InparallelDGRTDintendstofundapilotaction that will encourage the uptake of the European Open Science Cloud by the ResearchInfrastructures. Close coordination between DG RTD and DG CONNECT funded projects willfacilitatetheestablishmentoftheEuropeanOpenScienceCloud.ThefinancialplanfortheEuropeanOpenScienceCloudshouldbedesignedsothattheservicescanbesustainedbytheiroperatingorganisationsaccordingtoacontinuumoffundingmodelsrangingfromsponsoredresourcesforpeer‐reviewedscientificcasestocommunitieswhowouldpayfortheservicestheyreceive.Additionalresourceswillberequiredinorderfortheseservicestobeexpandedandtoserveawiderrangeofusers.TheEuropeanCommissiontogetherwithregional,nationalandthematicfundingagencieswillneedtobecomestakeholdersandcontributetotheexpansionofEuropeanOpenScienceCloud.Theguidingprincipleisthatfundingfromsuchstakeholderswillbefocusedoninnovationofservicesanduptakebynewusercommunitiesandbusinessactorswhiletheoperationalcostswillbebornebytheoperatingorganisationsandtheusercommunities.Belowisanon–exhaustivelistofareaswherefundingagenciescancontributetothecreationoftheEuropeanOpenScienceCloud:
Manyresearchorganisationsthatoperateresearchinfrastructuresdonothavethemandatetoprovidecloudservicestotheirusersforthemanagementandprocessingoftheirexperimentaldata.Thisrepresentsagapinthescientific lifecycleandamissedopportunitytohighlighttheresults and impact of public funded research. These research organisations will requireassistancetobridgethisgapbysupportingtheiruserssotheycanmakeuseofcloudservicestomanageandprocesstheirexperimentaldata.TheEuropeanCommission’sINFRASTRUCTURES2016‐2017workprogrammeforeseesapilotaction addressing the federation, networking and coordination of pan‐European research
EIROforumITWorkingGroup27October2015
19This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
infrastructuresandcloudsforthepurposeof increasingresearchandsciencedataavailabilityand use. It also foresees Data and Distributed Computing e‐infrastructures for Open Sciencewhichshouldcooperatewiththepilotaction.Thecombinedfocusofthesefundingcallsshouldprovide an incentive for the existing e‐infrastructures and Research Infrastructures to worktogethertoformthebasisoftheEuropeanOpenScienceCloud.Lookingfurtherahead,theEChastaken steps to ensure funding for GÉANT over the full duration of H2020 by introducing‘Framework Partnership Agreements’ (FPA). The FPA model represents a more long‐termengagementthatcouldencouragetheintegrationofe‐infrastructuresco‐fundedviaECprojectsintotheResearchInfrastructures’computingmodels.TheapplicationoftheFPAapproachtotheEuropeanOpenScienceCloudcouldestablishthebasisfortheEuropeanResearchArea’sdigitalcommonsandleadtowardsScience2.066.
ConclusionsCloudcomputingrepresentsaparadigmshiftinthewayITresourcesareprovisionedforresearchcommunities. Traditionally the ITdepartments of researchorganisationshavedeveloped andoperated in‐house theservices that theirusers required.Butcommercial cloudservicesareadisruptivetechnologywitheasy‐to‐usecommodityservicesmadeavailableoftenona‘freemium’basistousersataglobalscale.ConsequentlytheroleofITdepartmentsischangingasusersby‐passtheirtraditionalserviceprovisionchannelstogettheon‐demandservicestheywantandtherebyintroducingshadowITservicesthatareoutsidethepolicyandsecurityboundariesofresearch organisations. This is impacting data intensive science and how e‐infrastructureservicesareusedbyresearchersandjudgedbyfundingagencies.ThiswaveofchangeistakingplacewithinthebroadercontextofOpenSciencebringingever‐greater transparency, accessibility and accountability, wherein stakeholders in the researchprocess increasingly expect to be able to access and reuse the outputs of taxpayer fundedresearch.Fromthegrassroots,OpenAccessfirstemergedfromtheHighEnergyPhysicsscholarlyresearchcommunity67,whosawbenefitinnolongerwaitingfortraditionalpublicationschedulesbeforesharingresearchfindings(and,subsequently,dataandsoftwarecode).Top‐down,governmentsandotherfundersseeopennessasacatalystforincreasingpublicandcommercialengagementwithresearch,bringingaboutbothsocietalandcommercialbenefit.Thisnewrealityrepresentsathreattotheestablishedserviceprocurementanddeliverymodelsbut also an opportunity. In an era of rationalisation and budget concentration, all means ofoptimisingservicedeliveryandreducingoperationalcostsmustbeconsidered.TheEIROforummembershaveextremeITneedsthatincreasewiththeprogressoftheresearchinfrastructurestheyoperatewhilethebudgetenvelopeforITremains,atbest,unchanged.Cloudcomputingand thecloudservicesmarketdidnotexistwhen thecomputingmodels formanyESFRIresearchinfrastructureswereconceived.Thesecomputingmodelsmustevolvetobecomemoreagileandopportunistic,capableofusingITcapacity inwhatever form it isdelivered,be it inagrid, cloud,HPCoreven inavolunteerstructure.Weexpectcommercialcloudservicestoplayanincreasingroleinthesecomputingmodels.
66http://ec.europa.eu/research/consultations/science‐2.0/background.pdf67 Open Access: Unlocking the Value of Scientific Research, Richard K. Johnson (SPARC), March 2004,http://www.sparc.arl.org/sites/default/files/media_files/OpenAccess_RKJ_preprint.pdf
EIROforumITWorkingGroup27October2015
20This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
Commercialsectorsareinvestingheavily incloudservices leadingtoarapidexpansionofthemarketandabreath‐takingrateof innovationthatthepubliclyfundedresearchsectorcannotmatchbutcanleverageandsoprofitfromsuchadvances.The European Open Science Cloud represents a strategic vision that can be a vector forintroducing change in theserviceprovisioningandcomputingmodels for thepublicly fundedresearchsectorinthemediumtolongterm.TheEuropeanOpenScienceCloudhas thepotential togreatly improve theprovisioningof ITservicesforResearchInfrastructurestoaddresstheirbigdataneeds. Itcanencompassall thephasesoftheresearchlifecycleandofferaplatformofjointinnovationforthepublicandprivatesectors.ItwillsignificantlychangethewayITservicesareprocured,organisedandfunded.Thekeychallengesareintegratingfrequentlychangingtechnologies,managingthecomplexityandidentifyingtheoptimalorganisationalandfinancialmodels.Researchersmustbeconvincedthattheywillnotlosecontroloftheirpreciousdata.Itisanambitiousundertakingrequiringtheactiveengagement of many stakeholders and careful planning of the technical, financial, legal andgovernanceaspects.Forittosucceeditmustbecomeapriorityforalltheactorsinvolvedwithmonitoringbythefundingagenciesandregularassessmentbytheusercommunities.Thispositionpaperisarallyingcallforadoptionofsuchastrategicapproach–withintheECandotherfundingbodiesaswellastheoperatorsofResearchInfrastructures.