Accelerating Science and Innovation - It's Good to Share (HPC & Big Data 2017)

Post on 10-Feb-2017

86 Views

Category:

Science

5 Downloads

Preview:

Click to see full reader

Transcript

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 1

Accelerating Science and Innovation – It’s Good to ShareMartin Hamilton, JiscAlison Davis, Francis Crick InstituteTim Cutts, Wellcome Trust Sanger Institute

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 2

Accelerating Science & Innovation – It’s Good to Share

1. About Jisc

2. R&D on new services for researchers› Research Data Shared Service› Research Data Discovery Service› What’s next?

3. Personal perspectives:› Alison Davis, CIO, Francis Crick Institute› Tim Cutts, Head of Scientific Computing,

Wellcome Trust Sanger Institute

4. Panel discussion and Q&A

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 3

1. About Jisc

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 4

About Jisc

Jisc is the UK higher education, further education and skills sectors’ not-for-profit organisation for digital services and solutions. This is what we do:› Operate shared digital infrastructure and

services for universities and colleges› Negotiate sector-wide deals, e.g. with IT

vendors and commercial publishers› Provide trusted advice and practical

assistance

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 5

About Jisc

In the UK

there is…

470Colleges

providing further education

160Higher education

institutions

2.3mStudents in HE

4.9mLearners in FE

23%Postgraduate

77%Undergraduat

e

Funding for FE and skills

£7.7bn

Income of HEIs

£30.7bn

1,085Providers of further

education and skills

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 6

About Jisc

Janet network

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 7

About Jisc

Janet network

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 8

About JiscNetflix

VoicenetAkamai

Virgin Radio

Bogons

Logicalis UKPipex / GXN

BBC

Datahop

InTechnology

INUK

Simplecall

LINX multicast

Gamma

Google

Simplecall Redstone

Updata

aql

Voicenet

Google

Limelight

Limelight

AkamaiBTnet

Init7

Amazon

Microsoft EU (via TN)

Telekom Malaysia

Globelynx

10Gbit/s 1Gbit/s

100Gbit/

s

GÉANT

GÉANT+

LINX

Microsoft EU (via TW)

Total external connectivity ≈ 1 Tbit/s

Leeds

Akamai

Google

VM for LGfLInTechnology

NHS N3

Exa Networks

Synetrix BBC (HD 4K pilots)

One Connect

Glasgow&

Edinburgh

HEAnet

BBC (Pacific Quay)

Gamma

BBC (HD 4K pilots)

NHS N3

SWAN (Glas)

SWAN (Edin)

Manchester

Telecity

Harbour

Exch.

Telehouse

North & West

VM for LGfLRM for

Schools

VM for LGfL

RM for Schools

Global Transit

Tata IXManchester

IXLeeds

Global Transit Level3

Global Transit Level3

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 9

About Jisc

Support for researchers, including:› Technology platform, building on the Janet network:

– Shared data centres with Virtus & aql– Cloud deals and agreements, e.g. AWS & Azure– Archiving framework with Arkivum– Access and identity for higher assurance use cases (Assent, Safe Share)

› Open access and open data:– SHERPA services – tracking funder/journal Open Access policies– Monitor – track Open Access costs and compliance– CORE – Open Access publications search engine

› Agreements with publishers– e.g. Elsevier, Taylor & Francis, Wiley, Springer– Progress with subscription offsets for Article Processing Charges

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 10

About Jisc

Support for researchers, including:› Technology platform, building on the Janet network:

– Shared data centres with Virtus & aql– Cloud deals and agreements, e.g. AWS & Azure– Archiving framework with Arkivum

› Open access and open data:– SHERPA services – tracking Open Access policies– Monitor – track Open Access costs and compliance– CORE – Open Access publications search engine

› Agreements with publishers– e.g. Elsevier, Taylor & Francis, Wiley, Springer– Progress with subscription offsets for Article Processing Charges

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 11

About Jisc

Shared Data centre (North):› Run in partnership with aql› Tier 3 data centre, designed with HPC in mind› Able to offer air and water cooling› Will be connected to the core of Janet at

2x100G initially› Racks available in 4/10/20/30kW

configurations› Anchor tenants are Universities of Liverpool,

Leeds, Sheffield and Sheffield Hallam University

› Expected to be available for service April 2017

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 12

About JiscShared Data centre (South):› Tier 3 data centre, designed with HPC in

mind.  Able to offer air and water cooling› Connected to the core of Janet at 2x400G› Racks available in 4/10/20/30kW

configurations› Anchor tenants are UCL, LSE, QMUL, Kings

College, Sanger Institute, Francis Crick Institute

› Other tenants include Imperial College, Brunel University, Bristol Uni, Surrey University, University of the Arts, HEFCE, University of Sussex, Institute of Cancer Research

› Currently seeing a 60:40 split in favour of HPC

› The SDC(South) framework is now at ~220 racks and a committed power from the tenants of ~2.3MW – all in just over 2 years

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 13

2. R&D on new services for researchers

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 14

R&D on new services for researchers

Research DataShared Service

› Procurement concluded and suppliers selected

› Now building the service to the community’s requirements

› 13 pilot institutions› Research Data Network› Find out more:

researchdata.jiscinvolve.org

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 15

R&D on new services for researchers

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 16

R&D on new services for researchers

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 17

R&D on new services for researchers

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 18

R&D on new services for researchers

Milestones 2015-18

Apr 2015-Dec 2015

Jan 2016 – July 2016 Aug-2016 -June 2017

Jul 2017-Sept 2017 Oct 2017-Apr 2018

-Requirements - HEI Pilots Selected-Procurement commences

- Support consultancy work begins-Supplier Framework selected

-Alpha Development-Alpha service tested and reviewed

-Beta Development-Feedback on Beta Service

- Business case decision

-If go then begin transition to production service

-Institutional survey-HEI and supplier workshops-Pilot HEI selection process

-Detailed HEI requirements and technical architecture-Contracting commences

-Development Phase-Contact additional early adopter HEI’s and promote Beta Service

-Business planning and Begin Business Case-Market Research and Consultation

-Promote service to institutions-Start on next phases (service enhancement/modular)

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 19

R&D on new services for researchers

Research Data Discovery Service

› Alpha - feedback sought!› Uses CKAN to aggregate

research data from institutions

› Test system has 14K datasets from 22 organisations so far

› Find out more: rdds.jiscinvolve.org

› Try it: staging.ckan.data.alpha.jisc.ac.uk

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 20

3. Personal perspectives

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 21

Personal perspectives

› Alison Davis› Chief Information Officer

› Francis Crick Institute

Big Data & HPC Collaboration Challenges

Alison Davis, CIOThe Francis Crick Institute

23

• Funded by 6 founding partners

• Construction = £650M• The building is170 m long & 50 m high.• Total floor space of 93,000 m2 (17.5 football fields)• Capacity for ca.1250 scientists &250 operational staff•Migration August – December 2016

24

The Crick – key facts

Our science questions and experimental approaches

Separate sites- April 2015

Multi-site operationsApril 2015-August 2016

MigrationAugust-December 2016

Stabilisation2017-?

“Normal”

26

Evolution

High availability

tier

Middle working tier

Near line archive

Long term archive

BatchCompute

(GPU/CPU)

Ext data sets

Instruments

Other

Target state scientific computing platform

Backup

Operational support and processes

Data sources

Interactive

Compute(GPU/CPU)

CloudServices

Applications

28

Key Challenges

crick.ac.uk

29

alison.davis@crick.ac.uk

@AlisonDavisIT

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 30

Personal perspectives

› Tim Cutts› Head of Scientific Computing

› Wellcome TrustSanger Institute

Shared Facilities, Collaborative Science

Dr Tim CuttsHead of Scientific Computing

tjrc@sanger.ac.uk

Genome Campus

Research Institutes

Conferences, training and public engagement

Translation activities

Sanger Science

Scientific Programmes

Core Facilities

2013 Vision:A Meeting of Minds

Francis Crick

InstituteWTSI DR

and Capacity

MRC Collaborative

Bioinformatics

JISCUCL

KCL

Many others

Scientific Collaboration

2013 Vision:The genesis of eMedLab

WTSI Use of JSDC Slough

• WTSI share of eMedLab• Business continuity and disaster recovery• iRODS replica including all our primary sequence data (6.5 PB)• Transferred all services from our previous DR site• Replicas of critical RDBMS• Replicas of enterprise NAS system (~2 PB)• BCP for critical web services

• 10 Gbit dark fibre

Today’s ScienceDrives IT Strategy

Large Scale Scientific

Computing

Collaboration

ReproducibleScience

Reliability

Rapid Delivery

Performance

Scalability

Cost Effectiveness

Data Security and Governance

• Scale• Data security• Reliability

Limitations

• Ever-increasing data acquisition rates

• Aggregating data is not sustainable

• Scale (== cost)• Governance

• Duplication (== cost)

• Network bandwidth (== cost)

2017 Vision:Evolution and Federation

Open Standard APIs

Federated AAAI

MRC Bioinformatics Centres

Public clouds

WTSI Flexible Compute

EBI Embassy Cloud

WTSI Flexible Compute

• 5,996 cores• 50 TB RAM• 3PB storage• 100 Gbit software-defined network• Red Hat OpenStack Platform with CloudForms and Ceph• Automated deployments

• Continuous integration tests• Reproducible• Images deployable to any cloud environment• Enables Service Desk to make safe changes

Ongoing challenges

• Open standard APIs• Sufficient resources to develop and

operate services• Major change in application

development strategies

Conclusion

• Data science needs federated analysis• Keep the data at its source• Move computational work to the data

• Standards and infrastructure software need investment

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 43

4. Panel discussion and Q&A

01/05/2023

HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 44

Contact details

Martin HamiltonFuturist, Jisc

@martin_hamiltonmartin.hamilton@jisc.ac.uk

Except where otherwise noted, this work is licensed under CC-BY

top related