01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 1
Accelerating Science and Innovation – It’s Good to ShareMartin Hamilton, JiscAlison Davis, Francis Crick InstituteTim Cutts, Wellcome Trust Sanger Institute
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 2
Accelerating Science & Innovation – It’s Good to Share
1. About Jisc
2. R&D on new services for researchers› Research Data Shared Service› Research Data Discovery Service› What’s next?
3. Personal perspectives:› Alison Davis, CIO, Francis Crick Institute› Tim Cutts, Head of Scientific Computing,
Wellcome Trust Sanger Institute
4. Panel discussion and Q&A
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 3
1. About Jisc
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 4
About Jisc
Jisc is the UK higher education, further education and skills sectors’ not-for-profit organisation for digital services and solutions. This is what we do:› Operate shared digital infrastructure and
services for universities and colleges› Negotiate sector-wide deals, e.g. with IT
vendors and commercial publishers› Provide trusted advice and practical
assistance
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 5
About Jisc
In the UK
there is…
470Colleges
providing further education
160Higher education
institutions
2.3mStudents in HE
4.9mLearners in FE
23%Postgraduate
77%Undergraduat
e
Funding for FE and skills
£7.7bn
Income of HEIs
£30.7bn
1,085Providers of further
education and skills
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 6
About Jisc
Janet network
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 7
About Jisc
Janet network
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 8
About JiscNetflix
VoicenetAkamai
Virgin Radio
Bogons
Logicalis UKPipex / GXN
BBC
Datahop
InTechnology
INUK
Simplecall
LINX multicast
Gamma
Simplecall Redstone
Updata
aql
Voicenet
Limelight
Limelight
AkamaiBTnet
Init7
Amazon
Microsoft EU (via TN)
Telekom Malaysia
Globelynx
10Gbit/s 1Gbit/s
100Gbit/
s
GÉANT
GÉANT+
LINX
Microsoft EU (via TW)
Total external connectivity ≈ 1 Tbit/s
Leeds
Akamai
VM for LGfLInTechnology
NHS N3
Exa Networks
Synetrix BBC (HD 4K pilots)
One Connect
Glasgow&
Edinburgh
HEAnet
BBC (Pacific Quay)
Gamma
BBC (HD 4K pilots)
NHS N3
SWAN (Glas)
SWAN (Edin)
Manchester
Telecity
Harbour
Exch.
Telehouse
North & West
VM for LGfLRM for
Schools
VM for LGfL
RM for Schools
Global Transit
Tata IXManchester
IXLeeds
Global Transit Level3
Global Transit Level3
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 9
About Jisc
Support for researchers, including:› Technology platform, building on the Janet network:
– Shared data centres with Virtus & aql– Cloud deals and agreements, e.g. AWS & Azure– Archiving framework with Arkivum– Access and identity for higher assurance use cases (Assent, Safe Share)
› Open access and open data:– SHERPA services – tracking funder/journal Open Access policies– Monitor – track Open Access costs and compliance– CORE – Open Access publications search engine
› Agreements with publishers– e.g. Elsevier, Taylor & Francis, Wiley, Springer– Progress with subscription offsets for Article Processing Charges
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 10
About Jisc
Support for researchers, including:› Technology platform, building on the Janet network:
– Shared data centres with Virtus & aql– Cloud deals and agreements, e.g. AWS & Azure– Archiving framework with Arkivum
› Open access and open data:– SHERPA services – tracking Open Access policies– Monitor – track Open Access costs and compliance– CORE – Open Access publications search engine
› Agreements with publishers– e.g. Elsevier, Taylor & Francis, Wiley, Springer– Progress with subscription offsets for Article Processing Charges
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 11
About Jisc
Shared Data centre (North):› Run in partnership with aql› Tier 3 data centre, designed with HPC in mind› Able to offer air and water cooling› Will be connected to the core of Janet at
2x100G initially› Racks available in 4/10/20/30kW
configurations› Anchor tenants are Universities of Liverpool,
Leeds, Sheffield and Sheffield Hallam University
› Expected to be available for service April 2017
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 12
About JiscShared Data centre (South):› Tier 3 data centre, designed with HPC in
mind. Able to offer air and water cooling› Connected to the core of Janet at 2x400G› Racks available in 4/10/20/30kW
configurations› Anchor tenants are UCL, LSE, QMUL, Kings
College, Sanger Institute, Francis Crick Institute
› Other tenants include Imperial College, Brunel University, Bristol Uni, Surrey University, University of the Arts, HEFCE, University of Sussex, Institute of Cancer Research
› Currently seeing a 60:40 split in favour of HPC
› The SDC(South) framework is now at ~220 racks and a committed power from the tenants of ~2.3MW – all in just over 2 years
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 13
2. R&D on new services for researchers
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 14
R&D on new services for researchers
Research DataShared Service
› Procurement concluded and suppliers selected
› Now building the service to the community’s requirements
› 13 pilot institutions› Research Data Network› Find out more:
researchdata.jiscinvolve.org
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 15
R&D on new services for researchers
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 16
R&D on new services for researchers
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 17
R&D on new services for researchers
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 18
R&D on new services for researchers
Milestones 2015-18
Apr 2015-Dec 2015
Jan 2016 – July 2016 Aug-2016 -June 2017
Jul 2017-Sept 2017 Oct 2017-Apr 2018
-Requirements - HEI Pilots Selected-Procurement commences
- Support consultancy work begins-Supplier Framework selected
-Alpha Development-Alpha service tested and reviewed
-Beta Development-Feedback on Beta Service
- Business case decision
-If go then begin transition to production service
-Institutional survey-HEI and supplier workshops-Pilot HEI selection process
-Detailed HEI requirements and technical architecture-Contracting commences
-Development Phase-Contact additional early adopter HEI’s and promote Beta Service
-Business planning and Begin Business Case-Market Research and Consultation
-Promote service to institutions-Start on next phases (service enhancement/modular)
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 19
R&D on new services for researchers
Research Data Discovery Service
› Alpha - feedback sought!› Uses CKAN to aggregate
research data from institutions
› Test system has 14K datasets from 22 organisations so far
› Find out more: rdds.jiscinvolve.org
› Try it: staging.ckan.data.alpha.jisc.ac.uk
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 20
3. Personal perspectives
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 21
Personal perspectives
› Alison Davis› Chief Information Officer
› Francis Crick Institute
Big Data & HPC Collaboration Challenges
Alison Davis, CIOThe Francis Crick Institute
23
• Funded by 6 founding partners
• Construction = £650M• The building is170 m long & 50 m high.• Total floor space of 93,000 m2 (17.5 football fields)• Capacity for ca.1250 scientists &250 operational staff•Migration August – December 2016
24
The Crick – key facts
Our science questions and experimental approaches
Separate sites- April 2015
Multi-site operationsApril 2015-August 2016
MigrationAugust-December 2016
Stabilisation2017-?
“Normal”
26
Evolution
High availability
tier
Middle working tier
Near line archive
Long term archive
BatchCompute
(GPU/CPU)
Ext data sets
Instruments
Other
Target state scientific computing platform
Backup
Operational support and processes
Data sources
Interactive
Compute(GPU/CPU)
CloudServices
Applications
28
Key Challenges
crick.ac.uk
29
@AlisonDavisIT
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 30
Personal perspectives
› Tim Cutts› Head of Scientific Computing
› Wellcome TrustSanger Institute
Genome Campus
Research Institutes
Conferences, training and public engagement
Translation activities
Sanger Science
Scientific Programmes
Core Facilities
2013 Vision:A Meeting of Minds
Francis Crick
InstituteWTSI DR
and Capacity
MRC Collaborative
Bioinformatics
JISCUCL
KCL
Many others
Scientific Collaboration
2013 Vision:The genesis of eMedLab
WTSI Use of JSDC Slough
• WTSI share of eMedLab• Business continuity and disaster recovery• iRODS replica including all our primary sequence data (6.5 PB)• Transferred all services from our previous DR site• Replicas of critical RDBMS• Replicas of enterprise NAS system (~2 PB)• BCP for critical web services
• 10 Gbit dark fibre
Today’s ScienceDrives IT Strategy
Large Scale Scientific
Computing
Collaboration
ReproducibleScience
Reliability
Rapid Delivery
Performance
Scalability
Cost Effectiveness
Data Security and Governance
• Scale• Data security• Reliability
Limitations
• Ever-increasing data acquisition rates
• Aggregating data is not sustainable
• Scale (== cost)• Governance
• Duplication (== cost)
• Network bandwidth (== cost)
2017 Vision:Evolution and Federation
Open Standard APIs
Federated AAAI
MRC Bioinformatics Centres
Public clouds
WTSI Flexible Compute
EBI Embassy Cloud
WTSI Flexible Compute
• 5,996 cores• 50 TB RAM• 3PB storage• 100 Gbit software-defined network• Red Hat OpenStack Platform with CloudForms and Ceph• Automated deployments
• Continuous integration tests• Reproducible• Images deployable to any cloud environment• Enables Service Desk to make safe changes
Ongoing challenges
• Open standard APIs• Sufficient resources to develop and
operate services• Major change in application
development strategies
Conclusion
• Data science needs federated analysis• Keep the data at its source• Move computational work to the data
• Standards and infrastructure software need investment
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 43
4. Panel discussion and Q&A
01/05/2023
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 44
Contact details
Martin HamiltonFuturist, Jisc
Except where otherwise noted, this work is licensed under CC-BY