Top Banner
Research Computing at ILRI Alan Orth ICT Managers Meeting, ILRI, Kenya, 5 March 2014
17

Research computing at ILRI

Jul 13, 2015

Download

Technology

Lance Robinson
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Research computing at ILRI

Research Computing at ILRI

Alan Orth

ICT Managers Meeting, ILRI, Kenya, 5 March 2014

Page 2: Research computing at ILRI

Where we came from (2003)

- 32 dual-core computenodes- 32 * 2 != 64- Writing MPI code is hard!- Data storage over NFS to“master” node- “Rocks” cluster distro- Revolutionary at the time!

Page 3: Research computing at ILRI

Where we came from (2010)

- Most of the original clusterremoved- Replaced with singleDell PowerEdge R910- 64 cores, 8TB storage, 128 GB- Threading is easier* than MPI!- Data is local- Easier to manage!

Page 4: Research computing at ILRI

To infinity and beyond (2013)

- A little bit back to the“old” model- Mixture of “thin” and“thick” nodes- Networked storage- Pure CentOS- Supermicro boxen- Pretty exciting! --->

Page 5: Research computing at ILRI

Primary characteristics

Computational capacity

Data storage

Page 6: Research computing at ILRI

Platform

- 152 compute cores- 32* TB storage- 700 GB RAM- 10 GbE interconnects- LTO-4 tape backups (LOL?)

Page 7: Research computing at ILRI

Homogeneous computing environment

User IDs, applications, and data are available everywhere.

Page 8: Research computing at ILRI

Scaling out storage with GlusterFS

- Developed by Red Hat- Abstracts backend storage (file systems, technology, etc)- Can do replicate, distribute, replicate+distribute, geo-replication (off site!), etc- Scales “out”, not “up”

Page 9: Research computing at ILRI

How we use GlusterFS

[aorth@hpc: ~]$ df -hFilesystem Size Used Avail Use% Mounted on

...wingu1:/homes 31T 9.5T 21T 32% /homewingu0:/apps 31T 9.5T 21T 32% /export/appswingu1:/data 31T 9.5T 21T 32% /export/data

- Persistent paths for homes, data, and applications across the cluster.- These volumes are replicated, so essentially application-layer RAID1

Page 10: Research computing at ILRI

GlusterFS <3 10GbE

Page 11: Research computing at ILRI

- Project from Lawrence Livermore National Labs (LLNL)- Manages resources

- Users request CPU, memory, and node allocations- Queues / prioritizes jobs, logs usage, etc

- More like an accountant than a bouncer

Page 12: Research computing at ILRI

Topology

Page 13: Research computing at ILRI

How we use SLURM

- Can submit “batch” jobs (long-running jobs, invoke program many times with different variables, etc)- Can run “interactively” (something that needs keyboard interaction)

Make it easy for users to do the “right thing”:

[aorth@hpc: ~]$ interactive -c 10salloc: Granted job allocation 1080[aorth@compute0: ~]$

Page 14: Research computing at ILRI

Managing applications

- Environment modules - http://modules.sourceforge.net- Dynamically load support for packages in a user’s environment- Makes it easy to support multiple versions, complicated packages with $PERL5LIB, package dependencies, etc

Page 15: Research computing at ILRI

Managing applications

Install once, use everywhere...

[aorth@hpc: ~]$ module avail blastblast/2.2.25+ blast/2.2.26 blast/2.2.26+ blast/2.2.28+

[aorth@hpc: ~]$ module load blast/2.2.28+

[aorth@hpc: ~]$ which blastn/export/apps/blast/2.2.28+/bin/blastn

Works anywhere on the cluster!

Page 16: Research computing at ILRI

Users and Groups

- Consistent UID/GIDs across systems- LDAP + SSSD (also from Red Hat) is a great match- 389 LDAP works great with CentOS- SSSD is simpler than pam_ldap and does caching

Page 17: Research computing at ILRI

More information and contact

[email protected]://hpc.ilri.cgiar.org/