Top Banner
Southgrid Status 2001-2011 Pete Gronbech: 30 th August 2007 GridPP 19 Ambleside
16

Southgrid Status 2001-2011 Pete Gronbech: 30 th August 2007 GridPP 19 Ambleside.

Mar 27, 2015

Download

Documents

Connor Hayes
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Southgrid Status 2001-2011 Pete Gronbech: 30 th August 2007 GridPP 19 Ambleside.

Southgrid Status2001-2011

Pete Gronbech: 30th August 2007

GridPP 19 Ambleside

Page 2: Southgrid Status 2001-2011 Pete Gronbech: 30 th August 2007 GridPP 19 Ambleside.

RAL PPD

• 2006 installed large upgrade 200 (Opteron 270) cpu cores equiv. to and extra 260 KSI2k plus 86TB of storage.

• The 50TB which was loaned to RAL Tier 1, and is now being returned.

• 10Gb/S Connection to RAL Backbone

• 2007 upgrade Disk and CPU:– 13 x 6TB SATA Disk servers,

3Ware RAID controllers 14 x 500GB WD disks

– 32 x Dual Intel 5150 Dual Core CPU Nodes with 8GB RAM

• Will be installed in the Atlas Centre, due to power/cooling issues in R1

•2005 30 Xeon cpus, and 6.5 TB storage supplemented by upgrade mid 2006

Page 3: Southgrid Status 2001-2011 Pete Gronbech: 30 th August 2007 GridPP 19 Ambleside.

RAL PPD (2)

• Supports 22 Different VOs of which 18 have run jobs in the last year.

• RAL PPD has always supported a large number of VO’s which has helped ensure the cluster is fully utilised.

• Yearly upgrades are planned for the forseable future. • The current computer room will have incremental

upgrades to house the increased capacity.• The RAL Tier 1 computer room can be used for over

flow when needed.

Page 4: Southgrid Status 2001-2011 Pete Gronbech: 30 th August 2007 GridPP 19 Ambleside.

Status at Cambridge

• 2001 20 * 1.3Ghz P3 for EDG• 2002 3 TB Storage• 2004 CPUs: 32 * 2.8GHz

Xeon

• 2005 DPM enabled• 2006 Local computer room

upgraded.• Christmas 2006

32 Intel ‘ Woodcrest’ servers, giving 128 cpu cores equiv. to 358 KSI2k.

• Jun 2007 Storage upgrade of 40TB running DPM on SL4 64 bit

• Condor version 6.8.5 is being used

Page 5: Southgrid Status 2001-2011 Pete Gronbech: 30 th August 2007 GridPP 19 Ambleside.

Cambridge Futures

• CAMGRID– 430 cpus across campus, mainly running debian

& Condor. Upgrades expected in 2008.

• Special Projects:– CAMONT VO supported at Cambridge, Oxford and

Birmingham. Job submission by Karl Harrison and David Sinclair

– LHCb on Windows project (Ying Ying Li)• Code ported to windows

– HEP 4 node cluster– MS Research Lab 4 node cluster (Windows compute

cluster)• Code running on a server at Oxford, expansion on OERC

windows cluster

Page 6: Southgrid Status 2001-2011 Pete Gronbech: 30 th August 2007 GridPP 19 Ambleside.

Bristol History

• Early days with Owen Maroney on the UK EDG testbed.

• Bristol started in Gridpp in July 2005 with 4 LCG service nodes & one 2-CPU WN.

• 2006 upgraded to 8 dual-CPU WN & more may be added.• 2007 10TB storage upgrade

• When LCG is integrated to the Bristol HPC cluster (Blue Crystal) very soon, there will be a new CE & SE, providing access to 2048 2.6GHz cores, and it will use StoRM to make over 50TB of GPFS storage available to the Grid. This is a Cluster vision / IBM SRIF funded project.

• The HPC WN number should be closer to 3712 cores (96 2 x dual core Opteron (4 cores/WN) + 416 2 x quad-core opterons (8 cores / WN) )

• Large water cooled computer room being built on the top floor of the physics building.

• Currently integrating the first phase of the HPC cluster (Baby Blue) with the LC software.

Page 7: Southgrid Status 2001-2011 Pete Gronbech: 30 th August 2007 GridPP 19 Ambleside.

Status at Bristol

Current Gridpp cluster as at August 2007

Page 8: Southgrid Status 2001-2011 Pete Gronbech: 30 th August 2007 GridPP 19 Ambleside.

Status at Birmingham

• Currently SL3 with glite 3• CPUs: 28 2.0GHz Xeon (+98

800MHz )• 10TB DPM Storage service• Babar Farm will be phased out

as the new HPC cluster comes on line.

• Run Pre Production Service which is used for testing new versions of the middleware.

• SouthGrid Hardware support (Yves Coppens) based here.

Page 9: Southgrid Status 2001-2011 Pete Gronbech: 30 th August 2007 GridPP 19 Ambleside.

Birmingham Futures

• Large SRIF funded Clustervision / IBM University HPC cluster.

• The name of the cluster is Blue Bear. It has 256 64-bit AMD Opteron 2.6GhZ dual core sockets (1024 processing cores) with 8.0GB each. Gridpp should get at least 10% of the cluster usage.

• A second phase is planned for 2008.

Page 10: Southgrid Status 2001-2011 Pete Gronbech: 30 th August 2007 GridPP 19 Ambleside.

Early beginnings at Oxford

grid.physics.ox.ac.uk circa 2000-2003

• Following attendance at RAL 21-23rd June 2000 course by Ian Fosters Globus team. Initial installations on grid test system used globus 1.1.3 installed using the globus method. (Aug - Oct 2000)

• Later reinstalled from Andrew McNab’s RPM’s 1.1.3-5 (July 2001) and1.1.3-6 with UK host certificate (Nov 2001)

• grid machine modified to be front end for lhcb Monte Carlo; OpenAFS, Java, OpenPBS, Openssh installed( Nov 2001)

• First attempt using kickstart (RH6.2) method. Crashed with anaconda errors.

– Read on TB-support mail list that kickstart method no longer supported.

• Decide to try manual EDG method.– Pulled all CE rpms to my NFS server. Tried simple rpm -i *.rpm which failed

• Converted to using the LCFG method.

Page 11: Southgrid Status 2001-2011 Pete Gronbech: 30 th August 2007 GridPP 19 Ambleside.

Oxford goes to production status

2004-2007• Early 2004 saw the arrival of two racks of Dell

equipment providing: CPUs: 80 2.8 GHz, 3.2TB of disk storage. (£60K investment (local Oxford Money))

– Compute Element• 37 Worker Nodes, 74 Jobs Slots, 67 KSI2K

– 37 Dual 2.8GHz P4 Xeon, 2GB RAM

– DPM SRM Storage Element• 2 Disks servers 3.2TB Disk Space• 1.6 TB DPM server – second 1.6TB DPM disk pool node.

– Mon, LFC and UI nodes– GridMon Network Monitor

• 1Gb/s Connectivity to the Oxford Backbone– Oxford currently connected at 1Gb/s to TVM

• Submission from the Oxford CampusGrid via the NGS VO is possible. Working towards NGS affiliation status.

• Planned upgrades for 05 and 06 were hampered by lack of decent computer room with sufficient power and cooling.

Page 12: Southgrid Status 2001-2011 Pete Gronbech: 30 th August 2007 GridPP 19 Ambleside.

Oxford Upgrade 2007

• 11 systems, 22 servers, 44 cpus, 176 cores. Intel 5345 clovertown cpu’s provide ~350KSI2K

• 11 servers each providing 9TB usable storage after RAID 6, total ~99TB

• Two racks, 4 Redundant Management Nodes, 4 PDU’s, 4 UPS’s

Page 13: Southgrid Status 2001-2011 Pete Gronbech: 30 th August 2007 GridPP 19 Ambleside.

Two New Computer Rooms will provide excellent

infrastructure for the future 2007-2011

The New Computer room being built at Begbroke Science Park jointly for the Oxford Super Computer and the Physics department, will provide space for 55 (11KW) computer racks. 22 of which will be for Physics. Up to a third of these can be used for the Tier 2 centre. This £1.5M project is funded by SRIF and a contribution of ~£200K from Oxford Physics.

All new Physics HPC clusters including the Grid will be housed here when it is ready in October / November 2007.

Page 14: Southgrid Status 2001-2011 Pete Gronbech: 30 th August 2007 GridPP 19 Ambleside.

Local Oxford DWB Computer room

Completely separate from the Begroke Science park a computer room with 100KW cooling and >200KW power is being built. ~£150K Oxford Physics Money.

Local Physics department Infrastructure computer room (100KW) has been agreed.

Will be complete next week (Sept 2007).

This will relieve local computer rooms and house T2 equipment until the Begbroke room is ready. Racks that are currently in unsuitable locations can be re housed.

Page 15: Southgrid Status 2001-2011 Pete Gronbech: 30 th August 2007 GridPP 19 Ambleside.

Summary

• SouthGrid is set for substantial expansion following significant infrastructure investment at all sites.

• Birmingham existing HEP and PPS clusters running well, new University Cluster will be utilised shortly.

• Bristol small cluster is stable, new University HPC cluster is starting to come on line.

• Cambridge cluster upgraded as part of the CamGrid SRIF3 bid.• Oxford resources will be upgraded in the coming weeks being

installed into the new local computer room. • RAL PPD has expanded last year and this year, way above what was

originally promised in the MoU. Continued yearly expansion planned.

SouthGrid Striding out into the future

To reach the summit of our ambitions for the Grid users of the Future !!!

Page 16: Southgrid Status 2001-2011 Pete Gronbech: 30 th August 2007 GridPP 19 Ambleside.

Enjoy your walks, and recruit some new gridpp

members