The University of Sunderland Cluster Computer IET Lecture by John Tindle Northumbria Network, ICT Group Monday 11 February 2008
Dec 18, 2015
The University of Sunderland Cluster Computer
IET Lecture by John TindleNorthumbria Network, ICT
Group Monday 11 February 2008
Overview of talk
SRIF3 and Potential Vendors General Requirements Areas of Application Development Team Cluster Design Cluster System Hardware + Software Demonstrations
United Kingdom – Science Research Investment Fund (SRIF)
The Science Research Investment Fund (SRIF) is a joint initiative by the Office of Science and Technology (OST) and the Department for Education and Skills (DfES). The purpose of SRIF is to contribute to higher education institutions' (HEIs) long-term sustainable research strategies and address past under-investment in research infrastructure.
SRIF3
SRIF3 - 90% and UoS - 10% Project duration about two years Made operational by late
December 2007 Heriot Watt University -
coordinator
Potential Grid Computer Vendors
Dell – selected vendor CompuSys – SE England Streamline - midlands Fujitsu - Manchester ClusterVision - Dutch OCF - Sheffield
General requirements
General requirements
High performance general purpose computer
Built using standard components Commodity off the shelf (COTS) Low cost PC technology Reuse existing skills - Ethernet Easy to maintain - hopefully
Designed for Networking Experiments
Require flexible networking infrastructure
Modifiable under program control Managed switch required Unmanaged switch often employed
in standard cluster systems Fully connected programmable
intranet
System Supports
Rate limiting Quality of service (QoS) Multiprotocol Label Switching (MPLS) VLANs and VPNs IPv4 and IPv6 supported in hardware Programmable queue structures
Special requirements 1 Operation at normal room
temperature Typical existing systems require
a low air inlet temperature < 5 Degrees C a dedicated server room with
airconditioning Low acoustic noise output Dual boot capability Windows or Linux in any proportion
Special requirements 2 continued
Concurrent processing, for example Boxes 75% cores for Windows Boxes 25% cores for Linux
CPU power control – 4 levels High resolution displays for media
and data visualisation
Advantages of design
Heat generated is not vented to the outside atmosphere
Airconditioning running cost are not incurred
Heat is used to heat the building Compute nodes (height 2U) use
relatively large diameter low noise fans
Areas of application
Areas of application
1. Media systems – 3D rendering2. Networking experiments
MSc Network Systems – large cohort
3. Engineering computing4. Numerical optimisation5. Video streaming 6. IP Television
Application cont 1
7. Parallel distributed computing8. Distributed databases9. Remote teaching experiments 10. Semantic web11. Search large image databases12. Search engine development13. Web based data analysis
Application cont 2
14. Computational fluid dynamics15. Large scale data visualisation
using high resolution colour computer graphics
UoS Cluster Development Team From left to right
Kevin Ginty Simon Stobart John Tindle Phil Irving Matt Hinds Note - all wearing Dell tee shirts
UoS Team
UoS Cluster
Work Area At last all up and running!
UoS Estates Department
Very good project work was completed by the UoS Estates Department Electrical network design Building air flow analysis
Computing Terraces Heat dissipation Finite element (FE) study and analysis Work area refurbishment
Cluster Hardware
Cluster Hardware
The system has been built using Dell compute nodes Cisco networking components Grid design contributions from
both Dell and Cisco
Basic Building Block
Compute nodes Dell PE2950 server Height 2U Two dual core processors Four cores per box Ram 8G , 2G per core http://157.228.27.155/website/CLUSTER-GRID/Dell-docs
1/
Computer Nodes
Network interface cards 3 off Local disk drives 250G SATA II The large amount of RAM
facilitates virtual computing experiments
VMWare server and MS VirtualPC
Cisco 6509 switch
Cisco 6509 URL (1off) Cisco 720 supervisor engines (2off) Central network switch for the
cluster RSM router switch module Provides
6509 Provides
720Mbps full duplex, (4off port cards)
Virtual LANs - VLAN Virtual private networks - VPN Link bandwidth throttling Traffic prioritisation, QoS Network experimentation
Cluster Intranet
The network has three buses1. Data2. IPC3. IPMI
1. Data bus
User data bus A normal data bus required for
interprocessor communication between user applications
2. IPC Bus Inter process communication (IPC) “The Microsoft Windows operating system
provides mechanisms for facilitating communications and data sharing between applications.
Collectively, the activities enabled by these mechanisms are called interprocess communications (IPC). Some forms of IPC facilitate the division of labor among several specialized processes”.
IPC Bus continued
“Other forms of IPC facilitate the division of labor among computers on a network”.
Ref Microsoft Website IPC is controlled by the OS For example IPC is
Used to transfer and install new disk images on compute nodes
Disk imaging is a complex operation
3. IPMI Bus
IPMI Intelligent Platform Management
Interface (IPMI) specification defines a set of common interfaces to computer hardware and firmware which system administrators can use to monitor system health and manage the system.
Master Rack A
Linux and Microsoft 2 – PE2950 control nodes 5 – PE1950 web servers Cisco Catalyst 6509 720 supervisor engines 2 * 720 supervisors 4 * 48 port cards (192 ports)
Master Rack A cont
Compute nodes require 40*3 = 120 connections
Disk storage 1 – MD1000 http://157.228.27.155/website/CLUSTER-GRID/Dell-docs
1/
Master rack resilient to mains failure
Power supply 6 kVA APC (hard wired 24 Amp PSU)
Master Rack A KVM Switch
Ethernet KVM switch Keyboard, Video display, Mouse - KVM Provides user access to the head
nodes Windows head node, named – “Paddy” Linux head node, named - “Max” Movie USCC MVI_6991.AVI
Rack B Infiniband InfiniBand is a switched fabric
communications link primarily used in high-performance computing.
Its features include quality of service and failover and it is designed to be scalable.
The InfiniBand architecture specification defines a connection between processor nodes and high performance I/O nodes.
Infiniband Rack B
6 – PE2950 each with two HCAs 1 – Cisco 7000P router Host channel adapter (HCA) link http://157.228.27.155/website/CLUSTER-GRID/Cisco-doc
s1/HCA/
Infiniband http://en.wikipedia.org/wiki/InfiniBand
Cisco Infiniband
Cisco 7000p High speed bus 10Gbits/sec Low latency < 1microsec Infiniband 6 compute nodes
24 cpu cores High speed serial communication
Infiniband
Many parallel channels PCI Express bus (serial DMA) Direct memory access (DMA)
General compute Rack C
11 – PE2950 computer nodes Product details
Racks
A*1 - 2 control (+5 servers) GigE B*1 - 6 Infiniband (overlay) C*3 - 11 (33) GigE N*1 - 1 (Cisco Netlab + VoIP) Total compute nodes
2+6+33+1 = 42
Rack Layout
- C C B A C N - F C C B A C N F Future expansion – F KVM video - MVI_6994.AVI
Summary - Dell Server 2950
Number of nodes 40 + 1(lin) + 1(win) Number of compute nodes 40 Intel Xeon Woodcrest 2.66GHz Two dual core processors GigE NICs – 3 off per server RAM 8G, 2G per core Disks 250G SATA II
Summary - cluster speedup
Compare time taken to complete a task
Time on cluster = 1 hour Time using a single CPU = 160 hours
or 160/24 = 6.6 days approx 1 week Facility available for use by companies “Software City” startup companies
Data storage
Master nodes via PERC5e to MD1000 using 15 x 500G SATA drives
Disk storage 7.5T Linux 7 disks MS 2003 Server HPC 8 disks MD1000 URL http://157.228.27.155/website/CLUSTER-GRID/Dell-docs2/
Power
Total maximum load generated by Dell cluster cabinets
Total load = 20,742kW Values determined by using Dells
integrated system design tool Power and Noise
Web servers
PE1950 Height 1U Five server Web services Domain controller, DNS, DHCP etc http://157.228.27.155/website/CLUSTER-GRID/Dell-docs
1/
Access Workstations Dell workstations (10 off) Operating Systems WinXP Pro HD displays LCD (4 off)
Size 32 inch wall mounted Graphics NVS285 – 8*2 GPUs Graphics NVS440 – 2*4 GPU Graphics processor units Support for HDTV
Block Diagram
Switch 6509Servers 1950 Server 2950
NAS 7.5Tb
2 CPUs/node, 2 cores/CPU4 cores/node
Intranet data, controlmonitor, spare 250Gb SATA
Ram 8Gb
Campus network
8 * 500Gb Lin7 * 500G Win
Infiniband overlay 6 nodes2 * HCA/node10Gbps linksInfininband switch - 7000P
2 * 720 supervisor engines720 Gb duplex
40Gb per slot720 aggregate bandwidth
4 * line cards 4*48 port line cards4 * 48 = 192 ports1Gb Ethernet links copper
Support forVPNsQoSMPLSrate limitingprivate VLANsIPv4 and IPv6 routing in hardware
University of Sunderland Cluster Computer USCC
Cluster gatewayWeb server5 * 19503 * Win2003 server2 * Linuxonline support for users
Local access workstations 10 * Dell PCs
Visualsation of data, display area4 * 37 inch LCD flatscreens
Ethernet 3 LANs
GigE intranets 3 * LANS
42 * nodes, 2 * head nodes Lin/Win, 40 compute nodes
CPUs
Distributed stirage 40 * 250Gb = 10Tb
Movie USCC
MVI_6992.AVI
Cluster Software
Cluster Software
Compute Node Operating Systems Scientific Linux (based on Redhat) MS Windows Server 2003
High performance computing - HPC
Scali Scali Management
software to mange high performance cluster computers
Scali is used to control the cluster start and stop processes, upload data/code
and schedule tasks
Scali datasheet http://www.scali.com/ http://157.228.27.155/website/CLUSTER-GRID/Scali/
Other software
Apache web services Tomcat, Java server side
programming Compilers C++, Java Servers FTPD 3D modelling and animation
Blender Autodesk 3DS Max software
Virtual Computing
Virtual Network Security Experiment - example
Virtual Network VMWare Appliances Components
(1) NAT router (2) WinXP-sp2 attacks FC5 across network(3) Network hub - interconnection (4) Firewall - protection(5) Fedora Core FC5 target system
Network Security Experiment
VMware host
XPProSP2
Eth0
SW2
Red Greeneth1 eth0Ethernet 2 Ethernet
NAT FirewallForward port 80 from Red to FC5’s IP
FC5
Eth0
Load Apache (httpd) web server
NAT/ (VMnet8)
HUB (VMnet4)
Security Experiment A total of 5 virtual networking devices
using just one compute box Port scanning attack (Nessus) Intrusion detection (Snort) Tunnelling using SSH and Putty RAM required 500K+ for each
network component
Cisco Netlab
Cisco Netlab provides Remote access to network facilities
for experimental purposes Netlab is installed the Network
cabinet Plus VoIP demonstration system for
teaching purposes
Network Research
Current ResearchNetwork Planning
Network Planning Research Network model using OOD Hybrid parallel search algorithm
based upon features of Parallel genetic algorithm (GA) Particle swarm optimisation (PSO)
Ring of communicating processes
Network Planning Research Web services Server side programs - JSP FTPDaemon, URL objects, XML Pan Reif solver
based on Newton’s Method Steve Turner PhD student
Submit May 2008 – first to use USCC UoS Cluster Computer USCC
Hybrid GA
Telecom Network PlanningDSL for ISP
DSL Network Plan Schematic Diagram
Numerical output from GA optimiser – PON Equipment
Data visualisation - multidimensional data structure: location, time and service types
Demonstrations
1. IPTV2. Java test program
Demonstration 1 - IPTV
IP television demonstration IP internet protocol Video LAN client – VLC Number of servers and clients – 10 Video streams standard definition
4 to 5Mbps Multicasting Class D addressing
IPTV IGMP
Internet group management protocol Video streams HD 16Mbps HD only uses 1.6% of 1Gbps Rudolph Nureyev dancing Six Five Special 1957
Don Lang and the Frantic Five New dance demonstration - Bunny Hop
Demonstration 2
Java demonstration test program Compute node processes 40 Workstation server 1 Communication via UDP Graphical display on local server of
data sent from compute nodes Network configuration – star
Star network
Node 1 Node 2
Node 39 Node 40
Server Node
Cluster Demonstration Program
Star
Ring
Cluster configuration file Description of File ipadd.txt
1 Node id 192.168.1.50 Hub server address 192.168.1.5 Previous Compute Node 192.168.1.7 Next Compute Node
192.168.1.51 Hub2 spare
Equation double val = 100 * ( 0.5 + Math.exp(-t/tau) * 0.5 * Math.sin(theta)) ;
Screenshot of hub server bar graph display
USCC configuration Single demo in a compute node
Dirs 1+4 = 5 (top level + one per core) All compute node
40*5 = 200 Workstations 10
20*200 = 2000 Ten demos
10*2000 = 20,000 directories to set up
Java program to configure cluster
UoS Cluster Computer Inaugural Event
UoS Cluster Computer Inaugural Event
Date: Thursday 24 April 2008 Time: 5.30pm Venue: St Peter’s Campus Three speakers (each 20minutes)
John MacIntyre - UoS Robert Starmer - Cisco San Jose TBA - Dell Computers
USCC Inaugural Event
Attendance is free Anyone wishing to attend is asked
to register beforehand to facilitate catering
Contact via email [email protected]
The End
Thank you for your attention Any questions
Slides and further information available at URL http://157.228.27.155/website/CLUSTER-GRID/