Top Banner
21 October 200 3 CERN IT-PS-UI Solaris status and plans 1 Solaris status and plans HEPIX Autumn 2003 Ignacio Reguero, Michel Manent, Carlos Ungil presented by Sebastian Lopienski
20

21 October 2003CERN IT-PS-UI Solaris status and plans1 Solaris status and plans HEPIX Autumn 2003 Ignacio Reguero, Michel Manent, Carlos Ungil presented.

Mar 29, 2015

Download

Documents

Josue Rokes
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 21 October 2003CERN IT-PS-UI Solaris status and plans1 Solaris status and plans HEPIX Autumn 2003 Ignacio Reguero, Michel Manent, Carlos Ungil presented.

21 October 2003 CERN IT-PS-UI Solaris status and plans 1

Solaris status and plans

HEPIX Autumn 2003

Ignacio Reguero, Michel Manent, Carlos Ungil

presented by

Sebastian Lopienski

Page 2: 21 October 2003CERN IT-PS-UI Solaris status and plans1 Solaris status and plans HEPIX Autumn 2003 Ignacio Reguero, Michel Manent, Carlos Ungil presented.

21 October 2003 CERN IT-PS-UI Solaris status and plans 2

Executive Summary

• Current Status– Some figures

• SUNINST0 Network Installation Server• CAE Server Upgrade• SUNDEV Technology Refresh

– New 10 Sun 210s

• Implementation of EDG WP4 Quattor fabric management on Solaris– System administration view

• Solaris 9 Certification• Sun Blade Server 1600 and N1 Management

Page 3: 21 October 2003CERN IT-PS-UI Solaris status and plans1 Solaris status and plans HEPIX Autumn 2003 Ignacio Reguero, Michel Manent, Carlos Ungil presented.

21 October 2003 CERN IT-PS-UI Solaris status and plans 3

Current Status of Solaris Usage at CERN

• Second platform for LHC physics– Mostly for validation purposes (numerical software)

• Total population of 663 Active nodes– Figures from LanDB network database

• Around 300 on Solaris 8• Rest: about a half running Solaris 2.6 and a half

on Solaris 7– Problem: Most of these machines cannot upgrade OS

without hardware upgrade (disks and memory)

Page 4: 21 October 2003CERN IT-PS-UI Solaris status and plans1 Solaris status and plans HEPIX Autumn 2003 Ignacio Reguero, Michel Manent, Carlos Ungil presented.

21 October 2003 CERN IT-PS-UI Solaris status and plans 4

SUNINST0 Network Installation Server

• Jumpstart server• Network configurations + responsible now fully

extracted from LANDB network database– With single fetch procedure

• After router fix Sun DHCP server is stable– On request of SM18 LHC Magnet Test had to

demonstrate boot of exotic devices (like data acquisition devices) from it

• However, still working with CS group to replace DHCP server with the one of CS– SOAP interface to an Oracle DB will allow us to update

• Similar than the one in place for Print DNS hierarchy

Page 5: 21 October 2003CERN IT-PS-UI Solaris status and plans1 Solaris status and plans HEPIX Autumn 2003 Ignacio Reguero, Michel Manent, Carlos Ungil presented.

21 October 2003 CERN IT-PS-UI Solaris status and plans 5

CAE Server Upgrade (1)

• CAE - Electronic design cluster• To serve the electronics design community• New server: V480

– 4 x 900 MHz CPU– 8 Gb SDRAM– Gigabit ethernet

• A1000 RAID Disk box– 436Gb with RAID 5 – space for users

• Had to coordinate IDPROM change with disk movement– Hit technical and sociological problems– IDPROM change: a need to keep old Cadence licenses on new

server - but SUN reluctant to do it for new HW models– At the end provided ad hoc solution that does not support OBP

upgrade• Found the hard way!• And need OBP upgrade for A1000 support

Page 6: 21 October 2003CERN IT-PS-UI Solaris status and plans1 Solaris status and plans HEPIX Autumn 2003 Ignacio Reguero, Michel Manent, Carlos Ungil presented.

21 October 2003 CERN IT-PS-UI Solaris status and plans 6

CAE Server Upgrade (2)

• IDPROM problem not solved yet– Considering to use other machine

• Cannot be too new (works on V220 or 280R)

• Also lots of A1000 RAID box problems– RAID manager software has to be coordinated with

firmware level in the controller and OBP• So lots of upgrades required before connecting old A1000s

to new server

– After first installation, additional A1000s are not seen unless adding entries by hand to /kernel/drv/sd.conf with the relevant SCSI ID

Page 7: 21 October 2003CERN IT-PS-UI Solaris status and plans1 Solaris status and plans HEPIX Autumn 2003 Ignacio Reguero, Michel Manent, Carlos Ungil presented.

21 October 2003 CERN IT-PS-UI Solaris status and plans 7

CAE Server Upgrade

Page 8: 21 October 2003CERN IT-PS-UI Solaris status and plans1 Solaris status and plans HEPIX Autumn 2003 Ignacio Reguero, Michel Manent, Carlos Ungil presented.

21 October 2003 CERN IT-PS-UI Solaris status and plans 8

SUNDEV Technology Refresh• A cluster for physics development• 10 Sun Fire V210

– State of the art SPARC machines– Thin Rack mountable servers

• 1 unit on 19” racks• They all fit in a single CERN rack together with Gigabit

switch and Sun blade server– Dual 1GHz UltraSPARC-IIIi– 2Gb memory– 2 x 36GB Disk drives– 4 x GIGABIT Ethernet on the motherboard

• They are being installed on Solaris 8.7.3 (latest required for this hardware), later Solaris 9

• Performance improvement at least 120% over the current SUNDEV machines

Page 9: 21 October 2003CERN IT-PS-UI Solaris status and plans1 Solaris status and plans HEPIX Autumn 2003 Ignacio Reguero, Michel Manent, Carlos Ungil presented.

21 October 2003 CERN IT-PS-UI Solaris status and plans 9

SUNDEV Technology Refresh

Page 10: 21 October 2003CERN IT-PS-UI Solaris status and plans1 Solaris status and plans HEPIX Autumn 2003 Ignacio Reguero, Michel Manent, Carlos Ungil presented.

21 October 2003 CERN IT-PS-UI Solaris status and plans 10

Implementation of EDG WP4 Quattor fabric management on Solaris (1)

• We plan to use Quattor to manage all Solaris systems– From Solaris 9 onwards

• What does it mean for us?– Central Configuration DataBase (CDB)

• Configuration information• Software to be installed

– Both applications and system

• A cache manager provided for the client accessing the DB– To avoid dependency on the DB server or on the network

• The configuration database is linked to the network installation server

– The Jumpstart profile is generated from the database

Page 11: 21 October 2003CERN IT-PS-UI Solaris status and plans1 Solaris status and plans HEPIX Autumn 2003 Ignacio Reguero, Michel Manent, Carlos Ungil presented.

21 October 2003 CERN IT-PS-UI Solaris status and plans 11

Implementation of EDG WP4 Quattor fabric management on Solaris (2)

– Node Configuration Manager (NCM)• A la SUE• For configuration “components”

– Simplified SUE features

• NCM components are simplified SUE features • They have single action: “configure”• They access Configuration DB through the cache manager

– SPMA software distributor (package level)• Replaces ASIS software distribution (file level)• For Linux it uses RPMs, for Solaris implemented with

Solaris PKG• Allows to install packages from various SW repositories• Several protocols supported: HTTP, file system (AFS), FTP,

etc.

Page 12: 21 October 2003CERN IT-PS-UI Solaris status and plans1 Solaris status and plans HEPIX Autumn 2003 Ignacio Reguero, Michel Manent, Carlos Ungil presented.

21 October 2003 CERN IT-PS-UI Solaris status and plans 12

Implementation of EDG WP4 Quattor fabric management on Solaris (3)

• Still working on– Creation of Solaris NCM Components from existing SUE

features (Juan Pelegrin)– DB Access Control

• For delegation

– Behavior with “unmanaged” software

CDB

HostREPOSITORY

REPOSITORY

REPOSITORY

HostHost

xmlxml

pan

pan

PKGPKG

NCM

target.cf target.cf

SPMA

Page 13: 21 October 2003CERN IT-PS-UI Solaris status and plans1 Solaris status and plans HEPIX Autumn 2003 Ignacio Reguero, Michel Manent, Carlos Ungil presented.

21 October 2003 CERN IT-PS-UI Solaris status and plans 13

Solaris 9 Certification• Validating Solaris 9 – running all SW on the new system• Timescale for the end of 2003

– Refsol9 reference machine now available• Not big changes in terms of Solaris, but new features:

– “Web Start Flash Archives”: system images for installation• Nice for farms (but for same HW)

– Resource pools• Guaranteed resources for an application on large shared systems

– Gnome 2.0 is the standard desktop environment– We deliver Mozilla 1.4 (instead of Netscape recomm. by Sun)– Sun ONE Studio 8 as default compiler

• Replacement of ASIS and SUE with Quattor• More Open Source software packaged with the system

– Perl, Bash,…– Some of these products supported on same basis as SUN

native ones– Probably occasion to reduce the number of products

maintained by us

Page 14: 21 October 2003CERN IT-PS-UI Solaris status and plans1 Solaris status and plans HEPIX Autumn 2003 Ignacio Reguero, Michel Manent, Carlos Ungil presented.

21 October 2003 CERN IT-PS-UI Solaris status and plans 14

Solaris Reference machines + Installation Server

Page 15: 21 October 2003CERN IT-PS-UI Solaris status and plans1 Solaris status and plans HEPIX Autumn 2003 Ignacio Reguero, Michel Manent, Carlos Ungil presented.

21 October 2003 CERN IT-PS-UI Solaris status and plans 15

Sun Blade Server 1600

• Sun Blade server 1600– Packaged farm– Fits in 3 units of a 19” rack– SSC Controller with gigabit switch that manages up to 16 CPUs

• Several Gigabit Ethernet external connections• VLAN with 16 Gigabit Ethernet Interface• Protection attack by Packet Filter configuration• Console through Serial Port for each Blade

• Received 12 X 650MHz UltraSPARC-IIe• Waiting for 4 “Intel Compatible” CPUs

– AMD Athlon XP-M 1.2GHz• Other Specialized Blades supported on hardware level

– SSL Encryptor– Load Balancer

Page 16: 21 October 2003CERN IT-PS-UI Solaris status and plans1 Solaris status and plans HEPIX Autumn 2003 Ignacio Reguero, Michel Manent, Carlos Ungil presented.

21 October 2003 CERN IT-PS-UI Solaris status and plans 16

Sun Blade Server 1600 system chassis

SSC0

(active)

SSC1

(standby)

Switch Fabric

Switch Fabric

External Switch

137.138.x.x (ce0) 137.138.x.x (ce1) 137.138.x.x (ce0) 137.138.x.x (ce1)

Slot 0……s15 Slot 0……s15

Blades 0…….15

Page 17: 21 October 2003CERN IT-PS-UI Solaris status and plans1 Solaris status and plans HEPIX Autumn 2003 Ignacio Reguero, Michel Manent, Carlos Ungil presented.

21 October 2003 CERN IT-PS-UI Solaris status and plans 17

Sun Blade Server 1600

Page 18: 21 October 2003CERN IT-PS-UI Solaris status and plans1 Solaris status and plans HEPIX Autumn 2003 Ignacio Reguero, Michel Manent, Carlos Ungil presented.

21 October 2003 CERN IT-PS-UI Solaris status and plans 18

Sun Blade Server 1600 Installation and Configuration

• Fully automated network installation (DHCP) using Jumpstart from SUNINST0– Initial configuration, installation & application software

• One private IP address for each System Controller• One IP address for each Blade

• Ongoing Test of Web Start Flash Archives – Quick replicate one Blade’s operating environment &

application software on other Blades intended

• Sun VTS (Validation Test Suite) online diagnostics tool– verifies configuration and functionality of hardware

controllers, devices and platforms

Page 19: 21 October 2003CERN IT-PS-UI Solaris status and plans1 Solaris status and plans HEPIX Autumn 2003 Ignacio Reguero, Michel Manent, Carlos Ungil presented.

21 October 2003 CERN IT-PS-UI Solaris status and plans 19

Sun N1 System Management Framework

• Sun N1 Provisioning server 3.0 Blade Edition being tested– Automates configuration and deployment different kinds

of servers• Including specialized servers• Assignment may vary according to a schedule or other

input – dynamic management of clusters

• To compare N1 with Quattor functionality

• Question: could N1 manage heterogeneous farms out the Blade server scope?

Page 20: 21 October 2003CERN IT-PS-UI Solaris status and plans1 Solaris status and plans HEPIX Autumn 2003 Ignacio Reguero, Michel Manent, Carlos Ungil presented.

21 October 2003 CERN IT-PS-UI Solaris status and plans 20

Questions?

Unix Infrastructure section:http://cern.ch/product-support/UI

[email protected]@[email protected]