Top Banner
Nouvelle architecture réseau du CC Guillaume Cessieux Équipe réseaux CC-IN2P3 Séminaire CC 2010-12-03
31

Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

Aug 26, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

Nouvelle architecture réseau du CC

Guillaume Cessieux

Équipe réseaux CC-IN2P3

Séminaire CC

2010-12-03

Page 2: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

Latest major network upgrade: Mid 2009

Backbone 20G → 40G

– No topology change, only additional bandwidth

Linking abilities tripled

– Distribution layer added

2010-12-03GCX 2

ComputingStorage

FC+TAPE

Storage

SATA

ComputingStorage

FC+TAPE

Storage

SATA

Page 3: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

Previous network architecture: Mid 2009

2010-12-03GCX 3

36 computing racks

34 to 42 server per rack

1x10G uplink

1G per server

Data FC

(27 servers)

Data SATA

816 servers

in 34 racks

10G/server

Tape

10 servers

2x1G per server

10G/server

1 switch/rack

(36 access switches)

48x1G/switch

3 distributing switches

Linked to backbone with 4x10G

Computing

24 servers per switch

34 access switches with

Trunked uplink 2x10G

Linked to backbone with 4x10G

2 distributing switches

Storage

Backbone 40GWAN

Page 4: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

Reaching limits (1/2)

2010-12-03GCX 4

Same 40G path 1 year later

20G uplink of a distribution switch:

20G40G

Page 5: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

Reaching limits (2/2)

Clear traffic increase

– More hosts exchanging more

– More remote exchanges

Limits appearing

– Disturbing bottlenecks

– Long path before being routed

Upcoming challenges– New computing room, massive data transfers, virtualization, heavy Grid

computation

2010-12-03GCX 5

Usage of one 40G backbone etherchannel

10G direct link with CERN

Sample host average: 620M on 1G

Page 6: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

Complete network analysis performed

Inventory

– 20 network devices found

• Thanks discovery protocols…

– Software and features not harmonised

Topology

– A map worth anything

Usage

– Traffic patterns, bottlenecks

2010-12-03GCX 6

switch>show cdp neighborsCapability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge

S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone,

D - Remote, C - CVTA, M - Two-port Mac Relay

Device ID Local Intrfce Holdtme Capability Platform Port ID

s1.in2p3.fr.

Ten 2/1 146 R S I WS-XXXXX- Ten 1/50

s2.in2p3.fr.

Ten 3/3 130 R S I WS-XXXXX- Ten 7/6

s3.in2p3.fr.

Ten 3/4 150 R S I WS-XXXXX- Ten 6/6

Page 7: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

Requirements for new network architecture

More bandwidth!

Able to scale for next years

– Allowing non disruptive network upgrade

– Particularly with new computing room

Ease exchanges betwen major functional areas

As usual: Good balance between risks, costs and

requirements

2010-12-03GCX 7

Page 8: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

Main directions

Target non blocking mode

No physical redundancy, meshes etc. – “A single slot failure in 10 years on big Cisco devices”

– Too expensive and not worth for us

– High availability handled at service level (DNS…)

– Big devices preferred to meshed bunch of small

Keep it simple

– Ease configuration and troubleshooting

– Avoid closed complex vendor solutions

• e.g things branded “virtual”, “abstracted”, “dynamic”

2010-12-03GCX 8

Page 9: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

Services

(db, Grid, monitoring…)

AFS

GPFS

TSM

Remote workers

dcache

Xrootd

SRB

HPSS

New network architecture

2010-12-03GCX 9

60G

x4 40G

10G

WAN (generic not LHCOPN)

60G Workers

x3

10G

20G

60G

20G

Area Old bandwidth New bandwidth

AFS 30G shared 30G

GPFS TSM 40G shared 60G

Dcache Xrootd srb HPSS 40G shared 120G

Workers 40G shared 170G

WAN 10G 20G

From 160G often

shared to 400G wirespeed

Page 10: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

Current status - http://netstat.in2p3.fr/

2010-12-03GCX 10

Edge

Thumpers Thumpers

AFS

Services

Thors

Workers

CINES

Thors & DellThors

Thors

40G

9x20G

4x40G4x10G

20G

4x60G

5x20G

Page 11: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

WAN upgrade

2010-12-03GCX 11

RENATER

LHCOPN

Circuits to IN2P3 Laboratories

10G+100M+10M+4M+2M

GÉANT2

Internet

NRENs

MCU VPN

2x1G

100M1G

LAN

Dual homing of hosts doing

massive data transfers

(dcache for WLCG)

ccpn-inter

x3

New 10G links

NRENs

NRENs

Chicago

2x1G/host

ccpn-opn

2x1G/host

Page 12: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

RENATER GÉANT2

2010-12-03GCX 12

Page 13: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

LHCOPN: LHC Optical Private Network

2010-12-03GCX 13

CC - CERNCC – DE-KIT

Page 14: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

2010-12-03 14

workers CINES

Storage

GPFS

Storage

Services

LHCOPN

Storage and

LHCOPN services

(dache, fts, vobox)

Offices &

telecom

services

10G

2x1G

Fermilab

dcache

Xrootd

SRB

HPSS

IN2P3 Laboratories

RENATER

Lyon

CC IN2P3

Outside

20G

20G

10G

GCX

Current status

Page 15: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

GCX 15

120G60G

30G

80G 160G20G

Workers - 160G

INTER – 20G

Services - 80G

AFS - 30G

GPFS - 30G

dcache - 120G

~40%

~25%

~18%

~2%

~1%

~16%

Page 16: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

Main network devices and configurations used

New core device: Nexus 7018– High density device, really scalable

– Very modular: slots & switching engine cards

– 80G backplane per slot (8x10G non blocking)

• Initial configuration: 6 slots, 3 switching engines (3x48G)

– This device is vital

• 4 power supplies on 2 UPS, 2 managements slots

2010-12-03GCX 16

Compatibility check is done:

Mod boot Impact Install-type

------ ------ ------------------ -------------

1 yes non-disruptive rolling

2 yes non-disruptive rolling

3 yes non-disruptive rolling

4 yes non-disruptive rolling

5 yes non-disruptive rolling

6 yes non-disruptive rolling

9 yes non-disruptive reset

10 yes non-disruptive reset

Page 17: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

2010-12-03GCX 17

10 remaining slots!

2 extra switching

engines possible

Page 18: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

2010-12-03GCX 18

32x10G (8x10G non bloquants)

Switching engine 48G

Page 19: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

Main network devices and configurations used

• 24x10G (12 blocking)

+ 96x1G

+ 336x1G blocking (1G/8ports)

• 48x10G (24 blocking)

+ 96x1G

• 64x10G (32 blocking)

48x1G + 2x10G

6509

6513

4948

4900 16x10G

Core & Edge

Distribution

Access

2010-12-03 19GCX

Page 20: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

2010-12-03GCX 20

Page 21: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

4900: 16x10G

2010-12-03GCX 21

Page 22: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

4948: 48x1G + 2x10G

2010-12-03GCX 22

Uplink 1x10G

Page 23: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

Timeline

2010-12-03GCX 23

June July August Sept. October 2010

Sept. 21st

19h 20h 21h 22h 23h

Reload of border routers

Capacity upgrade and reconfiguration

Reload of core routers

Reconfiguration of core network

Reconfiguration of satellite devices

Software upgrade on satellite devices

Fixing remaining problems !

4 people during 5h

Border

~5 devices

Core

~15 devices

Satellite

~150 devices

Testing new software and upgrade process

Testbed with new core configuration

Wiring, final preparation and organisation

Nexus received

Offline preparation and final design

Page 24: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

A 3 months preparation

Testing, preconfiguring, scripting, checklist

Optical wiring: Fully done and tested before

2010-12-03GCX 24

~60 new fibres

> 1km of fibre deployed!

Page 25: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

Feedbacks

No major surprise!– Heavy testing phase was fruitful

– Main issue: Some routes not correctly announced

• Not detected nor understood, but workaround found

Keep monitoring, but deactivate alarms– Spare 800 SMS and 6k e-mails to each team members

Do not parallelize actions too much

– Hard to isolate faults or validate actions

2010-12-03GCX 25

• Routing resilience hiding such problem

• Snapshot routing tables with a

traceroute to each route!

Page 26: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

Key benefits

Increased core capacity by 2.5

Isolated areas, delivering wire speed,

removed bottlenecks, shortened paths

Seamless capacity upgrade now possible

Harmonised softwares and features on 170

devices

– From 37 different versions to 13

2010-12-03GCX 26

Page 27: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

100G test (1/3)

2010-12-03GCX 27

This slide was for eyes only.

Page 28: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

100G test (2/3)

2010-12-03GCX 28

1

3

2

4

9

5

7

6

8

Lyon, CCIN2P3 Geneva, CERN150 km

10

b1

b2

b3

b4

b5

1

3

2

4

9

5

7

6

8

1λ 100G

10

1

3

2

4

9

5

7

6

8

10

1

2

3

4

ccteng01

ccteng02

ccperfsonar

cccata-test100g

b1

b2

b3

b4

ccteng02-2

10G

10G

10G

10G

10x10G

Page 29: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

100G test (3/3)

This slide was for eyes only.

Page 30: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

AOB

Nouvelle salle machine

– 2em Nexus, switching, 160G

– Non autonome: extension

Serveur VPN SSL

– En cours de validation...

https://cctelecom.in2p3.fr/netacl/

2010-12-03GCX 30

LAN room

A

LAN room

B

Target: 160G

WAN

20G

Page 31: Nouvelle architecture réseau du CC · Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone, D - Remote,

Conclusion

Flat to starred network architecture

– Closely matching our needs

Average network usage down from 60% to

~15%

Ready to face traffic increase for some more

time

– How long?

2010-12-03GCX 31