Top Banner
Better management of large-scale, heterogeneous networks toward a programmable management plane Joshua George, Anees Shaikh Google Network Operations www.openconfig.net
30
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Better management of large-scale, heterogeneous networkstoward a programmable management plane

Joshua George, Anees ShaikhGoogle Network Operations

www.openconfig.net

Page 2: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Management plane challenges

Rethinking telemetry -- efficient, large-scale monitoring

OpenConfig -- community-driven API development3

2

1

Agenda

2

Page 3: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Management Plane Challenges

3

Page 4: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Challenges of managing a large-scale network

● more than 8M OIDs collected every 5 minutes

● more than 20K CLI commands issued and scraped every 5 minutes

● many tools, and multiple generations of software

Opportunity for significant OPEX savings: reduced outage impact, simplification of management stack, better scaling

4

● 20+ network device roles

● more than half dozen vendors, multiple platforms

● 4M lines of configuration files

● up to ~30K configuration changes per month

Page 5: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Management plane is way behind

● proprietary CLIs, lots of scripts

● imperative, incremental configuration

● lack of abstractions

● configuration scraping from devices

● SNMP monitoring -- not always “simple” and not often scalable

5

Page 6: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Configuration

• describes configuration data structure and content

• common modeling language: YANG

• multiple data encodings: protobuf, XML, JSON, ...

Topology

• describes structure of the network

• common modeling language: multiple

• data encoding: protobuf, ...

Telemetry

• describes monitoring data structure and attributes

• common modeling language: exploring YANG

• data delivery: RPC, protobuf inside UDP

Model-driven network management

6

Page 7: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Rethinking Network Telemetry

7

Page 8: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Telemetry solutions todayWhat do we use? Often SNMP is the default choice.● legacy implementations -- designed for limited processing and

bandwidth

● expensive discoverability -- re-walk MIBs to discover new elements

● no capability advertisement -- test OIDs to determine support

● rigid structure -- limited extensibility to add new data

● proprietary data -- require vendor-specific mappings and multiple requests to reassemble data

● protocol stagnation -- no absorption of current data modeling and transmission techniques

8

Page 9: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Telemetry challenges

● SNMP object collection growing with each platform generation○ e.g., 100K objects on current platforms, expected to grow 3x over

next 2 generations○ similar for object collection frequency

● Future devices continue to grow in density and drive this trend○ scale limitations in data acquisition at high frequencies

● Near-real-time acquisition and access to monitoring data is a requirement for <insert buzzword here>○ traffic management, tight control loops, fast recovery

9

Page 10: SDN in the Management Plane: OpenConfig and Streaming Telemetry

I get it, you really don’t like SNMP…

10

but do you have a better idea?

Page 11: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Rethinking telemetry...reverse the flow○ stream data continuously --

with incremental updates based on subscriptions

○ observe network state through a time-series data stream

○ devices programmed with a data model describing desired structure and content

○ efficient, secure transport protocols

11

Page 12: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Telemetry framework requirements

● network elements stream data to collectors (push model)

● data populated based on vendor-neutral models whenever possible

● utilize a publish/subscribe API to select desired data

● scale for next 10 years of density growth with high data freshness

○ other protocols distribute load to hardware, so should telemetry

● utilize modern transport mechanisms with active development communities

○ gRPC (HTTP/2), Thrift, etc.

○ protocol buffer over UDP

12

Page 13: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Example telemetry configuration flow

13

Network Management System

Structured inventory and set of telemetry capabilities pushed to the NMS

Rules for typical monitoring

Tactical overrides from operators

Generate monitoring configuration

Publish to network element

gRPC endpoint

Page 14: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Example telemetry data flow

14

Collection Infrastructure

Send statistics every X seconds

Asynchronous event reporting

Operator requests for ad-hoc data.

3 types of telemetry events:● Bulk time series data

○ All interface stats every 10 seconds.

● Event/edge driven updates○ LSP A is now down.

● Operator request/response○ Show me oper state for all

interfaces.gRPC endpoint

Page 15: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Practical realization

● Streaming telemetry is beyond an idea stage, but is far from a final product

● Multiple vendor implementations now available for experimentation

● Development is ongoing -- now is the time to share your requirements and make your voice heard !

15

Page 16: SDN in the Management Plane: OpenConfig and Streaming Telemetry

OpenConfig

16

Page 17: SDN in the Management Plane: OpenConfig and Streaming Telemetry

OpenConfig● Informal industry collaboration of network operators

● Focus: define vendor-neutral configuration and operational state models based on real operations○ Adopted YANG data modeling language (RFC 6020)

● Participants: Apple, AT&T, BT, Comcast, Cox, Facebook, Google Level3, Microsoft, Verizon, Yahoo!

● Primary output is model code, published as open source via public github repo

● Ongoing interactions with standards and open source communities (e.g., IETF, ONF, ODL, ONOS)

17

Page 18: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Example configuration pipeline

configuration datavendor-neutral, validated

multiple vendor devices

18

OC YANG models

configurationgeneration

gRPC req

operators

intent API

“drain peering link”

update topology model

gRPC endpoint

Page 19: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Extending OpenConfig models

● base OpenConfig model as a starting point

● vendors can offer augmentations / deviations

● operators can add locally consumed extensions

base model

X vendor modifications

local modifications

extended model

19

Page 20: SDN in the Management Plane: OpenConfig and Streaming Telemetry

OpenConfig releases and roadmapData models (configuration and operational state)

● BGP and routing policy○ multiple vendor implementations in progress

● MPLS / TE consolidated model○ RSVP / TE and segment routing model as initial focus

● design patterns for operational state and model composition● tools for translating YANG models to usable code artifacts

○ e.g., pyangbind

20

Models in progress● interfaces, system, optical transport, ...

Page 21: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Models must be composed to be useful

● model composition framework is critical missing piece from existing model-building efforts

21

Page 22: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Modeling operational state

22

Types of operational state data● derived, negotiated, set by a protocol, etc. (negotiated BGP hold-time)● operational state data for counters or statistics (interface counters)● operational state data representing intended configuration (actual vs.

configured)

Clear benefits from using YANG to model both configuration and operational state in the same data model

● but … YANG focus has primarily been config, NETCONF-centric, lack of common conventions

Page 23: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Summary● New networking paradigms like SDN focus mostly on control

○ it’s time for the management plane to join the age of SDN

● Core principles:○ model-driven management○ streaming telemetry to scale monitoring and improve freshness○ vendor-neutral, extensible APIs for managing devices

● Architecture and emerging vendor implementations of multi-mode telemetry solutions

● OpenConfig is a focused effort by operators to develop vendor-neutral models to define management APIs

23

Operators: get involved and push your vendors for support on your gear!

Page 24: SDN in the Management Plane: OpenConfig and Streaming Telemetry

thank you !

Page 25: SDN in the Management Plane: OpenConfig and Streaming Telemetry

gRPC: multi-platform RPC frameworkgRPC features● load-balancing, app-level flow control, call-cancellation● serialization with protobuf (efficient wire encoding)● multi-platform, many supported languages● open source, under active development

gRPC leverages HTTP/2 as its transport layer● binary framing, header compression● bidirectional streams, server push support● connection multiplexing across requests and streams

25

http://grpc.io

Page 26: SDN in the Management Plane: OpenConfig and Streaming Telemetry

OpenConfig and standards (e.g., IETF)● primary goal is native implementation of OpenConfig models

○ OpenConfig is not a standards group

● publish models and documentation in IETF to inject operator perspective into standards process

● adoption of OpenConfig models and ideas into standards can simplify development efforts for vendors

26

Page 27: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Additional “observations”● YANG and NETCONF should be decoupled -- each are

independently useful

● YANG needs to evolve more rapidly at this early phase, stabilize as real usage increases

● current YANG model versioning is not helpful -- treat models like software artifacts, not dated documents

● current standard models should be open for revisiting and revising

● should not rush to standardize more models until they are deployed and used in production

27these are not necessarily OpenConfig consensus views

Page 28: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Current OpenConfig “process”

● initial models developed by OpenConfig

● extensive collaboration with vendors

● leverage existing work where possible

● publish models and docs

28

Page 29: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Intent-based configuration flowabstract configuration models

Config Model

Topology Model

configuration intentoperators

declarative APIconfiguration flow

configuration pusherNETCONF, RESTCONF, JSON-RPC, ...

...

authoritativeconfig store

config generation

device-level configuration

standard models

vendor-neutral configuration models

generatedconfiguration instances

config generation

authoritativeconfig storeapplication

NB APIs

Network OS

SB protocols

analagous SDN stack

29

Page 30: SDN in the Management Plane: OpenConfig and Streaming Telemetry

Telemetry required for a full solution

● Popular network technology topics today focus on more granular control of network traffic decisions.○ APIs, RPC frameworks, agents, controllers, overlays…

● Developing an awesome method to control network elements is only half of the solution.

● Data accuracy and freshness are limiting factors in a control solution’s optimality.

30