Wp Wan Traffic Engineering

meeting the challenges of the distributed cloudThe evolution toward Network Functions Virtualization and the distributed cloud brings new challenges for wide area networks (WANs). By deploying software-defined networking-based Traffic Engineering with a centralized control plane, operators can improve elasticity while maximizing resource utilization in the WAN that connects the distributed cloud.

ericsson White paperUen 284 23-3249 | December 2014

Wide area network Traffic Engineering

WIDE AREA NETWORK TRAFFIC ENGINEERING NETWORK FUNCTIONS VIRTUALIZATION 2

Network Functions Virtualization (NFV) as developed by the European Telecommunications Standards Institute (ETSI) NFV Industry Specification Group (ISG) aims for a clear separation between functional logic defined in software and the underlying infrastructure. This implies an opportunity to redesign the way network functions will be implemented in the future. Instead of being implemented in vertically integrated boxes (often called physical appliances,) network functions will be provided as virtual appliances, in other words software executed in a virtualized infrastructure environment. Figure 1 shows how NFV decouples applications which could be network functions, service enablers or application servers from the infrastructure.

In this paper, we use the term network functions in a broad sense. It can relate to functions within the control or forwarding plane of mobile-core networks (such as Evolved Packet Core and IMS), fixed-access networks (such as Broadband Network Gateway [BNG]) but also to functions in the forwarding plane of IP networks, for example, switches, routers and firewalls. In addition, the virtualization of customer premises equipment such as home and enterprise gateways is covered by this concept. Furthermore, it also covers application servers such as video servers that are involved in the delivery of network services.

The NFV concept was first presented to a wider audience in a white paper written by 13 Tier 1 operators that was published in October 2012 [1]. It resulted in the creation of an ISG within ETSI, which started work in January 2013. Within less than two years, the number of active supporters of the ETSI NFV ISG grew to 235 companies, including 34 service provider organizations [2] [3]. It is important to note that an ISG in ETSI does not have a standardization mandate. It is rather a pre-standardization activity that focuses primarily on use cases, high-level architecture design and proof-of-concepts, and it typically has a lifespan of two years.

In a later specification phase, it is expected that existing de facto IT cloud standards will be reused to a large extent. This would also address the need for operators to harmonize their NFV infrastructure as much as possible with existing IT cloud infrastructure already running in their data centers.

Network Functions Virtualization

Figure 1: NFV removes vertical coupling between applications and infrastructure.

Application

Middle-ware

OS

Switch

Processor

Application

Middle-ware

OS

Switch

Processor

Application

Middle-ware

Application

Middleware

Application

Middleware

Linux

Industry-standard virtualizationTransformation

Industry-standard processors

Industry-standard switches

Application

Middleware

OS

Switch

Processor

WIDE AREA NETWORK TRAFFIC ENGINEERING VIRTUAL DATA CENTERS 3

Virtual data centers

Figure 2: VDC concept. Source: Telefnica S.A.

UNICA infra domain management

DC1

VDC1

VDC2

VDC3

User portals UNICA infra domain portals

UNICA infra domain

API

API API API

DC2 DC3

Todays IT data centers are typically geographically centralized at larger sites of several thousand square meters. However, there are several reasons for using more than one data center per geographical region. One reason is redundancy if there is an outage in one location, the data center in another location can take over, thus guaranteeing service continuity. Another reason is user experience and the cost of traffic. Companies like Google, Facebook, and Netflix all started with a few data centers in the US. However, with heavier traffic and rising numbers of users worldwide, they were forced to add data centers in other regions in order to maintain good user experience and to optimize traffic costs.

Similar considerations apply to NFV. Several Tier 1 operators have expressed demand for distributed cloud environments in which some parts of the cloud infrastructure are provided by centralized, national data centers and other parts of the cloud environment are located closer to the access network. For operators, it would be a natural choice to place remote cloud-execution environments, for example, in existing central office sites where fixed access gateways such as broadband remote access server BNG or mobile packet gateways such as Gateway GPRS Support Node Evolved Packet Gateway are already installed.

Telefnica is an example of an operator that has fully embraced the concept of virtualized data centers across distributed cloud environments. Figure 2 shows UNICA, which is Telefnicas future data center architecture. An important requirement in UNICA is the ability to support several virtual data centers (VDCs). Each VDC can be managed by a different organizational entity, and a VDC can span several physical cloud environments. The various physical cloud environments can be located in centralized data centers or in smaller central office sites closer to the access network. All VDCs run on top of a common network infrastructure; however each VDC only sees the slice of the network assigned to it.

WIDE AREA NETWORK TRAFFIC ENGINEERING NEW CHALLENGES FOR THE WAN NETWORK 4

New challenges for the WAN NetworkThe evolution toward NFV requires a distributed cloud that allows the creation, deletion, and free movement of virtual machines across different geographical locations. As a consequence, a WAN that interconnects the distributed cloud must be able to deal with more dynamically changing traffic patterns.

To better understand the challenges that virtualized network functions and virtualized data centers put onto the WAN, a brief introduction to Traffic Engineering (TE) is needed. In simple terms, TE refers to the process of planning and steering data flows through the network. Operators have several reasons to deploy TE in their networks. The main reason is to minimize capex by increasing utilization of network resources.

A common approach to TE is to create a traffic matrix that describes the bandwidth requirements between sources and destinations in a network. This traffic matrix is then used as input for the actual path computation, taking into consideration QoS and cost parameters, with the goal to maximize utilization while fulfilling all traffic requirements.

The introduction of the VDC concept puts new requirements on the underlying WAN since the traffic within each VDC needs to be completely isolated from traffic of other VDCs. Concepts and technologies for achieving this partitioning are commonly referred to as network slicing.

In addition, NFV will make the traffic within each VDC more dynamic. One reason for this is the location independence of network functions as exemplified in the following scenarios: > When creating new service instances, resources can be chosen wherever they are available

in the network. Resource pools can span various data centers in different geographical locations.

> Relocation of existing service instances becomes a pure software function virtual machine (VM) migration. Since no hardware has to be moved, relocation of service instances is likely to happen more frequently than with vertically integrated nodes.

Reasons for relocating existing virtual network functions include the following: > resource availability (compute, storage, network bandwidth) > user mobility (change in service demands per location) > reduced latency, reduced transmission cost by bringing service nodes closer to the consumer

and avoidance of traffic tromboning > failure cases, resiliency, disaster recovery > planned maintenance > policy and security > energy and cooling capacity.

Whenever a virtual network function is instantiated or moved to a new location, traffic flows will change as well. This applies to, for example, traffic flows between a compute node and the related storage, or between a compute function and the access network connecting the user. The requirement on the WAN transport is to react to changes in traffic demand at the same pace at which virtual network functions are created or moved. Changes in traffic flows must be analyzed in real time or near-real time for their impact on allocated bandwidth in the transport.

As long as changes are small and can be supported by already allocated resources, no change in the WAN transport may be required. The WAN transport must, however, be able to monitor the cumulative effect of all service modifications initiated by the management and orchestration systems and take appropriate measures when needed.

WIDE AREA NETWORK TRAFFIC ENGINEERING NEW CHALLENGES FOR THE WAN NETWORK 5

Below is an example scenario that illustrates the impact of NFV on the traffic matrix in the WAN.For any given traffic pattern caused by the existing distribution of compute and storage functions between

the three data centers in Figure 3, we assume the network has been configured to accommodate the resulting traffic matrix. TE could have been applied to optimize QoS parameters and cost by achieving optimal network utilization.

In a next step, we assume that some virtual network functions are relocated. As shown in Figure 3, the compute resources in data center 1 are overutilized, while there are free resources in data center 2. In order to utilize compute resources more efficiently, some virtual network functions (running as virtual machines in data center 3) could be relocated to data center 2. Alternatively, when relocation is not possible or not desired, any new VMs could be created in data center 2 instead of the already overloaded data center 3. In either case, the creation of new VMs in data center 2 will impact the traffic matrix in the WAN, for example to connect to storage or bare metal servers in data center 3, or just because traffic from the access now has to be routed to data center 2 instead of data center 3.Figure 4 shows the situation after the change, with additional VMs located in data center 2. While

computing resources are now used more evenly, the traffic matrix has changed in the WAN and requires re-optimization of traffic flows to optimize cost and performance.

As shown in the example above, the introduction of VDCs and NFV creates new requirements for TE in the WAN. In the following section, we will outline a solution architecture that can react to changes more quickly by controlling the WAN TE from an overarching orchestration layer.

Figure 3: Data center scenario before change.

Figure 4: Data center scenario after change.

PE

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

Network utilizationmay be optimal

Compute utilization suboptimal

EnterpriseCE

CSR

Storage

Storage Bare metalserver

Compute

Traffic flows

Compute

Data center 2

Data center 1Traffic matrix

Data center 3

PE

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

Network needsoptimization

Using compute resources here,compute utilization may be optimal

EnterpriseCE

CSR

Storage

Storage Bare metalserver

Compute

Traffic flowsShortest path maynow be congested

Compute

Data center 2

Data center 1Traffic matrix

Data center 3

WIDE AREA NETWORK TRAFFIC ENGINEERING SOLUTION ARCHITECTURE FOR DYNAMIC TE 6

Solution architecture for dynamic TEIn the following, we describe a solution architecture for dynamic TE based on two main concepts: > abstracting data centers as virtual Provider Edge (vPE) > TE with centralized path computation.

This solution architecture is based on software-defined networking (SDN) principles in the following way. Control plane centralization is applied both within the data center as well as in the WAN. Logical centralization of control plane functions creates a consolidated network view and a single control point for cross-network domain orchestration systems through standardized northbound application programming interfaces (APIs).

ABSTRACTING DATA CENTERS AS VPEIP/multi-protocol label switching (MPLS) is the dominant technology used in service-provider networks today. The data center itself often uses different technologies such as Virtual Extensible LAN and Ethernet, including Provider Bridging and Shortest Path Bridging. Different data plane technologies increase the complexity of end-to-end TE when applying traditional concepts.

A PE router is a standard component in IP/MPLS provider networks. A new way of integrating a data center into an IP/MPLS network is by representing the data center as a vPE router. The term virtual indicates that the data center appears like a PE to the outside network. This can be achieved by a central logical control node in the data center, fulfilling two purposes. Firstly, it controls the switches in the data center. Secondly, it speaks standard PE routing protocols to the IP/MPLS WAN. Very large data centers can be modeled as several vPE routers.

The vPE approach is illustrated in Figure 5. Inside the data center, the Cloud Network Controller (CNC) is introduced as the central control plane instance. Being an SDN controller, it offers a northbound API allowing data center orchestration to control traffic flows centrally inside the data center. The CNC also

Figure 5: Data center architecture with vPE concept.

Cross-domain orchestration

Data center orchestration

Data center Data center

WANIP/MPLS

vPEforwarding plane

vPEforwarding plane

vPEcontrolplane

vPEcontrolplane

Data centertransport

Data centertransport

Control

Control Control

Topology informationControl

Data center orchestration

CNC CNC

PCE

vSwitch vSwitchGate-way

Gate-way

VMVMVM

VMVMVM


controls data plane nodes within the data center through its southbound interface. OpenFlow is a protocol standardized by the Open Network Foundation and is most suitable as a southbound interface between control and forwarding planes. The controller can use other southbound protocols like NetConf or the Extensible Messaging and Presence Protocol depending on the type of forwarding device to be controlled.

Towards the WAN, the CNC behaves like the control plane of a conventional router. This allows for easy integration with existing WAN networks, using standard routing protocols.

A second SDN controller is introduced to offer a central control point from the cross-domain orchestration system toward the WAN interconnecting cloud locations. This WAN controller acts as a Path Computation Element (PCE) that computes and controls traffic paths in the WAN by interacting with physical or vPE routers.

TE WITH CENTRALIZED PATH COMPUTATIONMPLS is the dominant data plane technology in the WAN today. With MPLS, packets are sent from one network node to the next based on simple labels rather than complex network addresses.Several methods have been defined to enable TE in MPLS networks. The underlying principle of all mechanisms is to pre-configure label switched paths (LSPs) through the network based on path computation.

Existing TE approaches can be divided into distributed and centralized approaches according to the way paths are set up in the transport network. In distributed approaches, topology attributes are disseminated via extensions of existing routing protocols such as Open Shortest Path First-TE or Intermediate System to Intermediate System-TE. The head-end node of each individual LSP is in charge of computing the path for its LSPs based on topology and constraints learned through the routing protocol. The head-end node then signals the LSPs through the network using RSVP (Reservation Protocol)-TE.

The drawback of this approach is the required signaling through the network. LSP signaling not only happens when new LSPs are set up but also at regular time intervals to refresh LSP state on all routers along each LSP. In large networks, this signaling, together with the need to store LSP state on all routers, can represent a problem for scalability.

In centralized approaches, the LSP calculation is often performed offline in a management system. The LSP configuration is then pushed into the nodes through Operation and Maintenance procedures. The advantage of a centralized approach is that it can consider application requirements more easily than a distributed path computation in the network nodes. Central path calculation also has the ability to compute more optimal paths, based on global knowledge, which results in better network resource utilization.

The drawback of the traditional approach of centralized offline calculation is a lack of agility and the need for complex and expensive management software. It also involves a high degree of manual work to prepare the necessary input for path calculation. This results in high cost and slow response times.With the advent of SDN principles, new solutions that can overcome the limitations of existing TE solutions become feasible.

An SDN-based approach has the following characteristics: > Path calculation is done in a centralized controller. > The controller provides open northbound interfaces for integration with an orchestration layer. > The controller supports various southbound interfaces toward the network elements for

configuring the calculated path in the data plane.

Centralized path computation is performed by a dedicated PCE. The PCE is a functional building block that can either be deployed as a standalone node or as a component of a Multi-layer WAN Controller (MLWC). In both cases, the central controller provides a central touch point that allows higher layer orchestration systems to interact with the network. This makes it possible for orchestration systems to perform optimizations across different types of resources, covering compute, storage and also networking, which results in optimal utilization across all resource categories. Deploying the PCE as part of an MLWC offers the additional capability of performing path optimization across the IP/MPLS and optical layers [4]. The MLWC can be implemented on an open SDN controller platform such as OpenDaylight [5]. Another example of an open SDN controller platform for WAN applications is ONOS [6].

An appealing solution results from combining an SDN-based control layer with Segment Routing (SR) in the data plane. SR is currently under standardization by the Internet Engineering Task Force, and is supported by several large network equipment vendors and operators. In contrast to conventional TE solutions, SR removes the need to signal and periodically refresh LSPs along the entire path in the network. Information about the path of an LSP is not stored in each router along the path, but in the header of payload packets. This reduces the amount of


label information and the complexity of maintaining it in all routers along an LSP. Together with the removal of label signaling, this allows the network to scale better by offloading the control and forwarding planes in routers. Besides scalable TE, SR also offers advantages such as 100 percent fast reroute coverage in arbitrary topologies, support for network wide service chaining and IPv6 as a data plane alternative to MPLS.

It should also be noted that centralized path computation also works with conventional label signaling or even in heterogeneous environments where parts of the network use SR and other parts use a label distribution protocol. This allows for smooth migration of existing networks. SDN-based SR can be gradually introduced via software updates. Only the edge nodes require an interface toward an SDN controller. Intermediate network nodes do not need an interface toward the SDN controller they need to support the SR label distribution and forwarding mechanisms, which in most cases only requires a software update.

WIDE AREA NETWORK TRAFFIC ENGINEERING USE CASE EXAMPLES 9

Use case exampleHere is a practical use case where cross-domain orchestration works together with a central PCE to provide optimal forwarding for different types of applications. In the network setup shown in Figure 6, assume there is a high service demand in an area near data center DC2. There will be a mix of various applications creating delay-sensitive traffic, such as voice and VoLTE, and high-bandwidth traffic, such as video. Without intervention, the higher than usual traffic demand could create congestion on the shortest path link between PE2 and PE4. This can cause packet loss and delay with service degradation for all applications, while at the same time spare bandwidth is available on non-shortest path links.In reaction to the increased traffic demand, the cross-domain orchestration system can interact

both with the data center as well as with the WAN. The orchestration could, for example, instantiate virtual network functions like caching servers or content filters close to consumer in data center DC2. In addition, the orchestration system can also interact with the WAN to achieve more optimal load distribution. This must also take into consideration the QoS requirements of the different applications.

In the example shown here, the direct link between PE2 and PE4 has low delay but is highly utilized. There is an additional path via PE3 that has spare bandwidth but higher delay. Without TE, all traffic will use the shortest path and no load balancing will occur.

When network analytics functions detect suboptimal resource utilization, the orchestration system instructs the PCE to establish two alternative MPLS LSPs for load balancing (step 1). The PCE computes optimal routes for each LSP based on topology and bandwidth information (step 2). Path computation can also take other constraints into account like delay, path diversity, protection type and time of day. The PCE then signals the path to the edge routers PE2 and PE4 using the PCE Protocol (step 3). The PE routers assign traffic flows to LSPs on a per-VPN or per-class-of-service basis (step 4). For example, delay-sensitive VoLTE traffic is assigned to the low delay path, while high-volume video traffic, which is less delay-sensitive, is assigned to the path with higher bandwidth and higher delay. As a result, network resources are used efficiently and QoS requirements are fulfilled for each type of application.

Figure 6: Suboptimal resource utilization.


vPE1

vPE2 PE3

PE4

Internet

PCE

10G, 10

ms

10G

, 10m

s

10G, 20ms

DC1

DC2

Access

WIDE AREA NETWORK TRAFFIC ENGINEERING USE CASE EXAMPLES 10

In an SR scenario, the explicit paths computed by the PCE are encoded in the MPLS labels stack and communicated to PE2 and PE4. In contrast to a solution with RSVP-TE, an SR solution does not require any signaling along the LSP path. No LSP state has to be stored on internal backbone routers. The need to refresh LSP state through periodic RSVP-TE messages is also removed, resulting in increased scalability of the solution. SR even removes the need for a dedicated label distribution protocol by carrying label information in extensions of conventional routing protocols.

Figure 7: Network optimized with TE.


1

2

3

3

4

4vPE1

vPE2 PE3

PE4

Internet

PCE

10G, 10

ms

10G

, 10m

s

10G, 20ms

Signaling oflabeled paths

Signaling oflabeled paths

Delay-

sensiti

ve traf

fic

High-volume traffic

DC1

DC2

Access

WIDE AREA NETWORK TRAFFIC ENGINEERING CONCLUSION 11

CoNclusionVirtualizing network functions in distributed cloud environments leads to more dynamically changing traffic patterns in the WAN. This creates challenges for traditional TE approaches that are unable to react quickly enough to changing traffic patterns, and leads to inefficient use of transport resources and QoS degradations.

This paper outlines a network architecture that supports the implementation of an automated, near real-time TE solution and fulfills the needs of distributed cloud environments. The approach is characterized by two main concepts: > abstraction of the data center as vPE > TE with a central PCE.

Both concepts can be implemented with an SDN-based approach in an open and standardized way. With this approach, the network becomes accessible for the orchestration system and allows cross-domain optimization of compute, storage and network resources. This allows operators to maximize utilization for all types of resources as a whole, with reduced effort and response times.

The proposed SDN-based TE approach works with conventional label signaling for path setup. However, an appealing solution results from using SR in the data plane as a means to manage transport paths in the WAN. SR can be gradually introduced via software updates.

This white paper has been developed in collaboration with Telefnica I+D.

WIDE AREA NETWORK TRAFFIC ENGINEERING GLOSSARY 12

GLOSSARYAPI application programming interfaceBNG Broadband Network GatewayCE Customer EdgeCSR cell site routerCNC Cloud Network ControllerETSI European Telecommunications Standards InstituteISG Industry Specification GroupLSP label switched path MLWC Multi-layer WAN ControllerMPLS multi-protocol label switchingNFV Network Functions VirtualizationOS operating systemPCE Path Computation ElementPE Provider EdgeRSVP Reservation ProtocolSDN software-defined networkingSR Segment RoutingTE Traffic EngineeringVDC virtual data centerVM virtual machinevPE virtual Provider EdgeWAN wide area network

WIDE AREA NETWORK TRAFFIC ENGINEERING REFERENCES & FURTHER READING 13

References

Further Reading

[1] ETSI NFV ISG, October 2012, Network Functions Virtualisation: An Introduction, Benefits, Enablers, Challenges & Call for Action, available at: http://portal.etsi.org/NFV/NFV_White_Paper.pdf

[2] ETSI NFV ISG, October 2013, Network Functions Virtualisation (NFV): Network Operator Perspectives on Industry Progress, available at: http://portal.etsi.org/nfv/nfv_white_paper2.pdf

[3] ETSI NFV ISG, October 2014, Network Functions Virtualisation (NFV): Network Operator Perspectives on Industry Progress, available at: http://portal.etsi.org/nfv/nfv_white_paper3.pdf

[4] E2E Traffic Engineering Routing for Transport SDN, Ericsson, Corital & Telefnica, OFC 2014 conference, Paola Iovanna, Fabio Ubaldi, Francesco Di Michele, Juan Pedro Fernandez-Palacios Gimenez & Victor Lopez

[5] OpenDaylight, accessed December 2014, available at: http://www.opendaylight.org/

[6] ON.Lab ONOS, accessed December 2014, available at: http://onosproject.org/

Cloud computing in telecommunications, Ericsson Review, June 2010, Jan Gabrielsson, Ola Hubertsson, Ignacio Ms & Robert Skog, available at:http://www.ericsson.com/res/thecompany/docs/publications/ericsson_review/2010/cloudcomputing.pdf

Software-defined networking: the service provider perspective, Ericsson Review, February 2013, Attila Takacs, Elisa Bellagamba & Joe Wilke, available at:http://www.ericsson.com/res/thecompany/docs/publications/ericsson_review/2013/er-software-defined-networking.pdf

Virtualizing network services the telecom cloud, Ericsson Review, March 2014, Henrik Basilier, Marian Darula & Joe Wilke, available at:http://www.ericsson.com/res/thecompany/docs/publications/ericsson_review/2014/er-telecom-cloud.pdf

Wp Wan Traffic Engineering

Documents

term network functions

way network functions

nfv infrastructure

etsi nfv isg

cloud infrastructure

delivery of network

nfv concept

broadband network gateway