-
meeting the challenges of the distributed cloudThe evolution
toward Network Functions Virtualization and the distributed cloud
brings new challenges for wide area networks (WANs). By deploying
software-defined networking-based Traffic Engineering with a
centralized control plane, operators can improve elasticity while
maximizing resource utilization in the WAN that connects the
distributed cloud.
ericsson White paperUen 284 23-3249 | December 2014
Wide area network Traffic Engineering
-
WIDE AREA NETWORK TRAFFIC ENGINEERING NETWORK FUNCTIONS
VIRTUALIZATION 2
Network Functions Virtualization (NFV) as developed by the
European Telecommunications Standards Institute (ETSI) NFV Industry
Specification Group (ISG) aims for a clear separation between
functional logic defined in software and the underlying
infrastructure. This implies an opportunity to redesign the way
network functions will be implemented in the future. Instead of
being implemented in vertically integrated boxes (often called
physical appliances,) network functions will be provided as virtual
appliances, in other words software executed in a virtualized
infrastructure environment. Figure 1 shows how NFV decouples
applications which could be network functions, service enablers or
application servers from the infrastructure.
In this paper, we use the term network functions in a broad
sense. It can relate to functions within the control or forwarding
plane of mobile-core networks (such as Evolved Packet Core and
IMS), fixed-access networks (such as Broadband Network Gateway
[BNG]) but also to functions in the forwarding plane of IP
networks, for example, switches, routers and firewalls. In
addition, the virtualization of customer premises equipment such as
home and enterprise gateways is covered by this concept.
Furthermore, it also covers application servers such as video
servers that are involved in the delivery of network services.
The NFV concept was first presented to a wider audience in a
white paper written by 13 Tier 1 operators that was published in
October 2012 [1]. It resulted in the creation of an ISG within
ETSI, which started work in January 2013. Within less than two
years, the number of active supporters of the ETSI NFV ISG grew to
235 companies, including 34 service provider organizations [2] [3].
It is important to note that an ISG in ETSI does not have a
standardization mandate. It is rather a pre-standardization
activity that focuses primarily on use cases, high-level
architecture design and proof-of-concepts, and it typically has a
lifespan of two years.
In a later specification phase, it is expected that existing de
facto IT cloud standards will be reused to a large extent. This
would also address the need for operators to harmonize their NFV
infrastructure as much as possible with existing IT cloud
infrastructure already running in their data centers.
Network Functions Virtualization
Figure 1: NFV removes vertical coupling between applications and
infrastructure.
Application
Middle-ware
OS
Switch
Processor
Application
Middle-ware
OS
Switch
Processor
Application
Middle-ware
Application
Middleware
Application
Middleware
Linux
Industry-standard virtualizationTransformation
Industry-standard processors
Industry-standard switches
Application
Middleware
OS
Switch
Processor
-
WIDE AREA NETWORK TRAFFIC ENGINEERING VIRTUAL DATA CENTERS 3
Virtual data centers
Figure 2: VDC concept. Source: Telefnica S.A.
UNICA infra domain management
DC1
VDC1
VDC2
VDC3
User portals UNICA infra domain portals
UNICA infra domain
API
API API API
DC2 DC3
Todays IT data centers are typically geographically centralized
at larger sites of several thousand square meters. However, there
are several reasons for using more than one data center per
geographical region. One reason is redundancy if there is an outage
in one location, the data center in another location can take over,
thus guaranteeing service continuity. Another reason is user
experience and the cost of traffic. Companies like Google,
Facebook, and Netflix all started with a few data centers in the
US. However, with heavier traffic and rising numbers of users
worldwide, they were forced to add data centers in other regions in
order to maintain good user experience and to optimize traffic
costs.
Similar considerations apply to NFV. Several Tier 1 operators
have expressed demand for distributed cloud environments in which
some parts of the cloud infrastructure are provided by centralized,
national data centers and other parts of the cloud environment are
located closer to the access network. For operators, it would be a
natural choice to place remote cloud-execution environments, for
example, in existing central office sites where fixed access
gateways such as broadband remote access server BNG or mobile
packet gateways such as Gateway GPRS Support Node Evolved Packet
Gateway are already installed.
Telefnica is an example of an operator that has fully embraced
the concept of virtualized data centers across distributed cloud
environments. Figure 2 shows UNICA, which is Telefnicas future data
center architecture. An important requirement in UNICA is the
ability to support several virtual data centers (VDCs). Each VDC
can be managed by a different organizational entity, and a VDC can
span several physical cloud environments. The various physical
cloud environments can be located in centralized data centers or in
smaller central office sites closer to the access network. All VDCs
run on top of a common network infrastructure; however each VDC
only sees the slice of the network assigned to it.
-
WIDE AREA NETWORK TRAFFIC ENGINEERING NEW CHALLENGES FOR THE WAN
NETWORK 4
New challenges for the WAN NetworkThe evolution toward NFV
requires a distributed cloud that allows the creation, deletion,
and free movement of virtual machines across different geographical
locations. As a consequence, a WAN that interconnects the
distributed cloud must be able to deal with more dynamically
changing traffic patterns.
To better understand the challenges that virtualized network
functions and virtualized data centers put onto the WAN, a brief
introduction to Traffic Engineering (TE) is needed. In simple
terms, TE refers to the process of planning and steering data flows
through the network. Operators have several reasons to deploy TE in
their networks. The main reason is to minimize capex by increasing
utilization of network resources.
A common approach to TE is to create a traffic matrix that
describes the bandwidth requirements between sources and
destinations in a network. This traffic matrix is then used as
input for the actual path computation, taking into consideration
QoS and cost parameters, with the goal to maximize utilization
while fulfilling all traffic requirements.
The introduction of the VDC concept puts new requirements on the
underlying WAN since the traffic within each VDC needs to be
completely isolated from traffic of other VDCs. Concepts and
technologies for achieving this partitioning are commonly referred
to as network slicing.
In addition, NFV will make the traffic within each VDC more
dynamic. One reason for this is the location independence of
network functions as exemplified in the following scenarios: >
When creating new service instances, resources can be chosen
wherever they are available
in the network. Resource pools can span various data centers in
different geographical locations.
> Relocation of existing service instances becomes a pure
software function virtual machine (VM) migration. Since no hardware
has to be moved, relocation of service instances is likely to
happen more frequently than with vertically integrated nodes.
Reasons for relocating existing virtual network functions
include the following: > resource availability (compute,
storage, network bandwidth) > user mobility (change in service
demands per location) > reduced latency, reduced transmission
cost by bringing service nodes closer to the consumer
and avoidance of traffic tromboning > failure cases,
resiliency, disaster recovery > planned maintenance > policy
and security > energy and cooling capacity.
Whenever a virtual network function is instantiated or moved to
a new location, traffic flows will change as well. This applies to,
for example, traffic flows between a compute node and the related
storage, or between a compute function and the access network
connecting the user. The requirement on the WAN transport is to
react to changes in traffic demand at the same pace at which
virtual network functions are created or moved. Changes in traffic
flows must be analyzed in real time or near-real time for their
impact on allocated bandwidth in the transport.
As long as changes are small and can be supported by already
allocated resources, no change in the WAN transport may be
required. The WAN transport must, however, be able to monitor the
cumulative effect of all service modifications initiated by the
management and orchestration systems and take appropriate measures
when needed.
-
WIDE AREA NETWORK TRAFFIC ENGINEERING NEW CHALLENGES FOR THE WAN
NETWORK 5
Below is an example scenario that illustrates the impact of NFV
on the traffic matrix in the WAN.For any given traffic pattern
caused by the existing distribution of compute and storage
functions between
the three data centers in Figure 3, we assume the network has
been configured to accommodate the resulting traffic matrix. TE
could have been applied to optimize QoS parameters and cost by
achieving optimal network utilization.
In a next step, we assume that some virtual network functions
are relocated. As shown in Figure 3, the compute resources in data
center 1 are overutilized, while there are free resources in data
center 2. In order to utilize compute resources more efficiently,
some virtual network functions (running as virtual machines in data
center 3) could be relocated to data center 2. Alternatively, when
relocation is not possible or not desired, any new VMs could be
created in data center 2 instead of the already overloaded data
center 3. In either case, the creation of new VMs in data center 2
will impact the traffic matrix in the WAN, for example to connect
to storage or bare metal servers in data center 3, or just because
traffic from the access now has to be routed to data center 2
instead of data center 3.Figure 4 shows the situation after the
change, with additional VMs located in data center 2. While
computing resources are now used more evenly, the traffic matrix
has changed in the WAN and requires re-optimization of traffic
flows to optimize cost and performance.
As shown in the example above, the introduction of VDCs and NFV
creates new requirements for TE in the WAN. In the following
section, we will outline a solution architecture that can react to
changes more quickly by controlling the WAN TE from an overarching
orchestration layer.
Figure 3: Data center scenario before change.
Figure 4: Data center scenario after change.
PE
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
Network utilizationmay be optimal
Compute utilization suboptimal
EnterpriseCE
CSR
Storage
Storage Bare metalserver
Compute
Traffic flows
Compute
Data center 2
Data center 1Traffic matrix
Data center 3
PE
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
Network needsoptimization
Using compute resources here,compute utilization may be
optimal
EnterpriseCE
CSR
Storage
Storage Bare metalserver
Compute
Traffic flowsShortest path maynow be congested
Compute
Data center 2
Data center 1Traffic matrix
Data center 3
-
WIDE AREA NETWORK TRAFFIC ENGINEERING SOLUTION ARCHITECTURE FOR
DYNAMIC TE 6
Solution architecture for dynamic TEIn the following, we
describe a solution architecture for dynamic TE based on two main
concepts: > abstracting data centers as virtual Provider Edge
(vPE) > TE with centralized path computation.
This solution architecture is based on software-defined
networking (SDN) principles in the following way. Control plane
centralization is applied both within the data center as well as in
the WAN. Logical centralization of control plane functions creates
a consolidated network view and a single control point for
cross-network domain orchestration systems through standardized
northbound application programming interfaces (APIs).
ABSTRACTING DATA CENTERS AS VPEIP/multi-protocol label switching
(MPLS) is the dominant technology used in service-provider networks
today. The data center itself often uses different technologies
such as Virtual Extensible LAN and Ethernet, including Provider
Bridging and Shortest Path Bridging. Different data plane
technologies increase the complexity of end-to-end TE when applying
traditional concepts.
A PE router is a standard component in IP/MPLS provider
networks. A new way of integrating a data center into an IP/MPLS
network is by representing the data center as a vPE router. The
term virtual indicates that the data center appears like a PE to
the outside network. This can be achieved by a central logical
control node in the data center, fulfilling two purposes. Firstly,
it controls the switches in the data center. Secondly, it speaks
standard PE routing protocols to the IP/MPLS WAN. Very large data
centers can be modeled as several vPE routers.
The vPE approach is illustrated in Figure 5. Inside the data
center, the Cloud Network Controller (CNC) is introduced as the
central control plane instance. Being an SDN controller, it offers
a northbound API allowing data center orchestration to control
traffic flows centrally inside the data center. The CNC also
Figure 5: Data center architecture with vPE concept.
Cross-domain orchestration
Data center orchestration
Data center Data center
WANIP/MPLS
vPEforwarding plane
vPEforwarding plane
vPEcontrolplane
vPEcontrolplane
Data centertransport
Data centertransport
Control
Control Control
Topology informationControl
Data center orchestration
CNC CNC
PCE
vSwitch vSwitchGate-way
Gate-way
VMVMVM
VMVMVM
-
WIDE AREA NETWORK TRAFFIC ENGINEERING SOLUTION ARCHITECTURE FOR
DYNAMIC TE 7
controls data plane nodes within the data center through its
southbound interface. OpenFlow is a protocol standardized by the
Open Network Foundation and is most suitable as a southbound
interface between control and forwarding planes. The controller can
use other southbound protocols like NetConf or the Extensible
Messaging and Presence Protocol depending on the type of forwarding
device to be controlled.
Towards the WAN, the CNC behaves like the control plane of a
conventional router. This allows for easy integration with existing
WAN networks, using standard routing protocols.
A second SDN controller is introduced to offer a central control
point from the cross-domain orchestration system toward the WAN
interconnecting cloud locations. This WAN controller acts as a Path
Computation Element (PCE) that computes and controls traffic paths
in the WAN by interacting with physical or vPE routers.
TE WITH CENTRALIZED PATH COMPUTATIONMPLS is the dominant data
plane technology in the WAN today. With MPLS, packets are sent from
one network node to the next based on simple labels rather than
complex network addresses.Several methods have been defined to
enable TE in MPLS networks. The underlying principle of all
mechanisms is to pre-configure label switched paths (LSPs) through
the network based on path computation.
Existing TE approaches can be divided into distributed and
centralized approaches according to the way paths are set up in the
transport network. In distributed approaches, topology attributes
are disseminated via extensions of existing routing protocols such
as Open Shortest Path First-TE or Intermediate System to
Intermediate System-TE. The head-end node of each individual LSP is
in charge of computing the path for its LSPs based on topology and
constraints learned through the routing protocol. The head-end node
then signals the LSPs through the network using RSVP (Reservation
Protocol)-TE.
The drawback of this approach is the required signaling through
the network. LSP signaling not only happens when new LSPs are set
up but also at regular time intervals to refresh LSP state on all
routers along each LSP. In large networks, this signaling, together
with the need to store LSP state on all routers, can represent a
problem for scalability.
In centralized approaches, the LSP calculation is often
performed offline in a management system. The LSP configuration is
then pushed into the nodes through Operation and Maintenance
procedures. The advantage of a centralized approach is that it can
consider application requirements more easily than a distributed
path computation in the network nodes. Central path calculation
also has the ability to compute more optimal paths, based on global
knowledge, which results in better network resource
utilization.
The drawback of the traditional approach of centralized offline
calculation is a lack of agility and the need for complex and
expensive management software. It also involves a high degree of
manual work to prepare the necessary input for path calculation.
This results in high cost and slow response times.With the advent
of SDN principles, new solutions that can overcome the limitations
of existing TE solutions become feasible.
An SDN-based approach has the following characteristics: >
Path calculation is done in a centralized controller. > The
controller provides open northbound interfaces for integration with
an orchestration layer. > The controller supports various
southbound interfaces toward the network elements for
configuring the calculated path in the data plane.
Centralized path computation is performed by a dedicated PCE.
The PCE is a functional building block that can either be deployed
as a standalone node or as a component of a Multi-layer WAN
Controller (MLWC). In both cases, the central controller provides a
central touch point that allows higher layer orchestration systems
to interact with the network. This makes it possible for
orchestration systems to perform optimizations across different
types of resources, covering compute, storage and also networking,
which results in optimal utilization across all resource
categories. Deploying the PCE as part of an MLWC offers the
additional capability of performing path optimization across the
IP/MPLS and optical layers [4]. The MLWC can be implemented on an
open SDN controller platform such as OpenDaylight [5]. Another
example of an open SDN controller platform for WAN applications is
ONOS [6].
An appealing solution results from combining an SDN-based
control layer with Segment Routing (SR) in the data plane. SR is
currently under standardization by the Internet Engineering Task
Force, and is supported by several large network equipment vendors
and operators. In contrast to conventional TE solutions, SR removes
the need to signal and periodically refresh LSPs along the entire
path in the network. Information about the path of an LSP is not
stored in each router along the path, but in the header of payload
packets. This reduces the amount of
-
WIDE AREA NETWORK TRAFFIC ENGINEERING SOLUTION ARCHITECTURE FOR
DYNAMIC TE 8
label information and the complexity of maintaining it in all
routers along an LSP. Together with the removal of label signaling,
this allows the network to scale better by offloading the control
and forwarding planes in routers. Besides scalable TE, SR also
offers advantages such as 100 percent fast reroute coverage in
arbitrary topologies, support for network wide service chaining and
IPv6 as a data plane alternative to MPLS.
It should also be noted that centralized path computation also
works with conventional label signaling or even in heterogeneous
environments where parts of the network use SR and other parts use
a label distribution protocol. This allows for smooth migration of
existing networks. SDN-based SR can be gradually introduced via
software updates. Only the edge nodes require an interface toward
an SDN controller. Intermediate network nodes do not need an
interface toward the SDN controller they need to support the SR
label distribution and forwarding mechanisms, which in most cases
only requires a software update.
-
WIDE AREA NETWORK TRAFFIC ENGINEERING USE CASE EXAMPLES 9
Use case exampleHere is a practical use case where cross-domain
orchestration works together with a central PCE to provide optimal
forwarding for different types of applications. In the network
setup shown in Figure 6, assume there is a high service demand in
an area near data center DC2. There will be a mix of various
applications creating delay-sensitive traffic, such as voice and
VoLTE, and high-bandwidth traffic, such as video. Without
intervention, the higher than usual traffic demand could create
congestion on the shortest path link between PE2 and PE4. This can
cause packet loss and delay with service degradation for all
applications, while at the same time spare bandwidth is available
on non-shortest path links.In reaction to the increased traffic
demand, the cross-domain orchestration system can interact
both with the data center as well as with the WAN. The
orchestration could, for example, instantiate virtual network
functions like caching servers or content filters close to consumer
in data center DC2. In addition, the orchestration system can also
interact with the WAN to achieve more optimal load distribution.
This must also take into consideration the QoS requirements of the
different applications.
In the example shown here, the direct link between PE2 and PE4
has low delay but is highly utilized. There is an additional path
via PE3 that has spare bandwidth but higher delay. Without TE, all
traffic will use the shortest path and no load balancing will
occur.
When network analytics functions detect suboptimal resource
utilization, the orchestration system instructs the PCE to
establish two alternative MPLS LSPs for load balancing (step 1).
The PCE computes optimal routes for each LSP based on topology and
bandwidth information (step 2). Path computation can also take
other constraints into account like delay, path diversity,
protection type and time of day. The PCE then signals the path to
the edge routers PE2 and PE4 using the PCE Protocol (step 3). The
PE routers assign traffic flows to LSPs on a per-VPN or
per-class-of-service basis (step 4). For example, delay-sensitive
VoLTE traffic is assigned to the low delay path, while high-volume
video traffic, which is less delay-sensitive, is assigned to the
path with higher bandwidth and higher delay. As a result, network
resources are used efficiently and QoS requirements are fulfilled
for each type of application.
Figure 6: Suboptimal resource utilization.
Cross-domain orchestration
vPE1
vPE2 PE3
PE4
Internet
PCE
10G, 10
ms
10G
, 10m
s
10G, 20ms
DC1
DC2
Access
-
WIDE AREA NETWORK TRAFFIC ENGINEERING USE CASE EXAMPLES 10
In an SR scenario, the explicit paths computed by the PCE are
encoded in the MPLS labels stack and communicated to PE2 and PE4.
In contrast to a solution with RSVP-TE, an SR solution does not
require any signaling along the LSP path. No LSP state has to be
stored on internal backbone routers. The need to refresh LSP state
through periodic RSVP-TE messages is also removed, resulting in
increased scalability of the solution. SR even removes the need for
a dedicated label distribution protocol by carrying label
information in extensions of conventional routing protocols.
Figure 7: Network optimized with TE.
Cross-domain orchestration
1
2
3
3
4
4vPE1
vPE2 PE3
PE4
Internet
PCE
10G, 10
ms
10G
, 10m
s
10G, 20ms
Signaling oflabeled paths
Signaling oflabeled paths
Delay-
sensiti
ve traf
fic
High-volume traffic
DC1
DC2
Access
-
WIDE AREA NETWORK TRAFFIC ENGINEERING CONCLUSION 11
CoNclusionVirtualizing network functions in distributed cloud
environments leads to more dynamically changing traffic patterns in
the WAN. This creates challenges for traditional TE approaches that
are unable to react quickly enough to changing traffic patterns,
and leads to inefficient use of transport resources and QoS
degradations.
This paper outlines a network architecture that supports the
implementation of an automated, near real-time TE solution and
fulfills the needs of distributed cloud environments. The approach
is characterized by two main concepts: > abstraction of the data
center as vPE > TE with a central PCE.
Both concepts can be implemented with an SDN-based approach in
an open and standardized way. With this approach, the network
becomes accessible for the orchestration system and allows
cross-domain optimization of compute, storage and network
resources. This allows operators to maximize utilization for all
types of resources as a whole, with reduced effort and response
times.
The proposed SDN-based TE approach works with conventional label
signaling for path setup. However, an appealing solution results
from using SR in the data plane as a means to manage transport
paths in the WAN. SR can be gradually introduced via software
updates.
This white paper has been developed in collaboration with
Telefnica I+D.
-
WIDE AREA NETWORK TRAFFIC ENGINEERING GLOSSARY 12
GLOSSARYAPI application programming interfaceBNG Broadband
Network GatewayCE Customer EdgeCSR cell site routerCNC Cloud
Network ControllerETSI European Telecommunications Standards
InstituteISG Industry Specification GroupLSP label switched path
MLWC Multi-layer WAN ControllerMPLS multi-protocol label
switchingNFV Network Functions VirtualizationOS operating systemPCE
Path Computation ElementPE Provider EdgeRSVP Reservation
ProtocolSDN software-defined networkingSR Segment RoutingTE Traffic
EngineeringVDC virtual data centerVM virtual machinevPE virtual
Provider EdgeWAN wide area network
-
WIDE AREA NETWORK TRAFFIC ENGINEERING REFERENCES & FURTHER
READING 13
References
Further Reading
[1] ETSI NFV ISG, October 2012, Network Functions
Virtualisation: An Introduction, Benefits, Enablers, Challenges
& Call for Action, available at:
http://portal.etsi.org/NFV/NFV_White_Paper.pdf
[2] ETSI NFV ISG, October 2013, Network Functions Virtualisation
(NFV): Network Operator Perspectives on Industry Progress,
available at: http://portal.etsi.org/nfv/nfv_white_paper2.pdf
[3] ETSI NFV ISG, October 2014, Network Functions Virtualisation
(NFV): Network Operator Perspectives on Industry Progress,
available at: http://portal.etsi.org/nfv/nfv_white_paper3.pdf
[4] E2E Traffic Engineering Routing for Transport SDN, Ericsson,
Corital & Telefnica, OFC 2014 conference, Paola Iovanna, Fabio
Ubaldi, Francesco Di Michele, Juan Pedro Fernandez-Palacios Gimenez
& Victor Lopez
[5] OpenDaylight, accessed December 2014, available at:
http://www.opendaylight.org/
[6] ON.Lab ONOS, accessed December 2014, available at:
http://onosproject.org/
Cloud computing in telecommunications, Ericsson Review, June
2010, Jan Gabrielsson, Ola Hubertsson, Ignacio Ms & Robert
Skog, available
at:http://www.ericsson.com/res/thecompany/docs/publications/ericsson_review/2010/cloudcomputing.pdf
Software-defined networking: the service provider perspective,
Ericsson Review, February 2013, Attila Takacs, Elisa Bellagamba
& Joe Wilke, available
at:http://www.ericsson.com/res/thecompany/docs/publications/ericsson_review/2013/er-software-defined-networking.pdf
Virtualizing network services the telecom cloud, Ericsson
Review, March 2014, Henrik Basilier, Marian Darula & Joe Wilke,
available
at:http://www.ericsson.com/res/thecompany/docs/publications/ericsson_review/2014/er-telecom-cloud.pdf