Top Banner
Quality of Service in LTE Contents Executive Summary ................................ 1 Overview ................................................ 2 Networks Covered .................................. 2 HSPA & LTE Interworking ...................... 3 CDMA & LTE Interworking ..................... 3 MVNO Environments ............................ 4 The Fall of Erlang and the Rise of IP ........... 4 Protocol Stacks: LTE is all IP .................. 6 Mechanics of Per-Hop Differentiated Queuing ..................................................... 7 Key functions in 3GPP PCC standards.............. 7 3GPP PCC Theory of Operation .................... 11 TFT................................................... 12 PCC rule parameters .......................... 14 What is the “End” in End-to-End? ................. 14 LTE QoS Use Cases: Fairshare Traffic Management .......................................................... 16 PCRF Signaling ..................................... 17 In-Band Marking (TEID modification) .......... 17 In-Band Marking SGi (DSCP Modification) ..... 18 Comparison of Techniques....................... 18 Automated QoS Control for Mobile Network Congestion Management ......................... 19 Conclusions and Recommendations ............... 21 Executive Summary As each communications service provider (CSP) transitions various network types to LTE, the efficient handling of subscriber Quality of Service (QoS), both inside and across different networks, is a pressing issue. In this sweeping, in-depth look at various network technologies and available approaches to QoS-handling within and across networks, Sandvine draws out the key issues and presents recommendations for sound Network Policy Control: Affected network types and architectures Evolution and consequences of the all-IP network architecture Background and explanation of the 3GPP elements used in the delivery of services, and their key functions Explanation of the 3GPP PCC (policy and charging control) theory of operation Issues associated with the boundary interchange between network types for QoS Key questions and decisions CSPs face when defining and managing end-to-end QoS in LTE and between networks Explanation of the three possible mechanisms that exist for per-sector prioritization in networks that have deployed LTE Comparison and pros and cons of each of the possible QoS-handling techniques for LTE Finally, the paper shows how the inherent flexibility of Sandvine technology allows our Fairshare Traffic Management to support all three QoS-handling methods for LTE networks, including unique support for the most effective and efficient approach.
22

Quality of Service in Lte Long Form

Oct 22, 2015

Download

Documents

dorosdoros

Quality of Service in Lte Long Form
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Quality of Service in Lte Long Form

Quality of Service in LTE

Contents

Executive Summary ................................ 1

Overview ................................................ 2

Networks Covered .................................. 2

HSPA & LTE Interworking ...................... 3

CDMA & LTE Interworking ..................... 3

MVNO Environments ............................ 4

The Fall of Erlang and the Rise of IP ........... 4

Protocol Stacks: LTE is all IP .................. 6

Mechanics of Per-Hop Differentiated Queuing ..................................................... 7

Key functions in 3GPP PCC standards.............. 7

3GPP PCC Theory of Operation .................... 11

TFT................................................... 12

PCC rule parameters .......................... 14

What is the “End” in End-to-End? ................. 14

LTE QoS Use Cases: Fairshare Traffic Management .......................................................... 16

PCRF Signaling ..................................... 17

In-Band Marking (TEID modification) .......... 17

In-Band Marking SGi (DSCP Modification) ..... 18

Comparison of Techniques....................... 18

Automated QoS Control for Mobile Network Congestion Management ......................... 19

Conclusions and Recommendations ............... 21

Executive Summary As each communications service provider (CSP) transitions various network types to LTE, the efficient handling of subscriber Quality of Service (QoS), both inside and across different networks, is a pressing issue.

In this sweeping, in-depth look at various network technologies and available approaches to QoS-handling within and across networks, Sandvine draws out the key issues and presents recommendations for sound Network Policy Control:

• Affected network types and architectures • Evolution and consequences of the all-IP network

architecture • Background and explanation of the 3GPP elements used

in the delivery of services, and their key functions • Explanation of the 3GPP PCC (policy and charging

control) theory of operation • Issues associated with the boundary interchange between

network types for QoS • Key questions and decisions CSPs face when defining and

managing end-to-end QoS in LTE and between networks • Explanation of the three possible mechanisms that exist

for per-sector prioritization in networks that have deployed LTE

• Comparison and pros and cons of each of the possible QoS-handling techniques for LTE

Finally, the paper shows how the inherent flexibility of Sandvine technology allows our Fairshare Traffic Management to support all three QoS-handling methods for LTE networks, including unique support for the most effective and efficient approach.

Page 2: Quality of Service in Lte Long Form

Quality of Service in LTE

Page 2

Overview Although this document focuses on LTE (3GPP R9 and later), much of the background for mobile network QoS comes from earlier 3GPP revisions. In particular, much of the baseline framework was defined in 3GPP R7 (shown below in Figure 1), so it is useful to highlight the differences. This document refers to H-PLMN as the home network, the operator to which the subscriber pays a fee, and the V-PLMN as the visited network, the one the subscriber is currently attached to. The normal case is that the H-PLMN and V-PLMN are the same, and the subscriber is not roaming.

Figure 1: Typical infrastructural roaming, 3GPP R7

Not covered are QoS issues that occur inside the handset. Older circuit-switched voice handsets guarantee quality directly in the baseband and real-time operating system. Newer smartphones have moved to non-real-time operating systems (BSD-based for Apple, Linux-based for Android, Windows-based for Microsoft), and there is the strong probability of ‘jitter’ and ‘lag’ being introduced inside the OS scheduler itself.

In addition, roaming between LTE and HSPA networks is possible, as is roaming between 3GPP and 3GPP2 networks, and this has a consequence on QoS and mapping between capabilities. Therefore, the boundary interchange between networks for QoS is thoroughly discussed in this paper.

Networks Covered The issues and points discussed in this paper are applicable to the following network types and interactions:

Page 3: Quality of Service in Lte Long Form

Quality of Service in LTE

Page 3

HSPA & LTE Interworking A mobile operator current using HSPA technologies, and migrating towards LTE, will usually support soft hand-off (e.g., a dual-mode device which can switch mid-session depending on available coverage). This type of network is shown in Figure 2.

Figure 2: LTE and HSPA interworking

CDMA & LTE Interworking Some operators currently use CDMA (3GPP2) technologies, and are migrating to LTE. As part of the migration they may support soft hand-off (e.g., user-equipment that can start a session on one network and move to another mid-session). Hard hand-off techniques are not covered here. Figure 3 shows a network diagram for a mixed CDMA and LTE operator.

Figure 3: LTE and CDMA interworking, single operator

Page 4: Quality of Service in Lte Long Form

Quality of Service in LTE

Page 4

MVNO Environments Another case covered is that of an MVNO (Mobile Virtual Network Operator). There are two types of MVNO: full virtual (simply a marketing brand, no network at all), and a partial infrastructural MVNO, owning an HLR/AAA and a GGSN. The latter is shown by Figure 4. An MVNO with a partial infrastructure has the ability to create differentiated radio access bearers via their GGSN, and care must be taken to prevent an imbalance in end-consumer experience on the shared infrastructure.

Figure 4: MVNO environment

The Fall of Erlang and the Rise of IP Since the beginning of the 20th century, voice networks have been engineered for capacity according to the Erlang model. In comparison to data networks, voice networks have some key simplifications that allowed the modeling to occur:

• Voice is treated as constant bitrate (i.e., one voice circuit uses constant network bandwidth) • Voice is treated as a symmetric path (both directions follow the same links) • Voice is treated as a single path (no multi-path networks are used) • Voice sessions start at a predictable rate according to human behaviour • Voice ‘packets’ are fixed size • Sessions go from many-to-one (handsets to voice switch) and do not interact with each other

As a consequence, telephony network providers were able to build their network capacity according to simple and fixed design rules (e.g., 99.99% of calls would complete at the busiest hour). It is assumed the quality of a call is Boolean – if it connects, it has perfect quality; if there is insufficient capacity, it is blocked (connection admission control).

Early-generation mobile networks introduced some complexity to voice Erlang in that the hand-off between locations had to be handled as the user moved, but the overall rules and technologies stayed the same - an end-to-end circuit from the user handset to the mobile-switching center (and from there to the call recipient) started at a predictable, low rate and used a fixed capacity in a symmetric fashion on a single path.

Page 5: Quality of Service in Lte Long Form

Quality of Service in LTE

Page 5

As data needs grew, IP packet-switched networks became the de-facto standard. Network engineering in data was performed based on peak observed load and forward-prediction models such as Holt-Winters forecasting. Capacity-based billing models emerged between carriers based on 95-percentile. QoS switched from deterministic to probabilistic. Circuit-based QoS was replaced by per-hop behaviour. Within a service provider, QoS management may be performed using Differentiated Services Code Point (DSCP) to modify the probability on a per-hop-basis. QoS management between operators is rare. Operators typically provision managed services such as video and voice using traditional circuit-switched models, non-converged networks, or networks converged at the physical layer but partitioned using techniques such as Multiprotocol Label Switching (MPLS). Networks are normally treated as non-oversubscribed except for the ‘last-mile’ consumer access.

Applications requiring quality goals typically build them into the application (usually with buffering, or with complex codecs such as Skype’s SILK, which allow for packet loss). The oversubscription in the fixed networks is normally sufficient for QoS-sensitive applications such as Skype and Netflix to function in a best-effort environment most of the time.

Figure 5: MRTG utilised capacity chart, 95% line shown

As mobile data emerged, network engineering based on observed trends became problematic. The high rate of adoption of new devices and new applications meant that capacity could not be added quickly enough. The disparity between ‘busy’ and ‘non-busy’ mobile sectors is now high in terms of volume, and the sectors that are busy vary due to mobility.

Since data applications use variable packet sizes, they tend to interact with each other poorly as the links approach 70-80% utilization. In particular, latency tends to go up exponentially as the link utilisation goes over 75%. Applications which use TCP and large packets tend to dominate the throughput, creating latency issues for smaller packet applications.

Figure 6: Ethernet utilization vs. loss/latency

0 20 40 60 80 100 120 140

0%

20%

40%

60%

80%

100%

120%

10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Late

ncy

(ms)

Loss

(%)

Utilization (%)

Ethernet Utilisation vs. Loss/Latency

Loss Latency

Page 6: Quality of Service in Lte Long Form

Quality of Service in LTE

Page 6

Protocol Stacks: LTE is all IP One of the design goals of LTE was to be entirely IP (including carrying voice over IP). As a consequence, QoS must be understood in both the context of the ‘inner-tunnel’, which interoperates with the packet core and radio network, and the ‘outer-tunnel’, which uses traditional IP traffic engineering techniques. The various tunneling and encapsulation protocols that are required are shown in Figure 7.

Figure 7: LTE major transport encapsulations

It is common practice for an operator to use MPLS or other tunneling technique (see Figure 8), and in practical terms it is impossible to convey the QoS markings from inner-tunnels to outer-tunnels. A best practice is to engineer the network such that there is only one point of congestion (the eNodeB). This may be difficult to achieve as the S1-U may be significantly over-subscribed. As a consequence, the usual requirement is to use outer-marks (DSCP, MPLS-EX) that are driven in conjunction with the inner marks (TEID). If the P-GW is the label-edge router in the downstream direction, one can use a feature such as Cisco’s “ip user-datagram-tos copy”, which causes the inner-packet encapsulated by GTP-U to also drive the outer DSCP. This can influence the transport network on S5 and S1-U to somewhat match the decisions which will be made by the eNodeB (which are based on the tunnel-ID (TEID) and the PDP context parameters). The “active-charging service” feature can be used to achieve the inverse - taking un-encapsulated packets from SGi interface and using their DSCP mark to map to a specific PDP-context.

Figure 8: Practical LTE deployment

Page 7: Quality of Service in Lte Long Form

Quality of Service in LTE

Page 7

Mechanics of Per-Hop Differentiated Queuing In IP-based networks, differentiated service is performed on a per-hop basis. The most common techniques used are Differentiated Services (DiffServ, RFC 2474, RFC 2475, RFC 3260), which uses the 6-bit DSCP field in the IP header, and MPLS, which also uses the DiffServ architecture, but with different marking techniques (RFC 3270). In particular, MPLS supports 3-bits (8-levels) in the EXP field. In 3GPP, the QoS Class Indicator (QCI) maps directly to DSCP. The basic classes defined by DiffServ are ‘default’, ‘expedited forwarding’, and ‘assured forwarding’. Of these, expedited forwarding is used for ‘strict’ priority (e.g., video and voice), and ‘assured forwarding’ is used for business differentiation (e.g., weighted-fair priority).

One of the long-standing complexities of DiffServ has always been its behaviour in tunnels, and 3GPP is no different. In a 3GPP environment, the outer marking is only used by backhaul networks, and inner marks are ignored. DiffServ may be used to manage QoS on external networks and be mapped into 3GPP bearers. Examples of interchange between the two are proprietary per vendor, but include Cisco’s “ip user-datagram-tos copy” feature, which copies the DSCP field from the inner IP packet to the outer GTP header, and Cisco’s “active-charging service” feature, which maps un-tunneled packet DSCP fields into specific radio bearers by mapping them to a specific PDP context. Note that, despite the number of levels supported in signaling, individual equipment types vary in the number of queues supported, and in the queuing behaviour (strict vs. fair). As a consequence, many distinct ‘marks’ map to the same behaviour and it is important to understand the internal queuing support provided by each piece of network equipment along each possible path.

Key functions in 3GPP PCC standards Before moving on it’s important to thoroughly review the six main functions of the 3GPP PCC standards that manage services and QoS in modern networks. The following core functions are shown in Figure 9:

• SPR (Subscription Profile Repository) • OCS (Online Charging System): optional - may be dealt with using Gy/CCR • AF (Application Function): one per operator-provided service • PCRF (Policy Charging Rules Function) • PCEF (Policy Charging Enforcement Function) • TDF (Traffic Detection Function): may be merged into PCEF

Figure 9: 3GPP PCC block diagram

u901317$
Highlight
u901317$
Rectangle
Page 8: Quality of Service in Lte Long Form

Quality of Service in LTE

Page 8

AF The AF, if involved, may provide the following application session related information (i.e., based on SIP1 and SDP2

• Subscriber Identifier - typically the MSISDN (Mobile Subscriber Integrated Services Digital Network-Number) of the user, if known by the AF

):

• IP address of the UE • Media Type, Media Format (e.g., media format sub-field of the media announcement and all other

parameter information (a= lines) associated with the media format) • Bandwidth • Sponsored data connectivity information (e.g., allowing the flow to be zero-rated towards the

consumer, and the charge in aggregate to be dealt with in some other fashion) • Flow description (i.e., source and destination IP address, port numbers and the protocol) • AF Application Identifier • AF Communication Service Identifier (e.g., IMS Communication Service Identifier), UE provided via

AF • AF Application Event Identifier • AF Record Information • Flow status (for gating decision) • Priority indicator, which may be used by the PCRF to guarantee service for an application session of

a higher relative priority • Emergency indicator • Application service provider (i.e., the diameter realm or business name)

SPR The SPR may provide the following information for a subscriber, connecting to a specific packet gateway:

• Subscriber's allowed services (i.e., list of Service IDs) • For each allowed service, a pre-emption priority • Information on subscriber's allowed QoS, including:

• the Subscribed Guaranteed Bandwidth QoS; • a list of QoS class identifiers together with the MBR limit and, for real-time QoS class

identifiers, GBR limit.

• Subscriber's charging related information • Spending limits profile containing an indication that policy decisions depend on policy counters

available at the OCS that has a spending limit associated with it and optionally the list of relevant policy counters

• Subscriber category • Subscriber's usage monitoring related information • Subscriber's profile configuration • Sponsored data connectivity profiles

1 Session Initiation Protocol, RFC 3261 2 Session Description Protocol, RFC 4566

Page 9: Quality of Service in Lte Long Form

Quality of Service in LTE

Page 9

• Multimedia Priority Service (MPS) Priority (user priority) • IMS Signaling Priority

PCRF A PCRF has two key functions in the 3GPP PCC standards: provisioning charging rules to the PCEF (performed on session initiation), and creating/destroying dedicated bearer PDP contexts (and thus radio bearers) in response to a request from an Application Function (AF). It is important to note that the use of a PCRF is optional; it is not a required element of a 3GPP network.

The original framers of the 3GPP PCC specifications anticipated the PCRF installing dynamic rules (5-tuple based) on a per-flow basis. In 3GPP R8 this was deprecated in favour of application detection and control (ADC) rules (giving much greater scale), and this was formalized in R11 in the May 2011 3GPP TSG SA WG2 meeting.

Figure 10: PCRF system architecture

PCEF The PCEF is the main component of PCC, and its use is non-optional. An operator can (and commonly does) have pre-provisioned PCC rules in the PCEF (other basic rules are also provisioned in the HLR/HSS). The PCEF typically gets subscription information from the AAA or S6a interface towards the HSS & AAA. The PCEF performs the following primary functions:

• Gate enforcement. The PCEF allows a service data flow, which is subject to policy control, to pass through the PCEF if and only if the corresponding gate is open. This provides a means of blocking unknown or unenforced traffic (and may be used to block, for example, users with no credit).

• Charging Trigger Function where through Diameter Credit Control it feeds information to an Online Charging System in order to track usage.

• Charging Data Function through offline charging records required for typical post-paid services and charging reconciliation.

• QoS enforcement:

• QoS class identifier correspondence with IP session-specific QoS attributes. The PCEF converts a QoS class identifier value to IP-session specific QoS attribute values (typically DSCP) and determine the QoS class identifier value from a set of IP-session specific QoS attribute values.

• PCC rule QoS enforcement. The PCEF enforces the authorized QoS of a service data flow according to the active PCC rule (e.g., to enforce uplink DSCP marking).

• IP-session bearer QoS enforcement. The PCEF controls the QoS that is provided to a combined set of service data flows. The policy enforcement function ensures that the resources which can be used by an authorized set of service data flows are within the "authorized resources"

Page 10: Quality of Service in Lte Long Form

Quality of Service in LTE

Page 10

specified via the Gx interface by "authorized QoS". The authorized QoS provides an upper bound on the resources that can be reserved (GBR) or allocated (MBR) for the IP session bearer. The authorized QoS information is mapped by the PCEF to IP CAN specific QoS attributes. During IP-CAN bearer QoS enforcement, if packet filters are provided to the UE, the PCEF shall provide packet filters with the same content as that in the service data flow template filters received over the Gx interface.

TDF Introduced in 3GPP release 11, the TDF component is still in the process of being standardized and has not yet been widely adopted. Its existence demonstrates industry recognition of the rise and predominance of over-the-top applications, which do not use SIP and a traditional AF. The TDF may be deployed in two different ways: it may signal on a per flow basis after detection towards the PCRF, or it may act to perform the gating/redirection/bandwidth limitation without informing the PCRF. The latter use case is more common as over-the-top applications often operate at much greater scale than the PCRF is capable of handling.

3GPP technical specifications 23.203 and 29.212 version 12 describe the relationship between the TDF, PCRF, PCEF, various Diameter interfaces and other related elements such as an OCS. The key aspect that determines compliance as a 3GPP release 11 or higher TDF is support for the newly introduced Diameter Sd reference point described in TS 29.212. Diameter Sd is used for communication between the TDF and PCRF using application detection and control (ADC) rules fully detailed in TS 29.212.3 The PCEF uses PCC rules and the Diameter Gx reference point to communicate with the PCRF (in place since release 7 and also described in TS 29.212). Decisions about which applications to detect can be installed locally to a TDF and/or to what the specifications refer to as a “PCEF enhanced with ADC”; that is, a PCEF with an embedded TDF, which has embedded ADC rules.4

The TDF may also provide usage monitoring towards the PCRF (so that the PCRF can provide an additional form of metering when an OCS is not present or capable). 3GPP release 12 introduced charging support to the TDF, effectively duplicating charging functions also described for the PCEF element. Both the TDF and PCEF elements must interpret monitoring keys from the PCRF and charging keys from the OCS.

3GPP release 12 introduced charging support to the TDF, effectively duplicating charging functions also described for the PCEF element. For those cases where service data flow description cannot be provided by the TDF to the PCRF, the TDF performs gating, redirection, bandwidth limitation and charging for detected applications. For those cases where service data flow description is provided by the TDF to the PCRF, actions resulting from application detection may be performed by the PCEF as part of the charging and policy enforcement per service data flow and by the Bearer Binding and Event Reporting Function (BBERF) for bearer binding, or actions may be performed by the PCEF/TDF using Application Detection and Control (ADC) rules.

5 Indeed, the charging sections of TS 23.203 often describe the PCEF and TDF elements as one entity; for example, the credit management section of TS 23.203 is addressed at the “PCEF/TDF” element having or receiving “PCC/ADC” rules. 6

3 3GPP TS 29.212 V12.2.0 (2013-09), section 4b

Annex Q of TS 23.203 provides the following view of the logical relationship between these elements and their interfaces when online charging and an OCS are also in play:

http://www.3gpp.org/ftp/Specs/html-info/29212.htm 4 3GPP TS 23.203 V12.2.0 (2013-09), section 4.5 http://www.3gpp.org/ftp/Specs/html-info/23203.htm 5 Ibid. 6 Ibid, section 6.1.3.

Page 11: Quality of Service in Lte Long Form

Quality of Service in LTE

Page 11

PCEF / TDF

PCRF

OCS Gx , Sd

Gy , Gyn

Sy

Figure 11: Usage Monitoring via Online Charging System (3GPP TS 23.203 v. 12.2.0, Annex Q)

It is anticipated that the TDF will perform other triggers (e.g., location changes, congestion detection) so that the PCRF can become aware of the network. As of release 12, the standards remain vague regarding how topology/location information will be provided from the PCRF to the TDF. Some vendors offer location awareness using proprietary features. Theoretically, the standards provide an optional mechanism for every location change to be propagated from the RAN all the way to the PCRF. However, for practical reasons this has not been implemented, and therefore the TDF as purely described in the latest standards will face a similar problem.7

OCS

The OCS is out of the scope of PCC, but does have a recent (and not widely adopted) interface towards the PCRF for purposes of coordinating policy with credit. A common approach is to perform this function via the PCEF directly (as it too communicates with the OCS via Diameter Gy). The OCS, if involved, may provide policy counters status for each relevant policy counter towards the PCRF over the newly–emerging Sy interface. In addition, the PCEF has a connection via Gy towards the OCS, and can make its own evaluation of rules based on credit responses.

3GPP PCC Theory of Operation Figure 12 shows the overall PCC system and its interconnections to non-PCC components.

Figure 12: 3GPP block diagram, expanded, with core PCC components shaded

7 For a complete overview of Sandvine’s approach the 3GPP release 12 and TDF standards, see Technology Showcase – Traffic Detection Function.

Page 12: Quality of Service in Lte Long Form

Quality of Service in LTE

Page 12

The general signaling flow through the 3GPP PCC architecture is initiated by either the user (session initiation) or via the AF/TDF. The following five major signaling flows are important to describe:

1. Subscriber initiates bearer (creates PDP context): When the subscriber registers their device to the network, after authentication by the S-GW and the HSS, a default bearer is created on the P-GW. The PCEF initiates a message with Gx to load the rule-set for the user (which is ultimately stored in the SPR).

2. Application-function initiated change: When activated, the AF signals the PCRF via Rx to indicate a new service flow (matched using IP header bits), selecting the QoS and charging parameters. The PCRF provisions this rule into the PCEF with the appropriate TFT & QCI, which commences the QoS and charging as specified by the AF.

3. TDF-initiated change: Conceptually this is identical to the Application-function initiated change, except that it is based on detecting an application, rather than the user initiating the application.

4. Network-initiated change (RAT change, loss of bearer, QoS change, etc): Based on a rule on the PCEF, a trigger can be sent towards the PCRF (using diameter Gx). Examples include quota exceeded, start use of an application, entrance to a specific location, etc.

5. PCRF-initiated change. The PCRF is free to run internal logic on conditions it is aware of, and change the provisioning of rules on the PCEF using a Gx RAR.

TFT In 3GPP, a TFT (Traffic Flow Template) is a classifier that matches on fields on the inner-IP of a GTP-U tunnel. This in turn causes differentiated radio-bearer performance. It can match the following fields:

• Source address (with subnet mask) • IP protocol number (TCP, UDP) • Destination port range • Source port range • IPSec Security Parameter Index (SPI) • Type of Service (TOS) (IPv4) • Flow-Label (IPv6 only)

Whether using the static-TFT model or the dynamic Gx-signaled TFT model, the same sequence occurs: a dedicated bearer (secondary PDP context) is created, and traffic is forced to match it. Operators can create PDP contexts dynamically using Gx, in which traffic matching the TFT filters into the context based on rules in the PCEF, or having dynamic PDP creation done by the packet gateway itself based on traffic matching with pre-provisioned values. This can be useful if an upstream device on SGi will mark packets matching certain conditions with DSCP.

Figure 13: TFT mapping to PDP context on 3G (dedicated bearer analogous to secondary)

Page 13: Quality of Service in Lte Long Form

Quality of Service in LTE

Page 13

The LTE version of the standards allows up to nine TFTs to be used per bearer. In prior revisions, there is only one TFT allowed, which is important to note if QoS handoff between HSPA and LTE, or LTE and CDMA is needed.

The TFT selects which PDP context is used, and the QCI label is a short-hand for the QoS parameters within the context. Note the QCI is a short-hand label only. The standardized QCI characteristics are given as rough guidelines in Table 6.1.7 of 3GPP TS 23.203 (v11.3.0), reproduced here:

Table 1: Table 6.1.7 from 3GPP TS 23.203 V11.3.0

QCI Resource Type Priority Packet Delay Budget

Packet Error Loss Rate Example Service

1 (Note 3)

GBR (guaranteed

bitrate)

2 100ms 10-2 Conversational Voice

2 (Note 3) 4 150ms 10-3 Conversational Video (live)

3 (Note 3) 3 50ms 10-3 Real time gaming

4 (Note 3) 5 300ms 10-6 Non-Conversational Video (buffered)

5 (Note 3)

Non-GBR

1 100ms 10-6 IMS Signaling

6 (Note 4) 6 300ms 10-6 Video (Buffered streaming) TCP

7 (Note 3) 7 100ms 10-3 Voice, Video (Live), Interactive Gaming

8 (Note 5) 8 300ms 10-6 Video (buffered

streaming), TCP 9 (Note 6) 9

NOTE 1: A delay of 20 ms for the delay between a PCEF and a radio base station should be subtracted from a given PDB to derive the packet delay budget that applies to the radio interface. This delay is the average between the case where the PCEF is located "close" to the radio base station (roughly 10 ms) and the case where the PCEF is located "far" from the radio base station, e.g. in case of roaming with home routed traffic (the one-way packet delay between Europe and the US west coast is roughly 50 ms). The average takes into account that roaming is a less typical scenario. It is expected that subtracting this average delay of 20 ms from a given PDB will lead to desired end-to-end performance in most typical cases. Also, note that the PDB defines an upper bound. Actual packet delays - in particular for GBR traffic - should typically be lower than the PDB specified for a QCI as long as the UE has sufficient radio channel quality. NOTE 2: The rate of non-congestion related packet losses that may occur between a radio base station and a PCEF should be regarded to be negligible. A PELR value specified for a standardized QCI therefore applies completely to the radio interface between a UE and radio base station. NOTE 3: This QCI is typically associated with an operator controlled service, i.e., a service where the SDF aggregate's uplink / downlink packet filters are known at the point in time when the SDF aggregate is authorized. In case of E-UTRAN this is the point in time when a corresponding dedicated EPS bearer is established / modified. NOTE 4: If the network supports Multimedia Priority Services (MPS) then this QCI could be used for the prioritization of non-real-time data (i.e. most typically TCP-based services/applications) of MPS subscribers. NOTE 5: This QCI could be used for a dedicated "premium bearer" (e.g. associated with premium content) for any subscriber / subscriber group. Also in this case, the SDF aggregate's uplink / downlink packet filters are known at the point in time when the SDF aggregate is authorized. Alternatively, this QCI could be used for the default bearer of a UE/PDN for "premium subscribers". NOTE 6: This QCI is typically used for the default bearer of a UE/PDN for non-privileged subscribers. Note that AMBR can be used as a "tool" to provide subscriber differentiation between subscriber groups connected to the

Page 14: Quality of Service in Lte Long Form

Quality of Service in LTE

Page 14

same PDN with the same QCI on the default bearer.

PCC rule parameters • QCI – QoS class indicator • ARP – Allocation/Retention Priority -- information about the priority level, pre-emption capability,

and pre-emption vulnerability. ARP priority is 1…15, 1 has highest priority. 1-8 are used within the operator domain, 9-15 are used when roaming.

• GBR – guaranteed bitrate • MBR – maximum bitrate • SDF – service data flow

Packets matching the rule (the TFT) will be routed into a bearer that matches the settings (via the QCI) of ARP, GBR, and MBR.

Key operator deployment & architecture decisions The following are key questions a network operator needs to answer:

1. Will dynamic PCC (flow-based) rules be used? This dramatically impacts the scale of the PCRF deployment, the use of Diameter routing agents, and the signaling load on the evolved packet core.

2. Will Application Detection and Control (ADC) rules be used? The richness of capabilities of the PCEF will be the gating factor for services.

3. Will static PCC rules be used? An upstream marking device with application awareness may be needed.

4. Is QoS in the radio sufficient? If not, all rules need to apply to both the eNodeB (via the TEID and PDP context) in addition to other network technologies using their own proprietary methods (e.g., RSVP-TE, DSCP, MPLS-EX, …)

What is the “End” in End-to-End? Internet architectures are generally built around per-hop-behaviour, whereas traditional telecommunications voice infrastructures were built around circuit-switching. As a consequence of these design choices there are tradeoffs as we migrate to LTE. The most important set of trade-off comes in defining the ‘ends’ in end-to-end.

Key questions include:

• Is upstream QoS important (user-equipment towards Internet) • Are carrier-provided applications included? • Are over-the-top applications included? • Is a guarantee required, or is increased probability of quality sufficient? • Does the QoS have to work in in-network hand-off scenarios between LTE and earlier technologies? • Does the QoS have to work in off-network hand-off (roaming) scenarios between LTE and LTE

technologies? • Does the QoS have to work in off-network hand-off (roaming) scenarios between LTE and non-LTE

technologies? • If the quality cannot be guaranteed should the application be disallowed? (connection admission

control)

Page 15: Quality of Service in Lte Long Form

Quality of Service in LTE

Page 15

• If a session is started in a region with sufficient capacity, but the user moves to one without, is the session terminated?

• Is it sufficient to perform the QoS only in the most-congested part of the network and assume the remainder is sufficiently non-oversubscribed to not matter?

• Is QoS being used as an ‘improvement’ or a ‘degradement’? • Is mobile-to-mobile QoS needed (e.g., push-to-talk over cellular)? • Is there an ability to control demand on some classes of application (e.g., video optimization,

traffic management) to create additional capacity? • Will local breakout be used in the home-network? In the visited-network? If so, packets may flow a

different path.

Answers to the above questions help narrow the focus onto which and how layers of the network are affected, and thus which techniques are needed. In Figure 14, we can see a stereotypical LTE network. At each level there are aggregation routers, bringing together multiple sources.

Figure 14: Network layers

Usually there is multi-path routing (e.g., OSPF ECMP) and as a consequence of packet switching several problems may arise (see Figure 15):

1. The upstream and downstream paths may go over different links, or in the case of the over-the-top applications, entirely different transit service providers.

2. The paths that packets flow over may be unstable. 3. Latencies may be different in each direction. 4. Oversubscription (and thus latency, loss, jitter) can be dramatically different in each direction. 5. Networking vendors and even technologies can be different in each direction.

Figure 15: Typical packet path

Page 16: Quality of Service in Lte Long Form

Quality of Service in LTE

Page 16

In the case above, if ‘end-to-end’ is defined as subscriber-to-subscriber content, then QoS means some network engineering or active signaling across multiple operators and technologies. It would be highly wasteful to provide guarantees on all links for a possible service flow. In the example shown by Figure 15, if we assume the service is 1Mbps of peak bandwidth, the naïve model would be to create a 1Mbps ‘constant bitrate’ guarantee on links 1…25. But if there were some way of knowing the packet flow, and the packet flow never changed, we would only need that 1Mbps guarantee on 1…6 in the downstream. The naïve approach wastes four times the capacity. A more typical approach is to guarantee the 1Mbps solely on link 6 (or sometimes link 5…7), and engineer the ‘core’ to have statistical guarantees only.

Another downside to using the ‘naïve’ approach is that signaling is required to a large number of routers (all the routers in transit-A, transit-B, all the core routers, all the access routers). These routers almost certainly have different capabilities and interfaces (it is likely the ‘core’ routers support signaling via BGP solely, the edge routers may support RADIUS CoA or COPS, and the routers in the transit would be under different administrative control, etc.).

Note that the 3GPP PCC standards are written to assume there is ‘negligible’ loss between the PCEF and the radio base station (see Note 2 of Table 1). In practice this is an aggressive assumption for LTE since the backhaul (S1-U) network can be congested.

LTE QoS Use Cases: Fairshare Traffic Management Sandvine has always worked to ensure its QoS-handling capabilities function well for all network types and between networks types, while easily adapting to ongoing architectural evolutions. A good example is the Traffic Management product, which includes an advanced feature-set called Fairshare.

Sandvine has widely deployed Fairshare Traffic Management in cable environments using PacketCable MultiMedia (PCMM) prioritization. This traffic management mechanism is described in RFC 60578

1. Identify links experiencing congestion

. PCMM is a direct analog of 3GPP PCC, being based on it (and with an explicit goal to harmonise together in Common-IMS (which brings together ETSI TISPAN, ETSI 3GPP, 3GPP2, and CableLabs). The general theory of operation of Fairshare Traffic Management is to do the following:

2. Identify the users on those links likely to cause disproportionate congestion in the next time interval

3. Reduce the scheduling priority of those users until either

a. Congestion disappears (with some hold-down time or hysteresis to prevent oscillation) or b. The user is no longer causing disproportionate congestion

The net effect is to shift congestion (and thus latency and loss) more towards the short-term heavy users. In DOCSIS cable networks this is achieved using the DOCSIS priority field (the equivalent of the ARP field in 3GPP). DOCSIS allows 8 levels of priority; the default service flow is given priority 1, and the ‘heavy users on congested links’ are overridden with a dynamic, fully-wildcard rule that gives them priority 0. As a consequence, the DOCSIS scheduler prefers the priority 1 users over the priority 0, and the congestion shifts. If we bring this use case into 3GPP we run into a problem that the user-equipment does not support being signaled in the same fashion as DOCSIS, and thus the upstream

8 http://tools.ietf.org/html/rfc6057

Page 17: Quality of Service in Lte Long Form

Quality of Service in LTE

Page 17

cannot be prioritized9

The second problem is recognizing which link (which mobile sector) the user is on. In DOCSIS this is signaled with DHCP/SNMP/IPDR protocols. In 3GPP this is only signaled on bearer creation (enabling the user equipment) and possibly on interim updates (e.g., User Location Update). In versions prior to LTE, the fix requires deploying complex probes in the IuB, IuPS, IuCS links. In LTE, it is possible to deploy the Sandvine Policy Traffic Switch (PTS) in the S1-U and thus become eNodeB-aware in a very simple fashion (the outer-IP of the GTP tunnel is the eNodeB). Thus the three following possible mechanisms exist for per-sector prioritization in LTE:

. However, we can perform prioritization for congestion management in the downstream.

1. Signal to a PCRF to signal to the P-GW to create a dedicated bearer with a wildcard service flow (TFT).

2. Modify the tunnel-ID (TEID) to match one that is statically created on the P-GW that has the requisite QoS parameters.

3. Deploy a marking mechanism on the SGi and have it hit a statically-provisioned, dynamic PCC rule using, for example, DSCP marking.

PCRF Signaling In this model, shown by Figure 16, the PTS performs reporting and correlation based on the outer-IP of the GTP-U tunnel. The Fairshare Traffic Management policy measures top users on busy sectors and creates signaling via Rx (acting as an application function). The Rx message contains the following:

• MSDISDN • Subscriber IP • Priority indicator (optionally bandwidth) • Flow-identifier (wild carded all-source IP, all ports, overriding the default TFT)

Figure 16: Radio prioritisation via PCRF

Matching traffic will be de-prioritised in the radio by the eNodeB scheduler.

In-Band Marking (TEID modification) In this model, as shown by Figure 17, the Fairshare Traffic Management policy measures top-users on busy sectors, and modifies the TEID of their traffic to match a pre-defined bearer that was statically created on the P-GW. In addition, DSCP marking or MPLS-EX marking can be performed on the outer tunnel to cause QoS prioritization in the ratio backhaul itself.

9 In LTE, the UE can support multiple primary contexts, but this would mean it would have to understand in some proprietary fashion how to route traffic from one to the other. A primary context has a separate IP address. The intent is to standardise and use this for Voice over LTE, in which the UE knows how to select the right dedicated bearer. This may or may not work for generic over the top applications.

Page 18: Quality of Service in Lte Long Form

Quality of Service in LTE

Page 18

Figure 17: Radio prioritisation via TEID modification

Matching traffic will be de-prioritised in the eNodeB radio scheduler.

In-Band Marking SGi (DSCP Modification) In this model, there is either a signaling mechanism in place from the S1-MME/S11 interface (to teach the PTS about the user to sector mapping), or a set of PTS elements are deployed in the S1-U to do the measurement per sector. A PTS on SGi marks traffic with DSCP, and a static rule on the P-GW causes these packets to be mapped into a dynamically created dedicated bearer (for example, using Cisco’s NQoS feature).

Figure 18: Radio prioritization via SGi

Comparison of Techniques Of the three techniques (PCRF dynamic PCC rule, in-band TEID marking, in-band DSCP marking) there are different strengths and weakness. None of the techniques reliably handle upstream prioritization due to limitations of current user equipment. As a consequence, the upstream is best handled via a capacity-control agent such as the Sandvine PTS (as a policer) or using the 3GPP PCC ‘GBR’. Providing a maximum rate rather than prioritization is not as efficient or effective, but will serve some purpose.

The PCRF with dynamic PCC rules model is the most complex, requiring the most moving parts and the highest signaling rate. The SGi in-band marking, other than requiring a proprietary configuration per P-GW, will be the most reliable and simplest to manage. All three techniques will be equally effective at over-the-air radio prioritization, and this effectiveness will be a function of the eNodeB scheduler solely.

S1-U prioritization of both inner-and outer tunnel offers the best overall performance as it handles both backhaul and radio congestion. Sandvine’s support for this use case is uniquely enabled by the SandScript policy language and the freeform policy creation environment it provides.

Page 19: Quality of Service in Lte Long Form

Quality of Service in LTE

Page 19

Automated QoS Control for Mobile Network Congestion Management Sandvine’s Fairshare Traffic Management uses an advanced feature called the QualityGuard congestion response system. QualityGuard continuously measures subscriber QoE in real time to detect congestion in the access network, and then automatically manages QoS to remove congestion by shaping traffic classified as “low-value” (heavy short-term users who are contributing to congestion, or non-real-time applications such as email and bulk downloads, or a combination of both).

Earlier it was noted that as throughput increases on a node or router, latency increases due to the growing queue delay and the ‘bursty’ nature of TCP. The increase is rather marginal, but proportional to the increase in bandwidth. As the throughput approaches capacity, latency begins to increase exponentially until it reaches a final tipping point where the element experiences congestive collapse. When an access node is near or at capacity, subscribers experience the greatest deterioration of QoE.

Figure 19 shows the relationship between throughput and latency on the road to the congestive collapse of an access resource.

Figure 19 - Relationship between Throughput and Latency

QualityGuard uses access round trip time (aRTT) to measure real-time subscriber QoE, and this is used as the input for a closed-loop control system that continuously and automatically works to maintain the optimum shaped traffic output during times of congestion. This is the maximum throughput, or target goodput, that the access resource can maintain while still providing a good QoE to the 95-99% of subscribers that fall into the high-value traffic category. From a technical standpoint, QualityGuard’s goal is to maintain the optimal goodput for the access resource, which in mobile networks is a constantly moving target due to the variable nature of a cell’s maximum capacity.

Congestive collapse

Page 20: Quality of Service in Lte Long Form

Quality of Service in LTE

Page 20

Figure 20 – Maximizing subscriber QoE and infrastructure lifetime

Using reports generated by a set of congestion-related business intelligence called QualityWatch, and driven by Sandvine’s standard reporting interface, Network Demographics, the following three graphical reports demonstrate the positive effect of QualityGuard.

Figure 21 shows the net effect of QualityGuard on Layer-7 OTT bandwidth for a resource experiencing massive congestion problems. When web browsing traffic begins to increase and real-time subscriber QoE falls below a configured benchmark, QualityGuard shapes the bulk transfer traffic of subscribers currently contributing to the congestion condition while creating capacity for the other 95-99% of users also attempting to use the resource.

Figure 21 – Trial results – verifying the desired effect of QualityGuard on bandwidth

Looking at the same results from a different perspective, Figure 22 shows QualityGuard’s effect on latency in the form of aRTT measurements, and Figure 23 shows the effect on the calculated quality score.

QualityGuard enforces

Page 21: Quality of Service in Lte Long Form

Quality of Service in LTE

Page 21

Figure 22 – Trial results – verifying the desired effect of QualityGuard on high-value latency

Figure 23 – Trial results – high-value latency expressed as a quality score

Conclusions and Recommendations 1. An operator should define end-to-end to include the radio scheduler (eNodeB) and the backhaul,

and overprovision the remainder of the network to provide probabilistic guarantees only. 2. It’s important to understand the limitations of both the backhaul and chosen eNodeB equipment, in

particular the number of queues supported, and whether they are strict-priority (starving lower priority flows) or weighted-fair.

3. Avoid the use of guaranteed bitrate classes (media services are rarely constant bitrate, voice uses silence suppression, audio is adaptive bitrate, video is highly variable based on source content).

4. Concentrate on a single network technology first: hand-off conditions between HSPA & LTE or CDMA and LTE will lead to severe limitations in both the number and richness of service flows.

5. Focus on per-hop behaviour and marking rather than ‘circuit’-based techniques. In particular, avoid the use of signaling service flows for over-the-top services due to their short-lifetime and high-speed.

Page 22: Quality of Service in Lte Long Form

Copyright ©2013 Sandvine Incorporated ULC. Sandvine and the Sandvine logo are registered trademarks of Sandvine Incorporated ULC. All rights reserved.

European Offices Sandvine Limited Basingstoke, UK Phone: +44 0 1256 698021 Email: [email protected]

Headquarters Sandvine Incorporated ULC Waterloo, Ontario Canada Phone: +1 519 880 2600 Email: [email protected]

2013-11-22