Top Banner

of 121

Module10 Troubleshooting Training

Apr 02, 2018

Download

Documents

Tal Deri
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 7/27/2019 Module10 Troubleshooting Training

    1/121

    Infrastructure for All-IP Broadband Mobile WirelessAccelerating Access Anywhere

    Module 10: Troubleshooting, Diagnostics

    Tools & Logging

    Jay Weitzen

    Airvana Performance Engineering

  • 7/27/2019 Module10 Troubleshooting Training

    2/121

  • 7/27/2019 Module10 Troubleshooting Training

    3/121

    Page 3Confidential & Proprietary

    References:

    Troubleshooting guide

    All your training notes

  • 7/27/2019 Module10 Troubleshooting Training

    4/121

    Page 4Confidential & Proprietary

    Common 1xEV-DO Issues

    Cannot set up a DO call

    User authentication fails

    DO Connections are dropping

    Handoff doesnt work

    Poor user Internet browsing experience

    Poor application throughput

  • 7/27/2019 Module10 Troubleshooting Training

    5/121

    Page 5Confidential & Proprietary

    AT DOM RNC PDSN

    1xEV-DO Network Acquisition

    1xEV-DO Session Registration

    1xEV-DO Connection Setup,

    Configuration Negotiation,

    Connection Teardown

    PDSN A11 Registration

    1xEV-DO Connection Setup

    PPP Setup

    time

    User Data Flow

    Server

    Review of Call Processing Flow

  • 7/27/2019 Module10 Troubleshooting Training

    6/121Page 6Confidential & Proprietary

    Work to Localize the Problem

  • 7/27/2019 Module10 Troubleshooting Training

    7/121Page 7Confidential & Proprietary

    If You Think the Problem Is Localized to the

    Laptop or AT

  • 7/27/2019 Module10 Troubleshooting Training

    8/121Page 8Confidential & Proprietary

    DO-RNC Configuration and Operation

    Troubleshooting Process

  • 7/27/2019 Module10 Troubleshooting Training

    9/121Page 9Confidential & Proprietary

    Debugging Common 1xEV-DO Problems

    Cannot set up a DO call

    User authentication fails

    DO Connections are dropping

    Handoff doesnt work

    Poor user Internet browsing experience

    Poor application throughput

  • 7/27/2019 Module10 Troubleshooting Training

    10/121Page 10Confidential & Proprietary

    BSNE Configuration and Operation

    Troubleshooting Process

  • 7/27/2019 Module10 Troubleshooting Training

    11/121Page 12Confidential & Proprietary

    Cannot Set Up 1xEV-DO Call - CLI

    Are the modules up and running? YES

    Nortel-03> show moduleSlot Present Power Contains Status SW Version Up Time03 YES YES sc1/3/1 Active 2.10.0.51 002d 15h 49m 57s04 YES YES modem1/4/1 Active 2.10.0.50 002d 15h 21m 54s

    04 YES YES modem1/4/2 Active 2.10.0.50 002d 15h 21m 44s

    Is DOM node up?

    NORTEL-03# show node

    Administrative Status : UP

    Operational Status : DOWNfor 0 days 17h:17m:48s < Issue Indicator# 1 >EMS Status : COUPLED

    [Current State] : idle

    Are sector-elements up?

    Nortel-03# show sector-element< Issue Indicator# 2 >

    Name Carrier Sector CAI Channel PN Power Admin Oper-------------------------------------------------------------------------------

    element1 carrier1 sector1 IS-856 350 104 37 dBm UP DOWNelement2 carrier1 sector2 IS-856 350 208 35 dBm UP DOWNelement3 carrier1 sector3 IS-856 350 312 35 dBm UP DOWN

  • 7/27/2019 Module10 Troubleshooting Training

    12/121Page 13Confidential & Proprietary

    Cannot Set Up 1xEV-DO Call CLI (Contd)

    Is Abis Peer up?

    NORTEL-03# show abis peer

    ----------------------------------------------------------------------------------------

    Peer IP | Peer | Peer UpTime | Hellos | Messages

    Address | Status | (in sec) | Sent | Received | Sent | Received

    ----------------------------------------------------------------------------------------

    Is there an IP connectivity to the RNC ? < Issue Indicator# 4 >

    NORTEL-03#ping 10.12.0.241 < RNC Node Address>

    Sending 5, 100-byte ICMP Echos to 10.12.0.241, timeout is 2 seconds:

    . . . . .

    Success rate is 0 percent (0/5), round-trip min/avg/max = 0/0/0 ms

    Check IP Routing Table for route to RNC ? < Issue Indicator# 5 >

    Missing Route for RNCs Node Address

    NORTEL-03# show ip route

    IP ROUTING TABLE

    ----------------------------------------------------------------------------

    Destination Flags Gateway Owner Interface

    ----------------------------------------------------------------------------

    1.1.1.0/24 L 1.1.1.1 L ppp1/0/1

    2.2.2.0/24 L 2.2.2.2 L ppp1/0/2

  • 7/27/2019 Module10 Troubleshooting Training

    13/121Page 14Confidential & Proprietary

    Check the GPS timing source ? < GOOD>

    NORTEL-03> show gps health

    GPS module is present

    GPS Lock Status: Locked

    Even Second: Valid for 001d 18h 26m 42s

    Check the SNTP Time from GPS ? < GOOD>

    NORTEL-03> show sntp time

    Sntp Time Details - UTC

    Timing Source = GPS

    Base Secs = 3276519473 (c34bb831)

    Base Nsecs = 0 (00000000)

    Base TBH = 587 (0000024b)

    Base TBL = 2975047453 (b1539f1d)

    Current Secs = 3276519473 (c34bb831)Current Nsecs = 764072390 (2d8ad1c6)

    Num Leap Secs (since 1980) = 13

    Date = 2003/10/30

    Time = 16:17:53.764

    Local Time Offset = -300 minutes

    Cannot Set Up 1xEV-DO Call CLI (Contd)

  • 7/27/2019 Module10 Troubleshooting Training

    14/121Page 15Confidential & Proprietary

    NORTEL-03> show parent-rnc < GOOD>

    RNC PRIORITY

    10.12.0.241 1

    10.12.0.249 2

    10.12.0.244 3

    NORTEL-03# show rn-connection summary < Widespread Issue >

    Connection Summary:

    -------------------

    TotalNumConnections : 0

    NumTrafficChannels[sector0] : 0

    NumTrafficChannels[sector1] : 0

    NumTrafficChannels[sector2] : 0

    BtsConnId BscConnId UATI32 SoftHoCnt SofterHoCnt SectorBitmap

    =========-==========-==========-=========-===========-============

    Nortel-03# show topology-manager config

    TM-ID = 10.12.0.232

    RN-RNC Parameters

    -----------------

    HoldTime = 120

    KeepaliveTime = 40

    Cannot Set Up 1xEV-DO Call CLI (Contd)

  • 7/27/2019 Module10 Troubleshooting Training

    15/121Page 16Confidential & Proprietary

    NORTEL-03#show topology-manager peer

    BSC STATE TXMSGS TXUPDATES RXMSGS RXUPDATES

    10.12.0.241 BtsBscStateAbisInit 1599 0 0 0

    NORTEL -03# show abis config

    Abis Configuration parameters

    Hello Interval (in seconds) 10

    Hello Retransmits 3

    Connect Timeout (in seconds) 40

    NORTEL -03# show traffic-channel < Widespread Issue >

    BTS BSC Flow Pri Dn Pkts Up Pkts

    -------------------------------------------------

    0x0010 0x00000c01 CCH 128 1188 359

    0x0011 0x00000c02 CCH 128 1099 245

    0x0012 0x00000c03 CCH 128 852 0

    Cannot Set Up 1xEV-DO Call CLI (Contd)

  • 7/27/2019 Module10 Troubleshooting Training

    16/121Page 17Confidential & Proprietary

    Nortel-07# show 1xEV-DO sess all 1001 40 < NO SESSION SETUP>UATI List

    Inst UATI24 RATI PSI HW Id IMSI STATE ConnState

    (Dec) (Hex) (Hex) (Hex) (Hex) (BCD)

    ----------------------------------------------------------------------------------------

    Nortel-07 # show 1xEV-DO conection all < NO CONNECTION>--------------------------------------------------------------------------------

    S# Slot Session Instance UATI24

    --------------------------------------------------------------------------------

    AIRVANA-07#show pcf pdsn all < GOOD>

    PDSN Selection Table

    --------------------

    No. IP Status No. IP Status No. IP Stat

    us

    ---------------------------------- ---------------------------------- ----------------------------

    ------

    0 99.99.99.99 Active 22 0.0.0.0 Inactive 44 0.0.0.0 Inactive

    1 0.0.0.0 Inactive 23 0.0.0.0 Inactive 45 0.0.0.0 Inactive

    Cannot Set Up 1xEV-DO Call RNC CLI

  • 7/27/2019 Module10 Troubleshooting Training

    17/121Page 18

    Confidential & Proprietary

    The Problem

    We seem to have lost IP connectivity

    between DOM and RNC

    We do have backhaul connectivity

  • 7/27/2019 Module10 Troubleshooting Training

    18/121Page 19

    Confidential & Proprietary

    On DOM, add Static Route for RNCs Node address of

    10.12.0.241

    NORTEL-03# conf

    Enter configuration commands, one per line. End with CTRL-Z.

    NORTEL-03(config)# ip route 10.12.0.241/32 1.1.1.2 2.2.2.1

    NORTEL-03(config)#exit

    NORTEL-03# show ip route

    IP ROUTING TABLE

    ----------------------------------------------------------------------------

    Destination Flags Gateway Owner Interface

    ----------------------------------------------------------------------------

    10.12.0.241/32 1.1.1.2 S ppp1/0/12.2.2.1 S ppp1/0/2

    1.1.1.0/24 L 1.1.1.1 L ppp1/0/1

    2.2.2.0/24 L 2.2.2.2 L ppp1/0/2

    The FIX

  • 7/27/2019 Module10 Troubleshooting Training

    19/121Page 20

    Confidential & Proprietary

    How many active connections on a particular sector-element?

    Nortel-03# show sector-element e1

    element1:

    Carrier: carrier1

    Sector: sector1 (none)

    Admin status: UP

    Operational status: UP for 0 days 17h:56m:28sWaiting for: Admin status to be set DOWN

    [OperUpDownCount]: 12

    [Current State]: Operating

    Modem module: modem1

    Modem operational status: UP

    [ModemSectorAdminStatus]: UP

    [ModemSectorOperStatus]: UP[ModemTestModeEnabled]: false

    [ModemTestCase]: None

    Radio module: radio1

    Radio operational status: UP

    Power output level: 40 dBm

    Connection limit: 48

    Active connections: 0Bounce detection: 3 times in last 300 seconds

    Bounce timeout: 120 seconds

    [RNC config state]: COMPLETE

    [RNC config failed count]: 0

    Air interface: IS-856

    Timing advance: 54 chipx2

    Reverse delay dist.: 48 chipx2

    Cannot Set Up 1xEV-DO Call CLI (Contd)

  • 7/27/2019 Module10 Troubleshooting Training

    20/121Page 21

    Confidential & Proprietary

    Continue

    [Sector ID]: 0x12000000340000005600000078000001

    PN offset: 4

    Cell ID: 0000

    Maximum revision: 244

    Minimum revision: 244

    [Color code]: 1

    Redirect: falseControl channel rate: 76.8 kbps

    Sync capsule offset: 0 slots

    RAB length: 32 slots

    RAB offset: 16 slots

    Country code: 1

    [Subnet mask length]: 104 bits

    [Route update radius]: 0Rev link silence duration: 0 frames

    Rev link silence period: 3

    Channel count: 0

    Open loop adjust: 86 dB

    Probe initial adjust: 0 dB

    Probe steps: 15

    Probe power step: 3 dB

    A Persistence: [1,1,1,1]

    Search window size inc.: false

    Search window offset inc.: false

    [OverheadBtsConnectionId]: 0x0010

    [OverheadBscConnectionId]: 0x00000C01

  • 7/27/2019 Module10 Troubleshooting Training

    21/121Page 22

    Confidential & Proprietary

    Continue..

    Neighbor count: 3Neighbor 1:Neighbor Index: 8Neighbor Address: 10.12.0.244Air interface: IS-856Pilot PN: 24Channel included: falseSystem type: 0Band class: Class 1 (1900 MHz)

    Channel 925Search window size: 4 chips (0)Search window offset: 0 chips

    Neighbor 2:Neighbor Index: 9Neighbor Address: 10.12.0.244Air interface: IS-856Pilot PN: 196Channel included: falseSystem type: 0Band class: Class 1 (1900 MHz)Channel 925Search window size: 4 chips (0)Search window offset: 0 chips

    Neighbor 3:Neighbor Index: 10Neighbor Address: 10.12.0.244

    Air interface: IS-856Pilot PN: 368Channel included: falseSystem type: 0Band class: Class 1 (1900 MHz)Channel 925Search window size: 4 chips (0)Search window offset: 0 chipsActive connections: 1

  • 7/27/2019 Module10 Troubleshooting Training

    22/121Page 23

    Confidential & Proprietary

    Nortel-03#show traffic-channel

    BTS BSC Flow Pri Dn Pkts Up Pkts

    -------------------------------------------------

    0x0010 0x00000c01 CCH 128 1215 359

    0x0011 0x00000c02 CCH 128 1161 282

    0x0012 0x00000c03 CCH 128 877 0

    0x0229 0x000301ec FTC 128 486 430

    Verifying the DO connection from RNC

    Nortel-07# show 1xEV-DO sess all 1001 40

    UATI List

    Inst UATI24 RATI PSI HW Id IMSI STATE ConnState

    (Dec) (Hex) (Hex) (Hex) (Hex) (BCD)----------------------------------------------------------------------------------------

    1 3 D003E9 FED4991C 60000006 600E0278 310012135135864 Open Active

    2 4 D003EA 1944395C 60000001 600E0299 310012135135897 Open Dormant

    Total Displayed Number of Current Active Sessions: 1Total Displayed Number of Current Dormant Sessions: 1

    Total Displayed Number of Current Sessions Awaiting Close from AT: 0

    Total Displayed Number of Current Active Sessions: 1

    Total Displayed Number of Current Dormant Sessions: 2

    Total Displayed Number of Current Sessions Awaiting Close from AT: 0

    Cannot Set Up 1xEV-DO Call CLI (Contd)

  • 7/27/2019 Module10 Troubleshooting Training

    23/121

    Page 24Confidential & Proprietary

    Nortel-07 # show 1xEV-DO conection all--------------------------------------------------------------------------------

    S# Slot Session Instance UATI24

    --------------------------------------------------------------------------------

    1 3 3 0x00d003e9

    Cannot Set Up 1xEV-DO Call CLI (Contd)

  • 7/27/2019 Module10 Troubleshooting Training

    24/121

    Page 25Confidential & Proprietary

    What If This Does Not Work?

    Look for problems external to BSNE:

    Radio malfunction

    Antenna malfunction

    Antenna cable malfunction

    Interference BTS chassis malfunction

  • 7/27/2019 Module10 Troubleshooting Training

    25/121

    Page 26Confidential & Proprietary

    Common 1xEV-DO Issues

    Cannot set up a DO call

    User authentication fails

    DO Connections are dropping

    Handoff doesnt work on isolated sectors

    Poor user Internet browsing experience

    Poor application throughput

  • 7/27/2019 Module10 Troubleshooting Training

    26/121

    Page 27Confidential & Proprietary

    Handoff Failure Indicators

    Wrong neighbor list IP address

    Incorrect neighbor channel number

    Multiple neighbors with same PN offset

    Time offset errors between BTS

    Coverage/ neighbor sector(s) down

    PN offset increment is incorrect

  • 7/27/2019 Module10 Troubleshooting Training

    27/121

    Page 28Confidential & Proprietary

    Sector-Element Neighbor List

    DOM# show sector-element e1element1:

    Carrier: carrier1

    Sector: sector1 (none)

    Admin status: UP

    Operational status: UP for 1 days 03h:19m:20sWaiting for: Admin status to be set DOWN

    [OperUpDownCount]: 2

    [Current State]: Operating

    Modem module: modem1

    Neighbor count: 5

    Neighbor 1:

    Neighbor Index: 8

    Neighbor Address: 10.10.117.8

    Air interface: IS-856

    Pilot PN: 24

    Channel included: false

    System type: 0

    Band class: Class 1 (1900 MHz)

    Channel 925

    Search window size: 4 chips (0)

    Search window offset: 0 chips

  • 7/27/2019 Module10 Troubleshooting Training

    28/121

    Page 29Confidential & Proprietary

    Sector-Element Neighbor List

    Neighbor 2:

    Neighbor Index: 9

    Neighbor Address: 10.10.117.8

    Air interface: IS-856

    Pilot PN: 196

    Channel included: false

    System type: 0

    Band class: Class 1 (1900 MHz)

    Channel 925

    Search window size: 4 chips (0)

    Search window offset: 0 chips

    Neighbor 3:

    Neighbor Index: 10

    Neighbor Address: 10.10.117.8

    Air interface: IS-856

    Pilot PN: 368

    Channel included: false

    System type: 0

    Band class: Class 1 (1900 MHz)

    Channel 925

    Search window size: 4 chips (0)

    Search window offset: 0 chips

  • 7/27/2019 Module10 Troubleshooting Training

    29/121

    Page 30Confidential & Proprietary

    Sector-Element Neighbor List

    Neighbor 4:

    Neighbor Index: 18

    Neighbor Address: 10.12.0.242

    Air interface: IS-856

    Pilot PN: 176

    Channel included: false

    System type: 0

    Band class: Class 1 (1900 MHz)

    Channel 925

    Search window size: 4 chips (0)

    Search window offset: 0 chips

    Neighbor 5:

    Neighbor Index: 19

    Neighbor Address: 10.12.0.242

    Air interface: IS-856

    Pilot PN: 348

    Channel included: false

    System type: 0

    Band class: Class 1 (1900 MHz)

    Channel 925

    Search window size: 4 chips (0)

    Search window offset: 0 chips

  • 7/27/2019 Module10 Troubleshooting Training

    30/121

    Page 32Confidential & Proprietary

    Common 1xEV-DO Issues

    Cannot set up a DO call

    User authentication fails

    DO Connections are dropping

    Handoff doesnt work on isolated sectors

    Poor user Internet browsing experience

    Poor application throughput

  • 7/27/2019 Module10 Troubleshooting Training

    31/121

    Page 33Confidential & Proprietary

    Connections Are Dropping

    Is AT at cell boundaries? Poor RF coverage.

    PDSN has some security features (ingress

    filtering), which can tear down calls withMicrosoft machines. This usually happens on

    the PC where there are 1 local Ethernet and 1DO connection.

    Abis link is disconnecting?

    Maximum AT connection per sector reaches

    the configured limit (default 48 ATs/sector).

  • 7/27/2019 Module10 Troubleshooting Training

    32/121

    Page 34Confidential & Proprietary

    Where to Go

    Set call control logging to info and turn on

    debug call drop on RNC

    This will give you the cause and last 100 C/Isamples from each drop call

    Use RNC, slot, and sector carrier OMs toform a picture of what is happening

  • 7/27/2019 Module10 Troubleshooting Training

    33/121

    Page 35Confidential & Proprietary

    Call Drop Logging

    ISP_RNC(config)#1xevdo debug connection-drops

  • 7/27/2019 Module10 Troubleshooting Training

    34/121

    Page 36Confidential & Proprietary

    Debugging Common 1xEV-DO Problems

    Cannot set up a DO call

    User authentication fails

    DO Connections are dropping

    Handoff doesnt work

    Poor user Internet browsing experience

    Poor application throughput

  • 7/27/2019 Module10 Troubleshooting Training

    35/121

    Page 37Confidential & Proprietary

    Objectives

    Provide a proven methodology to measure,analyze and debug throughput issues in the

    network. Reduce the time it takes to resolve

    throughput-related problems.

    Minimize duplicate testing effort atcustomer sites.

    Demonstrate rigor and coherence whenfaced with a throughput issues.

    Avoid finger pointing within organizations.

  • 7/27/2019 Module10 Troubleshooting Training

    36/121

    Page 38Confidential & Proprietary

    Outline

    Introduction

    Brief theoretical background of why packet

    losses are the primary driver on throughput inTCP based networks

    Testing tools and instrumentation

    Before you spend too much time chasing

    phantoms

    Troubleshooting methodology

    Results analysis

    T i l P f P bl Ob d b

  • 7/27/2019 Module10 Troubleshooting Training

    37/121

    Page 39Confidential & Proprietary

    Typical Performance Problems Observed by

    Customers

    Poor web browsing (HTTP) performance Characterized by unexpected delays between requests and

    responses as well as by slow, interrupted web page loading

    Poor file transfer (FTP) performance on the forward link Characterized by substantial difference between requested data

    rates (DRC) and actual perceived speed

    Stalls of more than 10 seconds during downloads

    Limited file transfer (FTP) performance on the reverse link Long latency

    Minimum 110 ms, typically around 300-600 ms for most users

    Representing a potential obstacle for particular applications

  • 7/27/2019 Module10 Troubleshooting Training

    38/121

    Page 40Confidential & Proprietary

    Understanding Causes of Poor Performance

    Signal quality and RF coverage issues

    RF holes

    Poor RF coverage or C/I with strong signals

    Self-interference from laptops

    Hybrid mode tune away issues

    Poor TCP performance due to packet losses

    Loss in backhaul network

    Loss in core network

    Loss in RAN

    Excessive latency

    Too many users

    Inadequate backhaul bandwidth

  • 7/27/2019 Module10 Troubleshooting Training

    39/121

    Page 41Confidential & Proprietary

    Work to Localize the Problem

  • 7/27/2019 Module10 Troubleshooting Training

    40/121

    Page 42Confidential & Proprietary

    Historical Look at Causes of Degraded Performance

    Performance issues have been due to (in decreasing order): RF issues.

    Hybrid mode problems.

    Self interference from laptops.

    RF holes when overlaying 1900 MHz on 800 MHz.

    Routing mis-configurations in the core network.

    External internet issues and bandwidth.

    Backhaul packet loss. Including router configuration, aggregation, excessive latency,

    lost packets, inadequate backhaul bandwidth.

    Configuration/Misconfiguration issues in RAN.

    Incorrect configuration scripts. Laptop/at issues.

    AT bugs, mostly fixed in newer AT software versions.

    Laptop settings.

  • 7/27/2019 Module10 Troubleshooting Training

    41/121

    Page 43Confidential & Proprietary

    User Performance Troubleshooting Process

  • 7/27/2019 Module10 Troubleshooting Training

    42/121

    Page 44Confidential & Proprietary

    Airlink Performance Troubleshooting Process

  • 7/27/2019 Module10 Troubleshooting Training

    43/121

    Page 45Confidential & Proprietary

    Why Packet Loss Causes Poor TCP

    Performance in Wireless Networks

    Why RF engineers really do need to understand

    TCP

    Why Network engineers really do need tounderstand what goes on in the physical

    Layer

  • 7/27/2019 Module10 Troubleshooting Training

    44/121

    Page 46Confidential & Proprietary

    TCP Vs. UDP

    UDP source does not react to loss and is notsensitive to delay

    UDP tests can be used to determine the maxair-link bandwidth

    TCP is sensitive to both loss and delay

    TCP tests can be used to determine the max air-link bandwidth subject to the loss and delayconstraints of the network

    Single user/sector FTP tests often used tobenchmark overall end to end packet loss andperformance

  • 7/27/2019 Module10 Troubleshooting Training

    45/121

    Page 47Confidential & Proprietary

    RLP Provides Airlink Reliability to TCP Layer

    RLP is a 1xEV-DO protocol layer that is responsible forretransmitting octets that are lost in the RAN backhaul orover the airlink

    RLP retransmits octets when it receives a Nak from thereceiver indicating loss

    RLP retransmits only once

    If the Nak or the retransmitted data is lost, RLP will not be

    able to recover lost data and thus the error-ed data ispassed to the higher layer

    The maximum time RLP waits to recover a hole in thedata is 500 ms, also known as the RLP abort timeout

    App-Layer PayloadIP

    PPP PayloadPPP Hdr

    RLP PayloadRLP Hdr

  • 7/27/2019 Module10 Troubleshooting Training

    46/121

    Page 48Confidential & Proprietary

    TCP Congestion Control

    Sender may not only overrun receiver, but may alsooverrun intermediate routers: No way to explicitly know router buffer occupancy,

    So we need to inferit from packet losses. Assumption is that losses stem from congestion, namely, that

    intermediate routers have no available buffers.

    Sender maintains a congestion window:

    Never have more than CW of un-acknowledged data outstanding(or RWIN data; Min of the two).

    Successive ACKs from receiver cause CW to grow.

    How CW grows based on which of 2 phases: Slow-start: initial state.

    Congestion avoidance: steady-state.

    Switch between the two when CW > slow-start threshold.

  • 7/27/2019 Module10 Troubleshooting Training

    47/121

    Page 49Confidential & Proprietary

    TCP Congestion Control Principles

    Lack of congestion control would lead tocongestion collapse. Idea is to be a good networkcitizen

    Would like to transmit as fast as possible withoutloss

    Probe network to find available bandwidth

    In steady-state: linear increase in CW per RTT After each loss event: CW is halved

    This is called additive increase /multiplicative

    decrease (AIMD) Various papers on why AIMD leads to network

    stability

  • 7/27/2019 Module10 Troubleshooting Training

    48/121

    Page 50Confidential & Proprietary

    Slow Start

    Initial CW = 1

    After each ACK, CW += 1;

    Continue until:

    Loss occurs OR

    CW > slow start threshold

    Then switch to congestion

    avoidance

    If we detect loss, cut CW in

    half

    Exponential increase in

    window size per RTT

    sender

    onesegment

    RTT

    receiver

    time

    twosegments

    foursegments

  • 7/27/2019 Module10 Troubleshooting Training

    49/121

    Page 51Confidential & Proprietary

    Congestion Avoidance

    Until (loss) {

    after CW packets ACKed:

    CW += 1;

    }

    ssthresh = CW/2;

    Depending on loss type:

    SACK/Fast Retransmit:

    CW/= 2; continue;

    Course grained timeout:

    CW = 1; go to slow start.

    (This is for TCP Reno/SACK: TCP

    Tahoe always sets CW=1 after a loss)

  • 7/27/2019 Module10 Troubleshooting Training

    50/121

    Page 52Confidential & Proprietary

    How Are Losses Recovered?

    Say packet is lost (data or ACK!)

    Coarse-grained timeout:

    Sender does not receive ACK aftersome period of time

    Event is called a retransmissiontime-out (RTO)

    RTO value is based on estimatedround-trip time (RTT)

    RTT is adjusted over time usingexponential weighted movingaverage:

    RTT = (1-x)*RTT + (x)*sample

    (X is typically 0.1)

    First done in TCP Tahoe

    Seq=92,8bytesdata

    ACK=10

    0

    loss

    timeout

    lost ACK scenario

    X

    Seq=92,8bytesdata

    ACK=10

    0

    sender receiver

    time

  • 7/27/2019 Module10 Troubleshooting Training

    51/121

    Page 53Confidential & Proprietary

    Fast Retransmit

    Receiver expects N, gets N+1:

    Immediately sends ACK(N)

    This is called a duplicate ACK

    Does NOT delay ACKs here! Continue sending dup ACKs for each

    subsequent packet (not N)

    Sender gets 3 duplicate ACKs:

    Infers N is lost and resends

    3 chosen so out-of-order packets dont

    trigger fast retransmit accidentally

    Called fast since we dont need towait for a full RTT

    Introduced in TCP Reno

    sender receiver

    time

    SEQ=3000,size=1000

    ACK3000

    XSEQ=4000SEQ=5000

    SEQ=6000

    ACK3000

    ACK3000

    ACK3000

    SEQ=3000,size=1000

    TCP Configuration: Make Sure You Know the

  • 7/27/2019 Module10 Troubleshooting Training

    52/121

    Page 54Confidential & Proprietary

    TCP Configuration: Make Sure You Know the

    TCP Configuration of Your Network

    TCP configuration greatly affects the

    throughput performance

    TCP configuration parameters of interest: Maximum TCP window

    Maximum duplicate Acks Selective acknowledgments

    Maximum segment size

    Minimum round-trip timeout

  • 7/27/2019 Module10 Troubleshooting Training

    53/121

    Page 55Confidential & Proprietary

    TCP Configuration TCP Window

    The maximum TCP window (set at the receiver)determines the maximum number of un-acknowledged octets that the server can transmit

    When TCP detects loss, it halves its window inreaction to congestion

    Relationship between TCP window, throughputand delay:

    Max achievable TCP throughput = TCP window /round-trip-delay

    TCP window must be bandwidth-delay product

    Use DRTCP.EXE to set the max TCP (receive)window on the laptop

  • 7/27/2019 Module10 Troubleshooting Training

    54/121

    Page 56Confidential & Proprietary

    TCP Configuration Duplicate Acks

    When the number of Acks acknowledging

    the same TCP segment reaches the

    maximum duplicate Acks parameter, theserver re-transmits the segment

    Use DRTCP.EXE to set the maximumnumber of duplicate Acks on the laptop

    TCP C fi i S l i A k

  • 7/27/2019 Module10 Troubleshooting Training

    55/121

    Page 57Confidential & Proprietary

    TCP Configuration Selective Acks

    The selective Acks (SACK) feature is negotiatedbetween the receiver and the sender

    It provides a way for the receiver to request theselective retransmission of one or more lostsegments

    If disabled, the server can only retransmit lostsegment one at a time

    In networks with relatively high-delays and high-

    loss (such as in wireless networks), SACK mustbe enabled

    Use DRTCP.EXE to enable SACK on the laptop

    TCP C fi i M S Si

  • 7/27/2019 Module10 Troubleshooting Training

    56/121

    Page 58Confidential & Proprietary

    TCP Configuration Max Segment Size

    The maximum segment size (MSS)

    configuration affects the percentage

    overhead associated with TCP/IP packets For optimum performance, the MSS should

    be set to be the maximum integral divisor ofthe TCP window size and must be less than

    roughly 1500 bytes

    Use DRTCP.EXE to set the MSS to a safe

    value of 1400 bytes

    TCP Configuration

  • 7/27/2019 Module10 Troubleshooting Training

    57/121

    Page 59Confidential & Proprietary

    TCP Configuration

    Minimum Round-trip Timeout

    The minimum round-trip timeout (min

    RTO) is the initial value to which TCP sets

    its round-trip-timeout (after which itretransmits an unacknowledged segment)

    In 1xEV-DO 1x hybrid mode, the minRTO parameter may need to be increased

    on sun-based FTP servers

  • 7/27/2019 Module10 Troubleshooting Training

    58/121

    Page 60Confidential & Proprietary

    Tools and Instrumentation forTroubleshooting 1xEV-DO Networks

    Tools and Instrumentation For Debugging and

  • 7/27/2019 Module10 Troubleshooting Training

    59/121

    Page 61Confidential & Proprietary

    Tools and Instrumentation For Debugging and

    Analyzing Performance

    RN and RNC logging and data collection RNC HDRFastPath stats

    RNC FastPath stats

    Forward traffic channel manager logs on the RN

    Power-rate-control stats on both the RN and the RNC

    RN CellDM

    CAIT logs and stats

    Tap (FTAP,RTAP)

    PPP errors shown on the laptop dialer box Windump logs collected on the laptop

    Snoop logs collected on the traffic server/host

    What Is CAIT and How Can It Help You Analyze

  • 7/27/2019 Module10 Troubleshooting Training

    60/121

    Page 62Confidential & Proprietary

    W C C p y

    Performance Issues?

    CAIT is the Qualcomm access terminal diagnosticmonitor, and currently is the only tool whichsupports 1xEV-DO

    What to collect:

    Logs: Airvana recommended logging as follows

    F3 access terminal monitor screens Required to debug hybrid mode issues

    ! CAIT is both CPU and memory intensive and

    can be intrusive (using CAIT can degradeperformance) in single at tests

    CAIT instrumented laptops should have at least 256M

    memory

    K CAIT S

  • 7/27/2019 Module10 Troubleshooting Training

    61/121

    Page 63Confidential & Proprietary

    Key CAIT Screens

    F3 Mobile Messages

    F9 Status Window

    F8 Scripting

    1xEV-DO Windows

    Ai R d d L i M k

  • 7/27/2019 Module10 Troubleshooting Training

    62/121

    Page 64Confidential & Proprietary

    Airvana Recommended Logging Mask

    If you want

    GPS Position

    Data

    Use these tocollect 1xRTT

    and AT state

    messages

    TAP (FTAP and RTAP) and How Is It Used in

  • 7/27/2019 Module10 Troubleshooting Training

    63/121

    Page 65Confidential & Proprietary

    ( )

    the Debug Process

    FTAP and RTAP are test instruments built into1xEV-DO access terminals and RN and RNC

    On the protocol stack, TAP does NOT employ RLP, but

    sends test frames to saturate the bandwidth achievableby the user

    Taps forward and reverse bandwidth estimation is very

    accurate Taps reverse and forward error rate estimation is

    inaccurate, but using CAIT stats and power-rate-controlstats we can do without FTAPs error rate estimates

    TAP is used to (dis)prove that the path betweenthe RNC (below the RLP layer) and the AT is

    problematic

    What to Look for in TAP Tests?

  • 7/27/2019 Module10 Troubleshooting Training

    64/121

    Page 66Confidential & Proprietary

    What to Look for in TAP Tests?

    FTAP and RTAP must report the data ratesthat one would expect the airlink to be able

    to deliver Example: if the user is the only active user onthe sector, the reported FTAP data rate must be

    almost equal to the average DRC of the user,and the RTAP data rate must be almost equal tothe average reverse rate limit on the sector

    Power-rate-control stats and CAIT statsmust respectively show a reverse and aforward error rate both less than 1%

    What Is Windump and How Is It Used

  • 7/27/2019 Module10 Troubleshooting Training

    65/121

    Page 67Confidential & Proprietary

    What Is Windump and How Is It Used

    Windump (TCPdump in a unix environment) captures allactivity across the IP stack on a laptop

    Programs like ethereal are used to parse Windump data and

    display and interpret the contents of the IP, ICMP, TCP, headers

    For throughput troubleshooting, the first item to look for isthe SACK permitted feature of the options field of theTCP header of the SYN and SYN Acks packets that areused to establish a TCP connection. If SACK permitteddoes not appear, it means that the SACK feature is not

    enabled The most important information that can be obtained from

    Windumps, is the TCP trace which shows how the TCPentity on the laptop is receiving its TCP data

    Installing WinPcap and Windump Software

  • 7/27/2019 Module10 Troubleshooting Training

    66/121

    Page 68Confidential & Proprietary

    Installing WinPcap and Windump Software

    Before installing Windump version 3.6.2 executable

    file, install WinPcap version 2.3 from the following

    URL : http://windump.polito.it/install/Default.htm on

    the PC.

    Next, install the Windump executable from the sameURL above.

    The manual can be found in the following URL :

    http://windump.polito.It/docs/manual.htm.

    Executing Windump

  • 7/27/2019 Module10 Troubleshooting Training

    67/121

    Page 69Confidential & Proprietary

    Executing Windump

    1) c:\documents and settings\administrator> windump -help.windump version 3.6.2, based on tcpdump version 3.6.2.

    WinPcap version 2.3, based on libpcap version 0.6.2.

    Usage: windump [-adDeflnNOpqStuvxX] [-B size] [-c count] [ -F file ].

    [ -I interface ] [ -r file ] [ -s snaplen ].

    [ -T type ] [ -w file ] [ expression ].

    2) c:\documents and settings\administrator>windump -D.1.\Device\packet_NdisWanIp (NdisWan adapter).

    2.\Device\packet_{49854d65-f483-48d3-b8fe-da3c7d38cf4f} (3com 10/100 mini PCIEthernet adapter).

    3.\Device\packet_{96382fcd-f4a9-416f-b1e2-ee54af25ae87} (NOC extranet access adapter).4.\Device\packet_NdisWanBh (NdisWan adapter).

    3) c:\documents and settings\administrator>windump -i 4 -w hugeng.windump: listening on\device\packet_NdisWanBh.

    windump: WARNING: the operation completed successfully.

    .

    38 packets received by filter.

    0 packets dropped by kernel.

    4) open the file with ethereal......

  • 7/27/2019 Module10 Troubleshooting Training

    68/121

    Historical Look at Causes of Degraded Performance

  • 7/27/2019 Module10 Troubleshooting Training

    69/121

    Page 71Confidential & Proprietary

    Historical Look at Causes of Degraded Performance

    Vast majority of performance issues have been due to (indecreasing order): RF

    RF issues in 1xRTT networks spill over into 1xEV-DO networksusing hybrid access terminals

    Neighbor lists, PN-plan, handoff boundaries, parameters

    Routing mis-configurations in the core network

    External internet issues and bandwidth

    Backhaul packet loss

    Including router configuration, aggregation, excessive latency, lostpackets, inadequate backhaul bandwidth

    Configuration/Misconfiguration issues in RAN

    Incorrect configuration scripts

    Laptop/at issues

    AT bugs, mostly fixed in newer AT software versions

    Laptop settings

    Analyzing With TCP Tool

  • 7/27/2019 Module10 Troubleshooting Training

    70/121

    Page 72Confidential & Proprietary

    Analyzing With TCP Tool

    Using TCP Sequence Number Plot

  • 7/27/2019 Module10 Troubleshooting Training

    71/121

    Page 73Confidential & Proprietary

    Using TCP Sequence Number Plot

    There is a beautiful way to plot and

    visualize the dynamics of TCP behavior

    Called a TCP sequence number plot Plot packet events (data and ACKs) as

    points in 2-D space, with time on thehorizontal axis, and sequence number on the

    vertical axis

    How to Interpret a TCP Trace? The Basics

  • 7/27/2019 Module10 Troubleshooting Training

    72/121

    Page 74Confidential & Proprietary

    How to Interpret a TCP Trace? The Basics

    Loss

    Retx RetxRetx

    Retx

    TCP

    Sequence

    Numbers

    TimeElapsed

    In-Sequence

    Segment

    arrivals

    SACKDisabled

    How to Interpret a TCP Trace? The Basics

  • 7/27/2019 Module10 Troubleshooting Training

    73/121

    Page 75Confidential & Proprietary

    How to Interpret a TCP Trace? The Basics

    Retx

    TCP

    Sequence

    Numbers

    TimeElapsed

    In-SequenceSegment

    arrivals

    SACK

    Enabled

    Retx

    How to Interpret a TCP Trace? The Basics

  • 7/27/2019 Module10 Troubleshooting Training

    74/121

    Page 76Confidential & Proprietary

    How to Interpret a TCP Trace? The Basics

    No TCP

    segment Losses

    TCP

    Sequence

    Numbers

    TimeElapsed

    130 ms gap

    Data Recovered

    through RLP

    How to Interpret a TCP Trace? The Basics

  • 7/27/2019 Module10 Troubleshooting Training

    75/121

    Page 77Confidential & Proprietary

    How to Interpret a TCP Trace? The Basics

    Hybrid ModeSwitchover

    TCP

    Sequence

    Numbers

    TimeElapsed

    Hybrid Mode

    Switchover

    Hybrid Mode

    Switchover

    Hybrid Mode

    Switchover

  • 7/27/2019 Module10 Troubleshooting Training

    76/121

    Page 78Confidential & Proprietary

    Performance TroubleshootingMethodology

    Debugging Methodology

  • 7/27/2019 Module10 Troubleshooting Training

    77/121

    Page 79Confidential & Proprietary

    Debugging Methodology

    Work backwards from user

    Verify airlink quality including hybrid mode

    tune away Verify AT and PC setup

    Verify backhaul performance

    Verify RAN as a whole

    Look for dropped packets

    Verify core network Routing issues, dropped packets, latency

    Before Starting on a Wild Goose Chase

  • 7/27/2019 Module10 Troubleshooting Training

    78/121

    Page 80Confidential & Proprietary

    Before Starting on a Wild Goose Chase

    Check system configuration against what it should be Understand what your benchmarks should be based on

    network type Hybrid/non-hybrid

    Mobile-IP/simple-IP

    If using hybrid mode, check that underlying 1x network isworking the way it should 1xRTT network idle handoff problems are almost guaranteed to cause

    significant throughput degradation and stalls in 1xEV-DO performancetests

    Check for interference on PCMCIA cards

    If responding to a benchmark change, check to see that theinstrumentation (servers, and routes) are the same from whenthe benchmark was conducted

    Check that 2.4 MBPS is not going over 2 MBPS backhaulbandwidth!

    Historical Look at Causes of Degraded

  • 7/27/2019 Module10 Troubleshooting Training

    79/121

    Page 81Confidential & Proprietary

    Performance

    Vast majority of performance issues have been due to (indecreasing order): RF

    RF issues in 1xRTT networks spill over into 1xEV-DO networks usinghybrid access terminals

    Poor coverage

    Routing mis-configurations in the core network

    External internet issues and bandwidth

    Backhaul packet loss

    Including router configuration, aggregation, excessive latency, lostpackets, backhaul bandwidth

    Configuration/Misconfiguration issues in RAN

    Incorrect configuration scripts

    Laptop/at issues

    AT bugs, mostly fixed in newer AT software versions

    Laptop settings

  • 7/27/2019 Module10 Troubleshooting Training

    80/121

  • 7/27/2019 Module10 Troubleshooting Training

    81/121

    Check for Acquisition of 1xRTT System

  • 7/27/2019 Module10 Troubleshooting Training

    82/121

    Page 84Confidential & Proprietary

    Check for Acquisition of 1xRTT System

    1xEV-DO hybrid mode operation requiresAT to:

    Acquire an IS-2000 system first: System selection is based on the PRL and specified

    mode preference

    Failure to acquire IS-95/IS-2000 system afterrepeated attempts will cause the terminal to go into

    deep-sleep state for power conservation. EXCEPT

    for DMSS 3.3 and higher in which case ->

    Hybrid Mode Operation: Traffic State

  • 7/27/2019 Module10 Troubleshooting Training

    83/121

    Page 85Confidential & Proprietary

    Hybrid Mode Operation: Traffic State

    AT will tune to 1xRTT Paging channelevery 2.56 seconds.

    If the tune away is stalls, the IS-856 airinterface will return to initialization state if the

    control channel supervision timer expires (on

    AT) or connection to drops at RN/RNC due toRTCLost (RLL fade timer)

    Hybrid Mode Operation: Traffic State (3)

  • 7/27/2019 Module10 Troubleshooting Training

    84/121

    Page 86Confidential & Proprietary

    Hybrid Mode Operation: Traffic State (3)

    Key notes on 1xEV-DO connection close settings:

    RL fade timer, FTC time out, and AT supervision lost

    timers all are set at 5 seconds.

    Any combination of activities which causes the AT to

    spend more than 5 seconds on 1x during a tune away

    will result in the 1xEV-DO connection closing.

    If the AT resets first, then it will become confused, and

    reset.

    See example due to failed 1x idle handoff followed by

    connection close.

    Hybrid Operation: Packet Data Origination

  • 7/27/2019 Module10 Troubleshooting Training

    85/121

    Page 87Confidential & Proprietary

    Hybrid Operation: Packet Data Origination

    Attempted on the currently acquired IS-856system first, if available

    Attempted on the IS-95/IS-2000 iforigination on the currently acquired IS-856

    system is not successful

    1xEV-DO Traffic State Hybrid CAIT Log File

    E l

  • 7/27/2019 Module10 Troubleshooting Training

    86/121

    Page 88Confidential & Proprietary

    Examples

    See Module on Hybrid Mode operation

  • 7/27/2019 Module10 Troubleshooting Training

    87/121

    Page 89Confidential & Proprietary

    So you still think there is a

    performance problem in theRAN?

    A proven methodology for

    isolating the problem

    Before You Start Out on a Wild Goose Chase

  • 7/27/2019 Module10 Troubleshooting Training

    88/121

    Page 90Confidential & Proprietary

    Verify that hybrid mode operation is notthe problem.

    Verify that the laptop is configuredproperly.

    Verify that the network you benchmarked is

    the same network you now have. Verify that someone has not gone in and

    changed some key system parameters. Now what?, You are still not happy with the

    performance!

    Debugging Methodology

  • 7/27/2019 Module10 Troubleshooting Training

    89/121

    Page 91Confidential & Proprietary

    gg g gy

    Tap

    Normal ?TAP TestNO Backhaul

    Pkt Drops ?

    NO Poor Airlink

    Channel Quality

    YESDebug Backhaul

    YES

    Collect RNC,RN, CAIT, PPP

    and

    Windump stats

    Too Many PPP

    Errors?

    Or Too Many Windump

    Holes?YES

    Pre-RLP Packet Drops?NO

    Data lost before

    reaching the RNC.

    Debug PDSN-RNC-Server

    BackhaulYES

    Check if:

    SACK is Enabled;

    There are many other active

    connections on the sector;

    Windows Scaling is enabled;

    Laptops PCMCIA bus is clean;

    AT is Handing-off too much;

    AT is in Hybrid mode with high

    switchover or ping delays.

    Network Clean End-to-End

    Check TCP window setting, Ping Delay,

    or any rater-limiters along the data path

    NO

    Network Topology

  • 7/27/2019 Module10 Troubleshooting Training

    90/121

    Page 92Confidential & Proprietary

    p gy

    RNC

    IPNetwork

    PDSN

    Traffic Server

    App-Layer PktIP

    App-Layer PktIPPPPIP

    IPPPPRLP

    T1 Router

    IPBackhaul

    Network

    DOM

    App-Layer Pkt

    App-Lyr PktIPPPPRLPPHY

    Radio Access Network

    Step 1: Verify Physical Layer Using TAP

  • 7/27/2019 Module10 Troubleshooting Training

    91/121

    Page 93Confidential & Proprietary

    p y y y g

    1) Start with FTAP and RTAP tests and observe the reverse error rateusing power-rate-control stats on the RN or the RNC (show power-

    control stats ) as well as the forward error rate on the CAIT

    1xEV-DO forward link stats window

    enable

    tap

    user

    at-tx-mode ftap-loobback

    start ftap 60

    ... (60 seconds later)show counters ftap

    (this shows the forward link throughput and the forward-link error rate)

    (and then for RTAP)

    at-tx-mode rtap

    start rtap 60

    ... (60 seconds later)show counters rtap

    (this shows the reverse link throughput and the reverse-link error rate)

    Note that before running the rtap test, you should fix the reverse data rate of the

    AT to the maximum data rate that you deem achievable).

    Step 1: (Contd)

  • 7/27/2019 Module10 Troubleshooting Training

    92/121

    Page 94Confidential & Proprietary

    p ( )

    If FTAP and RTAP tests are not normal, watch for the logmessages of fwd-traffic-channel-manager (which shouldbe enabled on the FLM from the SC at severity 32). If those log messages show packet drops, this means that the IP

    backhaul between the RNC and the RN is losing packets.

    Debug the T1 backhaul, the routes on the RN, RNC andintermediate routers, and check that the RNC is connected to anEthernet switch and configured to use full-duplex mode.

    If those messages do not show packet drops, it means that theairlink channel quality is poor (note that packet drop messages at arate of one per two minutes or when the connection/soft-handoffleg is being added are considered normal).

    If FTAP and RTAP tests are normal, then you canconclude that the path between the RNC (below the RLPlayer) and the AT is clean, and you can move to the nextsteps.

    Steps 2 and 3 : Looking for Packet Loss Which

    Limit TCP Performance

  • 7/27/2019 Module10 Troubleshooting Training

    93/121

    Page 95Confidential & Proprietary

    Limit TCP Performance

    2) Perform an FTP download and collect the following: HDRFastPath statistics on the RNSM card where the RN is homed

    at 10 sec intervals: show hfpstats followed by show hfp-

    connstats

    Number of PPP errors shown on the Laptop connection dialer box

    CAIT 1xEV-DO FL stats, RL stats and RLP stats

    Windumps on the Laptop

    Snoop dumps on the Traffic Server (if it is possible to access the

    server)

    Call-Control-Agent logs on the RN at severity 20

    3) Perform a ping test between the AT and the TrafficServer and note the Average Round Trip delay as well as

    pronounced anomalies in delay variation / ping timeouts

  • 7/27/2019 Module10 Troubleshooting Training

    94/121

    What Do Pre-RLP Dropped Octets Mean?

  • 7/27/2019 Module10 Troubleshooting Training

    95/121

    Page 97Confidential & Proprietary

    pp

    Octets are dropped at the pre-RLP layerwhen the airlink does not allow for more

    data to be transmitted in the forwarddirection while more data arrives from the

    PDSN to the RNC for transmission over the

    airlink

    Pre-RLP Drops (contd)

  • 7/27/2019 Module10 Troubleshooting Training

    96/121

    Page 98Confidential & Proprietary

    Pre-RLP drops can normally occur in the following cases: If an uncontrolled UDP application is sending packets to the AT

    If a TCP-based application has windows scaling enabled

    If the there are other connections on the sector (even if they are insoft-handoff and they are not passing any data)

    If the AT is in the process of performing a soft-handoff

    If the AT is operating in hybrid mode (in this case, pre-RLP drops

    will tend to happen when the round-trip-delay between the AT and

    the traffic server is high, such as in the case of mobile IP)

    Pre-RLP dropped octets cause a gap in the TCP stream

    (not recovered by RLP) and thus cause the TCP server tore-transmit the data

    If Pre-RLP Drop Rate is High

  • 7/27/2019 Module10 Troubleshooting Training

    97/121

    Page 99Confidential & Proprietary

    Check to make sure:

    SACK is Enabled

    There are many other active connections on thesector

    Windows Scaling is enabled

    Laptops PCMCIA bus is clean

    AT is Handing-off too much

    AT is in Hybrid mode with high switchover orping delays

  • 7/27/2019 Module10 Troubleshooting Training

    98/121

  • 7/27/2019 Module10 Troubleshooting Training

    99/121

    What Do RLP Aborts Observed With

    CAIT Mean?

  • 7/27/2019 Module10 Troubleshooting Training

    100/121

    Page 102Confidential & Proprietary

    CAIT Mean?

    Since RLP does not re-transmit indefinitely, thereis a non-zero chance that RLP will not be able torecover lost data. In that case, the RLP abort count

    on the RLP CAIT stats will go up. At 1% errorrate in both the forward and the reverse directions,this is expected to happen once per 3 Mbytes of

    FTP downloaded data. Each increase in the RLP abort count on CAIT

    should correspond to an increase (by 1 or 2) in the

    PPP errors shown of the connection dialer box. This will also correspond to a gap in the TCP trace

    obtained from the Windump data.

  • 7/27/2019 Module10 Troubleshooting Training

    101/121

    Page 103Confidential & Proprietary

    Logging Subsystem

    Logging Subsystem

  • 7/27/2019 Module10 Troubleshooting Training

    102/121

    Page 104Confidential & Proprietary

    Logging levels

  • 7/27/2019 Module10 Troubleshooting Training

    103/121

    Page 105Confidential & Proprietary

    Levels 1-32

    Default level of 3 for all components

    6-32 are debug levels with increasinggranularity

    Logging can be enabled per component

    Implementing Logging On EMS

  • 7/27/2019 Module10 Troubleshooting Training

    104/121

    Page 106Confidential & Proprietary

    Node->Node Menu->Show Cards->Card (Example BIOSC card)->Card Menu(Example BIOSC Menu)->Config

    LogFacilityMgr->Click on LogFacility

    Implementing Logging CLI

  • 7/27/2019 Module10 Troubleshooting Training

    105/121

    Page 107Confidential & Proprietary

    Nortel-07>en

    Nortel-07#config

    Enter configuration commands, one per line. End with CTRL-Z.Nortel-07(config)#logging trap severity 12 pcf-sig

    Nortel-07(config)#logging buffer trap severity 12 pcf-sig

    Nortel-07(config)#logging monitor trap severity 12 pcf-sig

    Nortel-07(config)#logging start pcf-sig

    pcf-sig logging enabled

    Nortel-07(config)#

    Logging Status - CLI

  • 7/27/2019 Module10 Troubleshooting Training

    106/121

    Page 108Confidential & Proprietary

    DOM-03>show logging status

    Component Logging Status

    ID Component State Console Monitor Buffer

    ---- --------- ----- ------- ------- ------

    0003 redundancy Off Yes No Yes

    0019 net-task Off Yes No Yes

    0020 route-manager Off Yes No Yes

    0021 radio-proxy Off Yes No Yes

    0022 radio-manager Off Yes No Yes

    0029 bts-controller Off Yes No Yes

    0030 call-control-agent Off Yes No Yes

    0032 abis Off Yes No Yes

    0033 tdb Off Yes No Yes

    0034 topology-manager Off Yes No Yes

    0040 arp Off Yes No Yes

    0041 ip Off Yes No Yes

    0042 icmp Off Yes No Yes

    0043 udp Off Yes No Yes

    0044 tcp Off Yes No Yes 0045 rip Off Yes No Yes

    0046 radio-resource-control Off Yes No Yes

    0049 modem-card-perfmon Off Yes No Yes

    Finding Log Files - EMS

  • 7/27/2019 Module10 Troubleshooting Training

    107/121

    Page 109Confidential & Proprietary

    Finding Log Files CLI

  • 7/27/2019 Module10 Troubleshooting Training

    108/121

    Page 110Confidential & Proprietary

    Nortel-07# shell

    Nortel-07(shell)(disk0:/)# cd logs

    Nortel-07(shell)(disk0:logs)# ls

    size date time name

    ---------- ------ ----- ---------------

    32768 Apr 25 05:14 ./

    32768 Apr 25 13:10 ../

    2398 Apr 10 18:31 swdnllog.txt

    2032 Apr 10 18:32 rnc041003183140.bin

    8108 Apr 23 17:19 rnc041003184048.bin

    1964 Apr 23 17:29 rnc042303212226.bin

    682 Apr 23 17:40 rnc042303213738.bin

    5768 Apr 23 18:21 rnc042303214127.bin

    28 Apr 23 18:21 rnc042303182159.bin

    2254 Apr 23 18:39 rnc042303222329.bin

    278 Apr 23 21:29 rnc042303212922.bin

    Free MBytes 38089

    Viewing Log File EMS

  • 7/27/2019 Module10 Troubleshooting Training

    109/121

    Page 111Confidential & Proprietary

    Viewing Log File CLI

  • 7/27/2019 Module10 Troubleshooting Training

    110/121

    Page 112Confidential & Proprietary

    DOM-03>show logging file all01-01-70 00:00:01.210 S=03 C=010401 F=0074 ID=0006 %%SntpClient: Tx Error

    01-01-70 00:00:01.140 S=03 C=010402 F=0074 ID=0006 %%SntpClient: Tx Error

    08-30-03 00:38:36.420 S=03 C=010401 F=0020 ID=0010 %%RTM: Received delete table message from CPU 10301

    08-31-03 07:59:17.620 S=03 C=010401 F=0020 ID=0010 %%RTM: Received delete table message from CPU 10301

    09-01-03 15:15:05.420 S=03 C=010401 F=0020 ID=0010 %%RTM: Received delete table message from CPU 10301

    09-02-03 22:27:36.820 S=03 C=010401 F=0020 ID=0010 %%RTM: Received delete table message from CPU 10301

    09-04-03 05:48:08.020 S=03 C=010401 F=0020 ID=0010 %%RTM: Received delete table message from CPU 10301

    09-05-03 13:27:10.420 S=03 C=010401 F=0020 ID=0010 %%RTM: Received delete table message from CPU 10301

    09-06-03 21:53:37.820 S=03 C=010401 F=0020 ID=0010 %%RTM: Received delete table message from CPU 10301

    09-08-03 06:24:45.620 S=03 C=010401 F=0020 ID=0010 %%RTM: Received delete table message from CPU 10301

    09-09-03 09:22:19.070 S=05 C=010301 F=0070 ID=0002 %%extracting images from archive

    disk0:/images/rn8000.2.0.3.4.tar09-09-03 09:22:19.070 S=05 C=010301 F=0070 ID=0034 %%extracting from disk0:/images/rn8000.2.0.3.4.tar

    09-09-03 09:22:19.080 S=05 C=010301 F=0070 ID=0034 %%extracting file version.txt, size 952 bytes, 2blocks

    09-09-03 09:22:19.150 S=05 C=010301 F=0070 ID=0034 %%extracting file rn8000sc.gz, size 2404720 bytes, 4697 blocks

    09-09-03 09:22:28.210 S=05 C=010301 F=0070 ID=0034 %%extracting file rn8000fl.gz, size 1480465 bytes, 2892 blocks

    09-09-03 09:22:33.830 S=05 C=010301 F=0070 ID=0034 %%extracting file rn8000rl.gz, size 1727838 bytes, 3375 blocks

    Searching the Log Files

  • 7/27/2019 Module10 Troubleshooting Training

    111/121

    Page 113Confidential & Proprietary

    Nortel-07>show logging file for abis

    04-25-03 14:07:34.800 S=03 C=010301 F=0032 ID=0016 ABIS:: SendHello: peer 10.12.0.248

    timed out

    04-25-03 14:07:34.800 S=03 C=010301 F=0032 ID=0011 ABIS:: Close connection to 10.12.0.248,

    fd 53

    Nortel-07>show logging file match SendHello

    04-25-03 14:07:34.800 S=03 C=010301 F=0032 ID=0016 ABIS:: SendHello: peer 10.12.0.248timed out

    Nortel-07>show logging file time from 04-25-03:14:05:00 to 04-25-03:14:08:00

    04-25-03 14:07:34.800 S=03 C=010301 F=0032 ID=0016 ABIS:: SendHello: peer 10.12.0.248timed out

    04-25-03 14:07:34.800 S=03 C=010301 F=0032 ID=0011 ABIS:: Close connection to 10 .12.0.248,

    fd 53

  • 7/27/2019 Module10 Troubleshooting Training

    112/121

    Page 114Confidential & Proprietary

    1xEV-DO Call Setup Logs

  • 7/27/2019 Module10 Troubleshooting Training

    113/121

    Page 115Confidential & Proprietary

    11-11-03 14:28:41.054 S=09 C=010301 F=0009 ID=0282 [0xe80428 (uati) MOB CSMDormant] : Rx from AT : Route Update Message

    : Seq# 24911-11-03 14:28:41.054 S=09 C=010301 F=0009 ID=0215 [0xe80428 (uati) CSM CSMDormant] :Rx from AT : Connection Request

    Message : AT-initiated : ID 74

    11-11-03 14:28:41.054 S=16 C=010301 F=0009 ID=0254 [0xe80428 (uati) CSM CSMDormant] : Exiting State

    11-11-03 14:28:41.054 S=16 C=010301 F=0009 ID=0253 [0xe80428 (uati) CSM CSMAwaitRouteUpdate] : Entering State

    11-11-03 14:28:41.054 S=16 C=010301 F=0009 ID=0324 [0xe80428 (uati) SHO SHOSM_Idle] : [IP:10.12.0.242, CR:0xb9d, RNC

    Connection ID:0x300b4] : Exiting State

    11-11-03 14:28:41.054 S=16 C=010301 F=0009 ID=0323 [0xe80428 (uati) SHO SHOSM_Adding] : [IP:10.12.0.242, CR:0xb9d, RNC

    Connection ID:0x300b4] : Entering State

    11-11-03 14:28:41.054 S=09 C=010301 F=0009 ID=0326 [0xe80428 (uati) SHO SHOSM_Adding] : [IP:10.12.0.242, CR:0xb9d, RNC

    Connection ID:0x300b4] : Tx to RN : Add Traffic Channel Request Message : Number of pilots 1

    11-11-03 14:28:41.054 S=16 C=010301 F=0009 ID=0267 [0xe80428 (uati) CSM CSMAwaitOpenSHOL] : Add/Tweak for SHOSM :

    Responses outstanding 1

    11-11-03 14:28:41.064 S=09 C=010301 F=0009 ID=0329 [0xe80428 (uati) SHO SHOSM_Adding] : [IP:10.12.0.242, CR:0xb9d, RNC

    Connection ID:0x300b4] : Rx from RN : Add Traffic Channel Response Message : Success

    11-11-03 14:28:41.064 S=16 C=010301 F=0009 ID=0324 [0xe80428 (uati) SHO SHOSM_Adding] : [IP:10.12.0.242, CR:0xb9d, RNC

    Connection ID:0x300b4] : Exiting State

    11-11-03 14:28:41.064 S=16 C=010301 F=0009 ID=0323 [0xe80428 (uati) SHO SHOSM_Open] : [IP:10.12.0.242, CR:0xb9d, RNC

    Connection ID:0x300b4] : Entering State

    11-11-03 14:28:41.064 S=16 C=010301 F=0009 ID=0254 [0xe80428 (uati) CSM CSMAwaitOpenSHOL] : Exiting State

    11-11-03 14:28:41.064 S=16 C=010301 F=0009 ID=0253 [0xe80428 (uati) CSM CSMAwaitTCC] : Entering State

    11-11-03 14:28:41.064 S=09 C=010301 F=0009 ID=0269 [0xe80428 (uati) CSM CSMAwaitTCC] :Tx to AT : Traffic Channel

    Assignment Message : Sequence Number 0

    1xEV-DO Call Setup Logs

  • 7/27/2019 Module10 Troubleshooting Training

    114/121

    Page 116Confidential & Proprietary

    11-11-03 14:28:41.214 S=09 C=010301 F=0009 ID=0330 [0xe80428 (uati) SHO SHOSM_Open] : [IP:10.12.0.242, CR:0xb9d, RNCConnection ID:0x300b4] : Rx from RN : RTC Acquired Status Indication Message : DRCM ask:0x2

    11-11-03 14:28:41.234 S=09 C=010301 F=0009 ID=0329 [0xe80428 (uati) SHO SHOSM_Open] : [IP:10.12.0.242, CR:0xb9d, RNCConnection ID:0x300b4] : Rx from RN : FTC Desired Indication Message : Success

    11-11-03 14:28:41.234 S=16 C=010301 F=0009 ID=0315 [0xe80428 (uati) SDU CSMAwaitTCC] : Setting Active soft handoff leg[IP:10.12.0.242, CR:0xb9d, RNC Connection ID:0x300b4] : DRC Mask 0x2

    11-11-03 14:28:41.234 S=09 C=010301 F=0009 ID=0192 [0xe80428 (uati) CSM CSMAwaitTCC] :Tx to AT : RTC Ack Message

    11-11-03 14:28:41.234 S=16 C=010301 F=0009 ID=0397 [0xe80428 (uati) SCSM SCSM_Open] :Received Connection OpenedIndication event

    11-11-03 14:28:41.234 S=16 C=010301 F=0009 ID=0396 [0xe80428 (uati) SCSM] : Transitioning from SCSMS CSM_Open to SCSMSCSM_Ready

    11-11-03 14:28:41.234 S=16 C=010301 F=0009 ID=0081 [0xe80428 (uati) Session] : Transitioned from SSM NoA10Conn state toSSMWaitForPCFReg state

    11-11-03 14:28:41.234 S=05 C=010301 F=0009 ID=0179 [0xe80428 (uati) SSM SSMWaitForPCFReg] :A10 Registered for Session.

    PSI:0xc000005511-11-03 14:28:41.234 S=16 C=010301 F=0009 ID=0081 [0xe80428 (uati) Session] : Transitioned from SSM WaitForPCFReg state toSSMOpen state

    11-11-03 14:28:41.344 S=16 C=010301 F=0009 ID=0104 [0xe80428 (uati) FlowControl 2 Close] : Rx : uplink RLP packet event

    11-11-03 14:28:41.344 S=16 C=010301 F=0009 ID=0099 [0xe80428 (uati) FlowControl 2] : Transitioned from Close state to Open state

    11-11-03 14:28:41.344 S=09 C=010301 F=0009 ID=0216 [0xe80428 (uati) CSM CSMAwaitTCC] :Rx from AT : Traffic ChannelComplete Message : Seq# 0

    11-11-03 14:28:41.344 S=05 C=010301 F=0009 ID=0212 [0xe80428 (uati) CSM CSMAwaitTCC] :Connection opened11-11-03 14:28:41.344 S=16 C=010301 F=0009 ID=0254 [0xe80428 (uati) CSM CSMAwaitTCC] : Exiting State

    11-11-03 14:28:41.344 S=16 C=010301 F=0009 ID=0253 [0xe80428 (uati) CSM CSMOpen] : Entering State

    11-11-03 14:28:41.344 S=09 C=010301 F=0009 ID=0077 [0xe80428 (uati) FlowControl 2 Open] :Rx: XON Request message

    11-11-03 14:28:41.344 S=16 C=010301 F=0009 ID=0104 [0xe80428 (uati) FlowControl 2 Open] : Rx : Xon Request event

    11-11-03 14:28:41.344 S=09 C=010301 F=0009 ID=0107 [0xe80428 (uati) FlowControl 2 Open] :Tx to AT : Xon Response message

    1xEV-DO Call Setup Logs

  • 7/27/2019 Module10 Troubleshooting Training

    115/121

    Page 117Confidential & Proprietary

    11-11-03 15:12:13.201 S=09 C=010701 F=0014 ID=0200 [0xe80428 PCF 0xc000007a] Tx A11 registration requestmessage to PDSN 99.99.99.99, PCF 10.12.0.1, socket 71, bytes 354

    11-11-03 15:12:13.201 S=16 C=010701 F=0014 ID=0178 [0xe80428 PCF 0xc000007a] Changed state from Idle toWaitRegReply

    11-11-03 15:12:13.201 S=09 C=010701 F=0014 ID=0042 [PCF] Received A11 registration reply message fromPDSN 99.99.99.99

    11-11-03 15:12:13.201 S=16 C=010701 F=0014 ID=0178 [0xe80428 PCF 0xc000007a] Changed state fromWaitRegReply to Registered

    RNC-07# show 1xEV-DO session all 1001 40

    UATI List

    Inst UATI24 RATI PSI HW Id IMSI STATE ConnState(Dec) (Hex) (Hex) (Hex) (Hex) (BCD)

    ----------------------------------------------------------------------------------------

    1 1 E80428 F65922EA C000007a 600E0299 310012135135897 Open Active

    2 72 E80430 06D268CA N/A 6B200484 310012153093252 NoA10Conn Dormant

    Total Displayed Number of Current Active Sessions: 1

    Total Displayed Number of Current Dormant Sessions: 1

    Total Displayed Number of Current Sessions Awaiting Close from AT: 0

    1xEV-DO Connection Establishment and

    Termination Call Flow

  • 7/27/2019 Module10 Troubleshooting Training

    116/121

    Page 118Confidential & Proprietary

    1xEV-DO Call Termination Log

  • 7/27/2019 Module10 Troubleshooting Training

    117/121

    Page 119Confidential & Proprietary

    11-11-03 15:26:05.669 S=05 C=010301 F=0009 ID=0188 [0xe803e9 (uati) SSM SSMOpen] : A10 De-registered for Session.PSI:0xc0000079 - PCF Initiated

    11-11-03 15:26:05.669 S=16 C=010301 F=0009 ID=0220 [0xe803e9 (uati) CSM CSMOpen] :Rx : Local Connection Close Event :Reason 1

    11-11-03 15:26:05.669 S=09 C=010301 F=0009 ID=0266 [0xe803e9 (uati) CSM CSMOpen] :Tx to AT : Connection Close Message :Reason Code 0

    11-11-03 15:26:05.669 S=16 C=010301 F=0009 ID=0081 [0xe803e9 (uati) Session] : Transitioned from SSMOpen state to SSM

    NoA10Conn state11-11-03 15:26:05.749 S=09 C=010301 F=0009 ID=0221 [0xe803e9 (uati) CSM CSMAwaitATClose] :Rx from AT : Connection Close

    Message : Reason 1

    11-11-03 15:26:05.749 S=16 C=010301 F=0009 ID=0206 [0xe803e9 (uati) CSM CSMAwaitATClose] : Set Suspend Deadline(0xaf:0x3b80f3f0)

    11-11-03 15:26:05.749 S=16 C=010301 F=0009 ID=0397 [0xe803e9 (uati) SCSM SCSM_Ready] :Received Connection ClosedIndication event

    11-11-03 15:26:05.749 S=16 C=010301 F=0009 ID=0396 [0xe803e9 (uati) SCSM] : Transitioning from SCSM SCSM_Ready to SCSMSCSM_Open

    11-11-03 15:26:05.749 S=05 C=010301 F=0009 ID=0213 [0xe803e9 (uati) CSM CSMAwaitATClose] :Connection closed

    11-11-03 15:26:05.749 S=16 C=010301 F=0009 ID=0254 [0xe803e9 (uati) CSM CSMAwaitATClose] : Exiting State

    11-11-03 15:26:05.749 S=16 C=010301 F=0009 ID=0253 [0xe803e9 (uati) CSM CSMAwaitCloseSHOL] : Entering State

    11-11-03 15:26:05.749 S=16 C=010301 F=0009 ID=0324 [0xe803e9 (uati) SHO SHOSM_Open] : [IP:10.12.0.242, CR:0xb9d, RNC

    Connection ID:0x30125] : Exiting State11-11-03 15:26:05.749 S=16 C=010301 F=0009 ID=0323 [0xe803e9 (uati) SHO SHOSM_Deleting] : [IP:10.12.0.242, CR:0xb9d, RNC

    Connection ID:0x30125] : Entering State

    11-11-03 15:26:05.749 S=16 C=010301 F=0009 ID=0313 [0xe803e9 (uati) SDU CSMAwaitCloseSHOL] : Currently active soft handoffleg [IP:10.12.0.242, CR:0xb9d, RNC Connection ID:0x30125] is being deleted

    1xEV-DO Call Termination Log

  • 7/27/2019 Module10 Troubleshooting Training

    118/121

    Page 120Confidential & Proprietary

    11-11-03 15:26:05.749 S=09 C=010301 F=0009 ID=0326 [0xe803e9 (uati) SHO SHOSM_Deleting] :[IP:10.12.0.242, CR:0xb9d, RNC Connection ID:0x30125] : Tx to RN : Remove Traffic Channel Request

    Message : Number of pilots 2

    11-11-03 15:26:05.749 S=16 C=010301 F=0009 ID=0268 [0xe803e9 (uati) CSM CSMAwaitCloseSHOL] :

    Subtract/Shutdown for SHOSM : Responses outstanding 1

    11-11-03 15:26:05.674 S=09 C=010701 F=0014 ID=0043 [PCF] Received A11 registration update message from

    PDSN 99.99.99.99

    11-11-03 15:26:05.674 S=09 C=010701 F=0014 ID=0208 [0xe803e9 PCF 0xc0000079] Tx A11 registration ack

    message to PDSN 99.99.99.99, PCF 10.12.0.1, socket 71, bytes 65

    11-11-03 15:26:05.674 S=09 C=010701 F=0014 ID=0200 [0xe803e9 PCF 0xc0000079]Tx A11 registration request message to PDSN

    99.99.99.99, PCF 10.12.0.1, socket 71, bytes 121

    11-11-03 15:26:05.674 S=16 C=010701 F=0014 ID=0178 [0xe803e9 PCF 0xc0000079] Changed state fromRegistered toWaitRegReply

    11-11-03 15:26:05.674 S=09 C=010701 F=0014 ID=0042 [PCF]Received A11 registration reply message from PDSN 99.99.99.99

    11-11-03 15:26:05.674 S=16 C=010701 F=0014 ID=0178 [0xe803e9 PCF 0xc0000079] Changed state fromWaitRegReply to Idle

    11-11-03 15:26:05.759 S=09 C=010301 F=0009 ID=0329 [0xe803e9 (uati) SHO SHOSM_Deleting] : [IP:10.12.0.242, CR:0xb9d, RNC

    Connection ID:0x30125] : Rx from RN : Remove Traffic Channel Response Message : Success

    11-11-03 15:26:05.759 S=16 C=010301 F=0009 ID=0323 [0xe803e9 (uati) SHO SHOSM_Closed] : [IP:10.12.0.242, CR:0xb9d, RNCConnection ID:0x30125] : Entering State

    11-11-03 15:26:05.759 S=16 C=010301 F=0009 ID=0208 [0xe803e9 (uati) CSM CSMAwaitCloseSHOL] : Soft Handoff leg object

    deleted [IP:10.12.0.242, CR:0xb9d, RNC Connection ID:0x30125] : Total 0

    1xEV-DO Call Termination Log

  • 7/27/2019 Module10 Troubleshooting Training

    119/121

    Page 121Confidential & Proprietary

    RNC-07# show 1xEV-DO session all 1001 40UATI List

    Inst UATI24 RATI PSI HW Id IMSI STATE ConnState

    (Dec) (Hex) (Hex) (Hex) (Hex) (BCD)

    ----------------------------------------------------------------------------------------

    1 1 E803E9 F65922EA N/A 600E0299 310012135135897 NoA10Conn Dormant

    2 78 E80436 FEE7D5CE N/A 6B200484 310012153093252 NoA10Conn Dormant

    Total Displayed Number of Current Active Sessions: 0

    Total Displayed Number of Current Dormant Sessions: 2

    Total Displayed Number of Current Sessions Awaiting Close from AT: 0

    Fault Simulation Test Cases

  • 7/27/2019 Module10 Troubleshooting Training

    120/121

    Page 122Confidential & Proprietary

    Failure in Abis Peer establishment due to IP Connectivity< Static Route>

    Abis Peer Link Status is Toggling

    DOM Node Status is DOWN

    DOM homing to RNC failed

    DOM cannot rehome to another RNSM card in RNC

    Unable to setup DO Forward Traffic-Channel

    Unable to setup PCF (A10) Link to PDSN PDSN Rejects PCF (A10) setup

    T1/E1 Physical Link is DOWN due to mismatched

    configuration T1/E1 Physical Link is DOWN due to remote Physicaldisconnection

    PPP interface is DOWN

  • 7/27/2019 Module10 Troubleshooting Training

    121/121

    End of ModuleThank You

    Accelerating Access Anywhere