Top Banner
Internal Use Only▲ DO Trouble-Shooting and case analyse © 2007, ZTE Corporation. All rights reserved.
35
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: DO Trouble-Shooting and Case Analyse-1

Internal Use Only▲

DO Trouble-Shooting and case analyse

© 2007, ZTE Corporation. All rights reserved.

Page 2: DO Trouble-Shooting and Case Analyse-1

Internal Use Only▲

© 2007, ZTE Corporation. All rights reserved.

Contents

Common Types of EVDO Service Faults Methods of EVDO Troubleshooting Common Tools and Commands in EVDO Troubleshooting

Page 3: DO Trouble-Shooting and Case Analyse-1

Internal Use Only▲

© 2007, ZTE Corporation. All rights reserved.

Contents

Common Types of EVDO Service FaultsCommon Types of EVDO Service Faults Methods of EVDO Troubleshooting Common Tools and Commands in EVDO Troubleshooting

Page 4: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Common Types of EVDO Service FaultsCommon Types of EVDO Service Faults

The following are common types of EVDO service faults: Access failure: Fail to set up a connection. Performance problem of data transmission: The transmission rate

fails to reach a reasonable level for data uploading or downloading.

Call drop: The connection is interrupted (omitted)

Page 5: DO Trouble-Shooting and Case Analyse-1

Internal Use Only▲

© 2007, ZTE Corporation. All rights reserved.

Contents

Common Types of EVDO Service Faults Methods of EVDO TroubleshootingMethods of EVDO Troubleshooting

Collecting and Feeding back Fault Information Collecting and Feeding back Fault Information Access failure Data transmission performance fault Call drop

Common Tools and Commands in EVDO Troubleshooting

Page 6: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Collecting and Feeding back Fault InformationCollecting and Feeding back Fault Information

Normally the following information should be collected and then feed back so as to facilitate the troubleshooting process:

Fault phenomena; (Customer/User complain) Alarms and notification; (OMC) OMC operation logs; (OMC) Performance indexes; (OMC/CNO2) Affected scope; (Customer/OMC/CNO2) Time of occurrence; (OMC/CNO2) Fault cause statistics; (CNO2) Signaling trace; (OMC/Test) Service Observation; (OMC) Running versions; (OMC) Networking topology of packet domain; (Customer) Abis interface transmission infomation; (Customer/OMC)

Page 7: DO Trouble-Shooting and Case Analyse-1

Internal Use Only▲

© 2007, ZTE Corporation. All rights reserved.

Contents

Common Types of EVDO Service Faults Methods of EVDO TroubleshootingMethods of EVDO Troubleshooting

Collecting and Providing Fault Information Access failureAccess failure Data transmission performance fault Call drop

Common Tools and Commands in EVDO Troubleshooting

Page 8: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Access failureAccess failure

Access failure : Faults occur during the access phase, and lead to

connection failures.

What are the possible causes for Access failures?What are the possible causes for Access failures?

Page 9: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Access failureAccess failure

Common Causes of Access Failures Link failure of R-P interface; (inter-connection parameter, route,

physical links) Link failure of A12 interface; (inter-connection parameter, route,

physical links) Incomplete configuration data on BSC side; (IPCF, UIM,

Resource) Internal interference; (PN, neighbor cell) External radio interference; Configuration error of wireless parameter; (Cell radius, Power,

Frequency) Hardware fault of BTS; (RF, clock system) Terminal problem; (PRL, account) Internal fault of BSC; (Packet lost) Wrong running version;

Page 10: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Access failureAccess failure

The following fault information of an access failure should better also be reported:

Configuration of R-P interface inter-connection parameters Configuration of A12 interface inter-connection parameters Configuration of BTS and cell’s wireless parameters Print information of terminal’s PPP-Logs

Page 11: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Access failureAccess failureComfirm fault

phenomena and affect scope

Access fault declare

Check alarms and notifications

Correlational alarms or

notifications?

Time of Alarm occured

YN

Other correlational NE

fault?

N

Turn to the broken NE

Y

Service recovered after

other NE recovered?

N

N

Comfirm whether the

change lead to the fault?

N Roll back the change

Fault recovered?Confirm the motive

of the changesY

Y

Lawless users?

N

Y

System changed?

Y

Solve the alarms

N

Resource overload?

N

Reply to the Customer

Y

System fault?Y

Failure causes analyseing

N

A End

Trouble shooting flow(1)Trouble shooting flow(1)

Page 12: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Access failureAccess failure

Trouble shooting flow(2)Trouble shooting flow(2)

Signal trace of complain users or

test user

A

End

Pass Authentication

of A12?

Messages in UM are reported?

Y

Messages in A12 are

reported?Ya

b

Turn ot UM messages report flow N

Turn to A12 messages report flow

N

c

N

Turn to A12 authentication flow

Pass A11 register?

Y

d

N

Success to set up the PPP

connection?

Y

Y

N

e

Turn to A11 register flow

Turn to PPP connection flow

Page 13: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Access failureAccess failure

UM messages report flowUM messages report flow

Check the radio para, ratio power,

and running version, link between BDS and RFS, RSSI and

neighbor list

Clear the abnormal item

Is it normal by checking on

OMC?Y N

Confirm whether the

1X service is normal, and turn to BTS

fault handling

flow

Is the fault user normal in the other sites?

Y

Turn to the terminal

productorYN

Are other users in the site/sector

normal?

a

Check BSC clock, media stream,

running version, ABIS boards and

transmission statue

N

Is it a global fault?

Y

Confirm whether the

1X service is normal, and turn to BSC

fault handling

flow

Confirm the commonness of the

sites (ABIS, geographical

distributing, RSSI and radio para.

N

return

Are messages in Um normal?

Y

Test on site and collect the CNT

LOG, to confirm the radio environment

Are the fault users in the other

sector of the same site normal?

N

Y

Check the radio para, running version, RSSI, Neighbor list;

Change over RFS boards, feeders, antennas, link between

BDS and RFS

N

Page 14: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Access failureAccess failure

A12 messages report flowA12 messages report flow

b

Is A12 request report?

Is the fault user normal in the

other sites/sector?

N

Check the A12 configuration

(inter-connect para, IPCF,

MDM), IPCF port statue

N

Clear the session of the user, and try

again.

Are other users in the site/sector

normal?YN

Are other users in the BSC

normal?

Compare the signal trace between fault

user and normal ones.

Y Y

Turn to BTS fault

handling flow

返回主流程

Y

N

Are the configuration or the port statue

normal?

Is it recovered after correct the configuration?

Turn to BSC fault

handling flow

Y

N Y

N

Page 15: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Access failureAccess failure

A12 authentication flowA12 authentication flow

c

Are there AbrCHAPAuthInd messages?

Check the route between IPCF and AN-AAA; Check the inter-connection

para of A12; Ask AN-AAA to check its configuration.

N

Return

Compare to the table description of AbrCHAPAuthInd

reason codes, to check it

Y

Pass A12 auth?

N

Y

has AN-AAA received the

A12AccessRequest messages?

N

Check the inter-connection para of A12;

Check the route between AN-AAA and PCF.

Y

Correct the error, and check

whether AN-AAA has

received the A12AccessRequ

est?

Y

Trace the packet of A12, to confirm

whether BSC, AN-AAA or

transmission problem.

N

Has AN-AAA replied to PCF?

Y

Is there AbrCHAPAuthI

nd?Y

N

Trace the packet of A12, to confirm

whether BSC, AN-AAA or

transmission problem.

N

Turn to AN-AAA to handle

Page 16: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Access failureAccess failure

A11 register flowA11 register flow

d

Has A11RegRequest

been sent?

Check the config of R-P interface. Check whether

PDSN is normal。

Has reA11RegReplyheen received?

N

Y

Receive more than one

A11RegReply messages?

Y

Check the config of R-P interface;

Check the route to PDSN;

N

Is there FACN?

Has A11RegRequest been received by

PDSN?

N

Ask PDSN to check the config of R-P

interface;Check the route to

PCF;

N

Check the config of R-P interface;

Check the route to PDSN;

Check the IP add in messages are correct

or not.

Y

Has FACN received the

A11RegRequest?

N

Return

N

Recovered after clear the item?

N

According ot <code description in

A11RegReply>, ask PDSN to handl e i t

Pass A11 auth?

N

Y

Message correct? The

route normal?

N

Ask the FACN or PDSN to check the

config

N

Check the PDSN address is normal in

A11RegReply; Check the route.

Y

Y

Ask FACN to handle it

Y

Trace packet of R-P interface, to confirm

BSC, PDSN or transmission

problem

Can the A11RegRequest be demodulated successfully?

Y

Check the SPI config

Y

N

Has PCF received the

A11RegReply?

Can the A11RegRespone be demodulated successfully?

Y

Check the SPI config

NY

Y

Page 17: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Data transmissionData transmission performance faultperformance fault

Compare to the table description of AbrCHAPAuthInd reason codes in the following to find the possible reason.

Value Possible resson Solution

0Congradulation! Authentication is

Success!

601

Access reject. Please check the Radius logs(\Zte.corp\ZXPDSS-A100\log\radius*.log), to find the Reply-Message of Access-Reject massage

1 ) User unavailable: User is not allocated, or the domain is not configured in AAA.2 ) password mismatched: Password in AT is different from in AAA.3 ) Invalid IMSI: Maybe the running mode of Radius service is wrong.

602Socket setup failure. Telnet RPU to check the port state of IPCF

1 ) wrong state of A12 port in IPCF.2 ) A12 ip address is not configured in RPU.3 ) A12 ip address configured in RPU is different form in data base of AAA.

603/505AAAServer reply timer expired. Check the radius log file of AAA.

1 ) Physical connection between AAAServer and IPCF broken: Try to ping A12 Ip of IPCF in AAAServer. Check physical connection, check A12 ip configure of IPCF.

2 ) Radius service of AAAServer is not running. Use command “netstat –a –n” to check whether the IP address of AAAServer has bind with Port 1812 and port 1813. And check whether Radius service is running.

3 ) Check log of AAA to confirm the A12 massage is received by AAA. If the A12 Request arrived AAA, and AAA sent the Reply, but AN didn’t receive, please check config of AAA. It’s possible error in A12 IP address.

4 ) Check A12 ExceptionProbe, to find if there are some “ERR_SPS_HRPD_A12_FunParaIncorrect_ TransmitUserAuthRsp_ucRevDataLen” error. If so, check the profile of user in AN-AAA. Delete the

unused attribute.

507Lcp Reply timer Expired. It shows that BSC doesn’t receive the Reply of LCP request.

1 ) The state of AT is wrong that AT can’t reply to the received massage. Try to reset the AT.2 ) Poor radio signaling, try to increase the power.

Page 18: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Access failureAccess failure

Reason code of A11regreply.

Page 19: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Access failureAccess failure

PPP connection flowPPP connection flowe

Trace the PPP log in terminal side, and the signaling of PDSN. And ask PDSN to check

whether there are messages lost?

Messages lost?

Use PPPsniffer to trace the PPP

messages in BSC, to compare with the

log in PDSN and the terminal

Y

Messages lost? Y

Maybe the Fault is in Um, ABIS,

media stream or board.

Turn to BSC fault handling flow

Transmission or PDSN problem

N

Turn to PDSN and terminal productor

to check it.

N

Return

Page 20: DO Trouble-Shooting and Case Analyse-1

Internal Use Only▲

© 2007, ZTE Corporation. All rights reserved.

Contents

Common Types of EVDO Service Faults Methods of EVDO TroubleshootingMethods of EVDO Troubleshooting

Collecting and feeding back fault Information Access failure Data transmissionData transmission performance faultperformance fault Call drop

Common Tools and Commands in EVDO Troubleshooting

Page 21: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Data transmissionData transmission performance faultperformance fault

Data transmission performance faults mainly refer to abnormal data transmission rates of forward and reverse links in DO services, e.g. the transmission is 0, quite low or unstable.

Page 22: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Data transmissionData transmission performance faultperformance fault

Data transmission performance fault Failures Data transmission performance faults mainly refer to abnormal

data transmission rates of forward and reverse links in DO services, e.g. the transmission is 0, quite low or unstable.

Normal data transmission rates of forward link in DO services are: The downloading rate of a single user at a near point should

be about 340KBps (DO Rev.A) or 250KBps (DO Rls.0). Normal data transmission rates of reverse link in DO services are:

The upload rate of a single user at a near point should be about 120KBps (DO Rev.A) or 15KBps (DO Rls.0).

What are the possible causes for data transmission What are the possible causes for data transmission performance faults?performance faults?

Page 23: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Data transmissionData transmission performance faultperformance fault

Common Fault Causes Poor radio environment (e.g. interference, poor coverage, too

many users) Bad terminal performance (including computer, terminal,

connection ports, setting of TCPwindowsize) ) BTS software/hardware failures (including clock, RF performance,

baseband-RFS link, media plane link, board and version) Improper BTS radio configuration Poor transmission quality of ABIS interface BSC software/hardware failures (including clock, media plane link,

version, board and resource overload) Improper DO radio configuration at BSC Incorrect user group setting of AAA

Common Fault Causes (1)Common Fault Causes (1)

Page 24: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Data transmissionData transmission performance faultperformance fault

Common Fault Causes R-P link failure (including network interfaces’ mode mismatch,

incorrect route configuration, packet loses/delay on transmission link and bandwidth limit)

PDSN failure (including packet loss, PPP link setup failure, incorrect office direction configuration, frame header compression problem and traffic limit)

P-I link failure (including network interfaces’ mode mismatch, incorrect route configuration, packet loses/delay on transmission link, improper setting of firewall and bandwidth limit)

Poor FTP server performance and small sending buffer

Common Fault Causes (2)Common Fault Causes (2)

Page 25: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Data transmissionData transmission performance faultperformance fault

the following information has to be fed back in case of a DO performance fault:

Packets captured at the terminal, two ends of R-P interface, two ends of P-I interface and FTP server (using Wireshark\Ethereal)

DO LOG data of CHM CDT data Logs of terminals used in the site test (using CNT or other test

tool) Performance configuration of FTP server Modes of all network interfaces (including R-P and P-I)

Additional Information to Be Fed back Additional Information to Be Fed back

Page 26: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Access failureAccess failure

Trouble shooting flow(1)Trouble shooting flow(1)

Collect such information as phenomenon,

occurrence time and range affected

Check alarms and notifications

Is there related alarm?

Check whether any system change was

made when the alarm is raised

YN

Does another NE fail?

N

Forward the fault to related NE engineers and

follow up the progressY

Is the fault solved after related NE

resumes to work?

N

N

Is the fault caused by the

change?N Try to recover the

system change

Is the alarm cleared?

Make clear why the change is made and check if the change is reasonable

and implemented again.

Y

Y

Is the system changed??

Y

Handle the alarm

N

End

Is forward rate abnormal?

Y

Y

f1

g

Go to abnormal forward rate handling process

Go to abnormal reverse rate handling process

Is reverse rate abnormal?

N

N

Is the fault a global one (check it with

network test)?

f2

Y N

A transmission performance fault is reported in DO data

service

Page 27: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Data transmissionData transmission performance faultperformance fault

Items need checked at first: Fault phenomena Occurrence time Affect scope of fault Other NE fault System change in BSS or other NE recently Alarms and Notifications Confirm the fault occure on forward rate or reverse rate

Trouble shooting flow(1)Trouble shooting flow(1)

Page 28: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Access failureAccess failure

Global abnormal forward rate handling process (1)Global abnormal forward rate handling process (1)f1

Return

Are BSC versions correct?

Is A8GRELost found in the

printout?

Is the fault solved after versions are

corrected?

N Y

N

Is the fault solved?Y

Y

Do PING tests suffer delay or packet loss?

Y

Capture packets at multiple points and find out the segment where delay or packet loss takes

place

Check transmission configuration and

routes

Transmission system

Request PDSN side to conduct necessary

check

PDSN

Check BSC, including media plane, board and

data configuration

Air Interface / BSC

Is the fault solved?

N

Collect RDS information of BSC,

seeing if packets are lost within the BSC

N

Y

Check or replace FTP server or use Iperf

tool to conduct testsN

Y

N Y

Check DO radio parameters of BSC,

BTS and CHM

Check networking mode at R-P interface,Check whether working modes of network interfaces on R-P transmission link match

and make correction If necessary

Is the system recovered after test or server replacement?

Server setting or performance problem

Y

N

f2

N

Reverse link

g

Page 29: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Data transmissionData transmission performance faultperformance fault

As it’s a globle fault, check the BSC side: Fault phenomena Occurrence time Affect scope of fault Other NE fault System change in BSS or other NE recently Alarms and Notifications Confirm the fault occure on forward rate or reverse rate

Global abnormal forward rate handling process (1)Global abnormal forward rate handling process (1)

Page 30: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Access failureAccess failure

Sites abnormal forward rate handling process (2)Sites abnormal forward rate handling process (2)

f2

Are ABIS interface resources limited?

Conduct tests in the night when the traffic is low to

make confirmation

Y

Is the rate normal during

the tests?N

N

Is there any abnormal alarm or notification of

the BTS

N

Handle abnormal alarm and see if it

affects DO performance

Y

Replace the terminal and PC.

Is it solved?

Are BTS versions correct?

Y

N

N

Is the fault solved after

version correction?

Y

Return

The cause lies in the terminal, PC or their

connectionY

Collect BTS DOLOG, terminal

Log at the site, capture packets at the terminal end,

and feed them back for further analysis

N

Is it an Air/Abis Interface failure?

Air Interface

Check Abis Interface, and cut over the BTS to another board if

necessary.

Abis Interface

Transmission / engineering

/BSC problem?

Request the transmission side to

assist

Transmission

Check BSC further

Y

BSC

Improve the engineering

Engineering

Try to find out the fault at the BTS side

Use RF fault location method to determine it is a BTS problem

or an interference

BTS

Check RFS for forward or reverse interference and

handle it

Interference

Is the fault solved?

Y

f1

N

Check DO radio parameters and

carrier status of BTS

Is the system recovered after all above are

done?

YN

Are RSSI and main-to-

diversity locking ratio correct?

Y

Check BTS RFS and antenna feeder

system to see if there is any external

interference

Is the fault solved?

N Y

Page 31: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Access failureAccess failure

abnormal reverse rate handling process (2)abnormal reverse rate handling process (2)

g

Is the fault a global one?Check if QoS

parameter of RTCMAC stream under QoSClass of QoS Parameter in

DO Radio Parameter is default

value

Y

Check QRAB generation mode

and QRAB algorithm of DO CHM of the BTS

N

Are they correct?

Is the system recovered after

correction?

NY

N

Check if non-QoS parameter of

RTCMAC stream under non-QoS

Parameter in DO Radio Parameter is

default value

Return

Is the fault solved?Y

Do PING tests suffer delay or packet loss?

Y

Capture packets at multiple

points and find out the segment where delay or

packet loss takes place

Check transmission configuration and

routes

Transmission system

Request PDSN side to conduct necessary

check

PDSN

Check BSC, including media plane, board and

data configuration

Air Interface / BSC

Collect RDS information of BSC, seeing if packets are lost within the BSC

N

Check or replace FTP server or use

Iperf tool to conduct tests

N

Y

Is the system recovered after test or server replacement?

Server setting or performance

problem

Y

N

f1/f2

N

Y

N

Check BTS RFS and antenna feeder

system to see if there is any external

interference

Is the system recovered?

N

Are RSSI and main-to-

diversity locking ratio correct?

Y

Y

Page 32: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Data transmissionData transmission performance faultperformance fault

Case AnalyseCase Analyse

Case study(1)-Abnorma Download Rate

Page 33: DO Trouble-Shooting and Case Analyse-1

Internal Use Only▲

© 2007, ZTE Corporation. All rights reserved.

Contents

Common Types of EVDO Service Faults Methods of EVDO Troubleshooting Common Tools and Commands in EVDO TroubleshootingCommon Tools and Commands in EVDO Troubleshooting

Page 34: DO Trouble-Shooting and Case Analyse-1

© 2007, ZTE Corporation. All rights reserved.

Internal Use Only▲

Common Tools and Commands in EVDO TroubleshootingCommon Tools and Commands in EVDO Troubleshooting

Common Tools and Commands in DO Troubles

Page 35: DO Trouble-Shooting and Case Analyse-1

Internal Use Only▲

© 2007, ZTE Corporation. All rights reserved.