eRAN
V100R005C00
Troubleshooting Guide
Issue
02
Date
2012-07-30
HUAWEI TECHNOLOGIES CO., LTD.
Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved.
No part of this document may be reproduced or transmitted in any form or by any means without prior written
consent of Huawei Technologies Co., Ltd.
Trademarks and Permissions
and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective holders.
Notice
The purchased products, services and features are stipulated by the contract made between Huawei and the
customer. All or part of the products, services and features described in this document may not be within the
purchase scope or the usage scope. Unless otherwise specified in the contract, all statements, information,
and recommendations in this document are provided "AS IS" without warranties, guarantees or representations
of any kind, either express or implied.
The information in this document is subject to change without notice. Every effort has been made in the
preparation of this document to ensure accuracy of the contents, but all statements, information, and
recommendations in this document do not constitute a warranty of any kind, express or implied.
Huawei Technologies Co., Ltd.
Address:
Huawei Industrial Base
Bantian, Longgang
Shenzhen 518129
People's Republic of China
Website:
http://www.huawei.com
Email:
support@huawei.com
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
i
eRAN
Troubleshooting Guide
About This Document
About This Document
Purpose
This document describes how to diagnose and handle eRAN faults. Maintenance engineers can
troubleshoot the following faults by referring to this document:
l
Faults reflected in user complaints
l
Faults found during routine maintenance
l
Sudden faults
l
Faults indicated by alarms
Intended Audience
This document is intended for:
l
System engineers
l
Site maintenance engineers
Product Versions
The following table lists the product versions related to this document.
Issue 02 (2012-07-30)
Product Name
Product Version
DBS3900 LTE
V100R005C00
DBS3900 LTE TDD
V100R005C00
BTS3900 LTE
V100R005C00
BTS3900A LTE
V100R005C00
BTS3900L LTE
V100R005C00
BTS3900AL LTE
V100R005C00
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
ii
eRAN
Troubleshooting Guide
About This Document
Change History
For details about the changes in this document, see 1 Changes in eRAN Troubleshooting
Guide.
Organization
1 Changes in eRAN Troubleshooting Guide
2 Troubleshooting Process and Methods
This chapter describes the general troubleshooting process and methods.
3 Common Maintenance Functions
This chapter describes common maintenance functions that are used to analyze and handle faults.
It also explains or provides references on how to use the functions.
4 Troubleshooting Access Faults
This chapter describes how to diagnose and handle access faults.
5 Troubleshooting Intra-RAT Handover Faults
This chapter describes how to diagnose and handle intra-RAT handover faults. RAT is short for
radio access technology.
6 Troubleshooting Service Drops
This chapter describes the method and procedure for troubleshooting service drops in the Long
Term Evolution (LTE) system. It also provides the definitions of service drops and related key
performance indicator (KPI) formulas.
7 Troubleshooting Inter-RAT Handover Faults
This section defines inter-RAT handover faults, describes handover principles, and provides the
fault handling method and procedure.
8 Troubleshooting Rate Faults
This chapter provides definitions of faults related to traffic rates and describes how to
troubleshoot low uplink/downlink UDP/TCP rates and rate fluctuations. UDP is short for User
Datagram Protocol, and TCP is short for Transmission Control Protocol.
9 Troubleshooting Cell Unavailability Faults
This chapter defines cell unavailability faults and provides a troubleshooting method.
10 Troubleshooting IP Transmission Faults
This section defines IP transmission faults and describes how to troubleshoot IP transmission
faults.
11 Troubleshooting Application Layer Faults
This chapter describes the definitions of application layer faults and the troubleshooting method.
12 Troubleshooting Transmission Synchronization Faults
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
iii
eRAN
Troubleshooting Guide
About This Document
This chapter describes how to troubleshoot transmission synchronization faults. This type of
faults include the clcok reference problem, IP clock link fault, system clock unlocked fault, base
station synchronization frame number error, or time synchronization failure.
13 Troubleshooting Transmission Security Faults
This chapter describes how to troubleshoot transmission security faults.
14 Troubleshooting RF Unit Faults
This chapter describes the method and procedure for troubleshooting radio frequency (RF) unit
faults in the Long Term Evolution (LTE) system.
15 Troubleshooting License Faults
This chapter describes how to diagnose and handle license faults.
Conventions
Symbol Conventions
The symbols that may be found in this document are defined as follows.
Symbol
Description
Indicates a hazard with a high level of risk, which if not
avoided, will result in death or serious injury.
Indicates a hazard with a medium or low level of risk, which
if not avoided, could result in minor or moderate injury.
Indicates a potentially hazardous situation, which if not
avoided, could result in equipment damage, data loss,
performance degradation, or unexpected results.
Indicates a tip that may help you solve a problem or save
time.
Provides additional information to emphasize or supplement
important points of the main text.
General Conventions
The general conventions that may be found in this document are defined as follows.
Issue 02 (2012-07-30)
Convention
Description
Times New Roman
Normal paragraphs are in Times New Roman.
Boldface
Names of files, directories, folders, and users are in
boldface. For example, log in as user root.
Italic
Book titles are in italics.
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
iv
eRAN
Troubleshooting Guide
About This Document
Convention
Description
Courier New
Examples of information displayed on the screen are in
Courier New.
Command Conventions
The command conventions that may be found in this document are defined as follows.
Convention
Description
Boldface
The keywords of a command line are in boldface.
Italic
Command arguments are in italics.
[]
Items (keywords or arguments) in brackets [ ] are optional.
{ x | y | ... }
Optional items are grouped in braces and separated by
vertical bars. One item is selected.
[ x | y | ... ]
Optional items are grouped in brackets and separated by
vertical bars. One item is selected or no item is selected.
{ x | y | ... }*
Optional items are grouped in braces and separated by
vertical bars. A minimum of one item or a maximum of all
items can be selected.
[ x | y | ... ]*
Optional items are grouped in brackets and separated by
vertical bars. Several items or no item can be selected.
GUI Conventions
The GUI conventions that may be found in this document are defined as follows.
Convention
Description
Boldface
Buttons, menus, parameters, tabs, window, and dialog titles
are in boldface. For example, click OK.
>
Multi-level menus are in boldface and separated by the ">"
signs. For example, choose File > Create > Folder.
Keyboard Operations
The keyboard operations that may be found in this document are defined as follows.
Issue 02 (2012-07-30)
Format
Description
Key
Press the key. For example, press Enter and press Tab.
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
v
eRAN
Troubleshooting Guide
About This Document
Format
Description
Key 1+Key 2
Press the keys concurrently. For example, pressing Ctrl+Alt
+A means the three keys should be pressed concurrently.
Key 1, Key 2
Press the keys in turn. For example, pressing Alt, A means
the two keys should be pressed in turn.
Mouse Operations
The mouse operations that may be found in this document are defined as follows.
Issue 02 (2012-07-30)
Action
Description
Click
Select and release the primary mouse button without moving
the pointer.
Double-click
Press the primary mouse button twice continuously and
quickly without moving the pointer.
Drag
Press and hold the primary mouse button and move the
pointer to a certain position.
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
vi
eRAN
Troubleshooting Guide
Contents
Contents
About This Document.....................................................................................................................ii
1 Changes in eRAN Troubleshooting Guide..............................................................................1
2 Troubleshooting Process and Methods.....................................................................................2
2.1 General Troubleshooting Process.......................................................................................................................3
2.2 General Troubleshooting Steps..........................................................................................................................4
2.2.1 Backing Up Data.......................................................................................................................................4
2.2.2 Collecting Fault Information.....................................................................................................................4
2.2.3 Determining the Fault Scope and Type.....................................................................................................6
2.2.4 Identifying Fault Causes............................................................................................................................8
2.2.5 Rectifying the Fault...................................................................................................................................8
2.2.6 Checking Whether Faults Have Been Rectified........................................................................................8
2.2.7 Contacting Huawei Technical Support......................................................................................................9
3 Common Maintenance Functions............................................................................................11
3.1 User Tracing.....................................................................................................................................................12
3.2 Interface Tracing...............................................................................................................................................12
3.3 Comparison/Interchange...................................................................................................................................12
3.4 Switchover/Reset..............................................................................................................................................12
4 Troubleshooting Access Faults.................................................................................................14
4.1 Definitions of Access Faults.............................................................................................................................15
4.2 Background Information...................................................................................................................................15
4.3 Troubleshooting Method..................................................................................................................................17
4.4 Troubleshooting Access Faults Due to Incorrect Parameter Configurations...................................................20
4.5 Troubleshooting Access Faults Due to Radio Environment Abnormalities.....................................................26
5 Troubleshooting Intra-RAT Handover Faults.......................................................................31
5.1 Definitions of Intra-RAT Handover Faults......................................................................................................32
5.2 Background Information...................................................................................................................................32
5.3 Troubleshooting Method..................................................................................................................................33
5.4 Troubleshooting Intra-RAT Handover Faults Due to Hardware Faults...........................................................35
5.5 Troubleshooting Intra-RAT Handover Faults Due to Incorrect Data Configurations......................................38
5.6 Troubleshooting Intra-RAT Handover Faults Due to Target Cell Congestion................................................40
5.7 Troubleshooting Intra-RAT Handover Faults Due to Poor Uu Quality...........................................................42
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
vii
eRAN
Troubleshooting Guide
Contents
6 Troubleshooting Service Drops................................................................................................45
6.1 Definitions of Service Drops............................................................................................................................47
6.2 Background Information...................................................................................................................................47
6.3 Troubleshooting Method..................................................................................................................................48
6.4 Troubleshooting Service Drops Due to Radio Faults.......................................................................................51
6.5 Troubleshooting Service Drops Due to Transmission Faults...........................................................................52
6.6 Troubleshooting Service Drops Due to Congestion.........................................................................................53
6.7 Troubleshooting Service Drops Due to Handover Failures..............................................................................54
6.8 Troubleshooting Service Drops Due to MME Faults.......................................................................................55
7 Troubleshooting Inter-RAT Handover Faults.......................................................................57
7.1 Definitions of Inter-RAT Handover Faults......................................................................................................58
7.2 Background Information...................................................................................................................................58
7.3 Troubleshooting Inter-RAT Handovers............................................................................................................58
8 Troubleshooting Rate Faults.....................................................................................................64
8.1 Definitions of Rate Faults.................................................................................................................................65
8.2 Background Information...................................................................................................................................65
8.3 Troubleshooting Abnormal Single-UE Rates...................................................................................................68
8.4 Troubleshooting Abnormal Multi-UE Rates....................................................................................................74
9 Troubleshooting Cell Unavailability Faults..........................................................................76
9.1 Definitions of Cell Unavailability Faults..........................................................................................................77
9.2 Background Information...................................................................................................................................77
9.3 Troubleshooting Method..................................................................................................................................78
9.4 Troubleshooting Cell Unavailability Faults Due to Incorrect Data Configuration..........................................80
9.5 Troubleshooting Cell Unavailability Faults Due to Abnormal Transport Resources.......................................82
9.6 Troubleshooting Cell Unavailability Faults Due to Abnormal RF Resources.................................................83
9.7 Troubleshooting Cell Unavailability Faults Due to Limited Capacity or Capability.......................................86
9.8 Troubleshooting Cell Unavailability Faults Due to Faulty Hardware..............................................................87
10 Troubleshooting IP Transmission Faults.............................................................................89
10.1 Definitions of IP Transmission Faults............................................................................................................90
10.2 Background Information.................................................................................................................................90
10.3 Troubleshooting Method................................................................................................................................90
10.4 Troubleshooting IP Physical Layer Faults......................................................................................................91
10.5 Troubleshooting IP Link Layer Faults............................................................................................................94
10.6 Troubleshooting IP Layer Faults....................................................................................................................96
11 Troubleshooting Application Layer Faults..........................................................................97
11.1 Definitions of Application Layer Faults.........................................................................................................98
11.2 Background Information.................................................................................................................................98
11.3 Troubleshooting Method................................................................................................................................98
11.4 Troubleshooting SCTP Link Faults..............................................................................................................100
11.5 Troubleshooting IP Path Faults....................................................................................................................103
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
viii
eRAN
Troubleshooting Guide
Contents
11.6 Troubleshooting OM Channel Faults...........................................................................................................103
12 Troubleshooting Transmission Synchronization Faults.................................................106
12.1 Definitions of Transmission Synchronization Faults...................................................................................107
12.2 Background Information...............................................................................................................................107
12.3 Troubleshooting Specific Transmission Synchronization Faults.................................................................107
13 Troubleshooting Transmission Security Faults................................................................ 111
13.1 Definitions of Transmission Security Faults................................................................................................112
13.2 Background Information...............................................................................................................................112
13.3 Troubleshooting Specific Transmission Security Faults..............................................................................113
14 Troubleshooting RF Unit Faults...........................................................................................120
14.1 Definitions of RF Unit Faults.......................................................................................................................121
14.2 Background Information...............................................................................................................................121
14.3 Troubleshooting Method..............................................................................................................................126
14.4 Troubleshooting VSWR Faults....................................................................................................................127
14.5 Troubleshooting RTWP Faults.....................................................................................................................129
14.6 Troubleshooting ALD Link Faults...............................................................................................................135
15 Troubleshooting License Faults............................................................................................137
15.1 Definitions of License Faults........................................................................................................................138
15.2 Background Information...............................................................................................................................138
15.3 Troubleshooting Method..............................................................................................................................138
15.4 Troubleshooting License Faults That Occur During License Installation....................................................139
15.5 Troubleshooting License Faults That Occur During Network Running......................................................142
15.6 Troubleshooting License Faults That Occur During Network Adjustment..................................................144
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
ix
eRAN
Troubleshooting Guide
1
1 Changes in eRAN Troubleshooting Guide
Changes in eRAN Troubleshooting Guide
This chapter describes the changes in eRAN Troubleshooting Guide.
02 (2012-07-30)
This is the second official release.
Compared with issue 01 (2012-06-29), this issue does not include any new information.
Compared with issue 01 (2012-06-29), this issue includes the following changes.
Topic
Change Description
Whole document
Updated descriptions.
No information in issue 01 (2012-06-29) is deleted from this issue.
01 (2012-06-29)
This is the first official release.
Compared with draft A (2012-05-11), this issue does not include any new information.
Compared with draft A (11.05.12), this issue includes the following changes.
Topic
Change Description
14.5 Troubleshooting RTWP Faults
Added the step for troubleshooting, including
the step for diagnosing and handling the crossconnected antennas.
No information in draft A (2012-05-11) is deleted from this issue.
Draft A (2012-05-11)
This is a draft.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
1
eRAN
Troubleshooting Guide
2
2 Troubleshooting Process and Methods
Troubleshooting Process and Methods
About This Chapter
This chapter describes the general troubleshooting process and methods.
2.1 General Troubleshooting Process
This section describes the general troubleshooting process.
2.2 General Troubleshooting Steps
This section describes each step in the general troubleshooting process in detail.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
2
eRAN
Troubleshooting Guide
2 Troubleshooting Process and Methods
2.1 General Troubleshooting Process
This section describes the general troubleshooting process.
Figure 2-1 shows the general troubleshooting process.
Figure 2-1 General troubleshooting process
Table 2-1 details each step of the general troubleshooting process.
Table 2-1 Steps in the general troubleshooting process
Issue 02 (2012-07-30)
No.
Step
Remarks
1
2.2.1 Backing Up Data
Data to be backed up includes the database, alarm
information, and log files.
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
3
eRAN
Troubleshooting Guide
2 Troubleshooting Process and Methods
No.
Step
Remarks
2
2.2.2 Collecting Fault
Information
Fault information is essential to troubleshooting.
Therefore, maintenance personnel are advised to collect
as much fault information as possible.
3
2.2.3 Determining the
Fault Scope and Type
Determine the fault scope and type based on the
symptoms.
4
2.2.4 Identifying Fault
Causes
Identify the fault causes based on the fault information
and symptom.
5
2.2.5 Rectifying the
Fault
Take appropriate measures or steps to rectify the fault.
6
2.2.6 Checking
Whether Faults Have
Been Rectified
Verify whether the fault is rectified.
2.2.7 Contacting
Huawei Technical
Support
If the fault scope or type cannot be determined, or the
fault cannot be rectified, contact Huawei technical
support.
7
If the fault is rectified, the troubleshooting process ends.
If the fault persists, check whether this fault falls in
another fault scope or type.
2.2 General Troubleshooting Steps
This section describes each step in the general troubleshooting process in detail.
2.2.1 Backing Up Data
To ensure data security, first save onsite data and back up related databases, alarm information,
and log files during troubleshooting.
For details about data to be backed up and how to back up data, see eNodeB Routine Maintenance
Guide.
2.2.2 Collecting Fault Information
Fault information is essential to troubleshooting. Therefore, maintenance personnel should
collect fault information as much as possible.
Fault Information to Be Collected
Before rectifying a fault, collect the following information:
l
Fault symptom
l
Time, location, and frequency
l
Scope and impact
l
Equipment running status before the fault occurs
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
4
eRAN
Troubleshooting Guide
2 Troubleshooting Process and Methods
l
Operations performed on the equipment before the fault occurs, and the results of these
operations
l
Measures taken to deal with the fault, and the results
l
Alarms and correlated alarms when the fault occurs
l
Board indicator status when the fault occurs
Fault Information Collection Methods
The methods for collecting fault information are as follows:
l
Consult the person who reports the fault about the symptom, time, location, and frequency
of the fault.
l
Consult maintenance personnel about the equipment running status, fault symptom,
operations performed before the fault occurs, and measures taken after the fault occurs and
the effect of these measures.
l
Observe the board indicator, operation and maintenance (OM) system, and alarm
management system to obtain the software and hardware running status.
l
Estimate the scope and impact of the fault by means of service demonstration, performance
measurement, and interface or signaling tracing.
Fault Information Collection Skills
The following are skills in collecting fault information:
l
Do not handle a fault hastily. Collect as much information as possible before rectifying the
fault.
l
Keep good liaison with maintenance personnel of other sites. Resort to them for technical
support if necessary.
Fault Information Classification
Table 2-2 Fault information types
Issue 02 (2012-07-30)
Type
Attrib
ute
Description
Original
information
Definiti
on
Original information includes the fault information reflected in
user complaints, fault notifications from other offices, exceptions
detected in maintenance, and the information collected by
maintenance personnel through different channels in the early
period when the fault is found. Original information is important
for fault locating and analysis.
Functio
n
Original information is used to determine the fault scope and fault
category. Original information helps narrow the fault scope and
locate the faults in the initial stage of troubleshooting. Original
information can also help troubleshoot other faults, especially
trunk faults.
Referen
ce
None
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
5
eRAN
Troubleshooting Guide
2 Troubleshooting Process and Methods
Type
Attrib
ute
Description
Alarm
information
Definiti
on
Alarm information is the output of the eNodeB alarm system. It
relates to the hardware, links, trunk, and CPU load of the eNodeB,
and includes the description of faults or exceptions, fault causes,
and handling suggestions. Alarm information is a key element for
fault locating and analysis.
Functio
n
Alarm information is specific and complete; therefore, it is directly
used to locate the faulty component or find the fault cause. In
addition, alarm information can also be used with other methods
to locate a fault.
Referen
ce
For details about how to use the alarm system, see M2000 Online
Help. For detailed information about each alarm, see eNodeB
Alarm Reference.
Definiti
on
Board indicators indicate the running status of boards, circuits,
links, optical channels, and nodes. Indicator status information is
also a key element for fault locating and analysis.
Functio
n
By analyzing indicator status, you can roughly locate faulty
components or fault causes that facilitate subsequent operations.
Generally, indicator status information is combined with alarm
information for locating faults.
Referen
ce
For the description of indicator status, see associated hardware
description manuals.
Definiti
on
Performance counters are statistics about service performance,
such as statistics about service drops and handovers. They help
find out causes of service faults so that measures can be taken in
a timely manner to prevent such faults.
Functio
n
Performance counters are used with signaling tracing and
signaling analysis to locate causes of service faults such as a high
service drop rate, low handover success rate, and service
exception. They are generally used for the key performance
indicator (KPI) analysis and performance monitoring of the entire
network.
Referen
ce
For details about the usage of performance counters, see M2000
Online Help. For the definitions of each performance counter, see
eNodeB Performance Counter Reference.
Indicator
status
Performance
counter
2.2.3 Determining the Fault Scope and Type
Based on the fault symptom, determine the fault scope and type.
In this document, faults are classified according to symptoms. eRAN faults are classified into
service faults and equipment faults.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
6
eRAN
Troubleshooting Guide
2 Troubleshooting Process and Methods
Service Faults
Service faults are further classified into the following types:
l
Access faults
– User access fails.
– The access success rate is low.
l
Handover faults
– The intra-frequency handover success rate is low.
– The inter-frequency handover success rate is low.
l
Service drop faults
– Service drops occur during handovers.
– Services are unexpectedly released.
l
Inter-RAT interoperability faults
Inter-RAT handovers cannot be normally performed.
l
Rate faults
– Data rates are low.
– There is no data rate.
– Data rates fluctuate.
Equipment Faults
Equipment faults are further classified into the following types:
l
Cell faults
– Cell setup fails.
– Cell activation fails.
l
Operation and maintenance channel (OMCH) faults
– The OMCH is interrupted or fails intermittently.
– The CPRI link does not work properly.
– The S1/X2/SCTP/IPPATH links do not work properly.
– IP transport is abnormal.
l
Clock faults
– The clock source is faulty.
– The IP clock link is faulty.
– The system clock is out of lock.
l
Security faults
– The IPSec tunnel is abnormal.
– SSL negotiation is abnormal.
– Digital certificate processing is abnormal.
l
Radio frequency faults
– The standing wave is abnormal.
– The received total wideband power (RTWP) on the RX channel is abnormal.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
7
eRAN
Troubleshooting Guide
2 Troubleshooting Process and Methods
– The antenna line device (ALD) link does not work properly.
l
License faults
– License installation fails.
– License modification fails.
2.2.4 Identifying Fault Causes
Fault locating is a process of finding the fault causes from many possible causes. By analyzing
and comparing all possible causes and then excluding impossible factors, you can determine the
specific fault causes.
Locating Equipment Faults
Locating equipment faults is easier than locating service faults. Though there are many types of
equipment faults, the fault scope is relatively narrow. Equipment faults are generally indicated
by the indicator status, alarms, and error messages. Based on the indicator status information,
alarm handling suggestions, or error messages, users can rectify most equipment faults.
Locating Service Faults
The methods for locating different types of service faults are as follows:
l
Access faults: Check the S1 interface and Uu interface. Locate transmission faults segment
by segment. Then, determine whether faults occur in the eRAN based on the interface
conditions. If so, proceed to locate specific faults.
l
Rate faults: Check whether there are access faults. If there are access faults, locate specific
faults by using the previous methods. Then, check the traffic on the IP path to determine
fault points.
l
Handover faults: Start signaling tracing and determine fault points according to the
signaling flow.
For instructions on fault locating and analysis, see 3 Common Maintenance Functions.
2.2.5 Rectifying the Fault
To rectify a fault, take proper measures to eliminate the fault and restore the system, including
checking and repairing cables, replacing boards, modifying configuration data, switching over
the system, and resetting boards. Maintenance personnel need to rectify different faults using
proper methods.
After the fault is rectified, be sure to perform the following:
l
Perform testing to confirm that the fault has been rectified.
l
Record the troubleshooting process and key points.
l
Summarize measures of preventing or decreasing such faults. This helps to prevent similar
faults from occurring in the future.
2.2.6 Checking Whether Faults Have Been Rectified
Check the equipment running status, observe the board indicator status, and query alarm
information to verify that the system is running properly. Perform testing to confirm that faults
have been rectified and that services return to normal.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
8
eRAN
Troubleshooting Guide
2 Troubleshooting Process and Methods
2.2.7 Contacting Huawei Technical Support
If the fault scope or type cannot be determined, or the fault cannot be rectified, contact Huawei
technical support.
If you need to contact Huawei technical support during troubleshooting, collect necessary
information in advance.
Collecting General Fault Information
General fault information includes the following:
l
Name of the office
l
Name and phone number of the contact person
l
Time when the fault occurs
l
Detailed description of the fault symptoms
l
Host software version of the equipment
l
Measures taken after the fault occurs and the result
l
Severity level of the fault and the time required for rectifying the fault
Collecting Fault Location Information
When a fault occurs, collect the following information:
l
One-click logs of the main control board
l
One-click logs of baseband boards
l
One-click logs of RRUs
l
Alarm information
l
KPI data of the entire network
l
Intelligent field test system (IFTS) tracing
l
Cell drive test (DT) tracing
l
SCTP link tracing
l
Signaling tracing on interfaces
l
eNodeB configuration information
l
M2000 self-organizing network (SON) logs
l
M2000 adaptation logs
l
M2000 software module management logs
For details about how to collect fault information, see eNodeB LMT User Guide, eNodeB
Performance Monitoring Reference, eNodeB Routine Maintenance Guide, and M2000 Online
Help.
Contacting Huawei Technical Support
The following lists the contact information of Huawei technical support:
l
Issue 02 (2012-07-30)
If you are in mainland China, dial 4008302118.
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
9
eRAN
Troubleshooting Guide
2 Troubleshooting Process and Methods
l
If you are outside mainland China, contact the technical support personnel in the local
Huawei office.
l
Email: support@huawei.com
l
Website: http://support.huawei.com
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
10
eRAN
Troubleshooting Guide
3 Common Maintenance Functions
3
Common Maintenance Functions
About This Chapter
This chapter describes common maintenance functions that are used to analyze and handle faults.
It also explains or provides references on how to use the functions.
3.1 User Tracing
User tracing is a function that traces all messages of a user in sequence over standard and internal
interfaces, traces internal status of the user equipment (UE), and displays the tracing results on
the screen.
3.2 Interface Tracing
Interface tracing is a function that traces all messages within a period in sequence on a standard
or internal interface and displays them on the screen.
3.3 Comparison/Interchange
Comparison and interchange are used to locate faults in a piece or pieces of equipment.
3.4 Switchover/Reset
Switchover helps identify whether the originally active equipment is faulty or whether the active/
standby relationship is normal. Reset is used to identify whether software running errors exist.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
11
eRAN
Troubleshooting Guide
3 Common Maintenance Functions
3.1 User Tracing
User tracing is a function that traces all messages of a user in sequence over standard and internal
interfaces, traces internal status of the user equipment (UE), and displays the tracing results on
the screen.
User tracing has the following advantages:
l
Real-time
l
Able to trace the user over all standard interfaces
l
Usable when traffic is heavy
l
Applicable in various scenarios, for example, call procedure analysis and VIP user tracing
User tracing is usually used to diagnose call faults that can be reproduced. For details about how
to perform user tracing, see the online help for the operation and maintenance system.
3.2 Interface Tracing
Interface tracing is a function that traces all messages within a period in sequence on a standard
or internal interface and displays them on the screen.
Interface tracing has the following advantages:
l
Real-time
l
Complete: All messages within a period on an interface can be traced.
l
Able to trace link management messages
Interface tracing applies in scenarios where user equipment (UEs) involved are uncertain. For
example, this function can be used to diagnose the cause for a low success rate of radio resource
control (RRC) connection setup at a site. For details about how to perform interface tracing, see
the online help for the operation and maintenance system.
3.3 Comparison/Interchange
Comparison and interchange are used to locate faults in a piece or pieces of equipment.
Comparison is a function used to locate a fault by comparing the faulty component or fault
symptom with a functional component or normal condition, respectively. Interchange is a
function used to locate a fault by interchanging a possibly faulty component with a functional
component and comparing the running status before and after the interchange.
Comparison usually applies in scenarios with a single fault. Interchange usually applies in
scenarios with complicated faults.
3.4 Switchover/Reset
Switchover helps identify whether the originally active equipment is faulty or whether the active/
standby relationship is normal. Reset is used to identify whether software running errors exist.
Switchover switching of the active and standby roles of equipment so that the standby equipment
takes over services. Comparing the running status before and after the switchover helps identify
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
12
eRAN
Troubleshooting Guide
3 Common Maintenance Functions
whether the originally active equipment is faulty or whether the active/standby relationship is
normal. Reset is a means to manually restart part of or the entire equipment. It is used to identify
whether software running errors exist.
Switchover and reset can only be emergency resorts. Exercise caution when using them, because:
l
Compared with other functions, switchover and reset can only be auxiliary means for fault
locating.
l
Because software runs randomly, a fault is usually not reproduced in a short period after a
switchover or reset. This hides the fault, which causes risks in secure and stable running of
the equipment.
l
Resets might interrupt services. Improper operations may even cause collapse. The
interruption and collapse have a severe impact on the operation of the system.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
13
eRAN
Troubleshooting Guide
4 Troubleshooting Access Faults
4
Troubleshooting Access Faults
About This Chapter
This chapter describes how to diagnose and handle access faults.
4.1 Definitions of Access Faults
If an access fault occurs, UEs have difficulty accessing the network due to radio resource control
(RRC) connection setup failures or E-UTRAN radio access bearer (E-RAB) setup failures.
4.2 Background Information
This section provides counters and alarms related to access faults, and methods for analyzing
TopN cells.
4.3 Troubleshooting Method
This section describes how to identify and troubleshoot the possible cause.
4.4 Troubleshooting Access Faults Due to Incorrect Parameter Configurations
This section provides information required to troubleshoot access faults due to incorrect
parameter configurations. The information includes fault descriptions, background information,
possible causes, fault handling method and procedure, and typical cases.
4.5 Troubleshooting Access Faults Due to Radio Environment Abnormalities
This section provides information required to troubleshoot access faults due to radio environment
abnormalities. The information includes fault descriptions, background information, possible
causes, fault handling method and procedure, and typical cases.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
14
eRAN
Troubleshooting Guide
4 Troubleshooting Access Faults
4.1 Definitions of Access Faults
If an access fault occurs, UEs have difficulty accessing the network due to radio resource control
(RRC) connection setup failures or E-UTRAN radio access bearer (E-RAB) setup failures.
4.2 Background Information
This section provides counters and alarms related to access faults, and methods for analyzing
TopN cells.
In Long Term Evolution (LTE) networks, access faults occur either during radio resource control
(RRC) connection setup or during E-UTRAN radio access bearer (E-RAB) setup. The access
success rate is a key performance indicator (KPI) that quantifies end user experience. An
excessively low access success rate indicates that end users have difficulty making mobileoriginated or mobile-terminated calls.
Related Counters
l
RRC Connection Setup Measurement (Cell)(RRC.Setup.Cell)
l
RRC Connection Setup Failure Measurement (Cell)(RRC.SetupFail.Cell)
l
E-RAB Setup Measurement (Cell)(E-RAB.Est.Cell)
l
E-RAB Setup Failure Measurement (Cell)(E-RAB.EstFail.Cell)
For details, see eNodeB Performance Counter Reference.
Related Alarms
l
Hardware-related alarms
– ALM-26104 Board Temperature Unacceptable
– ALM-26106 Board Clock Input Unavailable
– ALM-26107 Board Input Voltage Out of Range
– ALM-26200 Board Hardware Fault
– ALM-26202 Board Overload
– ALM-26203 Board Software Program Error
– ALM-26208 Board File System Damaged
l
Temperature-related alarms
– ALM-25650 Ambient Temperature Unacceptable
– ALM-25651 Ambient Humidity Unacceptable
– ALM-25652 Cabinet Temperature Unacceptable
– ALM-25653 Cabinet Humidity Unacceptable
– ALM-25655 Cabinet Air Outlet Temperature Unacceptable
– ALM-25656 Cabinet Air Inlet Temperature Unacceptable
l
Link-related alarms
– ALM-25880 Ethernet Link Fault
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
15
eRAN
Troubleshooting Guide
4 Troubleshooting Access Faults
– ALM-25886 IP Path Fault
– ALM-25888 SCTP Link Fault
– ALM-25889 SCTP Link Congestion
– ALM-26233 BBU CPRI Optical Interface Performance Degraded
– ALM-26234 BBU CPRI Interface Error
– ALM-29201 S1 Interface Fault
– ALM-29211 Excessive Packet Loss Rate in the Transmission Network
– ALM-29212 Excessive Delay in the Transmission Network
– ALM-29213 Excessive Jitter in the Transmission Network
l
RF-related alarms
– ALM-26239 RX Channel RTWP/RSSI Unbalanced Between RF Units
– ALM-26520 RF Unit TX Channel Gain Out of Range
– ALM-26521 RF Unit RX Channel RTWP/RSSI Too Low
– ALM-26522 RF Unit RX Channel RTWP/RSSI Unbalanced
l
Configuration-related alarms
– ALM-26245 Configuration Data Inconsistency
– ALM-26243 Board Configuration Data Ineffective
– ALM-26812 System Dynamic Traffic Exceeding Licensed Limit
– ALM-26815 Licensed Feature Entering Keep-Alive Period
– ALM-26818 No License Running in System
– ALM-26819 Data Configuration Exceeding Licensed Limit
– ALM-29243 Cell Capability Degraded
– ALM-29247 Cell PCI Conflict
For details, see eNodeB Alarm Reference.
TopN Cell Selection
TopN cells can be selected by analyzing the daily KPI file exported by the M2000.
l
Top3 cells with the largest amounts of failed RRC connection setups
(L.RRC.ConnReq.Att - L.RRC.ConnReq.Succ) and lowest RRC connection setup
success rates
l
Top3 cells with the largest amounts of failed E-RAB setups and lowest E-RAB setup
success rates
Tracing TopN Cells
After finding out topN cells and the periods when they have the lowest success rates, start Uu,
S1, and X2 interface tracing tasks and check the exact point where the RRC connection or ERAB setup fails.
In addition, after the Evolved Packet Core (EPC) obtains the international mobile subscriber
identity (IMSI) of the UE with the lowest success rate based on the UE's temporary mobile
subscriber identity (TMSI), you can start a task to trace the UE throughout the whole network.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
16
eRAN
Troubleshooting Guide
4 Troubleshooting Access Faults
Analyzing Environmental Interference to TopN Cells
Environmental interference to a cell consists of downlink (DL) interference and uplink (UL)
interference to the cell. The following methods can be used to check the environmental
interference:
l
To check DL interference, use a spectral scanner. If both neighboring cells and external
systems may cause DL interference to the cell, locate the exact source of the DL
interference.
l
To check UL interference, start a cell interference detection task and analyze the result.
4.3 Troubleshooting Method
This section describes how to identify and troubleshoot the possible cause.
Possible Causes
Scenario
Fault Description
Possible Causes
The RRC connection fails to
be set up.
l The UE cannot search
cells.
l Authentication fails.
l Parameters of the UE or
eNodeB are incorrectly
configured.
l A fault occurs in radio
interface processing.
l The radio environment is
abnormal.
l Parameters of the
Evolved Packet Core
(EPC) are incorrectly
configured.
l The UE is abnormal.
The E-RAB fails to be set up.
l Resources are
insufficient.
l Parameters of the UE or
eNodeB are incorrectly
configured.
l The radio environment is
abnormal.
l Parameters of the
Evolved Packet Core
(EPC) are incorrectly
configured.
l The UE is abnormal.
Troubleshooting Flowchart
Figure 4-1 and Figure 4-2 show the troubleshooting flowcharts for handling low RRC
connection setup rates and low E-RAB setup rates, respectively.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
17
eRAN
Troubleshooting Guide
4 Troubleshooting Access Faults
Figure 4-1 Troubleshooting flowchart for low RRC connection setup success rates
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
18
eRAN
Troubleshooting Guide
4 Troubleshooting Access Faults
Figure 4-2 Troubleshooting flowchart for low E-RAB setup success rates
Troubleshooting Procedure
1.
Select topN cells.
2.
Check whether parameters of the UE or eNodeB are incorrectly configured.
l Yes: Correct the parameter configurations. Go to 3.
l No: Go to 4.
3.
Check whether the fault is rectified.
l Yes: End.
l No: Go to 4.
4.
Check whether the radio environment is abnormal.
l Yes: Handle abnormalities in the radio environment. Go to 5.
l No: Go to 6.
5.
Check whether the fault is rectified.
l Yes: End.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
19
eRAN
Troubleshooting Guide
4 Troubleshooting Access Faults
l No: Go to 6.
6.
Check whether parameters of the EPC are incorrectly configured.
l Yes: Correct the parameter configurations. Go to 7.
l No: Go to 8.
7.
Check whether the fault is rectified.
l Yes: End.
l No: Go to 8.
8.
Contact Huawei technical support.
4.4 Troubleshooting Access Faults Due to Incorrect
Parameter Configurations
This section provides information required to troubleshoot access faults due to incorrect
parameter configurations. The information includes fault descriptions, background information,
possible causes, fault handling method and procedure, and typical cases.
Fault Description
l
The UE cannot receive broadcast information from the cell.
l
The UE cannot receive signals from the cell.
l
The UE cannot camp on the cell.
l
The end user complains about an access failure, and the value of the performance counter
L.RRC.ConnReq.Att is 0.
l
An RRC connection is successfully set up for the UE according to standard interface tracing
results, but then the mobility management entity (MME) releases the UE because the
authentication procedure fails.
l
The end user complains that the UE can receive signals from the cell but is unable to access
the cell.
l
According to the values of the performance counters on the eNodeB side, the number of
RRC connections that are successfully set up is much greater than the number of E-RABs
that are successfully set up.
l
According to the KPIs, the E-RAB setup success rate is relatively low, and among all cause
values, the cause values indicated by L.E-RAB.FailEst.TNL and L.E-RAB.FailEst.RNL
contribute a large proportion.
Background Information
None
Possible Causes
l
Cell parameters are incorrectly configured. For example, the E-UTRA absolute radio
frequency number (EARFCN), public land mobile network (PLMN) ID, threshold used in
the evaluation of cell camping, pilot strength, and access class.
l
The UE has special requirements for authentication and encryption.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
20
eRAN
Troubleshooting Guide
4 Troubleshooting Access Faults
l
Parameters of the subscriber identity module (SIM) card or registration-related parameters
on the home subscriber server (HSS) are incorrectly configured.
l
The authentication and encryption algorithms are incorrectly configured on the Evolved
Packet Core (EPC).
l
The IPPATH or IPRT managed objects (MOs) are incorrectly configured.
Fault Handling Flowchart
Figure 4-3 Fault handling flowchart for access faults due to incorrect parameter configurations
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
21
eRAN
Troubleshooting Guide
4 Troubleshooting Access Faults
Fault Handling Procedure
1.
Check whether cell parameters are incorrectly configured. Pay special attention to the
following parameter settings as they are often incorrectly configured: the EARFCN, PLMN
ID, threshold used in the evaluation of cell camping, pilot strength, and access class.
Yes: Correct the cell parameter configurations. Go to 2.
No: Go to 3.
2.
Check whether the fault is rectified.
Yes: End.
No: Go to 3.
3.
Check the type and version of the UE and determine whether the authentication and
encryption functions are required.
Yes: Enable the authentication and encryption functions. Go to 4.
No: Go to 5.
4.
Check whether the fault is rectified.
Yes: End.
No: Go to 5.
5.
Check whether parameters of the SIM card or registration-related parameters on the HSS
are incorrectly configured. The parameters of the SIM card include the K value, originating
point code (OPC), international mobile subscriber identity (IMSI), and whether this SIM
card is a UMTS SIM (USIM) card.
Yes: Correct the parameter configurations. Go to 6.
No: Go to 7.
6.
Check whether the fault is rectified.
Yes: End.
No: Go to 7.
7.
Check whether the authentication and encryption algorithms are incorrectly configured on
the EPC. For example, check whether the switches for the algorithms are turned off.
Yes: Modify the parameter configuration on the EPC. Go to 8.
No: Go to 9.
8.
Check whether the fault is rectified.
Yes: End.
No: Go to 9.
9.
Check whether the IPPATH or IPRT MOs are incorrectly configured.
Yes: Correct the MO configurations. Go to 10.
No: Go to 11.
10. Check whether the fault is rectified.
Yes: End.
No: Go to 11.
11. Check whether the fault can be diagnosed by tracing the access signaling procedure.
Yes: Handle the fault. Go to 12.
No: Go to 13.
12. Check whether the fault is rectified.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
22
eRAN
Troubleshooting Guide
4 Troubleshooting Access Faults
Yes: End.
No: Go to 13.
13. Contact Huawei technical support.
Typical Cases
l
Case 1: An E398 UE failed to access the network despite the fact that the authentication
and encryption functions were enabled on the EPC.
Fault Description
During a site test, an E398 UE failed to access a network where the authentication and
encryption functions were enabled on the EPC.
Fault Diagnosis
1.
The S1 interface was traced. According to the tracing result shown in Figure 4-4, the
access attempt was rejected due to no-Sultable-Cells-In-tracking-area(15).
Figure 4-4 S1 tracing result
2.
The signaling at the EPC side was traced. According to the tracing result shown in
Figure 4-5, the access attempt was rejected by the HSS in the diameter-authorizationrejected(5003) message.
Figure 4-5 Tracing result of the signaling at the EPC side
3.
Issue 02 (2012-07-30)
The UE was checked. Specifically, the configuration, registration information, and
the category of the SIM card were checked. Then, the cause of the fault was located,
which was that the E398 UE used a SIM card. In response to the access request from
a UE using a SIM card, the EPC would reply a diameter-authorization-rejected
message. Figure 4-6 shows a snapshot of the related section in 3GPP TS 29.272.
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
23
eRAN
Troubleshooting Guide
4 Troubleshooting Access Faults
Figure 4-6 Related section in the protocol
In conclusion, the E398 UE was unable to access the network because the UE used a
SIM card. To access an LTE network, the UE must use a USIM card.
Fault Handling
The SIM card in the E398 UE was replaced by a USIM card. Then, the authentication
procedure was successful and the UE successfully accessed the network.
l
Case 2: The E-RAB setup success rate at a site deteriorated due to incorrect transport
resource configurations.
Fault Description
According to the KPIs for a site, the E-RAB setup success rate deteriorated intermittently.
Fault Diagnosis
1.
Issue 02 (2012-07-30)
The cause value contained in the S1AP_INITIAL_CONTEXT_SETUP_FAIL
message (that is, the initial context setup request message) was checked and was found
to be transport resource unavailable(0), as shown in Figure 4-7.
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
24
eRAN
Troubleshooting Guide
4 Troubleshooting Access Faults
Figure 4-7 Snapshot of the S1AP_INITIAL_CONTEXT_SETUP_FAIL message
This cause value indicates that the E-RAB failed to be set up due to faults related to
transport resources, rather than faults related to radio resources.
2.
The IP address contained in the S1AP_INITIAL_CONTEXT_SETUP_REQ message
was checked and was found to be 8A:14:05:14. However, this IP address (8A:
14:05:14) was different from the peer IP address (8A 14 05 13) specified in the
IPPATH MO. Figure 4-8 shows the details of the
S1AP_INITIAL_CONTEXT_SETUP_REQ message.
Figure 4-8 Snapshot of the S1AP_INITIAL_CONTEXT_SETUP_REQ message
3.
This inconsistency was investigated. As the EPC maintenance personnel confirmed,
multiple logical IP addresses were configured on the interface of the unified gateway
(UGW), but only one IPPATH MO was configured on the eNodeB. As a result, the
E-RAB failed to be set up.
Fault Handling
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
25
eRAN
Troubleshooting Guide
4 Troubleshooting Access Faults
New IPPATH MOs were configured on the eNodeB based on the network plan. Then, the
E-RAB setup success rate was observed for a while, during which the E-RAB setup success
rate was normal all along.
4.5 Troubleshooting Access Faults Due to Radio
Environment Abnormalities
This section provides information required to troubleshoot access faults due to radio environment
abnormalities. The information includes fault descriptions, background information, possible
causes, fault handling method and procedure, and typical cases.
Fault Description
l
During a random access procedure, the UE cannot receive any random access responses.
l
During an RRC connection setup process, the eNodeB has not received any RRC
connection setup complete messages within the related timeout duration.
l
During an E-RAB setup process, the response in security mode times out.
l
The eNodeB has not received any RRC connection reconfiguration complete messages
within the related timeout duration.
l
At the eNodeB side, both the RRC connection setup success rate and the E-RAB setup
success rate are low.
Background Information
Radio environment abnormalities include radio interference, imbalance between the uplink (UL)
and downlink (DL) quality, weak coverage, and eNodeB hardware faults (such as distinct
antenna configurations). The items to be investigated as well as the methods of investigating
these items are described as follows:
l
Investigating radio interference
DL interference from neighboring cells, DL interference from external systems, and UL
interference need to be investigated. To investigate the DL interference, use a spectral
scanner. To investigate the UL interference, start a cell interference detection task.
l
Investigating weak coverage
The reference signal received power (RSRP) values reported by UEs during their access
need to be investigated. If most of these values are relatively low, it is highly probable that
the access difficulties lie in the weak coverage provided by the cell.
The actual radius of cell coverage as well as the signal quality variation need to be
investigated so that users can determine whether wide coverage or cross-cell coverage
occurs.
l
Investigating the imbalance between UL and DL quality
The transmit power of the remote radio unit (RRU) and UE need to be investigated to check
whether UL or DL limitations have occurred, because imbalance between UL and DL
quality is caused by UL limitations or DL limitations.
The UL and DL radii of cell coverage need to be investigated using drive tests.
l
Investigating eNodeB hardware
If two antennas are used, the tilt and azimuth of each antenna need to be investigated. If
their tilts or azimuths are significantly different from each other, adjust them so that their
tilts and azimuths are the same.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
26
eRAN
Troubleshooting Guide
4 Troubleshooting Access Faults
The jumper connection needs to be investigated by analyzing drive test results. If the jumper
is reversely connected, the UL signal level will be much lower than the DL signal level in
the cell, in which case UEs remote from the eNodeB will easily encounter access failures.
Therefore, if the jumper is reversely connected, rectify the jumper connection.
The physical conditions of feeders need to be investigated. If a feeder is damaged, water
immersed, bending, or not securely connected, a large number of call drops will occur. If
a voltage standing wave ratio (VSWR) alarm is reported, such problems exist and you need
to replace the faulty feeder.
Figure 4-9 and Figure 4-10 show common causes of random access failures and E-RAB setup
failures, respectively.
Figure 4-9 Common causes of random access failures
Figure 4-10 Common causes of E-RAB setup failures
Possible Causes
l
The cell provides weak coverage.
l
The UE does not use the maximum transmit power.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
27
eRAN
Troubleshooting Guide
4 Troubleshooting Access Faults
l
Inter-modulation interference exists.
l
The UE is located at cell edge.
Fault Diagnosis
To effectively diagnose access faults due to radio environment abnormalities, you are advised
to firstly find out whether this fault is caused by radio interference or weak coverage. The
following procedure is recommended:
Fault Handling Procedure
1.
Check whether related alarms are reported.
Yes: Handle these alarms by referring to eNodeB Alarm Reference. Go to 2.
No: Go to 3.
2.
Check whether the fault is rectified.
Yes: End.
No: Go to 3.
3.
Check whether interference exists. By using a spectral scanner, check whether there is DL
interference from neighboring cells or external systems. By analyzing the cell interference
detection result, check whether there is UL interference.
Yes: Minimize the interference. Go to 4.
No: Go to 5.
4.
Check whether the fault is rectified.
Yes: End.
No: Go to 5.
5.
Check whether the transmit power of the RRU and UE falls beyond link budgets.
Yes: Adjust the UL and DL transmit power. Go to 6.
No: Go to 7.
6.
Check whether the fault is rectified.
Yes: End.
No: Go to 7.
7.
Check whether cell coverage is abnormal.
Yes: Based on the RSRP distribution of the UEs attempting to access the cell, investigate
and handle possible coverage, interference, and imbalance between UL and DL quality by
using drive tests. Go to 8.
No: Go to 9.
8.
Check whether the fault is rectified.
Yes: End.
No: Go to 9.
9.
Contact Huawei technical support.
Typical Cases
Fault Description
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
28
eRAN
Troubleshooting Guide
4 Troubleshooting Access Faults
According to the KPIs for an eNodeB at a site, the RRC connection setup success rate fluctuated
significantly within a period.
Fault Diagnosis
1.
The KPIs were checked. For local cell 1, the daily RRC connection success rate was only
52%.
Figure 4-11 PRS KPI about RRC connection setups
2.
The signaling over the Uu interface was traced. The result indicated that all RRC connection
setup failures occurred because UEs do not respond. The following figure shows a snapshot
of the signaling traced over the Uu interface.
Figure 4-12 Signaling traced over the Uu interface
3.
Simulated load was added to the LTE side. The impact of the DL LTE signals on the DL
GSM signals was tested, during which the call drop rate at the GSM side raised significantly.
As a result, it was highly probable that inter-modulation interference existed.
4.
Online spectral scan was applied to the LTE side. Interference with a magnitude of 10 dB
was found within the high-frequency resource blocks (RBs), which affected signaling
transmission.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
29
eRAN
Troubleshooting Guide
4 Troubleshooting Access Faults
Figure 4-13 Online precise spectral scan result
5.
The site was investigated and the cause of the fault was located. The LTE and GSM sides
shared the same antennas. The antennas aged and induced inter-modulation interference.
Fault Handling
The antennas were replaced. Then, the access success rate was restored.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
30
eRAN
Troubleshooting Guide
5
5 Troubleshooting Intra-RAT Handover Faults
Troubleshooting Intra-RAT Handover
Faults
About This Chapter
This chapter describes how to diagnose and handle intra-RAT handover faults. RAT is short for
radio access technology.
5.1 Definitions of Intra-RAT Handover Faults
If an intra-RAT handover fault occurs, UEs have difficulty performing intra-RAT handovers
due to system faults.
5.2 Background Information
This section describes counters and alarms related to intra-RAT handover faults. In addition,
this section provides intra-RAT handover procedures.
5.3 Troubleshooting Method
This section describes how to identify and troubleshoot the possible cause.
5.4 Troubleshooting Intra-RAT Handover Faults Due to Hardware Faults
This section provides information required to troubleshoot intra-RAT handover faults due to
hardware faults. The information includes fault descriptions, background information, possible
causes, fault handling method and procedure, and typical cases.
5.5 Troubleshooting Intra-RAT Handover Faults Due to Incorrect Data Configurations
This section provides information required to troubleshoot intra-RAT handover faults due to
incorrect data configurations. The information includes fault descriptions, background
information, possible causes, fault handling method and procedure, and typical cases.
5.6 Troubleshooting Intra-RAT Handover Faults Due to Target Cell Congestion
This section provides information required to troubleshoot intra-RAT handover faults due to
target cell congestion. The information includes fault descriptions, background information,
possible causes, fault handling method and procedure, and typical cases.
5.7 Troubleshooting Intra-RAT Handover Faults Due to Poor Uu Quality
This section provides information required to troubleshoot intra-RAT handover faults due to
poor Uu quality. The information includes fault descriptions, background information, possible
causes, fault handling method and procedure, and typical cases.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
31
eRAN
Troubleshooting Guide
5 Troubleshooting Intra-RAT Handover Faults
5.1 Definitions of Intra-RAT Handover Faults
If an intra-RAT handover fault occurs, UEs have difficulty performing intra-RAT handovers
due to system faults.
5.2 Background Information
This section describes counters and alarms related to intra-RAT handover faults. In addition,
this section provides intra-RAT handover procedures.
Related Counters
l
Outgoing Handover Measurement (Cell)(HO.eRAN.Out.Cell)
l
Incoming Handover Measurement (Cell)(HO.eRAN.In.Cell)
For details, see eNodeB Performance Counter Reference.
Related Alarms
l
Board overload alarm
– ALM-26202 Board Overload
l
Alarms related to RF modules
– ALM-26529 RF Unit VSWR Threshold Crossed
– ALM-26522 RF Unit RX Channel RTWP/RSSI Unbalanced
l
Cell capability degraded alarm
– ALM-29243 Cell Capability Degraded
l
Alarms related to CPRI links
– ALM-26235 RF Unit Maintenance Link Failure
– ALM-26234 BBU CPRI Interface Error
– ALM-26233 BBU CPRI Optical Interface Performance Degraded
– ALM-26506 RF Unit Optical Interface Performance Degraded
l
Alarms related to clock sources
– ALM-26263 IP Clock Link Failure
– ALM-26264 System Clock Unlocked
– ALM-26538 RF Unit Clock Problem
– ALM-26260 System Clock Failure
– ALM-26265 Base Station Frame Number Synchronization Error
Handover Procedures
Handovers are classified as coverage-based, load-based, frequency-priority-based, servicebased, and UL-quality-based. For details, see eRAN Mobility Management in Connected Mode
Feature Parameter Description.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
32
eRAN
Troubleshooting Guide
5 Troubleshooting Intra-RAT Handover Faults
5.3 Troubleshooting Method
This section describes how to identify and troubleshoot the possible cause.
Possible Causes
There are various causes of handover faults, such as incorrect data configuration, hardware faults,
interference, and poor Uu quality. Therefore, to effectively diagnose a handover fault, you need
to carry out a pertinent analysis based on the actual situation.
Table 5-1 shows possible causes of handover faults.
Table 5-1 Possible causes of handover faults
Scenario
Fault Description
Possible Causes
The whole network
experiences abnormalities.
l The performance
counters throughout the
whole network are
abnormal.
l Network parameters are
incorrectly configured.
l The signaling exchange
procedure is incorrect.
l Related alarms are
reported.
A single eNodeB experiences
abnormalities.
l The performance
counters for the serving
cell are abnormal.
l Hardware is faulty.
l Related alarms are
reported.
l The target cell is
congested.
l Handovers to
neighboring cells are
seldom initiated.
l The Uu quality is poor.
l Parameters are set to
inappropriate values.
l Handovers to
neighboring cells are
frequently initiated.
l The UE cannot receive
handover commands
from the network.
Fault Analysis
The following measures are effective in locating a handover fault:
l
Analyzing handover-related performance counters
l
Investigating TopN cells
l
Checking alarms related to devices or data transmission
l
Checking the configurations of neighboring cells
l
Checking handover algorithm configurations
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
33
eRAN
Troubleshooting Guide
l
5 Troubleshooting Intra-RAT Handover Faults
Investigating interference and cell coverage
To locate an intra-RAT handover fault, you are advised to select TopN cells with handover faults
and then follow the troubleshooting procedure shown in Figure 5-1.
Figure 5-1 Troubleshooting flowchart for intra-RAT handover faults
Troubleshooting Procedure
1.
Check whether the hardware is faulty.
Hardware faults are the most likely cause if handovers suddenly become abnormal without
recent modifications to the configurations of the abnormal cell and its neighboring cells.
Yes: Hardware faults are often accompanied by alarms. You are advised to handle the fault
by following the instructions on how to troubleshoot handover faults due to hardware faults.
Go to 2.
No: Go to 3.
2.
Check whether the fault is rectified.
Yes: End.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
34
eRAN
Troubleshooting Guide
5 Troubleshooting Intra-RAT Handover Faults
No: Go to 3.
3.
Check whether handover parameters are incorrectly configured.
Specifically, check whether handover thresholds and neighboring cell configurations are
incorrect.
Yes: Follow the instructions on how to troubleshoot handover faults due to incorrect data
configurations. Go to 4.
No: Go to 5.
4.
Check whether the fault is rectified.
Yes: End.
No: Go to 5.
5.
Check whether the service channel of the target cell is severely congested.
Check the service satisfaction rates to determine whether the service channel of the target
cell is severely congested.
Yes: Follow the instructions on how to troubleshoot handover faults due to target cell
congestion. Go to 6.
No: Go to 7.
6.
Check whether the fault is rectified.
Yes: End.
No: Go to 7.
7.
Check whether the Uu quality is poor.
Poor Uu quality will cause abnormal signaling exchanges, leading to handover failures.
Yes: Follow the instructions on how to troubleshoot handover faults due to poor Uu quality.
Go to 8.
No: Go to 9.
8.
Check whether the fault is rectified.
Yes: End.
No: Go to 9.
9.
Contact Huawei technical support.
5.4 Troubleshooting Intra-RAT Handover Faults Due to
Hardware Faults
This section provides information required to troubleshoot intra-RAT handover faults due to
hardware faults. The information includes fault descriptions, background information, possible
causes, fault handling method and procedure, and typical cases.
Fault Description
Typical hardware faults include faulty or overloaded boards, as well as abnormal radio frequency
(RF) module or clock sources. If a hardware fault occurs, the cell will degrade in capability or
even become out of service, in addition to the following symptoms:
l
Abnormal cell-level performance counters
– Increased service drop rate
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
35
eRAN
Troubleshooting Guide
5 Troubleshooting Intra-RAT Handover Faults
– Decreased handover success rate
– Decreased access success rate
l
Related alarms
Background Information
Related Alarms
l
Board overload alarm
– ALM-26202 Board Overload
l
Alarms related to RF modules
– ALM-26529 RF Unit VSWR Threshold Crossed
– ALM-26522 RF Unit RX Channel RTWP/RSSI Unbalanced
l
Cell capability degraded alarm
– ALM-29243 Cell Capability Degraded
l
Alarms related to CPRI links
– ALM-26235 RF Unit Maintenance Link Failure
– ALM-26234 BBU CPRI Interface Error
– ALM-26233 BBU CPRI Optical Interface Performance Degraded
– ALM-26506 RF Unit Optical Interface Performance Degraded
l
Alarms related to clock sources
– ALM-26263 IP Clock Link Failure
– ALM-26264 System Clock Unlocked
– ALM-26538 RF Unit Clock Problem
– ALM-26260 System Clock Failure
– ALM-26265 Base Station Frame Number Synchronization Error
Possible Causes
Possible hardware faults that will cause handover faults are listed as follows:
l
A board is overloaded.
l
An RF module is faulty.
l
A common public radio interface (CPRI) link is faulty.
l
A clock source is faulty.
Fault Handling Flowchart
Figure 5-2 shows the fault handling flowchart for intra-RAT handover faults due to hardware
faults.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
36
eRAN
Troubleshooting Guide
5 Troubleshooting Intra-RAT Handover Faults
Figure 5-2 Fault handling flowchart for intra-RAT handover faults due to hardware faults
Fault Handling Procedure
1.
Check whether a hardware fault alarm is reported.
Yes: Handle the hardware fault alarm. Go to 2.
No: Go to 3.
2.
Check whether the fault is rectified.
Yes: End.
No: Go to 3.
3.
Contact Huawei technical support.
Typical Cases
Fault Description
Handovers between cell 0 and cell 2 under an eNodeB were normal with a high success rate, but
the handovers from cell 1 under the eNodeB to its neighboring cells were abnormal with a
relatively low success rate (7%) during busy hours.
Fault Diagnosis
1.
Alarms about the eNodeB were checked. Cell 1 had reported ALM-26529 RF Unit VSWR
Threshold Crossed.
2.
As engineers of the customer confirmed, the eNodeB had been reconstructed recently.
Therefore, it was highly probable that the RF connections became abnormal during the site
reconstruction.
3.
At the site, it was found that the jumper was not securely connected to the feeder, which
had caused the cell malfunction.
Fault Handling
The jumper was securely connected to the feeder. According to the KPI log, the inter-cell
handover success rate was restored.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
37
eRAN
Troubleshooting Guide
5 Troubleshooting Intra-RAT Handover Faults
5.5 Troubleshooting Intra-RAT Handover Faults Due to
Incorrect Data Configurations
This section provides information required to troubleshoot intra-RAT handover faults due to
incorrect data configurations. The information includes fault descriptions, background
information, possible causes, fault handling method and procedure, and typical cases.
Fault Description
l
Handovers to neighboring cells are seldom initiated.
According to drive test results or signaling tracing results, the UE experiences relatively
low signal quality in its serving cell. The signal level of neighboring cells meets the
threshold for a handover, but handovers occur with a low probability This leads to a high
service drop rate.
l
Handovers to neighboring cells are frequently initiated.
The signal level and quality of neighboring cells are almost the same as those of the serving
cell, but handovers to the neighboring cells are frequently initiated. This leads to poor
quality of voice services and a high probability of service drops.
Background Information
None
Possible Causes
l
Configurations of neighboring cells are incorrect.
If neighboring cells are not configured or incorrectly configured, handovers cannot be
triggered even after the UE reports measurements of these neighboring cells.
l
The terrestrial link (X2 interface) is incorrectly configured.
If an X2 interface is incorrectly configured, handovers to some neighboring cells cannot
be successfully executed. For example, if the IP path for an X2 interface is incorrectly
configured, X2-based inter-eNodeB handovers cannot be executed; or, if the IP path from
the target eNodeB to the source serving gateway (S-GW) is not configured, X2-based interS-GW handovers cannot be executed.
l
Parameters such as handover thresholds, hysteresis, and time-to-trigger are inappropriately
configured.
In the preceding handover scenario, a handover is triggered only when the signal level of
a neighboring cell is higher than that of the serving cell by at least a certain amount. As a
result, if handover parameters (such as the threshold, cell individual offsets [CIOs],
hysteresis, and time-to-trigger) are inappropriately set, the probability of triggering
handovers is either significantly low or significantly high.
Fault Handling Flowchart
Figure 5-3 shows the fault handling flowchart for intra-RAT handover faults due to incorrect
data configurations.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
38
eRAN
Troubleshooting Guide
5 Troubleshooting Intra-RAT Handover Faults
Figure 5-3 Fault handling flowchart for intra-RAT handover faults due to incorrect data
configurations
Fault Handling Procedure
1.
Check whether the terrestrial link is incorrectly configured.
Yes: Correct the terrestrial link configuration. Go to 2.
No: Go to 3.
2.
Check whether the fault is rectified.
Yes: End.
No: Go to 3.
3.
Check whether there are missing configurations of neighboring cells.
Yes: Complete neighboring cell configurations. Go to 4.
No: Go to 5.
4.
Check whether the fault is rectified.
Yes: End.
No: Go to 5.
5.
Check whether handover parameters are incorrectly configured.
Yes: Correct their configurations.
No: Go to 7.
6.
Check whether the fault is rectified.
Yes: End.
No: Go to 7.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
39
eRAN
Troubleshooting Guide
7.
5 Troubleshooting Intra-RAT Handover Faults
Contact Huawei technical support.
Typical Cases
Fault Description
During a drive test, a UE did not receive any handover commands after sending A3 measurement
reports to the eNodeB. Ultimately, the service is dropped.
Fault Diagnosis
1.
According to Huawei maintenance personnel, these A3 measurement reports were
successfully received by the source eNodeB. Later, the source eNodeB sent a Handover
Request message through the X2 interface to the target eNodeB, but the target eNodeB
responded with a Handover Failure message containing a cause value indicating
unavailable transport resources.
2.
The signaling over the X2 interface was traced and was found to be normal.
3.
The configuration of the IPPATH MO for the X2 interface was checked and an
inconsistency was found. The adjacent node ID specified in the IPPATH MO was different
from the X2 interface ID, which caused a resource request failure and ultimately a handover
failure.
Fault Handling
The configuration of the IPPATH MO was corrected. Then, the test was conducted again and
the UE was successfully handed over to the target cell.
5.6 Troubleshooting Intra-RAT Handover Faults Due to
Target Cell Congestion
This section provides information required to troubleshoot intra-RAT handover faults due to
target cell congestion. The information includes fault descriptions, background information,
possible causes, fault handling method and procedure, and typical cases.
Fault Description
The service satisfaction rate in the target cell is lower than the admission threshold for handedover services, due to which the target eNodeB rejects the requests of handovers to the target cell.
The service satisfaction rate in a cell can be viewed on the M2000.
Background Information
None
Possible Causes
l
UEs in the target cell surge due to assemblies or activities.
l
A large number of UEs have been handed over to the target cell due to inappropriate
parameter configurations.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
40
eRAN
Troubleshooting Guide
5 Troubleshooting Intra-RAT Handover Faults
Fault Handling Flowchart
Figure 5-4 shows the fault handling flowchart for intra-RAT handover faults due to target cell
congestion.
Figure 5-4 Fault handling flowchart for intra-RAT handover faults due to target cell congestion
Fault Handling Procedure
1.
Check whether the handover fails due to target cell congestion.
Yes: Expand the capacity of the target cell or tune the network optimization parameters of
the target cell. Go to 2.
No: Go to 3.
2.
Check whether the fault is rectified.
Yes: End.
No: Go to 3.
3.
Contact Huawei technical support.
Typical Cases
Fault Description
During a period, all handovers to a cell failed.
Fault Diagnosis
1.
The cell coverage was checked. No coverage hole was found.
2.
The RF module serving the cell was checked. No fault was found.
3.
As signaling tracing for a single UE indicated, the service satisfaction rate in the cell was
always low (lower than the admission thresholds for handed-over services with QCIs
ranging from 1 to 4) when a handover failure message appeared. Therefore, these handovers
failed because the traffic channel was so congested in the cell that there were no resources
available for new handed-over services.
Fault Handling
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
41
eRAN
Troubleshooting Guide
5 Troubleshooting Intra-RAT Handover Faults
Engineers of the customer were advised to expand the cell capacity or reduce UEs in the cell by
modifying handover parameter configurations. After the correspond measure was taken, the
success rate of handovers to the cell became normal.
5.7 Troubleshooting Intra-RAT Handover Faults Due to
Poor Uu Quality
This section provides information required to troubleshoot intra-RAT handover faults due to
poor Uu quality. The information includes fault descriptions, background information, possible
causes, fault handling method and procedure, and typical cases.
Fault Description
Two symptoms may occur when the Uu quality is poor. One is that the UE cannot receive any
handover commands from the eNodeB, the other is that the UE cannot access the target cell and
cannot report the handover complete message.
Background Information
Checking interference
1.
Start a cell interference detection task and check the performance counter indicating the
uplink (UL) signal quality. If high UL modulation and coding scheme (MCS) orders seldom
appear, it is highly probable that interference to the cell exists.
2.
Start the UE spectral scanning function and further determine whether the interference
originates from neighboring cells or external systems.
Checking cell coverage
l
Check for weak coverage.
If the reference signal received power (RSRP) values reported by UEs during handovers
are mostly lower than -115 dB, weak-coverage areas exist in the cell.
l
Check for wide coverage and cross-cell coverage.
Wide coverage and over-coverage can be checked by analyzing the actual radius of cell
coverage and signal quality variation in the cell.
Checking imbalance between UL and DL quality
Imbalance between UL and DL quality is classified into two situations: lower UL quality and
lower DL quality.
l
Check whether the transmit power of the RRU and UE falls within link budgets.
l
Check the actual UL and DL coverage by using drive tests.
Checking the antenna system
l
Check whether the jumper is reversely connected to the feeder.
Analyze the drive test data. If the UL signal level is different from the DL signal level in
the cell and UEs at cell edge easily encounter handover failures, the jumper is reversely
connected to the feeder and needs to be corrected.
l
Check whether the feeder is in poor physical condition.
If a feeder is damaged, water immersed, bending, or not securely connected, the transmit
power and receive sensitivity are decreased and severe service drops occur. In this case,
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
42
eRAN
Troubleshooting Guide
5 Troubleshooting Intra-RAT Handover Faults
the feeder needs to be replaced. For details, see ALM-26529 RF Unit VSWR Threshold
Crossed.
Replace faulty feeders promptly.
l
Check whether the tilts and azimuths of two antennas are the same.
Possible Causes
The following Uu problems may cause handover faults:
l
Interference
l
Unsatisfactory coverage
l
Imbalance between UL and DL quality
l
Antenna system faults
Fault Handling Flowchart
To effectively diagnose handover faults due to poor Uu quality, you are advised to firstly find
out whether this fault is caused by interference or unsatisfactory coverage. Figure 5-5 shows
the fault handling flowchart for intra-RAT handover faults due to poor Uu quality.
Figure 5-5 Fault Handling flowchart for intra-RAT handover faults due to poor Uu quality
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
43
eRAN
Troubleshooting Guide
5 Troubleshooting Intra-RAT Handover Faults
Fault Handling Procedure
1.
Check whether interference exists. By using a UE spectral scanner, check whether there is
DL interference from neighboring cells or external systems. By analyzing the cell
interference detection result, check whether there is UL interference.
Yes: Remove the interference. Go to 2.
No: Go to 3.
2.
Check whether the fault is rectified.
Yes: End.
No: Go to 3.
3.
Check whether cell coverage is abnormal.
Yes: Improve cell coverage. Go to 4.
No: Go to 5.
4.
Check whether the fault is rectified.
Yes: End.
No: Go to 5.
5.
Check whether there is imbalance between UL and DL quality. Specifically, check whether
the transmit power of the RRU and UE falls beyond link budgets.
Yes: Remove the imbalance between UL and DL quality. Go to 6.
No: Go to 7.
6.
Check whether the fault is rectified.
Yes: End.
No: Go to 7.
7.
Check whether there is a fault in the antenna system.
Yes: Adjust the antenna system. Go to 8.
No: Go to 9.
8.
Check whether the fault is rectified.
Yes: End.
No: Go to 9.
9.
Contact Huawei technical support.
Typical Cases
None
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
44
eRAN
Troubleshooting Guide
6 Troubleshooting Service Drops
6
Troubleshooting Service Drops
About This Chapter
This chapter describes the method and procedure for troubleshooting service drops in the Long
Term Evolution (LTE) system. It also provides the definitions of service drops and related key
performance indicator (KPI) formulas.
6.1 Definitions of Service Drops
The service drop rate is an important key performance indicator (KPI) for radio networks. It
indicates the ratio of the number of dropped services to the total number of services. A high
service drop rate cannot meet user requirements.
6.2 Background Information
This section provides background information for service drops. The background information
includes the formula used to calculate the service drop rate, counters and alarms related to service
drops, and drive tests and TopN cell analysis method for troubleshooting service drops.
6.3 Troubleshooting Method
This section describes how to identify and troubleshoot the possible cause.
6.4 Troubleshooting Service Drops Due to Radio Faults
This section provides information required to troubleshoot service drops due to radio faults. The
information includes fault descriptions, background information, possible causes, fault handling
method and procedure, and typical cases.
6.5 Troubleshooting Service Drops Due to Transmission Faults
This section provides information required to troubleshoot service drops due to transmission
faults. The information includes fault descriptions, background information, possible causes,
fault handling method and procedure, and typical cases.
6.6 Troubleshooting Service Drops Due to Congestion
This section provides information required to troubleshoot service drops due to congestion. The
information includes fault descriptions, background information, possible causes, fault handling
method and procedure, and typical cases.
6.7 Troubleshooting Service Drops Due to Handover Failures
This section provides information required to troubleshoot service drops due to handover faults.
The information includes fault descriptions, background information, possible causes, fault
handling method and procedure, and typical cases.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
45
eRAN
Troubleshooting Guide
6 Troubleshooting Service Drops
6.8 Troubleshooting Service Drops Due to MME Faults
This section provides information required to troubleshoot service drops due to MME faults.
The information includes fault descriptions, background information, possible causes, fault
handling method and procedure, and typical cases.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
46
eRAN
Troubleshooting Guide
6 Troubleshooting Service Drops
6.1 Definitions of Service Drops
The service drop rate is an important key performance indicator (KPI) for radio networks. It
indicates the ratio of the number of dropped services to the total number of services. A high
service drop rate cannot meet user requirements.
A service drop is counted each time the eNodeB sends an E-RAB RELEASE INDICATION or
UE CONTEXT RELEASE COMMAND message to the MME with a release cause other than
Normal Release, Detach, User Inactivity, cs fallback triggered, and Inter-RAT
redirection after an E-UTRAN radio access bearer (E-RAB) has been successfully set up for a
UE.
6.2 Background Information
This section provides background information for service drops. The background information
includes the formula used to calculate the service drop rate, counters and alarms related to service
drops, and drive tests and TopN cell analysis method for troubleshooting service drops.
An E-UTRAN radio access bearer (E-RAB) is a bearer on the access stratum (AS) for carrying
service data of UEs. An E-RAB release is a process of releasing the bearer resources for UEs,
and it represents the capability of a cell to release bearer resources for UEs. One E-RAB release
is counted once.
Related Counters
E-RAB Release Measurement (Cell) (E-RAB.Rel.Cell)
Counters related to service drops are classified as follows:
l
Release types
– Normal releases
– Abnormal releases
– Normal releases for outgoing handovers
– Abnormal releases for outgoing handovers
l
QoS class identifier (QCI)
– QCIs of 1 to 9
l
Abnormal release causes
– Radio faults (L.E-RAB.AbnormRel.Radio)
If the percentage of abnormal E-RAB releases due to radio faults to all abnormal ERAB releases is greater than 30%, you need to check whether the network planning
such as the physical cell identifier (PCI) and neighboring cell planning is proper.
– Transmission faults (L.E-RAB.AbnormRel.TNL)
If the percentage of abnormal E-RAB releases due to transmission faults to all abnormal
E-RAB releases is greater than 30%, you need to check whether the transmission links
over the S1/X2 interface experience exceptions such as intermittent disconnections.
– Congestion (L.E-RAB.AbnormRel.Cong)
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
47
eRAN
Troubleshooting Guide
6 Troubleshooting Service Drops
If the percentage of abnormal E-RAB releases due to congestion to all abnormal E-RAB
releases is greater than 30%, you need to check whether congestion occurs in the cell.
– Handover failures (L.E-RAB.AbnormRel.HOFailure)
If the percentage of abnormal E-RAB releases due to handover failures to all abnormal
E-RAB releases is greater than 30%, you need to check whether parameters are properly
set for the neighboring cells.
– MME faults (L.E-RAB.AbnormRel.MME)
If the percentage of abnormal E-RAB releases due to mobility management entity
(MME) faults to all abnormal E-RAB releases is greater than 30%, you need to check
whether parameters are properly set for the evolved packet core (EPC).
For details, see eNodeB Performance Counter Reference.
Formula
The service drop rate is calculated based on services but not on UEs. For example, services are
set up on multiple data radio bearers (DRBs) for a UE. Then, if all these services experience
drops, multiple service drops are counted.
The formula for calculating the service drop rate is as follows:
Service drop rate = L.E-RAB.AbnormRel/(L.E-RAB.AbnormRel + L.E-RAB.NormRel)
Where,
l
The L.E-RAB.AbnormRel counter measures the total number of abnormal E-RAB releases
in a cell.
l
The L.E-RAB.NormRel counter measures the total number of normal E-RAB releases in
a cell.
Drive Test
To identify service drops in drive tests, you need to check logs and signaling procedures on the
UE side.
For details, see the related UE user guide.
TopN Cell Selection
TopN cells must be selected according to the following rules:
l
The service drop rate of each of topN cells must be higher than the average service drop
rate of the whole network.
l
Cells are sequenced in descending order based on the number of abnormal E-RAB releases.
Related Alarms
None
6.3 Troubleshooting Method
This section describes how to identify and troubleshoot the possible cause.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
48
eRAN
Troubleshooting Guide
6 Troubleshooting Service Drops
Possible Causes
If the service drop rate increases or greatly fluctuates, you must first locate the faults and then
handle the faults accordingly. Table 6-1 describes possible causes of service drops.
Table 6-1 Possible causes of service drops
Type
Fault Description
Possible Causes
The whole network
experiences abnormalities.
l The service drop rate of
the whole network is
abnormal.
l Data transmission is
abnormal.
l Related alarms are
reported.
A single eNodeB experiences
abnormalities.
l Network planning is
improper.
l The evolved packet core
(EPC) works abnormally.
l The service drop rate of a
cell is abnormal.
l Data transmission is
abnormal.
l Related alarms are
reported.
l Network planning is
improper.
l Resources are
insufficient.
l Weak coverage or
interference exists.
l The EPC works
abnormally.
Troubleshooting Flowchart
To troubleshoot service drops, you are advised to select topN cells with service drops and then
follow the troubleshooting procedure shown in Figure 6-1.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
49
eRAN
Troubleshooting Guide
6 Troubleshooting Service Drops
Figure 6-1 Troubleshooting flowchart for service drops
Troubleshooting Procedure
Troubleshooting service drops of the whole network
1.
Check whether the whole network has experienced operations such as cutover, replacement,
upgrade, or patch installation.
2.
Check whether the eNodeB parameters, such as timers or algorithm switches, have been
modified.
3.
Check whether the traffic volume sharply increases.
The traffic volume trend of the whole network can be determined based on the number of
E-RAB setup attempts and successful E-RAB setups. Check whether there are activities
such as number allocation or important holidays that may lead to a traffic volume increase.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
50
eRAN
Troubleshooting Guide
4.
6 Troubleshooting Service Drops
Check whether the versions or parameters of the EPC network elements (NEs) have been
modified.
Troubleshooting service drops of the topN cells
1.
Check whether the topN cells have experienced operations such as cutover or relocation.
2.
Check whether the topN cells have experienced operation and maintenance (OM)
operations such as cell deactivation or board restart.
3.
Check whether the traffic volume sharply increases.
The traffic volume trend of a topN cell can be determined based on the number of E-RAB
setup attempts and successful E-RAB setups. Check whether there are activities such as
concerts or sports that may lead to a traffic volume increase.
4.
Check whether the cell parameters have been modified, such as the maximum number of
acknowledged mode (AM) protocol data unit (PDU) retransmissions by the UE or eNodeB,
or the UE inactivity timer length.
5.
Check whether the versions or parameters of the EPC NEs corresponding to the topN cells
have been modified.
6.4 Troubleshooting Service Drops Due to Radio Faults
This section provides information required to troubleshoot service drops due to radio faults. The
information includes fault descriptions, background information, possible causes, fault handling
method and procedure, and typical cases.
Fault Description
According to the definitions of eNodeB performance counters, the L.E-RAB.AbnormRel.Radio
counter measures the number of abnormal E-RAB releases due to radio interface faults in nonhandover scenarios.
Related Information
None
Possible Causes
Abnormal E-RAB releases due to radio faults are caused by faults such as the number of Radio
Link Control (RLC) retransmissions reaching the maximum, UE uplink out-of-synchronization,
or signaling procedure failures that are resulted from weak coverage, uplink interference, or UE
exceptions.
Fault Handling Flowchart
None
Fault Handling Procedure
1.
Check whether UEs are mostly located in weak coverage areas.
Check the values of the counters related to different channel quality indicator (CQI) levels
and modulation and coding scheme (MCS) orders to determine whether low-level CQIs
and low-order MCSs are mostly used.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
51
eRAN
Troubleshooting Guide
6 Troubleshooting Service Drops
Yes: Confirm the cell coverage by using drive tests, and then adjust the weak coverage
accordingly. Go to 2.
No: Go to 3.
2.
Check whether the fault is rectified.
Yes: End.
No: Go to 3.
3.
Check whether uplink interference exists.
Yes: Remove the interference source. Go to 4.
No: Go to 5.
4.
Check whether the fault is rectified.
Yes: End.
No: Go to 5.
5.
Contact Huawei technical support.
Typical Cases
None
6.5 Troubleshooting Service Drops Due to Transmission
Faults
This section provides information required to troubleshoot service drops due to transmission
faults. The information includes fault descriptions, background information, possible causes,
fault handling method and procedure, and typical cases.
Fault Description
According to the definitions of eNodeB performance counters, the L.E-RAB.AbnormRel.TNL
counter measures the number of abnormal E-RAB releases due to faults at the transport network
layer.
Related Information
None
Possible Causes
Abnormal E-RAB releases due to transmission faults are caused by transmission exceptions
between the eNodeB and the MME. For example, the transmission link over the S1 interference
experiences intermittent disconnections.
Fault Handling Flowchart
None
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
52
eRAN
Troubleshooting Guide
6 Troubleshooting Service Drops
Fault Handling Procedure
Check whether transmission-related alarms are reported. If any, clear the reported alarms. Then,
check whether the corresponding counter has a proper value.
1.
Check whether transmission-related alarms are reported on the M2000 client.
Yes: Clear the alarms by referring to the instructions in the alarm reference. Go to 2.
No: Go to 3.
2.
Check whether the fault is rectified.
Yes: End.
No: Go to 3.
3.
Contact Huawei technical support.
Typical Cases
None
6.6 Troubleshooting Service Drops Due to Congestion
This section provides information required to troubleshoot service drops due to congestion. The
information includes fault descriptions, background information, possible causes, fault handling
method and procedure, and typical cases.
Fault Description
According to the definitions of eNodeB performance counters, the L.E-RAB.AbnormRel.Cong
counter measures the number of abnormal E-RAB releases due to resource congestion.
Related Information
None
Possible Causes
Abnormal E-RAB releases due to congestion are caused by congestion of radio resources on the
eNodeB side. For example, the radio sources are insufficient if the number of UEs reaches the
upper limit.
Fault Handling Flowchart
None
Fault Handling Procedure
If service drops due to congestion occurs in a topN cell for a long time, mobility load balancing
(MLB) can be enabled to temporarily reduce the cell load. In the long term, the cell requires
capacity expansion. After rectifying the congestion fault, check whether the corresponding
counter has a proper value.
1.
Issue 02 (2012-07-30)
Turn on the switch for the MLB algorithm, and then check whether the congestion fault is
rectified.
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
53
eRAN
Troubleshooting Guide
6 Troubleshooting Service Drops
Yes: End.
No: Go to 2.
2.
Contact Huawei technical support.
Typical Cases
None
6.7 Troubleshooting Service Drops Due to Handover
Failures
This section provides information required to troubleshoot service drops due to handover faults.
The information includes fault descriptions, background information, possible causes, fault
handling method and procedure, and typical cases.
Fault Description
According to the definitions of eNodeB performance counters, the L.ERAB.AbnormRel.HOFailure counter measures the number of abnormal E-RAB releases due to
outgoing handover failures.
Related Information
Counters related to outgoing handovers to a specific cell
l
Number of Inter-Specific Cell Outgoing Handover Attempts (L.HHO.NCell.PrepAttOut)
l
Number of Performed Inter-Specific Cell Outgoing Handovers
(L.HHO.NCell.ExecAttOut)
l
Number of Successful Outgoing Handovers Between Two Specific Cells
(L.HHO.NCell.ExecSuccOut)
l
Number of Ping-Pong Handovers Between Two Specific Cells
(L.HHO.Ncell.PingPongHo)
Possible Causes
Abnormal E-RAB releases due to handover failures are caused by failures of handovers from
the local cell to another cell.
Fault Handling Flowchart
None
Fault Handling Procedure
If service drops due to outgoing handover failures increase in a topN cell, you can identify the
causes based on the counters related to outgoing handovers to specific cells.
1.
Obtain the related counters.
Calculate the number of handover failures from the topN cell to each specific target cell
and find out the target cell that has the highest number of handover failures. Then, check
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
54
eRAN
Troubleshooting Guide
6 Troubleshooting Service Drops
the parameter settings related to the neighbor relationship with this target cell. If the
parameter settings are improper, optimize the parameter settings as required.
2.
Check whether the fault is rectified.
Yes: End.
No: Go to 3.
3.
Contact Huawei technical support.
Typical Cases
None
6.8 Troubleshooting Service Drops Due to MME Faults
This section provides information required to troubleshoot service drops due to MME faults.
The information includes fault descriptions, background information, possible causes, fault
handling method and procedure, and typical cases.
Fault Description
According to the definitions of eNodeB performance counters, the L.E-RAB.AbnormRel.MME
counter measures the number of abnormal E-RAB releases that are initiated by the evolved
packet core (EPC). However, these abnormal releases are not included in the value of the L.ERAB.AbnormRel counter.
Related Information
None
Possible Causes
Abnormal E-RAB releases due to MME faults are initiated by the EPC when UEs are performing
services.
Fault Handling Flowchart
None
Fault Handling Procedure
MME faults must be identified on the EPC side.
1.
Obtain the S1 tracing messages related to the topN cell and analyze specific release causes.
2.
Collect the analysis result and information about the signaling procedure and then contact
EPC engineers.
3.
Check whether the fault is rectified.
Yes: End.
No: Go to 4.
4.
Issue 02 (2012-07-30)
Contact Huawei technical support.
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
55
eRAN
Troubleshooting Guide
6 Troubleshooting Service Drops
Typical Cases
None
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
56
eRAN
Troubleshooting Guide
7
7 Troubleshooting Inter-RAT Handover Faults
Troubleshooting Inter-RAT Handover
Faults
About This Chapter
This section defines inter-RAT handover faults, describes handover principles, and provides the
fault handling method and procedure.
First office application (FOA) and commercial Long Term Evolution (LTE) networks are being
deployed in large scales. GSM/EDGE radio access network (GERAN) and Universal terrestrial
radio access network (UTRAN) will coexist with LTE networks for a long time. This requires
operators to use effective inter-RAT policies for protecting the GERAN and UTRAN resources
and providing rich services at the same time.
7.1 Definitions of Inter-RAT Handover Faults
Inter-RAT handover faults are system faults that cause handover initiation failure or handover
failure. RAT is short for radio access technology.
7.2 Background Information
This section provides background information about inter-RAT handover faults. The
background information includes counters, handover types, handover procedures, and related
formulas.
7.3 Troubleshooting Inter-RAT Handovers
This section provides information required to troubleshoot inter-RAT handover faults. The
information includes fault descriptions, background information, possible causes, and fault
handling method and procedure.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
57
eRAN
Troubleshooting Guide
7 Troubleshooting Inter-RAT Handover Faults
7.1 Definitions of Inter-RAT Handover Faults
Inter-RAT handover faults are system faults that cause handover initiation failure or handover
failure. RAT is short for radio access technology.
7.2 Background Information
This section provides background information about inter-RAT handover faults. The
background information includes counters, handover types, handover procedures, and related
formulas.
Related Counters
Inter-RAT Outgoing Handover Measurement (Cell) (HO.IRAT.Out.Cell)
For details, see eNodeB Performance Counter Reference.
Handover Types and Procedures
For details, see eRAN Mobility Management in Connected Mode Feature Parameter
Description and 3GPP TS 23.401.
Related Formulas
Handover Success Rate
Formula
Success rate of handovers from evolved
universal terrestrial radio access network (EUTRAN) to Wideband Code Division
Multiple Access (WCDMA) networks
Number of Successful Outgoing Handovers
from E-UTRAN to UTRAN/Number of
Outgoing Handover Attempts from EUTRAN to UTRAN
Success rate of handovers from E-UTRAN to
GSM/EDGE radio access network (GERAN)
Number of Successful Outgoing Handovers
from E-UTRAN to GERAN/Number of
Outgoing Handover Attempts from EUTRAN to GERAN
7.3 Troubleshooting Inter-RAT Handovers
This section provides information required to troubleshoot inter-RAT handover faults. The
information includes fault descriptions, background information, possible causes, and fault
handling method and procedure.
Fault Description
The following are symptoms of inter-RAT handover faults:
l
Issue 02 (2012-07-30)
Users file service drop complaints.
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
58
eRAN
Troubleshooting Guide
7 Troubleshooting Inter-RAT Handover Faults
l
The success rate of outgoing inter-RAT handovers is low.
l
Signaling message tracing results indicate that handover procedures are incomplete or fail.
Related Information
l
UE capability for inter-frequency handover
Monitor the network access signaling procedure on the eNodeB to check whether the UE
supports inter-frequency handovers. The information about whether the UE supports interfrequency handovers can be obtained from the IE Feature group indicators in the
UECapabilityInformation message.
According to 3GPP TS 36.331 B.1 Feature group indicators, the eighth and ninth indicators
indicate whether a UE supports packet switched (PS) handovers to GERAN and URTAN,
respectively. Table 7-1 lists the eighth and ninth indicators.
Table 7-1 B.1 "Feature group indicators" in 3GPP TS 36.331
In
di
ca
to
r
Event
Description
8
EUTRA
RRC_CONNECTED to
UTRA CELL_DCH PS
handover
can only be set to "true" if the UE has set bit number
22 to "true"
9
EUTRA
RRC_CONNECTED to
GERAN GSM_Dedicated
handover
related to SR-VCC - can only be set to "true" if the
UE has set bit number 23 to "true"
If the value of the eighth and ninth indicators is 0, the UE does not support PS handovers.
If the value of the eighth and ninth indicators is 1, the UE supports PS handovers.
l
Neighboring cell configuration check
Inter-RAT neighboring cells must be configured before handovers can be performed from
evolved universal terrestrial radio access network (E-UTRAN) to UTRAN/GERAN. Use
the related commands provided by Huawei to configure inter-RAT neighboring cells.
– Check for missing and redundant neighboring cell configurations.
Check whether the routing area code (RAC) is configured when an external UTRAN
cell is added by running the ADD UTRANEXTERNALCELL and whether
NoHoFlag is set to Permit Ho when a neighboring relationship is added by running
the ADD UTRANNCELL or ADD GERANNCELL command.
– Check handover parameter settings.
Check whether handover threshold parameters are properly set by comparing the
settings with the default values or settings for a cell where handovers are normal.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
59
eRAN
Troubleshooting Guide
7 Troubleshooting Inter-RAT Handover Faults
The threshold settings can be queried by using the LST
INTERRATHOUTRANGROUP and LST INTERRATHOGERANGROUP
commands.
Possible Causes
l
The UE does not support inter-RAT handover.
l
Inter-RAT handover parameters or evolved packet core (EPC) parameters are incorrectly
set, or there are missing neighbor relationships.
l
The signal quality is poor. For example, the coverage is poor or there is interference.
Fault Handling
Inter-RAT handover faults are complex and you need to determine whether an inter-RAT
handover fault occurs in the entire network or in a cell based on the fault scope and background.
If the fault occurs in the entire network, locate the fault by checking the signaling exchange and
parameter settings on the mobility management entity (MME) and serving GPRS support node
(SGSN). If the fault occurs in a cell, check the data configuration, frequency, and hardware of
the cell.
Figure 7-1 shows the troubleshooting flowchart for inter-RAT handover faults.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
60
eRAN
Troubleshooting Guide
7 Troubleshooting Inter-RAT Handover Faults
Figure 7-1 Troubleshooting flowchart for inter-RAT handover faults
Fault Handling Procedure
1.
Check whether the UE does not support inter-frequency handover.
Yes: Use a UE that supports packet switched (PS) handover. Go to 2.
No: Go to 3.
2.
Check whether the fault is rectified.
Yes: End.
No: Go to 3.
3.
Check whether inter-RAT handover is disabled.
Yes: Run the MOD ENODEBALGOSWITCH:
HoModeSwitch=UtranPsHoSwitch-1; and MOD ENODEBALGOSWITCH:
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
61
eRAN
Troubleshooting Guide
7 Troubleshooting Inter-RAT Handover Faults
HoModeSwitch=GeranPsHoSwitch-1; commands to enable PS handover to UTRAN and
GERAN, respectively. Go to 4.
No: Go to 5.
4.
Check whether the fault is rectified.
Yes: End.
No: Go to 5.
5.
Check whether neighboring cells are incorrectly configured.
Yes: Correct the neighboring cell configuration. Go to 6.
No: Go to 7.
6.
Check whether the fault is rectified.
Yes: End.
No: Go to 7.
7.
Check whether EPC parameters are incorrectly configured.
Yes: Ask related personnel to correct the EPC parameter configurations. Go to 8.
No: Go to 9.
If the fault persists, go to 6.
8.
Check whether the fault is rectified.
Yes: End.
No: Go to 9.
9.
Check whether interference exists and whether the coverage is poor.
If the radio quality is poor, the UE cannot receive the handover command or cannot use
the channel assigned by the target cell, causing handover failure. Network planning and
optimization engineers can use drive tests to locate coverage problems and use a device or
the interference tracing function on the eNodeB to locate interference problems.
Yes: Remove the interference source and adjust the coverage scope. Go to 10.
No: Go to 11.
10. Check whether the fault is rectified.
Yes: End.
No: Go to 9.
11. Contact Huawei technical support.
Typical Cases
l
Case 1: In a PS handover from E-UTRAN to UTRAN, the eNodeB did not deliver a PS
handover command but delivered a redirection command to the UE.
Fault Description
In a test of PS handover from E-UTRAN to UTRAN in a laboratory at a site, after the UE
reported B1 measurement results to the eNodeB, the eNodeB did not deliver a PS handover
command but delivered a redirection command.
Fault Diagnosis
The result of tracing the network access procedure found that the UE did not support interRAT handover. If a UE does not support inter-RAT handover, the eNodeB will redirect the
UE to UTRAN.
Fault Handling
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
62
eRAN
Troubleshooting Guide
7 Troubleshooting Inter-RAT Handover Faults
The problem was solved after a UE that supports inter-RAT handover was used.
l
Case 2: In a test of PS handover from E-UTRAN to UTRAN, the eNodeB did not deliver
a PS handover command.
Fault Description
In a test of PS handover from E-UTRAN to UTRAN in a laboratory at a site, after the UE
reported B1 measurement results to the eNodeB, the eNodeB did not deliver a PS handover
command.
Fault Diagnosis
The result of tracing the network access procedure found that the UE supported inter-RAT
handover. The PS handover switch was checked on the eNodeB. The check result indicated
that the switch was turned on. Then, the neighboring cell relationships were checked. The
check result shows that a RAC was not configured for the neighboring UTRAN cell.
Fault Handling
The problem was resolved after an RAC was added to the neighboring UTRAN cell.
l
Case 3: In a test of PS handover from E-UTRAN to UTRAN, the eNodeB sent the MME
a PSHO Required message. After two seconds, the eNodeB sent the MME a PSHO Cancel
message.
Fault Description
In a test of PS handover from E-UTRAN to UTRAN in a laboratory at a site, the 4G EPC
and eNodeB were provided by vendor Y and the 3G core network and radio network
controller (RNC) were provided by vendor Z. After handover conditions were met, the
eNodeB sent the MME a PSHO Required message. After two seconds, the eNodeB sent
the MME a PSHO Cancel message.
Fault Diagnosis
Uu and S1 signaling was traced. The tracing result shows that the eNodeB sent the MME
a HO Cancel command after the UE reported B1 measurement results to the MME and the
eNodeB sent the MME a HO Required command. The reason why the eNodeB sent the
HO Cancel command was that the MME did not respond to the HO Required command.
The length of WaitInterRATSysHoRspTimer configured on the eNodeB was 2 seconds.
The eNodeB did not receive a response from the MME when the timer expired. As a result,
the eNodeB sent the Handover Cancel command to cancel the handover. The MME log
was checked. The check result shows that the MME received the HO Required command
but did not forward the command to the SGSN. The reason why the MME did not forward
the command is that the Gn interface was not configured between the MME and the SGSN,
and as a result, the MME could not find the SGSN. When the timer expired, the eNodeB
sent the UE a PSHO Cancel command.
Fault Handling
The problem was solved after the Gn interface was reconfigured between the MME and
SGSN.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
63
eRAN
Troubleshooting Guide
8 Troubleshooting Rate Faults
8
Troubleshooting Rate Faults
About This Chapter
This chapter provides definitions of faults related to traffic rates and describes how to
troubleshoot low uplink/downlink UDP/TCP rates and rate fluctuations. UDP is short for User
Datagram Protocol, and TCP is short for Transmission Control Protocol.
8.1 Definitions of Rate Faults
This section defines rate faults.
8.2 Background Information
This section provides background information for rate faults. The background information
includes the user-plane protocol stack, restrictions that the protocol stipulates for UEs of different
categories, and method used to calculate the theoretical rates.
8.3 Troubleshooting Abnormal Single-UE Rates
This section provides information required to troubleshoot abnormal single-UE rates. The
information includes fault descriptions, background information, possible causes, fault handling
method and procedure, and typical cases.
8.4 Troubleshooting Abnormal Multi-UE Rates
This section provides information required to troubleshoot abnormal multi-UE rates. The
information includes fault descriptions, background information, possible causes, fault handling
method and procedure, and typical cases.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
64
eRAN
Troubleshooting Guide
8 Troubleshooting Rate Faults
8.1 Definitions of Rate Faults
This section defines rate faults.
The following are rate faults and their definitions:
l
No transmission
User equipment (UE) that has accessed a network cannot perform data services.
l
Low downlink rate on a single UE
The observed rate of a downlink service, either a User Datagram Protocol (UDP) or
Transmission Control Protocol (TCP) service, on a UE is at least 10% lower than the
baseline value.
l
Downlink rate fluctuation on a single UE
The observed rate of a downlink service, either a UDP or TCP service, on a UE fluctuates
by more than 50%.
l
Low uplink rate on a single UE
The observed rate of an uplink service, either a UDP or TCP service, on a UE is at least
10% lower than the baseline value.
l
Uplink rate fluctuation on a single UE
The observed rate of an uplink service, either a UDP or TCP service, on a UE fluctuates
by more than 50%.
l
Abnormal rates on multiple UEs
A key performance indicator (KPI) indicates an abnormal rate, or a large number of users
complain about their traffic rates. This fault may be caused by a specific single-UE rate
fault or a common rate fault on multiple UEs.
l
User-recognized abnormal rate
The rate of a data service on a UE is abnormal according to the user's definition. For
example, the currently observed rate is noticeably lower than the rate of the previous day
or a period; the observed rate is considerably lower than the rate achieved by equivalent
equipment.
These faults can be classified into the following types:
l
No transmission
l
Low single-UE rate, including uplink and downlink UDP/TCP rates
l
Single-UE rate fluctuation, including uplink and downlink UDP/TCP rates
l
Abnormal multi-UE rates
8.2 Background Information
This section provides background information for rate faults. The background information
includes the user-plane protocol stack, restrictions that the protocol stipulates for UEs of different
categories, and method used to calculate the theoretical rates.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
65
eRAN
Troubleshooting Guide
8 Troubleshooting Rate Faults
LTE User-Plane Protocol Stack
Figure 8-1 shows the LTE user-plane protocol stack. Rate statistics for different layers vary
because of headers. Note the header differences during analysis.
Figure 8-1 LTE user-plane protocol stack
The traffic rates of data services can be measured in the following ways:
l
The Ethernet-layer rate can be measured by using DU Meter at the server and client.
l
The rates at the RLC and MAC layers can be measured at the eNodeB.
l
The rates at layers such as RLC and MAC for Huawei user equipment (UE) can be measured
by using the Probe.
Protocol-Defined Rates for UE Categories
3GPP TS 36.306 specifies the rates for various UE categories, as listed in Table 8-1 and Table
8-2.
Table 8-1 Downlink physical layer parameter values for UE categories
Issue 02 (2012-07-30)
UE Category
Maximum
Number of
DL-SCH
Transport
Block Bits
Received
Within a TTI
Maximum
Number of
Bits of a DLSCH
Transport
Block
Received
Within a TTI
Total Number
of Soft
Channel Bits
Maximum
Number of
Supported
Layers for
Spatial
Multiplexing
in DL
Category 1
10296
10296
250368
1
Category 2
51024
51024
1237248
2
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
66
eRAN
Troubleshooting Guide
8 Troubleshooting Rate Faults
UE Category
Maximum
Number of
DL-SCH
Transport
Block Bits
Received
Within a TTI
Maximum
Number of
Bits of a DLSCH
Transport
Block
Received
Within a TTI
Total Number
of Soft
Channel Bits
Maximum
Number of
Supported
Layers for
Spatial
Multiplexing
in DL
Category 3
102048
75376
1237248
2
Category 4
150752
75376
1827072
2
Category 5
302752
151376
3667200
4
Table 8-2 Uplink physical layer parameter values for UE categories
UE Category
Maximum Number of
Bits of a UL-SCH
Transport Block
Transmitted Within a TTI
Support for 64QAM in
UL
Category 1
5160
No
Category 2
25456
No
Category 3
51024
No
Category 4
51024
No
Category 5
75376
Yes
Theoretical Rate Calculation
In LTE networks, the theoretical traffic rate relates to the system bandwidth, modulation scheme,
multiple-input multiple-output (MIMO) mode, and parameter settings. Theoretical rate
calculation for a cell considers the number of symbols occupied by the physical downlink control
channel (PDCCH) in each subframe and the amount of time-frequency resources occupied by
the synchronization channel, by reference signals, and by the broadcast channel.
The theoretical rate can be determined based on the number of RBs and modulation order. For
details, see 3GPP TS 36.213.
Take a 20 MHz cell as an example. The only UE in the cell can use 100 RBs and MCS index
28. Then, the TBS of 75736 can be selected at the MAC layer for the UE. If MIMO is used, two
transport blocks (150752) are transmitted per transmission time interval (TTI), which is 1 ms.
Then, the throughput is 150.752 Mbit/s.
NOTE
The theoretical rate calculated is the protocol-stipulated MAC-layer rate, not the application-layer rate for
eNodeBs.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
67
eRAN
Troubleshooting Guide
8 Troubleshooting Rate Faults
8.3 Troubleshooting Abnormal Single-UE Rates
This section provides information required to troubleshoot abnormal single-UE rates. The
information includes fault descriptions, background information, possible causes, fault handling
method and procedure, and typical cases.
Fault Description
The observed rate is stable but at least 10% lower than the baseline value.
Figure 8-2 Rate fault 1 - stable but lower than the baseline value
The observed rate fluctuates by more than 50%, as shown in the following figures.
Figure 8-3 Rate fault 2 - fluctuation type 1
Figure 8-4 Rate fault 2 - fluctuation type 2
Related Information
The User Datagram Protocol (UDP) is a simple datagram-oriented transport-layer protocol. UDP
provides an unreliable service. It sends datagrams from the application to the IP layer but does
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
68
eRAN
Troubleshooting Guide
8 Troubleshooting Rate Faults
not ensure that the datagrams can arrive at their destinations. However, UDP features a high
transmission speed, because a connection does not need to be set up before UDP-based
transmission between a client and a server and retransmission upon timeout is not applied.
The Transmission Control Protocol (TCP) provides connection-oriented reliable delivery of a
stream of bytes. A client and a server can transmit data between each other only after a TCP
connection is set up between them. TCP provides functions such as retransmission upon timeout,
discarding of duplicate data, data checking, and flow control for data delivery from one end to
the other end.
TCP uses a more complicated control mechanism than UDP. In most cases, a link with a normal
TCP rate has a normal UDP rate, but a link with a normal UDP rate does not necessarily have
a normal TCP rate. When diagnosing rate faults, ensure normal UDP rates before handling TCP
services.
3GPP specifications impose uplink capability constraints on user equipment (UE) categories.
Only UEs of category 5 support 64 quadrature amplitude modulation (64QAM) in the uplink.
Possible Causes
A common way to find a cause is as follows: First, check whether the service involved is a UDP
service or a TCP service. If it is a TCP service, inject uplink and downlink UDP packets on a
single thread and check whether the uplink and downlink UDP rates can reach their peak values.
The purpose is to "clear the way" for TCP rate fault diagnosis. For example, eliminate rate
limiting at the network adapter and rectify radio parameter setting errors before handling TCP
rate faults. If the service involved is a UDP service, locate the fault by investigating link from
the server to the UE in an end-to-end manner. Second, if the UDP rate can reach its peak value
but the TCP rate cannot, the fault exists in the TCP transmission mechanism.
Abnormal rates have the following possible causes:
l
Fault in the data source at the server
l
Insufficient traffic into the eNodeB due to transmission problems
l
Radio interface faults, such as eNodeB alarms related to the radio interface, signal quality
problems, parameter setting errors, problems caused by multiple UEs online, license issues,
and uplink interference (required to be checked for abnormal uplink rates)
l
Fault in the PC connected to the UE
l
TCP parameter setting error, or fault in the TCP transmission mechanism
Fault Handling
None
Fault Handling Procedure
1.
Check whether data services run abnormally.
If a UE fails to access any data services, check whether the UE has been connected to or
disconnected from the network. Ensure that the UE is connected. Then, check the firewall
settings at the PC and the server. Ensure that the firewalls allow access of the data services.
In addition, check whether routes from the server to the evolved packet core (EPC) work
properly. On the server, ping the user-plane IP address of the unified gateway (UGW). If
the ping operation fails or the delay is excessively long, contact EPC or datacom technical
support.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
69
eRAN
Troubleshooting Guide
2.
8 Troubleshooting Rate Faults
Check whether the server malfunctions.
a.
On the server, run the following command to set the UDP packet injection volume:
iperf –c x.x.x.x –u –i 1 –t 99999 –b yyym
NOTE
"x.x.x.x" denotes the service IP address of the UE.
"yyym" denotes the UDP packet injection volume, which depends on the UE in use and the cell
bandwidth. The value can be greater than the theoretical maximum value as long as the data volume
is sufficient.
b.
On the PC, run the following command to start receiving packets:
iperf –s –u –i 1
c.
(Optional) If the actual output traffic volume from the server does not reach the
specified "yyym", run the following command with "-l" added to adjust the UDP
packet size:
iperf –c x.x.x.x –u –i 1 –t 99999 –b yyym -l 1000
d.
3.
(Optional) If the actual output traffic volume from the server still fails to reach the
specified "yyym", replace the server.
Check whether the input traffic volume to the eNodeB is insufficient.
A common reason for the insufficient input traffic volume is a bottleneck transmission
bandwidth at an intermediate node. Check whether:
l The bandwidth is correctly set along the transmission link.
Ensure that all network elements and interfaces work at the gigabit level and in autonegotiation speed mode. The network elements include at least Ethernet ports on the
server and all switches and routers on the network.
l The transmission bandwidth on the transmission link is greater than the peak value.
If microwave is used for transmission, ensure that the transmission bandwidth is greater
than the peak value.
NOTE
The transmission link refers to the S1 interface from the server to the eNodeB.
4.
Check whether the radio channel quality is unsatisfactory.
l Check whether the downlink signal quality is poor.
Use the software matching the UE type to measure signal quality parameters, such as
the reference signal received power (RSRP) and signal to interference plus noise ratio
(SINR). The RSRP and SINR must fulfill certain conditions to meet rate requirements.
For example, to enable the actual maximum rate to approach the theoretical peak value,
ensure that the RSRP and SINR stay above -85 dBm and 26 dB, respectively.
l Check whether the block error rate (BLER) is excessively high on the radio interface.
Monitor the BLER on the M2000 client. If the BLER is higher than 10%, the channel
condition is poor. Improve the channel condition for better downlink signal quality.
l (Optional) Check whether uplink interference exists.
When a cell is unloaded in the uplink (all UEs are powered off and there is no service
in the cell), check the received signal strength indicator (RSSI) across the uplink band.
In a normal case, the RSSI on each resource block (RB) is about -120 dBm when the
cell is unloaded. If the RSSI is 3 dBm to 5 dBm higher than the normal value, uplink
interference exists. Locate the interference source, and mitigate the interference.
5.
Issue 02 (2012-07-30)
Check whether the basic information about the data services or the parameter settings are
incorrect.
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
70
eRAN
Troubleshooting Guide
8 Troubleshooting Rate Faults
This check is twofold:
l Check whether the basic information about the data services is incorrect.
In this step, check the user's subscription information and UE's capability. Specifically,
check whether the user is subscribed to the correct QCI, whether the MBR and AMBR
of the UE are set as expected, and whether the UE is empowered with expected
capabilities.
l Check whether the basic information about the parameter settings is incorrect.
The parameter settings refer to the settings for the eNodeB. Algorithm setting changes
cause severe drops in the traffic rate. Export eNodeB parameter settings, and compare
them with the baseline values. If the values are inconsistent, confirm whether the settings
are customized for the operator or have been changed to incorrect values. If the settings
have been changed to incorrect values, inform the operator immediately.
6.
Check whether the number of users in the cell is excessively large.
Check the number of users in the cell and the downlink RB usage by performing Users
Statistics Monitoring and Usage of RB Monitoring tasks, respectively, under cell
performance monitoring. If an excessively large number of users have accessed the cell
and RBs are exhausted when a UE accesses the cell, the traffic rate on each UE will not be
high, and low-priority users will experience even lower traffic rates.
7.
Check whether license information is incorrect.
Run the LST LICENSE command to query license information, and observe whether:
l The license has expired, or limitation is imposed on functions related to the data services.
l The licensed throughput capability is correct.
8.
Check whether the client works abnormally.
Client faults may exist in the UE or in the PC connected to the UE.
l Check for faults in the UE.
If spare UEs are available, replace the UE and check whether the rate fault disappears.
If it disappears, the fault exists in the UE.
l Check for faults in the PC connected to the UE.
Investigate the software installed and running on the PC. You are advised to remove or
close all programs except those required by the test. In addition, close the Windows
firewall and firewalls of antivirus programs.
Check the central processing unit (CPU) usage. If the CPU usage exceeds 80%, the CPU
is heavily loaded. Close unused software or service, or replace the PC with a better one.
9.
Check for TCP errors.
TCP fault diagnosis varies depending on the symptom. If the throughput is maintained at
a level lower than the peak value, check parameter settings and the round trip time (RTT).
If the throughput can reach the peak value but is not stable, check for packet loss and severe
packet misordering.
l Check the TCP rate status.
Use a multi-thread download program (for example, FlashGet or FileZilla) or open
multiple Windows command line windows to download data. If the rate is higher than
the single-thread rate, perform further TCP checks. If the rate is equal to or even lower
than the single-thread rate, go back to the previous steps to recheck for possible faults.
l Check basic TCP parameter settings.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
71
eRAN
Troubleshooting Guide
8 Troubleshooting Rate Faults
Ensure that the basic TCP parameters are correctly set. The parameters include the
receive window, send window, and maximum transmission unit (MTU).
l Check the RTT.
Ping the server by using 32-byte packets and MSS-byte packets (MSS is short for
maximum segment size), and take the average RTT value for the two types as the
calculated RTT. Typically, the RTT value is required to be less than or equal to 50 ms.
Link optimization is required if the RTT value is greater than 50 ms.
l Check for packet loss and severe packet misordering.
On the PC side, trace packet headers or use the TCP fault diagnosis module to check
for packet loss and severe packet misordering. If packet loss or severe packet
misordering occurs, contact datacom personnel for handling.
10. If the fault persists, contact Huawei technical support.
Typical Cases
l
Case 1: The downlink rate was low with microwave transmission.
Fault Description
On network X in a country, the cell bandwidth was 15 MHz. In a downlink File Transfer
Protocol (FTP) throughput test using a single UE in a single cell, it was found that all
eNodeBs connected to a 100 Mbit/s microwave transport network had their downlink
throughput not exceeding 30 Mbit/s, but eNodeBs connected to a 1 Gbit/s optical transport
network had their downlink throughput as high as 80 Mbit/s.
Fault Diagnosis
A UDP test found that the UDP throughput was 100 Mbit/s at the sender but dropped to
only 80 Mbit/s at the receiver (eNodeB). Severe packet loss occurred. Due to TCP
congestion control, the throughput of 30 Mbit/s was normal, so the fault did not exist in the
eNodeB. The operator requested operation and maintenance (OM) personnel to locate the
packet loss point based on the following assumption: The throughput of 80 Mbit/s on the
optical transport network did not reach 100 Mbit/s, so congestion should not occur in the
microwave transport network.
The microwave transmission media were replaced with an Ethernet cable for the direction
connection between the eNodeB and the S-GW. The FTP transfer rate was maintained at
30 Mbit/s. The segment-by-segment check found that packet loss occurred at a position
between the input and output ports on a switch before packets entered the microwave
network. The operator traced the input and output ports and confirmed that packet loss
occurred. The operator further found that the fault was caused by a small buffer size that
was set for the port on the switch.
Fault Handling
The operator extended the buffer size and tested again. The test result indicated that the
downlink rate could reach the expected value. The extended buffer size helps enhance antiburst capability, reduce the tail drop probability, and increase the FTP transfer rate.
l
Case 2: UDP services were functional, but FTP services were unavailable.
Fault Description
Operator T in country D stated that no FTP service was available on eNodeBs operating in
the 1800 MHz band but all cells operated properly with UEs normally accessing the cells,
being released, and performing UDP services.
Fault Diagnosis
Based on the feedback from the operator, a check for TCP errors was performed directly,
only to find that the FTP transfer rate dropped to zero and the server could not be pinged.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
72
eRAN
Troubleshooting Guide
8 Troubleshooting Rate Faults
Because UDP services ran normally in the downlink, it was almost ascertained that the fault
was down link disconnection.
The check on a 800 MHz eNodeB connected to the same transport network found that FTP
services ran normally. Therefore, it was highly possible that the eNodeBs had faults. Due
to the severe impact of the fault, data configurations were immediately restored for the
1800 MHz eNodeBs by using the backup data configuration files. The fault was rectified.
The faulty configuration files were compared with baseline data configurations. The
comparison result indicated that a key radio parameter for downlink and uplink
transmission was set to a value different from the baseline value. The fault was caused by
the incorrect parameter setting.
Fault Handling
Parameter settings were changed to baseline values for all faulty eNodeBs.
l
Case 3: The traffic rate occasionally reached the peak value using the E398 but never
reached the peak value using Samsung UEs.
Fault Description
In a single cell under an eNodeB on network Y in country P, a single Samsung UE could
reach only 80 Mbit/s unexpectedly in both single-thread and multi-thread (using FileZilla)
TCP download. Huawei E398 could occasionally reach 100 Mbit/s in both single-thread
and multi-thread TCP download. Both the Samsung UE and Huawei E398 experienced rate
drops.
Fault Diagnosis
A UDP packet injection test was performed, only to find that Huawei E398 and Samsung
UE could both reach the peak values. Therefore, the fault should exist in the TCP
transmission mechanism. In this fault case, rate drops occurred, which was an evidence of
packet loss. The fault symptoms on Huawei E398 and Samsung UE were different, so there
must be causes other than packet loss.
The analysis of TCP/IP headers using a third-party tool indicated that packet loss occurred
on the radio interface. It was found from the configuration file for the eNodeB that the QoS
class identifier (QCI) was 7 and the unacknowledged mode (UM) was used. UM is
insensitive to packet loss, so the frontline personnel tried QCI 9 upon request in a further
test. In the test, rate drops disappeared, but Samsung UE still failed to reach the peak value
in neither single-thread nor multi-thread TCP download while Huawei E398 could reach
the peak value in both single-thread and multi-thread TCP download. A further test was
performed on RTT using Samsung UE and Huawei E398. The test result indicated that the
RTT value for Samsung UE was longer and less stable than the RTT value for Huawei
E398. A comparison between the configuration file for the eNodeB on network Y and the
baseline configuration file found a difference in the radio-interface encryption setting. The
Advanced Encryption Standard (AES) encryption algorithm was enabled for the radio
interface on network Y, but this algorithm was disabled in the lab. The frontline personnel
disabled the AES encryption algorithm as requested. Then, the traffic rate on Samsung UE
could reach 100 Mbit/s. The fault could be reproduced: The rate dropped to 80 Mbit/s after
this algorithm was enabled. The reason for Samsung UE's failure to reach the peak value
was the setting of the AES encryption algorithm on the radio interface.
Fault Handling
The problem in network Y was caused by more than one fault, which was further induced
by incorrect parameter settings. The problem was resolved after the parameter settings were
corrected.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
73
eRAN
Troubleshooting Guide
8 Troubleshooting Rate Faults
8.4 Troubleshooting Abnormal Multi-UE Rates
This section provides information required to troubleshoot abnormal multi-UE rates. The
information includes fault descriptions, background information, possible causes, fault handling
method and procedure, and typical cases.
Fault Description
A key performance indicator (KPI) indicates an abnormal rate according to the routine KPI
monitoring result, or a large number of users complain about their traffic rates.
Related Information
Related Counters
l
DRB Measurement (Cell) (Traffic.DRB.Cell)
l
Throughput Measurement (Cell) (Traffic.Thruput.Cell)
l
PDCP Measurement (Cell) (Traffic.PDCP.Cell)
l
MAC Data Unit Measurement (Cell) (Traffic.MAC.Cell)
l
User Number Measurement (Cell) (Traffic.User.Cell)
l
Packet Processing Measurement (Cell) (Traffic.Packet.Cell)
l
L.Thrp.bits.DL
l
L.Thrp.Time.DL
l
L.Thrp.bits.UL
l
L.Thrp.Time.UL
l
L.Traffic.User.DLData.Avg
l
L.Traffic.User.ULData.Avg
Possible Causes
If a large number of users complain about their traffic rates, find the cause by following the
procedure for troubleshooting abnormal single-UE rates. Pay more attention to faults that may
cause large-scope failures, for example, eNodeB faults, transmission failures, large-size
reconfiguration, and radio frequency (RF) faults.
If a KPI indicates an abnormal rate, check whether the KPI calculation formula is correct,
investigate TopN cells, analyze the changes of the KPI with other KPIs, review recent key actions
on the network, and if necessary collect and provide KPI logs.
Fault Handling
None
Fault Handling Procedure
1.
Issue 02 (2012-07-30)
Check whether the KPI calculation formula is correct.
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
74
eRAN
Troubleshooting Guide
8 Troubleshooting Rate Faults
Learn the definition of the KPI, determine whether its calculation formula and measurement
are correct, and check whether the observed KPI is correct according to the calculation
formula.
2.
Investigate TopN cells.
Select TopN cells and investigate them. If the fault exists in a single cell under a single
eNodeB, troubleshoot the fault by referring to 8.3 Troubleshooting Abnormal Single-UE
Rates.
3.
Analyze the changes of the KPI with other KPIs.
Analyze the changes to find the root cause or exceptions. For example, check whether the
traffic volume changes consistently with the number of users and whether the traffic volume
changes inversely with the channel quality indicator (CQI).
4.
Review recent key actions on the network.
Check whether the key actions affect the KPI.
5.
If the fault persists, contact Huawei technical support.
Typical Cases
Fault Description
On network T in a country, the routine KPI monitoring result indicated that the average traffic
rate had been decreasing across the network since a day while the number of users remained
almost unchanged.
Fault Diagnosis
The check on the rate calculation formula, counter measurement, and statistics changes found
that network T never changed the formula or measurement method. Therefore, it was not the
formula that caused the fault. The investigation of TopN cells found that the entire network had
almost the same trend, so the fault was not caused by abnormal individual cells. The analysis of
other KPIs indicated that the number of users remained almost unchanged. In addition, network
reconfiguration should not cause a gradual decrease. Finally, the review on recent key actions
found two actions: rollback of the evolved packet core (EPC) version and provisioning of lowrate subscription services. Further analysis was performed on the two actions.
The analysis found that the EPC version rollback did not affect the traffic rate. In an aggregate
maximum bit rate (AMBR) test in a lab, Transmission Control Protocol (TCP) services were
performed on UEs with AMBRs of 20 Mbit/s and 100 Mbit/s. The KPI monitoring result
indicated that the rate on a UE with an AMBR of 100 Mbit/s was about four times as high as
the rate on a UE with an AMBR of 20 Mbit/s. The investigation of AMBR distribution at more
than ten sites in recent days found that the number of UEs with a subscribed rate of 256 Mbit/s
had dropped by more than 70%. A majority of subscribers on the network were low-rate ones.
The confirmation with the operator proved that some UEs newly subscribed to low AMBRs,
and some with a subscribed rate of 256 Mbit/s switched to low AMBRs. That was the cause of
the rate decrease.
Fault Handling
No handling was required. The rate decrease was caused by the provisioning of low-rate
subscription services.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
75
eRAN
Troubleshooting Guide
9
9 Troubleshooting Cell Unavailability Faults
Troubleshooting Cell Unavailability Faults
About This Chapter
This chapter defines cell unavailability faults and provides a troubleshooting method.
9.1 Definitions of Cell Unavailability Faults
When the eNodeB detects that a cell is unavailable due to a cell activation failure, the eNodeB
reports an ALM-29240 Cell Unavailable alarm.
9.2 Background Information
This section provides background information for cell unavailability faults.
9.3 Troubleshooting Method
This section describes how to identify and troubleshoot the possible cause.
9.4 Troubleshooting Cell Unavailability Faults Due to Incorrect Data Configuration
This section provides information required to troubleshoot cell unavailability faults due to
incorrect data configurations. The information includes fault descriptions, background
information, possible causes, fault handling method and procedure, and typical cases.
9.5 Troubleshooting Cell Unavailability Faults Due to Abnormal Transport Resources
This section provides information required to troubleshoot cell unavailability faults due to
abnormal transport resources. The information includes fault descriptions, background
information, possible causes, fault handling method and procedure, and typical cases.
9.6 Troubleshooting Cell Unavailability Faults Due to Abnormal RF Resources
This section provides information required to troubleshoot cell unavailability faults due to
abnormal RF resources. The information includes fault descriptions, background information,
possible causes, fault handling method and procedure, and typical cases.
9.7 Troubleshooting Cell Unavailability Faults Due to Limited Capacity or Capability
This section provides information required to troubleshoot cell unavailability faults due to
limited capacity or capability. The information includes fault descriptions, background
information, possible causes, fault handling method and procedure, and typical cases.
9.8 Troubleshooting Cell Unavailability Faults Due to Faulty Hardware
This section provides information required to troubleshoot cell unavailability faults due to faulty
hardware. The information includes fault descriptions, background information, possible causes,
fault handling method and procedure, and typical cases.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
76
eRAN
Troubleshooting Guide
9 Troubleshooting Cell Unavailability Faults
9.1 Definitions of Cell Unavailability Faults
When the eNodeB detects that a cell is unavailable due to a cell activation failure, the eNodeB
reports an ALM-29240 Cell Unavailable alarm.
Cell unavailability mentioned in this chapter means that all UEs in a cell cannot perform services.
If only some UEs cannot perform services, the problem is due to scenario-specific causes. These
causes can be found with the aid of signaling tracing, which is not described in this chapter.
9.2 Background Information
This section provides background information for cell unavailability faults.
Factors that may affect the running of a cell include transmission, hardware, configuration, and
RF. If any factor is abnormal, the cell may be unavailable. In this case, check these factors for
troubleshooting.
Related Alarms
l
Cell alarms
– ALM-29240 Cell Unavailable
l
Transmission alarms
– ALM-25880 Ethernet Link Fault
– ALM-25886 IP Path Fault
– ALM-25888 SCTP Link Fault
l
Hardware alarms
– ALM-26101 Inter-Board CANBUS Communication Failure
– ALM-26200 Board Hardware Fault
– ALM-26201 Board Memory Soft Failure
– ALM-26205 BBU Board Maintenance Link Failure
l
Optical module and CPRI alarms related to the faulty cell
– ALM-26230 BBU CPRI Optical Module Fault
– ALM-26246 BBU CPRI Line Rate Negotiation Abnormal
l
RF module alarms related to the faulty cell
– ALM-26238 RRU Network Topology Type and Configuration Mismatch
l
Configuration alarms related to the faulty cell
– ALM-26243 Board Configuration Data Ineffective
– ALM-26251 Board Type and Configuration Mismatch
l
License alarms
– ALM-26817 License on Trial
l
Other alarms
– ALM-29201 S1 Interface Fault
– ALM-26262 External Clock Reference Problem
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
77
eRAN
Troubleshooting Guide
9 Troubleshooting Cell Unavailability Faults
9.3 Troubleshooting Method
This section describes how to identify and troubleshoot the possible cause.
Possible Causes
Cell unavailability may be caused by:
l
Incorrect data configuration
l
Abnormal transport resources
l
Abnormal RF resources
l
Limited capacity or capability
l
Faulty hardware
Troubleshooting Flowchart
Cell unavailability faults are generally indicated by alarms, MML command outputs, and logs.
Based on the information, you can know which factor leads to a failure in the setup or running
of a cell. The fault handling method provided in this section is used before log analysis, which
is shown in Figure 9-1.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
78
eRAN
Troubleshooting Guide
9 Troubleshooting Cell Unavailability Faults
Figure 9-1 Troubleshooting flowchart for cell unavailability faults
Troubleshooting Procedure
1.
Check whether there are related alarms.
Yes: Handle the alarms. For details, see eNodeB Alarm Reference. Go to 2.
No: Go to 3.
2.
Check whether the cell fault is rectified.
Yes: End.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
79
eRAN
Troubleshooting Guide
9 Troubleshooting Cell Unavailability Faults
No: Go to 3.
3.
Check cell fault information.
Perform the following operations in different scenarios:
l If the cell fault occurs during the deployment of an eNodeB or the setup of a cell, run
the ACT CELL command.
l If the cell fault occurs in another scenario, run the DSP CELL command.
4.
Rectify the fault based on cell fault information.
Possible fault causes and handling methods are provided as follows:
l If transport resources are abnormal, follow the instructions on how to troubleshoot cell
unavailability faults due to abnormal transport resources.
l If RF resources are abnormal, follow the instructions on how to troubleshoot cell
unavailability faults due to abnormal RF resources.
l If system capacity or capability is limited, follow the instructions on how to troubleshoot
cell unavailability faults due to limited capacity or capability.
l If data configurations are incorrect, follow the instructions on how to troubleshoot cell
unavailability faults due to incorrect data configurations.
5.
Check whether the cell fault is rectified.
Yes: End.
No: Go to 6.
6.
Check whether there are hardware faults.
Yes: Handle the fault problems. Go to 7.
No: Go to 8.
7.
Check whether the cell fault is rectified.
Yes: End.
No: Go to 8.
8.
Contact Huawei technical support.
9.4 Troubleshooting Cell Unavailability Faults Due to
Incorrect Data Configuration
This section provides information required to troubleshoot cell unavailability faults due to
incorrect data configurations. The information includes fault descriptions, background
information, possible causes, fault handling method and procedure, and typical cases.
Fault Description
A cell fails to be set up after data configuration.
Background Information
A cell cannot be set up successfully if the cell parameter settings do not match the actual RF/
baseband processing capability or other parameters.
Incorrect data configuration usually leads to a failure in the setup of a cell, not in the running of
a cell.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
80
eRAN
Troubleshooting Guide
9 Troubleshooting Cell Unavailability Faults
Related Alarms
l
ALM-29240 Cell Unavailable
Possible Causes
A resource item is set to a value inconsistent with the hardware or software configuration, leading
to cell setup failures. Possible causes are listed as follows:
l
Incorrect UL/DL subframe ratio or incorrect special subframe radio in TDD mode
l
Incorrect cell power configuration
l
Incorrect cell frequency configuration
l
Incorrect cell preamble format configuration
l
Incorrect cell UL/DL cyclic prefix configuration
l
Incorrect cell bandwidth configuration
l
Incorrect cell beamforming algorithm switch configuration
l
Incorrect cell operator information configuration
l
Incorrect cell antenna mode configuration
l
Incorrect CPRI line rate configuration
l
Incorrect cell network-related configuration
The common causes are:
l
Incorrect cell power configuration
l
Incorrect cell bandwidth configuration
l
Incorrect cell network-related configuration
Fault Handling Flowchart
None
Fault Handling Procedure
1.
Check whether there are related alarms.
Yes: Handle the alarms. For details, see eNodeB Alarm Reference. Go to 2.
No: Go to 3.
2.
Check whether the cell fault is rectified.
Yes: End.
No: Go to 3.
3.
Rectify the cell fault based on the MML command outputs about cell activation failures.
For details, see eNodeB Alarm Reference.
4.
Check whether the cell fault is rectified.
Yes: End.
No: Go to 5.
5.
Issue 02 (2012-07-30)
Contact Huawei technical support.
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
81
eRAN
Troubleshooting Guide
9 Troubleshooting Cell Unavailability Faults
Typical Cases
None
9.5 Troubleshooting Cell Unavailability Faults Due to
Abnormal Transport Resources
This section provides information required to troubleshoot cell unavailability faults due to
abnormal transport resources. The information includes fault descriptions, background
information, possible causes, fault handling method and procedure, and typical cases.
Fault Description
If the cell unavailability is caused by abnormal transport resources, a message will be displayed
after execution of the ACT CELL or DSP CELL command. The message indicates that the S1
interface used by the cell or an IP path on the S1 interface is abnormal.
Background Information
None
Possible Causes
The possible causes are:
l
An SCTP link is faulty or not configured.
l
An IP path is faulty or not configured.
l
Other transmission faults occur.
Fault Handling Flowchart
None
Fault Handling Procedure
1.
Check whether the S1 resources are unavailable.
Cell unavailability is due to S1 resource unavailability if any of the following conditions
is met:
l In the output of the DSP CELL command, the value of Cell latest avail state is
Unavailable S1 link.
l In the output of the ACT CELL command, the following information is provided: [0]
Configuration data activating failed: (1973485632) Cell S1 link (include S1 interface
and IP path) is abnormal.
Yes: Configure the S1 resources. Go to 2.
No: Go to 3.
2.
Check whether the cell fault is rectified.
Yes: End.
No: Go to 3.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
82
eRAN
Troubleshooting Guide
3.
9 Troubleshooting Cell Unavailability Faults
Check whether there is an SCTP link alarm.
Yes: Handle the alarm according to the help information of ALM-25888 SCTP Link Fault.
Go to 4.
No: Go to 5.
4.
Check whether the cell fault is rectified.
Yes: End.
No: Go to 5.
5.
Check whether there are IP path alarms.
Yes: Handle the alarms according to the help information of ALM-25886 IP Path Fault.
Go to 6.
No: Go to 7.
6.
Check whether the cell fault is rectified.
Yes: End.
No: Go to 7.
7.
Contact Huawei technical support.
Typical Cases
Fault Description
A cell failed to be activated. In the command output, the value of Reason For Latest State
Change was
CCEM_CELLBASIC_ERR_CELL_SETUP_FAIL_S1LINK_DOWN~1973485632.
Fault Diagnosis
OM personnel checked the active alarms and found there were not alarms related to the faulty
cell. OM personnel then checked the SCTP link status and found that the link was normal.
Finally, OM personnel found that IP paths were not configured.
Fault Handling
After OM personnel configured IP paths, the cell fault was rectified.
9.6 Troubleshooting Cell Unavailability Faults Due to
Abnormal RF Resources
This section provides information required to troubleshoot cell unavailability faults due to
abnormal RF resources. The information includes fault descriptions, background information,
possible causes, fault handling method and procedure, and typical cases.
Fault Description
RF-related alarms are reported.
Background Information
RF Resource Item
The RF resource items to be checked include:
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
83
eRAN
Troubleshooting Guide
9 Troubleshooting Cell Unavailability Faults
l
Whether CPRI links between RF units and LBBPs work properly
l
When the working status of RF units is normal
l
Whether RF unit versions match the main control board version.
l
When the line rates of CPRI links are successfully negotiated
l
Whether RF networking is consistent with data configuration
Possible Causes
A cell is unavailable if data configuration or hardware configuration of RF resources is incorrect.
The possible causes are abnormal CPRI links, abnormal RF units, version mismatch between
the main control board and RF units, unsuccessful negotiation of CPRI line rates, and mismatch
between RF networking and data configuration.
Fault Handling Flowchart
None
Fault Handling Procedure
1.
Check whether there are alarms related to RF units or RF unit maintenance links.
Yes: Handle the alarms. For details, see eNodeB Alarm Reference. Go to 2.
No: Go to 3.
2.
Check whether the cell fault is rectified.
Yes: End.
No: Go to 3.
3.
Check whether RF resources are abnormal.
Run the DSP BRD, DSP RRU, or DSP BRDVER for query.
Yes: Handle the problem. Go to 4.
No: Go to 5.
4.
Check whether the cell fault is rectified.
Yes: End.
No: Go to 5.
5.
If the fault persists, contact Huawei technical support.
Typical Cases
Fault Description
After a cell activation command was executed, Figure 9-2 was displayed. In another case, after
a cell query command was executed, Figure 9-3 was displayed.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
84
eRAN
Troubleshooting Guide
9 Troubleshooting Cell Unavailability Faults
Figure 9-2 RRU TX branch is not usable (1)
Figure 9-3 RRU TX branch is not usable (2)
Fault Handling Flowchart
OM personnel checked RF-channel-related alarms (including VSWR alarms and RF unit
maintenance link alarms) and found there were RF unit maintenance link alarms. OM personnel
then determined that fiber connections were incorrect according to alarm help information.
Fault Handling
After OM personnel reinstalled the fibers, the alarms were cleared and the cell was successfully
activated.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
85
eRAN
Troubleshooting Guide
9 Troubleshooting Cell Unavailability Faults
9.7 Troubleshooting Cell Unavailability Faults Due to
Limited Capacity or Capability
This section provides information required to troubleshoot cell unavailability faults due to
limited capacity or capability. The information includes fault descriptions, background
information, possible causes, fault handling method and procedure, and typical cases.
Fault Description
A cell fails to be set up if the required capacity or capability is limited on software or hardware.
Background Information
None
Possible Causes
The hardware or software specification is limited (for example, the licensed capacity or
capability is limited), leading to cell unavailability.
Fault Handling Flowchart
None
Fault Handling Procedure
1.
Obtain the command output after the cell fails to be activated.
2.
Rectify the cell fault according to the command output. For details about the command
output, check MML help information or related eNodeB documents.
3.
Check whether the cell fault is rectified.
Yes: End.
No: Go to 4.
4.
If the fault persists, contact Huawei technical support.
Typical Cases
Fault Description
After the DSP CELL command was executed, Figure 9-4 was displayed.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
86
eRAN
Troubleshooting Guide
9 Troubleshooting Cell Unavailability Faults
Figure 9-4 Command output indicating a failure to obtain the licensed number of cells
Fault Handling Flowchart
According to the command output, the cell activation failure is caused by license limitation. The
DSP LICENSE command is run, and the result indicates that the licensed number of cells is 3.
However, four cells are actually configured according to the result of the LST CELL command.
The configured number of cells exceeds the licensed number, which leads to the cell activation
failure.
Fault Handling
After a new license is applied for, downloaded, and activated, the cell is successfully activated.
9.8 Troubleshooting Cell Unavailability Faults Due to
Faulty Hardware
This section provides information required to troubleshoot cell unavailability faults due to faulty
hardware. The information includes fault descriptions, background information, possible causes,
fault handling method and procedure, and typical cases.
Fault Description
Board fault alarms are reported. Alternatively, cell unavailability faults cannot be rectified after
resetting, powering off, or reinstalling faulty boards.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
87
eRAN
Troubleshooting Guide
9 Troubleshooting Cell Unavailability Faults
Background Information
None
Possible Causes
A cell may not be set up if a fault occurs in the main control board, LBBP, RF unit, other hardware
(for example, a subrack).
Fault Handling Flowchart
None
Fault Handling Procedure
1.
Check whether the board status is abnormal and whether the board versions are mismatched.
Run the DSP BRD or DSP BRDVER for query. Pay more attention to RF units.
Yes: Rectify the board faults. Go to 2.
No: Go to 3.
2.
Check whether the cell fault is rectified.
Yes: End.
No: Go to 3.
3.
Collect the logs of the faulty cell.
The logs to be collected include the logs of the main control board, LBBP, and RF unit.
4.
Determine whether restoration operations such as eNodeB or board resets can be performed.
Yes: Go to 5.
No: Go to 9.
5.
(Optional) Reset the RF unit, LBBP, or main control board.
Run the RST BRD or RST ENODEB command.
6.
(Optional) Check whether the cell fault is rectified.
Yes: End.
No: Go to 7.
7.
(Optional) Power off the RF unit and LBBP.
Run the OPR BRDPWR command.
8.
(Optional) Check whether the cell fault is rectified.
Yes: End.
No: Go to 9.
9.
If the fault persists, contact Huawei technical support.
Typical Cases
None
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
88
eRAN
Troubleshooting Guide
10
10 Troubleshooting IP Transmission Faults
Troubleshooting IP Transmission Faults
About This Chapter
This section defines IP transmission faults and describes how to troubleshoot IP transmission
faults.
10.1 Definitions of IP Transmission Faults
If an Internet Protocol (IP) transmission fault occurs, messages and service data cannot be
transmitted between communication devices, and a peer device cannot be pinged.
10.2 Background Information
This section provides alarms related to IP transmission faults.
10.3 Troubleshooting Method
This section describes the method and procedure for troubleshooting IP transmission faults.
10.4 Troubleshooting IP Physical Layer Faults
This section provides information required to troubleshoot IP physical layer faults. The
information includes fault descriptions, background information, possible causes, fault handling
method and procedure, and typical cases.
10.5 Troubleshooting IP Link Layer Faults
This section provides information required to troubleshoot IP link layer faults. The information
includes fault descriptions, background information, possible causes, fault handling method and
procedure, and typical cases.
10.6 Troubleshooting IP Layer Faults
This section provides information required to troubleshoot IP layer faults. The information
includes fault descriptions, background information, possible causes, fault handling method and
procedure, and typical cases.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
89
eRAN
Troubleshooting Guide
10 Troubleshooting IP Transmission Faults
10.1 Definitions of IP Transmission Faults
If an Internet Protocol (IP) transmission fault occurs, messages and service data cannot be
transmitted between communication devices, and a peer device cannot be pinged.
10.2 Background Information
This section provides alarms related to IP transmission faults.
Related Alarms
The following alarms may be reported to indicate Internet Protocol (IP) transmission faults:
l
ALM-25880 Ethernet Link Fault
l
ALM-25885 IP Address Conflict
l
ALM-25886 IP Path Fault
l
ALM-25888 SCTP Link Fault
l
ALM-29240 Cell Unavailable
For details, see eNodeB Alarm Reference.
10.3 Troubleshooting Method
This section describes the method and procedure for troubleshooting IP transmission faults.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
90
eRAN
Troubleshooting Guide
10 Troubleshooting IP Transmission Faults
Troubleshooting Flowchart
Figure 10-1 Troubleshooting flowchart for IP transmission faults
Troubleshooting Procedure
1.
Check whether an alarm indicating the Ethernet link fault is reported in the active alarms
on the eNodeB. If an alarm indicating the Ethernet link fault is reported, rectify the fault.
If no alarm indicating the Ethernet link fault is reported, go to 2.
2.
Ping the IP address nearest to the local end or the network segment IP address. If the IP
address nearest to the local end or the network segment IP address cannot be pinged, there
is an IP data link layer fault. Rectify the fault. If the IP address nearest to the local end or
the network segment IP address can be pinged, go to 3.
3.
Ping an IP address that is in the same network segment as the local IP address and ping the
destination IP address. If the IP address in the same network segment can be pinged but
the destination IP address cannot be pinged, there is an IP layer link fault. Rectify the fault.
If both IP addresses can be pinged, go to 4.
4.
If the fault persists, contact Huawei technical support.
10.4 Troubleshooting IP Physical Layer Faults
This section provides information required to troubleshoot IP physical layer faults. The
information includes fault descriptions, background information, possible causes, fault handling
method and procedure, and typical cases.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
91
eRAN
Troubleshooting Guide
10 Troubleshooting IP Transmission Faults
Fault Description
An alarm indicating an Ethernet link fault can be monitored among active alarms on the eNodeB.
Related Information
None
Possible Causes
The Ethernet cable or optical module has faults.
Fault Handling
None
Fault Handling Procedure
1.
Monitor the Ethernet port indicator status.
There are two indicators for an Ethernet port. If the green indicator is on, the negotiation
succeeds between the Ethernet port and the peer port. If the green indicator is off, the
negotiation fails between the Ethernet port and the peer port. If the yellow indicator blinks
fast, data is being transmitted through the port. If the yellow indicator is off, no data is being
transmitted through the port.
Locate the fault based on the indicator status.
2.
Indicator Status
Possible Fault Cause
Both green indicators on the eNodeB and
switch are on.
Port negotiation is successful and the ports
are up. This indicates that the physical
layer communication is normal.
The green indicator on the eNodeB is on
and the green indicator on the switch is off.
The port on the eNodeB is up and the port
on the switch is down. The possible cause
is that the configuration is incorrect or the
hardware is faulty. Perform the following
steps to locate the fault.
The green indicator on the eNodeB is off
and the green indicator on the switch is on.
The port on the eNodeB is down and the
port on the switch is up. The possible cause
is that the configuration is incorrect or the
hardware is faulty. Perform the following
steps to locate the fault.
Both green indicators on the eNodeB and
the switch are off.
The negotiation has failed and the ports are
down. Perform the following steps to
locate the fault.
Check cables.
l Check the Ethernet cable.
Check whether the Ethernet cable is properly prepared and whether the cable is longer
than 100 m.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
92
eRAN
Troubleshooting Guide
10 Troubleshooting IP Transmission Faults
a.
Check and record the bandwidth (100 Mbit/s or 1000 Mbit/s) supported by the
personal computer (PC) used.
b.
Disconnect the Ethernet cable from the eNodeB and connect it to the PC and check
whether the ports used to connect the PC and the switch are up. If the ports are up,
check and record the bandwidth (100 Mbit/s or 1000 Mbit/s) negotiated between
the PC and the switch.
l Check the optical cable and optical modules.
3.
a.
Check whether the optical modules are securely inserted. If they are not securely
inserted, reinsert them. Check information about the optical module manufacturer,
rate, mode (single-mode or multi-mode), wavelength, and communication
distance. It is recommended that the eNodeB and peer device use optical modules
provided by the same manufacturer and with the same rate.
b.
Check whether the optical cable is securely inserted. If it is not securely inserted,
reinsert it. Check whether the optical cable is broken due to excessive bending. If
it is broken, replace it.
c.
Check whether the optical module is damaged by inserting two ends of one optical
cable to the optical module. Check whether an alarm indicating an optical module
fault is reported on the LMT. If no alarm indicating an optical module fault is
reported, the optical module is normal. If an alarm indicating optical module fault
is reported, replace the optical module.
Check configurations.
Log in to the eNodeB and run the LST ETHPORT and DSP ETHPORT commands to
check the Ethernet port configuration, especially the Port Attribute, Speed, and Duplex.
The Port Attribute indicates whether an Ethernet port is an electrical port or optical port.
The port attribute can be set to AUTO. If the Port Attribute is set to Fiber, but an electrical
port is used, the port status should be down. Other parameters can be checked in a similar
way.
The rate and duplex mode must be configured the same on the eNodeB and the switch. If
they are not configured the same on the eNodeB and the switch, the port negotiation fails
or the port negotiation succeeds but packets are lost. The Gigabit Ethernet (GE) electrical
port on the eNodeB can be set to AUTO only. If the GE electrical port on the eNodeB is
used to connect to the switch, the port attribute must be set to AUTO on both the eNodeB
and the switch.
The following parameter settings are recommended.
Port Type
Rate and Duplex Mode
on the eNodeB
Rate and Duplex Mode
on the Switch
Fast Ethernet (FE)
electrical or optical port
100M/FULL
100M/FULL
FE electrical or optical port
AUTO/AUTO
AUTO/AUTO
GE electrical port
AUTO/AUTO
AUTO/AUTO
GE optical port
100M/FULL
100M/FULL
GE optical port
AUTO/AUTO
AUTO/AUTO
Change the parameter settings on the eNodeB to check the configurations on the switch.
Change both the rate and duplex mode to AUTO. If port negotiation succeeds after the
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
93
eRAN
Troubleshooting Guide
10 Troubleshooting IP Transmission Faults
change and the DSP ETHPORT command output is the same as expected, the rate and
duplex mode are both set to AUTO on the switch. If the port negotiation fails, the rate and
duplex mode are not set to AUTO on the switch. Analyze the possible configuration on the
switch based on the DSP ETHPORT command output and change the configuration on
the eNodeB accordingly.
4.
Isolate the fault.
a.
Connect a PC to the Ethernet port on the eNodeB and check whether the alarm is
cleared.
b.
Connect a PC to the Ethernet port on the switch and check whether the PC indicator
is on.
c.
Identify and isolate the fault.
l If the alarm is cleared and the PC indicator is off, the Ethernet port on the switch
is faulty. Go to 4.4.
l If the alarm persists and the PC indicator is on, the Ethernet port on the eNodeB
is faulty. Go to 4.5.
l If the alarm is cleared and the PC indicator is on, the Ethernet ports on the peer
device and the eNodeB are not fully electrically compatible. Go to 4.6.
d.
Replace the switch.
e.
Run the RST ETHPORT and RST BRD commands to reset the Ethernet port and
the board, respectively.
Check whether an alarm indicating a board chip fault is reported. If an alarm indicating
a board chip fault is reported, replace the board on which the Ethernet port is located.
f.
5.
Check the parameters negotiated between the Ethernet ports on the switch and the
eNodeB.
If the fault persists, contact Huawei technical support.
Typical Cases
None
10.5 Troubleshooting IP Link Layer Faults
This section provides information required to troubleshoot IP link layer faults. The information
includes fault descriptions, background information, possible causes, fault handling method and
procedure, and typical cases.
Fault Description
Signaling messages and service data cannot be transmitted between communication devices.
The peer device cannot be pinged.
Related Information
None
Possible Causes
l
Issue 02 (2012-07-30)
The Ethernet port negotiation mode is inconsistent between the eNodeB and the peer device.
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
94
eRAN
Troubleshooting Guide
l
10 Troubleshooting IP Transmission Faults
The virtual local area network (VLAN) is incorrectly configured.
Fault Handling
Check whether the ARP and VLAN mechanisms work properly. Before transmitting an Internet
Control Message Protocol (ICMP), Stream Control Transmission Protocol (SCTP), or User
Datagram Protocol (UDP) packet, the eNodeB queries the next-hop media access control (MAC)
address in the ARP table based on the IP route. The eNodeB transmits the packet only if an ARP
table is configured on the eNodeB. If no ARP table is configured, the eNodeB broadcasts an
ARP request for the next-hop MAC address.
Fault Handling Procedure
1.
Check packet transmitting and receiving on the eNodeB.
Run the DSP ETHPORT command multiple times to check packet transmitting and
receiving on the eNodeB. If only the number of packets transmitted by the eNodeB
increases, the peer device does not respond. Check whether the eNodeB has transmitted
incorrect packets or the packets are correct but the peer device is faulty. Go to the next step.
2.
Query the ARP table.
Check whether the eNodeB has learned the ARP.
If the eNodeB has not learned the ARP, perform a ping test and check again. If the eNodeB
still has not learned the ARP, run the STR PORTREDIRECT command to start port
redirection to trace the packet header. Check whether the eNodeB has sent an ARP packet
and whether the packet is correct.
(Optional) Query the ARP information on the onsite switch.
The ARP aging period is 20 minutes on the eNodeB. If the communication between the
eNodeB and the peer device continues only for 20 minutes, the ARP update has failed after
the aging. If the VLAN configuration is changed within the 20 minutes, the fault is caused
by an incorrect VLAN configuration. If the VLAN configuration is not changed within the
20 minutes, the peer device must also be checked.
3.
Check the VLAN configuration.
Run the LST VLANMAP and LST VLANCLASS commands to check whether the VLAN
configuration is correct.
Run the STR PORTREDIRECT command on the eNodeB to start port mirroring to trace
the packet header. Compare the VLAN configuration with the VLAN information in the
packet. If the VLAN information in the packet is incorrect, modify the VLAN configuration
and check again.
NOTE
If VLAN group mode is used, the ARP message type is OTHER.
If the VLAN information in the ARP message is correct, the eNodeB is normal. Confirm
with the customer the VLAN configuration and port type of the peer device and the reason
why the peer device does not respond.
4.
If the fault persists, contact Huawei technical support.
Typical Cases
None
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
95
eRAN
Troubleshooting Guide
10 Troubleshooting IP Transmission Faults
10.6 Troubleshooting IP Layer Faults
This section provides information required to troubleshoot IP layer faults. The information
includes fault descriptions, background information, possible causes, fault handling method and
procedure, and typical cases.
Fault Description
The peer device cannot be pinged and an IP address in the same network segment as the eNodeB
can be pinged. Alarms indicating an SCTP link fault, cell unavailability, and a path fault are
reported by the upper layer.
Related Information
None
Possible Causes
l
The route configuration is incorrect or a related device is faulty.
l
The transmission network is disconnected.
Fault Handling
In most cases, the cause is that routes are unavailable. If the ARP table and VLAN are normal,
troubleshoot the fault as described in the next section.
Fault Handling Procedure
1.
Query the configured routes.
Run the LST IPRT and DSP IPRT commands to check whether routes are correctly
configured on the eNodeB.
2.
Use the traceroute function to locate the fault.
Run the TRACERT command on the eNodeB to query the nodes that the transmitted
packets pass and determine the gateway where the route becomes unavailable.
3.
Trace protocol data.
Run the STR PORTREDIRECT command on the eNodeB to start port mirroring to trace
the protocol and packet header.
4.
If the fault persists, contact Huawei technical support.
Typical Cases
None
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
96
eRAN
Troubleshooting Guide
11
11 Troubleshooting Application Layer Faults
Troubleshooting Application Layer
Faults
About This Chapter
This chapter describes the definitions of application layer faults and the troubleshooting method.
11.1 Definitions of Application Layer Faults
Application layer faults include unavailability and intermittent disconnection of Stream Control
Transmission Protocol (SCTP) links, Internet Protocol (IP) paths, and operation and
maintenance (OM) channels.
11.2 Background Information
11.3 Troubleshooting Method
This section describes the method and procedure for troubleshooting IP transport and application
layer faults.
11.4 Troubleshooting SCTP Link Faults
This section provides information required to troubleshoot SCTP link faults. The information
includes fault descriptions, background information, possible causes, fault handling method and
procedure, and typical cases.
11.5 Troubleshooting IP Path Faults
This section provides information required to troubleshoot IP path faults. The information
includes fault descriptions, background information, possible causes, fault handling method and
procedure, and typical cases.
11.6 Troubleshooting OM Channel Faults
This section provides information required to troubleshoot OM channel faults. The information
includes fault descriptions, background information, possible causes, fault handling method and
procedure, and typical cases.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
97
eRAN
Troubleshooting Guide
11 Troubleshooting Application Layer Faults
11.1 Definitions of Application Layer Faults
Application layer faults include unavailability and intermittent disconnection of Stream Control
Transmission Protocol (SCTP) links, Internet Protocol (IP) paths, and operation and
maintenance (OM) channels.
11.2 Background Information
The Stream Control Transmission Protocol (SCTP) is a transmission protocol that works on the
IP layer. The function of SCTP is similar to that of the Transmission Control Protocol (TCP)
and User Datagram Protocol (UDP) that work on the same layer as the SCTP. The latest standard
to which the SCTP conforms is Request for Comments (RFC) 2960 released in October 2000.
Compared with the TCP, the SCTP is improved for specific applications. In addition, multiple
features are added to the SCTP. The SCTP is now widely used in radio communications,
multimedia, and QoS.
The operation and maintenance (OM) channel is used for remote maintenance of eNodeBs. An
OM channel is set up using TCP handshakes.
11.3 Troubleshooting Method
This section describes the method and procedure for troubleshooting IP transport and application
layer faults.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
98
eRAN
Troubleshooting Guide
11 Troubleshooting Application Layer Faults
Troubleshooting flowchart for IP transport and application layer faults
Figure 11-1 Troubleshooting flowchart for IP transport and application layer faults
Troubleshooting Procedure
1.
Check whether an alarm indicating a Stream Control Transmission Protocol (SCTP) link
fault is reported or whether the SCTP link status is abnormal.
Yes: Troubleshoot the SCTP link fault.
No: Go to 2.
2.
Check whether an alarm indicating an Internet Protocol (IP) path fault is reported or whether
the IP path status is abnormal.
Yes: Troubleshoot the IP path fault.
No: Go to 3.
3.
Check whether an alarm indicating an operation and maintenance (OM) channel fault is
reported or whether the OM channel status is abnormal.
Yes: Troubleshoot the OM channel fault.
No: Go to 4.
4.
Issue 02 (2012-07-30)
Contact Huawei technical support.
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
99
eRAN
Troubleshooting Guide
11 Troubleshooting Application Layer Faults
11.4 Troubleshooting SCTP Link Faults
This section provides information required to troubleshoot SCTP link faults. The information
includes fault descriptions, background information, possible causes, fault handling method and
procedure, and typical cases.
Fault Description
l
Either of the following alarms is reported:
– ALM-25888 SCTP Link Fault
– ALM-25889 SCTP Link Congestion
l
The Stream Control Transmission Protocol (SCTP) link is unavailable or available only in
one direction.
After sending data to the peer device, the sender does not receive a response from the peer
device. In addition, the sender does not receive data from the peer device.
l
The SCTP link is abnormal.
The SCTP link is faulty or intermittently disconnected.
Related Information
To rectify SCTP link faults, you need to trace SCTP messages.
SCTP message blocks include 13 types of messages such as INIT, INIT ACK, DATA, SACK,
ABORT, SHUTDOWN, ERROR, COOKIEECHO, and HEARTBEAT.
Parameters such as the first peer IP address, the second peer IP address (used in SCTP dual
homing), and peer port number configured on the eNodeB must be consistent with those
configured on the mobility management entity (MME). Run the LST SCTPLNK command. In
the command output, the parameters in red rectangles are eNodeB parameters and the parameters
in the blue rectangles are evolved packet core (EPC) parameters. Ensure that the MME
parameters configured on the eNodeB are consistent with the SCTP parameters of the MME and
that eNodeB parameters configured on the EPC are consistent with the SCTP parameters of the
eNodeB.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
100
eRAN
Troubleshooting Guide
11 Troubleshooting Application Layer Faults
Figure 11-2 SCTP link configuration information
On the MME, check whether the peer port number configured on the MME is the same as the
local port number configured on the eNodeB and whether a correct network segment is
configured.
Possible Causes
l
The transmission network is faulty.
l
The SCTP parameters are incorrectly configured on the eNodeB or MME.
l
The NE has internal faults.
Fault Handling
None
Fault Handling Procedure
l
Typical Scenario
To find the cause for an SCTP fault, perform the following steps:
1.
Check configurations.
Check whether SCTP parameters are correctly configured on the MME and the
eNodeB.
2.
Check the transmission.
Ping the MME IP address. If the MME IP address cannot be pinged, check the route
and transmission network. If VLANs are configured for the eNodeB, set the
differentiated services code point (DSCP) value in the ping command to the one
configured for the VLAN for user data.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
101
eRAN
Troubleshooting Guide
11 Troubleshooting Application Layer Faults
3.
Start SCTP message tracing.
Start SCTP message tracing and compare the tracing result with normal SCTP message
exchange.
4.
Start a tracing task using WireShark.
Run the STR PORTREDIRECT command on the eNodeB to start port redirection.
If no desired data is traced, it is possible that the transmitting port did not send the
data. If desired data is traced, the transmission network and EPC are normal.
5.
l
If the fault persists, contact Huawei technical support.
Intermittent SCTP Link Disconnection
If an SCTP link is intermittently interrupted, the eNodeB cannot receive a response from
the peer device and then the SCTP link is down. After several seconds, the eNodeB initiates
SCTP link reestablishment and the SCTP link recovers.
1.
Check transmission alarms.
2.
Check the Quality of Service (QoS) of signaling data.
If VLANs are configured for the eNodeB, check whether the VLAN for signaling data
is correctly configured on the eNodeB. If VLANs are differentiated by next-hop IP
address, the check is not required. If VLANs are differentiated by service type, the
check is required.
If no VLAN is configured for the eNodeB, check whether the DSCP value for signaling
data is the same as that for the transmission network. Run the LST DIFPRI command
to query the DSCP value for signaling data. Check whether the DSCP value is 46 in
the QoS configuration for the transmission network. Ensure that data with a DSCP
value of 46 can be properly transmitted in the transmission network.
If the transport network bandwidth is limited and the DSCP value for SCTP services
is less than that for other types of services, the SCTP link will be intermittently
interrupted. Therefore, check whether SCTP services has a high DSCP-indicated
priority in the transmission network with the customer.
3.
Start SCTP message tracing.
Start SCTP message tracing and analyze the messages to find the cause for the link
failure.
4.
Check the network packet loss rate.
If the SCTP message tracing shows that packets are lost, check whether the port
attribute of the gigabit Ethernet (GE) or fast Ethernet (FE) port is consistent with that
on the peer device. If it is consistent, ping the peer device to check the packet loss rate
on the transmission network.
5.
Start a WireShark tracing task.
Run the STR PORTREDIRECT command on the eNodeB to start port redirection.
If no desired data is traced, it is possible that the transmitting port did not send the
data. If desired data is traced, the transmission network and EPC are normal.
6.
Take preventive measures.
If configurations are correct and the peer device can be pinged, run the MOD
SCTPLNK command or remove the SCTP link information and reconfigure the SCTP
parameters so that the eNodeB and the peer device negotiate about the SCTP link
again.
7.
Issue 02 (2012-07-30)
If the fault persists, contact Huawei technical support.
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
102
eRAN
Troubleshooting Guide
11 Troubleshooting Application Layer Faults
Typical Cases
None
11.5 Troubleshooting IP Path Faults
This section provides information required to troubleshoot IP path faults. The information
includes fault descriptions, background information, possible causes, fault handling method and
procedure, and typical cases.
Fault Description
l
The S1 interface is normal and cells are successfully activated, but UEs cannot attach to
the network.
l
UEs can attach to the network but cannot set up bearers of some QoS class identifiers
(QCIs). QoS is short for quality of service.
Related Information
The related alarm is as follows:
ALM-25886 IP Path Fault
Possible Causes
l
The Internet Protocol (IP) route is incorrectly configured.
l
The IP path parameters are incorrectly configured.
Fault Handling
None
Fault Handling Procedure
1.
Check whether ALM-25886 IP Path Fault is reported.
Yes: clear the alarm by referring to eNodeB Alarm Reference.
2.
Check whether IP path parameters are correctly configured.
Run the LST IPPATH command. In the command output, if Path Type is QOS and
DSCP is 0, only default bearers can be set up. In this case, change Path Type to ANY.
3.
If the fault persists, contact Huawei technical support.
Typical Cases
None
11.6 Troubleshooting OM Channel Faults
This section provides information required to troubleshoot OM channel faults. The information
includes fault descriptions, background information, possible causes, fault handling method and
procedure, and typical cases.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
103
eRAN
Troubleshooting Guide
11 Troubleshooting Application Layer Faults
Fault Description
The ALM-25901 Remote Maintenance Link Failure alarm is reported.
Operation and maintenance (OM) channel faults are classified into two categories:
l
OM channel unavailability: The OM channel is faulty.
l
OM channel interruption: The OM channel is intermittently interrupted.
Related Information
None
Possible Causes
l
The transmission network is faulty.
l
The OM channel parameters are incorrectly configured on the eNodeB or mobility
management entity (MME).
l
Some ports are disabled in the transport network.
Fault Handling
None
Fault Handling Procedure
This section describes how to handle an OM channel fault in various scenarios.
l
Typical Scenario
1.
Check configurations.
Check whether OM channel parameters are correctly configured on the M2000 client
and the eNodeB.
2.
Check the transmission.
Ping the IP address of the M2000. If the IP address of the M2000 cannot be pinged,
check the route and transport network.
NOTE
If ping operations are prohibited in the operator network, do not ping the M2000 client.
3.
(Optional) Trace protocol data.
If allowed, a protocol data tracing tool such as WireShark can be used to analyze
packet headers. Add a switch between the transmitting port and the transmission
network, configure transmitting port mirroring on the switch, and connect a personal
computer (PC) to the mirroring port on the switch to trace packet headers. If no desired
packet header is traced, the transmitting port is faulty. If desired packet headers are
traced, the transmission network is faulty.
4.
l
If the fault persists, contact Huawei technical support.
Intermittent OM Channel Interruption
1.
Check transmission alarms.
On the M2000 client, check whether a transmission alarm is reported by the eNodeB
during the intermittent transmission, for example, whether an Ethernet trunk fault
alarm is reported. If a transmission alarm is reported, adjust the transport network. If
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
104
eRAN
Troubleshooting Guide
11 Troubleshooting Application Layer Faults
no transmission alarm is reported, go to the next step. Check whether an alarm
indicating intermittent link (such as SCTP link) disconnections is also reported. If
such an alarm is reported, rectify the fault too.
2.
Check the VLAN configuration.
If VLANs are configured for the eNodeB, check whether the VLAN for OM data is
correctly configured on the eNodeB. If VLANs are differentiated by next-hop IP
address, the check is not required.
3.
Check whether network loopbacks exist.
Check whether loopbacks exist in the network based on the network topology. The
causes of loopbacks are twofold. Some loopbacks are caused by oversights in network
design, whereas others are temporary loopback links that were built during link tests
but were not removed promptly. As a result, loopbacks require careful investigation.
4.
(Optional) Trace protocol data.
If allowed, a protocol data tracing tool such as WireShark can be used to analyze
packet headers. Add a switch between the transmitting port and the transmission
network, configure transmitting port mirroring on the switch, and connect a personal
computer (PC) to the mirroring port on the switch to trace packet headers. If no desired
packet header is traced, the transmitting port is faulty. If desired packet headers are
traced, the transmission network is faulty.
5.
If the fault persists, contact Huawei technical support.
Typical Cases
None
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
105
eRAN
Troubleshooting Guide
12 Troubleshooting Transmission Synchronization Faults
12
Troubleshooting Transmission
Synchronization Faults
About This Chapter
This chapter describes how to troubleshoot transmission synchronization faults. This type of
faults include the clcok reference problem, IP clock link fault, system clock unlocked fault, base
station synchronization frame number error, or time synchronization failure.
12.1 Definitions of Transmission Synchronization Faults
This section describes the classification and definitions of transmission synchronization faults.
12.2 Background Information
For details about IP clock and non-IP clock, see eRAN Synchronization Feature Parameter
Description.
12.3 Troubleshooting Specific Transmission Synchronization Faults
This section provides information required to troubleshoot specific transmission synchronization
faults. The information includes fault descriptions, background information, possible causes,
fault handling method and procedure, and typical cases.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
106
eRAN
Troubleshooting Guide
12 Troubleshooting Transmission Synchronization Faults
12.1 Definitions of Transmission Synchronization Faults
This section describes the classification and definitions of transmission synchronization faults.
The following defines common transmission synchronization faults:
l
Clock reference problem
This fault occurs in the case of external clock reference loss, external clock reference
unavailability due to unacceptable quality, or excessive phase (or frequency) deviation
between the local oscillator and external clock references.
l
IP clock link fault
This fault occurs when the IP clock link between the eNodeB and the clock server
malfunctions.
l
System clock unlocked fault
This fault occurs when a phase-locked loop in a board is unlocked.
l
Base station synchronization frame number error
This error occurs when a synchronization frame number provided to a board is incorrect.
For example, a frame number jump occurs when the pps signals provided by the GPS are
abnormal.
l
Time synchronization failure
This failure occurs when the eNodeB fails to synchronize with the time synchronization
server (for example, the NTP server).
12.2 Background Information
For details about IP clock and non-IP clock, see eRAN Synchronization Feature Parameter
Description.
12.3 Troubleshooting Specific Transmission
Synchronization Faults
This section provides information required to troubleshoot specific transmission synchronization
faults. The information includes fault descriptions, background information, possible causes,
fault handling method and procedure, and typical cases.
Fault Description
External reference clocks for eNodeBs include GPS, synchronous Ethernet, clock over IP, BITS,
E1/T1, and TOD clocks. Any abnormality in a reference clock will cause the eNodeB incapable
of locking the reference clock. The clock status can be checked by running the DSP
CLKSTAT command.
l
The value of Current Clock Source State indicates an unknown status.
l
The value of Current Clock Source State indicates that the reference clock is abnormal,
for example, the reference clock is lost.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
107
eRAN
Troubleshooting Guide
12 Troubleshooting Transmission Synchronization Faults
l
The value of PLL Status indicates that the PLL status is abnormal, for example, the
reference clock is in free-run mode or there is excessive frequency deviation.
l
The value of Clock Synchronization Mode indicates that the clock synchronization mode
is not set to a specified mode.
If one of the previous conditions is met, there is a transmission security problem.
Background Information
l
The following describes how to perform a clock quality test:
1.
Start a clock quality test by running the STR CLKTST command.
2.
Several hours later, stop the clock quality check by running the STP CLKTST
command.
3.
Check the clock quality test result by running the DSP CLKTST command.
Possible Causes
l
The clock mode is incorrectly set.
l
The clock source is incorrectly added.
l
The clock working mode is incorrectly set for the eNodeB.
l
The external reference clock is abnormal, for example, there is excessive frequency
deviation.
l
The clock source is incorrectly selected, which leads to a clock lock failure.
Troubleshooting Flowchart
None
Troubleshooting Procedure
1.
Check the clock configuration for the eNodeB.
a.
Check whether the clock synchronization mode is set to a specified mode.
Check whether the mode is set to the required one, for example, frequency
synchronization or phase synchronization. If the configuration is incorrect, change the
mode to the required one.
b.
Check whether the clock sources are correctly added.
Use different query commands for different clock sources. For details, see eNodeB
MML Command Reference.
c.
Check whether the work mode of the clock is correctly set.
If the eNodeB needs to lock an external clock source, set the clock working mode to
AUTO or MANUAL. The difference between the two settings are:
l AUTO indicates that the eNodeB automatically selects a reference clock based on
the status, priorities, and link available status of reference clocks.
l MANUAL indicates that the eNodeB is forced to select a user-defined reference
clock.
Set the clock working mode based on actual requirements.
2.
Issue 02 (2012-07-30)
Check whether the external clock resources of the eNodeB work properly.
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
108
eRAN
Troubleshooting Guide
12 Troubleshooting Transmission Synchronization Faults
To check the status of an external clock source, run the DSP CLKSRC command. Pay
attention to the following two parameters:
l License Authorized
Generally, the value of this parameter indicates that the clock source can be used. If the
value indicates that the clock source cannot be used, enable the eNodeB synchronization
function.
To check whether the eNodeB synchronization function is enabled, run the DSP
LICENSE command. If the Allocated, Config, and Actual Used fields of the Enhanced
Synchronization control item are all 1, the function is enabled.
l Clock Source State
The link available status (Link Available State) of a reference clock can be checked
by running a command such as DSP IPCLKLINK, DSP SYNCETH, or DSP TOD.
The value of Clock Source State is Available when the external reference clock of the
eNodeB meets either of the following conditions:
– Non-IP clock
The physical connection between the reference clock and the eNodeB is normal, and
the eNodeB can properly receive clock signals sent by the reference clock.
– IP clock
The route from the eNodeB to the IP clock server is correct, and the eNodeB can
properly receive clock signals sent by the IP clock server.
If the clock source state or the link available state is unavailable, investigate the reason.
– Check whether the physical connection and communication are normal between the
eNodeB and the clock source. For the GPS, the number of satellites must be greater
than or equal to 4; the related command is DSP GPS.
– Check whether the eNodeB can properly receive clock signals. For a non-IP clock,
clock signals are generated at the physical layer, and therefore you can check only
on the equipment that sends the clock signals whether they are correctly sent. For
an IP clock, you can check whether clock packets are correctly received by
performing a trace task on the M2000 or by analyzing packet headers on the nearest
transmission equipment. The clock source state and link available state of an IP clock
can be determined based on the characteristics of received clock packets. For details
about the analysis, see the PTP clock packet analysis procedure in the next step.
3.
Check whether the eNodeB correctly selects a clock source.
When multiple external clock sources are added and work properly, the output of the DSP
CLKSRC command indicates that the status of these clock sources is Available. In
addition, the output of the corresponding link query command (DSP IPCLKLINK, DSP
SYNCETH, DSP GPS, or DSP TOD) indicates that the status of the clock link is also
Available. Note that only the link activation status (Link Active State) of the clock source
selected as the reference clock is Activated. The link activation status of other clock sources
is Unactivated.
The reference clock is explained as follows:
l If the clock working mode is set to MANUAL using the SET CLKMODE command,
the reference clock is the manually selected clock source.
l If the clock working mode is set to AUTO using the SET CLKMODE command, the
reference clock is the one automatically selected. The query command is DSP
CLKSTAT.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
109
eRAN
Troubleshooting Guide
12 Troubleshooting Transmission Synchronization Faults
l If the link availability status of the selected clock source is Available but the link
activation status is Unactivated, the reference clock is the one manually selected after
the clock working mode is set to MANUAL using the SET CLKMODE command.
4.
Check whether the eNodeB correctly locks an external clock source.
To check the lock status, run the DSP CLKSTAT command. The following describes the
parameters in this command:
l Current Clock Source: It indicates the clock source to be traced by the eNodeB.
l Current Clock Source State: The value should be Normal.
l PLL Status: The initial status should be Fast Tracking, and then Locked.
l Clock Synchronization Mode: It indicates the configured clock synchronization mode.
– Non-IP clock
For a non-IP clock source, if the link available state is available and the link active
state is activated in step 3, the states queried by running DSP CLKSTAT must be
normal.
The only risk is that the eNodeB enters free-run mode (instead of locked mode) after
a period of fast tracking. The eNodeB adjusts the local oscillator during fast tracking,
but the difference between the local oscillator and external clock sources is still
above the locking threshold. Therefore, the eNodeB cannot lock an external clock
source and enters free-run mode.
In this case, perform a clock quality test to check the frequency deviation values,
and report them to Huawei technical support.
– IP clock
For an IP clock, even if the clock link is available and activated, it cannot be
guaranteed that all check items are normal. The query command is DSP
CLKSTAT. The reason is that whether the eNodeB can lock an external clock
source depends on two packets (Sync and Delay_Resp) as well as the clock
information the packets carry.
In this situation, take two actions: (1) Collect clock packets received by the eNodeB
on the M2000 or collect headers of the packets on the nearest transmission
equipment; (2) Perform a clock quality test on the IP clock in the same way as that
for a non-IP clock. Then, send the packets (or packet headers) and quality test result
to Huawei technical support.
5.
If the transmission synchronization fault persists, contact Huawei technical support.
Typical Cases
None
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
110
eRAN
Troubleshooting Guide
13
13 Troubleshooting Transmission Security Faults
Troubleshooting Transmission Security
Faults
About This Chapter
This chapter describes how to troubleshoot transmission security faults.
13.1 Definitions of Transmission Security Faults
A transmission security fault occurs when an IPSec tunnel between an eNodeB and a security
gateway (SeGW) malfunctions. This fault leads to abnormal communication between the
eNodeB and the EPC.
13.2 Background Information
This section describes the data that requires encryption in transmission security networking
scenarios. In addition, this section provides the parameters related to transmission security.
13.3 Troubleshooting Specific Transmission Security Faults
This section provides information required to troubleshoot specific transmission security faults.
The information includes fault descriptions, background information, possible causes, fault
handling method and procedure, and typical cases.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
111
eRAN
Troubleshooting Guide
13 Troubleshooting Transmission Security Faults
13.1 Definitions of Transmission Security Faults
A transmission security fault occurs when an IPSec tunnel between an eNodeB and a security
gateway (SeGW) malfunctions. This fault leads to abnormal communication between the
eNodeB and the EPC.
Transmission security faults include:
l
Internet key exchange (IKE) negotiation failure: An IKE security association (SA) fails to
be set up between the eNodeB and the SeGW.
l
IPSec tunnel setup failure: The IKE SA between the eNodeB and the SeGW is normal, but
the IPSec SA carried by the IKE SA fails to be set up.
l
Certificate application failure: A digital certificate fails to be obtained due to an IKE
negotiation failure.
13.2 Background Information
This section describes the data that requires encryption in transmission security networking
scenarios. In addition, this section provides the parameters related to transmission security.
l
Encapsulation between two eNodeBs: Data streams between two eNodeBs are encapsulated
in transport mode.
l
Encapsulation between an eNodeB and an SeGW: Data streams (except those between the
SeGW and the EPC) are encapsulated in tunnel mode.
l
Encapsulation between an eNodeB and the EPC: Data streams over the S1 interface are
encapsulated in transport mode.
Figure 13-1 Transmission security networking
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
112
eRAN
Troubleshooting Guide
13 Troubleshooting Transmission Security Faults
Transmission security faults occur in most cases where security link negotiation between the
eNodeB and the security gateway fails. Parameters affecting the negotiation include IKE
parameters and IPSec parameters. IKE parameters include the ciphering algorithm, verification
algorithm, IKE version, identity authentication mode, and shared key. IPSec parameters include
the ciphering mode, ciphering algorithm, authentication algorithm, and authorization mode. For
details, see eRAN Transmission Security Feature Parameter Description.
13.3 Troubleshooting Specific Transmission Security Faults
This section provides information required to troubleshoot specific transmission security faults.
The information includes fault descriptions, background information, possible causes, fault
handling method and procedure, and typical cases.
Fault Description
When a transmission security fault occurs:
l
The eNodeB is out of control, and all operation commands cannot be delivered from the
M2000 to the eNodeB.
l
The eNodeB is under control, but transmission-related alarms are displayed on the Web
LMT.
l
Transmission detection commands such as ping cannot be successfully executed.
Background Information
l
Related Alarms
– ALM-26841 Certificate Invalid
– ALM-25891 IKE Negotiation Failure
– ALM-25880 Ethernet Link Fault
– ALM-26223 Transmission Optical Interface Performance Degraded
– ALM-26222 Transmission Optical Interface Error
– ALM-26220 Transmission Optical Module Fault
– ALM-25901 Remote Maintenance Link Failure
– ALM-25888 SCTP Link Fault
Possible Causes
Possible causes are:
l
Transmission security parameters are mismatched between the local and peer ends, which
leads to IPSec tunnel negotiation failures.
l
Security tunnel update fails due to certificate update failures or certificate expiry.
Troubleshooting Flowchart
Transmission security faults are generally due to data configuration. Therefore, data consistency
check between the eNodeB and the SeGW is crucial to troubleshooting.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
113
eRAN
Troubleshooting Guide
13 Troubleshooting Transmission Security Faults
Figure 13-2 Troubleshooting flowchart for transmission security faults
Troubleshooting Procedure
1.
Check whether an IPSec policy group is bound to the port involved.
Run the LST IPSECBIND command. The output is as follows:
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
114
eRAN
Troubleshooting Guide
13 Troubleshooting Transmission Security Faults
Figure 13-3 List binding relationships
If no binding relationship is found, bind an IPSec policy group to the port. Run the ADD
IPSECBIND command, and specify values for the mandatory parameters such as the slot
No., subboard type, port type, port No., and IPSec policy group name. To learn about the
IPSec policy group name, run the LST IPSECPOLICY command.
2.
Check whether the IKE proposal is correctly configured.
Run the DSP IKEPROPOSAL command for query. If the values in the red frame are
inconsistent with the network plan, run the MOD IKEPROPOSAL command to change
them.
Figure 13-4 List IKE negotiation results
3.
Check whether the IKE peer is correctly configured.
Run the DSP IKEPEER command for query. If the values in the red frame are inconsistent
with the network plan, run the MOD IKEPEER command to change them.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
115
eRAN
Troubleshooting Guide
13 Troubleshooting Transmission Security Faults
Figure 13-5 List IKE peer information
4.
Check whether the IKE proposal configuration on the eNodeB is the same as that on the
SeGW.
Run the LST IKEPROPOSAL command to check whether the IKE proposal with the ID
indicated in 3 is consistent with the that used by the SeGW. Pay more attention to the
encryption algorithm, authentication algorithm, IKE version, and key. If the authentication
is based on digital certificates, go to 5. If the authentication is based on shared keys, go to
6.
5.
Check whether the eNodeB's certificate chain is correct.
Run the DSP TRUSTCERT command to check the operator's root certificate. Pay more
attention to the information in the red frame. Check whether the name of the root certificate
is correct and whether the root certificate has expired. If the root certificate is incorrect,
apply for a new one. Then, run the DLD CERTFILE command to download the root
certificate, and run the ADD TRUSTCERT command to add the root certificate to the
eNodeB.
Figure 13-6 List operator's root certificate information
Run the DSP CERTMK command check the operator's device certificate. Pay more
attention to the information in the red frame. Check whether the issuer of the root certificate
is correct and whether the root certificate has expired. If the device certificate is incorrect,
apply for a new one. Then, run the DLD CERTFILE command to download the device
certificate, and run the ADD CERTMK command to add the device certificate to the
eNodeB.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
116
eRAN
Troubleshooting Guide
13 Troubleshooting Transmission Security Faults
Figure 13-7 List operator's device certificate information
Run the DSP APPCERT command to check whether the certificates used for IKE and SSL
are correct. Pay more attention to the information in the red frame. If a used certificate is
incorrect, run the MOD APPCERT command to change it.
Figure 13-8 List certificates used for IKE and SSL
6.
Check whether the IPSec proposal is correctly configured.
Run the DSP IPSECPROPOSAL command for query. If the values in the red frame are
inconsistent with the network plan, run the MOD IPSECPROPOSAL command to change
them.
Figure 13-9 List IPSec proposal information
7.
Check whether the IPSec policy is correctly configured.
Run the DSP IPSECPOLICY command for query. If the values in the red frame are
inconsistent with the network plan, run the MOD IPSECPOLICY command to change
them.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
117
eRAN
Troubleshooting Guide
13 Troubleshooting Transmission Security Faults
Figure 13-10 List IPSec policy information
8.
Check whether the ACL rule is correctly configured.
Run the LST ACLRULE command for query. The following figure provides an example.
If the values in the red frame are inconsistent with the network plan, run the MOD
ACLRULE command to change them.
Figure 13-11 List ACL rule information
9.
If the transmission security fault persists, contact Huawei technical support.
Before contacting Huawei technical support, collect configuration files, certificate files
(including the root certificate, intermediate certificate, device certificate files), and board
logs.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
118
eRAN
Troubleshooting Guide
13 Troubleshooting Transmission Security Faults
If possible, collect header information transmitted between the eNodeB and the SeGW
during negotiation.
Typical Cases
The following describes how to troubleshoot an IKE negotiation failure.
Fault Description
An IPSec policy group was bound to a port, but an IPSec tunnel failed to be set up between the
eNodeB and the SeGW.
Fault Diagnosis
1.
OM personnel checked whether the IPSec-related parameters were correctly configured.
The output of the DSP IKESA command indicated that the IKE SA status in phase 1 was
Ready or Ready|StayAlive, but the status in phase 2 was None. IPSec-related parameter
settings were checked and were found to be the same as those on the SeGW.
2.
OM personnel checked header information.
There were four IKE_AUTH exchanges between the eNodeB and the SeGW. After that,
the SeGW did not respond to the IKE_AUTH message from the eNodeB. When an eNodeB
has not received any responses from an SeGW for a long time, the eNodeB will continue
to send six IKE_AUTH messages before staring the next round of authentication
negotiation.
3.
OM personnel checked the IKE_AUTH messages sent from the SeGW to the eNodeB.
The notification payload in the messages was NO_PROPOSAL_CHOSEN. This indicated
that the SeGW failed to obtain the required IPSec proposal and therefore this round of IKE
authentication negotiation failed. The SeGW sent these messages to notify the eNodeB of
this failure.
NOTE
The eNodeB considered the encrypted notification messages invalid and therefore discarded these
messages.
Fault Handling
This fault was due to the configuration on the peer equipment. After the message transmission
rule on the peer equipment was modified, the fault was rectified.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
119
eRAN
Troubleshooting Guide
14 Troubleshooting RF Unit Faults
14
Troubleshooting RF Unit Faults
About This Chapter
This chapter describes the method and procedure for troubleshooting radio frequency (RF) unit
faults in the Long Term Evolution (LTE) system.
14.1 Definitions of RF Unit Faults
If a radio frequency (RF) unit is faulty, its sensitivity decreases, leading to deterioration of the
cell demodulation performance and reduction of the uplink coverage, or even service interruption
in the cell.
14.2 Background Information
This section defines the concepts related to RF unit fault troubleshooting. The concepts are
voltage standing wave ratio (VSWR) tests, passive intermodulation (PIM) interference, external
interference, and remote electrical tilt (RET) antennas.
14.3 Troubleshooting Method
This section describes how to identify and troubleshoot the possible cause.
14.4 Troubleshooting VSWR Faults
This section provides information required to troubleshoot VSWR faults. The information
includes fault descriptions, background information, possible causes, fault handling method and
procedure, and typical cases.
14.5 Troubleshooting RTWP Faults
This section provides information required to troubleshoot RTWP faults. The information
includes fault descriptions, background information, possible causes, fault handling method and
procedure, and typical cases.
14.6 Troubleshooting ALD Link Faults
This section provides information required to troubleshoot ALD link faults. The information
includes fault descriptions, background information, possible causes, fault handling method and
procedure, and typical cases.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
120
eRAN
Troubleshooting Guide
14 Troubleshooting RF Unit Faults
14.1 Definitions of RF Unit Faults
If a radio frequency (RF) unit is faulty, its sensitivity decreases, leading to deterioration of the
cell demodulation performance and reduction of the uplink coverage, or even service interruption
in the cell.
Generally, RF unit faults are indicated by alarms. Therefore, this chapter describes how to
troubleshoot RF unit faults based on reported alarms.
14.2 Background Information
This section defines the concepts related to RF unit fault troubleshooting. The concepts are
voltage standing wave ratio (VSWR) tests, passive intermodulation (PIM) interference, external
interference, and remote electrical tilt (RET) antennas.
VSWR Test
During a VSWR test on a radio frequency (RF) unit, power of the RF unit is first coupled as
forward power and backward power by using directional couplers, and then they are measured
by using standing-wave detectors. The difference between the measured forward power and
backward power is the return loss, which can be converted to a VSWR value by using related
formulas. The VSWR value is used to determine whether a VSWR alarm is reported.
Figure 14-1 Principle of a VSWR test
The VSWR test result indicates the connection condition between the RF unit and the antenna
system. If a large VSWR value is obtained, the antenna system is improperly connected with
the RF unit. The output power of the RF unit is not transmitted through the antenna but reflected
back. A high reflected power damages the RF unit, and the total reflection may break down the
unit. To avoid the preceding faults, the VSWR alarm post-processing switch must be turned on
for a remote radio unit (RRU) to be added. In this way, if a major VSWR alarm is generated,
the RRU automatically shuts down the faulty transmit (TX) channels and then does not provide
output power. In this scenario, the cell served by the RRU degrades the capacity or becomes
unavailable. The cell coverage and performance also deteriorate.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
121
eRAN
Troubleshooting Guide
14 Troubleshooting RF Unit Faults
NOTE
If a major VSWR alarm is generated, the faulty TX channels are automatically shut down. If you have
rectified the related faults, you can run the STR VSWRTEST command or manually modify the TX
channel configuration to open the TX channels. However, the VSWR alarm still exists. It will be cleared
only after the RRU is reset.
PIM Interference
PIM interference is induced by non-linearity of the passive components in the TX system. The
antenna non-linearity is indicated by the intermodulation (IM) suppression degree. For a linear
system, if the input is two signals, the output is also two signals without any additional frequency
component. For a non-linear system, if the input is two signals, new frequency components are
generated in the system and added to the output, and then the output is more than two signals.
The added frequency components are known as the IM products. The process of generating
frequency components is called IM. If the IM products work on frequencies within the receive
(RX) frequency band and accordingly increase the uplink interference or received total wideband
power (RTWP), IM interference is generated. In a high-power and multi-channel system, nonlinearity of the passive components generates high-order IM products. These IM products and
the operating frequency are mixed to from a group of new frequencies, and accordingly a group
of useless spectra is generated and affects the normal communication.
In a linear system, assume that the two input signals work on the frequencies of f1 and f2. Then,
IM components are generated, such as two IM3 components operating on the frequencies of (2
x f1 - f2) and (2 x f2 - f1), and two IM5 components operating on the frequencies of (3 x f1 - 2
x f2) and (3 x f2 - 2 x f1). As shown in the following figure, the input signals and IM components
are marked in green and red, respectively. The IM order of an IM component (m x f2 - n x f1)
is the sum of m and n. These IM components are generated symmetrically on the left and right
of the wanted signals. Their intervals depend on the IM orders and the maximum frequency
spacing (or bandwidth) of the input signals. A higher IM order leads to a lower amplitude for
the IM components and a further distance from the wanted signals, and therefore a smaller
impact.
The following figure shows an example of a PIM result.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
122
eRAN
Troubleshooting Guide
14 Troubleshooting RF Unit Faults
Figure 14-2 Example of a PIM result
All passive components encounter intermodulation distortion that may be caused by unreliable
mechanical contacts, poor soldering, or oxidization.
Passive components such as combiners, duplexers, and filters require specific IM suppression
degrees. If the IM suppression degree of an IM order meets the requirements, the IM products
have no impact on the system performance. Generally, cables do not have requirements for PIM
suppression degrees. A cable requiring high PIM suppression degrees can reduce PIM
interference, but it is too expensive to be used widely.
Note that an improper connection is not definitely coupled with the PIM interference. If an RF
unit is properly connected with the antenna system, high-power IM components may also be
generated due to insufficient PIM suppression degrees of the cables.
If the IM components work on frequencies within the RX frequency band, this increases the
noise floor of the RX channels and decreases the sensitivity of the RF unit. For a frequency
division duplex (FDD) system, frequency bands such as 800MHz and 700MHz have small
duplex spacing (spacing between the DL frequency and the UL frequency). Meanwhile, the IM3
and IM5 products of the TX signals work on frequencies within the RX frequency band. In this
scenario, the impact of PIM interference must be paid special attention.
To sum up, the generating conditions for PIM interference are as follows:
The input is TX signals of the eNodeB, or occasionally external interference signals transmitted
through the antenna. The media is cables or passive components such duplexers and antennas.
The output is IM products. The power of the IM components depends on the IM suppression
degree of the passive components or cables.
PIM interference has the following typical characteristics:
l
The RTWP multiplies while the TX power increases.
Add downlink simulated load to increase the TX power. If the RTWP obviously multiplies,
PIM interference exists.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
123
eRAN
Troubleshooting Guide
l
14 Troubleshooting RF Unit Faults
The RTWP is sensitive to the positions of cables and connectors.
Observe the RTWP while shaking the cable near a connector or hitting a connector. If the
RTWP changes greatly, PIM interference exists.
l
The impact of PIM interference increases with the bandwidth.
The impact of PIM interference must be taken into account for frequency bands with the
duplex spacing within 30 MHz.
l
The generating mechanism of PIM interference is complicated.
Generally, PIM interference exists when multiple frequency components are generated.
However, in a non-linear system, a single amplitude-modulated signal may generate
frequency components, and this leads to spectrum expansion. These frequency components
are also IM products. Moreover, in a scenario with improper connections, even continuous
wave (CW) signals generate frequency components.
External Interference
Electromagnetic waves are propagated through space in certain directions in the electric field.
Based on the directions (also known as polarization), the electromagnetic waves are classified
into linear polarized waves and circular polarized waves. Antennas with different polarization
can obtain various gains from linear polarized waves.
eNodeBs use orthogonal 45° dual-polarized antennas. Therefore, linear polarized waves
received by these antennas have main and diversity gain differences.
Interference signals can also be classified based on the polarization:
l
Linear polarized interference signals
Interference signals are propagated through various transmission media, and are frequently
reflected and refracted in some places such as urban areas. As a result, the linear polarized
interference signals continuously change their propagation directions and also change their
polarization in the electric field.
When they arrive the antennas of the eNodeB, their polarization has little difference from
each other, and two antenna ports of each sector receive interference signals with similar
power.
l
Circular polarized interference signals
Circular polarized interference signals are propagated without directions. Therefore, when
they arrive the dual-polarized antennas of the eNodeB, two antenna ports of each sector
can receive interference signals with similar power.
In some cases, external interference may also lead to RTWP imbalance alarms.
For example, linear polarized radio signals from a radar or navigation satellite high up in the air
are propagated without multiple reflections. When the eNodeB receives such interference
signals, the orthogonal dual-polarized antennas can obtain various gains based on the angle
between the interference signals and the antenna polarization. If the interference signals exist
for a long time, an RTWP imbalance alarm can be generated.
To determine whether external interference exists, perform the following steps:
1.
Check whether PIM interference exists.
Shut down downlink channels and then check whether the RTWP is excessively high.
Yes: Go to 2.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
124
eRAN
Troubleshooting Guide
14 Troubleshooting RF Unit Faults
No: There is no PIM interference.
2.
Check whether external interference exists. Perform the following steps:
Disconnect an RRU or RFU from the jumper, and then connect the RRU or RFU to a
matched load or direct open-circuit to check whether the RTWP falls within the normal
range.
If the RTWP is normal, external interference exists.
Stable external interference has the following typical characteristics:
l
Two interference signals received by a receiver are correlated but with different power.
They have the same impact on the RTWP.
l
External interference occupies a certain bandwidth. Monophony interference does not carry
any useful information, however, it seldom exists.
l
External interference is received only by antennas, which simplifies the troubleshooting
procedure.
Remote Electrical Tilt Antenna
A remote electrical tilt (RET) antenna can be remotely controlled because it is equipped with a
drive called the remote control unit (RCU). The RCU is installed closely to the RET antenna.
Each RCU consists of a driving motor, control circuit, and drive structure. The driving motor is
usually a digitally controlled step motor. The control circuit communicates with the controller
and controls the driving motor. The drive structure contains a gear that meshes with a pulling
bar. Under the control of the driving motor, the gear moves to transmit motion to the pulling
bar, and accordingly the tilt angle of the antenna can be adjusted.
The following figure shows the structure and working principles of an RET antenna equipped
with an RCU.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
125
eRAN
Troubleshooting Guide
14 Troubleshooting RF Unit Faults
Figure 14-3 Structure and working principles of an RET antenna equipped with an RCU
14.3 Troubleshooting Method
This section describes how to identify and troubleshoot the possible cause.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
126
eRAN
Troubleshooting Guide
14 Troubleshooting RF Unit Faults
Troubleshooting Flowchart
Figure 14-4 Troubleshooting flowchart for RF unit faults
Troubleshooting Procedure
1.
Check whether there is any alarm related to voltage standing wave ratio (VSWR) faults in
the active alarms on the eNodeB or there is any abnormal VSWR test result. If yes,
troubleshoot the VSWR faults. If no, go to 2.
2.
Check whether there is any alarm related to RTWP faults in the active alarms on the
eNodeB. If yes, troubleshoot the RTWP faults. If no, go to 3.
3.
Check whether there is any alarm related to ALD link faults in the active alarms on the
eNodeB or there are any abnormal ALD links. If yes, troubleshoot the ALD link faults. If
no, go to 4
4.
If the fault persists, contact Huawei technical support.
14.4 Troubleshooting VSWR Faults
This section provides information required to troubleshoot VSWR faults. The information
includes fault descriptions, background information, possible causes, fault handling method and
procedure, and typical cases.
Fault Description
An alarm ALM-26529 RF Unit VSWR Threshold Crossed is reported if there are VSWR faults
in the radio frequency (RF) channels of an RF unit.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
127
eRAN
Troubleshooting Guide
14 Troubleshooting RF Unit Faults
Related Information
None
Possible Causes
l
The VSWR alarm threshold is set to a low value.
l
Hardware installation is improper. For example, a jumper is improperly connected; a feeder
connector is insecurely installed or is immersed in water; the feeder connected to an antenna
port is bent, deformed, or damaged; a feeder is insecurely connected.
l
The frequency band supported by the RF unit is inconsistent with that supported by the
components of the antenna system.
l
A VSWR-related circuit fault occurs in the RF unit, or other hardware faults occur in the
RF unit.
Fault Handling Flowchart
None
Fault Handling Procedure
1.
Check the detected VSWR value when the alarm is reported.
If the VSWR value is greater than 10, it means that all output power is reflected back
because no feeder is connected to the related antenna port or the related feeder is bent or
damaged.
2.
Check the VSWR alarm threshold of the RF unit.
Run the LST RRU command to query the VSWR alarm threshold of the RF unit. Then,
check whether the threshold is properly set according to the network plan. If the threshold
is improper, change it by running the MOD RRU command.
3.
Check the current VSWR value.
a.
b.
Run the DSP VSWR command to query the current VSWR value.
NOTE
The execution of the STR VSWRTEST command interrupts services carried by the RF unit.
Run the STR VSWRTEST command to query the offline VSWR value.
TIP
It is recommended that multiple frequencies within the operating frequency range supported
by the cell be used as the test frequencies.
Figure 14-5 Command for starting a VSWR test
4.
Issue 02 (2012-07-30)
Compare the VSWR values queried by running the STR VSWRTEST and DSP VSWR
commands.
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
128
eRAN
Troubleshooting Guide
14 Troubleshooting RF Unit Faults
If the two values are the same and are greater than the threshold for reporting VSWR alarms,
onsite investigation is required. Go to 5.
If the two values are significantly different, run the STR VSWRTEST command to
perform VSWR tests on a frequency point at an interval of 1 MHz or smaller within the
bandwidth range to compare tested VSWR values.
l If the values are the same, the feeder between the RF unit and the antenna system may
be insecurely connected and accordingly the queried VSWR values are not stable. In
this case, check the feeder connection at the local end. Then, go to step 4.
l If some of the values are large, a hardware fault may occur in the RF unit. Save the test
results and submit the results together with one-click log files of the main control board
and RF unit to Huawei technical support for further analysis.
5.
Check the feeder connection at the local end.
Check whether the frequency band supported by the RF unit is consistent with that
supported by the components of the antenna system according to the network plan. The
antenna system consists of antennas, feeders, jumpers, combiner-dividers, filters, and
tower-mounted amplifiers (TMAs).
It is recommended that a Sitemaster be used to measure the distance between the point with
a large VSWR value and the test point during a VSWR test.
If no Sitemaster is available, locate the fault by using isolation methods. Add load to
different parts of the feeder at the local end. Then, run the STR VSWRTEST to start a
VSWR test on each isolation part of the feeder to locate the fault.
6.
If the feeder connection is normal, contact Huawei technical support.
Typical Cases
None
14.5 Troubleshooting RTWP Faults
This section provides information required to troubleshoot RTWP faults. The information
includes fault descriptions, background information, possible causes, fault handling method and
procedure, and typical cases.
Fault Description
An RTWP-related alarm is reported if there are received total wideband power (RTWP) faults
in the radio frequency (RF) channels of an RF unit.
Related Information
Related alarms are as follows:
l
ALM-26522 RF Unit RX Channel RTWP/RSSI Unbalanced
l
ALM-26521 RF Unit RX Channel RTWP/RSSI Too Low
Possible Causes
l
The setting of attenuation on the RX channel of the RF unit is incorrect.
l
The feeder connected to the RF unit is faulty.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
129
eRAN
Troubleshooting Guide
14 Troubleshooting RF Unit Faults
l
Passive intermodulation (PIM) exists.
l
External interference exists.
l
The feeder is improperly connected with the antenna.
l
The hardware in an RF module is faulty.
l
Faults may be caused by other uncertain factors.
Fault Handling Flowchart
Figure 14-6 Fault handling flowchart for RTWP faults
Fault Handling Procedure
1.
Rectify the faults and modify the improper settings.
a.
Run the LST ALMAF command to check whether alarms related to ALD or TDM
are reported. If such an alarm is reported, clear the alarm by referring to 14.6
Troubleshooting ALD Link Faults.
b.
Run the LST RXBRANCH command to check whether attenuation of the RX
channel of the RRU is configured as planned.
If it is not configured as planned, run the MOD RXBRANCH command to modify
the configuration. If it is configured as planned, go to 1.3.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
130
eRAN
Troubleshooting Guide
14 Troubleshooting RF Unit Faults
The result is similar to the following:
List RxBranch Configure Information
----------------------------------Cabinet No. = 0
Subrack No. = 62
Slot No. = 0
RX Channel No. = 0
Logical Switch of RX Channel = ON
Attenuation(0.5dB) = 0
(Number of results = 1)
c.
Check whether the ALM-26522 RF Unit RX Channel RTWP/RSSI Unbalanced
or ALM-26521 RF Unit RX Channel RTWP/RSSI Too Low alarm is reported.
If either of the alarms is reported, clear the alarm by referring to ALM-26522 RF Unit
RX Channel RTWP/RSSI Unbalanced or ALM-26521 RF Unit RX Channel RTWP/
RSSI Too Low. If the ALM-26522 RF Unit RX Channel RTWP/RSSI
Unbalanced alarm cannot be cleared by referring to ALM-26522 RF Unit RX Channel
RTWP/RSSI Unbalanced, perform 2 to 6.
2.
Check whether PIM interference exists.
PIM has a typical characteristic: The level of the intermodulation products increases with
the transmit power. Using this typical characteristic, the existence of PIM interference can
be determined. If the uplink interference increases significantly with the transmit power,
PIM interference exists. Otherwise, PIM interference does not exist. You can increase the
transmit power by adding a downlink simulated load, and then compare the received signal
strength indicator (RSSI) values before and after the simulated load is added.
The procedure is as follows:
a.
Run the ADD CELLSIMULOAD command to add a simulated load. For example,
ADD CELLSIMULOAD: LocalCellId=x, SimLoadCfgIndex=9;
The simulated load and transmit power have a positive correlation with the value of
the SimLoadCfgIndex parameter.
NOTE
Note that load simulation is mainly used in interference tests. You are advised not to use load
simulation for a cell with more than six active UEs. Otherwise, the scheduling performance cannot
be ensured.
b.
Start RSSI tracing.
From the main menu on the M2000 client, choose Monitor > Signaling Trace >
Signaling Trace Management. In the left navigation tree, choose LTE > Cell
Performance Monitoring > Interference RSSI Statistic Detect Monitoring. Then,
click New in the right pane. An RSSI tracing task is created. Figure 14-7 shows an
example of RSSI tracing results.
If the values on one RSSI curve are significantly greater than the values on other RSSI
curves, PIM interference exists. If values on all RSSI curves are basically the same,
there is no PIM interference and go to 3.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
131
eRAN
Troubleshooting Guide
14 Troubleshooting RF Unit Faults
Figure 14-7 RSSI tracing result
If PIM interference exists according to the preceding investigation, use either of the
following methods to determine the location or device where PIM is introduced:
l Add a simulated load and shake the cable segments by segments from the RF unit top
to the antenna port. If RSSI values change dramatically when shaking a segment, PIM
interference is introduced by this segment.
l Breakpoint-based PIM detection:
By using breakpoints, divide the cable connecting the RF unit top to the antenna port
into several segments by using breakpoints. Disconnect the cable at the breakpoints one
by one along the direction from the RF unit top to the antenna port. Each time the cable
is disconnected at a breakpoint, connect the breakpoint to a matched load or a lowintermodulation attenuator, add a downlink simulated load, and check whether the
RTWP values increase. Ensure the inserted attenuator has low intermodulation
interference so that it will not add additional PIM interference to the cable. If the RTWP
values increase, PIM interference is introduced by the device or cable before this
breakpoint.
For example, set four breakpoints from the RF unit top to the antenna port, as shown in
Figure 14-8. At first, disconnect the cable at breakpoint 1, connect breakpoint 1 to a
low-intermodulation attenuator, and add a downlink simulation load. If RTWP values
do not change, PIM interference is not caused by the RF unit. If RTWP values increase,
PIM interference is caused by the RF unit. Perform the similar steps to the other
breakpoints.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
132
eRAN
Troubleshooting Guide
14 Troubleshooting RF Unit Faults
Figure 14-8 Schematic diagram for breakpoint-based PIM detection
If the interference is caused by the RF unit, replace the RF unit. If the interference is caused
by the cable, replace the cable and then check whether the interference still exists. If the
interference is removed, no further action is required.
If the interference persists, check whether the interference exists in the antenna.
3.
Perform Broadband on-line frequency scan to check whether external interference exists.
Observe the scan result until the ALM-26239 RX Channel RTWP/RSSI Unbalanced
Between RF Units alarm is reported. Then, send the local tracing results, running logs of
RF units, and investigation results to Huawei technical support for fault diagnosis.
For the procedure for performing Broadband on-line frequency scan, see Monitoring
eNodeB Performance in Real Time > Spectrum Detection in eNodeB LMT User
Guide.
4.
Check whether a crossed pair connection exists.
Description
RF channels in an RF unit must be used by the same sector except in MIMO mutual-aid
scenarios. The purpose is to ensure the consistency between the direction and coverage of
an antenna so that balanced RTWP values are obtained. If the RF channels of an RF unit
are used by different sectors, the RF unit will have different RTWP values. Note that the
ALM-26522 RF Unit RX Channel RTWP/RSSI Unbalanced alarm is reported only
when the number of UEs is significantly different between two cells with a crossed pair
connection.
The ALM-26522 RF Unit RX Channel RTWP/RSSI Unbalanced alarm caused by a
crossed pair connection has the following characteristics:
l The alarm is reported in at least two sectors under the same eNodeB.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
133
eRAN
Troubleshooting Guide
14 Troubleshooting RF Unit Faults
l RTWP variations of different RF channels are uncorrelated.
l RTWP variations are similar in different sectors.
Troubleshooting Method
The cells with a crossed pair connection can be determined by using either of the following
two methods:
l Perform drive tests and trace signaling without interrupting the services.
Make a phone call in a cell (for example, cell 1). Check whether the UE accesses cell
1, where the UE is located. If the UE accesses another cell (for example, cell 3), the
antennas of cells 1 and 3 are cross-connected.
l Run the STR CROSFEEDTST command to start the a crossed pair connection test.
If the antenna system is not equipped with an external filter, the Start Test
Frequency and End Test Frequency parameters do not need to be specified. The test
will be performed in the test frequency band supported by the RF unit. If the antenna
system is equipped with an external filter, specify the Start Test Frequency and End
Test Frequency parameters to the start frequency and end frequency, respectively, for
the external filter.
NOTE
Note the following before starting a crossed pair connection test:
l This test is an offline test and the execution of this command interrupts services. If this command
is executed on a multi-mode RF unit or an RF unit connected to an antenna shared by the local
and peer modes, the services of the peer mode carried by this RF unit are also interrupted.
l This test cannot be performed simultaneously with the VSWR test or distance to fault (DTF) test.
l This command applies to the scenario where RF modules are in 2T2R mode. If this command is
executed in other scenarios, the result may be incorrect.
l The VSWR test has a great impact on the precision of this test because the VSWR will cause a
gain loss. You are advised to perform a high-precision VSWR test before running this command.
If the VSWR is greater than 2.5, you are not advised to run this command.
l This test cannot be applied to 1T2R RF units if RRU combination is not used. Otherwise, the result
may be incorrect.
l This command does not apply to multi-RRU cells, distributed cells, or cells under the eNodeB
with an omnidirectional antenna.
l This command does not apply to the scenario where all antennas of one sector are connected to
another sector.
l Do not start this test if the number of sectors that work in the same frequency band and support
the test is less than two.
l If the bandwidth between the start frequency and end frequency of the external filter is less than
10 MHz, the execution output is not reliable.
The Crossed value of RESULT appears in pairs. If RESULT is Crossed for two
sectors, a cross pair connection exists between the two sectors. Detailed information
about the sectors with a crossed pair connection is displayed in the detection result.
The result is similar to the following:
To start a cross feeder test,run the following command:
STR CROSFEEDTST:;
The result is shown as follows:
+++
HUAWEI
2012-02-02 10:54:58
O&M
#453
%%STR CROSFEEDTST:;%%
RETCODE = 0 Operation succeeded.
Session ID = 65537
(Number of results = 1)
--END
+++
HUAWEI
2012-02-02 10:55:15
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
134
eRAN
Troubleshooting Guide
14 Troubleshooting RF Unit Faults
O&M
#452
%%STR CROSFEEDTST:;%%
RETCODE = 0 Progress report, Operation succeeded.
Report Type = Cross Feeder Test Progress
Status = Success
Session ID = 65537
Cross Feeder Test Result
-----------------------Sector No. RESULT
0
Normal
1
Normal
(Number of results = 2)
--END
Handling Suggestion
After the sectors with a crossed pair connection are determined, adjust their antenna
connection. Since there are three types of crossed pair connections (main-main, maindiversity, and diversity-diversity), several rounds of antenna adjustment may be required
before the test result verifies no crossed pair connection.
5.
Check whether random electromagnetic interference exists.
If the fault is not caused by the preceding factors, it may be caused by random
electromagnetic interference. Occasional electromagnetic interference has a small impact
on the network performance. Therefore, ignore it if the RTWP imbalance alarm is not
frequently triggered. If the RTWP imbalance alarm is frequently triggered, contact Huawei
technical support.
6.
If the fault persists, contact Huawei technical support.
Typical Cases
None
14.6 Troubleshooting ALD Link Faults
This section provides information required to troubleshoot ALD link faults. The information
includes fault descriptions, background information, possible causes, fault handling method and
procedure, and typical cases.
Fault Description
An ALD-related alarm is reported if there are antenna line device (ALD) link faults in the radio
frequency (RF) channels of an RF unit.
Related Information
Related alarms are as follows:
l
ALM-26530 RF Unit ALD Current Out of Range
l
ALM-26541 ALD Maintenance Link Failure
l
ALM-26751 RET Antenna Motor Fault
l
ALM-26754 RET Antenna Data Loss
l
ALM-26757 RET Antenna Running Data and Configuration Mismatch
l
ALM-26752 ALD Hardware Fault
l
ALM-26753 RET Antenna Not Calibrated
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
135
eRAN
Troubleshooting Guide
l
14 Troubleshooting RF Unit Faults
ALM-26531 RF Unit ALD Switch Configuration Mismatch
Possible Causes
Possible causes for ALD-related alarms are listed as follows:
l
The setting of the ALD power supply switch is improper.
l
The settings of the ALD current alarm thresholds are incorrect.
l
The ALD connections are abnormal.
l
The ALDs are faulty.
Fault Handling Flowchart
None
Fault Handling Procedure
1.
Rectify the faults and modify the improper settings.
2.
If the fault persists, contact Huawei technical support.
Typical Cases
None
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
136
eRAN
Troubleshooting Guide
15 Troubleshooting License Faults
15
Troubleshooting License Faults
About This Chapter
This chapter describes how to diagnose and handle license faults.
15.1 Definitions of License Faults
License faults are license-related alarms and faults that occur during eNodeB license installation.
15.2 Background Information
A license is an authorization agreement between the supplier and the operator on the use of
products. It defines the product features, versions, capacity, validity period, and application
scope.
15.3 Troubleshooting Method
To troubleshoot license faults, determine in which scenarios the license faults occur, for example,
during license installation or during network running, and then take different measures.
15.4 Troubleshooting License Faults That Occur During License Installation
This section provides information required to troubleshoot license faults that occur during license
installation. The information includes fault descriptions, background information, possible
causes, fault handling method and procedure, and typical cases.
15.5 Troubleshooting License Faults That Occur During Network Running
This section provides information required to troubleshoot license faults that occur during
network running. The information includes fault descriptions, background information, possible
causes, fault handling method and procedure, and typical cases.
15.6 Troubleshooting License Faults That Occur During Network Adjustment
This section provides information required to troubleshoot license faults that occur during
network adjustment. The information includes fault descriptions, background information,
possible causes, fault handling method and procedure, and typical cases.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
137
eRAN
Troubleshooting Guide
15 Troubleshooting License Faults
15.1 Definitions of License Faults
License faults are license-related alarms and faults that occur during eNodeB license installation.
NOTE
Problems that may be encountered during license application are not described in this document. For details,
see eRAN License Management Feature Parameter Description.
15.2 Background Information
A license is an authorization agreement between the supplier and the operator on the use of
products. It defines the product features, versions, capacity, validity period, and application
scope.
Operators can purchase the license to determine the network functions and capacity at a specific
stage, maximizing the return on investment. For details, see eRAN License Management Feature
Parameter Description.
15.3 Troubleshooting Method
To troubleshoot license faults, determine in which scenarios the license faults occur, for example,
during license installation or during network running, and then take different measures.
Possible Causes
The possible causes of license faults are as follows:
l
Incorrect operations
l
Misunderstanding over the license mechanism
l
Errors in license files
l
Product defects
Troubleshooting Flowchart
The following figure shows the troubleshooting flowchart for license faults that occur in different
scenarios.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
138
eRAN
Troubleshooting Guide
15 Troubleshooting License Faults
Figure 15-1 Troubleshooting flowchart for license faults
Troubleshooting Procedure
1.
Determine whether license faults occur during license installation. If so, perform the
procedure for troubleshooting license faults that occur during license Installation. If not,
go to 2.
2.
Determine whether license faults occur during network running. If so, perform the
procedure for troubleshooting license faults that occur during network running. If not, go
to 3.
3.
Determine whether license faults occur during network adjustment. If so, perform the
procedure for troubleshooting license faults that occur during network adjustment. If not,
go to 4.
4.
If the faults persist, contact Huawei technical support.
15.4 Troubleshooting License Faults That Occur During
License Installation
This section provides information required to troubleshoot license faults that occur during license
installation. The information includes fault descriptions, background information, possible
causes, fault handling method and procedure, and typical cases.
Fault Description
If license installation fails, the following error messages will be displayed in the MML command
output:
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
139
eRAN
Troubleshooting Guide
15 Troubleshooting License Faults
l
License check failed; license serial number became invalid; the license file does not match
the product; the license versions do not match.
l
The license file has expired; the file type is DEMO.
l
The license control items do not match; the configured value exceeds the value in the license
file or the validity date of the control item is earlier than that in the license file.
Related Information
During license installation, the eNodeB checks the license. The check items are as follows:
l
Integrity check: Whether the product name in the license file matches the software name;
whether the checks on full-text signature, Service field signature, and feature signature are
successful.
l
Accuracy check: Whether the equipment serial number (ESN) in the Service field matches
the ESN of the equipment; whether the VR version number in the Service field matches
the VR version of the software.
l
Validity period check: Whether the license for the feature exceeds the validity date; whether
the license for the feature exceeds the validity date and protection period.
l
Difference check: Differences between new and old license files, including whether any
function items in the new license files are lost, whether any resource items are reduced or
lost, and whether the validity period for the feature becomes short.
If the license check fails, the subsequent processing is as follows:
l
If the integrity check fails, the license file installation fails.
l
If the accuracy check fails (the ESNs or the VR versions do not match), users need to
confirm whether to continue with the installation. If users choose to continue with the
installation, the feature defined in the license file can run in trial mode for 60 days. After
60 days, the feature enters the default mode. The license file with the same errors cannot
be installed repeatedly.
NOTE
l If the ESNs or VR versions do not match, the system runs based on the function items and resource
configuration defined in the license file. If the system does not read correct function items or resource
items from the license file, the system runs with the minimum configuration.
l If the ESNs or VR versions do not match and the license for the feature exceeds the validity date and
protection period, the feature runs in default mode. Otherwise, the feature runs in trial mode.
l If there is a license file in which the ESNs or VR versions do not match on the system, a license file
with the same error as the existing license file cannot be installed. If a correct license file exists, a
license file in which the ESNs or VR versions do not match can also be installed.
l If the license file to be installed expires, that is, the license for all features exceeds the validity date,
the license file installation fails. If only the license for some features exceeds the validity date, the
license file can be installed and a message prompting that the license for some features exceeds the
validity date is displayed.
l During license installation, if the function items, resource items, and validity period in the license file
are different from those in the previous license file, the installation result indicates the differences and
the user can choose to forcibly install the new license file.
l If the value of a license control item in the license file is smaller than the corresponding configured
value (for example, the number of cells), the license file fails to be installed.
Possible Causes
l
Issue 02 (2012-07-30)
The ESNs, VR versions, or product types do not match.
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
140
eRAN
Troubleshooting Guide
15 Troubleshooting License Faults
l
The license file has expired or the license file type is incorrect.
l
The system configuration items do not match the license control items.
Fault Diagnosis
If the license installation fails, an error message will be displayed in the MML command output.
You can diagnose the fault based on the error message. For details, see eRAN License
Management Feature Parameter Description.
Fault Handling
1.
Rectify the fault based on the error message by referring to eRAN License Management
Feature Parameter Description.
2.
If the fault persists, contact Huawei technical support.
Typical Cases
Fault Description
After eNodeBs at a site were upgraded from eRAN2.0 to eRAN2.1, the eNodeBs experienced
failures to install commercial licenses. The following error message was displayed:
The configured value of the control item is greater than the value in the license
file
Fault Diagnosis
During commercial license installation, the M2000 displayed the following message:
The confitgred valued of the control item is greater than the value in the license
file
This message shows that the configured values on the current eNodeB exceeded the limits of
the license file. Compare the license control items in the license file with the configuration that
has taken effect on the eNodeB to find the configuration items that have been activated on the
eNodeB but were not authorized by the license file.
Fault Handling
1.
Query the configured values on the eNodeB with the authorized values in the license file.
Run the DSP LICENSE command to query the configured values on the eNodeB, and
compare the configured values with the allocated values in the license file. The command
output is as follows:
Figure 15-2 Querying license information
As shown in the figure, Allocated, Config, and Actual Used are the allocated value in the
license file, the configured value on the eNodeB, and the actual value.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
141
eRAN
Troubleshooting Guide
15 Troubleshooting License Faults
When the configured value on the eNodeB exceeds the allocated value in the license file,
the following error message is displayed:
Data Configuration Exceeding Licensed Limit
2.
Check the functions not authorized by the license file.
Find the configuration items that are activated (the Config value is set to 1) on the eNodeB
but not included in the license file.
3.
Reinstall the license.
Modify the eNodeB configuration, disable the functions not authorized by the license file,
or apply for a new license file that includes these function items and in which the allocated
values are equal to or greater than the configured values on the eNodeB. Then, reinstall the
license.
4.
If the fault persists, contact Huawei technical support.
15.5 Troubleshooting License Faults That Occur During
Network Running
This section provides information required to troubleshoot license faults that occur during
network running. The information includes fault descriptions, background information, possible
causes, fault handling method and procedure, and typical cases.
Fault Description
Related alarms and events are generated.
Related Information
l
Related alarms
– ALM-26815 Licensed Feature Entering Keep-Alive Period
– ALM-26816 Licensed Feature Unusable
– ALM-26817 License on Trial
– ALM-26818 No License Running in System
– ALM-26819 Data Configuration Exceeding Licensed Limit
l
Related events
– EVT-26820 License Emergency Status Activated
– EVT-26821 License Emergency Status Ceased
Possible Causes
l
Licensed Feature Entering Keep-Alive Period
This alarm is generated when the licensed feature exceeds the validity date. You can run
the DSP LICENSE command to check the license control items that exceed the validity
date. After the licensed feature exceeds the validity date, the feature enters the keep-alive
period of 60 days (the keep-alive period is not affected by the system time jumping). During
the keep-alive period, the expired feature operates in the current license settings. After the
keep-alive period, the expired feature operates in the default license settings.
l
Issue 02 (2012-07-30)
Licensed Feature Unusable
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
142
eRAN
Troubleshooting Guide
15 Troubleshooting License Faults
This alarm is generated when the licensed feature exceeds the keep-alive period.
l
License on Trial
This alarm is generated when a license file enters the keep-alive period. The possible causes
are as follows:
– The license file exceeds the validity date.
– The license file is revoked.
– The ESN in the license file is inconsistent with the actual ESN of the eNodeB.
– The eNodeB version in the license file is inconsistent with the running version of the
eNodeB.
– The ESN and eNodeB version in the license file are inconsistent with the actual ESN
and the running version of the eNodeB.
l
No License Running in System
This alarm is generated when there is no valid license file on the system. The possible
causes are as follows:
– The license file exceeds the keep-alive period of 60 days.
– The license file is not found or errors occur during the license check when the system
is started.
l
Data Configuration Exceeding Licensed Limit
This alarm is generated when the eNodeB configuration exceeds the limits of the license
(including the default license). If this alarm is generated due to data modification during
the system running, the original data configuration is used. If this alarm is generated during
the system startup, the licensed specifications of the feature is used.
l
License Emergency Status Activated
This event is generated when the license emergency status is activated. In the license
emergency status, the eNodeB operates with the dynamic count-type resource items and
performance control items (such as traffic and number of users) reaching the maximum
values, and the other control items remain unchanged.
l
License Emergency Status Ceased
This event is generated when the license emergency status is ceased automatically seven
days after the eNodeB enters the emergency status or the license file is uploaded again to
the eNodeB in the emergency status.
Fault Diagnosis
Refer to the alarm reference documents to locate the alarm causes and clear the alarms.
Fault Handling
1.
Clear the alarms according to the alarm handling suggestions.
2.
If the fault persists, contact Huawei technical support.
Typical Cases
None
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
143
eRAN
Troubleshooting Guide
15 Troubleshooting License Faults
15.6 Troubleshooting License Faults That Occur During
Network Adjustment
This section provides information required to troubleshoot license faults that occur during
network adjustment. The information includes fault descriptions, background information,
possible causes, fault handling method and procedure, and typical cases.
Fault Description
After a command was run to enable a function, a configuration activation failure occurred due
to license restriction.
Figure 15-3 Example of a configuration activation failure due to license restriction
Related Information
l
License control item classification
License control items are classified into resource items and function items. The DSP
LICENSE command can be used to list all the control items on the maintenance console.
– The allocated value for a resource item generally exceeds 1. Operators determine the
number of resource items in the commercial license they purchase based on the site
requirements. Typical resource items include the cell bandwidth, number of accessed
users, and number of cells.
– The function items are assigned values of 0 or 1 to indicate whether the functions are
purchased. The typical function items include enhanced synchronization (clock
synchronization), IPSec, and IEEE 802.1X-based access control.
License control items can be further classified into the following five categories based on
the configured value and usage:
– Dynamic counting items (resource items): These items are passive control items without
requiring manual configuration. The configured value is NULL. These items
dynamically occupy session resources. When a session starts, the occupied resources
are counted and are subtracted from the total number of resources. When the session
stops, the occupied resources are released.
– Performance items (resource items): These items are passive control items without
requiring manual configuration. The configured value is NULL. When the eNodeB
starts up, the eNodeB learns about the allocated values of these items by queries. During
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
144
eRAN
Troubleshooting Guide
15 Troubleshooting License Faults
eNodeB operation, the quantity of occupied resources is ensured to be less than the
allocated values.
– Static counting items (resource items): These items are active control items and require
manual configuration. The corresponding resources are statically configured resources.
When the eNodeB starts up, the eNodeB obtains the configured values of these items
from the configuration file and uses these configured values to apply for the
corresponding types of resource. When the eNodeB stops providing services, the
resources are released.
– Boolean counting items (resource items): These items are active control items and
require manual configuration. The corresponding resources are Boolean resources at
the NE's submodule level. When the eNodeB starts up, the eNodeB decides whether to
apply for the corresponding resources based on the configured values (0 or 1). When a
submodule stops providing services, its resources are released.
– Boolean items (function items): These items are passive control items without requiring
manual configuration. The configured value is NULL, and the corresponding resources
are NE-level Boolean resources. When the eNodeB starts up, the eNodeB checks the
values of these items to see whether the corresponding functions are enabled.
l
License control item description
– Power: RF Output Power (per 20W) (FDD)
The power license controls the total required power of radio frequency (RF) modules
in an eNodeB. Each RF module provides 20 W power by default. Extra power must be
purchased in units of 20 W. If the eNodeB operates in default license mode, the licensed
power is 0 W by default.
– Bandwidth: Carrier Bandwidth (per 5MHz) (FDD)
The bandwidth license controls the total required bandwidth of an eNodeB. In the Long
Term Evolution (LTE) system, bandwidth of each carrier is scalable. It can be 1.4 MHz,
3 MHz, 5 MHz, 10 MHz, 15 MHz, or 20 MHz. Bandwidth is purchased in units of 5
MHz.
– CSFB control item
The control items for UTRAN, GERAN, and CDMA control the CSFB function for
these three radio access technologies (RATs).
LLT1CFBU01: CSFB to UTRAN
LLT1CFBG01: CSFB to GERAN
LLT1CFBR01: CSFB (FDD) to CDMA2000 1xRTT
– eNodeB throughput: Throughput Capacity (per Mbps)
This control item specifies the total licensed throughput of the eNodeB, which includes
the uplink and downlink throughput. Users can run the MOD LICRATIO command
to specify the proportion of licensed uplink throughput to the total licensed throughput.
Possible Causes
l
No license is running on the eNodeB.
l
The license for the eNodeB has expired, and the keep-alive period has expired.
l
The license for the eNodeB does not have the permission to apply for license control items.
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
145
eRAN
Troubleshooting Guide
15 Troubleshooting License Faults
Fault Handling Flowchart
When this type of fault occurs, the message "Failed to activate the configuration because of
license control" is displayed on the maintenance console. The following figure shows the fault
handling flowchart.
Figure 15-4 Fault handling flowchart
Fault Handling Procedure
1.
Check whether any license-related alarms are generated on the eNodeB.
2.
If license-related alarms are generated, clear the alarms by referring to eNodeB Alarm
Reference.
3.
If there are no license-related alarms, run the DSP LICENSE command to view the
allocated values and configured values for the current control items.
4.
Check whether the functions to be enabled on the eNodeB are authorized by control items
or whether the configured values exceed the allocated values in the license file.
5.
If the configured values exceed the allocated values, apply for a new license that meets
requirements and reinstall the license.
6.
If the fault persists, contact Huawei technical support.
Typical Cases
None
Issue 02 (2012-07-30)
Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
146
Study collections