Uploaded by Paul Garuba

3G Cell Optimisation RevA(1)

advertisement
3G Cell Optimisation
INTRODUCTION
Objective
Statistical RNC/RXI/RBS counters reflect the performance of a 3G cell within the
network. Specific counters are analysed to detect various faults. The objective of the
procedure is therefore to trace any network-related faults to the source of the problem
by analysing the fault symptoms evident in the counter values.
Scope
This procedure identifies and recommends solutions for network faults identified through
the analysis of RNC/RXI/RBS statistical counters.
References
Additional references to this procedure are as follows:
 ALEX Libraries:
o Radio Network Controller (RNC) 3810 (CXP 901 2011 RXX)
o RXI 820 ATM R4.1 (CXP 901 102/3 RXX)
o Radio Base Station (RBS) 3202/3206/3402/3412 (CXP 901 0811/X RXX)
o WCDMA RAN (CXS 101 06/4 RXX)
 TEMS User’s Manual
For RNC/RXI/RBS counter descriptions refer to the Performance Statistics
document within the relevant RNC/RXI/RBS ALEX Library.
PROCEDURE
ANALYSING COUNTERS
1. Available on an RNC, RXI and RBS basis, the FACTS tool is available to
interface to and represent the relevant counters. Counters are collected on a
quarterly (15 minute) basis (known as a reporting period) and are stored from the
operational date of the cell thereby allowing for past analysis.
2. There are numerous counters available from an RNC/RXI/RBS. However, this
procedure concentrates on counters reflecting the critical performance of the
cells. These counters (and formulae derived there from) are best analysed
graphically through the use of FACTS. The formulae used for statistics such as
DCR and CSSR may be obtained within the FACTS.
3. Both the NMC and the Planning & Optimization Engineer are responsible for
monitoring counters. The NMC has the responsibility of maintaining the active
status of all cells and therefore must act in accordance to all such related
counters. The Planning and Optimization Engineer monitors and acts on
counters reflecting the cell’s active performance.
4. It is possible to configure alarms to be generated for counters exceeding specific
values. These alarms would then be monitored by the NMC.
5. For the Radio Planning & Optimisation Engineer the focus is on maintaining
adequate cell performance in terms of Accessibility (call setup analysis),
Retainability (drop call analysis) and Integrity (speech quality/video quality/packet
throughput analysis)
ACCESSIBILITY
6. If a cell has poor accessibility it is typically due to either some form of congestion
or a hardware/software fault or a misconfiguration. It is also possible that there is
some external source of interference (such as a microwave link on the same
frequency) affecting the accessibility.
7. Accessibility should be monitored independently for the different RAB types (e.g.
Speech, CS Video, PS Interactive R99, PS Interactive HSDPA, etc.) as in certain
situations only one of the RAB types will be affected. For example, a disabled
HS-TXB will affect the accessibility of the PS Interactive HSDPA RAB, but if the
RBS also has a TXB (non-HS) installed then the other RABs may continue to
have an acceptable accessibility.
8. When a low CSSR is detected on a cell the first thing to check is if Admission
Control is rejecting the RRC/RAB setup attempt (pmNoReqDeniedAdm) or if it is
failing after admission (pmNoFailedAfterAdm). For high pmNoReqDeniedAdm
refer to the “Admission Control” sections below. For high pmNoFailedAfterAdm
refer to the “Failure After Admission” sections below.
Example: FACTS Report showing a low CSSR Speech caused by a high
pmNoReqDeniedAdm. Note that pmNoReqDeniedAdm is not RAB specific so other
RABs will most likely be affected in this case too.
Admission Control: DL Power
9. If Admission Control rejects a RAB establishment due to a lack of DL power then
the counter pmNoFailedRabEstAttemptLackDlPwr is incremented. Check that
the feeder losses are configured correctly in the RBS and that the parameter
maximumTransmissionPower is set correctly (typically to
maxDlPowerCapability minus 0.2dBm). The value of parameter pwrAdm should
also be verified (typically set to 75%). Also, check for MCPA alarms i.e.
sometimes RBS 3202 sites with high feeder losses are configured with two
MCPAs per sector and one of the MCPAs fails resulting in a large reduction in
the maxDlPowerCapability of the sector and, hence, causing a lack of DL power.
Long term solutions are to increase the power capability of the sector by adding
or upgrading an MCPA (RBS 3203) or RU (RBS 3206), re-engineering the site to
reduce feeder lengths, or perhaps to change the RBS type to one using RRUs
(RBS 3402 or RBS 3412) if this provides higher power at the reference point. The
short term solution is to reduce the traffic carried by the site (See the “Traffic
Offload” sections).
Example: FACTS Report showing a high no of RAB establishment failures due to
Admission Control rejections caused by a lack of DL power. In this situation the site had
only one of the two MCPAs in sector 1 functioning correctly causing the DL power
congestion. This is shown in the cabinet viewer snapshot below (Red LED on MCPA).
There was also an alarm in the RBS for the faulty MCPA.
Admission Control: DL Channelisation Codes
10. If Admission Control rejects a RAB establishment due to a lack of DL
channelisation codes then the counter
pmNoFailedRabEstAttemptLackDlChnlCode is incremented. This will typically
affect the PS Interactive R99 (DCH/FACH) CSSR worse than the Speech CSSR
as the PS Interactive R99 RAB requires channelisation codes at a lower
spreading factor (using more of the code tree). In the P4 software release a cell
that supports R99 and HSDPA typically has 5 spreading factor 16 DL
channelisation codes reserved for HSDPA. This means that approximately 32%
of available codes are reserved for HSDPA. When this is the case it is common
for DL channelisation code congestion too occur. Check the setting of parameter
dlCodeAdm (typically set to 85% on MTN’s network). The long term solution is
to add another cell in the coverage area to take some of the traffic; this may be
achieved by introducing a second carrier, another sector, or another site. The
short term solution is to reduce the traffic carried by the site (See the “Traffic
Offload” sections).
Example: FACTS Reports showing a high no of RAB establishment failures due to
Admission Control rejections caused by a lack of DL channelisation codes; and the
corresponding decrease in CSSR for Packet Interactive. In this case a large portion of
the speech calls were already redirected to GSM so the R99 Packet Interactive RAB
was worst affected; the required solution is sectorisation of the inbuilding antenna
system or implementation of a second carrier frequency.
Admission Control: UL/DL ASE
11. If Admission Control rejects a RAB establishment due to a lack of UL or DL air
speech equivalent (ASE) then the counters
pmNoFailedRabEstAttemptLackUlAse or
pmNoFailedRabEstAttemptLackDlAse are incremented. The ASE monitor
accounts for the air interface resource usage in a cell (separately for UL and DL)
by means of an average static load estimation of each radio link in the cell; for
more information refer to the “Capacity Management” document in the “WCDMA
RAN” ALEX library. Because a RL’s ASE is an estimation it is possible that in
certain situations it is an over-estimation of the load in a cell e.g. for inbuilding
cells on a different carrier frequency to the surrounding macro cells. In such
situations it is possible to increase the UL/DL ASE admission control limit
(parameters aseUlAdm/aseDlAdm) in order to prevent unnecessary admission
control rejections of RAB establishments. So, a short term solution to relieve
such congestion may be to increase aseUlAdm/aseDlAdm, but the effect on
DCR/CSSR should be closely monitored (Note that the aseUlAdm default value
on MTN’s network is already less stringent than the Ericsson default). Another
short term solution is to reduce the traffic carried by the site (See the “Traffic
Offload” sections). The long term solution is to add another cell in the coverage
area to take some of the traffic; this may be achieved by introducing a second
carrier, another sector, or another site.
Example: FACTS Report showing RAB establishment failures due to Admission Control
rejections caused by a lack of UL ASE. In this case the UL ASE congestion was minor
congestion for a few days so no action was taken.
Admission Control: Connection Limits
12. If Admission Control rejects a RAB establishment due to exceeding the
configured connection limit for SF 8, SF 16, or SF 32 then the counter
pmNoFailedRabEstAttemptExceedConnLimit is incremented. These
spreading factors are used by the PS64/384, PS64/128, and PS64/384 RBs so
the connection limit blocking typically applies to channel switching between these
RBs for an R99 packet interactive RAB. The connection limits are configured by
parameters sf8Adm, sf16Adm and sf32Adm. The default settings allow the
maximum possible number of RL’s for each spreading factor in which case
Admission Control will not block for this reason. Lower settings have been tested
(in combination with adjusted Class B QoS settings on the Iub interface) in which
case some connection limit rejections were obtained. But this is a special
situation and for the purpose of this document such connection limit rejections
are not worth further consideration.
Admission Control: Hardware Usage (Channel Elements)
13. It is possible for Admission Control to reject a RAB establishment attempt due to
insufficient UL or DL RBS hardware capacity i.e. too few channel elements
available. The channel element capacity of an RBS may be software limited
(according the software license configured for the RBS) or hardware limited
(according to the TXBs and RAXBs installed in the RBS). The two parameters
that control the RBS hardware admission policy are ulHwAdm and dlHwAdm.
By default these parameters should be set to 100% in which case no hardware is
reserved for handovers and Admission Control will not block RAB establishment
attempts for this reason (see “Failure After Admission: Hardware Usage”). In
software revision P4 there is no specific counter to indicate this type of
Admission Control rejection, so if pmNoReqDeniedAdm is triggered
without any of the other relevant counters indicating a reason then it is
likely that this is the cause and that ulHwAdm or dlHwAdm is incorrectly
configured to a value below 100%. In the P5 software release there are new
counters that indicate when lack of hardware capacity causes RAB establishment
failures in a cell: pmNoFailedRabEstAttemptLackDlHw,
pmNoFailedRabEstAttemptLackDlHwBest,
pmNoFailedRabEstAttemptLackUlHw,
pmNoFailedRabEstAttemptLackUlHwBest.
Example: FACTS Reports showing RAB establishment failures due to Admission
Control with no counter showing the reason (this is for P4). In this case the HW
admission limits were suspected and found to be ulHwAdm=70 and dlHwAdm=70
(instead of both being 100). After correcting these settings the Admission Control
rejections disappeared and, as can be seen in the second plot below, the Packet CSSR
improved. In the third plot below the UL CE Usage is seen to peak around 45 CEs. This
RBS had a capacity of 64 UL CEs; 70% of 64 CEs is 44.8 CEs. In otherwords, the UL
CE Usage and the Admission Control limit correlate to each other.
Failure After Admission: Iub Congestion
14. If a UTRAN cell has a high number of RRC/RAB establishment request failures
after being admitted by Admission Control (pmNoFailedAfterAdm), then a
common reason would be due to Iub Congestion. When considering the Iub
interface it is important to remember that mainly RABs configured to use strict
AAL2 QoS settings will be blocked at call setup by AAL2 CAC. Typically the R99
RABs (i.e. all RABs excluding HSDPA and EUL RABs) are configured to use
AAL2 QoS class A or class B, with both classes configured to use a strict QoS.
HSDPA and EUL will typically use AAL2 QoS class C and class D, with both
classes configured to use a best effort QoS. Typically the R99 Packet Interactive
RAB will be the first RAB to show signs of AAL2 congestion with a poor Packet
Interactive CSSR and corresponding high pmNoFailedAfterAdm. The AAL2
Setup Success Rate statistics from the relevant RXI towards the RBS may then
be investigated. This should typically be 99% and above, but if not and the
counter pmUnSuccOutConnsLocal indicates that it is local rejections (on the
RXI) by CAC, then there is congestion on the Iub interface.
Example: FACTS Reports showing high pmNoFailedAfterAdm (1st plot), low CSSR
Packet Interactive (2nd plot), and low AAL2 Call Setup Success Rate with corresponding
high pmUnSuccOutConnsLocal (3rd plot). From 2006-11-24 the problem disappears. In
this case the solution was to activate Directed Retry to GSM and to change the AAL2
QoS class B traffic to use a best effort configuration thereby allowing more PS64/128
and PS64/384 users (as well as ordering a 2nd E1 to the site); note that this RBS did not
have HSDPA configured therefore there was no concern about affecting the experience
of HS users as described in section “Considerations For HSDPA: Iub Bandwidth”.
Failure After Admission: Core Transport Network Congestion
15. Related to the above point (“Failure After Admission: Iub Congestion”) is
transport network congestion in links other than the Iub e.g. RNC<->MGW (Iu-
cs), RNC<->SGSN (Iu-ps) and inter-MGW links. If this is the case then the CSSR
of an entire RNC(s) will deteriorate along with the AAL2 Setup Success Rate for
a major link to the RNC. It would then be necessary to look at the link utilisation
in order to confirm such link congestion, but that is beyond the scope of this
document.
Example: FACTS Reports showing poor CSSR Speech for CTRNC1 for two days and
then an improvement for the next two days (1 st plot); and the corresponding AAL2 Setup
Success Rate for the CTMGW1->RBMGW1 (2nd plot) and RBMGW1->CTMGW1 (3rd
plot) links for the same days. The CTMGW1<->RBMGW1 link had a high utilisation
(>80%) so the peak cell rate (PCR) for the link was increased resulting in the noticeable
improvement.
Failure After Admission: Hardware Usage (Channel Elements)
16. A high number of RRC/RAB setup failures after admission
(pmNoFailedAfterAdm) could be due to insufficient UL or DL RBS hardware
capacity i.e. too few channel elements available. The channel element capacity
of an RBS may be software limited (according the software license configured for
the RBS) or hardware limited (according to the TXBs and RAXBs installed in the
RBS). The two parameters that control the RBS hardware admission policy are
ulHwAdm and dlHwAdm. If these parameters are set to a value lower than 100%
then Admission Control should block any RRC/RAB setup attempts requiring
more than the available channel elements (see “Admission Control: Hardware
Usage”); however, by default these parameters should be set to 100% in which
case no hardware is reserved for handovers and Admission Control will not block
RAB establishment attempts for this reason so the setup attempt fails after
admission. The RBS counters pmSetupFailureSfXX in the UplinkBasebandPool
(ULSETUPFAILURESSFXX) and pmSetupFailureSfXX in the
DownlinkBasebandPool (DLSETUPFAILURESSFXX) indicate RL (at SF XX)
setup failures due to a lack of UL and DL hardware capacity. If this is the case
then a short term solution may be to reduce the traffic carried by the site (See the
“Traffic Offload” sections). The long term solution is to upgrade the UL (RAXB) or
DL (TXB) channel element capacity of the site. This may be achieved by
swapping the relevant board with that of another site that has more capacity than
it requires, or by sourcing a new board. Note that it is possible for these counters
to increment even when there should be sufficient channel element capacity (for
example due to a software bug in the software revision being used; see “Failure
After Admission: Other”) so it is important to compare the channel element usage
to the channel element capacity of the RBS to make sure that it makes sense for
this to be the root of the problem.
Example: FACTS Reports showing poor CSSR Packet Interactive (1st plot); high
pmNoFailedAfterAdm (2nd plot); and UL setup failures due to a lack of UL baseband
hardware capacity (RAXB). Note that this RBS had 64 UL channel element capacity
until 31st August when it was upgraded to 128 UL channel elements. The estimated UL
CE Usage peaks above 64 channel elements even before the 31st confirming that RAXB
congestion is the source of the problem, and then after the upgrade to 128 channel
elements the UL CE Usage starts peaking above 100 indicating how necessary the
upgrade was. The improvement to CSSR Packet Interactive and the decrease in
pmNoFailedAfterAdm after the RAXB upgrade is clearly noticeable.
Failure After Admission: Other
17. If none of the above reasons for a poor CSSR are apparent, then it is likely to be
a more complicated problem to resolve; often relating to a software/hardware
fault, or perhaps an external source of interference in the area. At the time of
writing, the 3G technology is not as mature as the current 2G system (as would
be expected) and hence there are still numerous improvements being
implemented in every software release, along with the continued development of
new, more efficient and optimised hardware generations for the various 3G
nodes. The example below illustrates one such problem of this type encountered.
Example: FACTS Reports showing poor CSSR Speech with high pmNoFailedAfterAdm
(1st plot); and high pmSetupFailuresSfXX indicating TXB congestion. However, the DL
CE Usage is very low, seldom peaking above 6 channel elements so this doesn’t make
sense. After investigating numerous RBSs showing these “symptoms” it was
established that they all had a single HS-TXB as opposed to the other RBSs which all
had a TXB as well as an HS-TXB. Both configurations are valid and have more than
sufficient downlink channel element capacity. It was also noted that if the RBS is
restarted then the problem disappeared for a few days and then re-appeared; this is
clearly visible in the plots where the restart occurred on 2 January. This turned out to be
a software fault for the single TXB configuration (due to a failure to release some
resources on the TXB). The fix was delivered from software release P4.0.20 (whereas
the release installed on the nodes at the time was P4.0.12).
RETAINABILITY
18. If a cell has poor retainability it is typically due to either missing neighbour
definitions (WCDMA and/or GSM), overshooting cell(s), a misbehaving neighbour
site, a hardware/software fault or a misconfiguration. It is also possible that there
is some external source of interference (such as a microwave link on the same
frequency) affecting the retainability.
19. Retainability should be monitored independently for the different RAB types (e.g.
Speech, CS Video, PS Interactive R99, PS Interactive HSDPA, etc.) as in certain
situations only one of the RAB types will be affected. For example, a cell may be
configured with GSM as the preferred HO type in which case Speech calls will
perform IRAT handovers to GSM rather than performing IFHOs, but CS Video
calls will perform IFHOs. In such a situation, missing inter-frequency neighbour
cell relation definitions will impact the DCR of CS Video calls, but not Speech
calls.
20. However, in the majority of cases the factors that affect the Speech retainability
will also affect the retainability of the other RABs. When a high speech DCR is
detected on a cell the first thing to check is the type of drops occurring as
indicated by the counters pmNoSysRelSpeechSoHo,
pmNoSysRelSpeechNeighbr, pmNoSysRelSpeechUlSynch and
pmNoOfTermSpeechCong; and then to analyse the situation with the following
in mind…
Soft Handover Drops
21. Typically a cell that has a high number of dropped calls due to SOHO failures
(pmNoSysRelSpeechSoHo) will also have a high number of drops due to missing
neighbours (pmNoSysRelSpeechNeighbr) indicting that the SOHO failures are
due to missing neighbour relations; however, there are situations where SOHO
failures happen for other reasons. Two common reasons are a neighbouring cell
that is misbehaving (often due to faulty hardware/software) or a misconfiguration
resulting in a failure to perform an inter-RNC SOHO across the Iur interface.
These two situations are illustrated in the following examples…
Example: FACTS Reports showing two cells in the same area (1st plot) with a high
pmNoSysRelSpeechSoHo and a much lower pmNoSysRelSpeechNeighbr indicating
that the soft handover failures are not due to missing neighbours (2nd & 3rd plot). After
further investigation it was discovered that the cells on the neighbouring site U4554
were automatically locked (4th plot) and the Mub interface to the site was down. These
cells were transmitting CPICH yet there were multiple channels (RACH, FACH, etc) that
were disabled preventing the site from carrying any traffic. However, UEs in the
neighbouring cells were measuring the CPICH from these cells and attempting to
perform SOHO to them. Such SOHO attempts were failing leading to the SOHO drops.
As is clearly visible in the FACTS Reports, when the site U4554 came back on air on 22
Jan the SOHO drops on the neighbouring cells disappeared along with a huge reduction
in the DCR experienced by these cells.
Example: Refering to the three sites shown in the figure below (1st plot): U1393 and
U0547 are on CTRNC1 while U3970 is on TBRNC1. The three FACTS Reports below
(2nd, 3rd & 4th plots) show cells from these sites with a high pmNoSysRelSpeechSoHo.
Note that although there are some drops due to missing neighbours
(pmNoSysRelSpeechNeighbr), but most of the SOHO drops are for another reason. In
this case the soft handover counters (pmRlAddAttemptBestCellSpeech and
pmRlAddSuccessBestCellSpeech) indicated SOHO success between 3970C1 and
1393B1/547B1; however a GPEH trace of event
INTERNAL_SOFT_HANDOVER_EXECUTION showed that these handovers actually
failed (see snapshot of slide in 5th plot). It was established that a misconfiguration of an
AAL2 routing case between the two RNCs resulted in all SOHO attempts across the Iur
interface to fail. This was corrected on 9 Jan and from the FACTS Reports the
improvement is obvious.
Missing Neighbour Drops
22. A cell that has a high number of dropped calls due to missing neighbour
relations will have a high pmNoSysRelSpeechNeighbr. A missing neighbour
relation will only cause a dropped call if the RNC receives an Event 1a, 1c or 1d
Measurement Report from the UE requesting the addition of a SC to the AS (or
an HS cell change) for a SC that is not defined as a neighbour relation to any of
the cells in the AS and if the Ec/No reported for that SC is releaseConnOffset
above the Ec/No of the best serving cell in the AS; where the RNC parameter
releaseConnOffset is typically set to 12dBm. The reason for this system release
is to prevent excessive UL interference in the network. This type of dropped call
is relatively easy to solve using the General Performance Event Handler (GPEH)
tool in OSS-RC. With this tool all details on Event 1a, 1c or 1d Measurement
Reports containing a SC not in the AS neighbour list may be captured and
analysed using the INTERNAL_SOHO_DS_MISSING_NEIGHBOUR event
(including those Measurement Reports that do not cause a system release of the
call). In this way the missing neighbour or interfering cell may be established and
appropriate action taken e.g. addition of the neighbour relation and/or antenna
tilting, etc. For more information on the GPEH tool refer to the relevant
documentation in the ALEX RNC and OSS-RC libraries. Because missing
neighbour drops are relatively easy to solve, it is recommended to optimise the
neighbour relations and antenna configuration until the percentage of drops due
to missing neighbour relations is less than 10% of the total number of drops in
each RNC.
Example: FACTS Report showing a high DCR Speech on cell 1379C1 with the majority
of dropped calls due to missing neighbours as shown by the counter
pmNoSysRelSpeechNeighbr (1st plot). A GPEH trace with event
INTERNAL_SOHO_DS_MISSING_NEIGHBOUR was executed on 4 Jan where it was
found that SC 24 and SC 88 were the major cause of these missing neighbour drops
(2nd plot). The cells in the area with these scrambling codes were found to be 10C1 and
416B1 (3rd plot). With the addition of these two neighbours to 1379C1 on 4 Jan the
improvement in the DCR Speech from around 5% to around 2% is clearly visible in the
FACTS Report.
UL Synchronisation Drops
23. A connection is considered lost by the Radio Connection Supervision (RCS)
function in the SRNC when the last radio link set (RLS) for the connection has
been out-of-sync for a time given by the parameter dchRcLostT (or
hsDschRcLostT for an HSDPA connection). When this occurs for a connection
containing a speech RAB (or a multi-RAB containing speech) the counter
pmNoSysRelSpeechUlSynch is incremented. A cell that has a high DCR with a
relatively high number of dropped calls due to loss of UL synchronisation may
have various contributing factors. These may include missing IRAT neighbour
relation definitions resulting in the connection “hanging on” to the 3G network
until the call is dropped when it would be better served handing the call over to
the 2G network. This may be especially true for cells on the border of the 3G
coverage area where the 3G signal may reach areas further away from the site
than expected. By identifying such areas any missing 2G neighbour relations
may be added; or perhaps a misconfiguration discovered, such as having an
IRAT neighbour relation defined in an RNC towards a 2G cell that is not defined
as an outer cell in the 3G MSC Server (see example below). Bear in mind that a
maximum of 10 to 12 GSM neighbours is recommended for each 3G cell in order
to avoid neighbour list truncation issues. Another means to improve the situation
may be to lower the thresholds used to trigger IRAT HOs (refer to the settings for
parameters usedFreqThres2dEcno, utranRelThresh3aEcno,
usedFreqThresh2dRscp and utranRelThresh3aRscp). For example, triggering
compressed mode at Ec/No=-11dBm instead of -12dBm may prevent drops as
calls are handed earlier to the GSM network that typically has better coverage
than the 3G network. If the power of a RL is very low then that RL will be more
sensitive to sudden interference changes, so a cell that has a high number of UL
synchronisation drops may cover a radio environment with a relatively high
number of UEs experiencing sudden interference changes (generally caused by
bridges, buildings, tunnels, steep hilly terrain, etc). In order to prevent the Power
Control function from decreasing the power too much due to temporary good
radio conditions, a minimum DL code power may be configured using the
UtranCell parameter minPwrRl. Note that this parameter has an effect on the DL
power capacity of the cell so should only be used if really necessary. An external
interference source may also be the cause of UL sync drops; refer to the section
‘Other Reasons for Drops’ for an example of this. It is recommended to resolve
any SOHO and missing neighbour drop problems before attempting to resolve
UL sync drops as often the cause of such drops will resolve UL synchronisation
drops too.
Example: FACTS Report showing a high DCR Speech on cell 4412A1 with the majority
of dropped calls due to loss of UL synchronisation as shown by the counter
pmNoSysRelSpeechUlSynch (1st plot). It was discovered that the IRAT HOs towards
the 2G network stopped functioning as shown in the 2nd plot below. After further
investigation it was discovered that the definition of the 3G external cell on the 2G
BSC/MSC had somehow been deleted leading to no 3G->2G handovers. After the 3G
external cell definition was re-created in the 2G network the IRAT HOs were restored
and the corresponding improvement to the DCR is clearly visible in the FACTS Report.
Example: FACTS Report showing the reduction in the number of UL synch drops and
the corresponding improvement to the DCR Speech of cell 2440A1 when
usedFreqThresh2dRscp was changed from -106dBm to -104dBm, minPwrRl was
changed to from -150 (-15dBm) to -140 (14dBm) and two additional GSM neighbour
relations were added. This cell was on the border of the 3G coverage area.
Release due to Congestion Resolve Action
24. When Congestion Control detects downlink cell congestion, besides limiting
admission to the congested cell congestion resolve actions are also initiated in
the cell. Initially non-guaranteed services are targeted which typically results in
R99 packet interactive connections being channel switched to use the
RACH/FACH common channels with the counter pmNoOfSwDownNgCong being
incremented. If downlink cell congestion persists then RABs belonging to the
other service classes (guaranteed and guaranteed-hs) may also be release with
the following counters being incremented accordingly:
pmNoOfTermSpeechCong, pmNoOfTermCsCong and pmNoOfTermHsCong.
(Note that at the time of writing not all of these counters were available in
FACTS). When such system releases are detected then the solution is to solve
the congestion problems as described in the Accessibility section of this
document. Normally the congestion should be detected and resolved through
Accessibility monitoring before the DCR is significantly affected due to
congestion resolve actions. Refer to the ‘Capacity Management’ document in the
WCDMA RAN ALEX library for more information on the Congestion Control
functionality.
Example: At the time of writing only the following counters related to Congestion
Resolve Actions were available in FACTS. Note that in this example only the counter
pmNoOfSwDownNgAdm has been incrementing but this counter actually shows the
number of channel rate downswitches (PS64/384->PS64/128 or PS64/128->PS64/64)
triggered by Admission Control as opposed to the counter pmNoOfSwDownNgCong
that shows the number of channel type downswitches (PS64/XX->RACH/FACH)
triggered by Congestion Control.
Other Reasons for Drops
25. If none of the above reasons for a poor DCR may be established, then it is likely
to be a more complicated problem to resolve; often relating to a
software/hardware fault, or perhaps an external source of interference in the
area. At the time of writing, the 3G technology is not as mature as the current 2G
system (as would be expected) and hence there are still numerous
improvements being implemented in every software release, along with the
continued development of new, more efficient and optimised hardware
generations for the various 3G nodes. The examples below illustrate such
problems with two difficult scenarios encountered.
Example: FACTS Reports showing all three sectors of site U1619 suddenly
experiencing a high DCR of around 80%. The counters give no indication as to the
reason for the drop i.e. they are not SOHO, or missing neighbour or UL synch drops.
After restarting the RBS, the DCR returns to “normal” on all three sectors; as can be
seen in the plots below where the RBS was restarted at around 18:00 on 29 Aug. This
type of situation appeared randomly on sites throughout the network and turned out to
be a software fault in the software release implemented on the RBSs at the time.
Example: FACTS Reports showing poor DCR performance with almost all drops being
due to loss of UL synchronisation (1st plot). From the counters it is evident that the RRC
connection setup success rate is lower than it should be (pmTotNoRrcConnecReq
versus pmTotNoRrcConnecReqSuccess in 2nd plot); however, from drive tests it was
only possible to establish a call if the UE was less than 100m from the site and with line
of site to the antenna. After checking all configurations and equipment no cause for the
problem could be established. It was noted from the drive tests that when a call was
established the UE TX power was constantly at maximum (or very close there too).
Also, from the RBS counter pmAverageRSSI (not available in FACTS at the time of
writing) it was established that the uplink RSSI was constantly greater than -80dBm.
These two facts indicate the presence of high UL interference. A spectrum analyser was
used to establish that there was another signal in the area being transmitted on the
same frequency and, as it turned out, Transtel had a microwave link using this
frequency that went straight through the site (3rd plot).
INTEGRITY
26. Integrity is defined as the ability of a user to receive the requested service at the
desired quality. Typically the service quality is measured in terms of the transport
block error rate (BLER) for the RAB. The BLER measurements obtainable are
fairly limited but may be used as a benchmark of the service quality in the
network. The general approach to improving Integrity is through network
optimisation as described in the Retainability section of this document;
improvements in Retainability KPIs should lead to improvements in Integrity KPIs
too.
UL Block Error Rate
27. From RNC counters, it is possible to obtain the UL BLER, after macrodiversity
combining, per UeRc in each RNC. This effectively allows the BLER on each
RNC to be monitored for each RB type, providing an indication of the service
quality for each RB type. Refer to the ‘System Performance Statistics’ document
in the ALEX WCDMA RAN library for a mapping of UeRc number to RB.
However, at the time of writing these statistics were not available in FACTS.
Drive Test Based Service Quality Measurements
28. From TEMS Investigation 7.1 the ability to measure the speech and video
streaming service quality is introduced through two new KPIs: the WCDMA
Speech Quality Index (SQI) and the Video Streaming Quality Index (VSQI).
These two KPIs may be used to benchmark the service quality for speech and
video streaming from drive test data. In addition to the above the DL BLER may
also be obtained for
R99 & HSDPA PS Interactive Throughput (RNC Level)
29. The throughput obtained on the packet interactive RABs (HSDPA and R99) are
a good indication of the Integrity offered by these services. Counters are
available at an RNC level to obtain the PS Interactive Average Throughput for
R99 (DCH/FACH) and HSDPA, as well as the retransmission rate for these two
services.
Example: FACTS Report showing the weekly throughput and retransmission rate for
HSDPA and R99 (DCH/FACH) PS Interactive RABs on RBRNC1.
HSDPA (MAC-HS) Throughput (Cell Level)
30. The HSDPA service is marketed as a broadband mobile data service with
superior performance to standard 3G (or R99) mobile packet data. Of course this
means that it is important to ensure that an HSDPA user actually experiences
superior performance to an R99 3G user and while the HS Packet Interactive
RAB is capable of much higher throughputs than the R99 Packet Interactive
RAB, one has to bear in mind that HSDPA uses a shared channel (HS-DSCH)
whereas in R99 the channel is dedicated (DCH) to the user. The total MAC-HS
cell throughput before re-transmissions (HSCELLDATARATE) and after retransmissions (HSCELLTHROUGHPUT), as well as the average MAC-HS
throughput per user before re-transmissions (HSUSERDATARATE) and after retransmissions (HSUSERTHROUGHPUT) are available from RBS counters and
implemented in FACTS. Note that this reflects the throughput over the radio
interface and generally does not reflect problems in the transport network (See
“Considerations for HSDPA: Iub Bandwidth”). As would be expected, there are
relationships between the CQI value reported by the UE, the retransmission rate,
and the cell throughput. Low CQI values reflect a poor radio environment and
hence a lower throughput with higher retransmission rate. Bear in mind that this
may be due to a single user that is in a poor coverage area, whereas the cell
does provide good throughputs in other areas. In such a situation some antenna
adjustments may improve performance; otherwise an additional site may be
required to fill the coverage gap. Note too that for reliable throughput statistics a
reasonable data volume is required.
Example: FACTS Reports showing good MAC-HS Throughputs well above 1Mbps for
the cell although dropping below 500kbps when the data volume drops below 500KByte
per 15 minute interval (1st plot); good average reported CQI values mostly above 20
with the average used CQI value noticeably higher showing that the MAC-HS scheduler
is scheduling the users in the better instantaneous radio environment (2nd plot);
retransmission ratio and NACK ratio mostly below 0.1 (or 10%) correlating to the good
radio environment reflected by the CQI values (3rd plot); a noticeable split between the
cell throughput and user throughput as the average number of UEs with data buffered in
the RBS increases (4th plot).
Example: FACTS Reports showing poor MAC-HS Throughputs below 300kbps (1st
plot); corresponding poor average reported CQI values of around 8 with the average
used CQI below the average reported CQI possibly due to the poor ACK/NACK
detection success rate and the fact that there is only one user in the cell (2 nd plot); very
poor retransmission ratio (above 0.5) and NACK ratio (above 0.4) reflecting the poor
radio environment reported by the CQI values (3rd plot).
OTHER
Considerations for HSDPA: DL Channelisation Codes
31. Typically when HSDPA is launched in a network it is configured to share the
same carrier as the R99 traffic. In software release P4 this normally means that
five HS-PDSCHs at spreading factor 16 are reserved for the HS-DSCH channel
and an HS-SCCH at spreading factor 128 is required too; i.e. 32% of the
available downlink channelisation codes are reserved for HSDPA. When R99
and HSDPA. With this configuration it is typical for downlink channelisation code
congestion to be the limiting factor of the cells capacity. The P5 software release
introduces a feature called HSDPA Dynamic Code Allocation making it possible
for the HSDPA Scheduler to only use the channelisation codes available after
R99 usage. This may help with downlink channelisation code congestion, but at
the expense of reduced throughput for HSDPA users. It is still possible to reserve
a limited set of codes for HSDPA. For more about downlink channelisation code
congestion refer to section ‘Admission Control: Downlink Channelisation Codes’.
To solve downlink channelisation code congestion in the long term the
introduction of a second carrier frequency on the site is required. Typically the
original carrier is configured to carry the R99 traffic with all HSDPA traffic being
directed towards the newly introduced second carrier.
Considerations for HSDPA: Iub Bandwidth
32. With the high throughputs that are achievable with HSDPA, the capacity of the
Iub interface between RBS and RNC becomes an important factor effecting
HSDPA performance and customer perception. The throughput on the radio
interface may be monitored as described in the Integrity section ‘HSDPA (MACHS) Throughput’, however even if the radio interface provides good throughputs
the application layer may still experience sub-standard throughputs if the Iub
interface is congested. Typically the R99 traffic is configured to use AAL2 QoS
class A and B, whereas the HSDPA traffic uses AAL2 QoS class C; i.e. across
the Iub interface the R99 traffic has priority over the HSDPA traffic. As the R99
traffic on the site increases, so the Iub bandwidth available for HSDPA
decreases. The counter pmCapAlloclubHsLimitingRatio (in 1/10th percent)
provides an indication of the percentage of time for which the Iub interface limits
the HSDPA throughput. When high pmCapAlloclubHsLimitingRatio is
experienced in conjunction with lost HSDPA data frames
(HSDATAFRAMESLOSTRATIO formula) then the Iub capacity is causing serious
performance degradation. To alleviate the problem some traffic may be
offloaded to GSM (see ‘Offloading Traffic’ sections), but ultimately additional Iub
bandwidth is required unless an additional site, perhaps an inbuilding site at a
shopping centre, may be commissioned to carry part of the load.
Example: FACTS Reports showing the improvement to pmCapAlloclubHsLimitingRatio
and the disappearance of lost HS data frames when the Iub interface was upgraded
from one E1 to two E1s. In the second figure, notice that there is no obvious
improvement to the HSDPA Throughput as these counters measure the throughput
across the radio interface (at the MAC-HS layer) whereas the Iub was congested.
Considerations for HSDPA: Throughputs Above 3.6Mbps (Cat 7-10 UEs)
33. To support HSDPA throughputs above 3.6Mbps (which is possible from the P5
software release through support of more that five HSDPA codes per cell) it is
likely that many cells will required the introduction of a second carrier frequency
in order to prevent downlink channelisation code congestion. Also, the Iub
bandwidth will need to be upgraded to make such throughputs possible and the
existing DL channel element capacity (TXB/HS-TXB combinations) will need to
be reviewed in order to cater for more than five HSDPA codes per cell.
Considerations for HSDPA: Channel Element Usage
34. Another consideration with the introduction of HSDPA is the channel element
capacity required in the RBS to support the HS downlink channels and the ADCH in the uplink. Because HSDPA is broadband in the downlink only the
tendency is to provide sufficient channel element capacity in the downlink without
properly considering the uplink channel element requirements for the A-DCH.
This is especially evident when the optional UL384 RAB (SF 4) is used as this
RAB requires 16 channel elements; as opposed to the standard UL64 RAB (SF
16) that requires 4 channel elements. The number of SF 4 radio links allowed in
a cell may be limited to reduce the channel element usage of this RAB, however
too low values lead to poor packet interactive SOHO performance: an UL384
user who is in a SOHO area, with one cell that already has it’s maximum
allocation of SF 4 radio links, will continuously send Event 1a messages
requesting the addition of this cell to the active set, but this will continuously fail
resulting in poor SOHO statistics.
Example: FACTS Reports showing how the SF 4 UL setup failures
(ULSETUPFAILURESSF4) increased rapidly after the introduction of HSDPA, and then
disappeared when the UL 384 RAB (SF 4) was deactivated throughout most of the
network (by changing parameter sf4AdmUl from 2 to 0). The improvement was a
combination of the SOHO problem described above and the fact that many RBSs did
not have sufficient channel element capacity. In the second plot notice how the Packet
SOHO Success Rate was declining and then improved after the deactivation of the
UL384 RAB in 2006 Week 37.
Offloading Traffic: Directed Retry to GSM
35. When a UTRAN cell is congesting, for whatever reason, one possibility to
provide temporary (and often only partial) relief is to enable Directed Retry to
GSM on that cell. In the P4 software release this is the best option; however for
the P5 software release refer to the section “Offloading Traffic: Service Based
Handover” below. Only speech calls may be re-directed to the GSM network and
only one directed retry target GSM cell may be specified (typically the co-sited
cell). It is a blind redirection so it is important that the target GSM cell has the
same (or better) coverage area as the UTRAN cell. It is also very important to
ensure that the GSM cell has the capacity to handle the additional speech traffic.
The redirection is triggered by a transmitted DL power threshold which is not
ideal for situations where DL power is not the cause of congestion; however a
suitable DL power threshold may be established for other forms of cell
congestion.
Example: Two slides showing an example calculation to estimate the power threshold
to use i.e. the setting for parameter loadSharingGsmThreshold.
Example: FACTS Reports showing how the number of directed retries to GSM
increased as the threshold specified by loadSharingGsmThreshold was changed to
15%, 14%, 13% and finally kept on 12% (1st plot); the corresponding improvement to
CSSR Speech and the decline in pmNoReqDeniedAdm (Admission Control was
rejecting due to a lack of DL channelisation codes) (2 nd plot); the corresponding
improvement to CSSR HS (3rd Plot).
Offloading Traffic: Service Based Handover (P5+)
36. With the introduction of the P5 software release comes an optional feature called
Service Based Handover. Similar to the Directed Retry to GSM feature, Service
Based Handover allows Speech calls to be redirected to GSM, but unlike
Directed Retry it is not a blind handover at Call Setup with only one GSM target
cell per UTRAN cell. Instead, the speech call is established in the 3G cell and the
RNC then instructs the UE to enter compressed mode and to start measuring the
GSM neighbours defined. An IRAT HO to the best GSM cell candidate that fulfils
the GSM threshold criteria (gsmThresh3a) is then performed (unless a timer
times out). This feature can also be activated per subscriber via a service
indicator.
DEFINITIONS
A-DCH
AAL2
ASE
ATM
BLER
CAC
CE
CN
Associated DCH
ATM Adaptation Layer type 2
Air Speech Equivalent
Asynchronous Transfer Mode
Block Error Rate
Call Admission Control
Channel Element
Core Network
CSSR
DCH
DCR
DPCH
DL
GPEH
HS-DSCH
HS-PDSCH
HS-SCCH
HSDPA
IFHO
IRAT
Iu
Iub
Iur
MCPA
MGW
Mub
QoS
RAB
RANAG
RAXB
RBS
RL
RLS
RNC
RRC
RU
RXI
SC
SF
SOHO
SRNC
TXB
UeRc
UL
Call Setup Success Rate
Dedicated Channel (Transport Channel)
Drop Call Rate
Dedicated Physical Channel (Physical Channel)
Downlink
General Performance Event Handler
High-Speed Downlink Shared Channel (Transport Channel)
High-Speed Physical Downlink Shared Channel (Physical Channel)
High-Speed Shared Control Channel (Physical Channel)
High-Speed Downlink Packet Access
Inter-Frequency Handover
Inter-Radio Access Technology
Interface between an RNC and a CN
Interface between an RNC and an RBS
Interface between two RNCs
Multi-Channel Power Amplifier (board in RBS 3202)
Media Gateway
Management (O&M) Interface towards RBS from OSS-RC
Quality of Service
Radio Access Bearer
Radio Access Network Aggregator
Random Access and Receiver Board
Radio Base Station
Radio Link
Radio Link Set
Radio Network Controller
Radio Resource Control
Radio Unit (board in RBS 3206)
Product instance of RANAG
Scrambling Code
Spreading Factor
Soft(er) Handover
Serving-RNC
Transmitter Board
UE Radio Connection Configuration
Uplink
Download