Uploaded by m.geo2007

Ericsson SSR 8000 R15 System Troubleshooting - LZU1082262

advertisement
Ericsson SSR 8000 R15 System
Troubleshooting
STUDENT BOOK
LZT1381712 R1A
LZT1381712 R1A
Ericsson SSR 8000 R15 System Troubleshooting
DISCLAIMER
This book is a training document and contains simplifications.
Therefore, it must not be considered as a specification of the
system.
The contents of this document are subject to revision without
notice due to ongoing progress in methodology, design and
manufacturing.
Ericsson shall have no liability for any error or damage of any kind
resulting from the use of this document.
This document is not intended to replace the technical
documentation that was shipped with your system. Always refer to
that technical documentation during operation and maintenance.
© Ericsson AB 2015
This document was produced by Ericsson.
•
The book is to be used for training purposes only and it is
strictly prohibited to copy, reproduce, disclose or distribute it in
any manner without the express written consent from Ericsson.
This Student Book, LZT1381712, R1A supports course number
LZU1082262.
-2 -
© Ericsson AB 2015
LZT1381712 R1A
Table of Contents
Table of Contents
1 CLI TOOLS FOR TROUBLESHOOTING ........................................ 11
1 REVIEW FUNDAMENTAL CONCEPTS.......................................... 12
1.1 CONTEXT, INTERFACES, & BINDINGS ARCHITECTURE......... 13
1.2 TERMINOLOGY...........................................................................14
2 COMMAND LINE INTERFACE (CLI) STRUCTURE........................ 18
2.1 INTRODUCTION..........................................................................18
2.2 MANEUVERING THROUGH THE CLI ......................................... 20
3 MONITORING WITH CLI ................................................................23
3.1 CLI INTRODUCTION AND THE PROMPT STRUCTURE ............ 24
3.2 CONTEXT MONITORING ............................................................25
3.3 CLI HELP .....................................................................................25
3.4 CLI FOR THE FAST PEOPLE......................................................26
4 LAB ENVIRONMENT ......................................................................27
4.1 CONNECTING TO ERICSSON TRAINING LABS ........................ 28
5 CONFIGURE MANAGEMENT INTERFACE ................................... 29
5.1 REFERENCE FOR THIS MODULE ............................................. 29
5.2 CONFIGURE MANAGEMENT INTERFACE ................................ 30
5.3 VALIDATING THE CONFIGURATION ......................................... 33
5.4 BINDING INFORMATION ............................................................34
5.5 EXERCISE 1: MANAGEMENT CONFIGURATION ...................... 35
6
TROUBLESHOOTING PREPARATION COMMANDS & TOOLS.... 35
6.1 TROUBLESHOOTING PREPARATION ....................................... 35
6.2 REMOTE TERMINAL SESSION TIMEOUT ................................. 36
6.3 WHO IS LOGGED INTO THE SSR ..............................................36
6.4 COMMAND HISTORY .................................................................37
LZT1381712 R1A
© Ericsson AB 2015
-3 -
Ericsson SSR 8000 R15 System Troubleshooting
6.5 TROUBLESHOOTING BY SEARCHING ..................................... 38
6.6 COMMAND LINE INTERFACE & EMACS ................................... 39
6.7 COMMAND LINE INTERFACE & EMACS ................................... 40
6.8 GREP: GLOBAL REGULAR EXPRESSION PARSER ................. 40
6.9 EXTENDED GREP ......................................................................41
6.10 OTHER SEARCHING TOOLS ...................................................41
6.11 REGULAR EXPRESSIONS .......................................................42
6.12 REGULAR EXPRESSION: EXAMPLES WITH GREP ................ 44
7 ALIASES AND MACROS ................................................................45
7.1 INTRODUCTION TO ALIAS .........................................................45
7.2 INTRODUCTION TO MACRO......................................................46
7.3 VARIABLES IN MACROS ............................................................46
8 EXERCISE 2: INTRODUCTION, SEARCHING AND FILTERING ... 47
8.1 EXERCISE 2: SEARCHING AND FILTERING ............................. 47
8.2 EXERCISE 2, REVIEW (1-4)........................................................47
8.3 EXERCISE 2, REVIEW (2-4)........................................................48
8.4 EXERCISE 2, REVIEW (3-4)........................................................48
8.5 EXERCISE 2, REVIEW (4-4) (OPTIONAL) .................................. 49
9 CHAPTER SUMMARY ....................................................................50
2 OPERATIONAL HEALTH OF THE SSR SYSTEM .......................... 51
1 TROUBLESHOOTING PROCEDURE............................................. 52
1.1 SYSTEM HARDWARE HEALTH..................................................53
1.2 OVERVIEW: HARDWARE STATUS ............................................ 54
1.3 RETRIEVING HARDWARE DETAILS LINE CARDS .................... 56
1.4 RPSW HARDWARE INFORMATION ........................................... 57
1.5 ALSW HARDWARE INFORMATION ........................................... 58
1.6 FINDING HARDWARE ALARMS (1-2)......................................... 59
-4 -
© Ericsson AB 2015
LZT1381712 R1A
Table of Contents
1.7 FINDING HARDWARE ALARMS (2-2)......................................... 59
1.8 SYSTEM HARDWARE CHECKS .................................................60
1.9 SYSTEM ALARMS .......................................................................61
1.10 SYSTEM ALARM WITH OPTIONS ............................................ 61
1.11 EXAMPLE: INITIATING MAJOR SYSTEM ALARM .................... 62
1.12 EXAMPLE: INITIATING CRITICAL SYSTEM ALARM ................ 63
1.13 SYSTEM HARDWARE LED .......................................................64
1.14 CARD POWERED DOWN .........................................................65
1.15 SYSTEM STORAGE VERIFICATION ........................................ 66
1.16 SYSTEM STORAGE ..................................................................67
1.17 SYSTEM STORAGE VERIFICATION ........................................ 68
1.18 SYSTEM STORAGE VERIFICATION: EXAMPLE ...................... 68
2 CHAPTER SUMMARY ....................................................................70
3 FUNDAMENTAL CONCEPT OF PROCESSES
ARCHITECTURE ON THE SYSTEM........................................... 71
1 PROCESS ARCHITECTURE ..........................................................72
1.1 RPSW PROCESSES ...................................................................73
1.2 PROCESS SCHEDULING ...........................................................76
1.3 RPSW PROCESSES VERIFICATION ......................................... 77
1.4 FINDING CPU INTENSIVE PROCESSES ................................... 78
1.5 SINGLE PROCESS VERIFICATION ............................................ 79
1.6 SINGLE PROCESS IN DETAIL....................................................80
1.7 SINGLE PROCESS VERIFICATION - ISM .................................. 81
1.8 SINGLE PROCESS VERIFICATION - OSPF ............................... 82
1.9 MAXIMUM CRASHES ALLOWED ............................................... 83
1.10 PROCESS CRASH ....................................................................84
1.11 SOFTWARE PROCESS FAILURE SCENARIO ......................... 85
LZT1381712 R1A
© Ericsson AB 2015
-5 -
Ericsson SSR 8000 R15 System Troubleshooting
1.12 SYSTEM STOPPED PROCESSES ........................................... 86
1.13 OLD CORE FILE ON RP............................................................88
1.14 CORE FILES – COPIED BETWEEN RP .................................... 88
1.15 CORE DUMP FILES ON STANDBY RP..................................... 89
2 EXERCISE 3: INTRODUCTION ......................................................90
2.1 EXERCISE 3: SYSTEM PROCESSES ........................................ 90
2.1.1 EXERCISE 3: REVIEW .............................................................90
3 CHAPTER SUMMARY ....................................................................92
4 UNDERSTAND THE SSR SYSTEM REDUNDANCY ISSUES ........ 93
1 RP REDUNDANCY .........................................................................94
1.1 RP REDUNDANCY DETAILS ......................................................95
1.2 INVESTIGATING REDUNDANCY ISSUES .................................. 96
1.3 SHOW SYSTEM REDUNDANCY ................................................ 96
2 ANALYZING PROBLEMS OF STANDBY RP.................................. 98
2.1 ACTIVE OR STANBY RP.............................................................99
2.2 CONNECTING TO STANDBY RP WITHOUT CONSOLE ............ 99
2.3 SEARCHING FOR RESTART REASON .................................... 100
2.4 REPEATING COMMANDS ON STANDBY RP .......................... 100
2.5 VERIFY PROCESSES ON STANDBY RP ................................. 101
2.6 COPY FILES FROM STANDBY RP ........................................... 101
3
RP FAILOVER MANAGEMENT .................................................... 103
3.1 MANAGING RELOADS AND RP SWITCH-OVER ..................... 103
3.2 MANUAL RP SWITCHOVER ..................................................... 104
4 CHAPTER SUMMARY .................................................................. 106
5 ISSUES RELATED WITH BOOT PROBLEM ................................ 107
1 BOOT PROBLEMS ....................................................................... 108
1.1 ENTERING BOOT ROM INTERFACE ....................................... 108
-6 -
© Ericsson AB 2015
LZT1381712 R1A
Table of Contents
1.2 EXAMPLE: ENTERING BOOT ROM INTERFACE..................... 109
1.3 DIAGNOSTICS COMMAND ....................................................... 109
1.4 RUNNING DIAGNOSTICS ......................................................... 110
2 TROUBLESHOOTING SCENARIOS ............................................ 111
2.1 TROUBLESHOOTING SCENARIOS ......................................... 112
2.2 SYSTEM UPTIME ...................................................................... 112
2.3 SYSTEM STORAGE VERIFICATION ........................................ 113
2.4 EXERCISE 4: INVESTIGATE BOOT PROBLEMS ..................... 113
3 CHAPTER SUMMARY .................................................................. 114
6 ACTIVE AND HISTORY LOGS IN SSR......................................... 115
1 SYSTEM LOGGING INTRODUCTION.......................................... 116
1.1 LOGGD PROCESS.................................................................... 117
1.2 SYSTEM LOG COMMANDS ...................................................... 118
1.3 EVENT SEVERITY LEVELS IN LOG MESSAGES .................... 119
1.4 LOGS FROM CARDS ................................................................ 119
1.5 SHOW LOG AND TIME ............................................................. 120
1.6 LOG FILES ................................................................................ 121
1.6.1 CUSTOM LOG FILES AND FILTERS ..................................... 123
1.6.2 LOG FILES LOCATION .......................................................... 124
1.6.3 DISPLAY LOG FILES.............................................................. 124
1.7 FILTER BASED ON FACILITY ................................................... 125
1.7.1 FILTER BASED ON FACILITY EXAMPLE .............................. 125
1.8 PM PROCESS LOGS ................................................................ 126
1.9 CSM PROCESS LOGS .............................................................. 127
1.10 ISM PROCESS ........................................................................ 127
1.11 FILTER BASED ON FACILITY ON CARD................................ 128
1.12 LOGGER VERIFICATION ........................................................ 129
LZT1381712 R1A
© Ericsson AB 2015
-7 -
Ericsson SSR 8000 R15 System Troubleshooting
1.13 SHOW LOGGING CARD INFORMATION................................ 130
1.14 LOGGING DISPLAY INFO ....................................................... 130
1.15 LOGGING DEBUG ................................................................... 132
1.16 LOGGING DEBUG ................................................................... 134
1.17 LOG FILE COLLECTION ......................................................... 134
2 SYSLOG CONFIGURATION ........................................................ 135
2.1 SYSLOG SERVER..................................................................... 136
2.2 EXERCISE 5: LOGGING & SYSLOG......................................... 136
2.2.1 EXERCISE REVIEW: CONFIGURE SYSLOG & DEBUG........ 137
2.2.2 EXERCISE REVIEW: SYSLOG SERVER ENVIRONMENT .... 137
2.2.3 EXERCISE REVIEW: SAVE AND DISPLAY THE LOGS ........ 138
3 CHAPTER SUMMARY .................................................................. 139
7 USE AND IMPACT OF DEBUGGING ON THE SSR SYSTEM ..... 141
1 DEBUG INTRODUCTION ............................................................. 142
1.1 DEBUG COVERAGE ................................................................. 144
1.2 HOW TO RECOGNIZE A DEBUG FUNCTION .......................... 145
1.3 DEBUGGING WITHIN CONTEXT LOCAL ................................. 146
1.4 DEBUGGING IN DIFFERENT CONTEXTS................................ 146
1.5 DEBUG RELATIONSHIP WITH CONTEXTS ............................. 148
1.6 SEND DEBUG OUTPUT TO SCREEN ...................................... 149
1.7 ADMINISTRATOR PRIVACY ..................................................... 151
1.8 DEBUGGING AND IMPACT ...................................................... 152
1.9 EXERCISE 6: DEBUGGING ON SSR ........................................ 152
2 CHAPTER SUMMARY .................................................................. 153
8 TROUBLESHOOTING FOR TRAFFIC FLOW THROUGH
PORTS, CIRCUITS AND INTERFACES ................................... 155
1 TROUBLESHOOTING BASIC CHECKS ....................................... 156
-8 -
© Ericsson AB 2015
LZT1381712 R1A
Table of Contents
1.1 INTERFACE & PORT STATES .................................................. 157
1.2 VERIFYING INTERFACE STATUS ............................................ 159
1.3 IDENTIFYING INTERFACE PROBLEMS: UNBOUND STATE... 160
1.4 PORT STATUS: ADMIN STATE AND LINE STATE ................... 163
1.5 CIRCUIT STATUS ..................................................................... 164
2 TROUBLESHOOTING TRAFFIC .................................................. 165
2.1 TROUBLESHOOTING TRAFFIC PROBLEMS ........................... 165
2.2 PORT COUNTERS – OVERVIEW ............................................. 166
2.3 LIVE PORT COUNTERS ........................................................... 167
2.4 PORT COUNTERS .................................................................... 167
2.5 TROUBLESHOOTING CIRCUITS ............................................. 169
2.6 CIRCUIT COUNTERS................................................................ 169
2.7 VLAN CIRCUIT STATISTICS ..................................................... 170
2.8 CLEARING COUNTERS ............................................................ 171
2.9 IP TROUBLESHOOTING TOOL ................................................ 171
2.10 TRAFFIC TROUBLESHOOTING EXERCISE:
INTRODUCTION ................................................................................172
2.11 TRAFFIC TROUBLESHOOTING EXERCISE:
PREPARATION ..................................................................................172
2.12 EXERCISE 7: TRAFFIC TROUBLESHOOTING....................... 172
2.12.1 CONTEXT TOPOLOGY FOR TRAFFIC
TROUBLESHOOTING EXERCISE ..................................................... 173
2.12.2 TRAFFIC TROUBLESHOOTING EXERCISE REVIEW ......... 173
3 CHAPTER SUMMARY .................................................................. 177
9 ACRONYMS AND ABBREVIATIONS ........................................... 179
10 INDEX .......................................................................................... 183
11 TABLE OF FIGURES ................................................................... 185
LZT1381712 R1A
© Ericsson AB 2015
-9 -
Ericsson SSR 8000 R15 System Troubleshooting
Intentionally Blank
- 10 -
© Ericsson AB 2015
LZT1381712 R1A
CLI Tools for Troubleshooting
1 CLI Tools for Troubleshooting
Chapter Objectives
After this course the participant will be able to:
› Identify the CLI Tools for Troubleshooting
› Describe the grep and its Options
› Understand the use of CLI Command Aliases as
Shortcuts
› Using CLI Command macros to Execute Multiple
Command with Single Command
Figure 1-1: Chapter Objectives
LZT1381712 R1A
© Ericsson AB 2015
- 11 -
Ericsson SSR 8000 R15 System Troubleshooting
1
Review Fundamental Concepts
Figure 1-2: Review Fundamental Concepts
________________________________________________________________
________________________________________________________________
________________________________________________________________
________________________________________________________________
________________________________________________________________
________________________________________________________________
- 12 -
© Ericsson AB 2015
LZT1381712 R1A
CLI Tools for Troubleshooting
1.1
Context, Interfaces, & Bindings Architecture
Figure 1-3: Context, Interfaces, & Bindings Architecture
________________________________________________________________
________________________________________________________________
________________________________________________________________
________________________________________________________________
________________________________________________________________
________________________________________________________________
LZT1381712 R1A
© Ericsson AB 2015
- 13 -
Ericsson SSR 8000 R15 System Troubleshooting
1.2
Terminology
Context
A context is an instance of a virtual router. A context has its own management
domain, Authentication, Authorization, and Accounting (AAA) name space, IP
address space, and routing protocols. You create and delete contexts with
configuration commands. Contexts share common resources, such as memory
and processor cycles, but each context functions independently of all other
contexts configured on the router.
Every configuration includes a local context, which cannot be deleted. In singlecontext configurations, the local context is the only context.
All Ericsson IP Operating System features, such as the Command-Line Interface
(CLI), Simple Network Management Protocol (SNMP), troubleshooting features,
such as ping, traceroute, debug, and system logging, IP addresses, interfaces, and
access control lists are implemented per context. Likewise, each context has its
own complete implementation of IP routing protocols, including Border Gateway
Protocol (BGP), Open Shortest Path First (OSPF), Intermediate System-toIntermediate System (IS-IS), and the complete IP multicast routing protocol suite.
Each BGP instance has its own autonomous system number, policies, and import
and export properties. Each context can contain any mix of Interior Gateway
Protocol routing protocols.
Each context has its own IP address space, which can overlap with the address
space of other contexts. Any physical port and circuit can be associated with a
context through configuration commands and the binding process.
A context can have its own unique set of CLI administrators, each with their own
(possibly overlapping) administrator names and passwords, and each
authenticated through their own set of AAA databases. Each context can have its
own SNMP community strings. This support allows VPN customers visibility
into their own routing context for debugging and troubleshooting purposes.
Interface
The concept of an interface on the in the operating system differs from earlier
networking devices. In earlier devices, the term interface was often used
synonymously with port, channel, or circuit, which are physical entities to mean
the point on a router where a physical port is bound to a physical communication
line, either fiber-optic or copper. In the operating system, an interface is the
managed object which provides higher layer protocol and service information
(such as Layer 3 addressing) to a context at the point where a physical or logical
circuit interfaces with the context. The decoupling of the interface from the
physical layer entities enables many of the advanced features offered by the
router.
- 14 -
© Ericsson AB 2015
LZT1381712 R1A
CLI Tools for Troubleshooting
An interface is a logical entity that provides higher layer protocol and service
information, such as Layer 3 addressing. Interfaces are configured as part of a
context and are independent of physical ports and circuits. The separation of the
interface from the physical layer allows for many of the advanced features
offered by the router. For higher layer protocols to become active, you must bind
a physical port or circuit to an interface.
When you add interfaces to a context, the following restrictions apply:
•
A context can have only one interface per subnet.
•
The host portion of an IP address assigned to an interface cannot be 0.
•
The host portion of an IP address assigned to an interface cannot be the
subnet for a broadcast IP address.
•
For an unnumbered interface, the IP address borrowed must be in the
same context as the unnumbered interface.
Port
A port is a physical entity handling encapsulation and bits on the wire (eg, ATM,
Eth, PoS).
Ports in the router provide the physical connections to communication lines.
Ethernet ports are the simplest type of circuits provided by the router.
Circuit
In general telecommunications use, a circuit is a communications path between
two or more points. However, in the Ericsson IP Operating System, the term
circuit refers to the endpoint of any segment of a communications path that
terminates on a node in a network.
An 802.1Q Permanent Virtual Circuit (PVC) or VLAN is a separate,
administratively defined subgroup of a bridged LAN. Bridged LANs and 802.1Q
encapsulation are described in the 802.1Q IEEE Standard for Local and
Metropolitan Area Networks: Virtual Bridged Local Area Networks
specification, which defines the architecture and bridging protocols for
partitioning a bridged LAN into VLANs.
The router supports several types of circuits:
•
Ethernet ports, single-tagged 802.1Q Permanent Virtual Circuits (PVCs)
(VLANs), and double-tagged 802.1ad tunnels
All of these circuit types are also supported as aggregated circuits in
802.1AX link-groups.
•
LZT1381712 R1A
Layer 2 service instances
© Ericsson AB 2015
- 15 -
Ericsson SSR 8000 R15 System Troubleshooting
Service instances are subinterfaces of a LAN that accept one or more
Layer 2 (802.1Q) services for transport across local physical ports or a
provider backbone network. However, because packets flow through
service instances when in cross-connections and in bridge configurations,
they are also Ericsson IP OS circuits.
•
Generic Routing Encapsulation (GRE) tunnel circuits
•
Layer 2 Tunneling Protocol (L2TP) tunnel circuits
•
CLIPS virtual circuits
•
PPPoE virtual circuits
SSR
Configuration example
Context ABC
Port:
port eth 1/1
Binding created
by operator
Cct: dot1q pvc 100
Interface
test
Interface
context ABC
!
interface test
ip address 1.1.1.1/24
!
port eth 1/1
dot1q pvc 100
bind interface test ABC
!
› Context: a ‘Virtual Router’ containing its own routing info, addresses, subs, VPNs etc.
› Interface: a logical IP entity residing in the context (NOT the same as port or circuit)
› Port: a physical entity handling encapsulation and bits on the wire (eg, ATM, Eth, PoS)
› Circuit: the same as port, except at a more specific level (eg, ATM pvc or Ethernet vlan)
› Binding: A virtual ‘patch-cable’ connecting the port/cct to the Interface, in the Context
Figure 1-4: Terminology
Binding
A binding forms the association between a port or circuit and an interface (in a
given context) on which a Layer 3 or higher protocol is configured. Data cannot
flow on a port or circuit is bound to the interface and the higher layer protocol is
configured and enabled.
- 16 -
© Ericsson AB 2015
LZT1381712 R1A
CLI Tools for Troubleshooting
Binding an Ethernet Port to an Interface
Bindings associate particular ports or circuits with the higher layer routing
protocols configured for a context. No user data can flow on a port or circuit until
a higher layer service is configured and associated with it. After a port or circuit
is bound to an interface, traffic flows through the context as it would through any
IP router.
Unlike other IP operating systems that use implicit binding throughout, the
operating system uses explicit binding; that is, the interface and the circuit exist
as separate objects and become bound to each other only through explicit
command which either associates a static circuit to an interface or dynamically
associates a higher-layer protocol session with an interface.
LZT1381712 R1A
© Ericsson AB 2015
- 17 -
Ericsson SSR 8000 R15 System Troubleshooting
2
Command Line Interface (CLI) Structure
Figure 1-5: Command Line Interface (CLI) Structure
2.1
Introduction
Factory default the SSR configuration is empty and none of the features and
functions are enabled / configured
To access the software and its CLI, use either of the following methods:
- 18 -
•
Connect to the console port—You can connect a terminal to this port,
either directly or through a terminal server.
•
Connect to the Ethernet management port—You can connect a terminal
to the system over a LAN if remote access using SSH or Telnet has been
enabled.
© Ericsson AB 2015
LZT1381712 R1A
CLI Tools for Troubleshooting
› Factory default the SSR configuration is empty and none of
the features and functions are enabled / configured
› The SSR platform can be configured by something called
“Command Line Interface”
› The Command Line Interface is intuitive and there are
many cool tricks you can use to make life even easier
Figure 1-6: Introduction
› Before you can use the CLI to configure the SSR you
need to be connected…..
› Factory default means you can only configure the
SSR using the console port
› On the RPSW there is a console port for configuration
purposes
› Initially you can connect to the console port and start
serial terminal (9600, N, 8, 1, no flow control).
– In the training lab, we have a terminal server connected
to the console port to allow remote access to the system.
– The lab web page connects to the console port via the
terminal server IP address and port number connected to
specific SSRs.
Console Port
Ethernet Port
We assume power is connected and the system is running.
The management port is the 10/100/1000 Ethernet port located on the controller card
and is designated for system management. The management port is configured in
the local context.
Figure 1-7: Factory Default System: Step one
LZT1381712 R1A
© Ericsson AB 2015
- 19 -
Ericsson SSR 8000 R15 System Troubleshooting
2.2
Maneuvering through the CLI
In the CLI, the two primary modes are exec and global configuration.
When a session is initiated, the CLI is set to the exec mode by default. The exec
mode allows you to examine the state of the system and perform most
monitoring, troubleshooting, and administration tasks using a subset of the
available CLI commands.
Exec mode prompts can be one of the following forms, depending on the user
privilege level
[local]hostname#
[local]hostname>
Connected
1 [local]Ericsson>
1
Operator
Monitoring
2
Enable
3
Administrator
Monitoring
4
Global Config
2 [local]Ericsson> enable
3 [local]Ericsson#
[local]Ericsson# config
4 [local]Ericsson(config)#
[local]Ericsson(config)# context local
5 [local]Ericsson(config-ctx)#
5
Context
Port
QoS
[local]Ericsson(config-ctx)# interface test
ATM PVC
6 [local]Ericsson(config-if)#
Bind
[local]Ericsson(config-if)# exit
5 [local]Ericsson(config-ctx)#
[local]Ericsson(config-ctx)# end
3 [local]Ericsson#
6
Subscriber
password
Interface
Access-lists
OSPF
AAA
OSPF Sub-modes
Figure 1-8: Maneuvering through the CLI
In this example, local is the context in which commands are applied and
hostname is the currently configured hostname of the router. When you exit exec
mode using the exit command, the entire CLI session ends.
Global configuration mode is the top-level configuration mode; all other
configuration modes are accessed from this mode. The configuration modes
allow you to configure the system through the CLI, or to create and modify a
configuration file offline by entering configuration commands using any text
editor. After you have saved the file, you can then load it to the operating system.
To access global configuration mode, enter the configure command in exec
mode.
Configuration mode prompts take the following form:
[local]hostname(mode-name)#
- 20 -
© Ericsson AB 2015
LZT1381712 R1A
CLI Tools for Troubleshooting
In the example, local is the context in which commands are applied, hostname is
the currently configured hostname of the router, and mode-name is a string
indicating the name of the current configuration mode.
The prompt in global configuration mode, assuming the factory default hostname
of Ericsson and the local context, is:
[local]Ericsson(config)#
Command modes exist in a hierarchy. You must access the higher-level
command mode before you can access a lower-level command mode in the same
chain.
Exit: Exits the current configuration mode and returns to the next highest level
configuration mode. At the exec prompt, closes an active terminal or console
session, and terminates the session.
End: Exits the current configuration mode and returns to exec mode.
Navigate the CLI
LZT1381712 R1A
© Ericsson AB 2015
- 21 -
Ericsson SSR 8000 R15 System Troubleshooting
›
›
›
All configuration commands are stored in transaction database Configuration will be
But none of the commands are actually activated
committed automatically when
It is like writing your configuration on a sticky note
you leave the configuration
mode using exit or end
›
Activate your configuration:
–
[local]Ericsson(config)# commit
–
[local]Ericsson(config)# end
–
[local]Ericsson(config)# exit
(right away, but do not leave configuration mode)
(right away and leave configuration mode)
(only if you jump out of configuration mode)
You lose configuration when:
•
You get disconnected
during configuration
•
You type abort during
configuration
›
Throwing away your sticky note during configuration:
–
[local]Ericsson(config)# abort
›
Undo a single command:
–
[local]Ericsson(config-ctx)# no interface test
Figure 1-9: If you are configuring…
Commit: Commits an outstanding configuration database transaction.
End: The end command to exit the current configuration mode and return to exec
mode. When this command is entered, all commands entered since the beginning
of the configuration session, or since the last abort or commit command in
configuration mode, are committed to the database.
Exit: The exit command to exit the current configuration mode, return to exec
mode, or close an active terminal or console session.
Entering this command in any configuration mode exits the current configuration
mode and returns to the next highest level configuration mode.
Abort: The abort command to delete an outstanding database transaction, which
includes all configuration commands entered since the beginning of the
configuration session or since the latest abort or commit command.
Exiting Command Modes
The following example exits global configuration mode and returns to exec
mode:
[local]Ericsson(config)#exit
[local]Ericsson#
The following example exits a CLI session:
[local]Ericsson#exit
- 22 -
© Ericsson AB 2015
LZT1381712 R1A
CLI Tools for Troubleshooting
The following example exits context configuration mode and returns to exec
mode:
[local]Ericsson(config-ctx)#end
[local]Ericsson#
3
Monitoring with CLI
Figure 1-10: Monitoring with CLI
________________________________________________________________
________________________________________________________________
________________________________________________________________
________________________________________________________________
LZT1381712 R1A
© Ericsson AB 2015
- 23 -
Ericsson SSR 8000 R15 System Troubleshooting
3.1
CLI Introduction and the prompt structure
The primary administrator interface to the operating system is the CLI.
It is an intuitive text based command interpreter from which you can efficiently
operate, configure, verify and monitor the system by entering different
commands.
Based on the entered command the system will parse it and change the system
parameters or output a result. In this example we entered a show command.
In the next part, we have the configuration mode indicator. Here it is indicating
that we are configuring a context. We will come back to this later in this course.
Next is the system hostname. By default the hostname is set to Ericsson. We will
show you later how to change this.
The left-most part is the context monitoring mode indicator. It indicates the name
of the current context that is monitored. All contexts in the system have unique
names. The default administrative context called “local” is always there.
This part of the prompt indicates that only the information for this context will be
displayed when typing show commands. However, Context “local” is an
exception. Using the show command when monitoring from context local can
output information about all contexts.
currently
monitored
context
Configuration
Mode
Indicator
[local]Ericsson(config-ctx)# show ...
System
Hostname
(default)
Enter
commands
Figure 1-11: CLI Introduction and the prompt structure
- 24 -
© Ericsson AB 2015
LZT1381712 R1A
CLI Tools for Troubleshooting
3.2
Context Monitoring
Router A
To check configuration on Router A:
1.
Connect wire to Router A
Port
To check configuration on Router B:
1.
Disconnect wire from Router A
2.
Connect wire to Router B
Port
Router B
Port
Port
SSR
Connect wire to Ericsson SSR
Context A
To check configuration on Context A:
1.
Switch to context A using:
[local]Ericsson# context A
[A]Ericsson#
To check configuration on Context B:
1.
Switch to context B using:
[A]Ericsson# context B
[B]Ericsson#
Interface
Interface
Context B
Interface
Interface
Figure 1-12: Context monitoring
3.3
CLI Help
To access the online Help for the CLI:
•
Use the ? command when entering a command to display the options
available at the current state of the command syntax.
Use the help command to display how to use the ? character to obtain help.
› Within the CLI there is help available!
› Use it, since it is very intuitive
› “?” lists commands at current level
[local]Ericsson# show ip ?
access-list
Display access list(s)
all-host
Display static and dynamic IP hosts to name mappings
dynamic-host Display dynamic IP hosts to name mappings
host
Display static IP hosts to name mappings
. . .
› “command ?” lists options for that command
[local]Ericsson(config-ctx)#router ?
ancp
Access Node Control (GSMP)
bfd
Bidirectional Forwarding Detection (BFD)
bgp
Border Gateway Protocol (BGP)
. . . .
[local]Ericsson(config-ctx)# router bgp ?
1..4294967295 Autonomous system (AS) number
nn:nn
AS number in nn:nn format
Figure 1-13: CLI Help
LZT1381712 R1A
© Ericsson AB 2015
- 25 -
Ericsson SSR 8000 R15 System Troubleshooting
3.4
CLI for the fast people
The Tab key in any mode to complete a command. Partially typing a command
name and pressing the Tab key causes the command to be displayed in full to the
point where it is no longer unique and a further choice has to be made
› The CLI will accept partially submitted commands, as long as it is unique:
– Rather then typing “show configuration” you can also type “sho conf”
› The CLI will complete a command if you press the TAB key. This is nice for the
purists amongst us.
› Use
arrow keys to scroll through previous commands.
› Use
arrow keys to scroll through command
› Full Emacs editing support
› Press Enter at any time; parser reads the entire line
Figure 1-14: CLI for the fast people
- 26 -
© Ericsson AB 2015
LZT1381712 R1A
CLI Tools for Troubleshooting
4
Lab Environment
Figure 1-15: Lab environment
________________________________________________________________
________________________________________________________________
________________________________________________________________
________________________________________________________________
________________________________________________________________
LZT1381712 R1A
© Ericsson AB 2015
- 27 -
Ericsson SSR 8000 R15 System Troubleshooting
4.1
Connecting to Ericsson Training Labs
All labs are accessible via SSH from any location
•
There are SDSL lines for the labs
•
Each lab is connected via dedicated SDSL link
o
There is a backup link from other provider for each line
•
Lab firewalls convert public IP addresses of SDSL lines to
•
IP addresses of ssh servers
•
SDSL link is mapped to ssh 1 server while an ssh2 works as backup
Figure 1-16: Connecting to Ericsson Training labs
- 28 -
•
Students access a lab through a gateway – ssh server
•
ssh server performs multiple functions
o
Verifies user’s credentials
o
Handles telnet connections between student’s PC and lab
equipment (telnet is tunneled inside ssh session)
o
Provides http proxy functionality (needed to access CPE web
interface)
o
And more but not related to student access
© Ericsson AB 2015
LZT1381712 R1A
CLI Tools for Troubleshooting
5
Configure Management Interface
Figure 1-17: Configure Management Interface
5.1
Reference for this Module
Management side
Management
Subscriber
Circuit:
•ATM PVC
•FR DLCI
•VLAN
•Pseudo circuit (session)
local
Radius
Session could be
•DHCP (CLIPS)
•PPPoE/A/oAoE
Backbone
Subscriber side
Ericsson
SSR
Backbone side
Figure 1-18: Reference for this module
LZT1381712 R1A
© Ericsson AB 2015
- 29 -
Ericsson SSR 8000 R15 System Troubleshooting
5.2
Configure Management Interface
X = group number [1—5]
system hostname Train
system location Training
system contact GroupX
port ethernet management
no shutdown
1
context local
administrator GrX password GrX
enable password ericsson
bind interface mgmt local
2
local 3
5
General
Tasks:
1) System
2) Context
3) Interface
4) Port
5) Binding
6) Commit
4
Ethernet management
interface mgmt
ip address yy
6
commit
Remember:
1.
Names are case sensitive
2.
Names are just variables
and you determine their
value
3.
Context local is always
there! Why?
4.
How does a binding know
which interface to select
and which context?
Figure 1-19: Configure Management interface
system hostname
Specifies the system hostname. Use the system hostname command to specify
the system hostname.
system location
Configures the system location information.
system location text
Text that explains the physical location of the system. This argument can be any
alphanumeric string, including spaces, from 1 to 39 characters.
system contact
Identifies the system contact.No system contact information is configured.Use the
system contact command to configure the system to identify the person or
department to contact regarding system information.
port ethernet management
Configures the Ethernet management port on the controller card.
Use the port ethernet management command to create an Ethernet management
port for administrative access to the router. The slot is assigned automatically.
Use the no form of the command to remove an Ethernet management port.
bind interface
- 30 -
© Ericsson AB 2015
LZT1381712 R1A
CLI Tools for Troubleshooting
Statically binds a port, permanent virtual circuit (PVC), 802.1Q tunnel, a link
group, Generic Routing Encapsulation (GRE) tunnel or tunnel circuit, or IP-in-IP
tunnel to a previously created interface in the specified context.
context local
The special context local is always present and has unique qualities. Only an
administrator authenticated in the local context can configure the system.
Administrators authenticated in the local context can observe any portion of the
system, regardless of context. Administrators authenticated in other contexts are
restricted to the portion of the system relevant to that context.
Contexts are independent name spaces and data spaces. For example, a routing
process in one context can share routing information with a routing process in
another context through inter-context interfaces, just as physical routers are
connected together by physical cables.
administrator admin-name [encrypted encrypt-type password | password
password]
Creates an administrator logon account, or selects an existing one for
modification, and enters administrator configuration mode.
admin-name
Alphanumeric string representing a new or existing administrator.
encrypted encrypt-type password
Required only when configuring a new administrator account. Alphanumeric
string representing an encrypted type 1 or type 2 password for the administrator
account. Enter an already encrypted password to define the password.
password password
Required only when configuring a new administrator account. Alphanumeric
string representing an unencrypted password for the administrator account.
enable password ericsson
Configures a password for the specified privilege level that the system encrypts.
Use the enable password command to configure a password for the specified
privilege level that the system encrypts.
The router supports privilege levels 0 to 15 for both administrators and
commands. Privilege levels are enabled on a per-context basis.
If password authentication is enabled, the system prompts for the password when
the administrator enters the privilege level using the enable command in exec
mode. By default, local password authentication is enabled (see the enable
authentication command).
LZT1381712 R1A
© Ericsson AB 2015
- 31 -
Ericsson SSR 8000 R15 System Troubleshooting
To protect your passwords, the system does not store or display this command.
Instead, the system stores and displays the password in an encrypted form. When
displaying the configuration, the system uses the enable encrypted command in
context configuration mode.
Use the no form of this command to delete the password for the specific privilege
level.
interface
ip address <ip-address>
Assigns a primary IP address, and optionally, one or more secondary IP
addresses, to an interface.
Use the ip address command to assign a primary IP address and, optionally, one
or more secondary IP addresses to an interface. This assignment enables IP
services on an interface.
Use the ip-addr argument and either the netmask or /prefix-length construct to
assign the interface a primary IP address and netmask or prefix length. For
nonloopback interfaces, use the bind interface command in port configuration
mode to bind a circuit to the interface on which IP services are enabled.
Commit
Commits an outstanding configuration database transaction.
Use the commit command to commit an outstanding configuration database
transaction. You can use the at or in keywords to schedule the transaction to be
committed later. You can also associate a comment with the transaction.
- 32 -
© Ericsson AB 2015
LZT1381712 R1A
CLI Tools for Troubleshooting
5.3
Validating the Configuration
› Check if you are in the right context
– [local]Train-1# show bind
– [local]Train-1# show ip interface brief
– [local]Train-1# ping 10.1.1.3
– [local]Train-1# show port
– [local]Train-1# show port counter management
– [local]Train-1#
– ….
– Connect Computer and start Telnet session to management interface
address
– ….
› Save Configuration
› [local]Train-1# save config /md/Ericsson_GrX.cfg
Figure 1-20: Validating the configuration
show binding
Displays information on the configured bindings of one or more ports or
permanent virtual circuits (PVCs) on the system.
Use the show bindings command to display information on the configured
bindings of one or more ports or permanent virtual circuits (PVCs) on the system.
show ip interface brief
Displays information about interfaces, including the interface bound to the
Ethernet management port on the controller card.
Use the show ip interface command to display information about all interfaces,
including those on the controller card. Use this command without optional syntax
to display detailed information on all configured interfaces.
if-name
Optional. Name of the interface to be displayed.
all-context
Optional. Displays interface information for all contexts.
brief
Optional. Displays the name, IP address, and other information, in brief, for all
configured interfaces in the current context or, if the optional all-context keyword
is used, all contexts.
LZT1381712 R1A
© Ericsson AB 2015
- 33 -
Ericsson SSR 8000 R15 System Troubleshooting
ping <ip-address>
Tests whether the host is reachable.
show port
Displays a list of ports that are present or configured in the system.
show port counter
Displays the counters associated with system ports.
Use the show port counters command to display counters associated with system
ports. The values shown are accumulated since the counters were last cleared
with the clear port counters command in exec mode or since the line card was last
reloaded.
Use the persistent keyword to display counter values accumulated since the
system was last reloaded.
If you specify the optional slot or port argument, the display shows counter
information for the specified line card or port.
By default, this command displays only summary counter information for all
ports with their last known values, which are cached and updated every 60
seconds. Use the live keyword to read and display the current values for the
summary counters.
5.4
Binding Information
[local]Train-1# show bindings
Circuit
State Encaps
management Up
interface
Summary:
total: 1
up: 1
bound: 1
auth: 0
no-bind: 0
ether: 1
mpls: 0
clips: 0
ipsec: 0
ethernet
down: 0
unbound: 0
interface: 1
atm: 0
fr: 0
ppp: 0
vpls: 0
ipv6v4-man: 0
Bind Type
Bind Name
management@local
subscriber: 0
chdlc: 0
gre: 0
pppoe: 0
ipip: 0
ipv6v4-auto: 0
bypass: 0
dot1q: 0
[local]Train-1#
› Circuit information
› Binding state
› Encapsulation applicable for circuit / binding
› Binding type (bind interface)
› Binding Reference / name (interface management, context local)
Figure 1-21: Binding information
- 34 -
© Ericsson AB 2015
LZT1381712 R1A
CLI Tools for Troubleshooting
Use the command show bindings to display information on the configured
bindings of one or more ports or permanent virtual circuits (PVCs) on the system.
The following example displays all bindings in the current context (local).
5.5
Exercise 1: Management Configuration
› Please move to the exercises book.
Figure 1-22: Exercise 1: Management configuration
6
Troubleshooting Preparation Commands & Tools
Figure 1-23: Troubleshooting Preparation Commands & Tools
6.1
Troubleshooting Preparation
Before you begin troubleshooting, gather the evidence of what has been
happening on your router. Collect the output of the show tech-support command,
and optionally, other show commands and macros for specific problems. Collect
this evidence before beginning to troubleshoot, because some troubleshooting
techniques destroy or modify already stored data. If you need to escalate your
problem to customer support, you must include troubleshooting data with your
support request.
LZT1381712 R1A
© Ericsson AB 2015
- 35 -
Ericsson SSR 8000 R15 System Troubleshooting
› It is very useful to prepare the system for more efficient and
structured troubleshooting
› In following slide we will present recommended commands
and tools to use while troubleshooting
Figure 1-24: Troubleshooting Preparation
6.2
Remote Terminal Session Timeout
› By default SmartEdge disconnects administrator’s
sessions after 10 minutes of inactivity
› This is inconvenient when you are troubleshooting so lets
disable this function
[local]Train-1#
[local]Train-1# configure
Enter configuration commands, one per line, 'end' to exit
[local]Train-1(config)# timeout session idle 0
[local]Train-1(config)# end
Can also be
[local]Train-1#exit
configured per
Connection closed by foreign host.
administrator
[student@ssh1-Gothenburg2] ~ $ telnet 10.1.1.106
Trying 10.1.1.106...
Connected to 10.1.1.106.
Refresh terminal
Escape character is '^]'.
Redback
login:
Figure 1-25: Remote terminal session timeout
6.3
Who is logged into the SSR
› It might be useful to know who is logged in at the moment
and when they connected
[local]Train-1# show administrators
TTY
START TIME
REMOTE HOST
ADMINISTRATOR
------------------------------------------------------------------------------pts/24
Tue Dec 15 07:54:07 2015 10.1.1.1:tel
admin@local
* console Tue Dec 15 07:54:07 2015 (null)
redback@local
[local]Train-1#
* (star): your current session
›
Note! You need to refresh your connection to see a new administrator
sessions in the above output!
Figure 1-26: Who is logged into the SSR?
- 36 -
© Ericsson AB 2015
LZT1381712 R1A
CLI Tools for Troubleshooting
The show administrators command is used to display all administrator sessions
on a system. Use the active keyword to limit the display to active sessions. With
the keyword active , the argument admin-name can also be used to specify the
sessions corresponding to a particular administrator.
In the display, the asterisk (*) character denotes the administrator session in
which this command was entered.
6.4
Command History
› There is a CLI history log for the “monitoring mode” as
well “configuration mode
[local]train-1# show history ?
configuration Display the session configuration command history
|
Output Modifiers
<cr>
History log of “configuration commands”
[local]train-1#
History log of “monitoring commands”
[local]train-1# show history
en
case-0
show circuit qount 3/7 queue
show circuit count 3/7 queue
show config qos
show hardware
config
show hardware
show hardware detail
[local]train-1# show history
configuration
system hostname train-1
end
context xyz
exit
show history
[local]train-1#
[local]train-1(config)# show history
system hostname train-1
end
context xyz
exit
show history
[local]train-1(config)#
Figure 1-27: What did you type before?
The show history command is used to display the command history for the
current session. The history log contains up to 40 commands. To restrict the
history to only the configuration commands entered during the session, use the
optional configuration keyword, which is only available in exec mode.
Usually Troubleshooting involves some unexpected behaviour and this may be as
a result of user configuration. During configuration activities, or if multiple
people are logged in, it may be useful to see what commands have been executed
on the SSR. This is helpful in backtracking the cause of an event.
The show history command displays a history of commands executed in the
current Operation Mode. The operation Mode we are in can be seen by
examining our prompt.
We can either be in Configuration mode or User Executive mode.
To see the history of configuration commands we have to run the command
“show history” from configuration mode.
LZT1381712 R1A
© Ericsson AB 2015
- 37 -
Ericsson SSR 8000 R15 System Troubleshooting
Alternatively you can use the the command ‘show history configuration’from
User Executive Mode.
6.5
Troubleshooting by Searching
In a lot of cases system troubleshooting will involve interpreting the output of
system show commands. In many cases you may need to search within the output
generated from these show command.
› One very useful tool for troubleshooting is searching and
limiting the output
› SSR provides following features in the command line
interface:
– Emacs
– Regular expressions
– Aliases
– Macros
Figure 1-28: Troubleshooting by searching and limiting the output
You can search the whole output by using special characters and keys according
to EMACS text interpretation.
Another option for searching through the CLI output is using GREP.
The system has some powerful built in GREP features that are useful with many
show commands.
GREP filters the output and displays only the rows which include a string of
characters matching to the search pattern.
GREP is a powerful text interpretation tool and there are also extended options
available where you can use complex regular expressions.
Macros are similar to Aliases but they allow multiple commands to be executed
sequentially when the macro is called.
A command macro is an extended alias that allows you to define a sequence of
commands to run with the macro name instead of entering each command
separately.
- 38 -
© Ericsson AB 2015
LZT1381712 R1A
CLI Tools for Troubleshooting
6.6
Command Line Interface & Emacs
› Output with more then 24 lines of output is considered large output and the
screen will pause after 24 lines (auto more function)
› If you want to abort the output, just type q when output shows ---more--› Searching through the output can be done as well, very powerful
!
enable encrypted 1 $1$........$4qhlVuh2HDOCu/EbYfbM6.
!
!
administrator redback encrypted 1 $1$........$4qhlVuh2HDOCu/EbYfbM6.
---(more)---
Default 24 lines displayed
--(more)-- Output paused here
Press h or H for help
---(CLI More Help)--Display this help:
Move down half display:
Move down one line:
Move down one page:
Move to bottom of output:
Move to top of output:
Move up half display:
Move up one line:
Move up one page:
Quit automore:
Redraw display:
Repeat last search:
Repeat last search in reverse direction:
Search backwards through the output:
Search forwards through the output:
---(End of CLI Help)---
h or H
d, or ^D
Enter, e, ^E, j or ^N
Space, f, ^F or ^V
G, >, or ESC->
g, < or ESC-<
u or ^U
y, ^Y, k, ^K or ^P
b, ^B, or ESC-v
q, Q, or ZZ
^L, r or ^R
n
N
?<string>
/<string>
It’s sometimes useful to change
terminal output length:
[local]Train-1#show terminal
terminal name
= /dev/ttyp0
terminal width
= 80
terminal length = 24
terminal monitor = disabled
[local]Train-1#
To set the terminal output with no
pause:
[local]Train-1#terminal length 0
[local]Train-1#
Figure 1-29: Command Line Interface & Emacs
You can search the whole output by using special characters and keys according
to EMACS text interpretation.
For example, to search the output for the string “abc”, you can type ‘slash’ abc
and press enter. If the CLI finds a match, it will move to that line.
If you want to repeat the previous search, just press the lower case letter "n".
Press capital "N" to repeat the search in the reverse direction.
There are a number of other characters that can also be used. Lowercase g brings
you to the top of the output, Upper Case G brings you to the bottom of the
output, lower case bbrings you up a page and Spacebar brings you down one
page.
LZT1381712 R1A
© Ericsson AB 2015
- 39 -
Ericsson SSR 8000 R15 System Troubleshooting
6.7
Command Line Interface & Emacs
Building configuration...
Current configuration:
!
! Configuration last changed by user '%RCM%' at Thu Feb 8 07:57:05 2011
!
service multiple-contexts
!
! Bridge global configuration
!
context local
!
no ip domain-lookup
!
interface 1
ip address 10.1.1.105/24
logging console
!
enable encrypted 1 $1$........$4qhlVuh2HDOCu/EbYfbM6.
!
administrator redback encrypted 1 $1$........$4qhlVuh2HDOCu/EbYfbM6.
---(more)---
/abc
This will search for a match on the characters “abc”
n
Repeat the previous search in forward direction
N
Repeat the previous search in reverse direction
g
Top of output
G
Bottom (end) of output
b
Move up one page
Space bar
Default 24 lines displayed
Very powerful for many show commands
The matched value will be the first line displayed
Move down one page
Figure 1-30: Command Line Interface & Emacs
6.8
GREP: Global Regular Expression Parser
GREP is a filter toll which will search for a match in a line . When match is
found, GREP will display the complete line containing the match.
› GREP is a filter tool which will search for a match in a line
› When match is found, GREP will display the complete line containing the
match
› It is a great tool to limit the output from almost any command within the
SmartEdge
› Example:
[local]Train-1# show configuration | grep port
card ge-40-port 3
port ethernet 3/1
bind interface port27 abc
Display all the line containing “port”
port ethernet 5/1
port ethernet management
[local] Train-1#
Figure 1-31: GREP, Global Regular Expression Parser
- 40 -
© Ericsson AB 2015
LZT1381712 R1A
CLI Tools for Troubleshooting
6.9
Extended GREP
› Extended GREP enables more options when using GREP.
[local]Train-1#
show configuration | grep options '-E'
› Extended GREP supports Regular Expressions (presented in
next slides)
› Other GREP command options:
› '-c' count
› '-i' ignore case
› '-An', '-Bn', '-Cn' After, Before, Contain
Figure 1-32: Extended GREP
6.10
Other Searching tools
[local]Train-1# show configuration | ?
append
Append the output to the file
begin
Include lines beginning with the pattern
count
Count the number of lines
exclude
Exclude lines with the pattern
grep
Plain grep
include
Include lines with the pattern
join-lines Join lines of a logical record for subsequent
pattern matching
save
Save the output to the file
Figure 1-33: Other searching tools
The following example displays all lines from the output for the show
configuration command (in any mode) beginning with the line before the first
line that contains the word (pattern), ospf, and including the 6 lines after the first
occurrence of the pattern.
LZT1381712 R1A
© Ericsson AB 2015
- 41 -
Ericsson SSR 8000 R15 System Troubleshooting
6.11
Regular Expressions
A regular expression (abbreviated regex or regexp )is a sequence of characters
that forms a search pattern, mainly for use in pattern matching with strings, or
string matching.
›
›
Can be used in EMAC and GREP within the SmartEdge
Following are some examples of reserved words to build regular
expressions:
[local]Train-1# show configuration | grep option ‘-E’ ‘^hello’
›
›
›
›
^ = match expression at the start of a line, as in ^hello
$ = match expression at the end of a line, as in hello$
\ = turn off the special meaning of the next character, as in \^
[ ] = match any one of the enclosed characters, as in [aeiou12]
Extended grep
Use Hyphen "-" for a range, as in [0-9]
›
›
›
›
›
›
[^ ] = match any one character except those enclosed in [ ], as in [^0-9]
. = match a single character of any value, except end of line
* = match zero or more of the preceding character or expression
( ) = group function
{n,m} = repeat previous n to m
| = logical or, 'abc|ABC'
Figure 1-34: Regular expressions
^
Matches the starting position within the string. In line-based tools, it matches the
starting position of any line.
$
Matches the ending position of the string or the position just before a stringending newline. In line-based tools, it matches the ending position of any line.
\
The backslash character (\) in a regular expression indicates that the character
that follows it either is a special character, or should be interpreted literally.
\n
Matches what the nth marked subexpression matched, where n is a digit from 1 to
9.
[]
- 42 -
© Ericsson AB 2015
LZT1381712 R1A
CLI Tools for Troubleshooting
A bracket expression. Matches a single character that is contained within the
brackets. For example, [abc] matches "a", "b", or "c". [a-z] specifies a range
which matches any lowercase letter from "a" to "z".
[^ ]
Matches a single character that is not contained within the brackets. For example,
[^abc] matches any character other than "a", "b", or "c". [^a-z] matches any
single character that is not a lowercase letter from "a" to "z".
.
Matches any single character
*
Matches the preceding element zero or more times. For example, ab*c matches
"ac", "abc", "abbbc", etc. [xyz]* matches "", "x", "y", "z", "zx", "zyx", "xyzzy",
and so on.
()
Defines a marked subexpression.
{n,m}
Matches the preceding element at least m and not more than n times. For
example, a{3,5} matches only "aaa", "aaaa", and "aaaaa".
|
The choice (also known as alternation or set union) operator matches either the
expression before or the expression after the operator. For example, abc|def
matches "abc" or "def".
LZT1381712 R1A
© Ericsson AB 2015
- 43 -
Ericsson SSR 8000 R15 System Troubleshooting
6.12
Regular Expression: Examples with GREP
›
Grep example: Looking for connected subscribers with any IP address ending with 2:
[local]Train-1# show sub act all | grep option '-E -B6' '([0-9]{1,3}\.){3}2'
user2@VeryNiceService
Circuit
3/1 pppoe 32
Internal Circuit
3/1:1023:63/6/2/32
Current port-limit 1
port-limit 1 (applied)
range
repeat
ip pool
(applied from sub_default)
group
ip address 100.1.1.2 (applied from pool)
repeat
[local]Train-1#
›
EMACS example: Searching config for lines with any IP address ending with 1xx:
Search
Enter this
search
pattern at
--more--
Result
! Bridge global configuration
!
!
context local
!
no ip domain-lookup
!
interface management
ip address 10.1.1.101/24
logging console
/([0-9]{1,3}\.){3}1
This line
matches the
ip address 10.1.1.101/24
logging console
search
!
enable encrypted 1
$1$........$4qhlVuh2HDOCu/EbYfbM6.
!
!
administrator dj encrypted 1 $1$..
Figure 1-35: Regular expressions, examples with GREP
› Use the repeat to search for match at specific location
within the line
[local]Train-1# show sub all | grep option '-E' '^.{35}user'
pppoe
3/1 pppoe 32
user2@VeryNiceServ NiceServi Feb 14 20:11:37
[local]Train-1#
[local]Train-1#show sub all | grep option '-E' '^.{35}user|--|TYPE'
TYPE
CIRCUIT
SUBSCRIBER
CONTEXT
START TIME
-------------------------------------------------------------------------------pppoe
3/1 pppoe 32
user2@VeryNiceServ NiceServi Feb 14 20:11:37
-------------------------------------------------------------------------------[local]Train-1#
› Use the range and repeat to show processes with load
equal / higher then 10%
[local]Train-1# show proc | grep option '-E' '[1-9][0-9]{1,2}\...%|NAME'
NAME
PID
SPAWN
MEMORY TIME
%CPU STATE
UP/DOWN
aaad
288
3
6464K 00:00:00.12
11.00% run
00:00:01
[local]Train-1#
Figure 1-36: Regular expressions , examples with GREP
- 44 -
© Ericsson AB 2015
LZT1381712 R1A
CLI Tools for Troubleshooting
7
Aliases and Macros
Figure 1-37: Aliases and Macros
7.1
Introduction to Alias
›
Alias allows short notations of complex command string
[local]Train-1(config)# alias exec test2 show config port 3/1
[local]Train-1# test2
[0] (test)# show config port 3/1
Building configuration...
Current configuration:
!
card ether-12-port 3
!
port ethernet 3/1
no shutdown
encapsulation pppoe
bind authentication chap
!
end
Name length is limited to 15
Be careful: Creating an alias
name that matches an existing
command name will override the
command. To restore the
command, the alias configuration
needs to be removed.
Figure 1-38: Introduction to Alias
LZT1381712 R1A
© Ericsson AB 2015
- 45 -
Ericsson SSR 8000 R15 System Troubleshooting
7.2
Introduction to Macro
The same name limitations as
for aliases apply
› Macro allows multiple command lines to be grouped
together
[local]Train-1(config)# macro exec hallo
[local]Train-1(config-macro)# seq 10 show ip interface brief
[local]Train-1(config-macro)# seq 20 show bindings
[local]Train-1# hallo
[10] (hallo)# show ip interface brief
Thu Dec 15 12:23:13 2015
Name
Address
mgmt
10.1.1.102/24
[20] (hallo)# show bindings
Circuit
MTU
1500
State
Up
State Encaps
management
Up
Bindings
ethernet 7/1
Bind Type
ethernet
Bind Name
interface
mgmt@local
-- cut --
Figure 1-39: Introduction to macro
7.3
Variables in Macros
› Macros can include variables $1…$10:
[local]Train-1# ping atm channel end-to-end 13 /1 vpi 0 vci 100 count 10
[local]Train-1(config)# macro exec atm-ping
[local]Train-1(config-macro)# seq 10 show port $1/$2
[local]Train-1(config-macro)# seq 20 ping atm channel end-to-end $1/$2 vpi $3 vci $4
count $5
[local]Train-1# atm-ping 13 1 0 100 10
$1 $2 $3 $4 $5
Figure 1-40: Variables in Macros
- 46 -
© Ericsson AB 2015
LZT1381712 R1A
CLI Tools for Troubleshooting
8
Exercise 2: Introduction, Searching and Filtering
› Exercise to learn searching and filtering using:
– Regular expressions
– EMACS
– GREP with macros
› Part of the exercise is about connected subscribers
– We need to use emulation for “show subscribers” since your system
does not have any subscribers:
Emulates “show subscriber all” command
[local]Train-1# show configuration subs_all ...
[local]Train-1# show configuration subs_active ...
Emulates “show subscriber active” command
Figure 1-41: Exercise 2: Introduction, Searching and Filtering
8.1
Exercise 2: Searching and Filtering
› Please move to the exercises book.
Figure 1-42: Exercise 2: Searching and Filtering
8.2
Exercise 2, Review (1-4)
› Exercise 2.1, Save filtered output to file:
[local]Train-1# show log | grep fail | save /flash/log_fail_2013.txt
[local]Train-1# dir
Contents of /flash/
...
-rw-r--r-- 1 root 0
413 Dec 15 07:40 log_fail_2015.txt
-rw-r--r-- 1 root 0
3327 Dec 15 08:24 redback.bin
-rw-r--r-- 1 root 0
986 Dec 15 08:24 redback.cfg
...
› Exercise 2.2, Searching the output using EMACS:
[local]Train-1# sh configuration subs_active
0016CED62A70@internet
Circuit
12/1:1 vpi-vci 30 381 pppoe 1292
Internal Circuit
12/1:1:63/2/2/34
Interface bound 192.168.166.0
...
qos-metering-policy marking-qos-1 (applied from sub_default)
00173391DE24@internet
We type “/” and pattern for match here
---(more)---
See next slide …
Figure 1-43: Exercise 2, review (1-4)
LZT1381712 R1A
© Ericsson AB 2015
- 47 -
Ericsson SSR 8000 R15 System Troubleshooting
8.3
Exercise 2, Review (2-4)
›
Exercise 2.2, Searching the output using EMACS:
...
qos-metering-policy marking-qos-1 (applied from sub_default)
00173391DE24@internet
/192\.168\.162\.105
›
Result:
ip address 192.168.162.105 (applied from pool)
atm profile UBR-608 (applied)
qos-queuing-policy PQ (applied)
qos-metering-policy marking-qos-1 (applied from sub_default)
001733AE712C@internet
---(more)---
›
Type “u” for up half page to check the subscriber username and “n”
for next match:
– Username: instructor@internet
– Only one subscriber uses address 192.168.162.105
Figure 1-44: Exercise 2, review (2-4)
8.4
Exercise 2, Review (3-4)
› Exercise 2.3, Macro for searching domains
› Create Macro “subs_domain”:
[local]Train-1(config)# macro exec subs_domain
[local]Train-1(config-macro)# seq 10 show config subs_all | grep opt '-E -c -i'
'ericsson|redback'
[local]Train-1(config-macro)# end
› Execute Macro:
[local]Train-1# subs_domain
[10] (subs_domain)# show config subs_all |
2505
[local]Train-1#
grep opt '-E -c -i' 'ericsson|redback'
Figure 1-45: Exercise 2, review (3-4)
- 48 -
© Ericsson AB 2015
LZT1381712 R1A
CLI Tools for Troubleshooting
8.5
Exercise 2, Review (4-4) (optional)
› Exercise 2.4: Macro for searching with dates is optional (students that
have more time).
› Number of subscribers logged in between Oct 20th – 29th:
[local]Train-1# show conf subs_all | grep option '-E' 'Oct 2[0-9]' | count
1245
[local]Train-1#
› Subscribers logged in on October 8th (17 matches):
[local]Train-1# show conf subs_all | grep option '-E' 'Oct {1,2}8'
pppoe
12/1:1 vpi-vci 34 340 pppo user2@customer2.co internet Oct 8 11:56:18
pppoe
12/2:1 vpi-vci 31 322 pppo q4gL2@redback.com internet Oct 8 17:52:47
pppoe
12/2:1 vpi-vci 31 233 pppo kS9qO@provider2.co internet Oct 8 17:54:45
pppoe
12/2:1 vpi-vci 31 419 pppo BoKcC@customer1.co internet Oct 8 17:54:28
pppoe
12/3:1 vpi-vci 30 426 pppo jQaeQ@redback.com internet Oct 8 23:05:09
...
Figure 1-46: Exercise 2, review (4-4) (optional)
LZT1381712 R1A
© Ericsson AB 2015
- 49 -
Ericsson SSR 8000 R15 System Troubleshooting
9
Chapter Summary
After this course the participant should be able to:
› Identify the CLI Tools for Troubleshooting
› Describe the grep and its Options
› Understand the use of CLI Command Aliases as Shortcuts
› Using CLI Command macros to Execute Multiple
Command with Single Command
Figure 1-47: Chapter Summary
- 50 -
© Ericsson AB 2015
LZT1381712 R1A
Operational Health of the SSR System
2 Operational Health of the SSR System
Chapter Objectives
After this course the participant will be able to:
› Understanding the Operational Health of the SSR System
› Describe the basic RPSW health checks
› Explain the details of control plane interface
› Understand system storage
Figure 2-1: Chapter Objectives
LZT1381712 R1A
© Ericsson AB 2015
- 51 -
Ericsson SSR 8000 R15 System Troubleshooting
1
Troubleshooting Procedure
Before you begin troubleshooting, gather the evidence of what has been
happening on your router. Collect the output of the show tech-support command,
and optionally, other show commands and macros for specific problems. Collect
this evidence before beginning to troubleshoot, because some troubleshooting
techniques destroy or modify already stored data. If you need to escalate your
problem to customer support, you must include troubleshooting data with your
support request
› System alarms
› Hardware Status
System storage verification
› Internal
› External (optional)
System processes
› Processes verification
› Finding CPU intensive processes
› Process crash
› Manual Coredumps
Redundancy
› XCRP / Alarm Card redundancy
› Switch-over
Boot Problems
› Hardware related
› Software related
System Logging
› Active Logs, Log files
› Syslog
Debugging (Last Resort)
Document cases in a database
System Hardware Health Check
› Start debug
› Display debug
› Clear debug
Figure 2-2: Recommended Troubleshooting Procedure
In the next sections we will present recommended procedure for troubleshooting.
We will use the tools presented earlier.
We strongly recommend that every case is documented in a database which can
be used for later troubleshooting. This can minimize the time required for
troubleshooting, thus the system downtime and the OPEX.
- 52 -
© Ericsson AB 2015
LZT1381712 R1A
Operational Health of the SSR System
1.1
System Hardware Health
System Hardware Health Checks are a good starting point for System
Troubleshooting. There are many components to the SSR and when
troubleshooting it’s important to verify the status of the hardware before trying
to investigate problems with system processes, routing, packet processing and so
on.
8x Control cards
-Switch Fabric
-Alarm
-Route Processor
20x Line/Service cards
-Line cards
40x1G
10x10G
2x40G, 1x100G
-Smart Services Cards
EPG, BNG, CDN,
Service Management
Figure 2-3: System Hardware Health
LZT1381712 R1A
© Ericsson AB 2015
- 53 -
Ericsson SSR 8000 R15 System Troubleshooting
1.2
Overview: Hardware Status
This section describes how to troubleshoot hardware problems
We can also look at hardware status by typing ‘show hardware’ This command
displays information about each Hardware component. Note that in the output
shown some of the power module outputs have been omitted for simplicity. Each
component has relevant information shown including Slot, Type of Hardware,
Serial Number, Revision and Manufacture Date. Some hardware components will
also include Payload which indicates the status of the hardware. In some cases
this attribute is not used and is listed as N/A or non applicable.
[local]Train-1# show hardware
Slot Type
Serial No
Rev
Mfg Date
Payload
----- -------------------- -------------- ------- ----------- ------N/A
backplane
CF90000C81
R2G
02-FEB-2012 N/A
FT1
ft
ce510004an
r2c
25-NOV-2011 N/A
FT2
ft
ce510004ah
r2c
25-NOV-2011 N/A
PM1
pm
BR81691974
R2B
05-NOV-2011 N/A
PM2
pm
BR81691990
R2B
05-NOV-2011 N/A
--- cut --PM8
pm
BR81691991
R2B
05-NOV-2011 N/A
RPSW1 rpsw
Unavailable
Unavailable OK
Overview list
RPSW2 rpsw
CF90000AY0
R2H
06-DEC-2011 OK
ALSW1 alsw
CF90000B4V
R2N
08-DEC-2011 OK
of hardware
ALSW2 alsw
CF90000B4X
R2N
08-DEC-2011 OK
SW1
sw
CF90000BPF
R2M
13-DEC-2011 OK
SW2
sw
CF90000B8K
R2M
11-unknown OK
SW3
sw
CF90000BN6
R2M
12-DEC-2011 OK
SW4
sw
CF90000BKY
R2M
12-DEC-2011 OK
3
ge-40-port
CF90000AJQ
R2F
04-NOV-2011 OK
5
10ge-10-port
CF90000AG3
R2D
03-NOV-2011 OK
[local]Train-1#
Figure 2-4: Overview: Hardware Status
To check the hardware status of your router, use the show hardware command.
- 54 -
© Ericsson AB 2015
LZT1381712 R1A
Operational Health of the SSR System
[local] Ericsson# show hardware ?
backplane
Display backplane hardware information
card
Display hardware information for a specific card
daughter-card Display daughter-card hardware information
detail
Display detail hardware information for all cards
fantray
Display fantray hardware information
power-module
Display power-module hardware information
thermal
Display hardware thermal information for all cards
|
Output Modifiers
<cr>
[local]Train-1# show hardware card ?
1..20
Slot number
ALSW1..ALSW2 Slot number
RPSW1..RPSW2 Slot number
SW1..SW4
Slot number
Figure 2-5: More detailed hardware info
We can see the options for looking at hardware information related to Cards such
as Line Cards, Route Processor Switch Cards, Switch Cards and Alarm Cards and
other hardware components of the chassis such as fan trays and power modules.
LZT1381712 R1A
© Ericsson AB 2015
- 55 -
Ericsson SSR 8000 R15 System Troubleshooting
1.3
Retrieving Hardware Details Line Cards
We can get detailed information for a particular hardware component by using
the keyword ‘detail’ at the end of the ‘show hardware’ command. As an example
let us look at the detailed information we can grab for a Line Card
An important check that can be done here is the Line Card Temperature. It is
important that the temperature of a Line Card is not too high. If the Temperature
is too high or near the edge of the acceptable value then the card may flap
between being up and down causing problems.
Voltage Values can also be examined here. Value ranges here should be within
5% of the expected values.
[local]Train-1#show hardware card 1 detail
Slot
: 1
Type
Serial No
: D290092314
Hardware Rev
Mfg Date
: 02-APR-2014
Activated Time
: 178 h
WLCC-W024
: 12
Fluffy-W024
: 16
FEX-W024
: 6
Voltage 3.300V
: 3.335 (+1%)
Voltage 1.200V
Voltage 3.300V
: 3.341 (+1%)
Voltage 1.100V
Voltage 1.000V
: 1.014 (+1%)
Voltage 1.800V
Voltage 1.500V
: 1.512 (+1%)
Voltage 1.000V
--omitted
Voltage 5.000V
: 5.140 (+3%)
Inlet Temp
: Normal (28 C)
Card Temp Status
Payload Status
: OK
OSD Status
POD Status
: Passed
Failed LED
: Off
IS LED
Standby LED
: Off
Swap LED
Ejector Switch
: 1 (Locked)
Last Payld Reset
: Power On
Good news - no card alarms
Active Alarms
: NONE
: 1-10ge-20-4-port
: R5H
: 1.207 (+1%)
: 1.107 (+1%)
: 1.818 (+1%)
: 1.010 (+1%)
Temperature and
voltage within
normal range
: Normal
: Passed
Power on
diagnostics
positive
: On
: Off
LED status
Figure 2-6: Retrieving hardware details Line cards
Each line card has one or more FPGAs. Each software release has a supported
FPGA version, and all FPGA images are bundled with the line card image.
FPGAs are upgraded automatically. The line cards have the following FPGAs:
- 56 -
•
WLCC—Line card control. Terminates the Connection Manager (CM)
bus between the line card and RPSW (Controller) card and turns on the
power sequence on the board.
•
WXFP/WSFP—Configuration status. Muxes system clocks and
aggregates interrupts (for example, SFP/XFP/Phy interrupts).
•
WLCFAP—Connected to the FAP on the LC and has a PCI bridge.
Collects statistics from FAP. Programmed every time the FPGA is turned
on. The bus to the WSFP FPGA is used for programming.
© Ericsson AB 2015
LZT1381712 R1A
Operational Health of the SSR System
1.4
RPSW Hardware Information
We can see similar information for an RP card too. The Phalanx Version used by
the RP is one piece of information that may be useful from the output of this
command.Phalanx is a CPLD chip which is not field upgradable unlike all other
FPGAs on the system that are automatically upgraded when we upgrade IPOS on
the RP. Ensure that Phalanx version on both RP cards are the same. As well as
this there is information listed similar to the line card for voltage levels and
temperature values.
[local] Ericsson# show hardware card rpsw1 detail
Slot
: RPSW1
Type
Serial No
: CF90000BSH
Hardware Rev
CLEI Code
: IPUCA272AA
Product Code
Mfg Date
: 22-DEC-2011
Activated Time
: 22 h
Phalanx
: 3.0.13
Spanky
: 02.02
Voltage 54.000V
: 54.032 (+0%)
Voltage 12.000V
Voltage 1.050V
: 1.054 (+0%)
Voltage 1.500V
Voltage 1.000V
: 1.000 (+0%)
Voltage 1.800V
Voltage 1.200V
: 1.199 (-0%)
Voltage 1.000V
Voltage 1.000V
: 1.000 (+0%)
Voltage 0.900V
Inlet Temp
: Normal (31 C)
Card Temp Status
Payload Status
: OK
OSD Status
POD Status
: Passed
Failed LED
: Off
IS LED
Standby LED
: Off
Swap LED
Ejector Switch
: 1 (Locked)
Last Payld Reset
: Admin
Active Alarms
: NONE
: rpsw
: R2H
: W006
: 12.031 (+0%)
: 1.500 (+0%)
: 1.800 (+0%)
: 0.999 (-0%)
: 0.900 (+0%)
: Normal
: Passed
: On
: Off
Figure 2-7: RPSW hardware information
LZT1381712 R1A
© Ericsson AB 2015
- 57 -
Ericsson SSR 8000 R15 System Troubleshooting
1.5
ALSW Hardware Information
Detailed Hardware Information for the Alarm Switch Card is shown here.
The output displays similar information as for an RP card.
Notice the status of the Alarm LEDs we mentioned previously, which indicate the
existence of system alarms.
[local] Ericsson# sh hardware card alsw1 detail
Slot
: ALSW1
Type
Serial No
: CF900009WK
Hardware Rev
Mfg Date
: 07-OCT-2011
Activated Time
: 6 h
Farquaad
: 09
Shiba
: 03.06
Voltage 54.000V
: 53.287 (-1%)
Voltage 12.000V
Voltage 3.300V
: 3.299 (-0%)
Inlet Temp
: Normal (24 C)
Card Temp Status
Payload Status
: OK
OSD Status
POD Status
: Passed
Failed LED
: Off
IS LED
Standby LED
: Off
Swap LED
Ejector Switch
: 1 (Locked)
Last Payld Reset
: Reset Button
Active Alarms
: NONE
Power LED
: On
Fan LED
: Off
Critical Alarm LED : On
Major Alarm LED
: Off
Minor Alarm LED
: On
: alsw
: R2H
: 12.000 (+0%)
: Normal
: Not Run
: On
: Off
Note! There two sets of LEDs on the
ALSW card:
› ALSW local card LEDs
› SSR System LEDs
› Alarms will be covered in later slides
Figure 2-8: ALSW hardware information
In this case note that the output is quite long and is only partially shown but you
should see the status of active alarms of each hardware component.
- 58 -
© Ericsson AB 2015
LZT1381712 R1A
Operational Health of the SSR System
1.6
Finding Hardware Alarms (1-2)
› You can quickly see all alarms across chassis using grep
[local] Ericsson# show hardware detail | grep option -E 'Alarm|Slot'
Slot
: N/A
Type
: backplane
Active Alarms
: N/A
Slot
: FT1
Type
: ft
Active Alarms
: NONE
Slot
: FT2
Type
: ft
Active Alarms
: NONE
Slot
: PM1
Type
: pm
Active Alarms
: Input Failure - Feed B
Slot
: PM2
Type
: pm
Active Alarms
: Input Failure - Feed B
Slot
: PM3
Type
: pm
Active Alarms
: Input Failure - Feed B
Slot
: PM4
Type
: pm
Active Alarms
: Input Failure - Feed B
Slot
: PM5
Type
: pm
Active Alarms
: Input Failure - Both Feeds
Slot
: PM6
Type
: pm
Active Alarms
: Input Failure - Both Feeds
--More--
(Not all output is displayed)
Figure 2-9: Finding hardware alarms (1-2)
In this case note that the output is quite long and is only partially shown but you
should see the status of active alarms of each hardware component.
1.7
Finding Hardware Alarms (2-2)
› Simplifying with macro
[local]Train-1(config)# macro exec checkhw
[local]Train-1(config-macro)# seq 10 show clock
[local]Train-1(config-macro)# seq 20 show hardware detail | grep option '-E' 'Alarms|Slot'
[local]Train-1# checkhw
[10] (checkhw)# show clock
Thu Oct 27 08:49:36 2011 GMT
[20] (checkhw)# show hardware detail | grep option '-E' 'Alarms|Slot'
Slot
: N/A
Type
: backplane
Active Alarms
: N/A
Slot
: FT1
Type
: ft
Active Alarms
: NONE
Slot
: FT2
Type
: ft
Active Alarms
: NONE
Slot
: PM1
Type
: pm
Active Alarms
: Input Failure - Feed B
Slot
: PM2
Type
: pm
Active Alarms
: Input Failure - Feed B
Slot
: PM3
Type
: pm
--More--
Figure 2-10: Finding hardware alarms (2-2)
If we want to keep track of Hardware alarms a useful alias can be written using
the grep –E option on the output of Show hardware detaillooking for the strings
Alarm and Slot. This lists all lines in the output of show hardware detail that
contain the keywords ‘slot’ OR ‘alarm’
LZT1381712 R1A
© Ericsson AB 2015
- 59 -
Ericsson SSR 8000 R15 System Troubleshooting
1.8
System Hardware Checks
A good place to start is viewing all system alarms on the Chassis. The command
that is used is ‘Show System alarm’.
As you can see, on this particular chassis there are quite a few alarms that can be
seen. It is important to be able to distinguish the different types of alarms and
whether these are expected alarms or alarms that require urgent attention. In this
case you can see we have a number of alarms related to the Power Modules. The
chassis we are using happens to only have four out of a possible eight power
modules so Four power Modules are seen to be missing and as well as this, each
power module is only using one out of a possible two power source inputs so the
second feed for the active power modules are seen to be missing
[local]Ericsson# show system alarm
Timestamp
Source
Severity
Description
------------------------------------------------------------------------Dec 15 17:33:06.509 PM1
Minor
Input Failure - Feed B
Dec 15 17:33:06.512 PM2
Minor
Input Failure - Feed B
Dec 15 17:33:06.514 PM3
Minor
Input Failure - Feed B
Dec 15 17:33:06.517 PM4
Minor
Input Failure - Feed B
Dec 15 17:33:09.875 PM5
Minor
Power Module Missing
Dec 15 17:33:09.875 PM6
Minor
Power Module Missing
Dec 15 17:33:09.885 PM7
Minor
Power Module Missing
Dec 15 17:33:09.911 PM8
Minor
Power Module Missing
Alarm levels
Minor
Major
Critical
Overview
of system
alarms
Figure 2-11: System Hardware Checks
It is recommended that all customer SSR deployments have all power modules
loaded and dual feeds for each in which case these alarms would not be seen. We
will look at this in more detail when we discuss Troubleshooting of Power
Modules in a later section.
For each System alarm there are three possible severities. Minor , Major and
Critical. In this case these alarms are classed as Minor as the system is still
sufficiently powered and operational
- 60 -
© Ericsson AB 2015
LZT1381712 R1A
Operational Health of the SSR System
1.9
System Alarms
The show system alarm command displays system, card, and port alarms.
Displays alarms for the chassis, line cards, or Smart Services Cards (SSCs) and,
optionally, specific ports, controller cards, alarm cards, power modules, fantrays,
and switch fabric components.
›
Please note the “all” option!
To emulate this:
• Configure a card that is not inserted
[local]Train-1# show system alarm
Timestamp
Type
Source
Severity Description
-------------------------------------------------------------------------------Dec 15 14:44:08.197 2
Critical
Card Missing
Dec 15 09:51:25.466 PM1
Minor
Input Failure - Feed B
Dec 15 09:51:25.499 PM2
Minor
Input Failure - Feed B
Dec 15 09:26:09.550 5
Minor
Filesystem Full
[local]Train-1# show system alarm ?
ALSW1..ALSW2
Display active alarms for specified ALSW slot
FT1..FT2
Display active alarms for specified FT slot
PM1..PM8
Display active alarms for specified PM slot
RPSW1..RPSW2
Display active alarms for specified RPSW slot
SW1..SW4
Display active alarms for specified SW slot
chassis
Display active chassis alarms
slot/port:ch:sub[:subsub] Display active alarms for specified LC slot,
port, and channel numbers
|
Output Modifiers
<cr>
[local]Train-1#
Figure 2-12: System alarms
1.10
System Alarm with Options
No alamrs
[local]Train-1# show system alarm chassis
Timestamp
Source
Severity
Description
-------------------------------------------------------------------------------[local]Train-1# show system alarm ALSW1
Timestamp
Source
Severity
Description
-------------------------------------------------------------------------------[local]Train-1# show system alarm PM1
Timestamp
Source
Severity
Description
-------------------------------------------------------------------------------Sep 5 18:18:19.773
PM1
Minor
Input Failure - Feed B
Power Module
alarm
[local]Train-1# show system alarm 3/1
Timestamp
Source
Severity
Description
-------------------------------------------------------------------------------Port alarm
Sep 7 15:50:25.421
3/1
Major
Link down
Figure 2-13: System Alarm with Options, Examples
LZT1381712 R1A
© Ericsson AB 2015
- 61 -
Ericsson SSR 8000 R15 System Troubleshooting
1.11
Example: Initiating Major System Alarm
If you have access to a chassis it is easy to create Major and Critical alarms. To
create a Major alarm configure a port that has no active cable connection and
execute the ‘no shutdown’ command.
This will then generate a major alarm because, as far as the SSR is concerned, a
link that should be up has no active connection. The Alarm for Link Down is
raised as shown.
› System alarm can be generated based on
configuration mistakes by administrator
Port with no cable connected
[local]Ericsson(config)# port ethernet 1/19
[local]Ericsson(config-port)# no shutdown
[local]Ericsson(config-port)# end
[local]Ericsson# show system alarm
Timestamp
Source
Severity
Description
--------------------------------------------------------------------Jun 13 00:19:50.940 1/19
Major
Link down
› Example solutions:
– Port not activated (by default): enter “no shutdown”
– Missing cable: connect cable
– Other end missing config: Configure the port on the other end
– Wrong port configured: configure correct port
Figure 2-14: Example: Initiating Major System Alarm
- 62 -
© Ericsson AB 2015
LZT1381712 R1A
Operational Health of the SSR System
1.12
Example: Initiating Critical System Alarm
It is also easy to create a Critical alarm. To do this, simply configure a card that is
not physically inserted in the chassis.
This will then generate a Critical alarm because as far as the SSR is concerned
there is a card that should be present that cannot be detected and the card may be
malfunctioning or may have been wrongly removed.
This particular alarm may be a nuisance if you configure a card before inserting it
into the chassis, but do not want to have a critical alarm generated. In this case
there is a useful Card Configuration Command ‘deactivate’ which allows you to
maintain any configuration of a card without a critical alarm being raised as the
SSR is aware that the card is not activated yet. As you can see this clears the
alarm previously raised.
[local]Ericsson(config)# card ge-40-port 17
[local]Ericsson(config-card)# end
No Card Present in Slot 17
[local] Ericsson# show system alarm
Timestamp
Source
Severity
Description
------------------------------------------------------------------------Jun 13 00:23:45.080 17
Critical
Card Missing
› Solution:
[local]SR1-1(config)#card ge-40-port 17
[local]SR1-1(config-card)#deactivate
[local]SR1-1(config-card)#end
[local]SR1-1#sh sys alarm
Timestamp
Source
Severity
Description
------------------------------------------------------------------------
Figure 2-15: Example: Initiating Critical System Alarm
LZT1381712 R1A
© Ericsson AB 2015
- 63 -
Ericsson SSR 8000 R15 System Troubleshooting
1.13
System Hardware LED
Alarms generated will be Physically evident on the Alarm Card LEDs. The image
shows the Alarm card for a chassis with both Minor and Critical Alarms
currently raised.
Critical
alarm
Minor
alarm
Figure 2-16: System Hardware LED
The ALSW and ALSW-T alarm cards contain alarm management functionality,
timing/synchronization support, an instance of the central switch fabric, and an
internal management control plane. SSR 8000 Series must have at least one alarm
card, but may be equipped with two cards of the same type for redundancy.
The alarm cards are responsible for:
- 64 -
•
Visual alarm management
•
Synchronization timing distribution
•
External T1/E1 BITS inputs through RJ-48 connectors with wiring 1,2
and 4,5
•
Switch fabric (along with the other half-height cards)
© Ericsson AB 2015
LZT1381712 R1A
Operational Health of the SSR System
1.14
Card Powered Down
Note that in the output shown some of the power module outputs have been
omitted for simplicity.
We can see in this case Status of card in slot 1 is ok. Card in slot 12 is powered
down The reason for this is that Card 12 is inserted in the chassis but has not been
configured by the administrator yet, as shown by the output of “show
configuration card 12”
[local] Ericsson# show hardware
Slot Type
Serial No
Rev
Mfg Date
Payload
----- -------------------- -------------- ------- ----------- ------N/A
backplane
CF90000CM9
R2G
29-MAY-2012 N/A
...
1
12
ge-40-port
ge-40-port
CF90000BGX
CF90000BYT
R2H
R2H
[local]SR1-1# show configuration card 12
Building configuration...
27-DEC-2011 OK
02-JAN-2012 Power D
Card 12 Powered Down
Current configuration:
!
end
Figure 2-17: Card Powered Down
LZT1381712 R1A
© Ericsson AB 2015
- 65 -
Ericsson SSR 8000 R15 System Troubleshooting
1.15
System Storage Verification
Figure 2-18: System storage Verification
________________________________________________________________
________________________________________________________________
________________________________________________________________
________________________________________________________________
________________________________________________________________
________________________________________________________________
- 66 -
© Ericsson AB 2015
LZT1381712 R1A
Operational Health of the SSR System
1.16
System Storage
RPSW and RPSW-V2 (Controller) card contain both the control processor and an
instance of the central switch fabric. The difference between these two controller
cards is in the amount of the internal storage. An SSR 8000 Series chassis must
have at least one controller card.
› Each RPSW card has internal storage media to store the
operating system, configuration files, and other system
files.
› The SSR has two internal 16 GB disks on each RPSW:
16GB (flash) +16GB (md)
Internal storage
– Three partitions on first Disk:
› p01, p02 (8GB)  system boot partitions that store
operating system image files (active partition, alternate
partition)
› /flash (8GB)  primarily used for storing and managing
configuration files
– One Partition ob the second disk
› /md (16 GB)  all kernel and application core files and
log files
› Optional: external USB storage for transferring software
images, logs, configuration files
› Each line card has one 2 GB internal storage disk that is
partitioned in four parts: /p01, /p02, /flash, and /var
(/var/md).
Figure 2-19: System storage
Each RPSW card has internal storage media to store the operating system,
configuration files, and other system files.
The SSR has two internal 16 GB disks on each RPSW:
Three partitions on first Disk:
•
p01, p02 (8GB)  system boot partitions that store operating
system image files (active partition, alternate partition)
•
/flash (8GB)  primarily used for storing and managing
configuration files
One Partition on the second disk
•
/md (16 GB)  all kernel and application core files and log files
Optional: external USB storage for transferring software images, logs,
configuration files
Each line card has one 2 GB internal storage disk that is partitioned in four parts:
/p01, /p02, /flash, and /var (/var/md).
LZT1381712 R1A
© Ericsson AB 2015
- 67 -
Ericsson SSR 8000 R15 System Troubleshooting
1.17
System Storage Verification
Show disk: displays status for the internal storage partitions and optional USB
mass-storage devices.
› Verification of errors and free space within storage media
show disk [card {slot-id | all}] [internal |
external] [detail]
– Displays status for the internal storage partitions and optional USB
mass-storage devices.
Figure 2-20: System storage verification
1.18
System Storage Verification: Example
Use the show disk command to display status for the internal storage partitions
and an external USB storage device. The command also displays the soft and
hard error count for the system storage.
[local]Train-1# show disk internal detail
Manufacturer
: SMART (/dev/sda)
Model
: eUSB
Serial Number
: 1E884210130309181111
Manufacturer
Model
Serial Number
Filesystem
rootfs
/dev/sda2
/dev/sdb1
/dev/sda3
: SMART (/dev/sdb)
: eUSB
: 3E3B2F12132259181111
1k-blocks
3969036
3969068
15604376
7665864
Disk usage: not full  OK
Used Available Use% Mounted on
1726740
2042260 46% /
916612
2852420 24% /p02
1307320 13510644
9% /var
150644
7128880
2% /flash
Figure 2-21: System storage verification: Example
show disk internal Example
The following example displays status for the internal storage partitions on
the active controller card.
[local]Ericsson> sh disk internal detail
- 68 -
Manufacturer
Model
Serial Number
: SMART
(/dev/sda)
: eUSB
: SPG124600L3
Manufacturer
: SMART
(/dev/sdb)
© Ericsson AB 2015
LZT1381712 R1A
Operational Health of the SSR System
Model
Serial Number
Filesystem
rootfs
/dev/sda2
/dev/sdb1
/dev/sda3
/dev/md0
: eUSB
: SPG124600KU
1k-blocks
3872856
3880920
15499740
7745836
31047684
Used Available Use% Mounted on
2894516
783152 79% /
2724900
960432 74% /p02
646580 14072004
4% /var
179680
7175780
2% /flash
176260 29306704
1% /opt/disk
show disk external Example
The following example displays status for the USB mass-storage device in the
USB port of the active controller card.
[local]Ericsson>sh disk external
Filesystem
1k-blocks
Used Available Use% Mounted on
/dev/sdc1
2038464
1652032
386432 82% /media/flash
show disk external detail Example
The following example displays status for the USB device.
[local]Ericsson>sh disk external detail
Manufacturer : Generic
Model
: Mass Storage
Seial Num.
: BEE2D3C5
Filesystem
/dev/sdc1
LZT1381712 R1A
1k-blocks
2038464
Used Available Use% Mounted on
1652032
386432 82% /media/flash
© Ericsson AB 2015
- 69 -
Ericsson SSR 8000 R15 System Troubleshooting
2
Chapter Summary
After this course the participant should be able to:
› Understanding the Operational Health of the SSR System
› Describe the basic RPSW health checks
› Explain the details of control plane interface
› Understand system storage
Figure 2-22: Chapter Summary
- 70 -
© Ericsson AB 2015
LZT1381712 R1A
Fundamental Concept of Processes Architecture on the System
3 Fundamental Concept of Processes
Architecture on the System
Chapter Objectives
After this course the participant will be able to:
› Describe the Fundamental Concept of Processes
Architecture on the System
› Describe the SSR software architecture and system
processes
› Understanding the concept of manual core dump
› Identify the different types of processes in SSR
› Work with Core Dumps of Faulty Processes
Figure 3-1: Chapter Objectives
LZT1381712 R1A
© Ericsson AB 2015
- 71 -
Ericsson SSR 8000 R15 System Troubleshooting
1
Process Architecture
The SSR Operating System has a Modular system design.
All functions and protocols are split into separate processes each running in their
own protected memory space. As a result, failure of one protocol does not affect
other protocols. Each process can be stopped and restarted individually,
minimizing the impact on the overall system in case a failure occurs.
For example, if the OSPF process fails, only updates to the OSPF routes will be
temporarily affected, while all other protocols will continue to function.
PM checks
heartbeat msgs to
monitor health of
the system
CSM
csm: Specifies the Controller
State Manager (CSM) process
‘Hub’ processes
• ISM: Monitors and
broadcasts the state of
all interfaces, ports, and
circuits in the system.
• RCM: Controls all
system configurations
using a transactionoriented database.
‘Spoke’
processes
Figure 3-2: Process Architecture
An important process to note is the Process Manager, sometimes known as the
“God Process”. The function of the Process Manager is to monitor all active
processes in its respective hardware component. Active processes send a
heartbeat to the PM. If the heartbeat is lost to a process, PM will restart this
process. This is all done automatically, and the process is restored with minimal
impact to the system. The huge benefit is that the whole system does not need to
be reloaded when individual processes fail.
- 72 -
© Ericsson AB 2015
LZT1381712 R1A
Fundamental Concept of Processes Architecture on the System
1.1
RPSW Processes
The Ericsson IP Operating System is a set of interacting software modules that
are common to all Ericsson platforms running the operating system (OS).
The operating system provides general interfaces for configuring and interacting
with the system that are OS-independent, such as the Command Line Interface
(CLI), Simple Network Management Protocol (SNMP), and console logs. Even
OS-specific information, like lists of processes and counters, is displayed through
the CLI in an OS-independent way.
Most features and protocols have their own separate process. Each process has
it’s own protected memory space.
Implementing the major software components as independent processes allows a
particular process to be stopped, restarted, and upgraded without reloading the
entire system or individual traffic cards. In addition, if one component fails or is
disrupted, the system continues to operate
› Most features and protocols have their own separate
process
› Each process has it’s own protected memory space.
› There are also quite a few internal system processes:
– ISM (Interface and Circuit State Manager) - Manages the
configuration and state of all ports, circuits and interfaces in the
system. Is responsible for distributing this info thru the system.
– CSM (Card Slot Module/Connection State Manager) - Manages all
card and port config and state. Communicates with VxW and relays
state information to ISM.
– RCM (Router Config Module) - All config is managed by this
process using a transaction-oriented database. Each process has
an agent manager code running in this process
Figure 3-3: RPSW Processes (1-3)
Card State Manager (CSM)
The Card State Manager (CSM) is a back-end process corresponding to card and
port management. It relays card and port events to other back-end processes, such
as ISM.
CMA abstracts the details of the chassis so that the other software that is involved
in chassis management (mostly CSM) can be generic and portable to other
chassis architectures and types. This is not a separate process but rather a library
that is linked with the process that needs the abstraction.
Interface and Circuit State Manager (ISM)
LZT1381712 R1A
© Ericsson AB 2015
- 73 -
Ericsson SSR 8000 R15 System Troubleshooting
Interface and Circuit State Manager (ISM) monitors and disseminates the state of
all interfaces, ports, and circuits in the system. ISM is the common hub for event
messages within the system.
When ISM receives an event, it marks the event as received and passes the event
to interested clients. ISM tries to not send duplicate events to a client, but if it
does, a client must handle the duplication. ISM sends events in a specific order,
starting first with circuit events and followed by interface events in
circuit/interface order. All circuit delete events are sent before any other circuit
events, and all interface delete events are sent before any other interface events.
This order is to ensure that deleted nodes are removed from the system as quickly
as possible, because they might interfere with other nodes trying to take their
place.
Router Configuration Manager (RCM)
The Router Configuration Manager (RCM) controls all system configurations
using the configuration database.
The RCM engine is responsible for initializing all component managers and for
maintaining the list of all backend processes for communication. The set of
managers and backend processes is set at compile time. The registration of
manager to backend daemons occurs during RCM initialization, and each
manager is responsible for notifying the RCM engine with which backend
processes it communicates.
The RCM engine provides a session thread for processing any connection
requests from the interface layer. When a new interface layer component (CLI,
NetOpd, and so on) wants to communicate through the DCL to RCM, it starts a
new session with the RCM engine. Each session has a separate thread in RCM for
processing DCL messages. Because the RCM managers are stateless, the threads
only have mutual exclusion sections within the configuration database. Each
session modifies the database through a transaction. These transactions provide
all thread consistency for the RCM component managers.
The RCM has many other threads. These threads are either dynamically spawned
to perform a specific action or they live for the entire life of the RCM process.
Process Manager (PM)
The Process Manager (PM) monitors the health of every other process in the
system. The PM is the first process started when the system boots. It starts all the
other processes in the system. The list of processes to be started is described in a
text file that is packaged with the software distribution. PM also monitors the
processes and, if any process dies or appears to be stuck, it starts a new instance
of the process.
Reliable Database (RDB)
The configuration database, also known as the Reliable Database (RDB), is a
transactional database that maintains multiple transactions and avoids.
- 74 -
© Ericsson AB 2015
LZT1381712 R1A
Fundamental Concept of Processes Architecture on the System
By using the combination of transactions and locks, the configuration database
maintains the ACID transactional properties, atomicity, consistency, isolation,
and durability. Each property must be maintained by a database to ensure that
data does not get corrupted. By ensuring that every operation within the database
occurs as one atomic operation (atomicity), multiple users can interact with the
system (isolation) as well as make the database recoverable (durability). The
database must also provide facilities to allow a user to easily ensure the accuracy
of data within the database (consistency).
› Internal system processes continued:
–
–
–
PM (Process Manager) – The God process. Monitors and
maintains the health of all processes in the system. Checks
for periodic ‘heartbeat’ messages from each process to
monitor their health. Ie, if PM doesn’t receive its heartbeat
from a process it will initiate a coredump and restart of that
process.
RIB (Routing information dataBase) – Stores the main routing
table
RDB (Reliable dataBase) – Shared memory where all static
configuration is stored
Figure 3-4: RPSW Processes (2-3)
› Process communication: All communication between
processes is done using a Ericsson proprietary highly
optimized reliable IPC
–
–
–
–
Stands for ‘Inter-Process Communication’
Built on top of TCP
Monitors the health of the system
PM uses IPC to restart processes that are not responding
› RPSW—Line Card: Communication between BSD
processes & PPA done with IPC
› Hub and Spoke architecture: Some processes like ISM,
RIB, AAA, RPM are server processes that talk to several
client processes.
Figure 3-5: RPSW Processes (process communication) (3-3)
LZT1381712 R1A
© Ericsson AB 2015
- 75 -
Ericsson SSR 8000 R15 System Troubleshooting
1.2
Process Scheduling
Processes are scheduled to allow efficient use to shared CPU resources
Without scheduling, one or two busy processes could ‘starve’ all other processes,
leading to harmful effects on the system
The ‘Run Queue’ is a metric used to measure the busyness of the system
Internally the Run Queue is the number of processes, at any given time, that are
waiting to be serviced by the CPU
P1
P2
P1
CPU
Nbr of Processes in queue: Run Queue
› Processes are scheduled to allow efficient use to shared
CPU resources
› Without scheduling, one or two busy processes could
‘starve’ all other processes, leading to harmful effects on
the system
› The ‘Run Queue’ is a metric used to measure the busyness
of the system
› Internally the Run Queue is the number of processes, at
any given time, that are waiting to be serviced by the CPU
Figure 3-6: Process Scheduling
- 76 -
© Ericsson AB 2015
LZT1381712 R1A
Fundamental Concept of Processes Architecture on the System
1.3
RPSW Processes Verification
Run queue over 5 sec, 1 and 3
5 minute averages
[local]Train-1# show process
Load Average : 1.40 1.32 1.27
1
NAME
csm
rcm
ism
ped_parse
rpm
rib
ntp
arp
static
isis
rip
bgp
igmp
pim
ospf
sysmon
---more---
MEMORY
6616K
13924K
4748K
3676K
3276K
4020K
0K
3492K
0K
0K
0K
0K
0K
0K
0K
3860K
PID
26
27
28
29
30
31
0
32
0
0
0
0
0
0
0
33
4
SPAWN
1
1
1
1
1
1
0
1
0
0
0
0
0
0
0
1
Up/Down
Time process is up
or down associated
to state
TIME
00:00:22.18
00:00:07.51
00:00:05.95
00:00:03.27
00:00:02.70
00:00:04.61
Not Avail
00:00:03.31
Not Avail
Not Avail
Not Avail
Not Avail
Not Avail
Not Avail
Not Avail
00:00:03.32
%CPU
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
2
STATE
run
run
run
run
run
run
demand
run
demand
demand
demand
demand
demand
demand
demand
run
UP/DOWN
05:35:42
05:35:35
05:35:34
05:35:33
05:35:33
05:35:32
05:37:29
05:35:31
05:37:29
05:37:29
05:37:29
05:37:29
05:37:29
05:37:29
05:37:29
05:35:31
State = run = active
Spawn = 0 or 1 = good news
State = stop = stopped
Spwan > 1 process is restarted
State = demand = sleeping
Figure 3-7: RPSW processes verification
Displays current status of one or all processes running on the system.
show process [proc-name] [{crash-info | detail}]
Syntax Description
proc-name Optional. Process for which you want to display information. T
crash-info Optional. Specifies that process crash information is to be monitored.
detail Optional. Specifies that detailed process information is to be displayed.
We can examine all process in Demand state. A process in demand state is
waiting to be started in the configuration. The process hasn’t been configured on
the SSR. For example if ISIS is not running on the SSR the SSR process is
available to run but just hasn’t been configured yet.
Spawn: generate
ISM (Interface State Manager) configuration and state of all ports, circuits and
interfaces
CSM (Card Slot Module/Connection State Manager)
RCM (Router Config Module)
RDB (Reliable dataBase)
LZT1381712 R1A
© Ericsson AB 2015
- 77 -
Ericsson SSR 8000 R15 System Troubleshooting
monitor process: Monitor the status of a process and provide continuous
updates. Enter this command in exec mode.
You can see how long the process has been in its given state whether this is up or
down.
It is important not to confuse the UP/DOWN field with the TIME field. This one
represents the total CPU time used by the process since it had started.
1.4
Finding CPU Intensive Processes
› Finding processes claiming 10% or more of CPU
resources:
[local]Train-1# show proc | grep option '-E' '[1-9][0-9]{1,2}\...%'
rcm
254
2
13272K 00:00:00.38
17.53% run
hr
255
1
3400K 00:00:00.09
14.61% run
00:00:04
00:00:02
› Combined with macro it will be easy to use:
[local]Train-1(config)# macro exec highload
[local]Train-1(config-macro)# seq 10 show clock
[local]Train-1(config-macro)# seq 20 show proc | grep option '-E' '[1-9][0-9]{1,2}\...%|NAME'
[local]Train-1(config-macro)# end
[local]Train-1# highload
[10] (highload)# show clock
Tue Dec 15 13:06:50 2015 UTC
[20] (highload)# show proc | grep option '-E' '[1-9][0-9]{1,2}\...%|NAME'
NAME
PID
SPAWN
MEMORY TIME
%CPU STATE
UP/DOWN
aaad
288
3
6464K 00:00:00.12
11.00% run
00:00:01
[local]Train-1#
Figure 3-8: Finding CPU intensive processes
ISM (Interface State Manager)
CSM (Card Slot Module/Connection State Manager)
RCM (Router Config Module)
RDB (Reliable dataBase)
- 78 -
© Ericsson AB 2015
LZT1381712 R1A
Fundamental Concept of Processes Architecture on the System
1.5
Single Process Verification
show process: displays current status of one or all processes running on the
system.
show process [card {slot | RPSW1 | RPSW2 | standby | ALSW1 | ALSW2
|all}] [proc-name] [crash-info | detail]
slot
Optional. Displays process information for the traffic
card installed in the specified slot.
RPSW1
Optional. Displays process information for the
controller card in slot RPSW1.
RPSW2
Optional. Displays process information for the
controller card in slot RPSW2.
standby
Optional. Displays process information for the
controller card running in standby mode.
ALSW1
Optional. Displays process information for the ALSWT card in slot ALSW1.
ALSW2
Optional. Displays process information for the ALSWT card in slot ALSW2.
all
Optional. Displays process information for all traffic
cards installed in the router.
proc-name
Optional. Name of the process for which to display
information.
crash-info
Optional. Monitors process crash information.
detail
Optional. Displays detailed process information.
› Retrieving specific process information
[local]Train-1# show process ism
NAME
PID
ism
282
[local]Train-1#
SPAWN
2
MEMORY
4356K
TIME
00:00:00.20
%CPU
0.00%
STATE
run
UP/DOWN
00:02:04
› For each process details can be retrieved, see next slide
Figure 3-9: Single process verification
LZT1381712 R1A
© Ericsson AB 2015
- 79 -
Ericsson SSR 8000 R15 System Troubleshooting
1.6
Single Process in Detail
Keyword “crash-info” Display process crash information
ISM (Interface State Manager): configuration and state of all ports, circuits and
interfaces
CSM (Card Slot Module/Connection State Manager): Manages all card and port
config and state.
RCM (Router Config Module)
RDB (Reliable dataBase)
[local]Train-1# show
Process (PID)
process ism detail
: ism (282)
Spawn count
: 2
Memory
: 4356K
Time
: 00:00:00.23
%CPU
: 0.00%
State
: run
Up time
: 00:02:58
Heart beat
: Enabled
Spawn time
: 2 seconds
Max crashes allowed : 5
Crash thresh time
: 86400 seconds
Total crashes
: 0
Fast restart
: DISABLED
Images: (Spawns, Max spawns, Version, Path)
(*) 2, 3, v1, /usr/siara/bin/ism2
Client IPC Endpoints:
EP 7f000206 f0bc0008 - L2TP-ISM-EP-NAME:00000000
EP 7f000206 f0bc000c - L2TP-ISM-EP-NAME:00000000
EP 7f000206 f0bc0008 - PPPOE-ISM-EP-NAME:00000000
EP 7f000206 f0bc0008 - LM-IPC-ISM-EP-NAME:00000000
--Server IPC Endpoints:
EP 7f000206 f0bc000c - ISM2-MBE-EVIN-EP-NAME:00000000
Dependent process aaad (288) EP 7f000206 f53c000a
Dependent process ppp (222) EP 7f000206 d6ad0007
Dependent process EPPA IPC SLOT 3 (-2130509824) EP 7f000a43 00000013
Dependent process IPPA IPC SLOT 3 (-2147287040) EP 7f000a03 00000015
Dependent process EPPA IPC SLOT 1 (-2130640892) EP 7f000a41 00040013
---
Figure 3-10: Single process in detail
- 80 -
© Ericsson AB 2015
LZT1381712 R1A
Fundamental Concept of Processes Architecture on the System
1.7
Single Process Verification - ISM
[local] Ericsson# show process ism detail
Process (PID)
: ism (3615)
Spawn count
: 1
Memory
: 8688K
Time
: 00:00:30.13
%CPU
: 0.01%
State
: run
Up time
: 2d18h
Heart beat
: Enabled
Spawn time
: 2 seconds
Max crashes allowed : 5
Crash thresh time
: 86400 seconds
Total crashes
: 0
Fast restart
: DISABLED
Process has not had to be
restarted
When did it restart?
PM controls health of the
process
Process has not Crashed
No “Last Exit Status” shown
Figure 3-11: Single Process Verification – ISM
For each process we can retrieve detailed information using the “detail” keyword
Spawn count equal to 1 means that the process has never been restarted
From the “Up time” field we can derive when the process last restarted
Heart beat enabled means that Process Manager is controlling the health of the
process by means of heartbeats
We can see how many times the process has crashed. In this case the total
number of crashes is zero
Since the process has never restarted no “Last Exit Status” message is shown
LZT1381712 R1A
© Ericsson AB 2015
- 81 -
Ericsson SSR 8000 R15 System Troubleshooting
1.8
Single Process Verification - OSPF
[local] Ericsson# show process ospf detail
Process (PID)
: ospf (23251)
Spawn count
: 2
Memory
: 5364K
Time
: 00:00:00.93
%CPU
: 0.27%
State
: run
Up time
: 00:13:36
Heart beat
: Enabled
Spawn time
: 2 seconds
Max crashes allowed : 5
Crash thresh time
: 86400 seconds
Total crashes
: 0
Fast restart
: DISABLED
Last exit status
: Kill (9)
Process has had to be
restarted
Process has not Crashed
Process was killed manually
Figure 3-12: Single Process Verification – OSPF
As another example let’s now grab the detailed information for the ospf process
Spawn count equal to 2 means that the Process has restarted once.
Even in this case the process has never crashed, that means that the restart has
been caused by some other reason, for example manual intervention.
We can derive the reason why the process restarted by analyzing the “Last exit
status” message.
In this case, since the process has restarted, the “Last exit status” is shown. Exit
status “kill” means that the Process was manually killed.
- 82 -
© Ericsson AB 2015
LZT1381712 R1A
Fundamental Concept of Processes Architecture on the System
1.9
Maximum Crashes Allowed
If the process keeps crashing, it indicates that there is a problem that needs to be
dealt with. For this reason there is a defined limit on the number of crashes
allowed. This can be seen when using the show process detail command where
we see the “Maximum crashes allowed” in a specific “Crash Threshold Time”
period.
Limit on number of crashes allowed
show process ism detail
[local] Ericsson#
Process (PID)
: ism (3615)
Spawn count
: 1
Memory
: 8688K
Time
: 00:00:30.13
%CPU
: 0.01%
Does not apply to Manual
State
: run
Restarts
Up time
: 2d18h
Heart beat
: Enabled
Spawn time
: 2 seconds
Max crashes allowed : 5
Crash thresh time
: 86400 seconds
Total crashes
: 0
Process is allowed crash maximum of five times
in 86400s after which it will not be restarted
Fast restart
: DISABLED
Figure 3-13: Maximum Crashes Allowed
The default values for these are 5 Crashes in 86400 seconds which is 1 day. This
means if a process crashes and is forced to restart 5 times in a day it will not be
restarted.
It is important to note that this value only applies to actual crashes whereas
manual restarts have no effect on this. In other words a manual restart doesn’t
count as a crash.
LZT1381712 R1A
© Ericsson AB 2015
- 83 -
Ericsson SSR 8000 R15 System Troubleshooting
1.10
Process Crash
› The following sequence of events occur when a crash happens:
– Crash event
– Automatic core dump initiated
– Process restarted after core dump completed:
– Spawn-count increments
– Process restarts and initializes
› If the process keeps crashing, stop restarting after 5 crashes (by
default) within 86400 seconds (24 hours).
› This number changeable via:
- process set <process> max-crashes
› Rule doesn’t apply for manual restarts…process will keep coming
up forever
Figure 3-14: Process crash (1-2)
So what happens when a process crashes?
There is a predictable sequence of events when a Process crashes. First of all after
a process crash there is an automatic generation of a core dump. The core dump
contains information about the system state at the time of the crash particularly
the memory state of the process in question. This is saved to disk for future
analysis.After the core dump is completed the Process Manager will attempt to
restart the process and the Spawn count we saw earlier increases.
Core Dump
/md
Process
Crash
Process
Restarted
Spawn Count
incremented
+1
Figure 3-15: What happens when a process crashes?
- 84 -
© Ericsson AB 2015
LZT1381712 R1A
Fundamental Concept of Processes Architecture on the System
1.11
Software Process Failure Scenario
1
OSPF
died
Process
Manager
CLI, SNMP, other
Config
Process
BGP
Database
Multicast
PPP
Process
Manager
CLI, SNMP, other
OSPF
Routing Information Base
OS Kernel
Active
RPSW
Static
Config
Process
Database
BGP
Multicast
PPP
Process
Manager
CLI, SNMP, other
OSPF Static
Routing Information Base
OS Kernel
Standby
RPSW
2
OSPF
restartActive ALSW
Config
Process
BGP
Database
Multicast
PPP
Process
Manager
CLI, SNMP, other
OSPF
Static
Routing Information Base
Config
Process
BGP
Database
Multicast
PPP
Process
Manager
CLI, SNMP, other
OSPF Static
Routing Information Base
OS Kernel
Active
RPSW
3
Active ALSW
OS Kernel
Standby
RPSW
Config
Process
BGP
Database
Multicast
PPP
Process
Manager
CLI, SNMP, other
OSPF
Routing Information Base
Static
Config
Process
Database
Active
RPSW
BGP
Multicast
PPP
OSPF Static
Routing Information Base
OS Kernel
OS Kernel
Standby
RPSW
1) Problem Occurs in Software
2) Process is restarted
3) Process comes back up
• Only individual process is
effected
• All other processes continue to
run
All established connections
remain up and forward traffic
• Only effected process is
restarted
• Done completely automatically
All established connections
remain up and forward traffic
• Process starts running again
• NO RPSW switch over has to
occur
All established connections
remain up and forwarding
traffic
Figure 3-16: Software Process Failure Scenario
Let's take a closer look at what happens during a software process failure.
First, a Problem Occurs in a software process. For example, the OSPF process
dies.
Only the individual process is effected.
All other processes continue to run and all established connections remain up and
forward traffic
Next, the system detects that the OSPF process is down and automatically restarts
it
Finally, the process comes back up and starts running again; No hardware failure
or switchover occurs.
Data forwarding continued non-stop on the system, and the only impact was the
short period when OSPF was unavailable to make changes to OSPF specific
routes.
LZT1381712 R1A
© Ericsson AB 2015
- 85 -
Ericsson SSR 8000 R15 System Troubleshooting
› A coredump is created when a process crashes
In this example ppp
process crashed
[local]Train-1# show crashfiles
344146 Oct 28 06:03 /md/pppd_859.core
342952 Oct 28 06:00 /md/20111028_060011_pppd_859.core
343897 Oct 28 06:02 /md/20111028_060212_pppd_859.core
[local]Train-1#
› Coredumps are saved on /md
[local]Train-1# dir /md
Contents of /md
-rw------- 1 root 0
-rw------- 1 root 0
-rw------- 1 root 0
-rw------- 1 root 0
[local]Train-1#
343712 Oct 28 06:05 pppd_859.core
342952 Oct 28 06:00 20111028_060011_pppd_859.core
343897 Oct 28 06:02 20111028_060212_pppd_859.core
344146 Oct 28 06:03 20111028_060329_pppd_859.core
Figure 3-17: Process crash (2-2)
process coredump: Initiate a core dump of a process and save it in a crash file.
Enter this command in exec mode.
service upload-coredump: Crash files can be automatically uploaded to a remote
server
1.12
System Stopped Processes
Processes in Stop State indicate the process is Stopped and will not be restarted
by the Process Manager unless this is requested.
[local] Ericsson# process stop ospf
[local] Ericsson# show process ospf
NAME
ospf
PID
0
SPAWN
1
MEMORY
0K
TIME
Not Avail
%CPU
0.00%
STATE
stop
UP/DOWN
00:00:03
TIME
00:00:00.01
%CPU
0.00%
STATE
run
UP/DOWN
00:00:04
[local] Ericsson# process start ospf
[local] Ericsson# show process ospf
NAME
ospf
PID
23251
SPAWN
2
MEMORY
5188K
Figure 3-18: System Stopped Processes
To illustrate we will use the “process stop” command to stop the ospf process.
We can then use “show process ospf” to see the ospf process state. As expected
the process is in Stop State. Note that the Spawn count is one, meaning this
process has started once and is currently stopped.
If we start the ospf process again and then run “show process ospf” we can see
that it is in run state again. Note how the spawn count has increased to two. In
this case due to a manual restart.
- 86 -
© Ericsson AB 2015
LZT1381712 R1A
Fundamental Concept of Processes Architecture on the System
Core dumps are stored
on /md
› Check for core dumps on the system
[local]Train-1# show crashfiles
630974 Jun 6 04:19 /md/20120606_041954_netopd.3936.1338956394.SSR8020.core.gz
386124 May 9 16:37 /md/20120509_163729_ism2.3602.1336581449.Ericsson.core.gz
293762 Aug 1 01:40 /md/20120801_014055_pppd.22065.1343785255.SSR8020.core.gz
392714 Aug 1 01:41
/md/20120801_014104_pppd.22185.1343785264.SSR8020.core.gz[local]Train-1#
› You can run a quick check over the core dump
[local]Train-1# show process ppp crash-info
NAME
TIME
ppp
Wed Jul
ppp
Wed Jul
ppp
Wed Jul
[local]Train-1#
4 16:02:46 2007
4 16:03:32 2007
4 16:04:35 2007
STATUS
Trap (133)
Trap (133)
Software termination (15)
There is more next slide
Figure 3-19: Did a process crash? (1-2)
Manually create coredump:
[local]Train-1#process coredump [process name]
Please turn on heart-beat once coredump is complete.
[local]Train-1#process set aaad heart-beat on
› After sharing core dumps with Ericsson TAC, please
cleanup the /md directory to create sufficient space
[local]Train-1# dir /md/*core
Contents of /md/*.core
-rw-r--r-- 1 root root
624241 Mar 27 13:13
20120327_131307_clsd.6154.1332853987.Ericsson.core.gz
-rw-r--r-- 1 root root
386124 May 9 16:37
20120509_163729_ism2.3602.1336581449.Ericsson.core.gz
[local]Train-1#
[local]Train-1# del crashfile /md/20120327_131307_*.core.gz
Are you sure you want to delete
/md/20120327_131307_clsd.6154.1332853987.Ericsson.core.gz ?y
[local]Train-1#
› There is even more important reason for deleting old core
dump files than disk space concerns
Figure 3-20: Did a process crash? (2-2)
The show crashfiles command is use to display the size, location, and name of
any crash files located in the system. Files are placed in the /md partition in
internal storage. Crash files are used by technical support to determine the cause
of a system failure.
LZT1381712 R1A
© Ericsson AB 2015
- 87 -
Ericsson SSR 8000 R15 System Troubleshooting
1.13
Old Core File on RP
›
Imagine you log in to your RP in the morning and type:
[local]Train-1 # show crashfiles
11228124 May 5 11:40 /md/ribd_41.core
7628860 May 5 11:48 /md/arpd_28220.core
5514348 May 5 11:48 /md/arpd_11987.core
7410476 May 5 11:48 /md/loggd_2979.core
6157548 May 5 11:48 /md/arpd_12047.core
23848268 May 5 11:49 /md/rcm_39_sb.core
16987068 May 5 11:53 /md/snmpd_65.core
12412108 May 5 11:49 /md/bgpd_20293.core
16434108 May 5 11:50 /md/snmpd_15145.core
13014124 May 5 11:50 /md/bgpd_29315.core
16442300 May 5 11:50 /md/snmpd_15183.core
7583276 May 5 11:50 /md/arpd_24146.core
7062796 May 5 11:51 /md/arpd_20287.core
›
So you panic, call TAC, ask for chassis replacement ;-) and
so on….
Figure 3-21: Old core files on RP – BAD IDEA
1.14
Core Files – Copied between RP
›
But if you check system log you will find something very
interesting:
May 5 11:48:44: %SYSLOG-6-INFO: ftpd[11019]: connection from 127.3.252.1
May 5 11:48:44: %SYSLOG-6-INFO: ftpd[11019]: FTP LOGIN FROM 127.3.252.1 as nobody
May 5 11:48:44: %SYSLOG-6-INFO: ftpd[11019]: put /md/arpd_28220.core = 7628860 bytes
May 5 11:48:44: %SYSLOG-6-INFO: ftpd[11022]: connection from 127.3.252.1
May 5 11:48:44: %SYSLOG-6-INFO: ftpd[11022]: FTP LOGIN FROM 127.3.252.1 as nobody
May 5 11:48:44: %SYSLOG-6-INFO: ftpd[11022]: put /md/arpd_11987.core = 5514348 bytes
May 5 11:48:44: %SYSLOG-6-INFO: ftpd[11025]: connection from 127.3.252.1
May 5 11:48:44: %SYSLOG-6-INFO: ftpd[11025]: FTP LOGIN FROM 127.3.252.1 as nobody
May 5 11:49:04: %SYSLOG-6-INFO: ftpd[11025]: put /md/loggd_2979.core = 7410476 bytes
›
›
›
All core files have been copied from standby RP after switch
over occurred
ftp does not preserve original time stamp
We can check original file dates on standby RP
Figure 3-22: Core files are copied between RPs
- 88 -
© Ericsson AB 2015
LZT1381712 R1A
Fundamental Concept of Processes Architecture on the System
1.15
Core Dump Files on Standby RP
[local]Train-1# dir mate /md
Opening connection to mate...
total 4128
-rw-r--r-- 1 root
0 11228124 Sep 16
-rw-r--r-- 1 root
0
7628860 May 30
-rw-r--r-- 1 root
0
5514348 Aug 6
-rw-r--r-- 1 root
0
7410476 Aug 7
--- cut ---
›
›
›
2010 ribd_41.core
2010 arpd_28220.core
2011 arpd_11987.core
2011 loggd_2979.core
Leaving core files on RP introduced problems:
Increased operator’s adrenaline level and
Diverted operator’s attention from real issue
–
›
Core files are very old
and most likely from
different OS release
instead of investigating real problem – reason for switchover –
operator started to investigate multiple process crashes
After sharing core dumps with TAC, please cleanup the
/md directory on both RPs !
[local]Train-1# del mate /md/<filename>
Opening connection to mate...
Figure 3-23: Core dump files on standby RP
LZT1381712 R1A
© Ericsson AB 2015
- 89 -
Ericsson SSR 8000 R15 System Troubleshooting
2
Exercise 3: Introduction
› Exercise to learn to troubleshoot the system processes:
– Search for processes indicating problems
› Your system is not loaded so you will probably not find any
problems with the processes.
› You will emulate processes running on your system by
looking at a saved file:
[local]Train-1# show configuration sh_proc | grep ...
Emulates “show process” command
Figure 3-24: Exercise 3: Introduction
2.1
Exercise 3: System Processes
› Please move to the exercises book.
Figure 3-25: Exercise 3: System Processes
2.1.1
Exercise 3: Review
“show conf sh_proc” emulates output of
“show process” command
›
Exercise 3.1: Create a manual coredump
[local]Train-1#
›
process coredump ppp no-restart
Wait for coredump to complete:
[local]Train-1# show process ppp
NAME
PID
SPAWN
MEMORY
ppp
166
3
3436K
[local]Train-1#
›
TIME
00:00:00.59
%CPU
0.00%
STATE
run
UP/DOWN
00:02:20
process set ppp heart-beat on
See the core dump file in /md:
[local]Train-1# dir /md/*ppp*.core
Contents of /md/*ppp*
-rw-r--r-- 1 root root 399221 Aug 01 01:40 /md/20120801_014041_pppd.3908.1343785
241.Train-1.core.gz
-rw-r--r-- 1 root root 293762 Aug 01 01:40 /md/20120801_014055_pppd.22065.134378
5255.Train-1.core.gz
Exercise 3.2: Processes claiming more than 20%
‪
[local]Train-1# show conf sh_proc | grep option '-E' '[2-9][0-9]{1,2}\...%'
ism
454
1
18280K 00:00:11.96
32.00% run
13:48:30
ppp
473
1
4364K 00:00:08.36
23.00% run
13:48:22
‪
›
‪
Figure 3-26: Exercise 3, review (1-2)
- 90 -
© Ericsson AB 2015
LZT1381712 R1A
Fundamental Concept of Processes Architecture on the System
› Processes claiming any CPU resources
[local]Train-1# show conf sh_proc | grep option '-E' '[1-9][0-9]{0,2}\...%'
OR
[local]Train-1# show conf sh_proc | grep option '-E -v' ' {1,3}0\...%'
ism
454
1
18280K 00:00:11.96
32.00% run
13:48:30
rib
458
1
4540K 00:00:10.35
10.00% run
13:48:28
lm
466
1
4592K 00:00:07.74
12.00% run
13:48:25
ppp
473
1
4364K 00:00:08.36
23.00% run
13:48:22
aaad
480
1
6932K 00:00:15.83
7.00% run
13:48:18
› Processes which restarted more than once
[local]Train-1# show conf sh_proc | grep option '-E' '^.{26}[2-9]'
rcm
453
3
14060K 00:00:14.27
0.00% run
dlm
488
2
6828K 00:00:07.17
0.00% run
l2tp
482
3
4656K 00:00:33.39
0.00% run
[local]Train-1#
13:48:31
13:42:18
13:48:16
Figure 3-27: Exercise 3, review (2-2) (Optional parts)
LZT1381712 R1A
© Ericsson AB 2015
- 91 -
Ericsson SSR 8000 R15 System Troubleshooting
3
Chapter Summary
After this course the participant should be able to:
› Describe the Fundamental Concept of Processes
Architecture on the System
› Describe the SSR software architecture and system
processes
› Understanding the concept of manual core dump
› Identify the different types of processes in SSR
› Work with Core Dumps of Faulty Processes
Figure 3-28: Chapter Summary
- 92 -
© Ericsson AB 2015
LZT1381712 R1A
Understand the SSR System Redundancy Issues
4 Understand the SSR System Redundancy
Issues
Chapter Objectives
After this course the participant will be able to:
› Identify the SSR System Redundancy Issues
› Explain the redundancy on active RP
› Analyze problems of standby RP
› Understand RP Failover Management
Figure 4-1: Chapter Objectives
LZT1381712 R1A
© Ericsson AB 2015
- 93 -
Ericsson SSR 8000 R15 System Troubleshooting
1
RP Redundancy
When checking the Health of an SSR an import thing to verify is the RP
redundancy state. The SSR contains two RP’s, one RP is in Active state while the
other one is in standby state. The standby RP is ready to take over should the
Active RP fail. For this reason is very important to verify the state of this
redundancy.
›
Verifying current redundancy state of the system
[local]Train-1# show redundancy
--------------------------------This RPSW is active
--------------------------------STANDBY RPSW READY?
: YES
PAd in sync?
: YES
Database in sync?
: YES
Software Release in sync? : YES
Firmware in sync?
: YES
Mate-to-Mate link up?
: YES
--- cut --[local]Train-1#
›
Thumbs up
Those are software elements which are
synchronized during an upgrade.
During the boot sequence of the standby RP some fields might
indicate “NO”. This is part of the boot sequence where each
element is compared with the active RP
Figure 4-2: RP redundancy
This is done by using the command “show redundancy”.
In this case we can see that the Standby RP is ready and in synched with the
Active RP.
We can also see a list of processes that have been successfully synched and also
the details of any RP reload switchover that has occurred since system reload.
- 94 -
© Ericsson AB 2015
LZT1381712 R1A
Understand the SSR System Redundancy Issues
1.1
RP Redundancy Details
Part of the information available from the output of “show system redundancy” is
also available by running the command “show redundancy detail”.
Here we can see again a side by side comparison of software releases on both
RP’s.
[local]Train-1# show redundancy detail
Server (sync version3.0) is up
Client (sync version3.0) is connected
Client Mode: Service
| Active's Version
| Standby's Version
___________|_____________________________|_________________________________
Firmware
| OpenFirmware 3.0.2.29
| OpenFirmware 3.0.2.29
| PRODUCTION RELEASE
| PRODUCTION RELEASE
___________|_____________________________|_________________________________
Software
| /p02: 15.2.129.3.13
| /p02: 15.2.129.3.13
___________|_____________________________|_________________________________
Diagnostic | /p02: 15.2.129.3.13
| /p02: 15.2.129.3.13
___________|_____________________________|_________________________________
VPF
| /p02:
| /p02:
| SLES11SP3_ssc_xen-5.1.21
| SLES11SP3_ssc_xen-5.1.21
___________|_____________________________|_________________________________
Minikernel | v3.0.38-876-g113fd53-2532251| v3.0.38-876-g113fd53-2532251
|
|
___________|_____________________________|_________________________________
Software Sync Log:
-----------------Release Sync Type: release sync unnecessary
--more
Dec 16 2015 05:58:17: SUCCESS
Figure 4-3: RP redundancy details
Additional output from the command can be seen here.
This shows the logs of synchronisation between the RP’s.
As we can see also the configuration files have to be synched for successful
redundancy.
LZT1381712 R1A
© Ericsson AB 2015
- 95 -
Ericsson SSR 8000 R15 System Troubleshooting
1.2
Investigating Redundancy Issues
›
“show system redundancy” command provides set of very useful
information
›
It contains the following sections:
–
–
–
–
–
–
–
–
–
–
–
The output is large for this command
Active controller alarms
Hardware detail rp A & B
Controller switch history
Controller release sync history
Firmware sync log
Software sync log
Configuration sync log
Minikernel sync log
Controller protection internal log
Controller error log
Figure 4-4: Investigating redundancy issues
1.3
Show system redundancy
[local]Train-1# show system redundancy
Controller alarms for slot RPSW2:
-----------------------------
Alarms summary for RPSW
Controller alarms for slot RPSW1:
----------------------------Hardware detail for slot RPSW2:
--------------------------Slot
: RPSW2
Serial No
: CF90000AY0
Mfg Date
: 06-DEC-2011
Activated Time
: 44 h
Phalanx
: 2.0.9
--cut-Payload Status
: OK
POD Status
: Passed
Failed LED
: Off
Standby LED
: Off
Ejector Switch
: 1 (Locked)
Last Payld Reset
: Admin
Active Alarms
: NONE
Hardware detail for slot RPSW1:
--------------------------Slot
: RPSW1
Serial No
: CF90000BSH
Mfg Date
: 22-DEC-2011
Activated Time
: 28 min
Phalanx
: 2.0.9
Spanky
: 02.02
--cut-Last Payld Reset
: Admin
Active Alarms
: NONE
---(more)---
Type
Hardware Rev
: rpsw
: R2H
OSD Status
: Not Run
IS LED
Swap LED
: On
: Off
Type
Hardware Rev
: rpsw
: R2H
Same output as show
hardware card RPSW2 detail
Same output as show
hardware card RPSW1 detail
There is more next slide
Figure 4-5: show system redundancy (1-3)
- 96 -
© Ericsson AB 2015
LZT1381712 R1A
Understand the SSR System Redundancy Issues
Information about root
cause of switch over
Controller switch history:
-------------------------[Sat Jun 16 04:01:07 2012] Card Failed : (RPSW1)->(RPSW2)
Controller release sync status:
------------------------------Server (sync version3.0) is up
Client (sync version3.0) is connected
Client Mode: Service
| Active's Version
|
Standby's Version
___________|_____________________________|_________________________________
Firmware
| Mips,rev2.0.2.66
| Mips,rev2.0.2.66
___________|_____________________________|_________________________________
Software
| /p02: 11.1.2.1
| /p02: 11.1.2.1
___________|_____________________________|_________________________________
Minikernel | 11.7
| 11.7
___________|_____________________________|_________________________________
Same output as show
redundancy detail
Software Sync Log:
-----------------Release Sync Type: release sync unnecessary
Sep 7 2012 14:41:47: UNNECESSARY
Sep 7 2012 14:41:47: SUCCESS
Configuration Files Sync Log:
----------------------------Sep 7 2012 14:42:48: SUCCESS
---more---
There is more next slide
Figure 4-6: show system redundancy (2-3)
Controller protection internal log:
----------------------------------Sep 7 14:41:43: Controller::RPSW2 - Mate Link UP on ACTIVE card.
synch steps and state changes of RPs
including mate-to-mate link state
change, SW state change
Sep 7 14:41:43: Controller::RPSW2 - Sw State: [Running]. Mate Sw State [Startup]
.
Sep 7 14:41:52: Controller::RPSW2 - processMateInsertionStartup, received lock r
equest from peer
Sep 7 14:41:52: Controller::RPSW2 - Locking card for state synch
Sep 7 14:41:53: Controller::RPSW2 - Sw State: [Running] -> [WaitForPeer]. Mate S
w State [WaitForPeer].
Sep 7 14:42:20: Controller::RPSW2 - Exiting waitForPeerInit() Sw State: [WaitFor
Peer]. Mate Sw State [ReadyToRun].
Sep 7 14:42:20: Controller::RPSW2 - Sw State: [WaitForPeer] -> [ReadyToRun]. Mat
e Sw State [ReadyToRun].
Sep 7 14:42:52: Controller::RPSW2 - Sw State: [ReadyToRun] -> [Running]. Mate Sw
State [ReadyToRun].
Sep 7 14:42:55: Controller::RPSW2 - Sw State: [Running]. Mate Sw State [ReadyToR
un] -> [Running].
Sep 7 14:42:55: Controller::RPSW2 - Unlocking card after state synch
Sep 7 14:42:55: Controller::RPSW2 - Fault Severity change on Primary. Card Fail
-> No Faults.
Root causes of RP errors
No errors  good
Controller error log:
---------------------
Figure 4-7: show system redundancy (3-3)
LZT1381712 R1A
© Ericsson AB 2015
- 97 -
Ericsson SSR 8000 R15 System Troubleshooting
2
Analyzing Problems of Standby RP
Figure 4-8: Analyzing Problems of Standby RP
________________________________________________________________
________________________________________________________________
________________________________________________________________
________________________________________________________________
________________________________________________________________
- 98 -
© Ericsson AB 2015
LZT1381712 R1A
Understand the SSR System Redundancy Issues
2.1
Active or Stanby RP
Imagine active RP encountered serious problem and it restarted
We analyzed possible restart problems of currently active RP
This is very useful for single RP systems, however in case of redundant
configuration 
Current active RP was in standby mode before problem occurred
So we investigated perfectly healthy RP!
› Imagine active RP encountered
serious problem and it restarted
› We analyzed possible restart
problems of currently active RP
› This is very useful for single RP
systems, however in case of
redundant configuration 
› Current active RP was in standby
mode before problem occurred
› So we investigated perfectly
healthy RP!
Important to remember – in case
of RP problems on redundant
systems most often you need to
investigate the standby RP
Active
×
Standby
Standby
Active
Analyze
active RP!
Healthy RP!
Figure 4-9: Which RP should you check, Active or Standby?
2.2
Connecting to Standby RP without Console
› You can telnet from the active RP to standby RP using
the internal loopback
[local]Train-1# telnet mate
Trying 127.2.252.1...
Connected to 127.2.252.1.Escape character is '^]'.
login: ericsson
Password: ericsson
[local]standby#
› Once you are in the standby RP you can issue the same
commands as discussed for active RP
Figure 4-10: Connecting to standby RP without console
LZT1381712 R1A
© Ericsson AB 2015
- 99 -
Ericsson SSR 8000 R15 System Troubleshooting
2.3
Searching for Restart Reason
› Both RPs maintain independent logs
› You should investigate output of the following command
when searching for restart reason:
[local]standby# show system redundancy
[local]standby# show log file <filename> | grep ...
Note!
[local]standby# show redundancy
This RPSW is standby
[local]standby#
Figure 4-11: Searching for restart reason
2.4
Repeating Commands on Standby RP
› As discussed before on active RP
[local]standby# show crashfiles
[local]standby# show disk internal
Filesystem
1k-blocks
Used Available Use% Mounted on
rootfs
3969036
1653376
2115624 44% /
/dev/sda2
3969068
1659540
2109492 44% /p02
/dev/sdb1
15604376
418288 14399676
3% /var
/dev/sda3
7665864
150652
7128872
2% /flash
[local]standby#
[local]standby# show history global
Jul 4 17:29:00 show crashfiles
Jul 4 17:29:21 show disk internal
[local]standby#
Figure 4-12: Repeating commands on standby RP
›
System statistics confirm processes are pending init mode because
of standby function
[local]standby# show system status
System Status: OK
[local]standby#
Figure 4-13: Repeating commands on standby RP
- 100 -
© Ericsson AB 2015
LZT1381712 R1A
Understand the SSR System Redundancy Issues
2.5
Verify Processes on Standby RP
› Verify the processes running on the standby RP
[local]standby# show process
(5 sec, 1 and 5 minute )
Load Average : 1.35 1.31 1.26
NAME
ns
u2l
metad
evtmd
cmsp_sw0
cmsp_sw1
---more---
PID
3185
3206
3207
3238
3253
3254
SPAWN
1
1
1
1
1
1
MEMORY
4932K
3800K
30712K
4108K
4328K
4324K
TIME
00:00:00.50
00:00:00.02
00:00:01.51
00:00:00.03
00:00:00.05
00:00:00.05
%CPU
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
STATE
run
run
run
run
run
run
UP/DOWN
00:13:03
00:13:03
00:13:03
00:12:59
00:12:59
00:12:59
•Spawn = 1 means good news
•Spawn > 1 means investigate reason
•Spawn = 0 means process is not popular
•State = run / demand means good news IF up/down > days/weeks/months
•Time is just indicator for total activity of process. If high it just means the process is popular
•Load average provides indication of total utilization with different sample times (average time)
Figure 4-14: Verify processes on standby RP
2.6
Copy Files from Standby RP
› Why would you need to copy files from standby?
› Because only the active RP has an active IP
management connection toward the NOC
[local]Train-1# ping 10.1.1.3
PING 10.1.1.3 (10.1.1.3): source 10.1.1.101, 36
data bytes,
timeout is 1 second
!!!!!
[local]standby# ping 10.1.1.3
PING 10.1.1.3 (10.1.1.3): source 127.0.2.6, 36 data
bytes,
timeout is 1 second
.....
----10.1.1.3 PING Statistics---5 packets transmitted, 5 packets received, 0.0%
packet loss
round-trip min/avg/max/stddev =
0.205/0.347/0.790/0.250 ms
[local]Train-1#
----10.1.1.3 PING Statistics---5 packets transmitted, 0 packets received, 100.0%
packet loss
Active rp confirms connectivity
No response on standby
Figure 4-15: Copy files from standby RP
LZT1381712 R1A
© Ericsson AB 2015
- 101 -
Ericsson SSR 8000 R15 System Troubleshooting
› Copy core dumps from the standby RP to active RP:
[local]standby# exit
[local]Train-1# copy mate /flash/delta.cfg /flash/deltafrommate.cfg
copying from mate /flash/delta.cfg to local:/flash/deltafrommate.cfg...
Opening connection...
Copying file...
/flash/deltafrommate.cfg:
50.00 B
336.96 B/s
[local]Train-1#
[local]Train-1# copy /flash/deltafrommate.cfg ftp: //admin1@10.1.1.3/
copying from mate /flash/deltafrommate.cfg to ftp://deltafrommate.cfg...
Opening connection...
Copying file...
//deltafrommate.cfg:
50.00 B
336.96 B/s
[local]Train-1#
[local]Train-1# delete /flash/deltafrommate.cfg
[local]Train-1# delete mate /flash/delta.cfg
Figure 4-16: Copy files from standby RP
- 102 -
© Ericsson AB 2015
LZT1381712 R1A
Understand the SSR System Redundancy Issues
3
RP Failover Management
Figure 4-17: RP Failover Management
________________________________________________________________
________________________________________________________________
________________________________________________________________
________________________________________________________________
________________________________________________________________
3.1
Managing Reloads and RP Switch-over
Two types of reload switchover may occur:
LZT1381712 R1A
© Ericsson AB 2015
- 103 -
Ericsson SSR 8000 R15 System Troubleshooting
The first type is a manual switchover and can be triggered by running the
command “reload switch-over”
The second type is due to automatic failover upon failure of the Active RP. For
example the abnormal termination of a critical process running on the Active RP
causes the RP to reload and as a consequence it triggers a switchover. Example of
critical processes are PM, PAD, NS, CMS_SERVER.
[local]Train-1# reload ?
asp
Reload asp(s)
card
Reload card(s)
standby
Reload the standby card
switch-over Reload active RPSW or ALSW and cause standby to active switch over
<cr>
›
reload
–
–
Instructs the system to reload
Accounting off records will be sent prior to reload
›
reload standby
›
reload card
›
–
–
Reload the standby RP
Reload a specific traffic card
reload switch-over
–
–
Alternate Active to Standby RP
Reloads current active RP
Figure 4-18: Managing Reloads and RP Switch-over
3.2
Manual RP Switchover
› BEFORE switchover, make sure RPs are in synch. They might
not be and then you are in trouble….
› Lack of patience reason number one for causes out of synch on
RPs
[local]Train-1# show redundancy
--------------------------------This RPSW is active
--------------------------------STANDBY RPSW READY?
: NO
PAd in sync?
: NO
Database in sync?
: NO
Software Release in sync? : NO
Firmware in sync?
: NO
Mate-to-Mate link up?
: NO
[local]Train-1# show redundancy
--------------------------------This RPSW is active
--------------------------------STANDBY RPSW READY?
: YES
PAd in sync?
: YES
Database in sync?
: YES
Software Release in sync? : YES
Firmware in sync?
: YES
Mate-to-Mate link up?
: YES
[local]Train-1#
[local]Train-1#
NOT GOOD
thumbs up
Figure 4-19: Manual RP Switchover (1-2)
- 104 -
© Ericsson AB 2015
LZT1381712 R1A
Understand the SSR System Redundancy Issues
[local]Train-1# reload switch-over
The "reload switch-over" command on this system will cause standby to
active switch over, some cards may be rebooted. Do you really want to
reload? (y/n)
[local]Train-1# show chassis
Current platform is SSR 8020
(Flags: A-Active Card
B-Standby Card)
Slot : Configured Type
Installed Type
Operational State
Flags
-------------------------------------------------------------------------RPSW1 : n/a
rpsw
OOS-Booting
B
RPSW2 : n/a
rpsw
IS
A
ALSW1 : n/a
alsw
IS
A
ALSW2 : n/a
alsw
IS
B
--- cut …
[local]Train-1# show redund | grep NO
STANDBY RPSW READY?
: NO
VxWorks in sync?
: NO
Database in sync?
: NO
Software Release in sync? : NO
Firmware in sync?
: NO
[local]Train-1#
Figure 4-20: Manual RP switchover (2-2)
LZT1381712 R1A
© Ericsson AB 2015
- 105 -
Ericsson SSR 8000 R15 System Troubleshooting
4
Chapter Summary
After this course the participant should be able to:
› Identify the SSR System Redundancy Issues
› Explain the redundancy on active RP
› Analyze problems of standby RP
› Understand RP Failover Management
Figure 4-21: Chapter Summary
- 106 -
© Ericsson AB 2015
LZT1381712 R1A
Issues related with Boot Problem
5 Issues related with Boot Problem
Chapter Objectives
After this course the participant will be able to:
› Discuss the issues related with Boot Problem
› Understanding the booting in SSR
› Identify the issue related with booting in SSR
Figure 5-1: Chapter Objectives
LZT1381712 R1A
© Ericsson AB 2015
- 107 -
Ericsson SSR 8000 R15 System Troubleshooting
1
Boot Problems
› In some cases there can be issues
with the controller cards
Active
Standby
– The reason could be corrupt OS image
or hardware problem
› This could result the route
processor continuously rebooting
› You can stop this by entering the
boot ROM interface on the RPSW
Reboot
– Note! You can access the boot ROM
interface only through the “CONSOLE
port”
– Caution! Do not change any system
boot parameters unless instructed! It
may cause non responsive system.
Figure 5-2: Boot Problems
1.1
Entering Boot ROM Interface
1. Access the system through the CONSOLE port on the
RPSW.
– Usually this will be the standby.
2. Type “ssr” when you see the text:
Auto-boot in 5 sec, type 'ssr' to abort, [CR] to boot:
– You have 5 second so you need to be quick
› You will be in “Boot ROM Interface”. Special commands
apply here.
– Note! You cannot change the configuration
Figure 5-3: Entering Boot ROM Interface
- 108 -
© Ericsson AB 2015
LZT1381712 R1A
Issues related with Boot Problem
1.2
Example: Entering Boot ROM Interface
Start to reload system ...
Sep 07 15:43:05 Initiated shutdown procedure.
--- cut --Welcome to CodeGenInc SmartFirmware(tm) version 3.0 for x86_64
SmartFirmware(tm) Copyright 1996-2008 by CodeGen, Inc.
All Rights Reserved.
--- cut --Board voltage reads
54.0V
Board current reads
0.96A
Board power reads
51.9W
Executing POST
PASSED Loop 1 of 1,
POST PASSED
SHORT DRAM Test : PASSED
PCI Devices Test : PASSED
RTC Test : PASSED
RTC battery check Test : PASSED
Disk1 presence Test : PASSED
Check EPROM Test : PASSED
2012/09/07 15:43:58
We entered “ssr” here
Auto-boot in 5 sec, type 'ssr' to abort, [CR] to boot:
Ok
› “ok” is the prompt for the Boot ROM
Interface
› The booting has been stopped!
Figure 5-4: Example: Entering Boot ROM Interface
1.3
Diagnostics Command
Diagnostics are software tests that examine the hardware and operating system
environment and detect malfunctions. The system supports the following
automatic and user-initiated diagnostic tests for detecting problems related to
hardware and software.
› It is possible to run test of required level from the Boot ROM Interface:
Diag command and
ok diag
options
usage: diag <POST# | testname> <loopcount>:
POST# Specify the level of Diags to run (1 to 2)
* OR * Specify the testname to run, with is one of:
set-led
: X86RP DEV Port 80 led
Level 0xF: ENABLED
Run on boot
: X86 BOOT L2 CACHE
Level 0x9: ENABLED
Run on boot
: X86 OFW L2 CACHE
Level 0x9: ENABLED
Run on boot
: DRAM ECC
Level 0x9: ENABLED
short-dram
: SHORT DRAM
Level 0x1: ENABLED
long-dram
: LONG DRAM
Level 0x2: ENABLED
post-pci
: PCI Devices
Level 0x1: ENABLED
rtc-check
: RTC
Level 0x1: ENABLED
--- cut ---
Figure 5-5: Diagnostics Command
LZT1381712 R1A
© Ericsson AB 2015
- 109 -
Ericsson SSR 8000 R15 System Troubleshooting
1.4
Running Diagnostics
Run Diag level 1
ok diag POST1
SHORT DRAM Test : PASSED
PCI Devices Test : PASSED
RTC Test : PASSED
RTC battery check Test : PASSED
CPLD Register Test : PASSED
Disk0 presence Test : PASSED
Disk1 presence Test : PASSED
Check EPROM Test : PASSED
PASSED Loop 1 of 1,
POST level 1 PASSED
Report faults to
Ericsson tech group
2012/09/14 15:46:03
Figure 5-6: Running Diagnostics
- 110 -
© Ericsson AB 2015
LZT1381712 R1A
Issues related with Boot Problem
2
Troubleshooting Scenarios
Figure 5-7: Troubleshooting Scenarios
In case you need to
resume booting.
ok bootsys
Launching OS kernel: load flash:0 /boot/bzImage
Linux version 2.6.32.53-798-g5652359 (sysbuild@asglx-1-300) (gcc version 4.3.2
(Wind River Linux Sourcery G++ 4.3-85) ) #2 SMP PREEMPT Wed Apr 25 22:26:19 PDT
2012
Command line: console=ttyS0,9600n8 crashkernel=128m pci=hpmemsize=0,hpiosize=0
KERNEL supported cpus:
Intel GenuineIntel
AMD AuthenticAMD
Centaur CentaurHauls
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 00000000000a0000 (usable)
BIOS-e820: 00000000000c0000 - 00000000bf3db000 (usable)
BIOS-e820: 00000000bf3db000 - 00000000bf427000 (ACPI NVS)
BIOS-e820: 00000000bf427000 - 00000000bf42e000 (ACPI data)
BIOS-e820: 00000000bf42e000 - 00000000bf45f000 (unusable)
Figure 5-8: Resume Boot
LZT1381712 R1A
© Ericsson AB 2015
- 111 -
Ericsson SSR 8000 R15 System Troubleshooting
2.1
Troubleshooting Scenarios
›
We will present some common examples of
troubleshooting the SSR System in coming sections:
1. Analyzing problems of active RP
2. Investigating redundancy issues
3. Analyzing problems on standby RP
4. RP boot problems
Figure 5-9: Troubleshooting Scenarios
2.2
System Uptime
› The first question is how long has this RP been up
[local]Train-1# show version
Ericsson IPOS Version IPOS-15.2.129.3.13-Release
Built by sysbuild@eussjlx7061.sj.us.am.ericsson.se Tue Oct 20 08:33:21
PDT 2015
Copyright (C) 1998-2015, Ericsson AB. All rights reserved.
Operating System version is Linux 3.0.75-1281-gd853cba
System Bootstrap version is OpenFirmware 3.0.2.29 PRODUCTION RELEASE
Installed minikernel version is v3.0.38-876-g113fd53-2532251
ippmd / mloam-service-layer component version: 0.2-194-gf6ac77e
Built by sysbuild@eussjlx7046.sj.us.am.ericsson.se Tue Jan 13 00:36:02
PST 2015
Copyright (C) 1998-2015, Ericsson AB. All rights reserved.
Router Up Time - 26 minutes 19 seconds
[local]Train-1#
Figure 5-10: System uptime
- 112 -
© Ericsson AB 2015
LZT1381712 R1A
Issues related with Boot Problem
› What and when did somebody type something?
[local]Train-1# show history global
Dec 16 18:47:54 sho cont all
Dec 16 18:47:56 sho ver
Dec 16 18:48:17 sho chassis
Dec 16 18:48:20 sh hard
Dec 16 18:48:27 sho ver
Dec 16 15:49:51 en
Dec 16 15:49:52 ericsson
Dec 16 15:49:59 show config
Dec 16 15:50:01 config
Dec 16 15:50:04 context local
Dec 16 15:50:06 int 1
Dec 16 15:50:11 ip add 10.1.1.106/24
Dec 16 15:50:13 exit
Dec 16 15:50:21 administrator ericsson password
ericsson
Dec 16 15:50:25 privilege start 15
Dec 16 15:50:29 port eth 7/1
---more---
Anything you typed and did not
get rejected by CLI parser. Covers
both execute commands (show)
and configuration commands
Although we trust everybody
its good to check for possible
human errors
[local]Train-1# show history global | grep reload
Dec 16 15:58:18 reload card 5
[local]Train-1#
Figure 5-11: Check for human errors
2.3
System Storage Verification
› Verification of errors and free space within storage media
[local]Train-1# show disk internal detail
Manufacturer
: SMART (/dev/sda)
Model
: eUSB
Serial Number
: 1E884210130309181111
No disk errors!
Manufacturer
Model
Serial Number
Disk usage OK!
Filesystem
rootfs
/dev/sda2
/dev/sdb1
/dev/sda3
[local]Train-1#
: SMART (/dev/sdb)
: eUSB
: 3E3B2F12132259181111
1k-blocks
3969036
3969068
15604376
7665864
Used Available Use% Mounted on
1726740
2042260 46% /
916612
2852420 24% /p02
1307320 13510644
9% /var
150644
7128880
2% /flash
Figure 5-12: System storage verification
2.4
Exercise 4: Investigate Boot Problems
› Please move to the exercises book.
Figure 5-13: Exercise 4: Investigate Boot Problems
LZT1381712 R1A
© Ericsson AB 2015
- 113 -
Ericsson SSR 8000 R15 System Troubleshooting
3
Chapter Summary
After this course the participant should be able to:
› Discuss the issues related with Boot Problem
› Understanding the booting in SSR
› Identify the issue related with booting in SSR
Figure 5-14: Chapter Summary
- 114 -
© Ericsson AB 2015
LZT1381712 R1A
Active and History Logs in SSR
6 Active and History Logs in SSR
Chapter Objectives
After this course the participant will be able to:
› Describe the log in SSR
› Understanding the Active and History Logs
› Discuss the different type of logs in SSR
› Understanding the communication with Syslog Server
› Discuss the concept of communication with Syslog
Server
› Configure communication to a Syslog server
Figure 6-1: Chapter Objective
LZT1381712 R1A
© Ericsson AB 2015
- 115 -
Ericsson SSR 8000 R15 System Troubleshooting
1
System Logging Introduction
In many cases system troubleshooting takes place after a problem has occurred
and we need to figure out what the problem was and how it occurred. For this we
typically need to access historical data and this is where System Logging is
useful.
In SSR there is a System Logger which collects information from any process
which has information which needs to be logged, and writes this information to a
log buffer for future reference. The buffer used is a Circular 1 Mb buffer
meaning that when it fills older entries are overwritten.
› Troubleshooting:
– Often after problem occurred
› Logs: historical information
› System logger: collects
information from multiple
sources
› Storage of log messages
?
System
logs
Troubleshooting
– /md/loggd_dlog.bin
Figure 6-2: System logging introduction
- 116 -
© Ericsson AB 2015
LZT1381712 R1A
Active and History Logs in SSR
1.1
Loggd Process
The System Logging on SSR is done by the Logger daemon (or Loggd) Process.
The Loggd process is broken into three different log types. Main System Logs,
System Debugs and Malicious Packet Logs and each of these have their own
memory buffers which is limited to approximately 1mpb in size.
The Loggd Process runs on all RPSW cards as well as Line Cards.
Remote Loggd Processes (for example those running on the line card or standby
RP) send the log messages to the loggd running on the active RP, so the active
RP log buffer contains all the necessary logged messages.
Logger Daemon (Loggd)
Active
RPSW
LOG
Debug
STANDBY
RPSW
MAL PKT
1 Mb Buffer
Line Card
Figure 6-3: Loggd Process
It is the System Main Logs which are used for historical troubleshooting as it is
data from this buffer that can be saved for future analysis by writing the data to
files. Certain Debug messages may also be saved which we see later.
LZT1381712 R1A
© Ericsson AB 2015
- 117 -
Ericsson SSR 8000 R15 System Troubleshooting
1.2
System log Commands
The simplest way to check current logs is to use the “show log” command. This
displays the contents of the Loggd Buffer on the Active RP. As mentioned logs
from other cards in the chassis are sent to this log process.
Log messages typically take always the same format. The first part contains the
Timestamp of the event This is followed by the application which has written the
log event. The next part contains a numeric value which indicates the severity of
the event and its description or condition that caused the message to be logged.
Finally we can read the content of the Log Message. Many log messages are
normal and do not indicate a system problem while others may be critical.
Looking at current log events
[local]Ericsson# show log
Dec 16 12:44:41: %IPC-3-ERR: loggd: ipc_sendto sendto errno 2: No such file or directory
Dec 16 12:44:41: {6/LP}: %SRVFWD-ESSIF-6-CTLD_INFO: essifcd_sc_notify_appvm_start():463:
AppVM is alive
Dec 16 12:44:41: %IPC-3-ERR: loggd: ipcSendCommon: sendto rc=-3
Dec 16 12:44:41: %IPC-3-ERR: loggd: ipcContactPM: ipcSend(NS) err=-16
Dec 16 12:44:41: %ISP-6-INFO: [isp_heartbeat_register] is called on ACTIVE
-more--
Timestamp
Severity severity level
application
Log Message
0 Emergencies
1 Alerts
[local]Ericsson# show log card ?
1..20
RPSW1..RPSW2
all
2 Critical
3 Errors
slot number
all slots
`
4 Warnings
5 Notifications
6 Informational
7 Debugging
Figure 6-4: System log commands
- 118 -
© Ericsson AB 2015
LZT1381712 R1A
Active and History Logs in SSR
1.3
Event Severity Levels in log Messages
Figure 6-5: Event Severity Levels in Log Messages
There are 8 different Severity levels.
Severity Seven Logs are those used for debugging and must be enabled manually
by the user. This is done through system debugging which we will see in the next
section, For this reason the Logger Daemon Process only Messages from 0 to 6
by default.
As mentioned earlier there is a logger daemon process on each card and if we
want to display only the Logs for a specific card we can use the “Show log card”
command.
As you can see there are options to display logs for either RP as well as cards
from each slot on the SSR.
1.4
Logs from Cards
As mentioned each RP and Line Card has Log Process and they communicate to
the Active RP Log Process.
Line Cards send their log messages to the Log Process on the Active RP by
default, so the Active RP log buffer contains all the logs of Line Cards. The only
exception to this would be some logs on Card reload before communication is
established with RP.
We can identify a log message coming from another card because in this case,
after the timestamp, we have a message indicating the component the log was
generated from.
LZT1381712 R1A
© Ericsson AB 2015
- 119 -
Ericsson SSR 8000 R15 System Troubleshooting
In this case these messages are sent from the Line Processor on Card in slot 3.
Also standby RP can send its log messages to the Active RP and viceversa.
However this has to be configured from global config mode using the commands
“logging standby” and “logging active”
[local]Ericsson# show log startup
Ericsson Log
line cards send logs to
active RP by default
Ericsson IPOS
Context ID 0x40080001
Dec 16 05:59:26:{3/LP}: %FABRICD-6-INFO: Enable WLCFAP IRQ.
Dec 16 05:59:26:{3/LP}: %FABRICD-6-INFO: Enable FAP.0 IRQ.
Dec 16 05:59:26:{3/LP}: %FABRICD-6-INFO: IPC Event: FAPFMA_EVENT_IPC_FMM_BIRTH
Dec 16 05:59:26: %PAD-6-INFO: SVC - proc_asgSl_card_boot_events():885: slot 3,
ASG_SL_CARD_INIT_PASSED received
Dec 16 05:59:26: %PAD-6-INFO: SVC - slMakeEvent_asg_cb():504: slot 3,
Card_Boot_Event :6, image 0, source:1
Dec 16 05:59:26: %PAD-6-INFO: Card activation completed on slot 3
Dec 16 05:59:26: {3/LP}: %FABRICD-6-INFO: Sync Type: FAPFMA_FMR_SYNC
Dec 16 05:59:26: {3/LP}: %CAD-6-INFO: caCdlNpuUpdateInitPhase: All drivers
have completed initialization. Making transition to Ready.
[local]Ericsson(config)# logging ?
active
Configure to log active event to standby controller
cct-valid Configure to log only event with valid cct
cmd-audit Configure to log commands
debug
Configure to log debug events
standby
Configure to log standby event to active controller
timestamp Configure the timestamp information of log
[local]Ericsson(config)#
Figure 6-6: Logs from cards
1.5
Show log and time
[local]Ericsson# show log active all since 2015:06:23:21:54:17
Jun 23 21:54:35.086: %PAD-6-INFO: virtual bool
PktBaseEPortMgr::setPortOperation(EnableDisable): Port 1/14,
enableDisable=ENABLE
Jun 23 21:54:35.086: %PAD-6-INFO: caPktPortEnable(1/14)
Jun 23 21:54:35.195: %APP-6-INFO: submitting alarm, major: 193, minor: 1,
dn: ManagedElement=1,Equipment=1,Slot=0,Port=13, severity: 3, text: Link
down , time: 1340468651 (in applibcm_svr_cfg_event_callback)
Jun 23 21:54:35.807: %CSM-6-PORT: ethernet 1/14 link state UP service state
UP, overall admin is UP
Jun 23 21:54:35.811: [0002]: %VRRP-5-STATE_CHANGE: VRRP router
SS7_vrrp_1/151 state change from Init to Backup due to event Interface Up
Jun 23 21:54:35.811: [0003]: %VRRP-5-STATE_CHANGE: VRRP router
sr_om_1_sw01/150 state change from Init to Backup due to event Interface Up
Jun 23 21:54:35.811: [0004]: %VRRP-5-STATE_CHANGE: VRRP router
SR_GB_1_Sr1/10 state change from Init to Backup due to event Interface Up
Jun 23 21:54:38.201: %APP-6-INFO: submitting alarm, major: 193, minor: 1,
dn: ManagedElement=1,Equipment=1,Slot=0,Port=13, severity: 0, text: Link
down , time: 1340468677 (in applibcm_svr_cfg_event_callback)
Figure 6-7: Show log and time
A very useful facility when using log files is to search based on time. In many
cases we know when a certain event occurred and wish to analyze the logs
around that time.
- 120 -
© Ericsson AB 2015
LZT1381712 R1A
Active and History Logs in SSR
In this case, by using the option “since” , we show events logged since the
specified time
[local]Ericsson# show log active all since 2015:06:23:21:54:17 until
2012:06:23:21:55
Jun 23 21:54:35.086: %PAD-6-INFO: virtual bool
PktBaseEPortMgr::setPortOperation(EnableDisable): Port 1/14,
enableDisable=ENABLE
Jun 23 21:54:35.086: %PAD-6-INFO: caPktPortEnable(1/14)
Jun 23 21:54:35.195: %APP-6-INFO: submitting alarm, major: 193, minor: 1,
dn: ManagedElement=1,Equipment=1,Slot=0,Port=13, severity: 3, text: Link
down , time: 1340468651 (in applibcm_svr_cfg_event_callback)
Jun 23 21:54:35.807: %CSM-6-PORT: ethernet 1/14 link state UP service state
UP, overall admin is UP
Jun 23 21:54:35.811: [0002]: %VRRP-5-STATE_CHANGE: VRRP router
SS7_vrrp_1/151 state change from Init to Backup due to event Interface Up
Jun 23 21:54:35.811: [0003]: %VRRP-5-STATE_CHANGE: VRRP router
sr_om_1_sw01/150 state change from Init to Backup due to event Interface
Up
Jun 23 21:54:35.811: [0004]: %VRRP-5-STATE_CHANGE: VRRP router
SR_GB_1_Sr1/10 state change from Init to Backup due to event Interface Up
Jun 23 21:54:38.201: %APP-6-INFO: submitting alarm, major: 193, minor: 1,
dn: ManagedElement=1,Equipment=1,Slot=0,Port=13, severity: 0, text: Link
down , time: 1340468677 (in applibcm_svr_cfg_event_callback)
Figure 6-8: Show log and time
You can also specify a particular time window while looking through the logs.
Here we see an example of show log between two specific times.
We simply have to add the option “until” to the previous command.
These commands save us from going through unnecessary logs entries.
1.6
Log Files
The operating system contains two log buffers: main and debug. By default,
messages are stored in the main log. If the system restarts, for example as a result
of a logging daemon or system error, and the logger daemon shuts down and
restarts cleanly, the main log buffer is saved.
The Main Log buffer is a circular buffer and therefore after a while all logs are
overwritten. For this reason some log information is also stored in files for future
access.
LZT1381712 R1A
© Ericsson AB 2015
- 121 -
Ericsson SSR 8000 R15 System Troubleshooting
Logs stored in files
› /md/loggd_dlog.bin
LOG
Logger Daemon Restarts
Log Messages
› /md/loggd_startup.log
› /md/loggd_startup.log1
Severity
0,1,2,3,4,5,6
Log Messages
› /md/loggd_persistent.log
› /md/loggd_persistent.log1
› /md/loggd_persistent.log2
› /md/loggd_persistent.log3
Severity 0,1,2,3
Figure 6-9: Log Files
First of all if the logger daemon shuts down or restarts cleanly, the contents of
main log buffer is saved in the loggd_dlog.bin file stored in the /md directory.
This is useful to preserve logs across System Reload.
As well as this there are two predefined log files which are created.
The first of these is the Startup Log. A Startup log is created for every reboot of
SSR. This contains all logs since the last startup and contains logs of severity 0 to
6. The file is constantly written to and does not not wrap around. This means the
Startup log will always contain logs since the last system startup. The file does
however have a limited size of around 10 Mega bytes and once this is reached is
no longer written to.
Two startup log files are stored. The current startup log loggd_startup.log and the
startup log before the last system reload loggd_startup.log.1.
The second predefined log file is the Persistent Log file . Persistent Logs are logs
that are not lost on system reload. Persistent Log files are not rotated and are
written to continuously but only Error and More Severe Log Messages are
written. Up to 4 Persistent log files are stored and these fill up to a maximum of
around 10Mb each. When the persistent log reaches its maximum it is moved to
loggd_persistent.log1 and a new loggd_persistent file is created and so on.
Persistent Logs contain logs of severity 0 to 3. Unlike startup logs, these Logs are
not rotated on reload and are written to until the max size is reached.
This is true for all RPs and Line cards. We can find these files in the /md
directory of each of these cards.
- 122 -
© Ericsson AB 2015
LZT1381712 R1A
Active and History Logs in SSR
1.6.1
Custom Log Files and Filters
Users may also generate their own custom log files using the “logging file”
command from context-config mode. This file is stored by default in the directory
/md.
The file can then be customized to contain logs up to a particular severity using
the option filter.
The logging display filter can also be applied for the console, monitoring
terminal and syslog server.
[local] Ericsson(config-ctx)# logging file MYLOG.log
[local] Ericsson(config-ctx)# logging filter file ?
alert
Log alert and more severe events
(priority 1)
critical
Log critical and more severe events
(priority 2)
debug
Log all events, including debug
(priority 7)
emergency
Log only emergency events
(priority 0)
error
Log error and more severe events
(priority 3)
informational Log informational and more severe events (priority 6)
notice
Log notice and more severe events
(priority 5)
warning
Log warning and more severe events
(priority 4)
[local] Ericsson(config-ctx)# logging filter ?
console Configure logging display filter for the console
file
Configure logging display filter for file
monitor Configure logging display filter for monitoring terminal
syslog
Configure logging display filter for syslog server
Figure 6-10: Custom Log fIles and filters
LZT1381712 R1A
© Ericsson AB 2015
- 123 -
Ericsson SSR 8000 R15 System Troubleshooting
1.6.2
Log Files Location
Both default and custom files are saved in the /md folder.
In this example we can see the default files and the custom file we created earlier.
[local] Ericsson# cd /md
Current directory is now /md
[local] Ericsson# dir
Contents of /md/
total 184484
-rw-r--r-- 1 root root
16 Jun 23 02:23 loggd_ddbg.bin
-rw-r--r-- 1 root root
777848 Jun 23 02:23 loggd_dlog.bin
-rw-r--r-- 1 root root
5081846 Jun 23 06:21 loggd_persistent.log
-rw-r--r-- 1 root root
9751679 Jun 22 22:59 loggd_persistent.log.1
-rw-r--r-- 1 root root
9751727 Jun 18 01:16 loggd_persistent.log.2
-rw-r--r-- 1 root root
9751661 Dec 15 22:00loggd_persistent.log.3
-rw-r--r-- 1 root root
9751660 May 24 21:02 loggd_persistent.log.4
-rw-r--r-- 1 root root
9751711 May 17 02:31 loggd_persistent.log.5
-rw-r--r-- 1 root root
289774 Jun 23 06:20 loggd_startup.log
-rw-r--r-- 1 root root
262601 Jun 23 05:01 loggd_startup.log.1
-rw-rw-r-- 1 root root
948 Dec 16 12:56 MYLOG.log
Figure 6-11: Log Files location
Both default and custom files are saved in the /md folder.
In this example we can see the default files and the custom file we created earlier.
1.6.3
Display Log Files
[local] Ericsson# show log file loggd_persistent.log.5
Ericsson Log
Ericsson IPOS
Context ID 0x40080001
Sep 23 16:08:18: %LOG-6-SEC_ACTIVE: Sep 23 16:08:18: {5/LP}: %FABL-ALDSUPPORT-3-INTERNAL_ERR:
fwd_al_adj_create_raw, Error Code: FW
D_AL_ERROR_CIRCUIT_NOT_FOUND, Adj_id: 0x98e342;Adj_cookie: 0;Circuit handle:
5/1:511:63:31/1/2/304107;Port: 1;MTU: 1500;Encap lengt
h: 0;Encap s
Sep 23 16:08:18: %LOG-6-SEC_ACTIVE: Sep 23 16:08:18: {5/LP}: %FABL-ALDSUPPORT-3-INTERNAL_ERR:
fwd_al_circuit_control_pkts, Error Co
de: FWD_AL_ERROR_CIRCUIT_NOT_FOUND, Circuit handle: 5/1:511:63:31/1/2/304107;FABL_API_MODULE_ID:IFACE;
Sep 23 16:08:18: %LOG-6-SEC_ACTIVE: Sep 23 16:08:18: {5/LP}: %FABL-ALDSUPPORT-3-INTERNAL_ERR:
fwd_al_circuit_mac_config, Error Code
: FWD_AL_ERROR_CIRCUIT_NOT_FOUND, Circuit handle: 5/1:511:63:31/1/2/304107; Mac: 00:02:3b:04:57:68;
Sep 23 16:08:18: %LOG-6-SEC_ACTIVE: Sep 23 16:08:18: {5/LP}: %FABL-ALDSUPPORT-3-INTERNAL_ERR:
fwd_al_circuit_down, Error Code: FWD_
AL_ERROR_CIRCUIT_NOT_FOUND, Circuit count: 1;
Circuit List
5/1:511:63:31/1/2/304107;
Sep 23 16:08:18: %LOG-6-SEC_ACTIVE: Sep 23 16:08:18: {5/LP}: %FABL-ALDSUPPORT-3-INTERNAL_ERR:
fwd_al_circuit_mac_config, Error Code
: FWD_AL_ERROR_CIRCUIT_NOT_FOUND, Circuit handle: 5/1:511:63:31/1/2/304107; Mac: 00:02:3b:04:57:68;
Sep 23 16:08:18: %LOG-6-SEC_ACTIVE: Sep 23 16:08:18: {5/LP}: %FABL-ALDSUPPORT-3-INTERNAL_ERR:
fwd_al_circuit_create, Error Code: FW
D_AL_ERROR_INSUFFICIENT_MEMORY, Circuit handle: 5/1:511:63:31/1/2/304108;MTU: 1500;IPv6 MTU:
1500;Parent circuit: 5/1:511:63:31/1/2
/303308;
Sep 23 16:08:18: %LOG-6-SEC_ACTIVE: Sep 23 16:08:18: {5/LP}: %FABL-ALDSUP
---more
Figure 6-12: Display Log Files
- 124 -
© Ericsson AB 2015
LZT1381712 R1A
Active and History Logs in SSR
1.7
Filter Based on Facility
When debugging it is very useful to be able to only look at logs from a particular
facility. This is done using the “show log fac” command.
This option can be applied to any type of log, like for example a log file.
[local] Ericsson# show log fac ?
aaa
amcm
aos
app
arp
asesdk
asm
aspha
atm
bgp
bot
--more--
AAA facility
AMC Manager facility
AOS facility
Application facility
ARP facility
ASESDK facility
Remote mini-CSM facility
ASP HA Manager facility
ATM facility
BGP facility
SSC File Manager facility
[local] Ericsson# show log file loggd_startup.log fac ?
aaa
amcm
aos
app
arp
asesdk
asm
aspha
…
AAA facility
AMC Manager facility
AOS facility
Application facility
ARP facility
ASESDK facility
Remote mini-CSM facility
ASP HA Manager facility
Figure 6-13: Filter Based on Facility
1.7.1
Filter Based on Facility Example
In this example we want to get only logs related to authorization, authentication
and accounting from the current event log.
As we can see only logs generated by the “aaa” application are displayed.
[local] Ericsson# show log active fac aaa
Jun 23 05:06:24.312: %AAA-6-INFO: Perform non hitless switchover.
Jun 23 05:06:33.612: %AAA-5-NOTICE: [local] administrator: (test) logged in via tty:
/dev/pts/, host: 155.53.235.45
Jun 23 05:06:59.042: %AAA-5-NOTICE: [local] administrator: (test) logged in via tty:
/dev/pts/, host: 155.53.235.45
Jun 23 06:20:59.931: %AAA-5-NOTICE: [local] administrator: (test) logged in via tty:
/dev/pts/, host: 155.53.234.42
Jun 23 06:26:15.024: %AAA-5-NOTICE: [local] administrator: (test) logged in via tty:
/dev/pts/, host: 155.53.235.45
Jun 23 07:07:51.776: %AAA-5-NOTICE: [local] administrator: (test) found on /dev/pts/4 from
155.53.235.45 - record as logged out.
Jun 23 07:48:54.559: %AAA-5-NOTICE: [local] administrator: (test) found on /dev/pts/1 from
155.53.235.45 - record as logged out.
Jun 23 07:48:56.397: %AAA-5-NOTICE: [local] administrator: (test) found on /dev/pts/2 from
155.53.235.45 - record as logged out.
Figure 6-14: Filter Based on Facility example
LZT1381712 R1A
© Ericsson AB 2015
- 125 -
Ericsson SSR 8000 R15 System Troubleshooting
1.8
PM Process Logs
One Process whose logs should be of particular interest is the Process Manager or
pm. This process monitors and controls the operation of all other processes and
has IPC connections to each.
If a process is having a problem, PM will log this.
In this case we have simulated a crash of the RCM process. When we look at the
pm logs we can see info related to process rcm dying and restarting.
PM will also show information about any RPSW Switchover event.
RCM Process Crash
PM: process manager
[local] Ericsson# show log active fac pm
Jun 23 21:43:51.474: %PM-6-PROCDIE: rcm is dying, pid 3982
Jun 23 21:43:54.474: %PM-5-GEN: restarting <rcm> now
RPSW Switchover
[local] Ericsson# show log active fac pm
Jun 23 05:06:22.206: %PM-5-GEN: PM received ACTIVE event
Jun 23 05:06:22.206: %PM-5-GEN: Set PM to run in primary mode.
Jun 23 05:06:22.206: %PM-5-GEN: This RP is going Active.
Jun 23 05:06:22.206: %PM-5-GEN: Setting PM as primary.
Jun 23 05:06:22.217: %PM-5-GEN: Reason for controller switch: Card Failed
Jun 23 05:06:22.218: %PM-6-INFO: pm_send_status: Notifying ns
Jun 23 05:06:22.218: %PM-6-INFO: pm_send_status: Notifying rpsw_dtp
Figure 6-15: Pm Process Logs
- 126 -
© Ericsson AB 2015
LZT1381712 R1A
Active and History Logs in SSR
1.9
CSM Process Logs
Another useful Process is the Card State Manager process. This is responsible to
process events for cards and ports and may be useful in troubleshooting issues
with ports not coming up.
[local] Ericsson# show log active fac csm
Jun 23 05:05:15.504: %CSM-6-CARD: card ge-40-port INSERTED in slot 1 READY
Jun 23 05:05:15.505: %CSM-6-CARD: card ge-40-port INSERTED in slot 17 READY
Jun 23 05:05:15.506: %CSM-6-CARD: card alsw INSERTED in slot ALSW1
Jun 23 05:05:15.506: %CSM-6-CARD: card alsw INSERTED in slot ALSW2
Jun 23 05:05:15.506: %CSM-6-CARD: card sw INSERTED in slot SW1
Jun 23 05:05:15.507: %CSM-6-CARD: card sw INSERTED in slot SW2
Jun 23 05:05:15.507: %CSM-6-CARD: card sw INSERTED in slot SW3
Jun 23 05:05:15.507: %CSM-6-CARD: card sw INSERTED in slot SW4
Jun 23 05:05:24.765: %CSM-6-PORT: ethernet 1/11 link state UP service state UP,
overall admin is UP
Jun 23 05:05:24.765: %CSM-6-PORT: ethernet 1/12 link state UP service state UP,
overall adminis UP
Jun 23 05:05:24.765: %CSM-6-PORT: ethernet 1/14 link state UP service state UP,
overall adminis UP
Jun 23 05:05:39.570: %CSM-6-CARD: slot PM5, ALARM_CLEARED: Input Failure - Both Feeds
--More--
Figure 6-16: CSM Process Logs
1.10
ISM Process
Interface and Circuit State Manager process logs can also be useful for reporting
information about links going up and down or reload switchovers.
[local] Ericsson# show log active fac ism
Jun 23 05:05:04.586: %ISM-6-STATE_TOGGLE: This ISM going standby.
Jun 23 05:05:04.607: %ISM-6-CHKPT_OK: Marked ISM checkpoint as OK
Jun 23 05:05:04.796: %ISM-6-PPA_REG1: Switchover is complete and can process PPA
registration now.
Jun 23 05:06:22.224: %ISM-6-STATE_TOGGLE: This ISM going active.
Jun 23 05:06:22.226: %ISM-6-SWOVR_TYPE: Performing *** NON HITLESS *** switchover. All
dynamic
and subcribers circuits will be deleted.
Jun 23 05:06:22.309: %ISM-6-SENT_IPC: Sent RESYNC ipc to component: CSM.
Jun 23 05:06:24.311: %ISM-6-SENT_EVENT: Sent event: XC RESYNC, to MBE: dot1q
Jun 23 05:06:24.311: %ISM-6-SENT_EVENT: Sent event: XC RESYNC, to MBE: aaa
Jun 23 05:06:24.484: %ISM-6-SENT_EVENT: Sent event: XC DONE, to MBE: aaa
Jun 23 05:06:24.484: %ISM-6-SENT_IPC: Sent XC DONE ipc to component: ifmgr.
Jun 23 05:06:24.484: %ISM-6-SENT_EVENT: Sent event: XC DONE, to client: snmp
Jun 23 05:06:24.485: %ISM-6-PPA_REG1: Switchover is complete and can process PPA
registration
now.
Jun 23 05:09:23.292: %ISM-6-SB_RDY_SWOVR: Standby ISM is ready for switchover
Figure 6-17: ISM Process
LZT1381712 R1A
© Ericsson AB 2015
- 127 -
Ericsson SSR 8000 R15 System Troubleshooting
1.11
Filter Based on Facility on Card
As each Card has its own logging facility, the commands mentioned can also be
applied to individual cards.
[local]SSR8020# show log card 3 fac pm
-------------------------------------------------------------Slot number
: 3/LP
Card Type
: ge-40-port
Aug 19 16:46:59: {3/LP}: %PM-6-INFO: All run processes initialized
Aug 19 16:47:59: {3/LP}: %PM-6-INFO: Declaring system healthy
[local]SSR8020# show log card 3 fac ns
-------------------------------------------------------------Slot number
: 3/LP
Card Type
: ge-40-port
Aug 19 16:46:47: {3/LP}: %NS-6-INFO: New namespace 'RP.ACTIVE' from ep
[127.2.253.1:6001|000|003]
Aug 19 16:46:48: {3/LP}: %NS-6-INFO: New namespace 'RP.STANDBY' from ep
[127.2.252.1:6001|000|003]
Aug 20 00:39:41: {3/LP}: %NS-6-INFO: New namespace 'LC.05' from ep
[127.2.4.1:6001|000|003]
Figure 6-18: Filter based on facility on card
- 128 -
© Ericsson AB 2015
LZT1381712 R1A
Active and History Logs in SSR
1.12
Logger Verification
This command displays some state info and statistics for the main log buffer
(Log) and the debug log buffer (Dbg)
The 'Logger Buffer Locked' line indicates whether the log buffer is currently
locked. While the buffer is locked, all msgs will be dropped. this lock should be
very transient. if a buffer is locked, repeat the 'show logging' command several
times, waiting a few seconds between each repeat. if a buffer is 'stuck' in the
locked state this is very likely a bug.
The 'Logged msg' line is a count of the number of msgs that have been inserted
into the log buffers.
The 'Logger Drop Counter' section lists counts for any dropped msgs, by
component.
› Logging is a process and one can verify the state of the
logger system
[local]Train-1# show logging
% Logging Information
% ===================
%
Logger Uptime : 09:37:56 Wed Dec 16 2015
%
Logger Buffer (KB) : Log:
981, Dbg:
1023
% Logger Buffer Locked : Log:
N, Dbg:
N
%
# Logged msg : Log:
343, Dbg:
0
%
# Logged Filtered : Log:
0, Dbg:
0
% # Logged Rate Limited : Log:
0, Dbg:
0
%
==================
%
Logger Drop Counter : All drop counters are all ZERO
[local]Train-1#
Log: Main log buffer
Buffer not locked – no
performance problems
Good news - no dropped log
messages
Dbg: Debug log buffer
Figure 6-19: Logger verification
LZT1381712 R1A
© Ericsson AB 2015
- 129 -
Ericsson SSR 8000 R15 System Troubleshooting
1.13
Show Logging Card Information
This command is also applicable to individual cards.
[local] Ericsson# show logging card 1
-------------------------------------------------------------Slot number
: 1/LP
Card Type
: ge-40-port
% Logging Information
% ===================
%
Logger Uptime : 02:10:50 Wed Dec 16 2015
%
Logger Buffer (KB) : Log:
979, Dbg:
1023
% Logger Buffer Locked : Log:
N, Dbg:
N
%
# Logged msg : Log:
431, Dbg:
0
%
# Logged Filtered : Log:
0, Dbg:
0
% # Logged Rate Limited : Log:
0, Dbg:
0
%
==================
%
Logger Drop Counter : All drop counters are all ZERO
Figure 6-20: Show Logging Card information
To reduce the number of informational messages displayed on the console,
changes were introduced in Release 12.1 to suppress the default display of INFO
messages on the console and terminal connections. By default, these messages
are no longer displayed, but they are still stored in the system log buffer.
1.14
Logging Display Info
[local]Train-1# logging display-info
from release 12.1 level 6 info messages not
[local]Train-1# terminal monitor
displayed by default
[local]Train-1# conf
Enter configuration commands, one per line, 'end' to exit
[local]Train-1(config)# port eth 2/8
[local]Train-1(config-port)# no shut
[local]Train-1(config-port)# commit
Transaction committed.
Feb 6 15:37:34: %CSM-6-PORT: ethernet 2/8 link state UP service state UP, overall admin is UP
[local]Train-1(config-port)#shut
[local]Train-1(config-port)#commit
Feb 6 15:38:03: %CSM-6-PORT: ethernet 2/8 link state DOWN service state DOWN, overall admin is DOWN
Feb 6 15:38:03: %CSM-6-PORT: ethernet 2/8 link state down, trigger source: Configuration changed
[local]Train-1(config-port)# end
[local]Train-1# no logging display-info
[local]Train-1# conf
[local]Train-1(config)# port eth 2/8
use of this command is discouraged
[local]Train-1(config-port)# no shut
[local]Train-1(config-port)# commit
[local]Train-1(config-port)#
For a Line Card:
logging card slot display-info
Figure 6-21: Logging display info
However, if you want to display INFO messages (for example, for script
purposes), you can enable them by entering the hidden command
logging display-info for the RP CLI logs. In the following example we enable
the command “logging display-info” and then we enable a port so to generate a
Level 6 log message.
As we can see the Level 6 message is displayed .
- 130 -
© Ericsson AB 2015
LZT1381712 R1A
Active and History Logs in SSR
We check again by disabling the port. Level 6 messages are displayed.
However is important to note that use of this command is discouraged because it
can result in a large number of undocumented messages displayed on the console.
To disable the display of INFO messages on the console, use the no form of the
commands.
If now we try to enable the port again we will not see any Level 6 message.
Finally if you want to display INFO messages for a line card you have to use the
command logging card slot display-info, where slot is the slot number where
the card is hosted.
LZT1381712 R1A
© Ericsson AB 2015
- 131 -
Ericsson SSR 8000 R15 System Troubleshooting
1.15
Logging Debug
As we have already said there are two separate buffers for log and debug
messages.
If the logger daemon shuts down and restarts cleanly, the contents of log buffer is
saved in the /md/loggd_dlog.bin file while the debug buffer is saved in the
/md/loggd_ddbg.bin file .
Because of this separation, by default , you can not use the command “show log”
to display the contents of debug messages
However we can use the command “logging debug” from “global config mode”
to send debug messages to the log buffer
› /md/loggd_dlog.bin
LOG
Debug
› /md/loggd_ddbg.bin
› “show log” displays only content of log messages by
default
› “logging debug” sends debug message to log buffer
Figure 6-22: Logging debug
The command “logging” from “global config mode” contains several options.
We have already talked about “logging active” and “logging stanby”.
By using “logging active”, Active RP sends its log messages to Standby RP,
while by using “logging standby” Standby RP sends its log messages to Active
RP.
In a similar way, when we use “logging debug”, debug logs are sent to the log
buffer and from here to the log files.
However, it is important to note that “logging debug” sends to Log Buffer only
events which are displayed to either console or terminal screen
- 132 -
© Ericsson AB 2015
LZT1381712 R1A
Active and History Logs in SSR
[local]Train-1(config)#logging ?
active
Configure to log active event to standby controller
cct-valid Configure to log only event with valid cct
debug
Configure to log debug events
standby
Configure to log standby event to active controller
timestamp Configure the timestamp information of log
[local]Train-1(config)#
Log file
Log file
Event …..
Event …..
Event ….
Event …..
Event …..
Event ….
Logging
debug
Terminal /
console
Term monitor or
Logging console
Log engine
Log engine
Standby RPSW
Active RPSW
Debug engine
Logging debug ONLY sends events which are actually
displayed to either console or terminal screen
Figure 6-23: Logging debug (global config logging)
In the following example we enable debug for static rib.
Initially “logging debug” is not configured. We add a static route to generate a
debug message. However this message can’t be seen in the output of “show log”.
Now we enable “logging debug” and add another static route. Now this message
can be seen in the output of “show log” because it has been sent from the debug
to the log buffer.
Finally, note that we have enabled terminal monitor because as we have said
before only debug logs which are displayed are sent to the Log Buffer
LZT1381712 R1A
© Ericsson AB 2015
- 133 -
Ericsson SSR 8000 R15 System Troubleshooting
1.16
Logging Debug
[local]Train-1# debug static rib
[local]Train-1# terminal monitor
[local]Train-1# conf
[local]Train-1(config)# context local
[local]Train-1(config-ctx)# ip route 11.12.12.0/24 8.8.8.8
[local]Train-1(config-ctx)# commit
Feb 7 15:20:23: %STATIC-7-RIB: register nexthop: 8.8.8.8, context 0x40080001,
nexthop_afi 0, metric 4294967295, ifgrid 0x0, default 0, magic 0, bfd-disabled
[local]Train-1(config-ctx)# end
[local]Train-1# show log | grep "STATIC-7“
[local]Train-1# conf
[local]Train-1(config)# logging debug
[local]Train-1(config)# context local
[local]Train-1(config-ctx)# ip route 11.12.13.0/24 8.8.8.9
[local]Train-1(config-ctx)# commit
Feb 7 15:32:29: %STATIC-7-RIB: register nexthop: 8.8.8.9, context 0x40080001,
nexthop_afi 0, metric 4294967295, ifgrid 0x0, default 0, magic 0, bfd-disabled
[local]Train-1(config-ctx)# end
[local]Train-1# show log | grep "STATIC-7"
Feb 7 15:32:29: %STATIC-7-RIB: register nexthop: 8.8.8.9, context 0x40080001,
nexthop_afi 0, metric 4294967295, ifgrid 0x0, default 0, magic 0, bfd-disabled
Figure 6-24: Logging debug
1.17
Log File Collection
› In R14A release, logs can be collect system from both the
active and standby controller card and the line cards.
› the “save tech-support log” command is used.
– Collects logs for the active controller card, for specific cards,
for the SSC1-A host, or for all controller cards and line cards.
– Stores them in the /md directory on the active controller card.
› Command Syntax:
– save tech-support log [card {slot | all | standby | host}]
Figure 6-25: Log File Collection
- 134 -
© Ericsson AB 2015
LZT1381712 R1A
Active and History Logs in SSR
2
Syslog Configuration
Figure 6-26: Syslog Configuration
________________________________________________________________
________________________________________________________________
________________________________________________________________
________________________________________________________________
________________________________________________________________
LZT1381712 R1A
© Ericsson AB 2015
- 135 -
Ericsson SSR 8000 R15 System Troubleshooting
2.1
Syslog Server
The SSR OS contains two log buffers: main and debug
By default, generic messages are stored to the main log
At restart: log buffers are save to the /md/loggd_dlog.bin for the main log buffer,
and the /md/loggd_ddbg.bin for the debug log buffer.
› Log messages can be sent to a remote Syslog server
– Convenient for large installations where multiple systems can log to
a remote server for centralized management
› Log messages can be sent to multiple Syslog servers per
context
› Messages sent can be filtered to be a certain severity level
and higher
Figure 6-27: Syslog server
2.2
Exercise 5: Logging & Syslog
Management side
Subscriber
Circuit:
•Ethernet Port
Session will be
•PPPoE
Management
local
xyz
Syslog
Subscriber
Circuit:
•ATM PVC
Backbone
Session will be
•PPPoE
Subscriber side
SSR
Backbone side
Figure 6-28: Reference for Syslog lab
› Please move to the exercises book.
Figure 6-29: Exercise 5: Logging & Syslog
- 136 -
© Ericsson AB 2015
LZT1381712 R1A
Active and History Logs in SSR
2.2.1
Exercise Review: Configure Syslog & Debug
Transport side
Tasks;
1) Logging syslog
2) Logging debug
management
logging debug
2
local
Context GX_xyz
logging syslog IP# facility localX
X = group number
IP# = Assigned by
the instructor
GX_xyz
1
Ethernet 3/5
As a result, each context will log a different
facility label to its Syslog server
›
Messages from context GX_xyz have
facility label localX
READY
Figure 6-30: Exercise review: Configure Syslog & Debug
2.2.2
Exercise Review: Syslog Server Environment
Note: To view SYSLOG events on the lab server, telnet to SSH server public IP
address, log in with student credential, and view the log file.
For example: tail -f /var/log/HOSTS/10.1.1.10X/messages
Lab Note: Instructors, you may need to change the permission of the hostname
folder (ex 10.1.1.101) because when the SSR OS first creates this folder for
syslog messages, it is only available to root. Change the folder to give permission
for all users.
Refer to instructor’s guides for this course for details.
LZT1381712 R1A
© Ericsson AB 2015
- 137 -
Ericsson SSR 8000 R15 System Troubleshooting
› Start debug for aaa
[G1_xyz]Train-1# debug aaa all
› For each SSR source IP address, the syslog server will
store the logging output in
/var/log/HOSTS/<hostname>
› In our example:
[student@ssh1-gothenburg-1]# tail -f /var/log/HOSTS/10.1.1.101/messages
Nov 7 13:21:38 [local6.notice] Nov 7 13:36:06: %AAA-5-NOTICE: [local] administr
ator: (ericsson) logged in via tty: /dev/ttyp0, host: 10.1.1.3
Nov 7 13:21:41 [local6.notice] Nov 7 13:36:09: %AAA-5-NOTICE: [local] administr
ator: (ericsson) found on /dev/ttyp0 from 10.1.1.3 - record as logged out.
--- cut ---
Figure 6-31: Exercise review: Syslog server environment
2.2.3
Exercise Review: Save and Display the Logs
› Save (active) logs to the file “ericsson.log”:
[local]Train-1# save log text ericsson.log
› Display the content of this file:
[local]Train-1#[local]Train-1# show log file ericsson.log
--- cut --Dec 5 11:12:07.227: %DLM-6-INFO: Standby RPSW's /flash may not be in sync
Dec 5 11:12:07.573: %DLM-6-INFO: Standby RPSW synced /flash successfully
Dec 5 11:12:10.953: {1/LP}: %CAD-5-NOTICE: caLcSmSetState: Old State: Booting (1)
Dec 5 11:12:10.954: {1/LP}: %CAD-5-NOTICE: caLcSmSetState: New State: Running (2)
Dec 5 11:12:11.338: %CSM-6-PORT: ethernet 1/1 link state UP service state UP, overall
admin is UP
› Display the active logs for aaa only:
[local]Train-1# show log active fac aaa
Nov 7 13:11:26: %AAA-5-NOTICE: [local] administrator: (ericsson) found on /dev/ttyp0
from 10.1.1.3 - record as logged out.
Nov 7 13:15:24: %AAA-5-NOTICE: [local] administrator: (ericsson) logged in via tty:
/dev/ttyp0, host: 10.1.1.3
Nov 7 13:32:24: %AAA-5-NOTICE: [local] administrator: (ericsson) found on /dev/ttyp0
from 10.1.1.3 - record as logged out.
--- cut ---
Figure 6-32: Exercise review: Save and display the logs
- 138 -
© Ericsson AB 2015
LZT1381712 R1A
Active and History Logs in SSR
3
Chapter Summary
After this course the participant should be able to:
› Describe the log in SSR
› Understanding the Active and History Logs
› Discuss the different type of logs in SSR
› Understanding the communication with Syslog Server
› Discuss the concept of communication with Syslog
Server
› Configure communication to a Syslog server
Figure 6-33: Chapter Summary
LZT1381712 R1A
© Ericsson AB 2015
- 139 -
Ericsson SSR 8000 R15 System Troubleshooting
Intentionally Blank
- 140 -
© Ericsson AB 2015
LZT1381712 R1A
Use and Impact of Debugging on the SSR System
7 Use and Impact of Debugging on the SSR
System
Chapter Objectives
After this course the participant will be able to:
› Describe the Use and Impact of Debugging on the SSR
System
› Understand the SSR system debug structure
› Identify the SSR debug process
Figure 7-1: Chapter Objectives
LZT1381712 R1A
© Ericsson AB 2015
- 141 -
Ericsson SSR 8000 R15 System Troubleshooting
1
Debug Introduction
When working on SSR it is useful to do system debugging while events take
place in order to analyze what exactly is occurring on the system. Debug is one of
the most powerful troubleshooting tools in the system. Debug helps zooming in
and identifying failures.
There are some important facts to know before start debugging.
Debug is resource intensive. It utilizes system memory and CPU. As such
Debugging should be used with caution. When debug is enabled, the debug
messages of a process may be competing with other process messages that are
critical for the process to run. So while other processes are not effected, the
debug may impact this process at the point where the Process manager may fail
to receive heartbeat and cause the process to restart.
Resource
Intensive!
› Debug – troubleshooting tool
Important facts:
› Debug: last resort!
› Structured searching
› What to debug?
– port, routing ...
– System wide debug
– Context specific debug
port
context
local
Last
resort!
What function to debug?
› Where to start debug?
– Contexts are autonomous
› Display debug output to screen
› Administrator privacy
ABC
System wide debug:
debug aaa authen
Context specific debug:
debug ospf lsdb
XYZ
Which context to start
debug from?
Figure 7-2: Debug introduction
Thus use debug as a last resort while troubleshooting. Use various show
commands, logs, alarms and similar tools before using debug.
Debug can generates massive output. Having a structured way of searching using
debug will help you minimize system downtime and operational costs. We
recommend following basic steps for structured searching.
Failure could occur in different system components, for example port, context
and so on. It is recommended to focus on what function to debug.
- 142 -
© Ericsson AB 2015
LZT1381712 R1A
Use and Impact of Debugging on the SSR System
It is important to remember that SSR uses contexts, which is like having multiple
routers on the same system. Because of this, some debug functions are System
Wide, for example aaa authentication, While Other debugs are Context Specific,
for example debugging an ospf link-state database which applies to a specific
context in which the ospf process may run. As there are different instances of
ospf in different contexts, running the debug command in different contexts will
result in different outputs.
Its is important for this reason to know what context debugs should be run in.
Context are autonomous routing environments. This means you can have many
different routers within the system. Once you have multiple contexts you need to
decide where you are going to start your debug functions.
IPOS in SSR gives the option to view debug events per context, and also option
to view debug event for all contexts.
› Contexts are autonomous
› This means you can have many
different routers within the SSR
› Once you have multiple contexts
you need to decide where you are
going to start your debug functions
› OS gives the option to view debug
events
Context
local
Context
abc
Context
123
– Per context, and
– For all contexts
SSR
Figure 7-3: The challenge
LZT1381712 R1A
© Ericsson AB 2015
- 143 -
Ericsson SSR 8000 R15 System Troubleshooting
1.1
Debug Coverage
The SSR software supports multiple contexts. Each context is an instance of a
virtual router that runs on the same physical device. A context operates as a
separate routing and administrative domain with separate routing protocol
instances, addressing, authentication, authorization, and accounting. A context
does not share this information with other contexts.
› Debug functions on SSR can be divided into 2
categories:
› Context specific – they display debug information specific
to given context only
–
Example: Debug ospf lsdb (routing) is considered context
specific since you could have multiple contexts, each running
their own ospf instance
› System wide – they display the same information
regardless of context they were started in
–
Example: Debug aaa authen (negotiation room) is considered
system wide since the negotiation room is actually located on
port or circuit level and is not associated with a context
Figure 7-4: Debug coverage (what)
There are two types of contexts: local (a system-wide context) and administratordefined (a nonlocal context). The active context (the context that you are in)
affects your debug output.
To debug all contexts on your router, use the system-wide local context. You see
debug output related to this context and all contexts running on the router. For
example, to see all Open Shortest Path First (OSPF) instances on the router, issue
the debug ospf lsdb command in the local context.
[local] Ericsson# debug ospf lsdb
When you debug a local context, the software displays debug output for all
contexts. When a debug function is context specific, the debug output generated
by the local context includes a context ID that you can use to determine the
source of the event (the context in which the event has its origin).
Context-specific debugging refers to navigating to a specific context and running
debug commands from it and filtering out all debug output that is not related to
that context. Context-specific output consists of lines of output identified by a
context ID in brackets, which can be displayed either using context-specific
debugging or system-wide debugging.
- 144 -
© Ericsson AB 2015
LZT1381712 R1A
Use and Impact of Debugging on the SSR System
1.2
How to Recognize a Debug Function
[NiceService]Train-2#
Dec 16 05:59:26: [0002]: [13/1:1:63/1/2/11]: %AAA-7-AUTHOR: aaa_idx 1000001e:
Dec 16 05:59:26: [0002]: [13/1:1:63/1/2/11]: %AAA-7-AUTHOR: aaa_idx 1000001e:
Dec 16 05:59:26: [0002]: [13/1:1:63/1/2/11]: %AAA-7-AUTHOR: aaa_idx 1000001e:
Context identifier
Internal Circuit handle
Debug function
[local]Train-2#
Dec 16 05:59:26: [13/1:1:63/1/2/11]: %AAA-7-AUTHEN: aaa_idx 1000001f:
Dec 16 05:59:26: [13/1:1:63/1/2/11]: %AAA-7-AUTHEN: aaa_idx 0:
Missing context identifier (means this type of debug is system wide)
Context identifier included (means this type of debug is context specific)
[local]Train-1#
Dec 16 05:59:26: [0002]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.0 Update Router LSA
Dec 16 05:59:26: [0003]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.2 Update Router LSA
Dec 16 05:59:26: [0004]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.3 Update Sum-Net
Figure 7-5: How to recognize a debug function is context specific? Context ID
After some seconds we get the output shown here. The output includes debug
messages from different contexts.
How do we know this?
There is a context identification number for each outputted row.
If we type “show context all” we will see a list of all contexts in the system and
their id-s.
From this we can match the debug output to a context. This context id confirms
that debug function we started is context specific.
In summary:
What debug function did we start? “OSPF LSDB” which is context specific.
And where did we start it from? From context local.
Context local is “capture all” and as we expect we see output from all context
running ospf.
LZT1381712 R1A
© Ericsson AB 2015
- 145 -
Ericsson SSR 8000 R15 System Troubleshooting
›
›
›
Depending on the debug function the location where debugging is
started will make a difference
When debugging, consider the debug function and its relationship to
a context
To allow a “system wide capture” the context local is enabled as
“capture all” context for context specific debug
–
–
In case debugging function is indeed context specific, the debug
output generated by context local will include a context identifier
allowing the operator to understand the “source” of the event
Debug ospf lsdb (routing) within context local will include all ospf
instances within the SSR
Figure 7-6: Debug coverage (where)
1.3
Debugging Within Context Local
[local]Train-1# show debug
Example Messages only
OSPF:
lsdb debugging is turned on
[local]Train-1#
Dec 16 05:59:26: %LOG-6-SEC_STANDBY: Dec 16 05:59:26:%CSM-6-PORT: ethernet 3/7 link state UP, admin is UP
Dec 16 05:59:26: %LOG-6-SEC_STANDBY: Dec 16 05:59:26:%CSM-6-PORT: ethernet 3/8 link state UP, admin is UP
Dec 16 05:59:26: %CSM-6-PORT: ethernet 3/7 link state UP, admin is UP
Dec 16 05:59:26: %CSM-6-PORT: ethernet 3/8 link state UP, admin is UP
Dec 16 05:59:26: [0002]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.0 Update Router LSA 200.1.1.1/200.1.1.1/80000013 cksum 26f1 len 72
Dec 16 05:59:26: [0003]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.2 Update Router LSA 200.1.2.1/200.1.2.1/80000009 cksum ce79 len 36
Dec 16 05:59:26: [0004]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.3 Update Sum-Net LSA 0.0.0.0/200.1.3.1/80000001 cksum bb74 len 28
Dec 16 05:59:26: [0004]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.3 Update Router LSA 200.1.3.1/200.1.3.1/8000000a cksum 142 len 36
Dec 16 05:59:26: [0004]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.0 Update Router LSA 200.1.1.1/200.1.1.1/80000013 cksum 26f1 len 72
Dec 16 05:59:26: [0003]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.0 Update Router LSA 200.1.1.1/200.1.1.1/80000013 cksum 26f1 len 72
Dec 16 05:59:26: [0005]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.0 Update Router LSA 2.2.2.2/2.2.2.2/8000000a cksum 983b len 36
Dec 16 05:59:26: [0006]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.2 Update Router LSA 2.2.2.6/2.2.2.6/80000009 cksum 7c4e len 36
Dec 16 05:59:26: [0007]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.3 Update Router LSA 2.2.2.10/2.2.2.10/8000000a cksum 803f len 36
Dec 16 05:59:26: [0005]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.0 Update AS-Ext LSA 30.1.1.4/2.2.2.2/80000001 cksum 2821 len 36
Dec 16 05:59:26: [0005]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.0 Update AS-Ext LSA 2.2.2.0/2.2.2.2/80000001 cksum a6c0 len 36
Dec 16 05:59:26: [0005]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.0 Update AS-Ext LSA 30.1.1.0/2.2.2.2/80000001 cksum 50fc len 36
---more---
[local]Train-1# show context all
Context Name
Context ID
VPN-RD
Description
-----------------------------------------------------------------------------local
0x40080001
Rb-1
0x40080002
Rb-2
0x40080003
Rb-3
0x40080004
Re-1
0x40080005
Re-2
0x40080006
Re-3
0x40080007
[local]Train-1#
To generate this debug output
we performed port down/up
Figure 7-7: Debugging within context local
1.4
Debugging in Different Contexts
Let us look at another example with debug from different contexts.
First we start “debug aaa authorization” and verify that the debug has started by
typing show debugging.
We start the debug from context NiceService.
If we look at the output we notice that there is context id number “0002” which
indicates that aaa authorization is “context specific” debugging function.
- 146 -
© Ericsson AB 2015
LZT1381712 R1A
Use and Impact of Debugging on the SSR System
In summary: what debug function did we start? Aaa authorization which is
context specific
[NiceService]Train-2# show debugging
AAA:
authorization debugging is turned on
exception debugging is turned on
[NiceService]Train-2#
When looking from within context NiceService
only authorization and exception debugging
output will be shown
[NiceService]Train-2#
Dec 16 05:59:26:[0002]: [13/1:1:63/1/2/11]: %AAA-7-AUTHOR: aaa_idx 1000001e: unprovision attr 13
Dec 16 05:59:26: [0002]: [13/1:1:63/1/2/11]: %AAA-7-AUTHOR: aaa_idx 1000001e: aaa_ip_addr_prov:
rem pool entry 0x64010117
Dec 16 05:59:26: [0002]: [13/1:1:63/1/2/11]: %AAA-7-AUTHOR: aaa_idx 1000001e: unprovision attr 3
[local]Train-2# show debugging
AAA:
authentication debugging is turned on
exception debugging is turned on
[local]Train-2#
When looking from within context local only
authentication and exception debugging
output will be shown
[local]Train-2#
Dec 16 05:59:26:: [13/1:1:63/1/2/11]: %AAA-7-AUTHEN: aaa_idx 1000001f: Received SESSION_DOWN
msg extern_handle 0
Dec 16 05:59:26:: [13/1:1:63/1/2/11]: %AAA-7-AUTHEN: aaa_idx 0: Received AUTHEN_REQUEST msg
from PPPd for username user2@NiceService with external handle = 0
The examples above are based on system wide debug functions. Hence if one would
enable the same type of debugging in both contexts, the output would be the same
Figure 7-8: Debugging in different contexts
And where did we start it from? From context NiceService
Which means that we only see authorization debugging from context
NiceService. We also see exception debugging which is automatically started
whith aaa authorization.
In the second exercise we start “aaa authentication debug” from context local and
verify which debug is turned on.
We do not see any context id in the output which indicates that aaa authentication
is a system wide debug function.
When looking from within context local only authentication and exception
debugging output will be shown.
In the examples above, each context has a different debug function enabled.
Depending on which context the admin is monitoring from, the debug output will
be different.
LZT1381712 R1A
© Ericsson AB 2015
- 147 -
Ericsson SSR 8000 R15 System Troubleshooting
1.5
Debug Relationship with Contexts
Debug on SSR
Context
Context specific debug functions
can be looked at from two levels:
SSR
System
System wide debug functions
can be looked at from two levels:
Debug within
context local
Debug within
a context
Debug within
context local
another
Context
Context
local
another
Context
Context
local
You only see debug
output related to the
context
You will see debug
output related to all
contexts
You would see all
output
You would see all
output
Debug within
the context
No difference between the two levels….
Figure 7-9: Debug relationship with contexts
Let us look at the relationship between context where we start debug and debug
function. We want to select the debug function and also we want to choose which
context to start the debug from.
As we mentioned debug function in SSR can be divided into Context specific
debug functions which display debug information specific to given context only
and
1. Context specific debug functions
The context specific debug function can be looked at from two levels.
•
If you enable the debug function within a specific context other then context
local, you will see the output related only to that context.
•
However if you enable a context specific debugging function from context
local, you will see the output related to all contexts. Debug in context local
has a “capture all” effect.
2. System wide debug functions
- 148 -
•
If you enable the system wide debug function from either context local or a
non-local context, you would see the debug output for the whole system.
•
There is no difference between the debug outputs for the two levels in this
case.
© Ericsson AB 2015
LZT1381712 R1A
Use and Impact of Debugging on the SSR System
System wide debug functions which display the same information regardless of
context they were started in.
The context specific debug function can be looked at from two levels
Starting debug within a specific context other then context local. You will only
see debug output related to that context.
However if you start a context specific debugging function from context local
you will see debug output related to all contexts. Debug in context local has
“capture all” effect.
Also system wide debug functions can be started from two levels:
If you start system wide debug from a specific context other then context local
you would see all debug output.
If you start system wide debug from context local you would see all debug
output. There is no difference between the debug output for the two levels.
1.6
Send Debug Output to Screen
Another important thing to remember is that when we start debugs we must also
redirect the output to the terminal we are using to log into the system. By default
when you start debug you will not see anything. Let us see how to display debug
output to your terminal screen.
Displaying output to the session depends on how you are logged in to the system:
thru console port or thru telnet/ssh session.
When connected to the console port you need to enable logging to the console.
›
When connected to the craft port:
–
›
You need to enable logging to the console
console
[local]Train-1# config
[local]Train-1(config)# context local
[local]Train-1(config-ctx)# logging console
[local]Train-1(config-ctx)#
› repeat for each context where debug output needs to be
generated
When connected via Telnet or SSH:
–
–
You need to redirect debugging output to your terminal:
[local]Train-1# terminal Monitor
to pause debug output – any key to continue:
telnet / ssh
[local]Train-1# CTRL-S
Figure 7-10: Send debug output to screen
Enter the context configuration mode for the context from which you want to
start the debug.
LZT1381712 R1A
© Ericsson AB 2015
- 149 -
Ericsson SSR 8000 R15 System Troubleshooting
Type logging console. Remember to commit.
Repeat for each context where debug output needs to be generated
When connected via Telnet or SSH you need to redirect debugging output to your
terminal.
From administrator monitoring mode type terminal monitor.
Repeat for each context where debug output needs to be generated.
To pause debug output press control-S.
Press any key to continue showing the debug output
Finally it’s important to know that each administrator that is logged in to the
system can start its own debug function and will have unique destination for
debug output. They will not influence each other.
Displaying Debug Output through the Console Port
Use the logging console command in context configuration mode to view event
log messages on the console. By default, this is enabled in the local context.
[local]Ericsson#config Enter configuration commands, one per line, 'end' to exit
[local]Ericsson(config)#context local
[local]Ericsson(config-ctx)#logging console
Displaying Debug Output through Telnet or SSH
Use the terminal monitor command in exec mode to view event log messages
on your terminal when you are connected through Telnet or SSH. To pause debug
output at your terminal, type Ctrl+S. To continue, type Ctrl+C.
[local]Ericsson# terminal monitor
- 150 -
© Ericsson AB 2015
LZT1381712 R1A
Use and Impact of Debugging on the SSR System
1.7
Administrator Privacy
As we have already seen in the previous examples, we can display the status of
system debugging using the command “show debugging”. <This will show us
what debugs are currently active in the context from which we run the command.
To turn off all debugs we use the command ‘no debug all’
As we can verify , this turns off all debugging in the current context.
›
›
Each administrator is treated within the SSR as unique destination
for debugging output
Each administrator can start its own debugging functionality without
influencing other administrators
›
Enabling debugging is context specific and requires
›
Disabling debugging is context specific and requires
[local]Train-1# debug [function]
[local]Train-1# no debug [function]
[local]Train-1# no debug all
› will disable all debug functions in one step
›
Disconnecting the telnet / SSH session will be handled as implicit “no
debug all” for associated administrator
Figure 7-11: Administrator privacy
LZT1381712 R1A
© Ericsson AB 2015
- 151 -
Ericsson SSR 8000 R15 System Troubleshooting
1.8
Debugging and Impact
•Keep alive timer configured within process
›
›
›
›
›
•Process Manager learns this value to check
Debugging is started within a
process status
process
Output is sent to “logger” process
[local]Train-1# debug PPP
Debugging will share the time slice
with its own process
Worse case: its own primary process
Process Manager
will slow down and perhaps not
respond anymore to PM keep
KeepAlive
alive…
Causing the BSD kernel to restart
PPP
the process. But most important…
AAA
Logger
NO Impact on traffic card state table
debug
Restart PPP
Please restart
PPP process
BSD Kernel
Figure 7-12: Debugging and “impact”
1.9
Exercise 6: Debugging on SSR
› Please move to the exercises book.
Figure 7-13: Exercise 6: Debugging on SSR
- 152 -
© Ericsson AB 2015
LZT1381712 R1A
Use and Impact of Debugging on the SSR System
2
Chapter Summary
After this course the participant should be able to:
› Describe the Use and Impact of Debugging on the SSR
System
› Understand the SSR system debug structure
› Identify the SSR debug process
Figure 7-14: Chapter Summary
LZT1381712 R1A
© Ericsson AB 2015
- 153 -
Ericsson SSR 8000 R15 System Troubleshooting
Intentionally Blank
- 154 -
© Ericsson AB 2015
LZT1381712 R1A
Troubleshooting for Traffic Flow through Ports, Circuits and Interfaces
8 Troubleshooting for Traffic Flow through
Ports, Circuits and Interfaces
Chapter Objectives
After this course the participant will be able to:
› Perform Troubleshooting for Traffic Flow through Ports,
Circuits and Interfaces
› Explain the traffic flow in SSR System
› Identify the Connectivity Issue and Troubleshooting
Figure 8-1: Chapter Objectives
LZT1381712 R1A
© Ericsson AB 2015
- 155 -
Ericsson SSR 8000 R15 System Troubleshooting
1
Troubleshooting Basic Checks
Figure 8-2: Troubleshooting Basic Checks
________________________________________________________________
________________________________________________________________
________________________________________________________________
________________________________________________________________
________________________________________________________________
- 156 -
© Ericsson AB 2015
LZT1381712 R1A
Troubleshooting for Traffic Flow through Ports, Circuits and Interfaces
1.1
Interface & Port States
In SSR an interface and a port represent two separate, distinct entities and
interfaces need to bound to physical ports in order to pass traffic.
For this reason we have different states defined for an interface and for a port. An
interface state can take three possible values: Unbound, Bound and Up. While a
port can be in any of the following 3 states: Unconfigured, Down and Up
Interfaces and ports different
entities on SSR
They have distinct states
Binding
local
Port
1/1
VLAN
ABC
Interface
Figure 8-3: Interface & Port States
LZT1381712 R1A
© Ericsson AB 2015
- 157 -
Ericsson SSR 8000 R15 System Troubleshooting
Interfaces and ports different
entities on SSR
They have distinct states
Interface
Three states for a port
Three states for an interface
Unbound
Bound
Port/circuit
Up
Unconfigured
Down
Up
Figure 8-4: Interface & Port States
Interfaces and ports different
entities on SSR
They have distinct states
Interface
Three states for a port
Three states for an interface
Unbound
Bound
Up
Port/circuit
Unconfigured
Down
Up
Line
Admin
Line
Admin
Down
Down
Up
Up
Down
Up
Configured
Port States
Figure 8-5: Interface & Port States
Finally, let’s see the interaction between a port and an interface state:
A port state is never affected by the interface state of an interface bound to it.
However, an interface state may be affected by a port state. In fact, if an interface
is bound to a port, the state of the port will determine the state of the interface.
- 158 -
© Ericsson AB 2015
LZT1381712 R1A
Troubleshooting for Traffic Flow through Ports, Circuits and Interfaces
Interfaces and ports different
entities on SSR  Distinct
states
Interface
Three states for a port
Three states for an interface
Unbound
Bound
Up
Bound-interface state
determined by port state
Port/circuit
Unconfigured
Up
Down
Line
Admin
Line
Admin
Down
Down
Up
Up
Down
Up
Bound
Interfaces States
Configured
Port States
Figure 8-6: Interface & Port States
In particular, if the port, to which the interface is bound, is in Down State, the
interface will be in Bound State.
But in case the port, to which the interface is bound, is in Up State, the interface
will be in Up State as well.
1.2
Verifying Interface Status
› The first basic step is verifying IP interfaces of your router
[local]Train-1# show ip interface brief
Wed Jul 4 22:33:08 2013
Name
Address
e0
1.1.1.1/30
e1
1.1.1.5/30
e2
1.1.1.9/30
[local]Train-1#
MTU
1500
0
1500
State
Up
UnBound
Bound
Bindings
dot1q 5/1 vlan-id 10
ethernet 3/4
› Two things to check are:
–
–
State
Binding
shows if interface is operational
shows which physical circuit this interface uses
to forward traffic
Figure 8-7: Verifying interface status
show ip interface [if-name | all-context | brief [all-context] | rp]
Displays information about interfaces, including the interface bound to the
Ethernet management port on the controller card.
Use the show ip interface command to display information about all interfaces,
including those on the controller card. Use this command without optional syntax
to display detailed information on all configured interfaces.
An interface can be in any of the following states:
LZT1381712 R1A
© Ericsson AB 2015
- 159 -
Ericsson SSR 8000 R15 System Troubleshooting
Unbound—The interface is not currently bound to any port or circuit.
Bound—The interface is bound to at least one port or circuit. However, none of
the bound circuits are up; therefore, the interface is not up.
Up—At least one of the bound circuits is in the up state; therefore, the interface is
also up and traffic can be sent over the interface.
1.3
Identifying Interface Problems: Unbound State
Let’s start troubleshooting an interface in ‘Unbound’ State. Interface “e1” has
been created within context “local” but its not connected to any port, so it’s in
Unbound state.
[local]Train-1# show ip interface brief
Wed Jul 4 22:33:08 2013
Name
Address
e0
1.1.1.1/30
e1
1.1.1.5/30
e2
1.1.1.9/30
[local]Train-1#
MTU
1500
0
1500
State
Up
UnBound
Bound
Bindings
dot1q 5/1 vlan-id 10
ethernet 3/4
› Interface e1 is in “UnBound” state
– There is no physical circuit attached
to interface
Context local
Port eth 5/2
No binding
present
e1
Interface
› This is configuration error
› Binding has to be configured between some physical circuit (like port or
vlan) and interface e1
port ethernet 5/2
no shutdown
bind interface e1 local
Figure 8-8: Identifying interface problems: Unbound state (1-3)
As a further confirmation we can also check the configuration of the ports on the
system to see if any circuit or port binds to e1 using the command “show
configuration port” and looking for interface test1.
As you can see there are no bindings listed for interface e1.
Another way to get to the same conclusion is by using the command “show
binding bound” that displays info for bound circuits and look for interface e1.
To fix this problem we need to bind interface “e1” to a physical port or circuit.
Then, based on the state of the port we bind the interface to, the interface will get
in either Bound or Up state.
In this case we bind interface e1 on context test directly to Port 5/2 and not to a
particular circuit within that port.
Now, if we run a “show ip interface brief” we can see the state of the interface
has changed from Unbound to Up. Furthermore we can see to which port we
connected this interface to.
- 160 -
© Ericsson AB 2015
LZT1381712 R1A
Troubleshooting for Traffic Flow through Ports, Circuits and Interfaces
Since the interface moved to Up state, this means that the port we bound the
interface to is in Up state as well. Let’s verify this by running the command
“show port 5/2” As expected the port is in Up state.
[local]Train-1# show ip interface brief
Wed Jul 4 22:33:08 2007
Name
Address
e0
1.1.1.1/30
e1
1.1.1.5/30
e2
1.1.1.9/30
[local]Train-1#
MTU
1500
0
1500
State
Up
UnBound
Bound
Bindings
dot1q 5/1 vlan-id 10
ethernet 3/4
› Interface e2 is in “Bound” state
– This confirms configuration has been done properly
› The problem exists on L1/L2 level
› Investigation needs to be continued on port level
Figure 8-9: Identifying interface problems: Bound state (2-3)
Let’s now see how to troubleshoot an interface in ‘Bound’ State. First we bind
interface “e2” to a circuit on port 3/4.
By checking the Bindings field from the output of the command “show ip int
brief” we can confirm this interface is now bound to the right circuit on port 3/4.
We can also see that the interface is in Bound state.
This means that the port we bound the interface to is in Down state. Let’s verify
this by running the command “show port 3/4”
As expected the port is in Down state.
At this point we have to troubleshoot why port 3/4 is in Down state. As we have
seen earlier the state of a port, either Up or Down, is determined by a
combination of two other underlying states: Admin and Line states. These two
states are associated respectively to Layer1 and Layer2 problems.
LZT1381712 R1A
© Ericsson AB 2015
- 161 -
Ericsson SSR 8000 R15 System Troubleshooting
[local]Train-1# show ip interface brief
Wed Jul 4 22:33:08 2013
Name
Address
e0
1.1.1.1/30
e1
1.1.1.5/30
e2
1.1.1.9/30
[local]Train-1#
[local]Train-1# show port 3/4
Slot/Port:Ch:SubCh Type
3/4
ethernet
[local]Train-1#
MTU
1500
0
1500
State
Up
UnBound
Bound
Bindings
dot1q 5/1 vlan-id 10
ethernet 3/4
State
Down
› Port 3/4 is in “Down” state
› There are 2 possible reasons for that:
– Physical layer is down
– Port has been administratively shut down
Figure 8-10: Identifying interface problems: Bound state (cont.) (3-3)
- 162 -
© Ericsson AB 2015
LZT1381712 R1A
Troubleshooting for Traffic Flow through Ports, Circuits and Interfaces
1.4
Port Status: Admin State and Line State
•
Unconfigured— means that the port is not configured
•
Down— means the port is configured but in Down State
•
Up— means the port is configured and in Up State
To check the state of all port in the system we can use the command ‘show port
all’.
In this example we are showing only three ports from the output and as we can
see each of them is in a different state.
To unconfigure a port you can simply type “no port ethernet” followed by the
port number from configuration mode.
To configure a port we run the command “port ethernet” followed by the port
number from configuration mode.
The second port in Down state. This state is determined by a combination of two
underlying states, Line and Admin. If the port is in Down state it means Line and
Admin states are in either one of these two combinations of state : Down-Down
or Down-Up.
The third port is configured and in Up state.If the port is in Up state it means both
Line and Admin states are in Up state.
[local]Train-1# show port 3/4 detail
›
Most important is to realize the
difference between Admin
State and Line State
Admin State
Line State
(configuration)
(physical)
Result
Down
Down
Down
Down
Up
Down
Up
Down
Down
Up
Up
Up
ethernet 3/4 state is Down
Description
:
Port circuit
:
3/4:511:63:31/1/0/30
Link state
: Up
Last link state change
: Nov 12
07:41:14.117
Line state
: Down
Admin state
: Up
Link Dampening
: disabled
Undampened line state
: Up
Dampening Count
: 0
Encapsulation
: ethernet
MTU size
: 1500 Bytes
NAS-Port-Type
: none
NAS-Port-Id
: none
MAC address
:
00:30:88:19:71:84
Media type
: 100Base-T
Flow control
: on
Speed
: 100 Mbps
Duplex mode
: full
Loopback
: off
Mini-RJ21 Connector
: Ports 1-12
Support Lossless-Large-MTU : Not
Configurable
Active Alarms
: Link down
Figure 8-11: Port status: Admin state and Line State
LZT1381712 R1A
© Ericsson AB 2015
- 163 -
Ericsson SSR 8000 R15 System Troubleshooting
1.5
Circuit Status
Now that we have seen port and interface states, let us talk about circuit state.
Usually Circuit States follow the state of their parent port. However some circuits
can be down even when parent port is up - for example a PPP circuit may be
brought down on failure to receive keep-alives whereas its parent port may be up.
It is also possible to administratively shut down a circuit while keeping the Port
up by using the ‘shutdown’ command under the PVC configuration.
For example individual Dot1q Circuits may be administratively shut down on a
port. This allows us to shut down one vlan without shutting down the whole port.
Using the Show Circuit command we can look at the dot1q Circuits and see their
individual states.
› Usually circuits state follows state of their parent – port
› Some circuit types have their own keepalive mechanism
and can be brought down regardless of port state up
› It is also possible to administratively shutdown most of
circuit types
[local]Train-1(config-atm-pvc)#?
-- cut –
shutdown
Shutdown the PVC
[local]Train-1(config-dot1q-pvc)#?
-- cut –
shutdown
Shutdown the PVC
Figure 8-12: Circuit status
- 164 -
© Ericsson AB 2015
LZT1381712 R1A
Troubleshooting for Traffic Flow through Ports, Circuits and Interfaces
2
Troubleshooting Traffic
Figure 8-13: Troubleshooting Traffic
2.1
Troubleshooting Traffic Problems
› Basic problems like port down or binding misconfiguration are quite easy to spot
› Much more often problems are more selective and can
not be solved using basic checks
› SSR offers broad range of counters which are very
helpful when troubleshooting
› Statistics are collected:
–
–
For ports and
For separate circuits within a port
–
Traffic cards collect statistics for Layers 1,2 and 3
Figure 8-14: Troubleshooting traffic problems (counters)
LZT1381712 R1A
© Ericsson AB 2015
- 165 -
Ericsson SSR 8000 R15 System Troubleshooting
2.2
Port Counters – Overview
Now that we have seen port and interface states, let us talk about circuit state.
Usually Circuit States follow the state of their parent port. However some circuits
can be down even when parent port is up - for example a PPP circuit may be
brought down on failure to receive keep-alives whereas its parent port may be up.
It is also possible to administratively shut down a circuit while keeping the Port
up by using the ‘shutdown’ command under the PVC configuration.
For example individual Dot1q Circuits may be administratively shut down on a
port. This allows us to shut down one vlan without shutting down the whole port.
Using the Show Circuit command we can look at the dot1q Circuits and see their
individual states.
[local]Train-1# show port count 3/1
Port
Type
3/1
ethernet
packets sent
:
16
packets recvd
:
9
send packet rate
: 0.13
recv packet rate
: 0.63
rate refresh interval : 60 seconds
bytes sent
bytes recvd
send bit rate
recv bit rate
Data displayed in packets
: 680
: 40
: 5.33
: 3.95
Send and receive rate
is always refreshed
every 60 second
Data displayed in bytes
Counters Refresh interval:
› Port counters are refreshed every few minutes (except rate)
› How to get more up to date counters?
[local]Train-1# show port count 3/1 ?
:
ces
show ces information
detail show detailed information
live
show live information
queue
show per-queue information
|
Output Modifiers
<cr>
[local]Train-1#
Figure 8-15: Port counters – overview
- 166 -
© Ericsson AB 2015
LZT1381712 R1A
Troubleshooting for Traffic Flow through Ports, Circuits and Interfaces
2.3
Live Port Counters
For more up to date counters we can use the “live” keyword at the end of “show
port counters” command. The show port counters live, forces real time collection
of counters. The output looks the same but is more up-to-date.However the
updates are displayed only when the command is executed.
It is also important to note that even if we use the “show port counter live”
command, the send and receive packet rate is still only calculated every 60
seconds.
› Keyword “live” forces real-time collection of counters from
traffic card
–
Except rate counters
› It does not change output
[local]Train-1# show port count 3/1 live
Port
Type
3/1
ethernet
packets sent
: 1616
bytes sent
packets recvd
: 8909
bytes recvd
send packet rate
: 0.13
send bit rate
recv packet rate
: 0.63
recv bit rate
rate refresh interval : 60 seconds
: 68680
: 534540
: 45.33
: 303.95
Send and receive rate
is always refreshed
every 60 second
[local]Train-1#
Figure 8-16: Live port counters
2.4
Port Counters
[local]Train-1# show port count 3/1 detail
Counters for port ethernet 3/1 - Interval: 04:03:41
NPU Port Counters
-- cut --
Regular port counters
NPU Input Error Counters
-- cut --
PPA input and output errors
NPU Output Error Counters
-- cut -Packet Drop Counters
-- cut --
PPA counters
Independent of
layer 2&1
IP packet errors
General Counters
-- cut –
Transmit Counters
-- cut –
Receive Counters
-- cut –
L2 port statistics
L2/L1 counters
Depending on
traffic card type
Figure 8-17: Port counters – details (1-4)
LZT1381712 R1A
© Ericsson AB 2015
- 167 -
Ericsson SSR 8000 R15 System Troubleshooting
[local]Train-1# show port count 3/1 detail
Counters for port ethernet 3/1 - Interval: 04:03:41
NPU Port Counters
packets sent
: 0
packets recvd
: 0
send packet rate
: 0.00
recv packet rate
: 0.00
IP mcast pkts rcv : 0
IP mcast pkts sent : 0
rate refresh interval : 60 seconds
bytes sent
: 0
bytes recvd
: 0
send bit rate
: 0.00
recv bit rate
: 0.00
IP mcast bytes rcv : 0
IP mcast bytes snt : 0
NPU Input Error Counters
idc other errors
: 0
idc overrun errors : 0
no cct packets
: 0
cct down pkts
: 0
unknown encap pkts : 0
unreach pkts
: 0
media filter pkts : 0
crc port errors
: 0
idc abort errors
: 0
no cct bytes
: 0
cct down bytes
: 0
unknown encap byte : 0
unreach bytes
: 0
media filter bytes : 0
NPU Output Error Counters
WRED drop pkts
: 0
adj drop pkts
: 0
tail drop pkts
adj drop bytes
: 0
: 0
bad IP checksum
link layer bcast
: 0
: 0
Packet Drop Counters
not IPv4 drop pkts : 0
unhandled IP optns : 0
bad IP length
: 0
Idc = input descriptor cache (between fpga and ppa)
Cct = circuit
Encap = encapsulation
Unreach = unreachable (unknown destination)
mcast = multicast
Example packets increasing given counter
Received VLAN tag not configured for port
VLAN/PPPoE received on plain Ethernet port
No route for packet’s destination address
Destination MAC address doesn’t match system’s MAC
QoS contract exceeded
2000B IP packet with DF flag set needs to be forwarded
over FE port (MTU = 1500B)
---(more)---
Figure 8-18: Port counters – details (2-4)
General Counters
packets sent
bytes sent
mcast pkts sent
bcast pkts sent
dropped pkts out
pending pkts out
port drops out
: 0
: 0
: 0
: 0
: 0
: 0
: 0
packets recvd
bytes recvd
mcast pkts recvd
bcast pkts recvd
dropped pkts in
pending pkts in
port drops in
Transmit Counters
underflow
late collision
regular collision
single collision
multiple colls
excessive colls
deferred
error pkts sent
error bytes sent
: 0
: 0
: 0
: 0
: 0
: 0
: 0
: 0
: 0
eth 64 octets
: 0
eth 65-127 octs
: 0
eth 128-255 octs
: 0
eth 256-511 octs
: 0
eth 512-1023 octs : 0
eth 1024-1518 octs : 0
eth > 1518 octs
: 0
flow control
: 0
Receive Counters
jabber
false carrier
runt frames
undersized frames
oversized frames
crc errors
alignment errors
symbol errors
error pkts rcvd
error bytes rcvd
: 0
: 0
: 0
: 0
: 0
: 0
: 0
: 0
: 0
: 0
eth 64 octets
: 0
eth 65-127 octs
: 0
eth 128-255 octs
: 0
eth 256-511 octs
: 0
eth 512-1023 octs : 0
eth 1024-1518 octs : 0
eth > 1518 octs
: 0
flow control
: 0
overflows
: 0
overflow bytes
: 0
: 0
: 0
: 0
: 0
: 0
: 0
: 0
L2 (Ethernet in this case) counters
These packets were dropped before
reaching IPPA
Ethernet transmit and receive
counters in details.
Very useful for troubleshooting L2
problems
Figure 8-19: Port counters (3-4)
- 168 -
© Ericsson AB 2015
LZT1381712 R1A
Troubleshooting for Traffic Flow through Ports, Circuits and Interfaces
2.5
Troubleshooting Circuits
› It does not happen very often that whole port is being used
for one stream of traffic
› Usually ports are being divided into sub channels (ATM
PVC, Ethernet VLAN, PPPoE sessions)
– These channels are referred as circuits in SE architecture
› Port level counters give an indication for overall port
problems but they are not helpful when you need to
troubleshoot one of multiple circuits carrying traffic over the
same port
› Luckily SSR provides similar counters for circuits as it does
for ports 
Figure 8-20: Troubleshooting circuits
2.6
Circuit Counters
Packet Counters values may also be obtained for circuits. In this example we will
look at dot1q Ethernet Circuits.
We can also retrieve counters values only for a specific circuit.
› Circuit counters can be retrieved at various levels
[local]Train-1# show circuit counters ?
agent-circuit-id
Search for circuit based on agent-circuit-id attribute
agent-remote-id
Search for circuit based on agent-remote-id attribute
circuit-group
Display info for circuit-group circuits
clips
Display info for CLIPS circuits
detail
Display detailed counters
dot1q
Display info for dot1q circuits
ether
Display info for ethernet circuits
gre
Display info for GRE tunnel circuits
ipip
Display info for IP-in-IP tunnel circuits
ipsec
Display info for IPSec tunnel circuits
ipv6-auto
Display info for IPv6-over-v4 auto tunnel circuits
ipv6-man
Display info for IPv6-over-v4 manual tunnel circuits
l2tp
Display info for L2TP LNS circuits
l2vpn-cross-connect
Display info for l2vpn cross connect circuits
lg
Link-group of circuit(s)
live
Display live counters
mp
Display subscriber MP pseudo circuit information
mpls
Display info for MPLS LSP circuits
persistent
Persistent counters - values do not reflect clear operations
port-pseudowire
Display info for port pseudowire circuits
ppp
Display info for ppp circuits
pppoe
Display info for pppoe circuits
queue
Display per-queue counters
--cut
Figure 8-21: Circuit counters
As for a port, we can use the keyword “detail” to get more detailed information
about circuit counters.
LZT1381712 R1A
© Ericsson AB 2015
- 169 -
Ericsson SSR 8000 R15 System Troubleshooting
2.7
VLAN Circuit Statistics
[local]Train-1# show circuit counters 3/7 vlan-id 10 detail
please wait...
Circuit: 3/7 vlan-id 10, Internal id: 1/2/6, Encap: ether-dot1q
Packets
Bytes
------------------------------------------------------------------------------Receive
:
2550 Receive
:
140022
Receive/Second :
0.50 Receive/Second :
27.00
Transmit
:
45 Transmit
:
5309
Xmits/Queue
Xmits/Queue
0
:
45
0
:
5309
1
:
0
1
:
0
2
:
0
2
:
0
3
:
0
3
:
0
4
:
0
4
:
0
5
:
0
5
:
0
6
:
0
6
:
0
7
:
0
7
:
0
8
:
0
8
:
0
Xmit Q Deleted :
0 Xmit Q Deleted :
0
Transmit/Second :
0.03 Transmit/Second :
1.54
IP Multicast Rcv:
0 IP Multicast Rcv:
0
IP Multicast Tx :
0 IP Multicast Tx :
0
Unknown Encaps :
0 Unknown Encaps :
0
Down Drops
:
0 Down Drops
:
0
Unreach Drops
:
0 Unreach Drops
:
0
Adj Drops
:
0 Adj Drops
:
0
WRED Drops Total:
0 WRED Drops Total:
0
WRED Drops/Queue
WRED Drops/Queue
0
:
0
0
:
0
0
:
0
0
:
0
1
:
0
1
:
0
2
:
0
2
:
0
6
:
0
6
:
0
7
:
0
7
:
0
---more---
Transmitted packets per QoS
defined queue
The same meaning as for port
QoS WRED drops per queue
Figure 8-22: VLAN circuit statistics (1-2)
Tail Drops Total:
Tail Drops/Queue
0
:
1
:
2
:
3
:
4
:
5
:
6
:
7
:
0
0
0
0
0
0
0
0
0
Tail Drops Total:
Tail Drops/Queue
0
:
1
:
2
:
3
:
4
:
5
:
6
:
7
:
0
0
0
0
0
0
0
0
0
0
0
IP Counters
Soft GRE MPLS
:
Not IPv4 drops :
Unhandled IP Opt:
Bad IP Length
:
Bad IP Checksum :
Not IPv6 drops :
Broadcast Drops :
0
0
0
0
0
0
0
Soft GRE MPLS
Not IPv4 drops
Not IPv6 drops
:
0
MPLS Counters
MPLS Drops
:
0
MPLS Drops
:
0
ARP Counters
Drops
:
Unreachable
:
Rate Refresh Interval : 60 seconds
0
0
Drops
Unreachable
:
:
0
0
:
:
QoS tail drops per queue
The same meaning as for port
ARP statistics
[local]Train-1#
Figure 8-23: VLAN circuit statistics (2-2)
- 170 -
© Ericsson AB 2015
LZT1381712 R1A
Troubleshooting for Traffic Flow through Ports, Circuits and Interfaces
2.8
Clearing Counters
›
›
There are multiple ways of clearing counters
Global one – clears all counters across chassis
[local]Train-1# clear port counters
›
More specific command to clear only specific port
[local]Train-1# clear port counters 3/1
›
The most specific command allows you for clearing single
circuit counters
[local]Train-1# clear circuit counters 3/7 vlan-id 20
[local]Train-1# clear circuit counters 3/7 vlan-id 20 pppoe 15
›
You can also clear counters for all circuits of certain type
[local]Train-1# clear circuit counters pppoe
[local]Train-1# clear circuit counters dot1q
Figure 8-24: Clearing counters
Port and circuit counters can be cleared whenever you like by using the
commands “clear circuit counters” and “clear port counters”. This may be useful
at the start of a particular event that we need to troubleshoot.
These commands can also be applied only to a specific port or circuit. For
example if we want to clear the counters only for vlan 100 on port 3/1 we would
run the following command. Likewise if we want to clear the counters only for
port 3/7 we would run the following command.
2.9
IP Troubleshooting Tool
› Probably the most commonly used tool in IP connectivity
troubleshooting is ICMP ping
ping 1.1.1.1
› [local]Train-1#
It
has multiple
very? useful options
1..2147483647 Enter number of PING to transmit
df
flood
maxs
mins
numeric
pattern
preload
quiet
record
silent
size
source
timeout
tos
ttl
verbose
Set the Don't Fragment bit in the IP header
Flood ping
Sweep max size
Sweep min size
Numeric output only
Specify a pattern to fill in ICMP packet
Ping sends that many packets as fast as possible
Do not display ICMP error messages
Includes the RECORD_ROUTE option in the ECHO_REQUEST packet
Display only summary lines at startup and finish
Size of the ICMP datagram to send
Source IP address
Specify PING timeout
Specify type of service
Time-to-live
Verbose output
Figure 8-25: Ping - key IP troubleshooting tool
LZT1381712 R1A
© Ericsson AB 2015
- 171 -
Ericsson SSR 8000 R15 System Troubleshooting
2.10
Traffic Troubleshooting Exercise: Introduction
› There are multiple interconnected contexts on your SSR,
all of them are having some connectivity problems
› Your job is to find out problems and fix them (if possible)
› While there are several ways to find out the root cause,
the best would be if you used your knowledge about
port/circuit counters
–
Just show port / circuit / ip commands and ping
› Often you have access to only one side of the link – in
such case counters are your only option
Figure 8-26: Traffic troubleshooting exercise: Introduction
2.11
Traffic Troubleshooting Exercise: Preparation
› You need to load new configuration on your SSR before
we can start
› Please don’t look into the content of the file, also don’t
check your SSR configuration after it has been loaded
› Looking into config will remove whole fun from exercises

[local]Train-1# configure scp://student@10.1.1.3/troubleshooting_1.cfg
student@10.1.1.3's password:
troubleshooting_1.cf 100% |*****************************|
1896
00:00
Configuration complete
% Configuration file processing took: 2 seconds
[local]Train-1#
Figure 8-27: Traffic troubleshooting exercise: Preparation
2.12
Exercise 7: Traffic Troubleshooting
› Please move to the exercises book.
Figure 8-28: Exercise 7: Traffic troubleshooting
- 172 -
© Ericsson AB 2015
LZT1381712 R1A
Troubleshooting for Traffic Flow through Ports, Circuits and Interfaces
2.12.1
Context Topology for Traffic Troubleshooting Exercise
[x1]Train-1# ping 2.2.2.2 df size 1400 silent
Contexts: a1, b1 and d1
a1
b1
interface interface
e0
e0
ip address
ip address
1.1.1.1/30
1.1.1.1/30
d1
3/7
Contexts: a2, b2 and d2
a2
b2
interface e0
ip address 1.1.1.2/30
interface lo
ip address 2.2.2.2/32
d2
3/8
interface e0
ip address 1.1.1.1/30
c1
5/4
3/9
e1
3/10
Internet
c2
interface e0
ip address 1.1.1.2/30
interface lo
ip address 2.2.2.2/32
interface e0
ip address 1.1.1.2/30
interface lo
ip address 2.2.2.2/32
Figure 8-29: Context topology for traffic troubleshooting exercise
2.12.2
Traffic Troubleshooting Exercise Review
› Contexts a1 & a2
[a1]Train-1# show port count 3/7 detail
NPU Port Counters
no cct packets
: 0
cct down pkts
: 0
unknown encap pkts : 3
Unknown encapsulation
received from context a2
[a1]Train-1# show port count 3/8 detail
NPU Input Error Counters
idc other errors
: 0
idc overrun errors : 0
no cct packets
: 3
No circuit for packets received
from context a1
[a1]Train-1# show ip interface brief
Mon Dec 8 22:22:07 2008
Name
Address
e0
1.1.1.1/30
MTU
1500
State
Up
Bindings
dot1q 3/7 vlan-id 10
[a2]Train-1# show ip interface brief
Mon Dec 8 22:22:16 2008
Name
Address
e0
1.1.1.2/30
MTU
1500
State
Up
Bindings
ethernet 3/8
Encapsulation mismatch
Figure 8-30: Traffic troubleshooting exercises review (1-7)
LZT1381712 R1A
© Ericsson AB 2015
- 173 -
Ericsson SSR 8000 R15 System Troubleshooting
› Contexts b1 & b2
[b1]Train-1# show port count 3/8 detail
NPU Input Error Counters
idc other errors
: 0
unknown encap pkts : 0
unreach pkts
: 5
media filter pkts : 0
Context b2 doesn’t know where to
send packets from b1
[b2]Train-1# show ip route
Type
Network
Next Hop
Dist
Metric
UpTime
0
0
00:42:29
> C
1.1.1.0/30
[b2]Train-1#
Interface
e0
No route for 2.2.2.2
Figure 8-31: Traffic troubleshooting exercises review (2-7)
› Contexts c1 & c2
[c1]Train-1# show port counters 3/9 detail
NPU Output Error Counters
WRED drop pkts
: 0
adj drop pkts
: 5
Packets violate outgoing circuit
characteristics
[c1]Train-1# show port 3/9 detail
ethernet 3/2 state is Up
Description
:
Port circuit
: 3/2:511:63:31/1/0/9
Link state
: Up
Last link state change
: Apr 22 01:13:38.381
Line state
: Up
Admin state
: Up
Link Dampening
: disabled
Undampened line state
: Up
Dampening Count
: 0
Encapsulation
: dot1q
MTU size
: 1000 Bytes
NAS-Port-Type
: none
NAS-Port-Id
: none
MAC address
: 00:02:3b:04:65:67
Media type
: 1000Base-T
--- cut ---
Port 3/9 can transmit packets with
maximum size of 1000B
Figure 8-32: Traffic troubleshooting exercises review (optional) (3-7)
- 174 -
© Ericsson AB 2015
LZT1381712 R1A
Troubleshooting for Traffic Flow through Ports, Circuits and Interfaces
› Contexts d1 & d2
[d1]Train-1# show port count 3/8 detail
NPU Input Error Counters
unreach pkts
: 0
media filter pkts : 5
PPA received packets with wrong (not its own)
destination MAC address
[d1]Train-1# show arp-cache
Host
1.1.1.1
1.1.1.2
[d1]Train-1#
Hardware address
00:30:88:00:61:55
00:55:88:00:33:77
Ttl
-
Type
ARPA
ARPA
Circuit
3/7 vlan-id 30
3/7 vlan-id 30
d1 has wrong MAC address for 1.1.1.2
[d2]Train-1# show arp-cache
Host
1.1.1.1
1.1.1.2
[d2]Train-1#
Hardware address
00:30:88:00:61:55
00:30:88:00:33:78
Ttl
3042
-
Type
ARPA
ARPA
Circuit
3/8 vlan-id 30
3/8 vlan-id 30
Figure 8-33: Traffic troubleshooting exercises review (optional) (4-7)
› Context e1
[e1]Train-1# ping 2.2.2.2
PING 2.2.2.2 (2.2.2.2): source 1.1.1.1, 36 data bytes,
timeout is 1 second
.....
----2.2.2.2 PING Statistics---5 packets transmitted, 0 packets received, 100.0% packet loss
[e1]Train-1# show port counters 5/4 detail
Counters for port ethernet 5/4 - Interval: 13:01:48
NPU Port Counters
packets sent
: 3
packets recvd
: 2
send packet rate
: 0.00
recv packet rate
: 0.00
IP mcast pkts rcv : 0
IP mcast pkts sent : 0
rate refresh interval : 60 seconds
NPU Input Error Counters
idc other errors
: 0
idc overrun errors : 0
no cct packets
: 0
--cut
bytes sent
: 138
bytes recvd
: 120
send bit rate
: 0.00
recv bit rate
: 0.00
IP mcast bytes rcv : 0
IP mcast bytes snt : 0
Counters look good, no errors
observed
crc port errors
idc abort errors
no cct bytes
: 0
: 0
: 0
Figure 8-34: Traffic troubleshooting exercises review (optional) (5-7)
LZT1381712 R1A
© Ericsson AB 2015
- 175 -
Ericsson SSR 8000 R15 System Troubleshooting
› Context e1 continued
[e1]Train-1# show ip route
Type
Network
Next Hop
Dist
Metric
UpTime
> S
0.0.0.0/0
1.1.1.2
1
0
> C
1.1.1.0/30
0
0
[e1]Train-1#
[e1]Train-1# ping 1.1.1.2
PING 1.1.1.2 (1.1.1.2): source 1.1.1.1, 36 data bytes,
timeout is 1 second
.....
00:43:01
00:43:02
----1.1.1.2 PING Statistics---5 packets transmitted, 0 packets received, 100.0% packet loss
[e1]Train-1#
[e1]Train-1# show arp-cache
Total number of arp entries in cache: 2
Resolved entry
: 1
Incomplete entry : 1
Host
1.1.1.1
1.1.1.2
[e1]Train-1#
Hardware address
00:30:88:23:26:c2
incomplete
Ttl
10
Type
ARPA
ARPA
Interface
e0
e0
ARP failed, we need to debug it
Circuit
5/4 vlan-id 10
5/4 vlan-id 10
Figure 8-35: Traffic troubleshooting exercises review (optional) (6-7)
› Context e1 continued
[e1]Train-1# term mon
[e1]Train-1# debug arp pktio
[e1]Train-1# ping 1.1.1.2 1
Dec 9 00:08:59: [0015]: %ARP-7-PKTIO: Build ether pkt with dot1q encap for
1.1.1.2, vlan id 10, eh 0x41b03024
Dec 9 00:08:59: [0015]: %ARP-7-PKTIO: Dump outgoing packet, pkt length 46
0
16
32
ff ff ff ff ff ff 00 30 88 23 26 c2 81 00 00 0a
08 06 00 01 08 00 06 04 00 01 00 30 88 23 26 c2
01 01 01 01 ff ff ff ff ff ff 01 01 01 02
Dec 9 00:08:59: [0015]: %ARP-7-PKTIO: Received ARP pkt: context_id 0x4008000f,
cct_handle 5/4:1023:63/1/2/22
Dec 9 00:08:59: [0015]: %ARP-7-PKTIO: Dump incoming ARP packet, pkt length 60
0
16
32
48
ff ff ff ff ff ff 00 30 88 23 26 c2 81 00 00 0a
08 06 00 01 08 00 06 04 00 01 00 30 88 23 26 c2
01 01 01 01 ff ff ff ff ff ff 01 01 01 02 00 00
00 00 00 00 00 00 00 00 00 00 00 00
The same source MAC for send
and received ARP request
SE received its own ARP request –
there is a loop on physical level
Figure 8-36: Traffic troubleshooting exercises review (optional) (7-7)
- 176 -
© Ericsson AB 2015
LZT1381712 R1A
Troubleshooting for Traffic Flow through Ports, Circuits and Interfaces
3
Chapter Summary
After this course the participant should be able to:
› Perform Troubleshooting for Traffic Flow through Ports,
Circuits and Interfaces
› Explain the traffic flow in SSR System
› Identify the Connectivity Issue and Troubleshooting
Figure 8-37: Chapter Summary
LZT1381712 R1A
© Ericsson AB 2015
- 177 -
Ericsson SSR 8000 R15 System Troubleshooting
Intentionally Blank
- 178 -
© Ericsson AB 2015
LZT1381712 R1A
Acronyms and Abbreviations
9 Acronyms and Abbreviations
AAA
Authentication, Authorization, and Accounting
ALSW
Alarm Switch boards
AS
Autonomous system
ATM
Asynchronous Transfer Mode
BGP
Border Gateway Protocol
BNG
Broadband Network Gateway
CDN
Content Delivery Network
CLI
Command Line Interface
CPU
Central Processing Unit
CSM
Card Slot Module/Connection State Manager
DES
Data Encryption Standard
DRAM
Dynamic Random Access Memory
FABL
Forwarding Abstraction Layer
FIB
Forwarding Information Base
FTP
File Transfer Protocol
FTP
File Transfer Protocol
GB
GigaByte
GE
Gigabit Ethernet
GREP
Global Regular Expression Parser
LZT1381712 R1A
© Ericsson AB 2015
- 179 -
Ericsson SSR 8000 R15 System Troubleshooting
HW
Hardware
IP
Internet Protocol
IPOS
IP Operating System
IPOS
Internet Protocol Operating System
IPsec
Internet Protocol Security
IS
In Service
ISIS
Intermediate System - Intermediate System
ISM
Interface and Circuit State Manager
LED
Light Emitting Diode
MB
MegaByte
MSE
Multi-Service Edge
NAT
Network Address Translation
NPU
Network Processing Unit
OAM
Operations Administration and Maintenance
OOS
Out of Service
OS
Operating System
OSD
Out of Service Diagnostics
OSPF
Open Shortest Path First
PEM
Power Entry Module
PFE
Packet Forwarding Engine
PIM
Protocol Independent Multicast
PM
Process Manager
POD
Power On Diagnostics
POF
Points of Failure
POST
Power-On Self Test
- 180 -
© Ericsson AB 2015
LZT1381712 R1A
Acronyms and Abbreviations
PVC
Permanent Virtual Circuit
RAN
Remote Access Network
RCM
Router Config Module
RCP
Remote Copy Protocol
RDB
Reliable dataBase
RIB
Routing information dataBase
ROM
Read Only Memory
RP
Route Processor
RPSW
Route Processor Switch
RSVP
Resource Reservation Protocol
SCP
Secure Copy Protocol
SFTP
Secured File Transfer Protocol
SSC
Smart Services Card
SSH
Secure Shell
SSR
Smart Service Router
SW
Switch
TCP
Transmission Control Protocol
TFTP
Trivial File Transfer Protocol
USB
Universal Serial Bus
VPN
Virtual Private Network
VTY
Virtual Terminal
LZT1381712 R1A
© Ericsson AB 2015
- 181 -
Ericsson SSR 8000 R15 System Troubleshooting
Intentionally Blank
- 182 -
© Ericsson AB 2015
LZT1381712 R1A
Index
10 Index
Alarm Switch boards, 5, 60, 66, 81, 181
Asynchronous Transfer Mode, 15, 181
Authentication, Authorization, and
Accounting, 14, 181
Autonomous system, 181
Border Gateway Protocol, 14, 181
Broadband Network Gateway, 181
Card Slot Module/Connection State Manager,
8, 75, 79, 80, 82, 129, 181
Central Processing Unit, 5, 78, 80, 144, 181
Command Line Interface, 3, 11, 14, 18, 20,
21, 22, 23, 25, 26, 27, 40, 41, 75, 76, 132,
181
Content Delivery Network, 181
Data Encryption Standard, 181
Dynamic Random Access Memory, 181
File Transfer Protocol, 181
Forwarding Abstraction Layer, 181
Forwarding Information Base, 181
Gigabit Ethernet, 181
GigaByte, 69, 70, 181
Global Regular Expression Parser, 4, 40, 42,
43, 46, 181
Hardware, 182
In Service, 14, 182
Interface and Circuit State Manager, 5, 8, 75,
76, 79, 80, 82, 83, 129, 182
Intermediate System - Intermediate System,
79, 182
Internet Protocol, 9, 14, 15, 16, 17, 30, 33,
34, 35, 75, 139, 173, 182
Internet Protocol Operating System, 59, 145,
182
Internet Protocol Security, 182
IP Operating System, 59, 145, 182
Light Emitting Diode, 5, 66, 182
MegaByte, 182
Multi-Service Edge, 182
Network Address Translation, 182
Network Processing Unit, 182
LZT1381712 R1A
Open Shortest Path First, 6, 14, 74, 84, 87,
146, 147, 182
Operating System, 16, 75, 138, 139, 182
Operations Administration and Maintenance,
182
Out of Service, 182
Out of Service Diagnostics, 182
Packet Forwarding Engine, 182
Permanent Virtual Circuit, 15, 33, 166, 168,
183
Points of Failure, 182
Power Entry Module, 182
Power On Diagnostics, 182
Power-On Self Test, 182
Process Manager, 7, 74, 76, 106, 128, 182
Protocol Independent Multicast, 182
Read Only Memory, 7, 110, 111, 183
Reliable dataBase, 76, 79, 80, 82, 183
Remote Access Network, 183
Remote Copy Protocol, 183
Resource Reservation Protocol, 183
Route Processor, 6, 59, 60, 90, 91, 96, 97,
100, 101, 102, 103, 104, 105, 106, 107,
119, 120, 121, 122, 132, 134, 183
Route Processor Switch, 5, 58, 59, 69, 75,
77, 79, 119, 128, 183
Router Config Module, 76, 79, 80, 82, 128,
183
Routing information dataBase, 183
Secure Copy Protocol, 183
Secure Shell, 18, 30, 139, 152, 183
Secured File Transfer Protocol, 183
Smart Service Router, 4, 6, 7, 8, 18, 38, 39,
53, 55, 62, 64, 65, 66, 69, 74, 79, 95, 96,
117, 118, 119, 121, 124, 138, 139, 143,
144, 145, 146, 150, 154, 159, 183
Smart Services Card, 183
Switch, 183
Transmission Control Protocol, 183
Trivial File Transfer Protocol, 183
© Ericsson AB 2015
- 183 -
Ericsson SSR 8000 R15 System Troubleshooting
Universal Serial Bus, 70, 71, 183
Virtual Terminal, 183
- 184 -
Virtual Private Network, 14, 183
© Ericsson AB 2015
LZT1381712 R1A
Table of Figures
11 Table of Figures
Figure 1-1: Chapter Objectives ..................................................................................................... 11
Figure 1-2: Review Fundamental Concepts .................................................................................. 12
Figure 1-3: Context, Interfaces, & Bindings Architecture ............................................................... 13
Figure 1-4: Terminology ................................................................................................................ 16
Figure 1-5: Command Line Interface (CLI) Structure ..................................................................... 18
Figure 1-6: Introduction ................................................................................................................. 19
Figure 1-7: Factory Default System: Step one............................................................................... 19
Figure 1-8: Maneuvering through the CLI ...................................................................................... 20
Figure 1-9: If you are configuring… ............................................................................................... 22
Figure 1-10: Monitoring with CLI ................................................................................................... 23
Figure 1-11: CLI Introduction and the prompt structure ................................................................. 24
Figure 1-12: Context monitoring .................................................................................................... 25
Figure 1-13: CLI Help .................................................................................................................... 25
Figure 1-14: CLI for the fast people ............................................................................................... 26
Figure 1-15: Lab environment ....................................................................................................... 27
Figure 1-16: Connecting to Ericsson Training labs ........................................................................ 28
Figure 1-17: Configure Management Interface .............................................................................. 29
Figure 1-18: Reference for this module ......................................................................................... 29
Figure 1-19: Configure Management interface .............................................................................. 30
Figure 1-20: Validating the configuration ....................................................................................... 33
Figure 1-21: Binding information ................................................................................................... 34
Figure 1-22: Exercise 1: Management configuration ..................................................................... 35
Figure 1-23: Troubleshooting Preparation Commands & Tools ..................................................... 35
Figure 1-24: Troubleshooting Preparation ..................................................................................... 36
Figure 1-25: Remote terminal session timeout .............................................................................. 36
Figure 1-26: Who is logged into the SSR? .................................................................................... 36
Figure 1-27: What did you type before? ........................................................................................ 37
Figure 1-28: Troubleshooting by searching and limiting the output ................................................ 38
Figure 1-29: Command Line Interface & Emacs ............................................................................ 39
Figure 1-30: Command Line Interface & Emacs ............................................................................ 40
Figure 1-31: GREP, Global Regular Expression Parser ................................................................ 40
Figure 1-32: Extended GREP........................................................................................................ 41
Figure 1-33: Other searching tools ................................................................................................ 41
Figure 1-34: Regular expressions ................................................................................................. 42
Figure 1-35: Regular expressions, examples with GREP .............................................................. 44
Figure 1-36: Regular expressions , examples with GREP ............................................................. 44
Figure 1-37: Aliases and Macros ................................................................................................... 45
Figure 1-38: Introduction to Alias .................................................................................................. 45
Figure 1-39: Introduction to macro ................................................................................................ 46
Figure 1-40: Variables in Macros .................................................................................................. 46
LZT1381712 R1A
© Ericsson AB 2015
- 185 -
Ericsson SSR 8000 R15 System Troubleshooting
Figure 1-41: Exercise 2: Introduction, Searching and Filtering ...................................................... 47
Figure 1-42: Exercise 2: Searching and Filtering ........................................................................... 47
Figure 1-43: Exercise 2, review (1-4) ............................................................................................ 47
Figure 1-44: Exercise 2, review (2-4) ............................................................................................ 48
Figure 1-45: Exercise 2, review (3-4) ............................................................................................ 48
Figure 1-46: Exercise 2, review (4-4) (optional) ............................................................................. 49
Figure 1-47: Chapter Summary ..................................................................................................... 50
Figure 2-1: Chapter Objectives ..................................................................................................... 51
Figure 2-2: Recommended Troubleshooting Procedure ................................................................ 52
Figure 2-3: System Hardware Health ............................................................................................ 53
Figure 2-4: Overview: Hardware Status ........................................................................................ 54
Figure 2-5: More detailed hardware info........................................................................................ 55
Figure 2-6: Retrieving hardware details Line cards ....................................................................... 56
Figure 2-7: RPSW hardware information ....................................................................................... 57
Figure 2-8: ALSW hardware information ....................................................................................... 58
Figure 2-9: Finding hardware alarms (1-2) ................................................................................... 59
Figure 2-10: Finding hardware alarms (2-2) ................................................................................. 59
Figure 2-11: System Hardware Checks......................................................................................... 60
Figure 2-12: System alarms .......................................................................................................... 61
Figure 2-13: System Alarm with Options, Examples...................................................................... 61
Figure 2-14: Example: Initiating Major System Alarm .................................................................... 62
Figure 2-15: Example: Initiating Critical System Alarm .................................................................. 63
Figure 2-16: System Hardware LED.............................................................................................. 64
Figure 2-17: Card Powered Down ................................................................................................. 65
Figure 2-18: System storage Verification ...................................................................................... 66
Figure 2-19: System storage ......................................................................................................... 67
Figure 2-20: System storage verification ....................................................................................... 68
Figure 2-21: System storage verification: Example ....................................................................... 68
Figure 2-22: Chapter Summary ..................................................................................................... 70
Figure 3-1: Chapter Objectives ..................................................................................................... 71
Figure 3-2: Process Architecture ................................................................................................... 72
Figure 3-3: RPSW Processes (1-3) ............................................................................................... 73
Figure 3-4: RPSW Processes (2-3) ............................................................................................... 75
Figure 3-5: RPSW Processes (process communication) (3-3)....................................................... 75
Figure 3-6: Process Scheduling .................................................................................................... 76
Figure 3-7: RPSW processes verification ...................................................................................... 77
Figure 3-8: Finding CPU intensive processes ............................................................................... 78
Figure 3-9: Single process verification .......................................................................................... 79
Figure 3-10: Single process in detail ............................................................................................. 80
Figure 3-11: Single Process Verification – ISM ............................................................................. 81
Figure 3-12: Single Process Verification – OSPF .......................................................................... 82
Figure 3-13: Maximum Crashes Allowed ....................................................................................... 83
Figure 3-14: Process crash (1-2)................................................................................................... 84
Figure 3-15: What happens when a process crashes? .................................................................. 84
Figure 3-16: Software Process Failure Scenario ........................................................................... 85
Figure 3-17: Process crash (2-2)................................................................................................... 86
Figure 3-18: System Stopped Processes ...................................................................................... 86
Figure 3-19: Did a process crash? (1-2) ........................................................................................ 87
Figure 3-20: Did a process crash? (2-2) ........................................................................................ 87
Figure 3-21: Old core files on RP – BAD IDEA.............................................................................. 88
Figure 3-22: Core files are copied between RPs ........................................................................... 88
Figure 3-23: Core dump files on standby RP................................................................................. 89
- 186 -
© Ericsson AB 2015
LZT1381712 R1A
Table of Figures
Figure 3-24: Exercise 3: Introduction............................................................................................. 90
Figure 3-25: Exercise 3: System Processes .................................................................................. 90
Figure 3-26: Exercise 3, review (1-2) ............................................................................................ 90
Figure 3-27: Exercise 3, review (2-2) (Optional parts) ................................................................... 91
Figure 3-28: Chapter Summary ..................................................................................................... 92
Figure 4-1: Chapter Objectives ..................................................................................................... 93
Figure 4-2: RP redundancy ........................................................................................................... 94
Figure 4-3: RP redundancy details ................................................................................................ 95
Figure 4-4: Investigating redundancy issues ................................................................................. 96
Figure 4-5: show system redundancy (1-3) .................................................................................. 96
Figure 4-6: show system redundancy (2-3) ................................................................................... 97
Figure 4-7: show system redundancy (3-3) ................................................................................... 97
Figure 4-8: Analyzing Problems of Standby RP ............................................................................ 98
Figure 4-9: Which RP should you check, Active or Standby? ....................................................... 99
Figure 4-10: Connecting to standby RP without console ............................................................... 99
Figure 4-11: Searching for restart reason .................................................................................... 100
Figure 4-12: Repeating commands on standby RP ..................................................................... 100
Figure 4-13: Repeating commands on standby RP ..................................................................... 100
Figure 4-14: Verify processes on standby RP ............................................................................. 101
Figure 4-15: Copy files from standby RP ..................................................................................... 101
Figure 4-16: Copy files from standby RP ..................................................................................... 102
Figure 4-17: RP Failover Management ....................................................................................... 103
Figure 4-18: Managing Reloads and RP Switch-over .................................................................. 104
Figure 4-19: Manual RP Switchover (1-2) ................................................................................... 104
Figure 4-20: Manual RP switchover (2-2) .................................................................................... 105
Figure 4-21: Chapter Summary ................................................................................................... 106
Figure 5-1: Chapter Objectives ................................................................................................... 107
Figure 5-2: Boot Problems .......................................................................................................... 108
Figure 5-3: Entering Boot ROM Interface .................................................................................... 108
Figure 5-4: Example: Entering Boot ROM Interface .................................................................... 109
Figure 5-5: Diagnostics Command .............................................................................................. 109
Figure 5-6: Running Diagnostics ................................................................................................. 110
Figure 5-7: Troubleshooting Scenarios ....................................................................................... 111
Figure 5-8: Resume Boot ............................................................................................................ 111
Figure 5-9: Troubleshooting Scenarios ....................................................................................... 112
Figure 5-10: System uptime ........................................................................................................ 112
Figure 5-11: Check for human errors .......................................................................................... 113
Figure 5-12: System storage verification ..................................................................................... 113
Figure 5-13: Exercise 4: Investigate Boot Problems .................................................................... 113
Figure 5-14: Chapter Summary ................................................................................................... 114
Figure 6-1: Chapter Objective ..................................................................................................... 115
Figure 6-2: System logging introduction ...................................................................................... 116
Figure 6-3: Loggd Process .......................................................................................................... 117
Figure 6-4: System log commands.............................................................................................. 118
Figure 6-5: Event Severity Levels in Log Messages .................................................................... 119
Figure 6-6: Logs from cards ........................................................................................................ 120
Figure 6-7: Show log and time .................................................................................................... 120
Figure 6-8: Show log and time .................................................................................................... 121
Figure 6-9: Log Files ................................................................................................................... 122
Figure 6-10: Custom Log fIles and filters..................................................................................... 123
Figure 6-11: Log Files location .................................................................................................... 124
LZT1381712 R1A
© Ericsson AB 2015
- 187 -
Ericsson SSR 8000 R15 System Troubleshooting
Figure 6-12: Display Log Files..................................................................................................... 124
Figure 6-13: Filter Based on Facility ............................................................................................ 125
Figure 6-14: Filter Based on Facility example ............................................................................. 125
Figure 6-15: Pm Process Logs .................................................................................................... 126
Figure 6-16: CSM Process Logs ................................................................................................. 127
Figure 6-17: ISM Process ........................................................................................................... 127
Figure 6-18: Filter based on facility on card................................................................................. 128
Figure 6-19: Logger verification................................................................................................... 129
Figure 6-20: Show Logging Card information .............................................................................. 130
Figure 6-21: Logging display info ................................................................................................ 130
Figure 6-22: Logging debug ........................................................................................................ 132
Figure 6-23: Logging debug (global config logging)..................................................................... 133
Figure 6-24: Logging debug ........................................................................................................ 134
Figure 6-25: Log File Collection .................................................................................................. 134
Figure 6-26: Syslog Configuration ............................................................................................... 135
Figure 6-27: Syslog server .......................................................................................................... 136
Figure 6-28: Reference for Syslog lab ......................................................................................... 136
Figure 6-29: Exercise 5: Logging & Syslog ................................................................................. 136
Figure 6-30: Exercise review: Configure Syslog & Debug ........................................................... 137
Figure 6-31: Exercise review: Syslog server environment ........................................................... 138
Figure 6-32: Exercise review: Save and display the logs............................................................. 138
Figure 6-33: Chapter Summary ................................................................................................... 139
Figure 7-1: Chapter Objectives ................................................................................................... 141
Figure 7-2: Debug introduction .................................................................................................... 142
Figure 7-3: The challenge ........................................................................................................... 143
Figure 7-4: Debug coverage (what) ............................................................................................. 144
Figure 7-5: How to recognize a debug function is context specific? Context ID ........................... 145
Figure 7-6: Debug coverage (where)........................................................................................... 146
Figure 7-7: Debugging within context local .................................................................................. 146
Figure 7-8: Debugging in different contexts ................................................................................. 147
Figure 7-9: Debug relationship with contexts............................................................................... 148
Figure 7-10: Send debug output to screen .................................................................................. 149
Figure 7-11: Administrator privacy .............................................................................................. 151
Figure 7-12: Debugging and “impact” .......................................................................................... 152
Figure 7-13: Exercise 6: Debugging on SSR ............................................................................... 152
Figure 7-14: Chapter Summary ................................................................................................... 153
Figure 8-1: Chapter Objectives ................................................................................................... 155
Figure 8-2: Troubleshooting Basic Checks .................................................................................. 156
Figure 8-3: Interface & Port States .............................................................................................. 157
Figure 8-4: Interface & Port States .............................................................................................. 158
Figure 8-5: Interface & Port States .............................................................................................. 158
Figure 8-6: Interface & Port States .............................................................................................. 159
Figure 8-7: Verifying interface status ........................................................................................... 159
Figure 8-8: Identifying interface problems: Unbound state (1-3) .................................................. 160
Figure 8-9: Identifying interface problems: Bound state (2-3) ...................................................... 161
Figure 8-10: Identifying interface problems: Bound state (cont.) (3-3) ......................................... 162
Figure 8-11: Port status: Admin state and Line State .................................................................. 163
Figure 8-12: Circuit status ........................................................................................................... 164
Figure 8-13: Troubleshooting Traffic ........................................................................................... 165
Figure 8-14: Troubleshooting traffic problems (counters) ............................................................ 165
Figure 8-15: Port counters – overview......................................................................................... 166
Figure 8-16: Live port counters ................................................................................................... 167
- 188 -
© Ericsson AB 2015
LZT1381712 R1A
Table of Figures
Figure 8-17: Port counters – details (1-4) .................................................................................... 167
Figure 8-18: Port counters – details (2-4) .................................................................................... 168
Figure 8-19: Port counters (3-4) .................................................................................................. 168
Figure 8-20: Troubleshooting circuits .......................................................................................... 169
Figure 8-21: Circuit counters ....................................................................................................... 169
Figure 8-22: VLAN circuit statistics (1-2) ..................................................................................... 170
Figure 8-23: VLAN circuit statistics (2-2) ..................................................................................... 170
Figure 8-24: Clearing counters .................................................................................................... 171
Figure 8-25: Ping - key IP troubleshooting tool ............................................................................ 171
Figure 8-26: Traffic troubleshooting exercise: Introduction .......................................................... 172
Figure 8-27: Traffic troubleshooting exercise: Preparation .......................................................... 172
Figure 8-28: Exercise 7: Traffic troubleshooting .......................................................................... 172
Figure 8-29: Context topology for traffic troubleshooting exercise ............................................... 173
Figure 8-30: Traffic troubleshooting exercises review (1-7) ......................................................... 173
Figure 8-31: Traffic troubleshooting exercises review (2-7) ........................................................ 174
Figure 8-32: Traffic troubleshooting exercises review (optional) (3-7) ......................................... 174
Figure 8-33: Traffic troubleshooting exercises review (optional) (4-7) ......................................... 175
Figure 8-34: Traffic troubleshooting exercises review (optional) (5-7) ......................................... 175
Figure 8-35: Traffic troubleshooting exercises review (optional) (6-7) ......................................... 176
Figure 8-36: Traffic troubleshooting exercises review (optional) (7-7) ......................................... 176
Figure 8-37: Chapter Summary ................................................................................................... 177
LZT1381712 R1A
© Ericsson AB 2015
- 189 -
Download