Measuring Availability in Telecommunications Networks Mattias Thulin, November 2004 Disposition 1. 2. 3. 4. 5. 6. 7. 8. 9. Introduction Method Network Availability SDH Network description ITU-T standard G.826 Analysis Implementation Result Conclusions 2 Song Networks Nordic network provider Optical fiber network covering Northern Europe Products: IP-VPN Internet connections Telephone services Hosting Carrier services Introduction Market demand for network quality Important to measure network availability Maintain service-level agreements Attract new customers Indicator of network quality for internal maintenance Methods of measuring and defining Network Availability vary between operators Purpose How is Network Availability defined? How can it be measured? Why should it be measured? What standards exist? Are there any recommended values for availability parameters? How can availability measurements be applied to Song Networks SDH transmission network? Delimitations 1) 2) General study on Network Availability Develop a method for availability measurement Nortel SDH equipment Four rings 44 links Oriented towards network-operation Method Literature study Network study Monitoring system Preside Interviews Standards Design model for availability measurement and presentation Network Availability - definition The ability of a functional unit to be in a state to perform a required function under given conditions at a given instant of time or over a given time interval, assuming that the required external resources are provided. ISO 2382-14, 1997 Network Availability – The “five-nines” Percentage value of uptime for a given time period “Five-nines” 99,999% Availability Downtime per year 99,9999% 32s 99,999% 5min 15s Viewed as desired uptime in network core-level 99,99% 52min 36s 99,9% 8h 46min 99% 3 days 15h 40min Theoretic Availability Summing availability Serial units A B Total availability = A * B 99,99% 99,99% Parallel units A B Total availability = A + B - A * B Total availability = 0 , 9999 4 0 , 9996 99,99% Reactive Availability Data from trouble-tickets Good for measuring customer-experienced availability Easy to identify what equipment failed and what solved the error Can lack information of short interruptions and outside of office hours Customer- vs. Network-management oriented Important to know for whom or for what purpose are we measuring Customer oriented Includes all layers Calculate downtime when the customer connection is not working. Network-management oriented What links have lower availability? Considered as downtime although the traffic is rerouted SDH Network Description SDH – Synchronous Digital Hierarchy Based on American standard SONET Normally build in ring structure Error correction and retransmission is done by overlaying protocols Song Networks’ SDH network Sweden ring Nordic ring Baltic ring Europe ring Song Networks’ SDH network Sweden ring: 9 Network Elements Nordic ring: 7 Network Elements European ring: 7 Network Elements Baltic ring: 3 Network Elements G18 G17 6050 6054 G17 G18 G18 G17 G12 G11 6052 6056 G17 G18 Surveillance and statistics NE NE NE NE OPC NE NE Preside Preside Global performance Alarm lists Query performance statistics Preside log files Comma-delimited text files (CSV) One file per Network Element 96 15-min counts (past 24 hours) 8 24-hour counts (past week) 1200,Ottawa,OC48,Term,DS3,G7,2,Line,Rx,Ne,SES,03/07/99,03/0 7/99,16:00, 0,0,0,2,3,8,12,6, 0,0,1,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0 ,0,0 ITU-T Standard G.826 Parameters: Bit-error Errored Second (ES) 1 sec with 1 EB Errored Block (EB) 1 bit-error/block Severly Errored Second (SES) 1 sec with 30% EB Unavailable Second (UAS) T m i e 1 0 s < 1 0 s U n a v a a l i b y t l i d e e t c e t d U n a v a a l i b e l p e o i r d S e v e e r y l E o r e r d S e c o n d 1 0 s A v a a l i b y t l i d e e t c e t d A v a a l i b e l p e o i r d T 1 8 7 1 3 8 9 0 E o r e r d S e c o n d n ( o n S E S ) E o r e f r e S e c o n d A period of unavailable time begins at the onset of ten consecutive SES events. These ten seconds are considered to be part of unavailable time. A new period of available time begins at the onset of ten consecutive non-SES events. These ten seconds are considered to be part of available time. (ITU-T G.826, 2002) Analysis Define availability Develop model for calculating average availability Define database structure for saving availability statistics Specify format for availability reports Analysis Follow ITU-T Standard G.826 Apply to all active links in the network Calculate average availability per link, per ring and total network Present first five significant figures First calculate average UAS, convert to percentage in last step to avoid rounding error MeasuredTi me UAS Availabili ty *100 MeasuredTi me Analysis + + 3 Availability for a ring is the average UAS for all the links in the ring Implementation Log Files Parser Database Analyze Report Implementation - parser program Programmed in Java for platform independence Parse all log-files in directory for: NE Link Day UAS count Insert into MySQL database table Implementation – report generating Web interface for easy access Input parameters: start and end date PHP-script query database for UAS values and calculate average availability Per link Per ring Total network Report can be saved to PDF format (PHP-script) Implementation – Graphic reports Crystal Reports Start-date and end-date are entered and the program queries database and produces graphic reports Can be exported to PDF file NE Result Between 2004-07-12 and 2004-09-19 SDH Ring Availability in % Link Availability in % 6052 G11 99.989 6052 G12 99.989 6052 G17 99.944 6052 G18 99.958 6053 G17 99.856 6053 G18 99.959 6054 G17 100 6054 G18 99.95 6056 G11 99.989 6056 G12 100 6056 G17 99.951 6056 G18 100 6060 G11 100 6060 G12 100 6060 G17 100 Sweden DX 99.983 Nordic DX 99.997 6060 G18 99.993 Europe DX 99.462 6058 G17 99.993 6058 G18 100 Baltic DX 99.814 6044 G11 100 TOTAL 99.81 6044 G12 99.989 6044 G17 100 6044 G18 100 6045 G17 100 6045 G18 100 6046 G17 100 6046 G18 100 Result Between 2004-07-12 and 2004-09-19 Result Between 2004-07-12 and 2004-09-19 Conclusions Background study can be used for planning future measurements Positive feedback from network operations management for the weekly reports Need more statistic in the database to observe general trends By studying trends Song Networks can cut maintenance spending and better forecast future cost by directing resources to maintain a high network quality Future work: Measure backbone availability from a customer point of view using relational databases How do errors in the backbone affect distribution layer? Questions?