Journal of Network and Systems Management Visualizing Internet Network Unavailability: Insights for Better Action --Manuscript Draft-Manuscript Number: Full Title: Visualizing Internet Network Unavailability: Insights for Better Action Article Type: S.I. : Responsible Internet Keywords: Data analytics; Business Intelligence; Data visualization; informed decision making Corresponding Author: Mesfin Woldmariam, PhD Addis Ababa University Addis Ababa, ETHIOPIA Corresponding Author Secondary Information: Corresponding Author's Institution: Addis Ababa University Corresponding Author's Secondary Institution: First Author: Mesfin Woldmariam, PhD First Author Secondary Information: Order of Authors: Mesfin Woldmariam, PhD Order of Authors Secondary Information: Manuscript Region of Origin: FALKLAND ISLANDS (MALVINAS) Funding Information: Abstract: Telecom companies capture network availability and unavailability data continuously and automatically. But, it is not common to see them conduct analytics on the datasets. As the result, managers usually miss which network stations are making them lose in revenue and which of them bring more revenue? Which times do many networks go unavailable? These questions cannot be answered by both design science and behavioral science fields of information systems. By using network unavailability datasets at Ethio-telecom, this study demonstrates how analytics based information system can help managers to visualize communication network unavailability across network stations in Ethiopia. The study reveal network unavailability forms some kind of pattern across regions and time. For example, Southern regions has a persistent high unavailability while central and western Addis Ababa has continuous high availability of network. In terms of time May, June, and July has high unavailability while March, December, September and November has the highest availability. Thursday is the date many network unavailability is reported. It is very interesting to analyze why this kind of uniform pattern emerged across regions and the months. This kind of insights alerts decision makers to conduct more investigative studies for better actions. From academic perspective, the author insists information system scholars give attention to the overlooked field of analytics based information system. With the ever increasing datasets, analytics based information system deserves more attention than the past. Powered by Editorial Manager® and ProduXion Manager® from Aries Systems Corporation Click here to access/download;Manuscript;NW unavailability Analysis.docx Click here to view linked References 1 2 3 4 5 Visualizing Internet Network Unavailability: Insights for Better Action 6 7 Abstract 8 9 Telecom companies capture network availability and unavailability data continuously and automatically. But, it is not 10 common to see them conduct analytics on the datasets. As the result, managers usually miss which network stations 11 12 are making them lose in revenue and which of them bring more revenue? Which times do many networks go 13 unavailable? These questions cannot be answered by both design science and behavioral science fields of information 14 15 systems. By using network unavailability datasets at Ethio-telecom, this study demonstrates how analytics based 16 information system can help managers to visualize communication network unavailability across network stations in 17 18 Ethiopia. The study reveal network unavailability forms some kind of pattern across regions and time. For example, 19 Southern regions has a persistent high unavailability while central and western Addis Ababa has continuous high 20 21 availability of network. In terms of time May, June, and July has high unavailability while March, December, 22 September and November has the highest availability. Thursday is the date many network unavailability is reported. 23 24 It is very interesting to analyze why this kind of uniform pattern emerged across regions and the months. This kind of 25 insights alerts decision makers to conduct more investigative studies for better actions. From academic perspective, 26 27 the author insists information system scholars give attention to the overlooked field of analytics based information 28 system. With the ever increasing datasets, analytics based information system deserves more attention than the past. 29 30 31 Keywords: Data analytics, Business Intelligence, Data visualization, informed decision making 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 1 63 64 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 1. Introduction Service outages in internet networks can occur due to factors like power outage, accidental fall of telecom tower due to heavy rain and wind, sabotage by individuals, malfunctions in the core transmission network, major damage in the core network elements, such as controllers, or in the switching functions [1]. Prevalence of such causes means a huge income lose to service provider, which also affects businesses and individuals who use the network. For example, individuals and business who are seeking to rent office space need information about which part of the city has regular internet interruptions and which haven’t. Similarly, service providers also get information about which network sites are more profitable and which are not? What is the pattern of network unavailability across the regions and zones? When is the highest unavailability in the country etc? This means both service provider and service takers can make informed decisions. Service providers manage hundreds of thousands of connected network sites that are connected to central controlling station. When network site encounter trouble/ fault, this information/data is captured by central system so that maintenance department will fix. Promptly fixing such problems add value to the network reliability [2, 3]. Otherwise delay in fixing problems results in revenue lose to service provider and dissatisfaction to service takers. Having network availability rate without disturbance/ failure for all time service is mandatory for customer satisfaction as well as better profitability. But, as a matter of fact, achieving 100% network availability is unrealistic as the causes are beyond the control of telecom companies. Figure 1 shows how data about network problem is centrally and automatically captured by the system. Fig.1 Central data capturing system Insights about the characteristics and patterns of network unavailability across stations help managers prioritize maintenance and follow up. Network unavailability insights also help business and individuals to make informed 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 decisions when renting office space. Imagine if you rented an office space and got continuous internet interruptions and delayed maintenance. Delay in fixing network problems and repetitive maintenance on specific network site costs telecom companies a lot. Particularly, in developing nation, where almost all spare parts are imported, repetitive maintenance is expense and counterproductive. Due to lack of data analytics departments and units in Ethio-telecom, this problem went unnoticed by top level management of the company. As the result, Ethio-telecom has lost an estimated 500,000 USD from November 17-to May 17, 2020. In this perspective, this study demonstrates how data analytics give insights that is useful to managers and avoid extra expenses [4, 5]. From academics point of view, this study demonstrates the need for analytics based information systems, as stated by [6, 7]. 2. The Need for Data Analytics Based Information Systems Study Data analytics and decision making is a way of deciding based on insight in datasets. With the increasing digital products, services, and interactions added with existence of smart systems and agents, organizations collect huge datasets easily. Particularly, telecom companies have huge datasets which can be analyzed for better services. But with respect to African countries like Ethiopia, it is not common to deduce insights from data and most of the decisions are kind of business as usual. Thus, business managers are missing opportunities. Every day, we create large tera bytes of data, which is so large and big that 90 % of the data in the world today have been created in the last two years alone [8]. Such datasets is the backbone of data analytics and informed decision making [9], which is the spring board efficient and effective performance of todays and future organizations. The application of data analytics on telecom organizations’ day to day operation is becoming imperative due to its capability of providing valuable insights, knowledge, and foreseen issues [10]. Making sense of both structured and unstructured data through analytics helps organizations to revisit their routine operations, processes and decisions. For this to be materialized, they need to turn their datasets into insights. Such practices can also help organizations gain competitive advantages in the industry. The application of analytics or knowledge discovery in telecom industry is reported by different scholars like [11, 12, 13]. These studies argued how information and data visualizations help executives to make informed decisions. Unlike behavioral and design science research approaches of information systems research, research in data analytics based information system need to be guided not by theories and frameworks but by set of exploratory questions whose answers/ findings will be extracted from the dataset itself. This can be predictions, automated decisions, models that learns from data, or any type of data visualization delivering insights for decision makers [6]. For example, building dashboards from data visualization can help decision making [7]. In order to demonstrate the above argument, network unavailability datasets of Ethio-telecom for 2019 and 2020 was analyzed. The following section give brief characteristics of the datasets. 3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 3. Research Datasets The dataset has 6 columns as shown by table 1 below and there are 32,300 records. There are 23 telecommunication regions and each of them have many network stations identified by their site IDs (column two). Region 1.CAAZ 2.NAAZ 3.WAAZ …. 23. SR Site ID 111104 111143 111202 …… …. TAG Domain Plan Date 31/01/2019 Normal Power Normal Power 2/5/2019 Normal Power 2/5/2019 …. …. …. … …. … Table 1: Sample format of original datasets Close Date 5/2/2019 7/5/2019 7/5/2019 …. … Column Name Description Region This column lists out the names of the region where network unavailability is reported. The column lists out all the regions like CAAZ, SR, SWR, etc SITEID Each region has hundreds of network sites and each of them have their own site ID. TAG This column has data about the problem on the site. It has values like Core, Normal, Solar, which means problem on the core network, minor problem, and problem related to solar power respectively. Source of the problem. It has only one value as “power” Domain Planned Date This column has information related to the date (day, month, and year) of the problem as captured by the central system Closed date This column has information when the maintenance was completed. Table 2: description of dataset format 4. Insights from the datasets Accurate decision making needs accurate data analysis and presentation in a more consumable manner to decision maker. In the big data era, this requires the use of different tools like: visualization [14], big data analytics techniques [15]. In line with this, the present study used Tableau 2020.3 desktop version. This tool is the leading visualization software in the industry. Before the analysis and visualization, the necessary data preprocessing and transformation is made. 4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 Figure 2: Number of network unavailability days per regions /zones 4.1. Network Unavailability across the Regions As can be seen from the figure 2 above, south (SR), south western (SWR), and south eastern (SSWR) regions have the highest number of network unavailable days. When there are interruptions and disturbances on the network infrastructure, fixing them takes longer time (days) than other regions. Even though the data doesn’t tell why this is happening in these regions, the insight is clear! These regions are making the company lose much than other regions. Twenty five thousand network is unavailable is calculated as the sum of the number of days that the network sites are out of services in those regions. For example, as indicated on figure 2 above, there are around 2,500 reported cases of network sites failures in southern region. This is equal to (25,000/2,500) which means each network site failure lasts for an average of 10 days without being fixed. Given this insight, it is easy to understand customers of the company who reside in these regions are the most disadvantaged ones and have less competitive advantage on the use of internet based services, products or business models, as also reported by the study of [16,17] (Hailu, 2014; Tadesse, 2019). For individuals and enterprises who intend to embark on digital business that highly depend on internet, this kind of information is crucial and worth consideration. Given the current Pandemic (COVID-19), many organizations (government and private), particularly those working on the education sector are moving towards online education. Given the above internet network unavailability information, those business and institutions planning to operate from southern part of the country will obviously face a serious challenge. These areas are unreliable. The uniform decline in the figure is also an interesting alarm to telecom managers. Under normal circumstance, the figure should have look like zigzag than kind of liner decline. This kind of circumstances tell something is going unnoticed. 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 4.2. Network Unavailability of 2019-2020 for the regions Comparing network unavailability across the sites for the two years reveal that the trend is uniform. As seen by figure 3 below, those with high network unavailability in 2019 do have high unavailability in 2020 too and those with low unavailability in 2019 have also low unavailability in 2020. This uniformity catches the attention of any stakeholders. Why this has happened? In common sense and under normal circumstance, one can expect ups and down in the regions for the two regions. But, the data tips presence of something which we cannot tell from the present data. Answering this dilemma is not the objective of this paper. But the author strongly recommend management of the company assess what is going behind the scenes. Figure 3: Comparing network unavailability in the regions for two years 4.3. Network Unavailability across the months for the regions As illustrated by figure 4 below, southern regions have got the highest network unavailability for the months of May, June, January, July and October. From this insight, it is interesting to explore what is going on in these regions during these months. Particularly, the scale of interruption during the month of May is an eye catching that needs further investigation. Based on the information in literature, factors that lead to internet disturbances due to high scale of wind, rain, and political instability might be the contributing factors. On the other hand, the months of March and November are the two months with the least internet unavailability throughout the regions. This kind of information from the dataset is very informative to service users and service providers. The service provider can easily spot which network sites are bringing more revenue and which ones are costing them. Similarly, service users (individuals/ firms), who intends to provide internet based services and products like e-learning, ecommerce etc can make informed decisions while selecting their operational location. It is also interesting to investigate what is special with these months. 6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 Figure 4: Network unavailability for months 4.4. Network Unavailability across days of the week Analysis at a more granular level (day level interruption) reveal, Thursday has got the maximum network unavailability report for all the regions, see figure 5 below. Again regions in the southern part of the country has got the highest number of network unavailability. This information is also eye catching for telecom managers. It is interesting to investigate what is going on throughout all network stations in the country on every Thursday. 7 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 Figure 5: Network unavailability during the days of the week across the regions 5. Conclusion Internet networks availability is the key for business that intend to provide digital products and services to their customers. During the era of global information society, network availability to individual citizen is also like basic need. Internet is the backbone of many business, stakeholders, and citizens. As the result it is common to see organizations having websites, e-mails, social media presence, and even the use of collaboration technologies. But, it is not possible to have the network all the time. In this study, the application of data visualization technique reveal internet network unavailability in the country at different level of aggregation. Regions, months, and days. From the data analysis, network stations in southern part of the country have the highest number of days of internet unavailability. The months of May, June, and July are the months with the highest unavailability. Thursday is exceptionally the most unavailability reported days throughout the nation. This information needs close attention and investigation from service provider. More unavailable network means more loss of revenue and customer dissatisfaction too. 8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 Another striking information is the uniform trend across all the regions for the two years, across the months, and even days. Under normal circumstance, this kind of uniformity is not expected to happen. Some possible areas to investigate are: technology infrastructures deployed in the regions, behavior of maintenance personnel and efficiency of supervision, and service load on network equipment. From academic point of view, the findings reported in this study demonstrates information system research problems that did not get enough attention from information system community. These problems are not guided by design science or behavioral science research approaches. Therefore, the author calls information system scholars to reconsider the marginalized field of analytics based information systems, both in research and curricula. Reference [1]. Säe, J., & Lempiäinen, J. Maintaining Mobile Network Coverage Availability in Disturbance Scenarios. Mobile Information Systems, (2016). [2]. Karevan, A., Tee, K. F., & Vasili, M. A reliability-based and sustainability-informed maintenance optimization considering risk attitudes for telecommunications equipment. International Journal of Quality & Reliability Management (2020). [3]. Sehgal, A., & Vaishya, R. O. Availability of network with its maintenance in telecom industry. International Research Journal of Engineering and Technology (IRJET), PEC university of Technology, Chandigarh (2017). [4]. Mandinach, E. B. (2012). A perfect time for data use: Using data-driven decision making to inform practice. Educational Psychologist, 47(2), 71-85 (2012). [5]. Miragliotta, G., Sianesi, A., Convertini, E., & Distante, R. (2018). Data driven management in Industry 4.0: a method to measure Data Productivity. IFAC-papersOnLine, 51(11), 19-24 (2018). [6]. Bichler, M., Heinzl, A., & van der Aalst, W. M. Business analytics and data science: once again? (2017). [7]. Holsapple C, Lee-Post A, Pakath R (2014) A unified foundation for business analytics. Decis Support Syst 64C (August):130–141 (2014). [8]. Columbus, L. IBM predicts demand for data scientists will soar 28% by 2020. IBM White Paper (2017). [9]. Fiaz, A. S., Asha, N., Sumathi, D., & Navaz, A. S. Data visualization: enhancing big data more adaptable and valuable. International Journal of Applied Engineering Research, 11(4), 2801-2804 (2016). [10]. Malabocchia, F., Buriano, L., Mollo, M. J., Richeldi, M., & Rossotto, M. Mining telecommunications data bases: An approach to support the business management. In NOMS 98 1998 IEEE Network Operations and Management Symposium (Vol. 1, pp. 196-204). IEEE (1998, February). [11]. Daki, H., El Hannani, A., Aqqal, A., Haidine, A., Dahbi, A., & Ouahmane, H. Towards adopting Big Data technologies by mobile networks operators: A Moroccan case study. In 2016 2nd International Conference on Cloud Computing Technologies and Applications (CloudTech) (pp. 154-161). IEEE (2016, May). [12]. Wei, C. P., & Chiu, I. T. Turning telecommunications call details to churn prediction: a data mining approach. Expert systems with applications, 23(2), 103-112 (2002). [13]. Zelezny, F., Miksovsky, P., Stepankova, O., & Zidek, J. KDD in Telecommunications. In Workshop on data mining, decision support, meta-learning and ILP: forum for practical problem presentation and prospective solutions, Lyon’, University of Porto (2000). 9 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 [14]. Zhang, Q., Tao, C., & Chen, C. Research on mobile internet data visualization technology and application. Telecommunications Science, 30(10), 8 (2014). [15]. Mahmood, T., & Munir, K. (2020). Enabling Predictive and Preventive Maintenance using IoT and Big Data in the Telecom Sector. In IoTBDS (pp. 169-176) (2020) [16]. Hailu, T. (2014). Assessing the effect of service quality factors on broadband internet customer’s satisfaction in Addis Ababa, the case of ethio-telecom (Doctoral dissertation, St. Mary's University) (2014). [17]. Tadesse, A. Evaluating Quality of Services of 4G LTE Cellular Data Network: The Case of Addis Ababa (Doctoral dissertation, St. Mary's University) (2019). 10