Using Artificial Intelligence to Create Value in Insurance Sector Chapter 1 1.0 Introduction In this chapter, the researcher will explain the background of the study, the problem statement, research questions, the objective of the research, significance of the study, scope of the study limitations of the study, and the definitions of the study relates with the research topic, using Artificial Intelligence to create value in insurance sector specifically in fraud detection and prevention. 1.1 Background of Study The world that we are living today is full of uncertainties and risks such as accident, health danger, business crisis and many more. Losses of money will happen once these uncertainties happen in our life. Thus, people created a protection in managing these risks which we called as insurance today. Insurance is a contract, represented by a policy, in which an individual or an organization receives financial protection to prevent or reduce costs of loss from an insurance company (Investopedia, 2020). Over the past several years, human advanced technologies have been improved significantly. One of the most valued technologies would be Artificial Intelligence (AI). AI is a highly advanced technology which allows machines or computer systems to have similar thinking and intelligence behavior as humans. This includes gathering information, analyzing data, learning new things and making decisions. The improvement in accessible data, computing abilities and even predicting the future has led to a high acceleration of AI development. Humans are now using AI throughout the landscape of our lives often without realizing it. World leading companies such as IBM, Apple, Google and Amazon are using AI platforms and solutions for customers, partners and employees. It is disrupting and improving companies and organizations across all industries including insurance 1 industry. In the insurance industry, AI has proved itself in a successful way in areas such as underwriting, customers service and also fraud detection (Content.naic.org, 2020). The purpose of this study is to evaluate how AI brings advantages in insurance industry, more specifically in detecting fraud claiming. Many people try to take advantage of the insurance system and make quick profit by committing fraud claiming. Insurance fraud happens when the claimant makes up information or exaggerates information in order to get the money guaranteed by their insurance policy. Some of the most common types of insurance fraud claimed are accident insurance fraud, contractor insurance fraud, break-in insurance fraud and also disaster insurance fraud (TransUnion, 2020). Thus, this study is important as we can understand how AI can encounter or reduce the cases of fraud claiming in worldwide. 1.2 Problem Statement The issue of fraudulent activities had risen for a long time ago. Insurers have historically relied on mathematics and manual work to measure the risk to detect fraud. Thus, it is time consuming and complex. They mainly depend expert scrutiny, adjusters and special investigations team (Dhieb, Ghazzai, Besbes and Massoud, 2020). Based on the Federal Bureau Investigation (FBI), there is a total cost of US$40 billion caused by insurance fraud a year in United States which is the largest insurance market in the world. According to the Association of British Insurers, United Kingdom has detected around 125000 insurance fraud cases amounting to 1.3 billion pounds in 2016 (The Edge Markets, 2020). According to businesswire, in 2018, the total global insurance fraud detected was US$3.29 billion and is expected to grow further, mounting at a compound annual growth rate of 15.2% from 2019 to 2027 (WIRE, B, 2020). Therefore, from the statistic gathered from all around the world, relying on mathematicians and manual work to detect fraudulent activities in insurance sector did not work well. Thus, people start to use advanced machine such as AI to accomplish what they cannot in a high efficient ways. Studies show that companies such as Anadolu Sigorta, the first and one of the largest insurance company in Turkey, they used an AI predictive software from Friss, they had successfully saved US$5.7 million in fraud detection and prevention costs. This is because before they apply the AI predictive software, they have to investigate 25000 to 30000 claims 2 within two weeks manual process. Another example we can see from AXA, one of France’s top insurance companies, they also purchased an AI fraud detection software. The software they used is monitoring their entire network and the ability to contain emerging threats before they become a main issue (Mejia, N, 2020). However, although most respondents’ companies that deployed AI software claims that it helps achieving what manual work can’t, only 21% of those respondents really apply it into more detail usage. This indicates that AI haven’t be the first choice of problem solver in their consideration. Many organizations still do not require the fundamental skills to use AI in detecting fraud like having clear strategies in collecting data required by AI (mcKinsey, 2020). Besides that, there are studies that shows that the adoption of AI in insurance sector are having a significant different between Asia and other western countries. According to the Insurance AI & Analytics Survey 2018, more than 70% of all North American insurance carriers have already implementing and investing in AI projects (Bharadwaj, R, 2020). On the other hand, according to Srikanth Venkatesan, the Asia Pacific head of insurance, only around 40% of all large insurance organizations reported AI deployment in some of their projects (Olano, G, 2020). Thus, the purpose of this research is to provide the insurance sector a better understanding about the importance and value that an AI can provide to every companies. It can increase the efficiency of the company by significantly outperform the ability of a human and also decrease fraudulent claiming insurance sector to avoid unnecessary monetary losses. 1.3 Research Questions Central Questions: How does artificial intelligence relates in insurance industry? Sub Questions: How artificial intelligence create values in insurance industry? 1.4 Significance of Study The purpose of this study is to investigate the value of Artificial Intelligence in insurance industry specifically in detecting fraudulent activities. This allows insurers to understand the seriousness of the fraudulent claim which caused billions of financial losses in the insurance 3 sector. Besides that, the researcher believes that this study will also provide the insurers on the benefits of Artificial Intelligence in detecting and preventing fraud claiming. 1.5 Scope of Study The scope of this research is the insurance sector in general. There are so many insurance companies in the world from different field such as life insurance, general insurance, health insurance and many more. In the United States, according to the National Association of Insurance Commissioners, there are a total of 5965 insurance companies in 2018 (Iii.org, 2020). In Malaysia, there are a total of 22 licensed insurance companies and takaful operators based on the Central Bank of Malaysia (Bnm.gov.my, 2020). This focal point of this study is on insurance companies that implement artificial intelligence specifically in detecting and preventing fraudulent claims. 1.6 Limitations of Study Limitations of study are inevitable. In data collection process, there might be some limitations such as the data or information needed may be confidential, participants may be unwilling to provide them due to corporate restrictions. Furthermore, the usage or technology of artificial intelligence are overall still in developing or research process, the tech still has very huge improvement conditions. In other words, artificial intelligence still isn’t very common and acceptable by people. On top of that, time, cost and access to this study information are also considered limitations of study as this research must be completed within the limited time given. However, a flexible and suitable research methodology will be proposed in order to deal with these challenges. 4 1.7 Definitions of Terms The conceptual and operational definitions of key terms in this study are as follow: Term Machine Learning Supervised Learning Conceptual Definition Operational Definition Machine learning is the Machine learning operations concept that a computer is the use of machine learning program can learn and adapt models by to new data without human development/operations intervention. Machine teams. It seeks to add learning is a field of artificial discipline to the development intelligence that keeps a and deployment of machine computer’s built-in learning models by defining algorithms current regardless processes to make ML of changes in the worldwide development more reliable economy (Investopedia, and productive (WhatIs.com, 2020). 2020). Supervised learning is the Supervised learning is where machine learning task of you have input variables (x) learning a function that maps and an output variable (Y) an input to an output based on and you use an algorithm to example input-output pairs. It learn the mapping function is the process of an algorithm from the input to the output. learning from the training Y = f(X) dataset can be thought of as a The goal is to approximate teacher supervising the the mapping function so well learning process (Brownlee, that when you have new input J., 2020). data (x) that you can predict the output variables (Y) for that data (Brownlee, J., 2020). 5 Unsupervised Learning Unsupervised learning is a Unsupervised learning is type of machine learning that where you only have input looks for previously data (X) and no undetected patterns in a data corresponding output set with no pre-existing labels variables. The goal for and with a minimum of unsupervised learning is to human supervision. It is model the underlying unlike supervised learning structure or distribution in the above there is no correct data in order to learn more answers and there is no about the data (Brownlee, J., teacher. Algorithms are left to 2020). their own devises to discover and present the interesting structure in the data (Brownlee, J., 2020). Data Mining Data mining is a process of Data mining involves discovering patterns in large exploring and analyzing large data sets involving methods blocks of information to at the intersection of machine glean meaningful patterns and learning, statistics, and trends. It can be used in a database systems. It depends variety of ways, such as on effective data collection, database marketing, credit warehousing, and computer risk management, fraud processing (Investopedia. detection, spam Email 2020). filtering, or even to discern the sentiment or opinion of users (Investopedia. 2020). 6 Chapter 2: Literature Review 2.0 Introduction This chapter reviews the extant literature on fraudulent claiming in insurance sector, the importance of artificial intelligence in insurance sector specifically in detecting and preventing fraud follow by how artificial intelligence works in fraudulent claiming. Besides that, the researcher will be discussing in the challenges and issues faced by artificial intelligence while detecting and preventing fraud. 2.1 Fraudulent Claiming in Insurance Sector There are many definitions of fraud and fraudulent activities. The Association of Certified Fraud Examiners (ACFE) defines fraud as the use of one's occupation for personal enrichment through the deliberate misuse or misapplication of the employing organization's resources or assets. In recent years, insurance fraud detection has garnered large amounts of attention because a range of fraudulent methods have brought great loss to insurance companies and society as a whole. Insurance fraud detection is a branch of financial fraud detection (Abdallah, A. and Zainal, A, 2016). Nevertheless, the amounts involved in fraud have certainly increased as insurance made its transition into modern consumer society. The industry has been facing a problem of increasing prevalence and of sizeable proportions. Insurance fraud and, more generally, abuse of insurance not only put the profitability of the insurer at risk, but also negatively affect its value chain, the insurance industry, and may be extremely detrimental to established social and economic structures. They are believed to materially escalate the cost of certain types of insurance such as automobile, fire and health insurance. Eventually, they form a threat to the very principle of solidarity that keeps the insurance concept alive (The Geneva Papers on Risk and Insurance, 2004). Thus, based on previous studies, it shows that insurance fraudulent activities have been occur ever since the beginning and claims that almost all technological system that involves money and services can be compromised by fraudulent acts (Almeida, 2009). 7 2.2 Artificial Intelligence in Insurance Industry According to Wang Y (2018), in order to conquer the problem of insurance fraud, researchers have invested great effort into finding effective fraud indicators and methods. Fraud indicators play a critical role in insurance fraud detection. Appropriate indicators definitively make it possible for detection methods and algorithms to maximize the effectiveness of detection. Previous studies show that in the past, insurers have historically relied on mathematicians to measure risk and formulate premium rates for policy underwriting that would ensure rational levels of payouts without endangering the company’s financial health. Traditional insurance fraud detection methods are complex and time-consuming. They mainly depend on expert scrutiny, adjusters, and special investigation services. Added to that, manual detection results in additional costs and inaccurate results. Moreover, late decisions might lead to extra losses for the insurance companies (Dhieb, N., Ghazzai, H., Besbes, H. and Massoud, Y, 2020). Artificial intelligence methods include data mining, statistical, mathematical, and machine learning techniques to extract and identify useful information and subsequent knowledge from large databases. These systems have several main advantages such as fraud patterns are obtained automatically from data, specification of “fraud likelihood” for each case, consequently that efforts in investigating suspicious cases can be prioritized and revelation of new fraud types that were not defined or seen before (Li et al., 2008). Thus, with the advantages mentioned, it can increase the efficiency in detecting and preventing fraud claiming as compare to manual process. 2.3 How artificial intelligence detect and prevent fraud Fraud is increasing dramatically with the progression of modern technology and global communication. As a result, fighting fraud has become an important issue to be explored. The detection and prevention mechanisms are used mostly to combat fraud. Fraud protection systems can be divided into two categories, fraud prevention system and fraud detection system. 8 2.3.1 Fraud Prevention System Fraud prevention system is the first layer of protection to secure the technological systems against fraud. The purpose of this phase to stop fraud from occurring in the first place. Mechanisms in this phase restrict, suppress, destruct, destroy, control, remove, or prevent the occurrence of cyber-attacks, in computer systems (hardware and software systems), networks, or data. Example of such mechanism includes using encryption algorithm that is applied to scramble data. Another mechanism is firewall where it forms a blockade between the internal privately owned network and external networks. It does not only help to secure systems from unauthorized access but also to allow an organization to enforce a network security policy on traffic flowing between its network and the Internet (Asherry, M, 2013). However, according to Belo and Vieira (2011), this layer of the system is not always efficient and strong. There are, in some occasions, where prevention layer could be breached by fraudsters. 2.3.2 Fraud Detection System Fraud detection system is the next layer of protection which is also the concern of this paper. Fraud detection tries to discover and identify fraudulent activities as they enter the systems and report them to a system administrator (Behdad et al., 2012). In previous years, manual fraud audit techniques such as discovery sampling have been used to detect fraud, such as in Tennyson and Forn (2002). These complicated and time-consuming techniques transact with various areas of knowledge like economics, finance, law and business practices. Therefore, to raise the effectiveness of detection, computerized and automated fraud detection system was invented. However, fraud detection system capabilities were limited because the detection fundamentally depends on predefined rules that are stated by experts (Li et al., 2008). 2.4 Machine Learning Machine learning is an application of artificial intelligence that have the ability to learn by itself automatically and improve over time from experience independently. Machine learning focuses on the development of computer programs that can access data and use it learn for themselves. According to Michalski and M. Kubat (1998), To go beyond, a data analysis system has to be 9 equipped with a substantial amount of background knowledge and be able to perform reasoning tasks involving that knowledge and the data provided. In effort to meet this goal, researchers have turned to ideas from the machine learning field. This is a natural source of ideas, since the machine learning task can be described as turning background knowledge and examples into knowledge. The machine learning and artificial intelligence solutions can be categorized into two, supervised and unsupervised learning. These methods seek for data such as accounts, customers, suppliers and many more that behave unusually (Bolton, R. & Hand, D, 2002). 2.4.1 Supervised Learning When comes to supervised learning, all the records taken can be automatically classified as fraudulent or non-fraudulent. This can help insurer fraud examiners to save their time to monitor each and every claim. In the care of fraudulent activities, it requires huge sample size to obtain the results. These records are then used to train a supervised machine learning algorithm and eventually have the ability to automatically classify new records as either fraudulent or nonfraudulent (Dal Pozzolo et al., 2014). However, supervised learning has several limitations. The first one is caused by the difficulty of collecting supervision or labels. When there is a huge volume of input data, it is prohibitively expensive, if not impossible, to label all of them. Second, sometime, it is extremely hard to find distinctive label, there are uncertainties and ambiguities in the supervision or labels. These limitations may obstruct the implementations of the supervised learning approaches in some cases. Therefore, unsupervised learning and semi-supervised learning is used to overcome these disadvantages (Liu, Xu and Yu, 2011). 2.4.2 Unsupervised Learning Unsupervised learning techniques detect fraudulent in an unlabeled test data set under the assumption that majority of the instances in the data set is nonfraud. Unlike supervised technique, unsupervised means there is no class label for model construction. The main benefit of using unsupervised approach is that it does not rely on accurate identification for label data which is often in short supply or non-existent (Bolton, R. & Hand, D, 2002). 10 2.4.3 Semi-supervised Learning Semi-supervised learning lies between supervised and unsupervised learning since it involves a small number of labeled samples and a large number of unlabeled samples. The main goal of semi-supervised approach is to train a classifier from both labeled and unlabeled data (Zhu et al., 2011). Semi-supervised learning has more advantage compared to supervised learning because it achieves better performance by utilizing both labeled and unlabeled data, but with fewer labeled instances. Furthermore, semi-supervised learning also provides a computational model to understand human category learning, where most of the input is self-evidently unlabeled (Xiaojin Zhu and Goldberg, 2009). 2.5 Challenges and Issues Fraud detection is a complex domain; we may find that a fraud detection system is prone to fail, has a low accuracy rate, or gives many false alarms. It is extremely difficult for electronic commerce systems to handle fraud problem forcing them to incur heavy losses. This happens because fraud detection systems need to deal with multiple challenges to be taken into considerations. 2.5.1 Concept Drift Fraud detection systems work in dynamic environment where behavior of legitimate user or fraudster is continuously changing is called the drift phenomenon concept. Concept drift primarily refers to an online supervised learning scenario when the relation between the input data and the target variable changes over time. Whereas, in supervised learning, the aim is to predict a target variable y given a set of input features X. In the training instance that are used for model building, both X and y correspond to input data and target variable, respectively. In the new instance on which the predictive model is applied, X is known, but y is not known at the time of prediction, and the relation between the input data and the target variable may change (Gama et al., 2013). 2.5.2 Skewed Class Distribution Skewed distribution (imbalanced class) is considered as one of the most critical issues faced by fraud detection system. Generally, the imbalanced class issue is the situation where there are 11 much fewer samples of fraudulent instance than normal instance. In a supervised learning approach, the class imbalance problem happens when the minority class is very small, leading to numerous problems such as disability of learners to discover patterns in the minority class data. Furthermore, imbalance class has a serious impact on the performance of classifiers that tend to be overwhelmed by the majority class and ignore the minority class (Liu et al., 2012). 2.6 Theory 2.6.1 Benford’s Law Benford’s Law specifies the distribution of the digits for naturally occurring phenomena. For a long time, this technique, commonly used in areas of taxation and accounting, was considered mostly a mathematical curiosity as it described the frequency with which individual and sets of digits for naturally growing phenomena such as population measures should appear. Such naturally growing phenomena, however, has been shown to include practical areas such as spending records and stock market values. One of the limits to the use of classic Benford’s Law in fraud detection has been its requirement that analyzed records have no artificial cutoffs. In other words, records must be complete (Lu and Boritz, 2005). 12 Chapter 3: Methodology 3.1 Research Philosophy According to (Saunders, Thornhill and Lewis, 2009), the term research philosophy defined as a system of beliefs and assumptions about the development of knowledge. In other words, a research philosophy is a belief of knowledge and data be gathered, analyzed and used. Positivism relates to the philosophical stance of the natural scientist and entails working with an observable social reality to produce law-like generalizations. Interpretivism developed as a critique of positivism but from a subjectivist perspective. Interpretivism emphasizes that humans are different from physical phenomena because they create meanings (Saunders, Thornhill and Lewis, 2009). This research relies on Interpretivism. Interpretive approaches encompass social theories and perspectives that embrace a view of reality as socially constructed or made meaningful through actors' understanding of events. This emphasizes the difference between conducting research among people instead of objects such as trucks and computers (Saunders, Thornhill and Lewis, 2009). 3.2 Research Approach According to (Saunders, Thornhill and Lewis, 2009), research approaches are mainly based on the research philosophies, whereby the deductive approach is commonly used by researchers with traditional natural scientific views (positivism), while inductive approach is usually based on phenomenology (interpretivism). With deduction a theory and hypothesis are develop and a research strategy designed to test the hypothesis. With induction, data are collected, and a theory developed as a result of the data analysis. This research is based on inductive approach so as to get a better understanding on the nature of the problem. Data will be collected through interview and the result of this analysis would be the formulation of a theory. Research using an inductive approach is likely to be particularly concerned with the context in which such events were taking place. Therefore, the study of a 13 small sample of subjects might be more appropriate than a large number as with the deductive approach (Saunders, Thornhill and Lewis, 2009) 3.3 Research Strategy According to (Saunders, Thornhill and Lewis, 2009), your choice of research strategy will be guided by your research question and objectives, the extent of existing knowledge, the amount of time and other resources you have available, as well as your own philosophical underpinnings. According to Saunders, there are seven research strategies, namely: experiment, survey, case study, action research, grounded theory, ethnography and archival studies (Saunders, Thornhill and Lewis, 2009). 3.4 Research Choices According to Vibha, Bijayini and Sanjay (2020), Qualitative research focuses in understanding a research query as a humanistic or idealistic approach. Though quantitative approach is a more reliable method as it is based upon numeric and methods that can be made objectively and propagated by other researchers. Qualitative method is used to understand people’s beliefs, experiences, attitudes, behavior, and interactions. Thus, due to the nature of this research, I will be choosing qualitative research approach. A small sample size of in-depth interview will be conducted. 3.5 Data According to Hox and Boeijie (2020), primary data are data collected for specific research problem, using procedures that fit the problem best. On every situation that primary data are collected, new data are added to the existing store of social knowledge. On the other hand, the material created by other researchers is made available for reuse by the general research community, it is then called secondary data. For this research, primary data will be collected using questionnaires. Questionnaires tend to be used for descriptive or explanatory research. 14 3.6 Instrumentation/Questionnaire In this qualitative research, the researcher will conduct a face-to-face in-depth interview with participants. Interviews are useful to explore experiences, views, opinions, or beliefs on specific matters. It is to develop an understanding of the underlying structures of beliefs. This interview will involve semi-structured and generally open-ended questions that are few in number and intended to elicit views and opinions from the participants. There will be around 4 to 5 questions prepared prior the interview and follow by sub questions accordingly to ask the participants in more details manner so as to obtain more useful information. 3.7 Sources of data collection According to DiCicco‐Bloom and F Crabtree (2020), interviews are among the most familiar strategies for collecting qualitative data. The different qualitative interviewing strategies in common use emerged from diverse disciplinary perspectives resulting in a wide variation among interviewing approaches. In this research, unlike the highly structured survey interviews and questionnaires used in epidemiology and most health services research, the researcher examines less structured interview strategies in which the person interviewed is more a participant in meaning making than a conduit from which information is retrieved. Semi-structured interviews are often the sole data sources for a qualitative research project and are usually scheduled in advance at a designated time and location outside of everyday events. They are generally organized around a set of predetermined open-ended questions, with other questions emerging from the dialogue between interviewer and interviewees. The author will have the ability to maintain the flow of interviews by adjusting the pace which suited both the interviewer and interviewees. Thus, interviewees will have greater freedom to express and discuss their ideas and thoughts compared to other research method like questionnaires. This research method will allow the author to have direct interaction with interviewees, develop deeper personal relationships and obtain a more in-depth understanding. 3.8 Sampling 15 For this research, purposeful sampling will be used, This technique is widely used in qualitative research for the identification and selection of information-rich cases for the most effective use of limited resources (Patton, 2002). This will involve identifying and selecting individuals or groups of individuals that are especially knowledgeable about or experienced with a phenomenon of interest (Cresswell & Plano Clark, 2011). In addition to knowledge and experience, the importance of availability and willingness to participate, and the ability to communicate experiences and opinions in an articulate, expressive, and reflective manner is important (Bernard, 2002). There are no specific rules when determining an appropriate sample size in qualitative research. Research suggested that sample size is influenced by many considerations, among them time and cost (Bryman, 2008). Qualitative sample size may best be determined by the time allotted, resources available, and study objectives (Patton, 1990). For phenomenological studies, Creswell (1998) recommends 5 to 25 and Morse (1994) suggests at least six. Qualitative methods place primary emphasis on saturation (i.e., obtaining a comprehensive understanding by continuing to sample until no new substantive information is acquired) (Miles & Huberman, 1994. Therefore this study will comprise between 5 to 20 interviews with respondents until the ‘saturation’ point is reached. Invitations will be sent to qualified respondents. The objective was gathering ‘deep’ information and perceptions through interviews and representing it from the perspective of the research participant. 3.9 Data Analysis Qualitative research studies involve a continuous interplay between data collection and data analysis. This research study followed the Creswell’s (2016) six steps during the data analysis process. The steps to look at qualitative data analysis as following steps from the specific to the general and as involving multiple levels of analysis. Step l is to organize and prepare the data for analysis. This involves transcribing interviews, typing up field notes, or sorting and arranging the data into different types depending on the sources of information. Step 2 is read through all the data to obtain a general sense of the information and to reflect on its overall meaning. Step 3 is beginning detailed analysis with a coding process. Coding is the process of organizing the material into chunks or segments of text before bringing meaning to information. A qualitative 16 code book was developed, and it is a record that contains a list of predetermined codes that researchers use for coding the data. Manual hand coding was done. Step 4 is to use the coding process to generate a description of the setting or people as well as categories or themes for analysis. Step 5. is to advance how the description and themes will be represented in the qualitative narrative. The most popular approach, a narrative passage to convey the findings of the analysis. Step 6 is a final step in data analysis involves making an interpretation or meaning of the data (Creswell, J., 2016). 3.10 Ethical Consideration Ethical considerations relate to all phases of the research process. Research ethics refer to the appropriateness of your behavior in relation to the rights of those who become the subject of your work or are affected by the work. They also relate to yourself and ensuring no harm comes to you and other researchers (Saunders, Thornhill and Lewis, 2009). As stated by (Saunders, Thornhill and Lewis, 2009), potential ethical issues should be recognized and considered from the outset of your research and are one of the criteria against which your research is judged. As researchers anticipate data collection, they need to respect the participants and the sites for research. Many ethical issues will arise during this stage of the research. In this proposal, the researcher develops an informed consent form for participants to sign before they engage in the research. This form acknowledges that participants' rights will be protected during data collection. Another issue to comply about confidentiality is that some participants may not want to have their identity remain confidential. In the interpretation of data, researchers need to provide an accurate account of the information. This accuracy may require debriefing between the researcher and participants in quantitative research. In order to protect personal information of the interviewees, they will remain anonymous throughout the whole research from data collection to data analysis. Their identifications will be hidden in the transcriptions. 17 References Investopedia. 2020. What Everyone Should Know About Insurance. [online] Available at: <https://www.investopedia.com/terms/i/insurance.asp> [Accessed 2 August 2020]. TransUnion. 2020. What Is Insurance Claims Fraud | Transunion. [online] Available at: <https://www.transunion.com/blog/what-is-insurance-claimsfraud#:~:text=for%20many%20businesses.-,What%20is%20Insurance%20Claims%20Fraud%3F ,policy%20prices%20for%20your%20clients.> [Accessed 2 August 2020]. Dhieb, N., Ghazzai, H., Besbes, H. and Massoud, Y., 2020. A Secure AI-Driven Architecture for Automated Insurance Systems: Fraud Detection and Risk Measurement. IEEE Access, 8, pp.58546-58558. Content.naic.org. 2020. Artificial Intelligence. [online] Available at: <https://content.naic.org/cipr_topics/topic_artificial_intelligence.htm#:~:text=In%20the%20insu rance%20industry%2C%20AI,claims%2C%20marketing%20and%20fraud%20detection.&text= Whether%20it%20is%20structured%20or,make%20sense%20of%20big%20data.> [Accessed 3 August 2020]. The Edge Markets. 2020. Insurance: Combatting Fraud With Big Data. [online] Available at: <https://www.theedgemarkets.com/article/insurance-combatting-fraud-bigdata#:~:text=The%20total%20cost%20of%20insurance,than%20US%2440%20billion%20(RM1 69.&text=According%20to%20Bank%20Negara%20Malaysia,and%20takaful%20operators%20 in%202013.> [Accessed 5 August 2020]. WIRE, B., 2020. Insurance Fraud Detection Study, 2019: Global Market Size & Share Analysis, 2017-2027 - Researchandmarkets.Com. [online] Businesswire.com. Available at: <https://www.businesswire.com/news/home/20191120005771/en/Insurance-Fraud-DetectionStudy-2019-Global-Market> [Accessed 2 August 2020]. Mejia, N., 2020. Artificial Intelligence-Based Fraud Detection In Insurance. [online] Emerj. Available at: <https://emerj.com/ai-sector-overviews/artificial-intelligence-fraud-detectioninsurance/> [Accessed 3 August 2020]. 18 mcKinsey. 2020. AI Adoption Advances, But Foundational Barriers Remain. [online] Available at: <https://www.mckinsey.com/featured-insights/artificial-intelligence/ai-adoption-advancesbut-foundational-barriers-remain> [Accessed 1 September 2020]. Bharadwaj, R., 2020. Enterprise Adoption Of AI In The Insurance Sector - An Overview. [online] Emerj. Available at: <https://emerj.com/partner-content/enterprise-adoption-of-ai-in-theinsurance-sector-an-overview/> [Accessed 1 September 2020]. Olano, G., 2020. AI To Reorganise Asia's Insurance Landscape. [online] Insurancebusinessmag.com. Available at: <https://www.insurancebusinessmag.com/asia/news/breaking-news/ai-to-reorganise-asiasinsurance-landscape-170431.aspx> [Accessed 1 September 2020]. Iii.org. 2020. Facts + Statistics: Industry Overview | III. [online] Available at: <https://www.iii.org/fact-statistic/facts-statistics-industryoverview#:~:text=In%202018%20there%20were%205%2C965,and%20other%20companies%2 0(1%2C305).> [Accessed 9 September 2020]. Bnm.gov.my. 2020. List Of Licensed Financial Institutions In Malaysia | Bank Negara Malaysia | Central Bank Of Malaysia. [online] Available at: <https://www.bnm.gov.my/index.php?ch=li&cat=insurance&type=L&lang=en> [Accessed 9 September 2020]. Investopedia. 2020. Machine Learning. [online] Available at: <https://www.investopedia.com/terms/m/machinelearning.asp#:~:text=Machine%20learning%20is%20the%20concept,changes%20in%20the%20 worldwide%20economy.> [Accessed 13 September 2020]. WhatIs.com. 2020. What Is Machine Learning Operations (Mlops)? - Definition From Whatis.Com. [online] Available at: <https://whatis.techtarget.com/definition/machine-learningoperationsMLOps#:~:text=Machine%20learning%20operations%20(MLOps)%20is,development%20more %20reliable%20and%20productive.> [Accessed 13 September 2020]. 19 Brownlee, J., 2020. Supervised And Unsupervised Machine Learning Algorithms. [online] Machine Learning Mastery. Available at: <https://machinelearningmastery.com/supervised-andunsupervised-machine-learning-algorithms/> [Accessed 13 September 2020]. Investopedia. 2020. Data Mining: How Companies Use Data To Find Useful Patterns And Trends. [online] Available at: <https://www.investopedia.com/terms/d/datamining.asp#:~:text=Data%20mining%20involves% 20exploring%20and,glean%20meaningful%20patterns%20and%20trends.&text=The%20data%2 0mining%20process%20breaks,it%20into%20their%20data%20warehouses.> [Accessed 13 September 2020]. Abdallah, A. and Zainal, A., 2016. Fraud detection system. Journal of Network and Computer Applications, 68, pp.90-113. The Geneva Papers on Risk and Insurance, 2004. Insurance Fraud: Issues and Challenges. 29, pp.313-333. Almeida, Pedro, Jorge, Marco, Cortesão, Luís, Martins, Filipe, Vieira, Marco, Gomes, Paulo, 2009. Supporting Fraud Analysis in Mobile* Telecommunications Using Case-Based Reasoning. pp. 562–572. Wang, Y., 2018. Leveraging deep learning with LDA-based text analytics to detect automobile insurance fraud. Decision Support Systems, pp.87-95. Dhieb, N., Ghazzai, H., Besbes, H. and Massoud, Y., 2020. A Secure AI-Driven Architecture for Automated Insurance Systems: Fraud Detection and Risk Measurement. IEEE Access, 8, pp.58546-58558. Li, Jing, Huang, Kuei-Ying, Jin, Jionghua, Shi, Jianjun, 2008. A survey on statistical methods for health care fraud detection. Health Care Manag. Sci. 11 (3), 275–287 Asherry, M., 2013. Security, Prevention and Detection of Cyber Crimes. Belo, Orlando, Vieira, Carlos, 2011. Applying User Signatures on Fraud Detection in Telecommunications Networks. pp. 286–299. 20 M. Behdad, L. Barone, M. Bennamoun and T. French, "Nature-Inspired Techniques in the Context of Fraud Detection," in IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 42, no. 6, pp. 1273-1290, Nov. 2012, doi: 10.1109/TSMCC.2012.2215851. Tennyson, Sharon, Forn, Pau, 2002. Claims auditing in automobile insurance: fraud detection and deterrence objectives. J. Risk Insur. 69 (3), 289–308. Michalski, R. S., I. Bratko, and M. Kubat (1998). Machine Learning and Data Mining – Methods and Applications. John Wiley & Sons Ltd. Bolton, R. & Hand, D. (2002). Statistical Fraud Detection: A Review (With Discussion). Statistical Science 17(3): 235–255. Dal Pozzolo, A. & Caelen, O. & Le Borgne, Y. & Waterschoot, S. & Bontempi, G. (2014). Learned lessons in credit card fraud detection from a practitioner perspective. Expert systems with applications 41: 10 4915–4928. Liu, X., Xu, Z. and Yu, R., 2011. Spatiotemporal variability of drought and the potential climatological driving factors in the Liao River basin. Hydrological Processes, 26(1), pp.1-14. S. Zhu, Y. Wang and Y. Wu, "Health care fraud detection using nonnegative matrix factorization," 2011 6th International Conference on Computer Science & Education (ICCSE), Singapore, 2011, pp. 499-503, doi: 10.1109/ICCSE.2011.6028688. Zhu, Xiaojin, Goldberg, Andrew B., 2009. Introduction to Semi-Supervised Learning. Artís, Manuel, Ayuso, Mercedes, Guillén, Montserrat, 2002. Detection of automobile insurance fraud with discrete choice models and misclassified claims. J. Risk Insur. 69 (3), 325–340 Šubelj, L., Furlan, Š. and Bajec, M., 2011. An expert system for detecting automobile insurance fraud using social network analysis. Expert Systems with Applications, 38(1), pp.1039-1052. Gama, Joao, Bifet, Albert, Pechenizkiy, Mykola, Bouchachia, Abdelhamid, 2013. 1 A survey on concept drift adaptation. ACM Comput. Surv. 1 (1). Lu, Fletcher, Boritz, J. Efrim, 2005. Detecting Fraud in Health Insurance Data : 21 Learning to Model Incomplete Benford' s Law Distributions Saunders, M., Thornhill, A. and Lewis, P., 2009. Research Methods For Business Students. Vibha, P., Bijayini, J. and Sanjay, K., 2020. Qualitative research. Perspectives in Clinical Research,. Hox, J. and Boeijie, H., 2020. Data collection. DiCicco‐Bloom, B. and F Crabtree, B., 2020. the qualitative research interview. making sense of qualitative research,. Creswell, J., 2016. Qualitative inquiry and research design: Choosing among five approaches,. 22