MSE PROJECT TITLES Semester II, Session 2015/2016 Lecturer: Associate Professor Dr. Ow Siew Hock Email: show@um.edu.my 1. Title of project: Share Market Price Prediction Model based on Company News Analyses Objectives of project: 1. To analyse the trend of share market prices of a company listed in the BURSA Malaysia after the news about dividend or other share-related news have been released by companies from a specific sector, using data mining technique. 2. To establish different share market price prediction models for different sectors based on the findings from the share market price analyses obtained from No. 1 above. 3. To develop a share market price predicting system that incorporates the share market price predicting model (from No. 2 above). 4. To evaluate the accuracy of the share market price predicting model using the share market prices of the companies listed in the BURSA Malaysia website. Brief description: This research aims to formulate different share market price prediction models (it can be a mathematical formula) by analysing the trends of the share market prices after the dividend news or any other sharerelated news have been released by the companies from different sectors. Students must choose companies listed in the main market of BURSA Malaysia from one of the sectors such as finance, properties, hotels, trading/services, etc. Based on the trends (findings) of the past 10-20 years of share market prices, establish different share market price prediction models to predict the increase or decrease in the share market prices of the companies after the companies have released their next dividend news or any other share-related news such as share offer for sale, or news about rights issue. Students shall develop a mobile application that incorporates the different prediction models. The system shall record details about the companies (company name and stock code), the profits gained by the companies for the financial reporting period, and details about the dividend (% of dividend, government tax (if dividend is not tax exempted, the % of tax imposed), payment date of dividend, etc.). The system shall show the predicted trends in tabular and graphical formats. Both formats shall show the predicted percent of increase or decrease in the share market prices from the next day that the dividend news or share-related news have been announced until the payment date of the dividend or the related event (e.g.: rights issue date). The accuracy of the different prediction models shall be evaluated using the data from the BURSA Malaysia Website or related sources (i.e. using the past as well as the current year’s share market prices, dividend data, etc.). Expected outputs: 1. Different share market price prediction models to predict and show the trends of the share market prices of companies from different sectors for the period after the next company dividend news or any share-related news have been released until the dividend payment date (period DAD-DPD) or related event date (period SRAD-RED). 2. A share market price prediction system (mobile application) to illustrate and predict the trends of share market prices of the companies from different sectors for the period DAD-DPD or SRADRED. Note: DAD-DPD: Dividend Announcement Date-Dividend Payment Date; SRAD-RED: ShareRelated Announcement Date-Related-Event Date Tools/Programming languages to be used: Mobile development tools, graphical tools, and any other related tools and technologies. Students are required to discuss with project supervisor and all other project team members who embarked on this research to decide on the programming languages, tools and technologies to be used. Number of Students: 2 (Each student shall choose companies from different sectors and establish the prediction models together as companies from the different sectors might have different factors that affect the share market prices after the dividend announcement or any other share-related news). MSE PROJECT TITLES Semester II, Session 2015/16 Lecturer: Dr. Mumtaz Begum Mustafa Email: mumtaz@um.edu.my 1. Title of project: Personalized Academic Planner Using Machine Learning Technique Incorporating Relevant Traits and Performance Measures Objectives(s) of project: To assist university students in maximizing their CGPA by providing directed and personalized information about their course outcome based on certain traits and historical performance of the student. Brief description: In today’s competitive global environment, especially with the establishment of Trans-Pacific Partnership (TPP) and Asian Economic Community (AEC), graduates are expected to have the necessary knowledge and skills as required by the respective industry, in line with the increasing competition both locally and globally. However, not all students that enrolled into these higher learning institution can meet the challenge to become a high achievers without proper assistant. An academic planner can be an effective tool to provide such assistance. Machine Learning algorithms can be used to predict the performance of a student based on his or her traits and past performance. Based on the prediction, students will have ample time to revamp their effort towards improved CGPA. The Machine learning algorithms relies on the right inputs of traits and performance measures. The identification and use of relevant traits and performance measures will be very useful to enable this process to be performed, so as to maximize the student’s effort and resources. Expected Output: Propose a machine learning technique for predicting the academic result of a particular course taken by the student based on the identified relevant traits Produce a prototype as a proof of concept for the technique developed. Tools/Programming languages to be used: Java or similar programming environment. or .Net environment, Machine Learning Toolkits such as WEKA and Matlab. 2. Title of project: An Intelligent Remote Cloud-based Simulation System to Manage and Control Power Distribution Objectives of project: 1. To analyse the problems encountered in power (electricity) distribution. 2. To establish a cost-effective power distribution algorithm to resolve the problem(s) identified in No. 1 above. 3. To develop an intelligent Remote Control Monitoring System (RCMS) that incorporates the cost-effective power distribution algorithm (from No. 2 above). 4. To simulate the cost-effectiveness of the power distribution algorithm and RCMS in an office environment. Brief description: In our daily life, wastage of electricity occurs without noticing it. This research aims to formulate a costeffective power distribution algorithm (a mathematical formula) and develop an intelligent Remote Control Monitoring System (RCMS) to eliminate wastage of power such as at home or in an office. Student must first conduct interviews with the manager and/or electricians in charge of power distribution from Tenaga National Berhad (TNB) or an organisation (such as University of Malaya) to understand the power distribution problems encountered, trends of power distribution and pertinent issues. Based on the findings of the interviews, establish a cost-effective algorithm to monitor and control the power distribution to eliminate wastage of electricity. An intelligent Remote Control Monitoring System (RCMS) that incorporates this algorithm shall be developed using cloud computing technology. The system shall be used to simulate the power distribution in an office to determine if the algorithm established is able to manage and monitor power distribution efficiently. The proving of costeffectiveness shall be carried out by comparing the amount of simulated power distributed (or simulated power consumption) before and after the simulation of the algorithm and RCMS. Expected outputs: 1. A cost-effective power distribution algorithm to eliminate wastage of power. 2. An intelligent Remote Control Monitoring System (RCMS) to manage and monitor power distribution. 3. A simulation board which controls by the RCMS (hardware). Tools/Programming languages to be used: C#, F#, MSSQL2014, Windows Azure, Visual Studio 2015, Windows 10, Windows Communication Foundation (WCF), Windows Presentation Foundation (WPF), Windows Server 2012, MS Office 2013, Cloud computing technology, and related hardware (wire, protocol, etc). Number of Student: 1 Co-supervisor: A lecturer from the Faculty of Engineering will be appointed to supervise the student in the aspects that are related to electrical engineering to carry out the research. Semester II, Session 2015/2016 Lecturer: ASMIZA ABDUL SANI Email: asmiza@um.edu.my 1. Title of project: Graphical access control specification language with integrated formal analysis capability for web application. Objectives(s) of project: 1) To identify access control concepts and patterns for supporting automated generation of formal specification. 2) To produce a graphical notation for specifying access control for web application. 3) To formally verify the access control specification is enforcing the defined policy. Brief description: Most web system developed for any organization may have some kind of access control implemented for the purpose of security and/or personalized sessions. Especially in a distributed system environment, unauthorised access is a problem that can create unwanted situation where the data may be corrupted or stolen. Access control for a system is governed by its policy. Therefore, it is important to ensure that the policy is effectively enforcing in the system specification. Particularly for sensitive data, access control is commonly defined using logical approaches. This requires software engineers with formal skills. Often, these formal specifications are not executable and they are tested only after a version of the system is implemented. The reason for this project is to propose a practical way of specifying access control using a graphical language that is amenable to automated formal analysis. The framework aim to exploit the benefit of formalism without the complexity of formal method application to ensure the specification includes the security necessary as defined in access control policy. Expected Output: A framework for specifying access control that contains a graphical language and an integrated formal analysis tool. Tools/Programming languages to be used: (Not limited to) Java and Alloy. Number of students: 1 2. Title of project: Reverse engineering to identify features for web application system Objectives(s) of project: 1) To identify techniques on how to reverse engineering data of web application system into software engineering artefacts. Brief description: The development for web application often focuses on quick delivery without much consideration for formal planning and design using development approach such as agile programming. Due to this, any future transition to a more structural, plan-driven methods for evolution and maintenance of the system would be difficult with the absence of software engineering artifact. Models and specifications are essential documents for any plan-driven development. The reason for this research is to reverse engineering existing web application system (e.g. codes, user activity log) into software engineering artefact so that software features can be identified for a new software development project. Expected Output: A method for reverse engineering web application for identifying software features. Tools/Programming languages to be used: Any Number of students: 1 MSE PROJECT TITLES Semester 2, Session 2015/2016 Lecturer: Prof. Dr. Lee Sai Peck Email: saipeck@um.edu.my 1. Title of project: A Graph Theory Driven Approach to Analyze UML Class Diagrams Relationships using XMI Objectives(s) of project: To analyze the complexity of classes in a UML class model using XMI and graph theory. To identify the classes with high impact, i.e. classes being heavily reuse, besides the utility classes, interface classes, and design patterns. To measure the complexity and impact of classes in the UML model using graph theory metrics. Brief description: This research focuses on software design, specifically on UML class diagrams, to analyze the complexity of UML classes, and subsequently identify the classes with high impact, i.e. classes being heavily reuse, utility classes, interface classes, and design patterns. Analyzing and understanding the behavior of software systems can be associated with their modeling through graphs (as in the graph theory concept), where software components are represented as nodes, while inter-relationships among them are represented as edges. Graphs have been successfully applied on various domains such as the World Wide Web, social network, power grid, and scholarly citation network to provide a high-level graph abstraction view. One way to analyze UML diagrams is by using the XMI (XML Metadata Interchange) representation of the analyzed UML class diagram. XMI is used to exchange metadata information of UML models using XML (Extensible Markup Language) representation. All major CASE (Computer Aided Software Engineering) tools, such as ArgoUML, Rational Rose, Enterprise Architect, MS Visio, Altova, Visual Paradigm, etc can export and import class diagrams in the format of XMI. With the support of XMI, it would ease the task of measuring the complexity and impact of classes using graph theory metrics. Expected Outcome: An approach to analyze the complexity of UML class model using XMI and graph theory metrics. Tools/Programming languages to be used: Suitable development tools 2. Title of project: Automated fact Sharing and Coordination Support for Change Management Objectives(s) of project: To understand and analyze the current way how facts or knowledge or issues are currently being shared, managed and changed by teams within an organization. To propose a change management framework with a conceptual model as a basis to support various mechanisms/techniques for effective sharing of facts, coordination and moderation of activities and facts for change management support. To develop a software prototype as a proof-of-concept. Brief description: This project aims to provide an automated underlying support for facilitating change management by providing a framework for facts sharing and moderation as well as coordination of work flows for team members. The framework will be supported by a conceptual model supporting various mechanisms/techniques to be proposed to help in effective sharing of facts, coordination and moderation of activities and facts before any change to be made by a team member. Any change is first moderated by a moderator before finally being committed/approved by the super moderator. The project will start by identifying the aspects/issues/types of facts being shared, managed and changed currently within an organization, then derive a set of relevant concepts that will form the basis of support for the proposed model and related mechanisms/techniques in the framework to allow sharing and coordination/moderation of these aspects/issues/facts to happen in an automated and coordinated manner through a software prototype to be developed. Expected Outcome: A software prototype that supports the change management framework with a conceptual model and related mechanisms/techniques. Tools/Programming languages to be used: Suitable development tools 3. Title of project: Tracing Rationale of Structuring Design Patterns to Achieve Software Design Quality Objectives(s) of project: To investigate properties which relate software design qualities to design patterns, and rationale traceability of software design. To propose a method to assist tracing between the design rationale in text description and the visual. To identify the design primitives which are involved in achieving design qualities. To propose a method to determine how certain pattern structure and behaviour helps achieve certain design qualities through traceability support. Brief description: Using patterns in software engineering can help towards achieving a better quality of software as patterns encapsulate experience of commonly recurring solutions to design problems. A design pattern is structured in a certain way and contains various interactions among the elements in the model. However the structure and interactions in the model do not describe the rationale of how it would achieve a certain design quality. Design rationales are important in many aspects in the design phase such as for design verification, design evaluation, design maintenance, design reuse, design communication and design documentation. A software model tends to degrade over time as software maintainers modify the software without understanding the underlying rationale of the decision made. Many previous researches focus on specifying what pattern to use to achieve a particular software design quality such as maintainability and reusability, but do not focus on the rationales and how the design pattern helps achieve the set of design qualities. Hence this research aims to investigate and provide an approach for defining and tracing rationales of structuring design patterns to achieving software design quality. Expected Outcome: An approach and method for tracing rationales of structuring design patterns to achieving software design quality. Tools/Programming languages to be used: Suitable development tools MSE PROJECT TITLES Semester II, Session 2015/16 Lecturer: Assoc. Prof. Dr. Rodina Ahmad Email: rodina@um.edu.my 1. Title of project: Agent based model for competency assessment and planning Objectives(s) of project: To develop an intelligent model to assist academic institution in evaluating, preparing and planning for essential competencies required for academic staff. Brief description: Many existing models provide basic competencies as guidance for organization to plan and assess their needs for staff competencies. Present organization requires a more supportive and intelligent model to assist them in assessing, preparing and planning for individual and organizational competency development. This research needs to make use of advancement in human resource development and agent development to develop appropriate model to assist the mentioned problem. Expected Output: A practical model and software support prototype to demonstrate its usefulness and applicability in a real case. Tools/Programming languages to be used: Java or similar programming environment. Or Net environment MSE PROJECT TITLES Semester II, Session 2015/16 Lecturer: Dr. Raja Jamilah Raja Yusof Email: rjry@gmail.com 1. Title of project: An approach for Software component development for Personalized Learning Environment (PLE) Objectives(s) of project: i) To develop a reusable software components for PLE ii) To manipulate the reusable component above and develop end-user programming paradigm for PLE Brief description: PLE has been identified as the future trend in education. It is seen to involve a digital learning platform or a distinct adaptive instructional software program to help personalize lessons and adapt content and instruction in response to real-time feedback and assessment results identifying students’ academic needs (AIR, 2013). This project involves the development of a reusable software component for PLE which can be used for software developers and a simple enduser programming (EUP) paradigm for end users programmers to use (Roschelle, 1999). The focus of the project should be on identifying cognitive ability of students (based on a few psychological learning theories such as Revelian and Gibson Test) to help developers/users to personalize any learning environment to be developed in the future. Expected Output: Reusable components and user programming paradigm References: AIR, American Institute of Research (2013), Are Personalized Learning Environments the Next Wave of K–12 Education Reform? : Education Issue Paper Series, August 2013. http://www.air.org/sites/default/files/AIR_Personalized_Learning_Issue_Paper_2013.pdf Roschelle, J., DiGiano, C., Koutlis, M., Repenning, A., Phillips, J., Jackiw, N., Suthers, D., "Developing Educational Software Components", Computer, vol.32, no. 9, pp. 50-58, September 1999, doi:10.1109/2.789751 (Revelian Test): http://www.revelian.com/example-questions-cognitive-ability/ (Gibson Test): http://www.gcstest.com/web/ 2. Lecturer: Dr Raja Jamilah Raja Yusof Email: rjry@um.edu.my Title of project : Developing cybernetic software libraries for human cognitive alertness Objectives(s) of project: • To identify multimedia software element that can help human to stay alert and awake • To develop a reusable software library for the above • To test the effectiveness the above software Brief description: There are many situation that it is desirable to keep a human awake. Example of such a situation is while we are driving, staying late to study for examination and for the completion of certain job or tasks. This project focuses on making human stay awake while driving a car. There are a few fundamental multimedia software elements such as colour, sound, vibration as well as integration of these element such as generating event or episodic experience flashbacks. Expected Output: Software libraries for the software development of human cognitive alertness Reference: Louv, J., ( 2012), How Smart Drugs and Cybernetics Could Create a Superhuman Workforce, http://motherboard.vice.com/blog/how-smart-drugs-and-cybernetics-could-create-a-superhumanworkforce. Lecturer: Dr. Raja Jamilah Raja Yusof Email: rjry@gmail.com 1. Title of project: An approach for designing emoticons across culture Objectives(s) of project: i) To identify emotions and map them to existing emoticons ii) To investigate the suitability of the emoticons across culture iii) To develop an application to rapidly generate emoticons that is readily integrated in existing systems iv) To evaluate the usability of the application Brief description: Emoticons have been used extensively in social media and mobile text chatting application such as Facebook and Whatsapp. However, the emoticons provided to express emotions and feelings are sometimes not suitable for certain emotions especially with respect to culture. This project examines the existing emoticons and develop an application that could rapidly generate a tailored emoticon based on culture and user preference. Expected Output: A rapid generator of emoticons system for the usage across culture. References: Soto JA, Levenson RW. Emotion Recognition across Cultures: The Influence of Ethnicity on Empathic Accuracy and Physiological Linkage. Emotion (Washington, DC). 2009;9(6):874-884. doi:10.1037/a0017399. Shipra Kayan, Susan R. Fussell, and Leslie D. Setlock. 2006. Cultural differences in the use of instant messaging in Asia and North America. In Proceedings of the 2006 20th anniversary conference on Computer supported cooperative work (CSCW '06). ACM, New York, NY, USA, 525528. DOI=http://dx.doi.org/10.1145/1180875.1180956 http://sfussell.hci.cornell.edu/pubs/Manuscripts/p525-kayan.pdf Daantje Derks, Arjan E. R. Bos, and Jasper von Grumbkow. Emoticons and Online Message InterpretationSocial Science Computer Review Fall 2008 26: 379-388, first published on December 10, 2007doi:10.1177/0894439307311611 MSE PROJECT TITLE Semester 1, Session 2015/2016 Lecturer: Dr. Su Moon Ting Email: smting@um.edu.my 1. Title of project: An Approach for Collaborative Filtering of Software Development Knowledge to Support Development Objectives(s) of project: 1. To analyse existing collaborative filtering techniques used in recommender systems and identify technique(s) that is(are) suitable for collaborative filtering of software development artefacts/information/knowledge. 2. To propose an approach for collaborative filtering of software development artefacts/information/knowledge that incorporates the identified technique(s). 3. To develop a prototype tool that supports the proposed collaborative filtering approach. Brief description: In every phase of software development life cycle, software personnel access numerous development artefacts/information/knowledge and evaluate their usefulness for the task-athand. The evaluations happen implicitly within the minds of the software personnel, and are often forgotten as time passes and are not transferable between team members. The knowledge of the usefulness of the artefacts/information/knowledge can be re-used if software personnel provide evaluation information (for examples, ratings, comments, and tags) on the evaluated artefacts/information/knowledge. These evaluation data as well as the access data (for examples, frequency of access, and time spent on artefacts) can be analysed and subsequently used to filter artefacts/information/knowledge for the members in the development team. This collaborative filtering of artefacts/information/knowledge can support the acquisition of required information/knowledge during software development. There are three main objectives of this project. Firstly, to analyse existing collaborative filtering techniques used in current recommender systems, and based on the analysis identify suitable technique(s) for collaborative filtering of software development artefacts/information/knowledge. The latter might involve adapting the technique(s) to the domain of software development. Secondly, to develop an approach for collaborative filtering of software development artefacts/information/knowledge. The identified technique(s) will be incorporated into the approach. Thirdly, a prototype tool will be developed as to support the proposed collaborative filtering approach. Research Method: To be determined by student. Expected output: 1. Technique(s) suitable for collaborative artefacts/information/knowledge. filtering of software development 2. An approach for collaborative filtering of software development artefacts/information/knowledge that incorporates the identified technique(s). 3. A prototype tool. Tools/Programming languages to be used: To be determined. This might involve customising or extending or integrating suitable existing tools. Number of Students: 1 Status: Old 2. Title of project: A Stakeholder-Driven Approach for the Specification of Software Requirements. Objectives(s) of project: 1. To review and analyse to what extent current practices/approaches/techniques for the specification of software requirements produce Software Requirements Specifications (SRSs) that support the needs of their stakeholders. 2. To propose an approach for the specification of software requirements to produce SRSs that better meet the needs of their stakeholders. 3. To develop a prototype tool that supports the proposed approach. Brief description: The Software Requirements Specification (SRS) is “a specification for a particular software product, program, or set of programs that performs certain functions in a specific environment” [1]. A good SRS holds numerous potential benefits [1]. These include establishing the basis of agreement between the customers and the suppliers on what the software product is to do, reducing the development effort, providing a basis for costs and schedules estimation, providing a baseline for validation and verification, and so on [1]. Nevertheless, despite the effort of the authors of a SRS in producing a good SRS, the authors’ perceived usages of the SRS could be different from the actual usages of the SRS by its stakeholders. This project requires the student to review and analyse to what extent current practices/approaches/techniques for the specification of software requirements support the needs of the stakeholders of SRSs. The insights gained from here will be used to propose an approach for the specification of software requirements to produce SRSs that better meet the needs of their stakeholders. A prototype tool will be developed to support the proposed requirement specification approach. Research Method: To be determined by student. Expected output: 1. Results of review and analysis of current practices/approaches/techniques for the specification of software requirements. 2. Suggestions or guidelines for requirements specifications that better meet the needs of stakeholders. 3. A stakeholder-driven software requirements specification approach. 4. A prototype tool. Tools/Programming languages to be used: To be determined. This might involve customising or extending or integrating suitable existing tools. Number of Students: 1 Status: Old References [1] IEEE, IEEE Recommended Practice for Software Requirements Specifications, in IEEE Std 8301998. 1998. p. 1-40. MSE PROJECT TITLES Semester II, Session 2015/16 Lecturer: Assoc. Prof. Dr. Chiew Thiam Kian Email: tkchiew@um.edu.my 1. Title of project: Personalized Daily Life Profiling for Patients with Chronic Diseases using Mobile Technologies Objectives(s) of project: To build a model for categorization of patients with chronic diseases’ daily activities using mobile technologies. To develop an integrated system that can detect and profile patients with chronic diseases’ daily activities, as well as provide personalized feedback to the users based on doctors’ recommendation. Brief description: Patients with chronic diseases such as diabetes mellitus and hypertension are required to keep an active lifestyle at appropriate level. Recoding and profiling their daily activities are beneficial to monitoring and assessing their health status, especially compliance with the doctors’ recommendations for activeness. This project aims at using mobile technologies such as wearable devices, smart phones, and sensors to detect the users’ daily activities and categorize the activities into contexts such as location and types of activities. The recorded data will be analyzed to provide feedback to the users based on recommendations given by their doctors. Expected Output: A model for profiling user activities based on data collected using mobile technologies such as wearable devices, smart phones, and sensors. Tools/Programming languages to be used: Android, wearable devices, smart phone, sensors 2. Title of project: Automated User Interface Personalization for Mobile Health Application Objectives(s) of project: To build a model for profiling users of mobile health applications based on their usage pattern of the applications. To develop a mobile health application which incorporates the proposed user profiling model to automatically personalize user interface based on user profile. Brief description: User interface design is an important issue for mobile applications. It is difficult to use a standard user interface design to cater for the needs of diversified user groups. The issue is more evident for mobile health applications. This project will focus on a type of mobile health application, e.g. applications for patients with diabetes, and categorize the users into groups with different requirements for user interface. Then a model for profiling the users based on the ways they use the application will be develop. A mobile health application will be built with the capability of user profiling. The application will be able to customize its own user interface based on the identified user profile. Expected Output: A model for automated personalization of user interface based on users’ usage patterns of mobile health applications. Tools/Programming languages to be used: Android, smart phone MSE PROJECT TITLES Semester II, Session 2015/16 Name of lecturer: Assoc Prof Dr Zarinah Mohd Kasirun Email: zarinahmk@um.edu.my Title of project: Gamification towards Engagement in Requirements Engineering (RE) Objectives(s) of project: • To investigate gamification properties to increase stakeholders’ engagement in RE • To propose a method to facilitate gamification technique. • To develop a proof-of-concept the method above. Brief description: Various methods used in RE process has helped towards achieving a better quality of requirements elicited and negotiated. Hence, many high quality software systems and applications are developed and gaining popular in our daily lives. Nevertheless, in preparing a particular business strategy for instance, we still hear people said using pencil and paper is much faster as compared to using tool. Why is this happened? Conventionally, practices make perfect. So do if people keep using a particular tool, they will develop enthusiasm and will use it. They will not abandon good tools. This research will provide student an opportunity to investigate gamification concept in RE process and incorporate them in RE tool. Research Method: The student should conduct: Literature review, apply/enhance/propose suitable model, validate model (through proof-of concept). Number of Student: 1 Expected Outcome: An approach and/or method for gamification suitable for RE process to increase stakeholders’ engagement. Tools/Programming languages to be used: Suitable development tools MSE PROJECT TITLES Semester II, Session 2015/2016 Lecturer: Dr. Chiam Yin Kia Email: yinkia@um.edu.my 1. Title of project 1: Method for presenting uncertainties and statistical results to non-statistician decision makers and users Objectives(s) of project: 1. To review, investigate and validate existing image processing and simulation software frameworks for myocardial infarction and cardiac for analysing and predicting the risks of cardiac disease. 2. To analyse the uncertainties and statistical results produced by the evaluation framework proposed in Project 1. 3. To develop a method to present uncertainties and statistical results to non-statistician decision makers and users. Brief description: Myocardial infarction (also known as a heart attack) happens when a blood clot stops the blood flowing properly to part of the heart. This develops severe chest pain because the heart muscle is injured due to lack of oxygen. Analysing the condition of coronary arteries that supplies blood to the heart and identifying blockage as early as possible can help to prevent a myocardial infarction and allow the patients to undergo treatments. Various simulation and image processing software systems are available to help researchers and biomedical experts to construct simulation and image models of myocardial infarction and cardiac to analyse and predict the risks of having myocardial infarction. The information collected from the cardiac simulation models and images provide insights for designing new treatment strategies as well as optimizing and evaluating surgical procedures or medical devices. However, in many decision problems, the decision-maker might wish to consider a combination of information (both simulation and image processing data). For example, the clinicians/researchers might wish to combine a mixture of data analysed by the image processing or simulation software systems to optimize the predictions. In this research, we will investigate how to present machine-learning produced models (from big data perspective) to decision makers. A method will be proposed to present uncertainties and statistical results to non-statistician decision makers and users. Uncertainty visualisation techniques such as traditional errorbars, scaled size of glyphs, color-mapping on glyphs, and color-random uncertainty components will be studied and adapted into proposed method. Expected Output: In this research, a method that will be proposed to present uncertainties and statistical results to nonstatistician decision makers and users. 2. Title of project 2: Evaluation framework/method that compare and select a combination of software systems for data analytics pipeline Objectives(s) of project: 1. To review, investigate and validate existing software tools for supporting the tasks in a data analtyics pipeline. 2. To identify selection criteria from desirable features and quality attributes to compare and evaluate software systems for data analytics pipeline. 3. To develop a decision-making evaluation framework/method to compare and select a combination of software tools to support all the stages in a data analytics pipeline. 4. To evaluate the framework/method by conducting a case study for cardiac simulation and modelling data analytic pipeline. Brief description: Data analytics is a broader term that includes data analysis as necessary subcomponent. Analytics defines cognitive processes an analyst uses to understand problems and analyse data in meaningful ways. Specific tools, techniques, and methods are used to perform analytics and communicate results successfully. Many real-world data analytics scenarios involve an integration of multiple data analytics stages, which are often carried out using different software tools. In this research, we call the integration of multiple data analytic stages as “Data Analytics Pipeline”. A data analytics pipeline is a set of data analytics processing stages connected in series, where the stages reconnected one to the next to form a pipe. The output of one stage is the input of the next stage. The stages of a data analytic pipeline are often executed in parallel or overlapped in execution. Selecting a combination of data analytics tools is complex. Problems arise when various tools available to support each stage of a data analytics pipeline. Each tool provides a set of features and can process certain data. Some tools are able to handle more than one task. Researchers are facing problems in selecting a combination of software tools to be used in a series of stages throughout the data analytics process Techniques such as decision trees, fuzzy logic, case base reasoning, hybrid knowledge-based system and artificial neural network are well known approaches in software selection. However, most of the existing studies focus on selecting one software tool at a time, and not a combination of software tools. In a data analytic pipeline, different software tools are used in acquiring data, importing data (supported format) into the tool, conducting pre-processing, performing the analysis, visualize and reporting results, or automating repeatable tasks. A tradeoff between features and quality attributes need to be analyzed to select a combination of tools that allows the developement teams to work with their project needs and constraints. In this research, an evaluation framework/method will be proposed by applying decision making methods such as decision tree, case base reasoning and fuzzy logic to discover the hidden patterns, associations, and anomalies of the characteristics of different kinds of data analytics software systems. The discovered knowledge can be used for assisting the software development teams and researchers who are new to data analytics to optimize the selection of a combination of data analytics software tools to meet specific project or domain needs. Figure 1: A Generic Data Analytics Pipeline Expected Output: In this research, an evaluation framework/method will be proposed to compare and select a combination of data analytics software tools that can be used to handle a a series of tasks in a data analytics pipeline based on the identified criteria or constraints. 4. Title of project 3: Process reference model/framework to develop machine learning (ML) based systems Objectives(s) of project: 1. To identify the differences between ML based systems development and generic software development. 2. To analyse the best practices or steps used to ensure the stasticial validity of the MLcomponents in the ML based systems. 3. To develop a process reference model/framework to integrate ML component development process into generic software development process to ensure the statistical validlity of the ML components. Brief description: Machine learning (ML) will enable cognitive systems to assist us in making good decisions by bringing the right recommendation to us in a more natural and personalized way. Furthermore, many ML algorithms have been applied to analyse big data quickly and automatically in recent software systems development. However, many experienced software engineers are experts in generic software development but they are a novice at ML-based system development and unfamiliar with the methodology of implementing ML components. On the other hand, a novel ML method is typically proposed by a ML researcher or team of researchers. A problem is that the code is written by researchers that may or may not be trained in the discipline of software development to ensure the statistical validity of the ML components. There are various risks that can be identified and need to be addressed when we integrate ML-based component(s) into a software system. Sculley et al. (2014) have identified several ML specific risk factors that need to be avoided or refactored during ML based system development. These include boundary erosion, entanglement, hidden feedback loops, undeclared consumers, data dependencies, changes in the external world, and a variety of system-level anti-patterns (Sculley et al, 2014). There is a change of software engineering role in the ML based system development. The development teams need to ensure that the requirements that we expect from ML-based systems can still be met when the ML-based components keep involving. The outputs produced by ML components should always meet the statistical validity specified in the requirements. For example, ML-based system may deal with big business/personal data which is different from dealing with scientific data. There are unchanging scientific laws underlying scientific data while the learned laws using big business/personal data are not only more fluid but also the produced predictive results that may "change" the end users and the underlying learned rules over time. This research aims to address the following research questions: What are the software development practices that could make a big difference when experimenting and testing ML algorithms? What are the good ML development practices that are applied by ML researchers to ensure the statistical validity of ML components? How to integrate the practices or steps used in ML components development into generic software development processes? This research aims to propose a conceptual process reference model/framework for ML based systems, that support software engineers, ML researchers, statisticians and/or data analyst in developing and maintaining a ML based systems to ensure the statistical validity of the ML components. The model/framework should consist of a holistic life cycle model for ML based systems, product quality criteria and metrics for ML components, best practices for different stakeholders, and recommendations for action, as well as tools that support stakeholders in developing and maintaining ML-based systems. Expected Output: In this research, a conceptual process reference model/framework will be proposed for supporting ML based systems development to manage the ML component development and ensure the statistical validity of the ML components. Reference: Sculley, D., Holt, G., Golovin, D., Davydov, E., Phillips, T., Ebner, D., and Young, M. (2014). Machine learning: The high interest credit card of technical debt. In SE4ML: Software Engineering for Machine Learning (NIPS 2014 Workshop).