Computing Faculty of Arts, Science, and Technology BSc (Hons) Computing, BSc (Hons) Computer Science Level: 6 Module: COM641 Distributed Data & Data Analytics Assignment: 2 Issue Date: Review Date: Tutorial Sessions Submission Dates: 25/04 2022 before 10 a.m Estimated Completion time: 72 Hours Lecturer: Subrahmaniam Krishnan-Harihar Verified by: Bindu Jose To be completed by student: I certify that, other than where collaboration has been explicitly permitted, this work is the result of my individual effort and that all sources for materials have been acknowledged. I also confirm that I have read and understood the codes of practice on plagiarism contained within the Glyndŵr Academic Regulations and that, by signing this printed form or typing my name on an electronically submitted version, I am agreeing to be dealt with accordingly in any case of suspected unfair practice. I also certify that my attendance for the module has been at least 70% Name: .......................................................... Student Number: ......................................... Date Submitted: ......................................... Student Signature: .......................................... Are extenuating circumstances being claimed? If YES, give reference number: YES / NO -------------------------------------- To be completed by lecturer Comments: Grade / Mark (Indicative: may change when moderated) 1 COM641: Assignment 2 Assignment Description The assignment will entail reading a range of academic papers, journals and books. You will first be writing reports by critically evaluating the adoption of data analytics and the issues. You will then be preparing data visualisations and explaining the information conveyed by the visualisations. Task 1: Report on big data platforms (60%) In Assignment 1, you designed a distributed data solution for a cruise company called Happy Cruise Lines. In the last week, a much larger operator of cruise ships called Joyful Cruise Lines has taken over Happy Cruise Lines. A review of the IT systems of the two organisations is now being considered. Joyful Cruise Lines has nearly 100 ships and has served 2.5 million customers. These customers have made nearly 10 million voyages. As part of the IT review, the development of a big data platform is being considered. The company would like in depth analysis of customer and voyage data to better understand what types of cruises, fares and cities are most popular with different segments of customers (customers could be segmented by gender, age, nationality etc). Since Joyful Cruise Lines want to use the data the two organisations have for detailed customer analytics, the data may be used to personalise, aggregate, and measure customer preferences and experience. This may necessitate converting data into different formats, and applying various analysis techniques, tools and models. Your task is to conduct research into big data platforms and produce a report which evaluates the benefits, costs, complexity, security and other relevant aspects the company should be aware of in adopting and implementing a big data platform. If your view is that the company need not implement a big data platform, you should come up with alternative recommendations. In light of the above customer and voyage data analytics requirement, your report should critically evaluate the legal, social and ethical implications in collecting, analysing, and using personal data to create knowledge to support business decision making using the chosen big data / data analytics solution. You should make sure that you support your report with relevant examples and justify your findings and recommendations with appropriate and relevant references. References can be drawn from books, journals and other quality information resources. The IEEE Referencing style must be used. (1500 words +/- 10%) 60% % Task 2: Data Analysis & Visualisation (40%) For this task, you should use the economic data that has been provided to you. In Moodle, you can find a spreadsheet containing this data. 2 The spreadsheet provided has two datasets: one is on the output value of different business sectors in different regions of the UK. The other is on employment in those sectors for the four nations that make up the UK. Your tasks are: 2.1 Data preparation and calculations: In any data analysis / business analytics activity, analysts must usually go through a data preparation stage. Often, raw data is generated in formats that are not immediately amenable to visualisation or comparison. You must consider the two datasets and identify how you can match the two datasets for comparative analysis. Some of the tasks require you to perform additional calculations / format the data differently and/or build a new dataset. If that is necessary, you should do so and submit the revised data and calculations. Calculations can be done using Excel or equivalent. 10% 2.2 Visualisation of data: Using Oracle Analytics Cloud, you have to produce visualisations for the following: 2.2.1 GVA by sector for the four nations 2.2.2 Employment by sector for the four nations 2.2.3 Comparative visualisation showing GVA growth and employment growth by sector for each of the four nations 2.2.4 Choose any five sectors and plot the annual GVA growth in percentage terms for the chosen sectors 2.2.5 For the same sectors used in 2.2.4, plot the annual percentage growth in employment. 15% 2.2.6 You must consider what type of graph is suitable for each of the above visualisations. As part of the commentary in 2.3 briefly mention why the graph chosen is the appropriate one to easily understand the data. Each visualisation must have relevant titles, labels and description. 5% Total for task 2.2: 20% 2.3 Data interpretation and commentary: It is not enough to create graphs and visualisations. Data interpretation and reporting is critical to decision making. You must, therefore, separately provide brief comments about what information is conveyed by the visualisation. The comments must be no more than 75 words per visualisation. 10% The visualisations must be downloaded from Oracle Analytics Cloud and submitted. Total for task 2: 40% 3 Guidance Students will get assistance to complete the tasks through the tutorial sessions. Drafts will be reviewed and formative feedback will be given in the tutorial sessions. So you are much less likely to obtain a good grade if you don’t attend the tutorial sessions. All submitted work is expected to observe academic standards in terms of referencing (IEEE), academic writing, use of language etc. Failure to adhere to these instructions may result in your work being awarded a lower grade than it would otherwise deserve. Failure to complete any of the above tasks or sub-tasks in time will result in a loss of marks. Submission Your final submission be a zip file that contains the report, data formatting / calculations and visualisations. The report writing tasks should be word-processed. The visualisations must be downloaded from Oracle Analytics Cloud and submitted along with the accompanying spreadsheet. This specification document should be filed at the front of the assignment, with the front sheet (with your Name, Student Number, Date and Signature) visible at the front. Submissions must be made as one zip file including all of the above via Turnitin link provided through VLE (Moodle). The Glyndwr policy on assignment submission will be rigidly adhered to (see your Student Handbook). Task 1: Report, word processed. Task 2: Excel (or equivalent for data). PowerPoint (or equivalent) or PDF for visualisations Learning Outcomes 1. Critically assess some of the more advanced developments in database technology. e.g. Stored Procedures and Functions 2. Evaluate the current issues associated with theory to practical implementations in database research. 3. Critically evaluate the adoption/use of data analytics and business intelligence practices for achieving organisational benefits. Key skills for employability: 1. Written, oral and media communication skills 2. Leadership, team working and networking skills 3. Opportunity, creativity and problem-solving skills 4. Information technology skills and digital literacy 4 5. Information management skills 6. Research skills 7. Intercultural and sustainability skills 8. Career management skills 9. Learning to learn (managing personal and professional development, selfmanagement) 10. Numeracy Assessment Criteria Grade A B+ B Description Work which fulfils all the criteria of the A grade but at a quite exceptional standard. 80+ Work of distinguished quality which is based on very extensive reading and which demonstrates an authoritative grasp of the concepts, methodology, and content appropriate to the subject and to the assessment task. There is clear evidence of originality, and insight and an ability, to sustain an argument, to think analytically, and/or critically and to synthesise material effectively. 73-79 Work of distinguished quality, which displays most, but not all, of the A grade attributes. 70-72 Work which clearly demonstrates all the qualities of a B+ grade but which reveals greater insight and more originality. 68-69 Work which demonstrates a sound and above average level of understanding of the concepts, methodology and content appropriate to the subject and which draws on a wide range of properly referenced sources. There is clear evidence of critical judgment in selecting, ordering and analysing content. Demonstrates some ability to synthesise material and to construct responses which reveal insight and may offer some occasional originality. 63-67 Work which contains most of the qualities of a B+ grade but where the critical judgment is less developed and there is less insight and originality. 60-62 Work of the qualities of a B grade but which contains a greater degree of critical analysis and original insight. 58-59 Work derived from a solid basis of reading and which demonstrates a grasp of relevant material and key concepts; and an ability to structure and organise arguments. The performance may be rather routine but the work will be accurate, clearly written and include some critical analysis and a 53-57 5 modest degree of original insight. There will be no serious omissions or irrelevancies. C R Work which demonstrates many of the qualities of a B grade, but which contains less critical analysis and little or no original insight. 50-52 Competent and suitably organised work which demonstrates a reasonable level of understanding, but which lacks sufficient analysis and interpretation to warrant a B grade. It will display some of the weaknesses of a C grade. 48-49 Work which covers the basic subject matter adequately and which is appropriately organised and presented but which is too descriptive and insufficiently analytical. There may be some misunderstanding of certain key concepts and limitations in the ability to select relevant material so that the work may be flawed by some omissions and irrelevancies. There will be some evidence of appropriate reading, but it may be too narrowly focused. 40-47 There is sufficient information presented to indicate that the student has general familiarity with the subject area. Such answers contain very little appropriate or accurate material, cursory coverage of the basic material, with numerous errors, omissions or irrelevancies, loose structure, poor or non-existent development of arguments 0-39 6