M.S. RAMAIAH INSTITUTE OF TECHNOLOGY BANGALORE (Autonomous Institute, Affiliated to VTU) SYLLABUS (For the Academic year 2014 – 2015) Dept of Information Science and Engineering M.Tech (Software Engineering) III and IV Semester Page 1 of 17 History of the Institute: M. S. Ramaiah Institute of Technology was started in 1962 by the late Dr. M.S. Ramaiah, our Founder Chairman who was a renowned visionary, philanthropist, and a pioneer in creating several landmark infrastructure projects in India. Noticing the shortage of talented engineering professionals required to build a modern India, Dr. M.S. Ramaiah envisioned MSRIT as an institute of excellence imparting quality and affordable education. Part of Gokula Education Foundation, MSRIT has grown over the years with significant contributions from various professionals in different capacities, ably led by Dr. M.S. Ramaiah himself, whose personal commitment has seen the institution through its formative years. Today, MSRIT stands tall as one of India’s finest names in Engineering Education and has produced around 35,000 engineering professionals who occupy responsible positions across the globe. About the Department: The Department of Information Science and Engineering (ISE) was established in the year 1992 with an objective of producing high-quality professionals to meet the demands of the emerging field of Information Science and Engineering. The department started with UG programme with an annual sanctioned intake of 30 students and the intake was enhanced to 60 seats in the year 1991, to 90 seats in the year 2001 and then to 120 seats in the year 2008. The department started M.Tech in Software Engineering in the year 2004. The department has been recognized as R&D center by VTU. The department has well equipped laboratories. Some of the laboratories have also been set up in collaboration with industries such as Intel, Apple, Honeywell, EMC2, Nokia Siemens and IBM. The department has highly qualified and motivated faculty members. All faculty are involved in research and technical paper publications in reputed technical journals, conferences across the world. The department has been accredited by the NBA in 2001, 2004 & reaccredited in 2010. The department has successfully conducted seminars & workshops for students and academicians in the emerging technology. Page 2 of 17 Faculty List Sl.No. Name of Faculty Designation Dr. Vijaya Kumar B P Qualification M.Tech (IITR), Ph.D (IISc.) 1 2 Dr. Lingaraju G M Ph.D Professor 3 N Ramesh MTech Associate Professor 4 Rajaram M Gowda Associate Professor 5 Dr.Mydhili K Nair MTech. (Ph.D) M.Tech, Ph.D (Anna University) 6 Shashidhara H S M.Tech. (Ph.D) Associate Professor 7 George Philip C M.Tech. Associate Professor 8 T Tamilarasi M.E (Ph.D) Assistant Professor 9 Dr. Megha.P.Arakeri M.Tech, Ph.D (NITK) Associate Professor 10 Dr. Siddesh G M M.Tech., Ph.D Associate Professor 11 Savita K Shetty M.Tech. Assistant Professor 12 Myna A N M.Sc(Engg) , (Ph.D) Assistant Professor 13 Deepthi K M.Tech. Assistant Professor 14 Lincy Meera Mathews M.Tech (Ph.D) Assistant Professor 15 P M Krishna Raj M.Sc(Engg), (Ph.D) Assistant Professor 16 Rajeshwari S B B.E (M.Tech.) Assistant Professor 17 Prathima M N M.E. Assistant Professor 18 Pushpalatha M N M.Tech (Ph.D) Assistant Professor 19 Mohan Kumar S M.Tech (Ph.D) Assistant Professor 20 Sumana M Assistant Professor 21 Prashanth Kambli M.Tech (Ph.D) M.Sc, (MSc(Engg.) by Research) 22 Naresh E M.Tech, (Ph.D) Assistant Professor 23 Jagadeesh Sai D M.Tech. Assistant Professor 24 Mani Sekhar S R M.Tech, (Ph.D) Assistant Professor 25 Suresh Kumar K R M.Tech, (Ph.D) Assistant Professor 26 Sunitha R S M.Tech. Assistant Professor 27 Sandeep B L M.Tech, (Ph.D) Assistant Professor 28 Dayananda P M.Tech, (Ph.D) Assistant Professor 29 Koushik S M.Tech, (Ph.D) Assistant Professor Professor & Head Associate Professor Assistant Professor Page 3 of 17 Vision and Mission of the Institute and the Department The Vision of MSRIT: To evolve into an autonomous institution of international standing for imparting quality technical education The Mission of the institute in pursuance of its Vision: MSRIT shall deliver global quality technical education by nurturing a conducive learning environment for a better tomorrow through continuous improvement and customization Quality Policy “We at M. S. Ramaiah Institute of Technology, Bangalore strive to deliver comprehensive, continually enhanced, global quality technical and management education through an established Quality Management system Complemented by the Synergistic interaction of the stake holders concerned”. The Vision of the Department: To evolve as an outstanding education and research center of Information Technology to create high quality Engineering Professionals for the betterment of Society The Mission of the Department: • To provide a conducive environment that offers well balanced Information Technology education and research • To provide training and practical experience in fundamentals and emerging technologies Program Education Objectives (PEOs) Student will be able to PEO1: Contribute in the area of Software Engineering development, maintenance & research in social-technical system. PEO2: Exhibit the Software Engineering skills for analysis, design & testing using modern tools & technologies within or outside discipline. PEO3: Act according to professional ethics and communicate effectively with various stakeholders by demonstrating leadership qualities. Program Outcomes (Pos) Student will be able to a: To nurture and enhance the knowledge in Software Engineering at a global perspective that emphasizes on designing, developing and testing the software. b: Analyze complex Software Engineering problems critically and pursue research independently. c: Ability to identify problems and arrive at optimal software solution within constraints. d: Perceive & develop a suitable model by applying research skills and lifelong learning that caters to multi disciplinary or Software Engineering domains either individually or in a group. e: Ability to optimize the domain engineering activities with the help of modern IT tools. f: Apply the knowledge of Software Engineering for effective project management considering ethical and social responsibility. g: Communicate effectively with respect to documentation and presentation with engineering community and society. Page 4 of 17 Board of Studies S. No. Category 1 Head of the Department Concerned 2 Faculty members nominated by the Academic Council Name of the Person with Official Address Dr. Vijaya Kumar B P Head of the Department Information Science & Engg., M.S. Ramaiah Institute of Technology, MSRIT Post, Bangalore – 560 054. Shashidhara H S Associate Professor Dept. of Information Science & Engineering, M.S. Ramaiah Institute of Technology, MSR Post, Bangalore – 560 054. Dr. Megha P Arakeri Associate Professor Dept. of Information Science & Engineering, M.S. Ramaiah Institute of Technology, MSR Post, Bangalore – 560 054. Dr. Siddesh G M Associate Professor, Dept. of Information Science & Engineering, M.S. Ramaiah Institute of Technology, MSR Post, Bangalore – 560 054. P. M Krishna Raj Asst. Professor Dept. of Information Science & Engineering, M.S. Ramaiah Institute of Technology, MSR Post, Bangalore – 560 054. 3 Experts in the subject from outside the College, to be nominated by the Academic Council Dr. Satish Babu Professor & Head, Dept. of Computer Science & Engineering SIDDAGANGA INSTITUTE OF TECHNOLOGY NH 206,B.H.Road, Tumkur, Karnataka 572103. bsbsit@gmail.com Dr. Dilip Kumar Professor & Head University Visvesvaraya College of Engineering (UVCE), K R Circle, Dr Ambedkar Veedhi, Bangalore Karnataka 560001 dilipkumarsm06@yahoo.com Status Chair Person Member Member Member Member Autonomous Institute Member Government University Member Page 5 of 17 Dr.Y.N. SRIKANT Professor, Department of Computer Science and Automation, Indian Institute of Science Bangalore 560 012. email : srikant@csa.iisc.ernet.in Madhu N. Belur Department of Electrical Engineering Indian Institute of Technology Bombay Powai, Mumbai 400 076 India Dr. Chetan Kumar S Manager Software Development CISCO System Cessna Business Park, Kadubeesanahalli Village, Varthur Hobli, Sarjapur Marathalli, Bangalore – 560 087 Shivakumar.chetan@gmail.com Mr.Niranjan Salimath Beaglesloft , 37/5, Ulsoor Rd, Yellappa Chetty Layout, Sivanchetti Gardens, Bengaluru, Karnataka 560042 Email - ranju@beaglesloft.com VTU Member from IISc Special Invitee Expert Member from Industry Alumni Member Page 6 of 17 Scheme of Teaching for 2014-2016 Batch I Semester M.Tech. (Software Engineering) Sl. No Subject Code Subject Credits* L T P Total 1 MSWE11 Advanced Mathematics 4 1 0 05 2 MSWE12 Advanced Topics in Software Engineering 4 0 1 05 3 MSWE13 Software Architecture and Design Patterns 4 0 1 05 4 MSWE14 Technical Writing 0 0 2 02 5 MSWEAX Elective – A 4 0 1 05 6 MSWEBX Elective – B 4 0 1 05 20 1 6 27 Total * L : Lecture T : Tutorial Elective - A P : Practical Elective - B MSWEA1 Web Services MSWEB1 Parallel Computing MSWEA2 Embedded Systems Design MSWEB2 Mobile Computing II Semester M.Tech. (Software Engineering) Sl. No Subject Code Subject Credits* L T P Total 1 MSWE21 Software Measurements and Metrics 4 0 1 05 2 MSWE22 Software Project Management 4 0 1 05 3 MSWE23 Software Quality Assurance and Testing 4 0 1 05 4 MSWE24 Seminar - I 0 2 0 02 5 MSWECX Elective – C 4 0 1 05 6 MSWEDX Elective – D 4 0 1 05 20 2 5 27 Total * L : Lecture Elective - C T : Tutorial P : Practical Elective - D MSWEC1 Cloud Computing MSWED1 Storage Area Networks MSWEC2 Soft Computing MSWED2 Advanced Computer Graphics Page 7 of 17 III Semester M.Tech. (Software Engineering) Sl. No Subject Code Subject Credits* L T P Total 1 MSWE31 Software Engineering & Society 4 1 0 05 2 MSWE32 Project Preliminaries 0 2 6 08 4 MSWE33 Seminar - II 0 2 0 02 5 MSWEEX Elective - E 4 0 1 05 8 4 8 20 Total * L : Lecture T : Tutorial P : Practical Elective - E MSWEE1 Applied Parallel Computing MSWEE3 System Performance and Analysis (4:1:0) MSWEE2 Bioinformatics MSWEE4 Advanced Data Mining IV Semester M.Tech. (Software Engineering) Sl. No 1 Subject Code MSWE41 Subject Credits* Project-2 L T P Total 0 0 26 26 Total * L : Lecture 26 T : Tutorial P : Practical Semester wise Credit Allocation Semester 1 2 3 4 Total Core 15 15 05 00 30 Electives 10 10 05 00 25 Project 00 00 08 26 34 Others 02 02 02 00 11 Page 8 of 17 Total 27 27 20 26 100 Course Code SOFTWARE ENGINEERING AND SOCIETY : MSWE31 Prerequisites: NIL Credits Contact Hours : 4:1:0 : 56L + 28T Course coordinator(s): Krishna Raj P M Course objectives: 1. Understand the evolution of computing and related ethical issues. 2. Apply the traditional moral and ethical theories to computer related issues 3. Investigate the IPR issues related to software 4. Analyze the risks and liabilities of software for developer and producers 5. Study the new areas of software application and understand the ethical issues Course Contents: UNIT I History of Computing: Historical Development of Computing and Information, Development of the Internet, Development of the World Wide Web, The Emergence of Social and Ethical Problems in Computing, The Case for Computer Ethics Education . Morality and the Law, Ethics and Ethical Analysis: Traditional definitions, Ethical Theories, Functional Definition of Ethics, Ethical Reasoning and Decision Making, Codes of Ethics, Technology and Values. UNIT II Ethics and the Professions: Evolution of Professions, Education and Licensing, Professional Decision Making and Ethics, Professionalism and Ethical Responsibilities. Anonymity, Security, Privacy, and Civil Liberties, Ethical and Legal Framework for Information. UNIT III Intellectual Property Rights and Computer Technology: Computer Products and Services, Foundations of Intellectual Property, Ownership, Intellectual Property Crimes, Protection of Ownership Rights. Social Context of Computing: The Digital Divide, Obstacles to Overcoming the Digital Divide, ICT in the Workplace, Employee Monitoring, Workplace, Employee, Health, and Productivity. UNIT IV Software Issues: Risks and Liabilities, Causes of Software Failures, Consumer Protection, Improving Software Quality, Producer Protection. Computer Crimes: History of Computer Crimes, Types of Computer System Attacks, Motives of Computer Crimes, Costs and Social Consequences, Computer Crime Prevention. UNIT V New Frontiers for Computer Ethics: Artificial Intelligence, Virtualization and Virtual Reality, Cyberspace, Social Network Ecosystem. Tutorials: Case studies on various topics covered in the class will be discussed. References: 1. Joseph Migga Kizza: Ethical and Social Issues in the Information Age, Springer, Fifth Edition, 2013. 2. Robert Banger: Computer Ethics- A case based approach, Cambridge University Press, First Edition, 2008. 3. Robert Plotkin: Computer Ethics (Computers, Internet and Society), Checkmark Books, First Edition, 2011 Page 9 of 17 Course outcomes: Students will be able to: 1. Describe the evolution of computing and ethical issues therein (PO a) 2. Infer the socially relevant issues related to software like liberty and privacy (PO f) 3. Sketch the process of protecting the IPR issues in software (PO f) 4. Interpret the risks and liabilities of software in context of computer crimes (PO f) 5. Critique the ethical issues arising from new areas of software usage (PO f, PO g) Page 10 of 17 Course Code APPLIED PARALLEL COMPUTING : MSWEE1 Prerequisites: NIL Credits Contact Hours : 4:0:1 : 56L + 28P Course coordinator(s): N Ramesh Course objectives: 1. Understand the evolution of GPUs and Parallel Programming. 2. Understand Compute Unified Device Architecture ( CUDA ) 3. CUDA memory organization and performance measurement. 4. Understanding OPENCL. 5. Understanding case studies. Course Contents: UNIT I Introduction and History: GPUs as Parallel Computers; Architecture of a Modem GPU; Why More Speed or Parallelism; Parallel Programming Languages and Models; Overarching Goals; Evolution of Graphics Pipelines; The Era of Fixed-Function ; Graphics Pipelines; Evolution of Programmable Real-Time Graphics; Unified Graphics and Computing Processors; GPGPU; An Intermediate Step; GPU Computing; Scalable GPUs, Recent Developments; Future Trends. UNIT II Introduction to CUDA: Data Parallelism; CUDA Program Structure; A Matrix-Matrix Multiplication Example; Device Memories and Data Transfer; Kernel Functions and Threading; Function declarations; Kernel launch; Predefined variables; Runtime API.CUDA Thread Organization; Using b 1 0 C k Id X and t h re a d Id x ; Synchronization and Transparent Scalability; Thread Assignment ; Thread Scheduling and Latency Tolerance UNIT III CUDA Memories: Importance of Memory Access Efficiency; CUDA Device Memory Types; A Strategy for Reducing Global Memory Traffic; Memory as a Limiting Factor to Parallelism; Global Memory Bandwidth; Dynamic Partitioning of SM Resources; Data Perfetching; Instruction Mix; Thread Granularity; Measured Performance UNIT IV Introduction to OPENCL: Introduction to OPENCL; Background; Data Parallelism Model; Device Architecture; Kernel Functions; Device Management and Kernel Launch; Electrostatic Potential Map in OpenCL; Parallel Programming in CUDA C; CUDA C on multiple GPU’s UNIT V Case studies and Tools on CUDA – Introduction to tools - Cuda GDB; NVIDIA Parallel Insight, Case studies - Fast virus signature matching on GPU, AES Encryption and Decryption on the GPU, Imaging Earth's Subsurface Using CUDA, Incremental Computation of the Gaussian Laboratory: Exercises and Mini-projects based on concepts & tools. References: 1. David B Kirk and Wen Mei W Hwu: Programming Massively Parallel Processors: A Hands- On Approach, Elsevier India Private Limited, 2010. 2. Hubert Nguyen: GPU Gems 3, Addison Wesley Professional, 2007 ( Available online free at NVIDIA site) 3. http://www.nvidia.co.in/object/cuda_home_new_in.html Page 11 of 17 Course outcomes: Students will be able to: 1. Explain the evolution of GPUs and Parallel Programming (PO a) 2. Explain the CUDA architecture and Thread organization. (PO a, PO e) 3. Explain CUDA memory organization and performance measurement (POa, POc, POe) 4. Understand OPENCL and Parallel programming in CUDA. (PO a, PO c) 5. Understand case studies. (PO a, PO e) Page 12 of 17 BIOINFORMATICS Course Code : MSWEE2 Prerequisites: NIL Credits Contact Hours : 4:0:1 : 56L + 28P Course coordinator(s): Shashidhara H S Course objectives: • Study computational techniques for biological data analysis • Apply and improve those computational techniques on biological data • Study the different ways/techniques of analyzing biological data • Study the different ways of storing, retrieving and updating biological data Course Contents: Unit I The genetic material, gene structure and information content, protein structure and function, chemical bonds, molecular biology tools Unit II Dot plots, simple alignments, gaps, scoring matrices, the Needleman and Wunsch algorithm, semiglobal alignments, the Smith and Waterman algorithm, database searches – BLAST and FASTA Unit III Patterns of substitutions within genes, estimating substitution numbers, molecular clocks Molecular phylogenetics, phylogenetic trees, distance matrix methods, maximum likelihood approaches Unit IV Parsimony, Inferred Ancestral Sequences, strategies for fast searches – branch and bound and heuristic searches, consensus trees, tree confidence, molecular phylogenies Genomics – 1: Prokaryotic genomes, prokaryotic gene structure, GC content and prokaryotic genomes, prokaryotic gene density, eukaryotic genomes Unit V Genomics – 2: Eukaryotic gene structure Open reading frames, GC contents in eukaryotic genomes, gene expression, transposition, repetitive elements Amino acids, polypeptide composition, secondary structure, tertiary and quaternary structures, algorithms for modeling protein folding Laboratory: Exercises and Mini-projects based on concepts & tools. References: • Dan E. Krane, Michael L. Raymer, Fundamental Concepts of Bioinformatics, Pearson Education, 2008 • T K Attwood, D J Parry Smith, Introduction to Bioinformatics, Pearson Education, 2004 • Gary B. Fogel, David W. Corne, Evolutionary Computation in Bioinformatics, Morgan Kaufmann Publishers Course outcomes: Students will be able to: 1. Explain all the available molecular biology tools (PO e) 2. Solve sequence alignment problems with/without gap penalty (PO a, PO b, PO e) 3. Explain the pattern of substitution within genes (PO b) 4. Distinguish between character based and distance based phylogeny (PO a, PO b) 5. Identify different parts of prokaryotic and Eukaryotic Genes (PO a, PO b) Page 13 of 17 SYSTEM PERFORMANCE AND ANALYSIS Course Code : MSWEE3 Prerequisites: NIL Credits Contact Hours : 4:1:0 : 56L + 28T Course coordinator(s): Myna A N Course objectives: • • • Introduce various analysis techniques such as statistics, probability theory, experimental design, simulation, and queuing theory. Provide basic modeling, simulation, and analysis background to systems analysts To use measurement as well as modeling techniques to solve performance problems Course Contents: UNIT I AN OVERVIEW OF PERFORMANCE EVALUATION: Introduction, Common Mistakes in Performance Evaluation, A Systematic Approach to Performance Evaluation, Selecting an Evaluation Technique, Selecting Performance Metrics, Commonly used Performance Metrics, Utility Classification of Performance Metrics, Setting Performance Requirements. UNIT II WORKLOAD SELECTION AND CHARACTERIZATION: Types of Workloads, Addition instructions, Instruction Mixes, Kernels; Synthetic programs, Application Benchmarks, Popular benchmarks, Workload Selection: Services exercised, level of detail, Representativeness, Timeliness, Other considerations in Workload selection, Workload Characterization & Techniques: Terminology, Averaging, Specifying dispersion, Single Parameter Histograms, Multi parameter histograms, Principal Component Analysis, Markov Models, Clustering. UNIT III MEASUREMENT TECHNIQUES AND TOOLS: Monitors: Terminology and classification, software and hardware monitors, Software versus hardware monitors, firmware and hybrid monitors, Distributed System Monitors, Program-Execution Monitors and Accounting Logs: Program Execution Monitors, Techniques for Improving Program Performance, Accounting Logs, Analysis and Interpretation of Accounting log data, Using accenting logs to answer commonly asked questions. UNIT IV CAPACITY PLANNING, BENCHMARKING AND EXPERIMENTAL DESIGN: Steps in capacity planning and management, Problems in capacity planning, Common mistakes in benchmarking, Remote-Terminal Emulation, Components of an RTE, Limitations of RTEs, Experimental design and analysis: Terminology, Common mistakes in experiments, Types of Experimental designs, 2 k factorial designs, concepts, computation of effects, Sign table method for computing effects, allocation of Variance, General 2 k factorial designs. UNIT V SIMULATION: Introduction to Simulation, Analysis of Simulation Results: Model Verification Techniques: Top-down Modular design, Antibugging, Structured Walk-through, Deterministic Models, Run simplified cases, Trace, Online graphic displays, Continuity Tests, Degeneracy Tests, Consistency Tests, Seed Independence, Model Validation Techniques: Expert Intuition, Real System Measurements, Theoretical Results, Transient Removal: Long Runs, Proper Initialization, Truncation, Initial Date Deletion, Moving Average of Independent Replications, Batch Means, Terminating Simulation, Stopping Criteria, Variance Reduction. Page 14 of 17 Tutorials: Exercises based on concepts. References: 1. Raj Jain. “The Art of Computer Systems Performance Analysis”. John Wiley and sons, New York, USA, 1991 2. Law A M and Kelton W.D. “Simulation Modeling and Analysis “, McGraw Hill, New York, USA, 1991 3. Paul J Fortier, Howard E Michel: Computer Systems Performance Evaluation and Prediction, Elsevier, 2003 Course outcomes: Students will be able to: • To understand performance terminology (PO d, PO e) • Select proper workload and characterize the workload (PO a, PO d, PO e) • Collect performance statistics, analyze the data and display results using monitors (PO a, PO d, PO e) • Correctly design performance experiments (PO a, PO d, PO e) • Perform computer system performance analysis using simulation (POa, POd, POe) Page 15 of 17 ADVANCED DATA MINING Course Code : MSWEE4 Prerequisites: NIL Credits Contact Hours : 4:0:1 : 56L + 28P Course coordinator(s): Pushpalatha M N Course objectives: • Understand Supervised learning: classification, regression • Understand Algorithm-independent machine learning and association mining • Introduce concepts of unsupervised learning and clustering. • Familiarize Bayesian methods, MapReduce and Hadoop. • Familiarize advanced concepts and application to software engineering, web and text data. Course Contents: Unit I Introduction & Supervised learning: Data Mining, Classification by Decision Tree Induction, Rule-Based Classification, Classification by Back propagation, Support Vector machine, Lazy Learners and other classification methods, Prediction, Accuracy and Error measures, evaluating the accuracy of a classifier or predictor. Unit II Algorithm-independent machine learning and Association mining: Introduction, Lack of Inherent superiority of any classifier, Resampling for classifier Design – Bagging, Boosting. Mining frequent patterns and Association- Basic concepts, Efficient and scalable frequent itemset mining methods and Mining various kinds of association rules. Unit III Unsupervised learning and clustering: Types of Data in Cluster Analysis, A categorization of major clustering methods, Partitioning methods, Hierarchical methods, Density- Based methods, Model-Based Clustering methods, Outlier Analysis. Unit IV Bayesian Decision Theory: Continuous Feature, Minimum – Error – Rate Classification, Classifiers, Discriminant Functions, and Decision Surfaces, The Normal Density, Discriminant Functions for the Normal Density, Error Probabilities and Integrals, Error Bounds for Normal Densities, Bayes Decision Theory – Discrete Features, Map Reduce and Hadoop. Unit V Advanced Concepts: Multidimensional Analysis and Descriptive mining of complex data objects, Text Mining-Text data analysis and informational retrieval, text mining approaches, Analysis of World Wide Web, Social Impacts of Data Mining, Trends in data mining, Application: Software Engineering. Laboratory: Give advanced understanding of 1. R programming languages widely used for data analysis; 2. Weka, an environment which provides a collection of machine learning algorithms for data mining tasks. R: 3. R as calculators. Using the help system. 4. Data input and output. Vectors, arrays and matrices. 5. Data visualization (including plots in 2 and 3 dimensions, scatter plots, barplots, histograms). 6. Implementing concepts from Linear Algebra and Statistics (including probability distributions, matrix decompositions). Page 16 of 17 7. Programming: loops, conditional executions, string manipulations, data structures, etc. 8. Writing functions, debugging the code. Using packages and toolboxes 9. WEKA: 10. Data Input: concepts, instances, attributes. Feature selection 11. Using machine learning schemes (including decision trees, naive Bayes classifiers, clustering methods) 12. Training and Testing, predicting generalization performance, cross-validation References: • Jiawei Han, Micheline Kamber: Data Mining - Concepts and Techniques, 2nd Edition, Morgan Kaufmann Publisher, 2006. • Richard O. Duda, Peter E. Hart, and David G. Stork, "Pattern Classification", Second edition, Wiley, New York, 2000. • Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani, An Introduction to Statistical Learning with Applications in R, Springer, 2013. • The Art of R Programming: A Tour of Statistical Software Design, Norman Matloff, 2011, No Starch Press. • Manoel Mendonca, Nancy L. Sunderhaft, “Mining Software Engineering Data: A Survey” A DACS State-of-the-Art Report. Course outcomes: Students will be able to: • Explain data mining and Implement, Execute and Evaluate different classification methods and techniques. (PO a, PO b, PO e) • Discuss algorithm independent machine learning and association mining (PO a, PO b, PO e) • Identify the appropriate clustering techniques for the given data sets (PO a, PO b, PO e) • Discuss Bayesian Decision Theory, MapReduce and Hadoop (PO e) • Discuss advanced concepts and Apply concepts to software engineering, web and text data for extracting value and insight. (PO a, PO b, PO e) Page 17 of 17