M.S. RAMAIAH INSTITUTE OF TECHNOLOGY BANGALORE (Autonomous Institute, Affiliated to VTU) SYLLABUS

advertisement
M.S. RAMAIAH INSTITUTE OF TECHNOLOGY
BANGALORE
(Autonomous Institute, Affiliated to VTU)
SYLLABUS
(For the Academic year 2014 – 2015)
Dept of Information Science and Engineering
M.Tech (Software Engineering)
III and IV Semester
Page 1 of 17
History of the Institute:
M. S. Ramaiah Institute of Technology was started in 1962 by the late Dr. M.S. Ramaiah, our
Founder Chairman who was a renowned visionary, philanthropist, and a pioneer in creating
several landmark infrastructure projects in India. Noticing the shortage of talented
engineering professionals required to build a modern India, Dr. M.S. Ramaiah envisioned
MSRIT as an institute of excellence imparting quality and affordable education. Part of Gokula
Education Foundation, MSRIT has grown over the years with significant contributions from
various professionals in different capacities, ably led by Dr. M.S. Ramaiah himself, whose
personal commitment has seen the institution through its formative years. Today, MSRIT
stands tall as one of India’s finest names in Engineering Education and has produced around
35,000 engineering professionals who occupy responsible positions across the globe.
About the Department:
The Department of Information Science and Engineering (ISE) was established in the year
1992 with an objective of producing high-quality professionals to meet the demands of the
emerging field of Information Science and Engineering. The department started with UG
programme with an annual sanctioned intake of 30 students and the intake was enhanced to
60 seats in the year 1991, to 90 seats in the year 2001 and then to 120 seats in the year
2008. The department started M.Tech in Software Engineering in the year 2004. The
department has been recognized as R&D center by VTU. The department has well equipped
laboratories. Some of the laboratories have also been set up in collaboration with industries
such as Intel, Apple, Honeywell, EMC2, Nokia Siemens and IBM. The department has highly
qualified and motivated faculty members. All faculty are involved in research and technical
paper publications in reputed technical journals, conferences across the world. The
department has been accredited by the NBA in 2001, 2004 & reaccredited in 2010. The
department has successfully conducted seminars & workshops for students and academicians
in the emerging technology.
Page 2 of 17
Faculty List
Sl.No.
Name of Faculty
Designation
Dr. Vijaya Kumar B P
Qualification
M.Tech (IITR), Ph.D
(IISc.)
1
2
Dr. Lingaraju G M
Ph.D
Professor
3
N Ramesh
MTech
Associate Professor
4
Rajaram M Gowda
Associate Professor
5
Dr.Mydhili K Nair
MTech. (Ph.D)
M.Tech, Ph.D (Anna
University)
6
Shashidhara H S
M.Tech. (Ph.D)
Associate Professor
7
George Philip C
M.Tech.
Associate Professor
8
T Tamilarasi
M.E (Ph.D)
Assistant Professor
9
Dr. Megha.P.Arakeri
M.Tech, Ph.D (NITK)
Associate Professor
10
Dr. Siddesh G M
M.Tech., Ph.D
Associate Professor
11
Savita K Shetty
M.Tech.
Assistant Professor
12
Myna A N
M.Sc(Engg) , (Ph.D)
Assistant Professor
13
Deepthi K
M.Tech.
Assistant Professor
14
Lincy Meera Mathews
M.Tech (Ph.D)
Assistant Professor
15
P M Krishna Raj
M.Sc(Engg), (Ph.D)
Assistant Professor
16
Rajeshwari S B
B.E (M.Tech.)
Assistant Professor
17
Prathima M N
M.E.
Assistant Professor
18
Pushpalatha M N
M.Tech (Ph.D)
Assistant Professor
19
Mohan Kumar S
M.Tech (Ph.D)
Assistant Professor
20
Sumana M
Assistant Professor
21
Prashanth Kambli
M.Tech (Ph.D)
M.Sc, (MSc(Engg.) by
Research)
22
Naresh E
M.Tech, (Ph.D)
Assistant Professor
23
Jagadeesh Sai D
M.Tech.
Assistant Professor
24
Mani Sekhar S R
M.Tech, (Ph.D)
Assistant Professor
25
Suresh Kumar K R
M.Tech, (Ph.D)
Assistant Professor
26
Sunitha R S
M.Tech.
Assistant Professor
27
Sandeep B L
M.Tech, (Ph.D)
Assistant Professor
28
Dayananda P
M.Tech, (Ph.D)
Assistant Professor
29
Koushik S
M.Tech, (Ph.D)
Assistant Professor
Professor & Head
Associate Professor
Assistant Professor
Page 3 of 17
Vision and Mission of the Institute and the Department
The Vision of MSRIT: To evolve into an autonomous institution of international standing for
imparting quality technical education
The Mission of the institute in pursuance of its Vision: MSRIT shall deliver global quality
technical education by nurturing a conducive learning environment for a better tomorrow
through continuous improvement and customization
Quality Policy
“We at M. S. Ramaiah Institute of Technology, Bangalore strive to deliver comprehensive,
continually enhanced, global quality technical and management education through an
established Quality Management system Complemented by the Synergistic interaction of the
stake holders concerned”.
The Vision of the Department: To evolve as an outstanding education and research center
of Information Technology to create high quality Engineering Professionals for the betterment
of Society
The Mission of the Department:
• To provide a conducive environment that offers well balanced Information Technology
education and research
• To provide training and practical experience in fundamentals and emerging
technologies
Program Education Objectives (PEOs)
Student will be able to
PEO1: Contribute in the area of Software Engineering development, maintenance & research
in social-technical system.
PEO2: Exhibit the Software Engineering skills for analysis, design & testing using modern tools
& technologies within or outside discipline.
PEO3: Act according to professional ethics and communicate effectively with various
stakeholders by demonstrating leadership qualities.
Program Outcomes (Pos)
Student will be able to
a: To nurture and enhance the knowledge in Software Engineering at a global perspective that
emphasizes on designing, developing and testing the software.
b: Analyze complex Software Engineering problems critically and pursue research
independently.
c: Ability to identify problems and arrive at optimal software solution within constraints.
d: Perceive & develop a suitable model by applying research skills and lifelong learning that
caters to multi disciplinary or Software Engineering domains either individually or in a group.
e: Ability to optimize the domain engineering activities with the help of modern IT tools.
f: Apply the knowledge of Software Engineering for effective project management considering
ethical and social responsibility.
g: Communicate effectively with respect to documentation and presentation with engineering
community and society.
Page 4 of 17
Board of Studies
S.
No.
Category
1
Head of the
Department
Concerned
2
Faculty members
nominated by the
Academic Council
Name of the Person with Official
Address
Dr. Vijaya Kumar B P
Head of the Department
Information Science & Engg.,
M.S. Ramaiah Institute of Technology,
MSRIT Post,
Bangalore – 560 054.
Shashidhara H S
Associate Professor
Dept. of Information Science &
Engineering,
M.S. Ramaiah Institute of Technology,
MSR Post, Bangalore – 560 054.
Dr. Megha P Arakeri
Associate Professor
Dept. of Information Science &
Engineering,
M.S. Ramaiah Institute of Technology,
MSR Post, Bangalore – 560 054.
Dr. Siddesh G M
Associate Professor,
Dept. of Information Science &
Engineering,
M.S. Ramaiah Institute of Technology,
MSR Post, Bangalore – 560 054.
P. M Krishna Raj
Asst. Professor
Dept. of Information Science &
Engineering,
M.S. Ramaiah Institute of Technology,
MSR Post, Bangalore – 560 054.
3
Experts in the subject
from outside the
College, to be
nominated by the
Academic Council
Dr. Satish Babu
Professor & Head,
Dept. of Computer Science &
Engineering
SIDDAGANGA INSTITUTE OF
TECHNOLOGY
NH 206,B.H.Road, Tumkur,
Karnataka 572103.
bsbsit@gmail.com
Dr. Dilip Kumar
Professor & Head
University Visvesvaraya College of
Engineering (UVCE),
K R Circle, Dr Ambedkar Veedhi,
Bangalore
Karnataka 560001
dilipkumarsm06@yahoo.com
Status
Chair Person
Member
Member
Member
Member
Autonomous
Institute
Member
Government
University
Member
Page 5 of 17
Dr.Y.N. SRIKANT
Professor, Department of Computer
Science and Automation,
Indian Institute of Science
Bangalore 560 012.
email : srikant@csa.iisc.ernet.in
Madhu N. Belur
Department of Electrical Engineering
Indian Institute of Technology Bombay
Powai, Mumbai 400 076
India
Dr. Chetan Kumar S
Manager Software Development
CISCO System
Cessna Business Park,
Kadubeesanahalli Village,
Varthur Hobli, Sarjapur Marathalli,
Bangalore – 560 087
Shivakumar.chetan@gmail.com
Mr.Niranjan Salimath
Beaglesloft , 37/5, Ulsoor Rd, Yellappa
Chetty Layout, Sivanchetti Gardens,
Bengaluru, Karnataka 560042
Email - ranju@beaglesloft.com
VTU Member
from IISc
Special
Invitee
Expert
Member from
Industry
Alumni
Member
Page 6 of 17
Scheme of Teaching for 2014-2016 Batch
I Semester M.Tech. (Software Engineering)
Sl.
No
Subject
Code
Subject
Credits*
L
T
P
Total
1
MSWE11
Advanced Mathematics
4
1
0
05
2
MSWE12
Advanced Topics in Software Engineering
4
0
1
05
3
MSWE13
Software Architecture and Design Patterns
4
0
1
05
4
MSWE14
Technical Writing
0
0
2
02
5
MSWEAX
Elective – A
4
0
1
05
6
MSWEBX
Elective – B
4
0
1
05
20
1
6
27
Total
* L : Lecture
T : Tutorial
Elective - A
P : Practical
Elective - B
MSWEA1 Web Services
MSWEB1
Parallel Computing
MSWEA2 Embedded Systems Design
MSWEB2
Mobile Computing
II Semester M.Tech. (Software Engineering)
Sl.
No
Subject
Code
Subject
Credits*
L
T
P
Total
1
MSWE21
Software Measurements and Metrics
4
0
1
05
2
MSWE22
Software Project Management
4
0
1
05
3
MSWE23
Software Quality Assurance and Testing
4
0
1
05
4
MSWE24
Seminar - I
0
2
0
02
5
MSWECX
Elective – C
4
0
1
05
6
MSWEDX
Elective – D
4
0
1
05
20
2
5
27
Total
* L : Lecture
Elective - C
T : Tutorial
P : Practical
Elective - D
MSWEC1
Cloud Computing
MSWED1 Storage Area Networks
MSWEC2
Soft Computing
MSWED2 Advanced Computer Graphics
Page 7 of 17
III Semester M.Tech. (Software Engineering)
Sl.
No
Subject
Code
Subject
Credits*
L
T
P
Total
1
MSWE31
Software Engineering & Society
4
1
0
05
2
MSWE32
Project Preliminaries
0
2
6
08
4
MSWE33
Seminar - II
0
2
0
02
5
MSWEEX
Elective - E
4
0
1
05
8
4
8
20
Total
* L : Lecture
T : Tutorial
P : Practical
Elective - E
MSWEE1
Applied Parallel Computing
MSWEE3
System Performance and Analysis
(4:1:0)
MSWEE2
Bioinformatics
MSWEE4
Advanced Data Mining
IV Semester M.Tech. (Software Engineering)
Sl. No
1
Subject Code
MSWE41
Subject
Credits*
Project-2
L
T
P
Total
0
0
26
26
Total
* L : Lecture
26
T : Tutorial
P : Practical
Semester wise Credit Allocation
Semester
1
2
3
4
Total
Core
15
15
05
00
30
Electives
10
10
05
00
25
Project
00
00
08
26
34
Others
02
02
02
00
11
Page 8 of 17
Total
27
27
20
26
100
Course Code
SOFTWARE ENGINEERING AND SOCIETY
: MSWE31
Prerequisites: NIL
Credits
Contact Hours
: 4:1:0
: 56L + 28T
Course coordinator(s): Krishna Raj P M
Course objectives:
1. Understand the evolution of computing and related ethical issues.
2. Apply the traditional moral and ethical theories to computer related issues
3. Investigate the IPR issues related to software
4. Analyze the risks and liabilities of software for developer and producers
5. Study the new areas of software application and understand the ethical issues
Course Contents:
UNIT I
History of Computing: Historical Development of Computing and Information,
Development of the Internet, Development of the World Wide Web, The Emergence of
Social and Ethical Problems in Computing, The Case for Computer Ethics Education .
Morality and the Law, Ethics and Ethical Analysis: Traditional definitions, Ethical
Theories, Functional Definition of Ethics, Ethical Reasoning and Decision Making, Codes
of Ethics, Technology and Values.
UNIT II
Ethics and the Professions: Evolution of Professions, Education and Licensing,
Professional Decision Making and Ethics, Professionalism and Ethical Responsibilities.
Anonymity, Security, Privacy, and Civil Liberties, Ethical and Legal Framework for
Information.
UNIT III
Intellectual Property Rights and Computer Technology: Computer Products and
Services, Foundations of Intellectual Property, Ownership, Intellectual Property Crimes,
Protection of Ownership Rights.
Social Context of Computing: The Digital Divide, Obstacles to Overcoming the Digital
Divide, ICT in the Workplace, Employee Monitoring, Workplace, Employee, Health, and
Productivity.
UNIT IV
Software Issues: Risks and Liabilities, Causes of Software Failures, Consumer
Protection, Improving Software Quality, Producer Protection.
Computer Crimes: History of Computer Crimes, Types of Computer System Attacks,
Motives of Computer Crimes, Costs and Social Consequences, Computer Crime
Prevention.
UNIT V
New Frontiers for Computer Ethics: Artificial Intelligence, Virtualization and Virtual
Reality, Cyberspace, Social Network Ecosystem.
Tutorials:
Case studies on various topics covered in the class will be discussed.
References:
1. Joseph Migga Kizza: Ethical and Social Issues in the Information Age, Springer,
Fifth Edition, 2013.
2. Robert Banger: Computer Ethics- A case based approach, Cambridge University
Press, First Edition, 2008.
3. Robert Plotkin: Computer Ethics (Computers, Internet and Society), Checkmark
Books, First Edition, 2011
Page 9 of 17
Course outcomes:
Students will be able to:
1. Describe the evolution of computing and ethical issues therein (PO a)
2. Infer the socially relevant issues related to software like liberty and privacy (PO f)
3. Sketch the process of protecting the IPR issues in software (PO f)
4. Interpret the risks and liabilities of software in context of computer crimes (PO f)
5. Critique the ethical issues arising from new areas of software usage (PO f, PO g)
Page 10 of 17
Course Code
APPLIED PARALLEL COMPUTING
: MSWEE1
Prerequisites: NIL
Credits
Contact Hours
: 4:0:1
: 56L + 28P
Course coordinator(s): N Ramesh
Course objectives:
1. Understand the evolution of GPUs and Parallel Programming.
2. Understand Compute Unified Device Architecture ( CUDA )
3. CUDA memory organization and performance measurement.
4. Understanding OPENCL.
5. Understanding case studies.
Course Contents:
UNIT I
Introduction and History: GPUs as Parallel Computers; Architecture of a Modem GPU;
Why More Speed or Parallelism; Parallel Programming Languages and Models;
Overarching Goals; Evolution of Graphics Pipelines; The Era of Fixed-Function ; Graphics
Pipelines; Evolution of Programmable Real-Time Graphics; Unified Graphics and
Computing Processors; GPGPU; An Intermediate Step; GPU Computing; Scalable GPUs,
Recent Developments; Future Trends.
UNIT II
Introduction to CUDA: Data Parallelism; CUDA Program Structure; A Matrix-Matrix
Multiplication Example; Device Memories and Data Transfer; Kernel Functions and
Threading; Function declarations; Kernel launch; Predefined variables; Runtime
API.CUDA Thread Organization; Using b 1 0 C k Id X and t h re a d Id x ; Synchronization
and Transparent Scalability; Thread Assignment ; Thread Scheduling and Latency
Tolerance
UNIT III
CUDA Memories: Importance of Memory Access Efficiency; CUDA Device Memory
Types; A Strategy for Reducing Global Memory Traffic; Memory as a Limiting Factor
to Parallelism; Global Memory Bandwidth; Dynamic Partitioning of SM Resources; Data
Perfetching; Instruction Mix; Thread Granularity; Measured Performance
UNIT IV
Introduction to OPENCL: Introduction to OPENCL; Background; Data Parallelism Model;
Device Architecture; Kernel Functions; Device Management and Kernel Launch;
Electrostatic Potential Map in OpenCL; Parallel Programming in CUDA C; CUDA C on
multiple GPU’s
UNIT V
Case studies and Tools on CUDA – Introduction to tools - Cuda GDB; NVIDIA Parallel
Insight, Case studies - Fast virus signature matching on GPU, AES Encryption and
Decryption on the GPU, Imaging Earth's Subsurface Using CUDA, Incremental
Computation of the Gaussian
Laboratory:
Exercises and Mini-projects based on concepts & tools.
References:
1. David B Kirk and Wen Mei W Hwu: Programming Massively Parallel Processors: A
Hands- On Approach, Elsevier India Private Limited, 2010.
2. Hubert Nguyen: GPU Gems 3, Addison Wesley Professional, 2007 ( Available
online free at NVIDIA site)
3. http://www.nvidia.co.in/object/cuda_home_new_in.html
Page 11 of 17
Course outcomes:
Students will be able to:
1. Explain the evolution of GPUs and Parallel Programming (PO a)
2. Explain the CUDA architecture and Thread organization. (PO a, PO e)
3. Explain CUDA memory organization and performance measurement (POa, POc, POe)
4. Understand OPENCL and Parallel programming in CUDA. (PO a, PO c)
5. Understand case studies. (PO a, PO e)
Page 12 of 17
BIOINFORMATICS
Course Code : MSWEE2
Prerequisites: NIL
Credits
Contact Hours
: 4:0:1
: 56L + 28P
Course coordinator(s): Shashidhara H S
Course objectives:
• Study computational techniques for biological data analysis
•
Apply and improve those computational techniques on biological data
•
Study the different ways/techniques of analyzing biological data
•
Study the different ways of storing, retrieving and updating biological data
Course Contents:
Unit I
The genetic material, gene structure and information content, protein structure and
function, chemical bonds, molecular biology tools
Unit II
Dot plots, simple alignments, gaps, scoring matrices, the Needleman and Wunsch
algorithm, semiglobal alignments, the Smith and Waterman algorithm, database
searches – BLAST and FASTA
Unit III
Patterns of substitutions within genes, estimating substitution numbers, molecular clocks
Molecular phylogenetics, phylogenetic trees, distance matrix methods, maximum
likelihood approaches
Unit IV
Parsimony, Inferred Ancestral Sequences, strategies for fast searches – branch and
bound and heuristic searches, consensus trees, tree confidence, molecular phylogenies
Genomics – 1: Prokaryotic genomes, prokaryotic gene structure, GC content and
prokaryotic genomes, prokaryotic gene density, eukaryotic genomes
Unit V
Genomics – 2: Eukaryotic gene structure Open reading frames, GC contents in
eukaryotic genomes, gene expression, transposition, repetitive elements
Amino acids, polypeptide composition, secondary structure, tertiary and quaternary
structures, algorithms for modeling protein folding
Laboratory:
Exercises and Mini-projects based on concepts & tools.
References:
• Dan E. Krane, Michael L. Raymer, Fundamental Concepts of Bioinformatics, Pearson
Education, 2008
• T K Attwood, D J Parry Smith, Introduction to Bioinformatics, Pearson Education,
2004
• Gary B. Fogel, David W. Corne, Evolutionary Computation in Bioinformatics, Morgan
Kaufmann Publishers
Course outcomes:
Students will be able to:
1. Explain all the available molecular biology tools (PO e)
2. Solve sequence alignment problems with/without gap penalty (PO a, PO b, PO e)
3. Explain the pattern of substitution within genes (PO b)
4. Distinguish between character based and distance based phylogeny (PO a, PO b)
5. Identify different parts of prokaryotic and Eukaryotic Genes (PO a, PO b)
Page 13 of 17
SYSTEM PERFORMANCE AND ANALYSIS
Course Code : MSWEE3
Prerequisites: NIL
Credits
Contact Hours
: 4:1:0
: 56L + 28T
Course coordinator(s): Myna A N
Course objectives:
•
•
•
Introduce various analysis techniques such as statistics, probability theory, experimental design, simulation, and queuing theory.
Provide basic modeling, simulation, and analysis background to systems analysts
To use measurement as well as modeling techniques to solve performance
problems
Course Contents:
UNIT I
AN OVERVIEW OF PERFORMANCE EVALUATION: Introduction, Common Mistakes in
Performance Evaluation, A Systematic Approach to Performance Evaluation, Selecting an
Evaluation Technique, Selecting Performance Metrics, Commonly used Performance
Metrics, Utility Classification of Performance Metrics, Setting Performance Requirements.
UNIT II
WORKLOAD SELECTION AND CHARACTERIZATION: Types of Workloads, Addition
instructions, Instruction Mixes, Kernels; Synthetic programs, Application Benchmarks,
Popular benchmarks, Workload Selection: Services exercised, level of detail,
Representativeness, Timeliness, Other considerations in Workload selection, Workload
Characterization & Techniques: Terminology, Averaging, Specifying dispersion, Single
Parameter Histograms, Multi parameter histograms, Principal Component Analysis,
Markov Models, Clustering.
UNIT III
MEASUREMENT TECHNIQUES AND TOOLS: Monitors: Terminology and classification,
software and hardware monitors, Software versus hardware monitors, firmware and
hybrid monitors, Distributed System Monitors,
Program-Execution Monitors and
Accounting Logs: Program Execution Monitors, Techniques for Improving Program
Performance, Accounting Logs, Analysis and Interpretation of Accounting log data, Using
accenting logs to answer commonly asked questions.
UNIT IV
CAPACITY PLANNING, BENCHMARKING AND EXPERIMENTAL DESIGN: Steps in
capacity planning and management, Problems in capacity planning, Common mistakes in
benchmarking, Remote-Terminal Emulation, Components of an RTE, Limitations of RTEs,
Experimental design and analysis: Terminology, Common mistakes in experiments,
Types of Experimental designs, 2 k factorial designs, concepts, computation of effects,
Sign table method for computing effects, allocation of Variance, General 2 k factorial
designs.
UNIT V
SIMULATION: Introduction to Simulation, Analysis of Simulation Results: Model
Verification
Techniques:
Top-down
Modular
design,
Antibugging,
Structured
Walk-through, Deterministic Models, Run simplified cases, Trace, Online graphic
displays, Continuity Tests, Degeneracy Tests, Consistency Tests, Seed Independence,
Model Validation Techniques: Expert Intuition, Real System Measurements, Theoretical
Results, Transient Removal: Long Runs, Proper Initialization, Truncation, Initial Date
Deletion, Moving Average of Independent Replications, Batch Means, Terminating
Simulation, Stopping Criteria, Variance Reduction.
Page 14 of 17
Tutorials:
Exercises based on concepts.
References:
1. Raj Jain. “The Art of Computer Systems Performance Analysis”. John Wiley and sons,
New York, USA, 1991
2. Law A M and Kelton W.D. “Simulation Modeling and Analysis “, McGraw Hill, New
York, USA, 1991
3. Paul J Fortier, Howard E Michel: Computer Systems Performance Evaluation and Prediction, Elsevier, 2003
Course outcomes:
Students will be able to:
• To understand performance terminology (PO d, PO e)
• Select proper workload and characterize the workload (PO a, PO d, PO e)
• Collect performance statistics, analyze the data and display results using monitors (PO a, PO d, PO e)
• Correctly design performance experiments (PO a, PO d, PO e)
• Perform computer system performance analysis using simulation (POa, POd, POe)
Page 15 of 17
ADVANCED DATA MINING
Course Code : MSWEE4
Prerequisites: NIL
Credits
Contact Hours
: 4:0:1
: 56L + 28P
Course coordinator(s): Pushpalatha M N
Course objectives:
• Understand Supervised learning: classification, regression
•
Understand Algorithm-independent machine learning and association mining
•
Introduce concepts of unsupervised learning and clustering.
•
Familiarize Bayesian methods, MapReduce and Hadoop.
•
Familiarize advanced concepts and application to software engineering, web and
text data.
Course Contents:
Unit I
Introduction & Supervised learning: Data Mining, Classification by Decision Tree
Induction, Rule-Based Classification, Classification by Back propagation, Support Vector
machine, Lazy Learners and other classification methods, Prediction, Accuracy and Error
measures, evaluating the accuracy of a classifier or predictor.
Unit II
Algorithm-independent machine learning and Association mining: Introduction,
Lack of Inherent superiority of any classifier, Resampling for classifier Design – Bagging,
Boosting. Mining frequent patterns and Association- Basic concepts, Efficient and
scalable frequent itemset mining methods and Mining various kinds of association rules.
Unit III
Unsupervised learning and clustering:
Types of Data in Cluster Analysis, A
categorization of major clustering methods, Partitioning methods, Hierarchical methods,
Density- Based methods, Model-Based Clustering methods, Outlier Analysis.
Unit IV
Bayesian Decision Theory: Continuous Feature, Minimum – Error – Rate Classification,
Classifiers, Discriminant Functions, and
Decision Surfaces, The Normal Density,
Discriminant Functions for the Normal Density, Error Probabilities and Integrals, Error
Bounds for Normal Densities, Bayes Decision Theory – Discrete Features, Map Reduce
and Hadoop.
Unit V
Advanced Concepts: Multidimensional Analysis and Descriptive mining of complex data
objects, Text Mining-Text data analysis and informational retrieval, text mining
approaches, Analysis of World Wide Web, Social Impacts of Data Mining, Trends in data
mining, Application: Software Engineering.
Laboratory:
Give advanced understanding of
1. R programming languages widely used for data analysis;
2. Weka, an environment which provides a collection of machine learning algorithms for
data mining tasks.
R:
3. R as calculators. Using the help system.
4. Data input and output. Vectors, arrays and matrices.
5. Data visualization (including plots in 2 and 3 dimensions, scatter plots, barplots, histograms).
6. Implementing concepts from Linear Algebra and Statistics (including probability distributions, matrix decompositions).
Page 16 of 17
7. Programming: loops, conditional executions, string manipulations, data structures,
etc.
8. Writing functions, debugging the code. Using packages and toolboxes
9. WEKA:
10. Data Input: concepts, instances, attributes. Feature selection
11. Using machine learning schemes (including decision trees, naive Bayes classifiers,
clustering methods)
12. Training and Testing, predicting generalization performance, cross-validation
References:
• Jiawei Han, Micheline Kamber: Data Mining - Concepts and Techniques, 2nd Edition,
Morgan Kaufmann Publisher, 2006.
• Richard O. Duda, Peter E. Hart, and David G. Stork, "Pattern Classification", Second
edition, Wiley, New York, 2000.
• Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani, An Introduction
to Statistical Learning with Applications in R, Springer, 2013.
• The Art of R Programming: A Tour of Statistical Software Design, Norman Matloff,
2011, No Starch Press.
• Manoel Mendonca, Nancy L. Sunderhaft, “Mining Software Engineering Data: A Survey” A DACS State-of-the-Art Report.
Course outcomes:
Students will be able to:
• Explain data mining and Implement, Execute and Evaluate different classification
methods and techniques. (PO a, PO b, PO e)
•
Discuss algorithm independent machine learning and association mining (PO a, PO
b, PO e)
•
Identify the appropriate clustering techniques for the given data sets (PO a, PO b,
PO e)
•
Discuss Bayesian Decision Theory, MapReduce and Hadoop (PO e)
•
Discuss advanced concepts and Apply concepts to software engineering, web and
text data for extracting value and insight. (PO a, PO b, PO e)
Page 17 of 17
Download