DOWNLOADED FROM STUCOR APP PART – A Q.No. Question 1 Define clustering How can the initial number of clusters for k-means algorithm be 2 estimated 3 Can you Pick K in a K-Means Algorithm? 4 What are the problems faced if clustering exists in non-Euclidean Point out the conclusions drawn from choosing clustroid? 5 8 Define Bayes Theorem 9 Discuss the number of clusters? 10 What is diagnotics? Examine the use of object Attributes? 13 14 15 16 17 18 BTL5 BTL2 BTL4 Evaluating Understanding Analyzing BTL4 Analyzing BTL6 Creating BTL 1 Remembering What is unit of measure? BTL3 Applying BTL 1 Remembering Summarize about Rescaling? BTL2 Understanding What is metoids? What is Customer segmentation Describe the prediction trees Analyze on internal nodes and leaf nodes BTL2 Understanding BTL 1 Remembering CO R 12 Applying BTL2 Understanding BTL 1 Remembering U 11 BTL3 AP P 7 Compare and contrast the relationship between centroids and clustering Generalize the initialization of K-Means algorithm? 6 BT Level Competence BTL 1 Remembering What is purity of node Point out the CART BTL2 Understanding BTL4 Analyzing BTL 1 Remembering BTL4 Analyzing BTL3 Applying Explain the K-means clustering algorithm with an example.(13) BTL2 Understanding 2 i. ii. Define K-Means algorithm(6) Examine how the data is processed in BFR Algorithm(7) BTL1 Remembering 3 What are the main features of GRGPF Algorithm and explain it?(13) BTL1 Remembering 4 Summarize the hierarchical clustering in Euclidean and non-Euclidean Spaces with its efficiency?(13) BTL2 Understanding 5 Describe the various hierarchical methods of cluster analysis. (13) BTL2 Understanding ST 19 20 1 6 7 Illustrate Naïve Bayes PART – B Explain and list the different hierarchical clustering techniques and BTL4 explain any one. (13) i. Given a one dimensional dataset {1, 5, 8, 10, 2} use the BTL6 agglomerative clustering algorithms with the complete link with DOWNLOADED FROM STUCOR APP Analyzing Creating DOWNLOADED FROM STUCOR APP relationship. By using the maximal lifetime as the cutting threshold, how many clusters are there? What is their membership in each cluster? (6) ii. State and explain the clustering in non-Euclidean space with example. (7) Describe about Market-Basket model.(13) 9 10 11 12 13 3 4 Illustrate in detail about Decision tree in R.(13) Explain in detail about Naïve Bayes Theorem, Classifier, Smoothing and Diagnostics. (13) PART – C Use the k-means algorithm and Euclidean distance to cluster the following 8 examples into 3 clusters:A1=(2,10),A2=(2,5),A3=(8,4),A4=(5,8),A5=(7,5),A6=(6,4),A 7=(1,2),A8=(4,9).Suppose that the initial seeds(centers of each cluster) are A1,A4 and A7.Run the k-means algorithm for 1 epoch only. At the end of this epoch show (i) The new clusters (5) (ii) The centers of the new clusters (6) (iii) How many more iterations are needed to coverage? Draw the result for each epoch. (4) Develop decision tree with an example to predict whether customers will buy a product(15) Explain in detail about evaluate the decision tree algorithm(15) U 2 Generalize about general algorithm and decision tree algorithm. (13) ST 1 Explain in detail about the applications of clustering. (13) CO R 14 Demonstrate about the two clustering techniques with suitable example.(13) Illustrate about the clustering? Explain it with proper example.(13) Understanding BTL1 Remembering BTL3 Applying BTL1 Remembering BTL4 Analyzing BTL 6 Creating AP P 8 BTL2 Explain in detail about two methods of using the naïve Bayes classifier in R with examples.(15) UNIT III BTL3 Applying BTL4 Analyzing BTL5 Evaluating BTL 6 Creating BTL5 Evaluating BTL 6 Creating ASSOCIATION AND RECOMMENDATION SYSTEM Advanced Analytical Theory and Methods: Association Rules - Overview - Apriori Algorithm - Evaluation of Candidate Rules - Applications of Association Rules - Finding Association& finding similarity Recommendation System: Collaborative Recommendation- Content Based Recommendation - Knowledge Based Recommendation- Hybrid Recommendation Approaches. PART – A BT Q.No. Question Competence Level Define apriori algorithm 1 BTL1 Remembering State the use of Apriori algorithm in data mining 2 BTL1 Remembering 3 State market basket analysis DOWNLOADED FROM STUCOR APP BTL1 Remembering DOWNLOADED FROM STUCOR APP What is the logic behind association rule? BTL2 Understanding 5 What is Prune BTL2 Understanding 6 Define Confidence BTL1 Remembering 7 What do you mean by lift BTL2 Understanding 8 Show the advantage of leverage 9 10 What is frequent itemset generation Analyze the Validation and testing BTL3 BTL1 Applying Remembering BTL4 Analyzing 11 Demonstrate the approaches to improve Apriori efficiency BTL3 Applying 12 How are interesting rules identified? BTL4 Analyzing 13 Examine the broad classification of Recommendation systems? 14 Summarize the interesting rules distinguished from coincidental rules? What is utility matrix 15 AP P 4 BTL3 BTL 5 Applying Evaluating BTL4 Analyzing Examine about item profile Give the Content Based recommendation system BTL1 BTL 5 Remembering Evaluating 18 Explain collaborative filtering system Define knowledge based Recommendation BTL4 BTL2 Analyzing Understanding Give the definition Hybrid recommendation BTL2 Understanding 19 20 CO R 16 17 PART – B Explain the apriori algorithm for mining frequent item sets with an example. (13) 2 Illustrate how will you find Association Rules with High confidence(13) Evaluate Apriori Algorithm in detail for the following super market Scenarios(13) 4 5 6 abora ST 3 U 1 Transacti Onion Potato Burger Milk Gee on ID T1 1 1 1 0 0 T2 0 1 1 1 0 T3 0 0 0 1 1 T4 1 1 0 1 0 T5 1 1 1 0 1 T6 1 1 1 1 1 Describe the Recommendation systems? Clearly explain the two applications for Reccomdation systems.(13) Discuss in detail about any one Ranking algorithm used by Search Engines?(13) Explain Recommendation based on User Ratings using appropriate example.(13) tive filtering and content based systems. (13) DOWNLOADED FROM STUCOR APP BTL1 Remembering BTL3 Applying BTL5 Evaluating BTL1 Remembering BTL2 Understanding BTL1 Remembering BTL2 Understanding DOWNLOADED FROM STUCOR APP 8 9 10 11 12 13 14 Explain collaborative filtering based recommendation system. (13) Differentiate between lexical similarity and semantic similarity of documents.(13) Explain in detail about Frequent item set generation and rule generation. (13) Explain in detail about evaluation of candidate rule. (13) BTL2 Understanding BTL1 Remembering BTL1 Remembering Outline in detail about the application of association rule. (13) Generalize in detail about utility matrix and long tail. (13) BTL3 Applying BTL 6 Creating Explain in detail about discovering features of documents. (13) BTL3 Applying PART – C 2 3 4 AP P A database has five transactions. Let min sup = 60% and min conf=80% TID ITEMS T100 Milk, Onion, Nuts, Kiwi, Egg, Yoghurt T200 Dhal, Onion, Nuts, Kiwi, Egg, Yoghurt T300 Milk, Apple, Kiwi, Egg T400 Milk, Curd, Kiwi,Yoghurt T500 Curd, Onion, Kiwi, Ice cream,Egg Find all frequent item sets using Apriori method(15) Narrate in detail about a model for Recommendation system.(15) CO R 1 BTL6 Creating BTL 5 Evaluating Illustrates with an example the application of the Apriori algorithm to a relatively simple case that generalizes to those used in practice. Show how to use the Apriori algorithm to generate frequent item sets and rules and to BTL6 evaluate and visualize the rules.(15) Explain in detail about Hybrid and Knowledge based recommendation.(15) Analyzing STREAM MEMORY U UNIT IV BTL 4 Creating ST Introduction to Streams Concepts – Stream Data Model and Architecture - Stream Computing, Sampling Data in a Stream – Filtering Streams – Counting Distinct Elements in a Stream – Estimating moments – Counting oneness in a Window – Decaying Window – Real time Analytics Platform(RTAP) applications - Case Studies Real Time Sentiment Analysis, Stock Market Predictions. Using Graph Analytics for Big Data: Graph Analytics PART – A Q.No. Question Differentiate between data stream mining and traditional data mining 1 BT BTL4 Competence Analyzing BTL3 Applying 3 Illustrate examples can you find for stream sources? How are moments estimated? BTL2 Understanding 4 List out the applications of data stream. BTL4 Analyzing 5 Compute the surprise number (second moment) for the stream 3, 1, 4, 1, BTL1 3, 4, 2, 1, 2. What is the third moment of this stream? Define decaying window BTL1 Remembering Outline the need for sampling data in a stream Analyze the term filtering a data stream. BTL5 Evaluating BTL3 Applying 2 6 7 8 9 DOWNLOADED FROM STUCOR APP Remembering DOWNLOADED FROM STUCOR APP 10 Give the advantages of the algorithm used in estimating moments BTL1 Remembering 11 Why stream data systems? Why do you think data stream management is relevant in data mining? BTL2 BTL2 Understanding Understanding How oneness is counted in window BTL5 Evaluating 13 14 15 16 17 18 19 20 What would result if the cost of exact counts doesn‟t match? BTL3 Illustrate how would you show your understanding of Market-Basket BTL3 Data? What is sentiment analysis? BTL1 Compare and contrast RTAP (Real Time Analytics Platform) and RTSA BTL5 (Real Time Sentiment Analysis)? Prove by induction on m that 1+3+5+· · · +(2m−1) = m2 BTL4 List any 4 online tool to perform sentiment analysis BTL1 AP P 12 Generalize information would you use to substitute the view of streams over databases? Applying Applying Remembering Evaluating Analyzing Remembering BTL6 Creating What are streams? Explain stream data model with its architecture.(13) BTL1 Remembering What is decaying window? briefly explain it with an example (13) BTL1 Remembering i. Write a short note on sampling in Data Streams.(7) ii. What are the applications of data stream.(6) i. List some common online tools used to perform sentiment analysis.(6) ii. What do you understand by sentiment analysis?(7) BTL2 Understanding BTL1 Remembering Explain any one algorithm to count number of distinct elements in a data stream. (13) Describe about Stream clustering and parallel clustering. (13) BTL2 Understanding BTL1 Remembering Discuss in detail about characteristics of a social network as a graph. (13) BTL3 Applying BTL6 Creating Discuss the concept of decaying window in detail.(13) BTL3 Applying i. BTL2 Understanding BTL4 Analyzing Show how the mining concept used in real time sentiment analysis? (13) BTL3 Applying How is sentiment analysis playing a major role in data mining? (13) Analyzing PART – B CO R 1 2 3 4 U 5 6 8 ST 7 i. With a neat sketch, explain the architecture of data-stream management system.(6) ii. Outline the algorithm used for counting distinct elements in a data stream.(7) 9 10 11 12 13 Explain in detail about how data analysis used in Stock Market Predictions(7) ii. Describe in detail about the usage of data analysis in Weather forecasting predictions. (6) Explain the concept of Bloom Filter with an example.(13) DOWNLOADED FROM STUCOR APP BTL4 DOWNLOADED FROM STUCOR APP 14 1 2 Explain in detail about Alon-Matias-Szegedy algorithm for second moments. (13) PART – C How does the Big Data Stream Analytics Framework (BDSAF)works and explain with a neat architecture diagram (15) Taking stock market preconditions as a case study, elaborate on the Real-time Analytics Platform (RTAP). Present the assumptions mode. (15) 4 UNIT V Applying BTL6 Creating BTL6 Creating BTL5 Evaluating BTL5 Evaluating AP P i. Classify approaches would you use to estimate the moments?(7) ii. Examine is the function cost of exact counts?(8) Can you identify the phases involved in real time data analyticsdeployment to production? Analyze. (15) 3 BTL3 NOSQL DATA MANAGEMENT FOR BIG DATA AND VISUALIZATION ST U CO R NoSQL Databases : Schema-less Models‖: Increasing Flexibility for Data Manipulation-Key Value StoresDocument Stores - Tabular Stores - Object Data Stores - Graph Databases Hive - Sharding –- Hbase – Analysing big data with twitter - Big data for E-Commerce Big data for blogs - Review of Basic Data Analytic Methods using R. PART – A Q.No. Question BT Competence What is NoSQL database 1 BTL1 Remembering What is Key Value data store? 2 BTL1 Remembering 3 Compare document store vs Key value store BTL1 Remembering 4 BTL1 Remembering Provide your own definition of what big data means to your organization? Outline the sharding? 5 BTL 5 Creating 6 Identify three “big data” sources either within or external to your BTL2 Understanding organization that would be relevant to your business Define Tabular store 7 BTL2 Understanding Summarize the features of Hive. 8 BTL4 Analyzing 9 10 What is Hive in Big data? List out any three business challenges in an organization 11 Point out the aspects of adopting big data techniques 12 Figure out the process of validating big data 13 14 Name the dimensions for measuring the quality of information used for big data analytics Define object data stores. 15 16 Justify how twitter data is useful for analyzing big data. Point out top data analytic tools. 17 18 Define R. What is a Graph database? 19 Enumerate the term Graph Analytics. DOWNLOADED FROM STUCOR APP BTL4 Analyzing BTL3 Applying BTL3 Applying BTL3 BTL4 Applying Analyzing BTL1 BTL2 Remembering Understanding BTL5 BTL1 Evaluating Remembering BTL1 Remembering BTL5 Evaluating DOWNLOADED FROM STUCOR APP 20 Describe a pilot application for graph analytics BTL4 Analyzing BTL3 Applying PART – B List the classification of NoSQL Databases and explain about Key Value Stores. (13) 1 2 3 8 9 10 12 13 4 Remembering Applying With suitable examples differentiate the applications, structure, working and usage of different NoSQL databases (13) BTL2 Understanding Differences between SQL Vs NoSQL explain it with suitable example. (13) i. Explain Hive Architecture(6) ii. Write down the features of Hive(7) Explain the types of NoSQL data stores in detail. (13) Discuss in detail about the characteristics of NoSQL databases(13) BTL2 Understanding BTL2 Understanding BTL3 Applying BTL4 BTL1 Analyzing Remembering BTL4 Analyzing BTL6 Creating BTL4 Analyzing BTL5 Evaluating BTL5 Evaluating BTL 6 Creating Explain in detail about Market and Business drives for Big data Analytics. (13) What is HBase? Give detailed note on features of HBASE(13) i. What is the purpose of sharding? (6) ii. Explain the process of sharding in MongoDB. (7) PART – C Analyze the use of Hive. How does Hive interact with Hadoop explain in detail.(15) Formulate how big data analytics helps business people to increase their revenue. Discuss with any one real time application.(15) Draw insights out of any one visualization tool.(15) ST 14 3 BTL1 BTL3 CO R 7 2 Explain about Graph databases and descriptive Statistics(13) Write short notes on i. NoSQL Databases and its types(7) ii. Illustrate in detail about Hive data manipulation, queries, data definition and data types(6) U 6 1 Remembering AP P 4 5 BTL1 Describe the system architecture and components of Hive and Hadoop(13) What is NoSQL? What are the advantages of NoSQL? Explain the types BTL1 of NoSQL databases. (13) Explain in detail about brief history of NoSQL. Explain in detail about ACID vs. BASE.(15) DOWNLOADED FROM STUCOR APP Remembering ST U CO R AP P DOWNLOADED FROM STUCOR APP DOWNLOADED FROM STUCOR APP