الحل النموذجي للواجب 1

advertisement
Majmaah University
College of Science in Zolfi
‫تجسير علوم حاسب‬
Data Mining
Model Answer of Home Work(1)
Dept. of Computer Science &Info.
6-1-1437
1- Why Data mining (Scientific view)?
Huge of data collected and stored at enormous speeds (GB/hour)
2- Data mining may help scientists in:
– in classifying and segmenting data
– in Hypothesis Formation
1. -------b------ is a non-trivial extraction of implicit, previously unknown and potentially useful
information from data.
a)Data warehousing
2.
3.
4.
5.
b) Data mining
C) Text mining
d) Data selection
..........b......... is an essential process where intelligent methods are applied to extract data
patterns that is also referred to Knowledge discovery in database .
a) Data warehousing b) Data mining
C) Text mining
d) Data selection
Two fundamental goals of Data Mining are ____c____.
a) Analysis and Description
b) Data cleaning and organizing the data
c) Prediction and Description
d) Data cleaning and organizing the data
...........b............... is the process of finding a model that describes and distinguishes data
classes or concepts.
a) Data Characterization
b) Data Classification
c) Data clustering
d) Data selection
Cluster is--------------a----------------------------------------------------a) Group of similar objects that differ significantly from other objects
b) Operations on a database to transform or simplify data in order to prepare it for a
machine-learning algorithm
c) Symbolic representation of facts or ideas from which information can potentially be
extracted
d) None of these
1
6.
In the clustering algorithm the distance between cluster centroid to each object is
calculated using _______a________ method.
a) Euclidean distance
b) Clustering distance
c)Central distance
d) Cluster
Classification task referred to -----a---------a) A subdivision of a set of examples into a number of classes
b) A measure of the accuracy, of the classification of a concept that is given by a certain
theory
c) The task of assigning a classification to a set of examples
d) None of these
7.
3a)
b)
c)
d)
What is Data Mining and not Data Mining?
Look up phone number in phone directory (
Not Data Mining )
Query a Web search engine for information about “Amazon ( Not Data Mining
Certain names are more prevalent in certain US locations( Data Mining )
Group together similar documents returned by search engine according to their
context ( Data Mining )
4- Draws ideas from Machine learning/AI, Pattern recognition, Statistics, and Data
mining ?
Statistics/ AI
Data Mining
5- What are Data Mining Tasks?
A) Prediction Methods :
 Classification
 Regression
 Deviation Detection
B) Description Methods:
 Clustering
 Association Rule Discovery
 Sequential Pattern Discovery )
2
Machine learning
and Pattern
recognition
)
C) What are main steps to extract knowledge/ information from data ?
123456-
Data ( input problem)
Selection ( selected data )
Preprocessing ( preprocessed data)
Transformation (transformed data)
Data mining ( Pattern)
Interpretation /Evaluation ( Knowledge)
D) This figure is model of ----------Classification-------------------Training
Set
Learn
Classifier
Test
Set
Model
– This Application on ( - the image (star or galaxy )
 Segment------ image( star/ galaxy)---------------- Measure -------- image attributes (features)--------------.
 Model the -- prediction class (star or galaxy)------------- Success Story------- Could find some new- star or galaxy --3
E) Define Data Clustering ?
Cluster is a Group of similar objects that differ significantly from other objects
F) Define Regression
Regression is predict a value of a given variable based on the values of other
variables assuming a linear or nonlinear model of dependency.
G) What are Similarity Measures of Clustering?
Similarity Measures of Clustering is Euclidean Distance
H)
1.
2.
3.
4.
5.
6.
7.
What are Challenges of Data Mining ?
Scalability
Dimensionality
Complex and Heterogeneous Data
Data Quality
Data Ownership and Distribution
Privacy Preservation
Streaming Data
4
Download