METU Computer Engineering Ceng530

advertisement
METU
Computer Engineering
Ceng574
Assignment 1:
Dataset selection and presentation
Seeds DATA SET
Student:
1714443 Alperen Eroğlu
22.10.2012
Instructor:
Prof.Dr. Volkan ATALAY
1714443 Alperen EROGLU
1
Outline
1. Name
2. Origin
3. Short Description
4. Dimension of feature vector (number of attributes)
5. Number of classes or groups
6. Number of samples (objects)
7. Explanation of each attribute: label, explanation, type, min, max, mean,
std deviation
8. Goal
9. Any previous work on this dataset
22.10.2012
1714443 Alperen EROGLU
2
Name and Origin
SEEDs DATA SET
Source:
Małgorzata Charytanowicz, Jerzy Niewczas
Institute of Mathematics and Computer Science,
The John Paul II Catholic University of Lublin, Konstantynów 1 H,
PL 20-708 Lublin, Poland
e-mail: {mchmat,jniewczas}@kul.lublin.pl
Piotr Kulczycki, Piotr A. Kowalski, Szymon Lukasik, Slawomir Zak
Department of Automatic Control and Information Technology,
Cracow University of Technology, Warszawska 24, PL 31-155 Cracow, Poland
and
Systems Research Institute, Polish Academy of Sciences, Newelska 6,
PL 01-447 Warsaw, Poland
e-mail: {kulczycki,pakowal,slukasik,slzak}@ibspan.waw.pl
22.10.2012
1714443 Alperen EROGLU
3
Short Description
●
●
●
Measurements of geometrical properties of kernels
belonging to three different varieties of wheat,
Kama, Rosa and Canadian.
A soft X-ray technique and GRAINS package were
used to construct all seven, real-valued attributes.
The data set can be used for the tasks of
classification and cluster analysis.
22.10.2012
1714443 Alperen EROGLU
4
Dimension of feature vector
●
●
●
Number of attributes in this dataset is 7.
Seven geometric parameters of wheat kernels were
measured
All of these parameters were real-valued continuous.
22.10.2012
1714443 Alperen EROGLU
5
Number of classes and samples
●
●
Number of classes in this data set is 3, three different
varieties of wheat: Kama, Rosa and Canadian
Number of instances in this dataset is 210.
22.10.2012
1714443 Alperen EROGLU
6
Explanation of each attribute(1)
●
22.10.2012
1. area A,
2. perimeter P,
3. compactness C = 4*pi*A/P^2,
4. length of kernel,
5. width of kernel,
6. asymmetry coefficient
7. length of kernel groove.
1714443 Alperen EROGLU
7
Explanation of each attribute(2)
LA- MEAN
BEL
1
2
3
4
5
6
7
22.10.2012
MIN
MAX
14,8475238
10,59
21,18
14,5592857
12,41
17,25
0,87099857
0,8081
0,9183
5,62853333
4,899
6,675
3,25860476
2,63
4,033
3,70020095
0,7651
8,456
3,25860476
4,519
6,55
1714443 Alperen EROGLU
TYPE
REAL
REAL
REAL
REAL
REAL
REAL
REAL
STD. DEV.
2,90276331
1,30284559
0,02357309
0,44200731
0,37681405
1,49997296
0,23603043
8
Goal
Goal is to predict three different varieties of wheat:
Kama, Rosa and Canadian
22.10.2012
1714443 Alperen EROGLU
9
Previous Works
●
●
Relevant Papers:
M. Charytanowicz, J. Niewczas, P. Kulczycki, P.A. Kowalski, S.
Lukasik, S. Zak, 'A Complete Gradient Clustering Algorithm for
Features Analysis of X-ray Images', in: Information Technologies in
Biomedicine, Ewa Pietka, Jacek Kawa (eds.), Springer-Verlag, BerlinHeidelberg, 2010, pp. 15-24.
Citation Request:
●
Contributors gratefully acknowledge support of their work by the
Institute of Agrophysics of the Polish Academy of Sciences in Lublin.
22.10.2012
1714443 Alperen EROGLU
10
References
●
http://archive.ics.uci.edu/ml/datasets/seeds
22.10.2012
1714443 Alperen EROGLU
11
THANKS...
22.10.2012
1714443 Alperen EROGLU
12
Download