WEKA CS 595 Knowledge Discovery and Datamining Assignment # 1 Evaluation Report for WEKA (Waikato Environment for Knowledge Analysis) Presented By: Manoj Wartikar Sameer Sagade Date: 14 March, 2000. th 1 of 23 WEKA Weka Machine Learning Project. Machine Learning: An exciting and potentially far-reaching development in contemporary computer science is the invention and application of methods of Machine Learning. These enable a computer program to automatically analyze a large body of data and decide what information is most relevant. This crystallized information can then be used to help people make decision faster and more accurately. One of the central problems of the information age is dealing with the enormous explosion in the amount of raw information that is available. Machine learning (ML) has the potential to sift through this mass of information and convert it into knowledge that people can use. So far, however, it has been used mainly on small problems under well-controlled conditions. The aim of the Weka Project is to bring the technology out of the laboratory and provide solutions that can make a difference to people. The overall goal of this research programme is to build a state-of-the art facility for development of techniques of ML. Objectives: The team at Waikato has incorporated several standard ML techniques into software “Workbench” abbreviated WEKA (Waikato Environment for Knowledge Analysis). With the use of WEKA, a specialist in a particular field is able to use ML and derive useful knowledge from databases that are far too large to be analyzed by hand. The main objectives of WEKA are to Make Machine Learning (ML) techniques generally available; Apply them to practical problems as in agriculture; Develop new machine learning algorithms; Design a theoretical framework for the field. Documented Features: The WEKA presents a collection of algorithms for solving real-world data mining problems. The software is written in Java 2 and includes a uniform interface to the standard techniques in machine learning. The following techniques in Data mining are implemented in WEKA. 1. Attribute Selection. 2. Clustering. 3. Classifiers (both numeric and non-numeric). 4. Association Rules. 5. Filters. 6. Estimators. 2 of 23 WEKA Out of these options, only Classifiers, association rules and Filters are available as direct executables. All the remaining functions are available as API’s. The data required by the software is in the “.Arff” format. Sample databases are also provided with the software. Features: The WEKA package is comprised of a number of classes and inheritances. We have to create an instance of any class to execute it. The functionality of WEKA is classified based on the steps of Machine learning. Classifiers: The Classifiers class prints out a decision tree classifier for the dataset given as input. Also A ten-fold cross-validation estimation of its performance is also calculated. The Classifiers package implements the most common techniques separately for categorical and numerical values a) Classifiers for categorical prediction: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. Weka.classifiers.IBk Weka.classifiers.j48.J48 Weka.classifiers.j48.PART Weka.classifiers.NaiveBayes Weka.classifiers.OneR Weka.classifiers.KernelDensity Weka.classifiers.SMO Weka.classifiers.Logistic Weka.classifiers.AdaBoostM1 Weka.classifiers.LogitBoost Weka.classifiers.DecisionStump K-nearest neighbor learner C4.5 decision trees Rule learner Naive Bayes with/without kernels Holte's oner Kernel density classifier Support vector machines Logistic regression Adaboost Logit boost Decision stumps (for boosting) 3 of 23 WEKA Sample Executions of the various categorical CLASSIFIER Algorithms: K Nearest Neighbour Algorithm: >java weka.classifiers.IBk -t data/iris.arff IB1 instance-based classifier using 1 nearest neighbour(s) for classification === Error on training data === Correctly Classified Instances Incorrectly Classified Instances Mean absolute error Root mean squared error Total Number of Instances 150 0 0.0085 0.0091 150 100 % 0 % === Confusion Matrix === a b c <-- classified as 50 0 0 | a = Iris-setosa 0 50 0 | b = Iris-versicolor 0 0 50 | c = Iris-virginica === Stratified cross-validation === Correctly Classified Instances Incorrectly Classified Instances Mean absolute error Root mean squared error Total Number of Instances 144 6 0.0356 0.1618 150 96 % 4 % === Confusion Matrix === a b c <-- classified as 50 0 0 | a = Iris-setosa 0 47 3 | b = Iris-versicolor 0 3 47 | c = Iris-virginica 4 of 23 WEKA J48 Pruned Tree Algorithm: >java weka.classifiers.j48.J48 -t data/iris.arff J48 pruned tree -----------------petalwidth <= 0.6: Iris-setosa (50.0) petalwidth > 0.6 | petalwidth <= 1.7 | | petallength <= 4.9: Iris-versicolor (48.0/1.0) | | petallength > 4.9 | | | petalwidth <= 1.5: Iris-virginica (3.0) | | | petalwidth > 1.5: Iris-versicolor (3.0/1.0) | petalwidth > 1.7: Iris-virginica (46.0/1.0) Number of Leaves : Size of the tree : 5 9 === Error on training data === Correctly Classified Instances Incorrectly Classified Instances Mean absolute error Root mean squared error Total Number of Instances 147 3 0.0233 0.108 150 98 % 2 % === Confusion Matrix === a b c <-- classified as 50 0 0 | a = Iris-setosa 0 49 1 | b = Iris-versicolor 0 2 48 | c = Iris-virginica === Stratified cross-validation === Correctly Classified Instances 143 Incorrectly Classified Instances 7 Mean absolute error 0.0391 Root mean squared error 0.1707 Total Number of Instances 150 95.3333 % 4.6667 % === Confusion Matrix === a b c <-- classified as 49 1 0 | a = Iris-setosa 0 47 3 | b = Iris-versicolor 0 3 47 | c = Iris-virginica 5 of 23 WEKA === Error on training data === Correctly Classified Instances Incorrectly Classified Instances Mean absolute error Root mean squared error 50 0 0 | a = Iris-setosa 0 48 2 | b = Iris-versicolor 0 4 46 | c = Iris-virginica 144 6 0.0324 0.1495 96 % 4 % SMO (support vector machines) and logistic regression algorithms can handle only two class data sets so are not evaluated. AdaBoost, Logit Boost,Decision Stump are algorithms which boost the performance of the two classifier algorithms. The boosted algorithms are run inside these booster algorithms. These booster algorithms monitor the execution and applies appropriate boosting patches to the them. 6 of 23 WEKA b) Classifiers for numerical prediction: 1. 2. 3. 4. 5. weka.classifiers.LinearRegression weka.classifiers.m5.M5Prime weka.classifiers.Ibk weka.classifiers.LWR weka.classifiers.RegressionByDiscretization Linear regression Model trees K-nearest neighbor learner Locally weighted regression Uses categorical classifiers Sample Executions of the various categorical CLASSIFIER Algorithms: Linear Regression Model: > java weka.classifiers.LinearRegression -t data/cpu.arff Linear Regression Model class = -152.7641 * vendor=microdata,formation,prime,harris,dec,wang,perkinelmer,nixdorf,bti,sratus,dg,burroughs,cambex,magnuson,honeywell,ipl,ibm,cdc,n cr,basf,gould,siemens,nas,adviser,sperry,amdahl + 141.8644 * vendor=formation,prime,harris,dec,wang,perkinelmer,nixdorf,bti,sratus,dg,burroughs,cambex,magnuson,honeywell,ipl,ibm,cdc,n cr,basf,gould,siemens,nas,adviser,sperry,amdahl + -38.2268 * vendor=burroughs,cambex,magnuson,honeywell,ipl,ibm,cdc,ncr,basf,gould,siem ens,nas,adviser,sperry,amdahl + 39.4748 * vendor=cambex,magnuson,honeywell,ipl,ibm,cdc,ncr,basf,gould,siemens,nas,ad viser,sperry,amdahl + -39.5986 * vendor=honeywell,ipl,ibm,cdc,ncr,basf,gould,siemens,nas,adviser,sperry,amdahl + 21.4119 * vendor=ipl,ibm,cdc,ncr,basf,gould,siemens,nas,adviser,sperry,amdahl + -41.2396 * vendor=gould,siemens,nas,adviser,sperry,amdahl + 32.0545 * vendor=siemens,nas,adviser,sperry,amdahl + -113.6927 * vendor=adviser,sperry,amdahl + 176.5204 * vendor=sperry,amdahl + -51.2583 * vendor=amdahl + 0.0616 * MYCT + 0.0171 * MMIN + 0.0054 * MMAX + 0.6654 * CACH + -1.4159 * CHMIN + 1.5538 * CHMAX + 7 of 23 WEKA -41.4854 === Error on training data === Correlation coefficient Mean absolute error Root mean squared error Relative absolute error Root relative squared error Total Number of Instances 0.963 28.4042 41.6084 32.5055 % 26.9508 % 209 === Cross-validation === Correlation coefficient Mean absolute error Root mean squared error Relative absolute error Root relative squared error Total Number of Instances 0.9328 35.014 55.6291 39.9885 % 35.9513 % 209 8 of 23 WEKA Pruned Training Model Tree: > java weka.classifiers.m5.M5Prime -t data/cpu.arff Pruned training model tree: MMAX <= 14000 : LM1 (141/4.18%) MMAX > 14000 : LM2 (68/51.8%) Models at the leaves: Smoothed (complex): LM1: class = 4.15 2.05vendor=honeywell,ipl,ibm,cdc,ncr,basf,gould,siemens,nas,adviser,sperry,am dahl + 5.43vendor=adviser,sperry,amdahl - 5.78vendor=amdahl + 0.00638MYCT + 0.00158MMIN + 0.00345MMAX + 0.552CACH + 1.14CHMIN + 0.0945CHMAX LM2: class = -113 56.1vendor=honeywell,ipl,ibm,cdc,ncr,basf,gould,siemens,nas,adviser,sperry,am dahl + 10.2vendor=adviser,sperry,amdahl - 10.9vendor=amdahl + 0.012MYCT + 0.0145MMIN + 0.0089MMAX + 0.808CACH + 1.29CHMAX Number of Leaves : 2 === Error on training data === Correlation coefficient Mean absolute error Root mean squared error Relative absolute error Root relative squared error Total Number of Instances 0.9853 13.4072 26.3977 15.3431 % 17.0985 % 209 === Cross-validation === Correlation coefficient Mean absolute error Root mean squared error Relative absolute error 0.9767 13.1239 33.4455 14.9884 % 9 of 23 WEKA Root relative squared error Total Number of Instances 21.6147 % 209 10 of 23 WEKA K Nearest Neighbour classifier Algorithm: > java weka.classifiers.IBk -t data/cpu.arff IB1 instance-based classifier using 1 nearest neighbour(s) for classification === Error on training data === Correlation coefficient Mean absolute error Root mean squared error Relative absolute error Root relative squared error Total Number of Instances 1 0 0 0 % 0 % 209 === Cross-validation === Correlation coefficient Mean absolute error Root mean squared error Relative absolute error Root relative squared error Total Number of Instances 0.9475 20.8589 53.8162 23.8223 % 34.7797 % 209 11 of 23 WEKA Locally Weighted Regression: > java weka.classifiers.LWR -t data/cpu.arff Locally weighted regression =========================== Using linear weighting kernels Using all neighbours === Error on training data === Correlation coefficient Mean absolute error Root mean squared error Relative absolute error Root relative squared error Total Number of Instances 0.9967 8.9683 12.6133 10.2633 % 8.1699 % 209 === Cross-validation === Correlation coefficient Mean absolute error Root mean squared error Relative absolute error Root relative squared error Total Number of Instances 0.9808 14.9006 31.0836 17.0176 % 20.0884 % 209 12 of 23 WEKA Regression by Descretization: > java weka.classifiers.RegressionByDiscretization -t data/cpu.arff -W weka.classifiers.Ibk // Sub classifier is selected by categorical classification Regression by discretization Class attribute discretized into 10 values Subclassifier: weka.classifiers.Ibk IB1 instance-based classifier using 1 nearest neighbour(s) for classification === Error on training data === Correlation coefficient Mean absolute error Root mean squared error Relative absolute error Root relative squared error Total Number of Instances 0.9783 32.0353 35.6977 36.6609 % 23.1223 % 209 === Cross-validation === Correlation coefficient Mean absolute error Root mean squared error Relative absolute error Root relative squared error Total Number of Instances 0.9244 41.5572 64.7253 47.4612 % 41.8299 % 209 13 of 23 WEKA Association rules: Association rule mining finds interesting association or correlation relationships among a large set of data items. With massive amounts of data continuously being collected and stored in databases, many industries are becoming interested in mining association rules from their databases. For example, the discovery of interesting association relationships among huge amounts of business transaction records can help catalog design, cross marketing, loss-leader analysis, and other business decision making processes. A typical example of association rule mining is market basket analysis. This process analyzes customer-buying habits by finding associations between the different items that customer’s place in their “shopping baskets". The discovery of such associations can help retailers develop marketing strategies by gaining insight into which items are frequently purchased together by customers. For instance, if customers are buying milk, how likely are they to also buy bread (and what kind of bread) on the same trip to the supermarket? Such information can lead to increased sales. The WEKA software efficiently produces association rules for the given data set. The Apriori algorithm is used as the foundation of the package. It gives all the itemsets and the subsequent frequent sets for the specified minimal support and confidence. A typical output of the Association package is : Apriori Principle: > java weka.associations.Apriori -t data/weather.nominal.arff -I yes Apriori ======= Minimum support: 0.2 Minimum confidence: 0.9 Number of cycles performed: 17 Generated sets of large itemsets: Size of set of large itemsets L(1): 12 Large Itemsets L(1): outlook=sunny 5 outlook=overcast 4 outlook=rainy 5 temperature=hot 4 temperature=mild 6 14 of 23 WEKA temperature=cool 4 humidity=high 7 humidity=normal 7 windy=TRUE 6 windy=FALSE 8 play=yes 9 play=no 5 Size of set of large itemsets L(2): 47 Large Itemsets L(2): outlook=sunny temperature=hot 2 outlook=sunny temperature=mild 2 outlook=sunny humidity=high 3 outlook=sunny humidity=normal 2 outlook=sunny windy=TRUE 2 outlook=sunny windy=FALSE 3 outlook=sunny play=yes 2 outlook=sunny play=no 3 outlook=overcast temperature=hot 2 outlook=overcast humidity=high 2 outlook=overcast humidity=normal 2 outlook=overcast windy=TRUE 2 outlook=overcast windy=FALSE 2 outlook=overcast play=yes 4 outlook=rainy temperature=mild 3 outlook=rainy temperature=cool 2 outlook=rainy humidity=high 2 outlook=rainy humidity=normal 3 outlook=rainy windy=TRUE 2 outlook=rainy windy=FALSE 3 outlook=rainy play=yes 3 outlook=rainy play=no 2 temperature=hot humidity=high 3 temperature=hot windy=FALSE 3 temperature=hot play=yes 2 temperature=hot play=no 2 temperature=mild humidity=high 4 temperature=mild humidity=normal 2 temperature=mild windy=TRUE 3 temperature=mild windy=FALSE 3 temperature=mild play=yes 4 temperature=mild play=no 2 temperature=cool humidity=normal 4 temperature=cool windy=TRUE 2 temperature=cool windy=FALSE 2 15 of 23 WEKA temperature=cool play=yes 3 humidity=high windy=TRUE 3 humidity=high windy=FALSE 4 humidity=high play=yes 3 humidity=high play=no 4 humidity=normal windy=TRUE 3 humidity=normal windy=FALSE 4 humidity=normal play=yes 6 windy=TRUE play=yes 3 windy=TRUE play=no 3 windy=FALSE play=yes 6 windy=FALSE play=no 2 Size of set of large itemsets L(3): 39 Large Itemsets L(3): outlook=sunny temperature=hot humidity=high 2 outlook=sunny temperature=hot play=no 2 outlook=sunny humidity=high windy=FALSE 2 outlook=sunny humidity=high play=no 3 outlook=sunny humidity=normal play=yes 2 outlook=sunny windy=FALSE play=no 2 outlook=overcast temperature=hot windy=FALSE 2 outlook=overcast temperature=hot play=yes 2 outlook=overcast humidity=high play=yes 2 outlook=overcast humidity=normal play=yes 2 outlook=overcast windy=TRUE play=yes 2 outlook=overcast windy=FALSE play=yes 2 outlook=rainy temperature=mild humidity=high 2 outlook=rainy temperature=mild windy=FALSE 2 outlook=rainy temperature=mild play=yes 2 outlook=rainy temperature=cool humidity=normal 2 outlook=rainy humidity=normal windy=FALSE 2 outlook=rainy humidity=normal play=yes 2 outlook=rainy windy=TRUE play=no 2 outlook=rainy windy=FALSE play=yes 3 temperature=hot humidity=high windy=FALSE 2 temperature=hot humidity=high play=no 2 temperature=hot windy=FALSE play=yes 2 temperature=mild humidity=high windy=TRUE 2 temperature=mild humidity=high windy=FALSE 2 temperature=mild humidity=high play=yes 2 temperature=mild humidity=high play=no 2 temperature=mild humidity=normal play=yes 2 temperature=mild windy=TRUE play=yes 2 temperature=mild windy=FALSE play=yes 2 16 of 23 WEKA temperature=cool humidity=normal windy=TRUE 2 temperature=cool humidity=normal windy=FALSE 2 temperature=cool humidity=normal play=yes 3 temperature=cool windy=FALSE play=yes 2 humidity=high windy=TRUE play=no 2 humidity=high windy=FALSE play=yes 2 humidity=high windy=FALSE play=no 2 humidity=normal windy=TRUE play=yes 2 humidity=normal windy=FALSE play=yes 4 Size of set of large itemsets L(4): 6 Large Itemsets L(4): outlook=sunny temperature=hot humidity=high play=no 2 outlook=sunny humidity=high windy=FALSE play=no 2 outlook=overcast temperature=hot windy=FALSE play=yes 2 outlook=rainy temperature=mild windy=FALSE play=yes 2 outlook=rainy humidity=normal windy=FALSE play=yes 2 temperature=cool humidity=normal windy=FALSE play=yes 2 Best rules found: 1. humidity=normal windy=FALSE 4 ==> play=yes 4 (1) 2. temperature=cool 4 ==> humidity=normal 4 (1) 3. outlook=overcast 4 ==> play=yes 4 (1) 4. temperature=cool play=yes 3 ==> humidity=normal 3 (1) 5. outlook=rainy windy=FALSE 3 ==> play=yes 3 (1) 6. outlook=rainy play=yes 3 ==> windy=FALSE 3 (1) 7. outlook=sunny humidity=high 3 ==> play=no 3 (1) 8. outlook=sunny play=no 3 ==> humidity=high 3 (1) 9. temperature=cool windy=FALSE 2 ==> humidity=normal play=yes 2 (1) 10. temperature=cool humidity=normal windy=FALSE 2 ==> play=yes 2 (1) 17 of 23 WEKA Advantages, disadvantages and Future Upgradations: The WEKA system has covered the entire machine learning (knowledge discovery) process. Although an research project, the WEKA system has been able to implement and evaluate a number of different Algorithms for different steps in the machine learning process. The output and the information provided by the package is sufficient for an expert in machine learning and related topics. The results as displayed by the system show a detailed description of the flow and the steps involved in the entire machine learning process. The outputs provided by different algorithms are easy to compare and hence make the analysis easier. ARFF dataset is one of the most widely used data storage formats for research databases, making this system easier for use in research oriented projects. This package provides and number of application program interfaces (API) which help novice Dataminers build their systems using the ”core WEKA system”. Since the system provides a number of switches and options, we can customize the output of the system to suit our needs. First, major disadvantage is that the system is a Java based system and requires Java Virtual Machine installed for its execution. Since the system is entirely based on Command Line parameters and switches, it is difficult for an amateur to use the system efficiently. A Textual interface and output makes it all the more difficult to interpret and visualize the results. Important results such as the pruned trees, hierarchy based outputs cannot be displayed making it difficult to visualize the results. Although a commonly used dataset, ARFF is the only format that the WEKA system supports. All the current version i.e. 3.0.1 has some bugs or disadvantages, the developers are working on a better system and have come up with a new version which has a graphical user interface making the system complete. 18 of 23 WEKA Appendix (Sample executions for other algorithms covered) 19 of 23 WEKA PART Decision List Algorithm >java weka.classifiers.j48.PART -t data/iris.arff PART decision list -----------------petalwidth <= 0.6: Iris-setosa (50.0) petalwidth <= 1.7 AND petallength <= 4.9: Iris-versicolor (48.0/1.0) : Iris-virginica (52.0/3.0) Number of Rules : 3 === Error on training data === Correctly Classified Instances Incorrectly Classified Instances Mean absolute error Root mean squared error Total Number of Instances 146 4 0.0338 0.1301 150 97.3333 % 2.6667 % === Confusion Matrix === a b c <-- classified as 50 0 0 | a = Iris-setosa 0 47 3 | b = Iris-versicolor 0 1 49 | c = Iris-virginica === Stratified cross-validation === Correctly Classified Instances Incorrectly Classified Instances Mean absolute error Root mean squared error Total Number of Instances 142 8 0.0454 0.1805 150 94.6667 % 5.3333 % === Confusion Matrix === a b c <-- classified as 49 1 0 | a = Iris-setosa 0 47 3 | b = Iris-versicolor 0 4 46 | c = Iris-virginica 20 of 23 WEKA Naïve Bayes Classifier Algorithm: > java weka.classifiers.NaiveBayes -t data/iris.arff Naive Bayes Classifier Class Iris-setosa: Prior probability = 0.33 sepallength: Normal Distribution. Mean = 4.9913 StandardDev = 0.355 WeightSum = 50 Precision = 0.10588235294117648 sepalwidth: Normal Distribution. Mean = 3.4015 StandardDev = 0.3925 WeightSum = 50 Precision = 0.10909090909090911 petallength: Normal Distribution. Mean = 1.4694 StandardDev = 0.1782 WeightSum = 50 Precision = 0.14047619047619048 petalwidth: Normal Distribution. Mean = 0.2743 StandardDev = 0.1096 WeightSum = 50 Precision = 0.11428571428571428 Class Iris-versicolor: Prior probability = 0.33 sepallength: Normal Distribution. Mean = 5.9379 StandardDev = 0.5042 WeightSum = 50 Precision = 0.10588235294117648 sepalwidth: Normal Distribution. Mean = 2.7687 StandardDev = 0.3038 WeightSum = 50 Precision = 0.10909090909090911 petallength: Normal Distribution. Mean = 4.2452 StandardDev = 0.4712 WeightSum = 50 Precision = 0.14047619047619048 petalwidth: Normal Distribution. Mean = 1.3097 StandardDev = 0.1915 WeightSum = 50 Precision = 0.11428571428571428 Class Iris-virginica: Prior probability = 0.33 sepallength: Normal Distribution. Mean = 6.5795 StandardDev = 0.6353 WeightSum = 50 Precision = 0.10588235294117648 sepalwidth: Normal Distribution. Mean = 2.9629 StandardDev = 0.3088 WeightSum = 50 Precision = 0.10909090909090911 petallength: Normal Distribution. Mean = 5.5516 StandardDev = 0.5529 WeightSum = 50 Precision = 0.14047619047619048 petalwidth: Normal Distribution. Mean = 2.0343 StandardDev = 0.2646 WeightSum = 50 Precision = 0.11428571428571428 21 of 23 WEKA OneR Classifier Algorithm: > java weka.classifiers.OneR -t data/iris.arff petallength: < 2.45 -> Iris-setosa < 4.75 -> Iris-versicolor >= 4.75 -> Iris-virginica (143/150 instances correct) === Error on training data === Correctly Classified Instances Incorrectly Classified Instances Mean absolute error Root mean squared error Total Number of Instances 143 7 0.0311 0.1764 150 95.3333 % 4.6667 % === Confusion Matrix === a b c <-- classified as 50 0 0 | a = Iris-setosa 0 44 6 | b = Iris-versicolor 0 1 49 | c = Iris-virginica === Stratified cross-validation === Correctly Classified Instances Incorrectly Classified Instances Mean absolute error Root mean squared error Total Number of Instances 142 8 0.0356 0.1886 150 94.6667 % 5.3333 % === Confusion Matrix === a b c <-- classified as 50 0 0 | a = Iris-setosa 0 44 6 | b = Iris-versicolor 0 2 48 | c = Iris-virginica 22 of 23 WEKA Kernel Density Algorithm: > java weka.classifiers.KernelDensity -t data/iris.arff Kernel Density Estimator === Error on training data === Correctly Classified Instances Incorrectly Classified Instances Mean absolute error Root mean squared error Total Number of Instances 148 2 0.0313 0.0944 150 98.6667 % 1.3333 % === Confusion Matrix === a b c <-- classified as 50 0 0 | a = Iris-setosa 0 49 1 | b = Iris-versicolor 0 1 49 | c = Iris-virginica === Stratified cross-validation === Correctly Classified Instances Incorrectly Classified Instances Mean absolute error Root mean squared error Total Number of Instances 144 6 0.0466 0.1389 150 96 % 4 % === Confusion Matrix === a b c <-- classified as 50 0 0 | a = Iris-setosa 0 48 2 | b = Iris-versicolor 0 4 46 | c = Iris-virginica 23 of 23