Classification of grapevine cultivars using Kirlian camera and machine learning Danijel SKOČAJ1, Igor KONONENKO2, Irma TOMAŽIČ3, Zora KOROŠEC-KORUZA4 ABSTRACT The aim of the study was to verify whether Kirlian camera could be used to describe grapevines and if the berry bioelectric field is influenced by disease. With Kirlian camera we measured bioelectric fields of grape berries. To complete the measurements we described acquired coronas of the berries with numerical parameters and used machine learning algorithms to classify grapevine cultivars. We tested this method on eight grapevine cultivars, performing different tests. The results show that coronas of grapevine berries contain useful information about cultivar and their sanitary status. IZVLEČEK Cilj študije je bil preveriti ali lahko Kirlianovo kamero uporabimo za opis vinske trte in ali bolezni vplivajo na bioelektrično polje grozdnih jagod. S Kirlianovo kamero smo izmerili bioelektrično polje grozdnih jagod. Nato smo dobljene korone jagod opisali z numeričnimi parametri in jih klasificirali z algoritmi strojnega učenja. Z izvajanjem različnih testov smo metodo testirali na osmih kultivarjih vinske trte. Rezultati so pokazali, da korone grozdnih jagod vsebujejo koristne informacije o kultivarjih in njihovem zdravstvenem stanju. INTRODUCTION Grapevine variety identification, and even clone identification, are some of the most interesting targets of ampelography. In order to know and classify this complex germplasm, various methods are continuously proposed and studied. Several methods are based on the description of vine different organs, often using biometry in addition to the visual observations. The analytical determination of primary and secondary metabolites may also contribute to the identification of species. Methods based on the analysis of DNA polymorphism were recently described. The heterogeneity of V. vinifera varieties can be explained by their genetic variability; it can also be caused by environment or disease status (the presence of pathogens like viruses, viroids and perhaps phytoplasmas). Virus infections often result in modified morphological parameters, changes in physiology seen in viticulture also as a lower yield and quality of the crop (Walter and Marteli, 1998). M. Sc., Faculty of Computer and Information Science, Ljubljana, Tržaška 25, danijel.skocaj@fri.uni-lj.si Associate professor, PhD., Faculty of Computer and Information Science 3 Dipl. Eng. Agr., SLO-1000, Ljubljana, Jamnikarjeva 101, P. O. Box 2995 4 Assist. Prof., Ph. D., M. Sc. Biol., ibid. 1 2 We tested a novel approach to the classification of grapevine cultivars. Classification was performed by processing data about bioelectric fields of grape berries with machine learning algorithms. Data were acquired using Gas Discarge Visualisation (GDV) technique. Since this technique has been successfully applied in different experiments (Kononenko et al., 1999a;b, Trampuž et al. 1999, Čater and Batič 1998) we wanted to verify whether Kirlian camera could be used to describe grapevines and if the berry bioelectric field is influenced by disease. MATERIAL AND METHODS The study was carried out on the berries of eight different cultivars: 'Klarnica', 'Malvazija', 'Pinela', 'Planinka', 'Sladkočrn', 'Volovnik', 'Zelen' and 'Zweigeld' grown in the grapevine germplasm collection Vipava - 2S (Koruza et al., 1998). Cv. 'Zelen' was represented by two types, type A and type ŽG actually. For a cultivar or its type we sampled 2 vines, 2 bunches/vine and 10 berries/bunch, resulting in 360 samples (9×2×2×10). ELISA was used to detect grapevine viruses in dormant cuttings. The sanitary status of the vines is shown in table 1. Table 1: Results of ELISA testing of grapevine. viruses Cultivar Plant 'Klarnica' 'Klarnica' 'Pinela' 'Pinela' 'Planinka' 'Planinka' 'Sladkočrn' 'Sladkočrn' 'Zelen' type A 'Zelen' type A 'Zelen' type ŽG 'Zelen' type ŽG 'Volovnik' 'Volovnik' 'Zweigeld' 'Zweigeld' 'Malvazija'd 'Malvazija' 1 -a +b 3 + + 28 + 31 1 + 5 + 5 + + 9 + + 3 7 21 23 47 50 13 22 without symptoms exhibiting symptoms of phytoplasma GLRaV-1 a - negative reaction only visual diagnosed b GLRaV-2 + positive reaction GLRaV-3 c GLRaV-6 GVA GVB GFLV GFkV ArMV - - + - + + + + - + + + + + + + + + + - + ?c ? + + + + - ? nonspecific reaction d Acquiring bioelectric field of grape berries using Kirlian effect For acquiring bioelectric field of grape berries we used the Kirlian effect. Kirlian pair discovered it in 1939 and it has been studied and refined by independent labs since that. Korotkov (1998) and his team developed the instrument Crown TV that can be routinely used. It creates a high intensity electric field (10 kV, 1024 Hz, 0,5 sec) around the object that is set on the plate of the instrument (GDV Experts 1999). The electric field produces a visible gas discharge glow around the object (a »Kirlian picture«). The image is captured by video camera and transferred to the computer. A computer program then processes the image and describes the corona with a set of numerical parameters (form coefficients, fractal dimension, brightness coefficient and deviation etc.). Classification using machine learning Machine learning technology (Mitchell, 1997; Kononenko 1993; Kononenko et al., 1998) is well suited for induction of diagnostic and prognostic rules and small and specialised classification problems. The classification knowledge can be automatically derived from the description of cases that have been already classified. Thus, we acquired coronas of a large number of berries from different grapevines of known cultivars and sanitary status. Parameters of these coronas were then used in machine learning algorithms to derive classification knowledge. RESULTS We tried to solve two different types of problems: classification of grape berries according to cultivar and according to a sanitary status. We performed several tests classifying different number of examples into different number of classes: The classification according to cultivar: all nine cultivars; 9 classes, 40 examples in each class, four cultivars: ‘Volovnik’, ‘Zweigeld’, ‘Pinela’ and ‘Zelen ŽG’; 4 classes, 40 examples in each class, two cultivars: ‘Volovnik’ and ‘Zweigeld’; 2 classes, 40 examples in each class, two cultivars: ‘Zelen ŽG’ and ‘Pinela’; 2 classes, 40 examples in each class, two sets of cultivars: white grapes (‘Volovnik’+’Pinela’) and red grapes (‘Zweigeld’+‘Sladkočrn’); 2 classes, 80 examples in each class. The classification according to a sanitary status: infected ‘Pinela’ (from plant no. 28) and non-infected ‘Pinela’ (from plant no. 31); 2 classes, 30 examples in each class, ‘Volovnik’+’Zweigeld’ (not infected with GLRaV viruses) and ‘Sladkočrn’+ ’Klarnica’ (infected with GLRaV viruses); 2 classes, 80 examples in each class, ‘Malvazija’ without symptoms and ‘Malvazija’ with symptoms of phytoplasma; 2 classes, 20 examples in each class. As a machine learning algorithm we used the naive Bayesian classifier (Kononenko, 1993). Each corona of a berry was described with eleven parameters, which were independent of the size of a berry. We measured the classification accuracy and information score (Kononenko and Bratko, 1991). This measure eliminates the influence of prior probabilities and appropriately treats probabilistic answers of the classifier. For each problem we randomly split the available examples into 70% for training and 30% for testing set. We repeated this process 10 times and the results were averaged and the standard deviation calculated. The results are presented in Table 2. Table 2: Results of the classification. problem prior pr. (%) nine cultivars 11.1 four cultivars 25 ‘Volovnik’ : ‘Zweigeld’ 50 ‘Zelen ŽG’ : ‘Pinela’ 50 white grapes : red grapes 50 infected : non-infected ‘Pinela’ 50 infected : non-infected with GLRaV 50 ‘Malvazija’ with : without phytoplasma 50 class. accuracy (%) inf. score (bit) 35.7 ± 3.1 1.09 ± 0.07 55.6 ± 7.2 0.83 ± 0.09 77.5 ± 9.2 0.45 ± 0.15 65.8 ± 8.0 0.29 ± 0.11 77.5 ± 4.6 0.45 ± 0.05 70.0 ± 11.1 0.30 ± 0.13 71.0 ± 5.5 0.35 ± 0.06 88.3 ± 8.0 0.73 ± 0.16 In all tests, the classification accuracy is significantly higher than the prior probability of the classification. For example, in the case of classifying of all nine cultivars, the classification accuracy is 35.7%. Since all nine classes are of the same size, a prior probability for each class is 1/9=11.1%, which is more than three times lower than the classification accuracy. Because of this, the information score is very high. The classification is quite successful also in the cases of classification of grape berries according to a sanitary status. In these cases, the prior probability is 50% while the classification accuracy ranges between 70% and 88.3%. CONCLUSIONS The results show that the classification is not random. The classification accuracy is significantly higher than if the classifier would be random. This means that coronas of grapevine berries contain useful information about cultivar and about their sanitary status. We used only parameters that were independent of the size of grape berries. However, there is a possibility to introduce new parameters that may contain additional useful information. We plan to verify this hypothesis and further improve the methodology in order to make it useful for classification of grape cultivars and their diseases. REFERENCES Čater M. / Batič F. (1998): Determination of Seed Vitality by High Electrophotography.- Phyton (Horn, Austria), 38(2):225-237. Frequency Experts in GDV Technology (1999). http://www.kolumbus.fi/pekka.kaariainen/gdv/gdv1.htm Kononenko I. / Bratko I. (1991): Information based evaluation criterion for classifier’s performance.- Machine Learning, 6:67-80. Kononenko I: (1993): Inductive and Bayesian learning in medical diagnosis.- Applied Artificial Intelligence, 7:317-337. Kononenko, I. / Bratko, I. / Kukar, M. (1998): Application of machine learning to medicaldiagnosis. In: R. S. Michalski, I. Bratko, and M. Kubat (eds.): Machine Learning, Data Mining and Knowledge Discovery: Methods and Applications, John Wiley & Sons. Kononenko I. / Zrimec T. / Prihavec B. / Bevk M. / Stanojevic S. (1999a): Machine learning and GDV images: Diagnosis and therapy verification. Proc. Information Society'99 Biology and Cognitive Science, Ljubljana, 12-14 October 1999, pp. BKZ 84-87. Kononenko I. / Zrimec T. / Sadikov A. / Mele K. / Mihalrcic T. (1999b): Machine learning and GDV images: Current research and results. Proc. Information Society'99 - Biology and Cognitive Science, Ljubljana, 12-14 October 1999, pp. BKZ 80-83. Korotkov, K. (1998): Aura and Consciousness: A New Stage of Scientific Understanding, St.Petersburd, Russia: State Editing & Publishing Unit "Kultura". Koruza, B. / Tomažič, I. / Korošec-Koruza, Z. / Lokar, V. (1998): The collecting, conservating, evaluating and data collecting of genetic resources of grapevine (Vitis spp.) in Slovenia : [Predavanje na] 2nd European Network for Grapevine Genetic Resources, Conservation and Characterisation, Torres Vedras, Portugal, 18.-20.11.1998. 82. Mitchell, T. (1997): Machine Learning, McGraw Hill. Trampuž A. / Kononenko I. / Rus V. S. (1999): Experiential and Biophysical Effects of the Art of Living Programme on its Participants. Proc. Information Society'99 - Biology and Cognitive Science, Ljubljana, 12-14 October 1999, pp. BKZ 94-97. Walter, B / Martelli, G. P. (1998): Consideration on grapevine selection and certification. Vitis, 37, N 1, 87-90.