Data mining education What’s cookin’ ? Maja Skrjanc Introduction Sol-Eu-Net WP 4 – Dissemination/Education Analysis of machine learning and decision support courses Overview about machine learning, data mining and decision support courses available on the web. Resources Solomon European Network - Data Mining and Decision Support Courses (http://www.cs.bris.ac.uk/~ross/MLearnCourses.html) MLnet (http://www.mlnet.org/cgibin/mlnetois.pl/?File=courses.html) KDnuggets (http://www.kdnuggets.com/courses/index.html) David W. Aha home page (http://www.aic.nrl.navy.mil/~aha/research/ml/courses.html) Decision support system resources (http://dssresources.com/) Classification Intended audience: computer science students, students from other areas managers (CEOs) and IT professionals (data analysts) CS courses characteristics A review of data mining techniques, including decision trees, rule based learning, neural networks, inductive logic programming. Most web sites contain links to assigned reading materials, some of them available online as textbooks. Various courses have also links to required readings, usually very recent papers, which cover primarily newer topics like text and web mining. United States vs. Europe: novel, popular topics Interdisciplinary area: statistics, data warehousing, complexity analysis, data visualization, privacy and security issues Orientation towards real world problems CS courses - examples Masters program in knowledge discovery and data mining in CALD Center (Center for Automated Learning and Discovery) at Carnegie Mellon University (http://www.cs.cmu.edu/~cald/about.html) MSc in machine learning at University of Bristol (http://www.cs.bris.ac.uk/Teaching/MachineLearning/ ) Principles of Knowledge Discovery in Databases, Department of Computing Science, University of Alberta, (http://www.cs.ualberta.ca/~zaiane/courses/cmput690/index.html) Web data mining; Computer Science, Telecommunications, and Information Technology, DePaul University (http://maya.cs.depaul.edu/~mobasher/classes/cs589/syllabus.html) Ullman’s course on Data mining at Stanford University: Exam (http://www-db.stanford.edu/~ullman/mining/final.html) Lecture notes (http://hake.stanford.edu/~ullman/mining/mining.html) Non-CS courses characteristics Hard to get materials, different keywords (DSS, DM, DA) Some courses are the same as CS students courses More domain driven Non-CS courses examples Graduate Certificate Program in Data Warehousing and Business Intelligence at the Center for Information Management & Technology at Loyola University (http://gsb.luc.edu/centers/cimt/certificate/dwcert1.html) Department of Medical Informatics ,Health Sciences campus of Columbia University (http://www.cpmc.columbia.edu/edu/degree/curriculum.html) The graduate school in computational Biology, Bioinformatics, and Biometry (ComBi) (http://www.cs.helsinki.fi/research/hallinto/TOIMINTARAPORTIT/1 999/report99/node4.html#SECTION00041200000000000000) On-line tutorials Data Mining: Theory and Practice, Yike Guo, Department of Computing, Imperial College, UK (http://ruby.doc.ic.ac.uk/teaching/km99/) Basic concepts of data mining, basic data mining techniques, data mining procedure in real world applications, future research trends, data warehouse and decision support. Kurt Thearling, Development Wheelhouse Corporation Burlington, MA (http://www3.shore.net/~kht/text/dmwhite/dmwhite.shtml ) Introduction to data mining, presentation of data mining techniques, real world examples. IT professionals and executives courses characteristics Customized for target audience, case studies Different approaches Mostly held in the USA Some of them are vendor independant Usual duration 1-3 days Themes: introductory DM seminars, tools, cross-selling, CRM, e-commerce, DSS: DW, basic statistics, Excel pivot tables,.. IT professionals courses - examples SAS seminars (http://www.sas.com/service/edu/bks/index.html) SPSS Integral Solutions Limited, (http://www.spss.com/training/descriptions.cfm) Vendor independant DCI (http://www.dci.com/events/datamin1/) The Modeling Agency, The Woodlands, Texas (http://www.themodeling-agency.com/training/index.html) General review of the current situation On-line materials, recent papers Exercises, projects not only theoretical, but also practical Including DW, statistics Combine DA techn. with special areas of application, like marketing (web-marketing, e-marketing), business intelligence, public policy, security issues.. Raise awareness in business world at different levels (managers, data analysists, IT professionals,..) USA vs. Europe WP4: Development and organization of seminars, training and distance learning I Participants: IJS, GMD, BRI WP4 coordinator: Tanja Urbančič Objectives: Increase awareness of DM and DS (potential clients) Provide seminars and workshops (internal, for open market, customized for clients) Provide a tool for supporting distance learning activities WP4: Development and organization of seminars, training and distance learning II Repository of Educational Modules (prepared by IJS) questionnaire for project partners (20 questions, el. form) 12 proposals from 5 institutions (88 hours of program) 6 can be costumized, 7 have web material method-centred and application-centred, basic and advanced Info about available related courses by BRI) 67 courses, academic and commercial 20 European, 37 US, 10 other (collected DALS and AED DALS seminar (international) Data analysis in life sciences, May 2000, organized by IJS 2 days (1day methods, 1 day cases) 2 days (1day methods, 1 day cases) 5 lecturers from Slovenia and UK 18 attendees from Slovenia and Germany AED seminar (international) Analysis of ecological data, December 2000, organized by IJS 4 days (5 for graduate students of PNG) 4 lecturers from Slovenia 27 participants from 9 countries (Slovenia, Belgium, Bosnia and Herzegovina, Croatia, France, Italy, The Netherlands, Poland, Slovak Republic)