Uploaded by Indiecompany

W4 data exploration

Data exploration
Lukman Heryawan, PhD
10 Maret 2022
Perkuliahan (sebelum UTS)
• Link elok: https://elok.ugm.ac.id/course/view.php?id=9358 (Data mining - CS IUP)
• 7 kali pertemuan:
• Process model of data mining
• Data types and attributes
• Data distance and data collection
• Data exploration
• Data preparation
• Supervised model development and evaluation
• Supervised model improvement
• Penilaian:
• UTS - 5 soal (25%)
• Tugas individu – 5 kali (5%)
• Tugas kelompok - 2 kali + keaktifan kelas (20%)
Perkuliahan (setelah UTS)
• Link elok: https://elok.ugm.ac.id/course/view.php?id=9358 (Data mining - CS IUP)
• 7 kali pertemuan:
• Clustering methods
• Clustering evaluation
• Frequent itemset
• Association rule
• Sequential pattern
• Case study: solving issue using supervised methods
• Case study: solving issue using unsupervised methods
• Penilaian:
• UAS - 5 soal (25%)
• Tugas individu – 5 kali (5%)
• Tugas kelompok - 2 kali + keaktifan kelas (20%)
Data mining (DM) definition
• Data mining is a process of extracting and discovering patterns in large
data sets involving methods at the intersection of machine learning,
statistics, and database systems. Wikipedia
DM definition (cont)
Process model of CRISP-DM
• Cross-industry standard process for data mining, known as CRISP-DM
is an open standard process model that describes common
approaches used by data mining experts.
• It is the most widely-used analytics model.
CRISP-DM Process
Four stages of data mining
CRISP-DM and data exploration
Data exploration
• VOSviewer is a software tool for constructing and visualizing
bibliometric networks.
• These networks may for instance include journals, researchers, or
individual publications.
• VOSviewer also offers text mining functionality that can be used to
construct and visualize co-occurrence networks of important terms
extracted from a body of scientific literature.
• https://www.vosviewer.com/
VOSviewer example
Case study
MeSH on Demand Tool:
An Easy Way to Identify Relevant MeSH Terms
Text mining of PubMed database
Abstract example
Related MesH
Similar articles
measurement using
data distance
Data visualization for MeSH on demand application
Weekly assignment
(due date: 16 March 2022, 23:59)
• Create github account for this assignment
• Develop MeSH on demand application and store the
development progress report to your github account
• You may use this reference:
• Write a detail explanation of your development progress report
in github
• For example you can explain a visualization script that be
used to visualize terms of PubMed articles
• Share your progress and github link to email
• Email is sent with subject:
Name of student_assignment4_DM_CSIUP
Example of email format of assignment 4:
Tanya jawab
• Email: lukmanh@ugm.ac.id
• Scholar profile: https://scholar.google.co.id/citations?user=V_iMAWYAAAAJ&hl=en