Classification Tutorial for IBM Intelligent Miner

advertisement
Classification Tutorial for IBM Intelligent Miner
1. Download three datasets (cartrain.txt, cartest.txt, carapp.txt) from the class
website to local directory c:\im
2. Start -> Programs -> DB2 Intelligent Miner -> Intelligent Miner
3. Define three datasets in the Intelligent Miner according to the tutorial of
clustering and following information:
a. Dataset setting name: cartrain
Dataset path and file name: c:\im\cartrain.txt
b. Dataset setting name: cartest
Dataset path and file name: c:\im\cartest.txt
c. Dataset setting name: carapp
Dataset path and file name: c:\im\carapp.txt
Position Field name
1-6
Price
7-12
maintenance
13-18
doors
19-23
capacity
24-29
luggage
30-35
safety
36-41
acceptability
4. Save your work in the Mining Base by selecting the Mining Base -> Save Mining
Base As in the menu bar. Type in your mining base name in the popup window
and click Save button.
5. To perform ID3 classification, right click Mining folder in the Mining base
container and select create mining in the popup menu
6. Click Next in the Welcome window. In the following window, select
Classification – Tree as your mining function, type in your Settings name (car ID3
Training). Check Show the advanced pages and controls. Click Next to continue.
7. In the following window, select cartrain as input data, select optimize running for
Disk space. Click Next to continue.
8. In the following window, make sure you select the Training mode option and
check the Use default for all four parameters. Click Next to continue.
9. In the following window, select capacity, doors, luggage, maintenance, price and
safety as Input fields, and select acceptability as Class labels. Click Next to continue.
10. On the next Field parameters page of the wizard, click Next to continue.
11. On the Error matrix page of the wizard, click Next to continue.
12. In the output fields window, select the option of Do not create output, and click
Next to continue.
13. In result window, check If a result with this name exists, overwrite it. Please click
Next to continue
14. Click Finish button in the last summary window.
15. In the main window, expand the Mining folder and select Classification in the
Mining base container, then select car ID3 Training in the up-right container.
Click the Run icon to start mining.
16. After finishing the mining run, the IM will popup one result window. It presents
confusion matrix for pruned tree. You can see the pruned tree and unpruned tree
by clicking the corresponding button in the result window.
17. Please do not forget to save your work in the mining base.
18. To test ID3 classification model, right click on Mining folder in the Mining base
container and select create mining in the popup menu
19. Click Next in the Welcome window. In the next window, select Classification –
Tree as your mining function. Type in your Settings name (car ID3 Testing).
Check Show the advanced pages and controls, and click Next to continue.
20. In the following window, select cartest as input data, select optimize running for
Disk space, click next button to continue.
21. In the following window, make sure you select the Test mode option and select
the car ID3 Training. Click next to continue.
22. In the next window, select capacity, doors, luggage, maintenance, price, safety as
Input fields and select acceptability as Class label. Click Next to continue.
23. On the next Field parameters page of the wizard, click Next to continue.
24. On the Error matrix page of the wizard, click Next to continue.
25. In the output fields window, select the option of Do not create output, and click
Next to continue.
26. In the result window, check If a result with this name exists, overwrite it. Please
click Next to continue
27. Click Finish button in the last summary window.
28. In the main window, expand the Mining folder and select Classification in the
Mining base container, then select car ID3 Testing in the up-right container. Click
the Run icon to start mining.
29. After finishing the mining run, the IM will popup one result window. It presents
confusion matrix for pruned tree. You can see the applied tree by clicking the
corresponding button in the result window.
30. Please do not forget to save your work in the mining base.
31. To apply the ID3 classification model, right click on Mining folder in the Mining
base container and select create mining in the popup menu
32. Click Next in the Welcome window. In the next window, select Classification –
Tree as your mining function. Type in your Settings name (car ID3 Apply). Check
Show the advanced pages and controls, and click Next to continue.
33. In the following window, select carapp as input data, select optimize running for
Disk space, click next button to continue.
34. In the following window, select the Application mode option and select the car
ID3 Training. Click Next to continue.
35. In the next window, select capacity, doors, luggage, maintenance, price, safety as
Input fields and select acceptability as Class label. Click Next to continue.
36. On the next Field parameters page of the wizard, click Next to continue.
37. On the Error matrix page of the wizard, click Next to continue.
38. In the output fields window, make sure that select the option of Create output data.
Select all available fields as output fields. Type class in the Class ID field name
entry field, and type conf in the Confidence field name entry field. Click Next to
continue.
39. In the output data window, click Create data button.
40. In the Welcome window, Click Next to continue.
41. Select Flat files, and type in the settings name, such as carid3out, Click Next.
42. On the Flat files page, change to the directory c:\im. In the Path and file name
entry field, append carid3out.txt to the path, then click on Add file. Select Read
and Write as Use mode and check “The specified flat file does not yet exist”.
Click Next to continue.
43. On the Summary page of the Data wizard, click Finish to continue.
44. As you return to the output data window, select carid3out as output data and click
Next.
45. In the result window, check If a result with this name exists, overwrite it. Please
click Next to continue.
46. Click Finish button in the last summary window.
47. In the main window, expand the Mining folder and select Classification in the
Mining base container, then select car ID3 Apply in the up-right container. Click
the Run icon to start mining.
48. After running, please do not forget to save your work in the mining base.
49. Use Notepad or Textpad to open the output data file (carid3out.txt) in your
working directory (c:\im).
50. To perform Neural Network classification, right click Mining folder in the Mining
base container and select create mining in the popup menu.
51. Click Next in the Welcome window. In the following window, select
Classification – Neural as your mining function, type your Settings name (car NN
Training), check Show the advanced pages and controls, click next button to
continue.
52. In the following window, select cartrain as input data, select optimize running for
Disk space. Click Next to continue.
53. In the following window, make sure you select the Training mode option and
check the Use default for all four parameters. Click next to continue.
54. In the following window, select capacity, doors, luggage, maintenance, price and
safety as Input fields, and select acceptability as Class labels. Click next to
continue.
55. On the next Training parameters page of the wizard, select Automatic for
Architecture determination and Parameter determination. Click Next to continue.
56. In the output fields window, select the option of Do not create output, and click
Next to continue.
57. In result window, check If a result with this name exists, overwrite it. Please click
Next to continue
58. Click Finish button in the last summary window.
59. In the main window, expand the Mining folder and select Classification in the
Mining base container, then select car NN Training in the up-right container.
Click the Run icon to start mining.
60. After finish the mining run, the IM will popup one result window. It presents
confusion matrix for neural network. You can see the corresponding percentage
and bar chart by selecting menu item in the menu bar in the result window.
61. Please do not forget to save your work in the mining base.
62. To test Neural Network classification model, right click Mining folder in the
Mining base container and select create mining in the popup menu
63. Click Next in the Welcome window. In the next window, select Classification –
Neural as your mining function. Type in your Settings name (car NN Testing),
check Show the advanced pages and controls, and click next button to continue.
64. In the following window, select cartest as input data, select optimize running for
Disk space. click Next to continue.
65. In the following window, select the Test mode option and select the car NN
Training. Click Next to continue.
66. In the next window, select capacity, doors, luggage, maintenance, price, safety as
Input fields and select acceptability as Class label. Click next to continue.
67. In the output fields window, select the option of Do not create output, and click
Next to continue.
68. In result window, check If a result with this name exists, overwrite it. Please click
Next to continue
69. Click Finish button in the last summary window.
70. In the main window, expand the Mining folder and select Classification in the
Mining base container, then select car NN Testing in the up-right container. Click
the Run icon to start mining.
71. After finishing the mining run, the IM will popup one result window. It presents
confusion matrix for neural network. You can see the corresponding percentage
and bar chart by selecting menu item in the menu bar in the result window.
72. Please do not forget to save your work in the mining base.
73. To apply NN classification model, right click on Mining folder in the Mining base
container and select create mining in the popup menu.
74. Click Next in the Welcome window. In the next window, select Classification –
Neural as your mining function. Type your Settings name (car NN Apply), check
Show the advanced pages and controls, and click Next to continue.
75. In the following window, select carapp as input data, select optimize running for
Disk space, and click Next to continue.
76. In the following window, select the Application mode option and select the car
NN Training. Click Next to continue.
77. In the next window, select capacity, doors, luggage, maintenance, price, safety as
Input fields and select acceptability as Class label. Click next to continue.
78. In the output fields window, select the option of Create output data. Select all
available fields as output fields, type class in the Class ID field name entry field,
and type conf in the Confidence field name entry field. Click Next to continue.
79. In output data window, click Create data button.
80. In the Welcome window, click Next to continue.
81. Select Flat files, type the settings name, such as carnnout, Click Next.
82. On the Flat files page, change to the directory c:\im. In the Path and file name
entry field append carnnout.txt to the path, and then click on Add file. Select Read
and Write as Use mode and check “The specified flat file does not yet exist”.
Click Next to continue.
83. On the Summary page of the Data wizard, click Finish to continue.
84. As you return to the output data window, select carnnout as output data and click
Next.
85. In the result window, check If a result with this name exists, overwrite it. Please
click Next to continue.
86. Click Finish in the last summary window.
87. In the main window, expand the Mining folder and select Classification in the
Mining base container. Select car NN Apply in the up-right container. Click the
Run icon to start mining.
88. After running, please do not forget to save your work in the mining base.
89. Use Notepad or Textpad to open the output data file (carnnout.txt) in your
working directory (c:\im).
Download