weka.classifiers.rules.DecisionTable 為決策表學習器,適用於類別/數值預測。
預設利用最佳優先(登山)搜尋法,以交叉驗證之準確率或均方差為指標,找出最佳的屬性子集合。
然後,將訓練案例集縮減為只用留下的屬性子集合描述。
視每個案例為一條規則,其前件由留下屬性值組成,其後件則為多數決類別或平均值。
預測時若新案例有符合某規則之前件,則依其後件進行預測。
若遇決策表未涵蓋新案例,則使用k最近鄰居法,或背景多數決法作預測。
參數說明:
-X crossVal: [1] 交叉驗證切割組數,1表只保留一測試案例,餘供訓練
-I useIBk: [false] 遇未涵蓋新案例,使用k最近鄰居法,否則使用多數決法
-R displayRules: [false] 列印決策表
-E evaluationMeasure: [acc 或 rmse] 最佳指標遇類別採用準確率,遇數值採用均方差
其他指標還有 mae , auc
-S search: [weka.attributeSelection.BestFirst] 子集合搜尋策略
-- 以下為搜尋策略的參數 --
-D direction: [1] 0表向後屬性變少,1表向前屬性變多,3表雙向
-S lookupCacheSize: [1] 保留候選子集合的個數為案例集屬性個數的多少倍
-N searchTermination: [5] 放棄搜尋前,能忍受指標無進步之試走步數
-P startSet: [] 找尋初始點的屬性子集合,預設為空集合
參考:
kohavi-ecml-95-the power of decision tables
> java weka.classifiers.rules.DecisionTable -R -t data\weather.nominal.arff
Options: -R
Decision Table:
Number of training instances: 14
Number of Rules : 1
Non matches covered by Majority class.
Best first.
Start set: no attributes
Search direction: forward
Stale search after 5 node expansions
Total number of subsets evaluated: 12
Merit of best subset found: 64.286
Evaluation (for feature selection): CV (leave one out)
Feature set: 5
Rules:
================
play
================
yes
================
Time taken to build model: 0 seconds
Time taken to test model on training data: 0 seconds
=== Error on training data ===
Correctly Classified Instances 9 64.2857 %
Incorrectly Classified Instances 5 35.7143 %
Kappa statistic 0
Mean absolute error 0.4524
Root mean squared error 0.4797
Relative absolute error 97.4359 %
Root relative squared error 100.0539 %
Total Number of Instances 14
=== Confusion Matrix ===
a b <-- classified as
9 0 | a = yes
5 0 | b = no
=== Stratified cross-validation ===
Correctly Classified Instances 6 42.8571 %
Incorrectly Classified Instances 8 57.1429 %
Kappa statistic -0.3659
Mean absolute error 0.5318
Root mean squared error 0.5583
Relative absolute error 111.6786 %
Root relative squared error 113.1584 %
Total Number of Instances 14
=== Confusion Matrix ===
a b <-- classified as
6 3 | a = yes
5 0 | b = no
如下 weather.nominal.arff 案例集的14個案例有9個yes、5個no。
outlook | temperature | humidity | windy | play |
sunny | hot | high | FALSE | no |
sunny | hot | high | TRUE | no |
rainy | cool | normal | TRUE | no |
sunny | mild | high | FALSE | no |
rainy | mild | high | TRUE | no |
overcast | hot | high | FALSE | yes |
rainy | mild | high | FALSE | yes |
rainy | cool | normal | FALSE | yes |
overcast | cool | normal | TRUE | yes |
sunny | cool | normal | FALSE | yes |
rainy | mild | normal | FALSE | yes |
sunny | mild | normal | TRUE | yes |
overcast | mild | high | TRUE | yes |
overcast | hot | normal | FALSE | yes |
參考:
1.weka.classifiers.rules.DecisionTable
code | doc
沒有留言:
張貼留言