2017年3月9日 星期四

weka.classifiers.rules.DecisionTable

 
weka.classifiers.rules.DecisionTable 為決策表學習器,適用於類別/數值預測。
預設利用最佳優先(登山)搜尋法,以交叉驗證之準確率或均方差為指標,找出最佳的屬性子集合。
然後,將訓練案例集縮減為只用留下的屬性子集合描述。
視每個案例為一條規則,其前件由留下屬性值組成,其後件則為多數決類別或平均值。
預測時若新案例有符合某規則之前件,則依其後件進行預測。
若遇決策表未涵蓋新案例,則使用k最近鄰居法,或背景多數決法作預測。

參數說明:
-X  crossVal: [1] 交叉驗證切割組數,1表只保留一測試案例,餘供訓練
-I  useIBk: [false] 遇未涵蓋新案例,使用k最近鄰居法,否則使用多數決法
-R  displayRules: [false] 列印決策表
-E  evaluationMeasure: [acc 或 rmse] 最佳指標遇類別採用準確率,遇數值採用均方差
                       其他指標還有  mae , auc
-S  search: [weka.attributeSelection.BestFirst] 子集合搜尋策略

  -- 以下為搜尋策略的參數 --

-D  direction: [1] 0表向後屬性變少,1表向前屬性變多,3表雙向
-S  lookupCacheSize: [1] 保留候選子集合的個數為案例集屬性個數的多少倍
-N  searchTermination: [5] 放棄搜尋前,能忍受指標無進步之試走步數
-P  startSet: [] 找尋初始點的屬性子集合,預設為空集合

參考: 
    kohavi-ecml-95-the power of decision tables

> java weka.classifiers.rules.DecisionTable -R -t data\weather.nominal.arff


Options: -R 

Decision Table:

Number of training instances: 14
Number of Rules : 1
Non matches covered by Majority class.
 Best first.
 Start set: no attributes
 Search direction: forward
 Stale search after 5 node expansions
 Total number of subsets evaluated: 12
 Merit of best subset found:   64.286
Evaluation (for feature selection): CV (leave one out) 
Feature set: 5

Rules:
================
play  
================
yes
================



Time taken to build model: 0 seconds
Time taken to test model on training data: 0 seconds

=== Error on training data ===

Correctly Classified Instances           9               64.2857 %
Incorrectly Classified Instances         5               35.7143 %
Kappa statistic                          0     
Mean absolute error                      0.4524
Root mean squared error                  0.4797
Relative absolute error                 97.4359 %
Root relative squared error            100.0539 %
Total Number of Instances               14     


=== Confusion Matrix ===

 a b   <-- classified as
 9 0 | a = yes
 5 0 | b = no



=== Stratified cross-validation ===

Correctly Classified Instances           6               42.8571 %
Incorrectly Classified Instances         8               57.1429 %
Kappa statistic                         -0.3659
Mean absolute error                      0.5318
Root mean squared error                  0.5583
Relative absolute error                111.6786 %
Root relative squared error            113.1584 %
Total Number of Instances               14     


=== Confusion Matrix ===

 a b   <-- classified as
 6 3 | a = yes
 5 0 | b = no

如下 weather.nominal.arff 案例集的14個案例有9個yes、5個no。
outlooktemperaturehumiditywindyplay
sunnyhothighFALSEno
sunnyhothighTRUEno
rainycoolnormalTRUEno
sunnymildhighFALSEno
rainymildhighTRUEno
overcasthothighFALSEyes
rainymildhighFALSEyes
rainycoolnormalFALSEyes
overcastcoolnormalTRUEyes
sunnycoolnormalFALSEyes
rainymildnormalFALSEyes
sunnymildnormalTRUEyes
overcastmildhighTRUEyes
overcasthotnormalFALSEyes
參考:
1.weka.classifiers.rules.DecisionTable
   code | doc

沒有留言: