Seke Blog: weka.classifiers.rules.DecisionTable

 
weka.classifiers.rules.DecisionTable 為決策表學習器，適用於類別/數值預測。
預設利用最佳優先(登山)搜尋法，以交叉驗證之準確率或均方差為指標，找出最佳的屬性子集合。
然後，將訓練案例集縮減為只用留下的屬性子集合描述。
視每個案例為一條規則，其前件由留下屬性值組成，其後件則為多數決類別或平均值。
預測時若新案例有符合某規則之前件，則依其後件進行預測。
若遇決策表未涵蓋新案例，則使用k最近鄰居法，或背景多數決法作預測。

參數說明:
-X  crossVal: [1] 交叉驗證切割組數，1表只保留一測試案例，餘供訓練
-I  useIBk: [false] 遇未涵蓋新案例，使用k最近鄰居法，否則使用多數決法
-R  displayRules: [false] 列印決策表
-E  evaluationMeasure: [acc 或 rmse] 最佳指標遇類別採用準確率,遇數值採用均方差
                       其他指標還有  mae ， auc
-S  search: [weka.attributeSelection.BestFirst] 子集合搜尋策略

  -- 以下為搜尋策略的參數 --

-D  direction: [1] 0表向後屬性變少，1表向前屬性變多，3表雙向
-S  lookupCacheSize: [1] 保留候選子集合的個數為案例集屬性個數的多少倍
-N  searchTermination: [5] 放棄搜尋前，能忍受指標無進步之試走步數
-P  startSet: [] 找尋初始點的屬性子集合，預設為空集合

參考: 
    kohavi-ecml-95-the power of decision tables

> java weka.classifiers.rules.DecisionTable -R -t data\weather.nominal.arff


Options: -R 

Decision Table:

Number of training instances: 14
Number of Rules : 1
Non matches covered by Majority class.
 Best first.
 Start set: no attributes
 Search direction: forward
 Stale search after 5 node expansions
 Total number of subsets evaluated: 12
 Merit of best subset found:   64.286
Evaluation (for feature selection): CV (leave one out) 
Feature set: 5

Rules:
================
play  
================
yes
================



Time taken to build model: 0 seconds
Time taken to test model on training data: 0 seconds

=== Error on training data ===

Correctly Classified Instances           9               64.2857 %
Incorrectly Classified Instances         5               35.7143 %
Kappa statistic                          0     
Mean absolute error                      0.4524
Root mean squared error                  0.4797
Relative absolute error                 97.4359 %
Root relative squared error            100.0539 %
Total Number of Instances               14     


=== Confusion Matrix ===

 a b   <-- classified as
 9 0 | a = yes
 5 0 | b = no



=== Stratified cross-validation ===

Correctly Classified Instances           6               42.8571 %
Incorrectly Classified Instances         8               57.1429 %
Kappa statistic                         -0.3659
Mean absolute error                      0.5318
Root mean squared error                  0.5583
Relative absolute error                111.6786 %
Root relative squared error            113.1584 %
Total Number of Instances               14     


=== Confusion Matrix ===

 a b   <-- classified as
 6 3 | a = yes
 5 0 | b = no

如下 weather.nominal.arff 案例集的14個案例有9個yes、5個no。

outlook	temperature	humidity	windy	play
sunny	hot	high	FALSE	no
sunny	hot	high	TRUE	no
rainy	cool	normal	TRUE	no
sunny	mild	high	FALSE	no
rainy	mild	high	TRUE	no
overcast	hot	high	FALSE	yes
rainy	mild	high	FALSE	yes
rainy	cool	normal	FALSE	yes
overcast	cool	normal	TRUE	yes
sunny	cool	normal	FALSE	yes
rainy	mild	normal	FALSE	yes
sunny	mild	normal	TRUE	yes
overcast	mild	high	TRUE	yes
overcast	hot	normal	FALSE	yes

參考:
1.weka.classifiers.rules.DecisionTable
   code | doc

Seke Blog

weka.classifiers.rules.DecisionTable

沒有留言:

Building a Lightweight Streamlit Client for Local Ollama LLM Interaction

總網頁瀏覽量