Seke Blog: weka.classifiers.rules.OneR

weka.classifiers.rules.OneR 屬單節點決策樹(單屬性規則)學習器，
利用單屬性多數/區間多數決原理，提供案例集基本表現值供標竿比較之用。
任何學習器都應該比OneR基本表現更好才有存在價值。

OneR學習分類時，為每個屬性建立一顆單節點決策樹，最後留下錯誤率最低者。
預測時則只根據留下的單節點決策樹，依單一屬性值的多數/區間多數決作為預測類別。

參數說明:
-B 數值屬性的區間(bucket)切割參數，預設值6，
   表示任一區間要成立，其多數決類別必需擁有的最少案例數。
   此下限值愈低，愈容易出現小區間，遷就案例能力愈強。

> java -cp weka.jar;. weka.classifiers.rules.OneR  -t data\weather.numeric.arff

outlook:
        sunny   -> no
        overcast        -> yes
        rainy   -> yes
(10/14 instances correct)


Time taken to build model: 0.01 seconds
Time taken to test model on training data: 0 seconds

=== Error on training data ===

Correctly Classified Instances          10               71.4286 %
Incorrectly Classified Instances         4               28.5714 %
Kappa statistic                          0.3778
Mean absolute error                      0.2857
Root mean squared error                  0.5345
Relative absolute error                 61.5385 %
Root relative squared error            111.4773 %
Total Number of Instances               14


=== Confusion Matrix ===

 a b   <-- classified as
 7 2 | a = yes
 2 3 | b = no



=== Stratified cross-validation ===

Correctly Classified Instances           6               42.8571 %
Incorrectly Classified Instances         8               57.1429 %
Kappa statistic                         -0.2444
Mean absolute error                      0.5714
Root mean squared error                  0.7559
Relative absolute error                120      %
Root relative squared error            153.2194 %
Total Number of Instances               14


=== Confusion Matrix ===

 a b   <-- classified as
 5 4 | a = yes
 4 1 | b = no

如下weather.numeric.arff 案例集的4個屬性，以outlook所建單節點決策樹錯誤率最低

outlook	temperature	humidity	windy	play
overcast	83	86	FALSE	yes
overcast	64	65	TRUE	yes
overcast	72	90	TRUE	yes
overcast	81	75	FALSE	yes
rainy	65	70	TRUE	no
rainy	71	91	TRUE	no
rainy	70	96	FALSE	yes
rainy	68	80	FALSE	yes
rainy	75	80	FALSE	yes
sunny	85	85	FALSE	no
sunny	80	90	TRUE	no
sunny	72	95	FALSE	no
sunny	69	70	FALSE	yes
sunny	75	70	TRUE	yes

OneR 針對數值屬性提供區間(bucket)切割參數-B，預設值6，
表示任一區間要成立，其多數決類別必需擁有至少 6 個案例數。此下限值愈低，愈容易出現小區間，遷就案例能力愈強。

> java -cp weka.jar;. weka.classifiers.rules.OneR  -t data\weather.numeric.arff -B 1

Options: -B 1

temperature:
        < 64.5  -> yes
        < 66.5  -> no
        < 70.5  -> yes
        < 71.5  -> no
        < 77.5  -> yes
        < 80.5  -> no
        < 84.0  -> yes
        >= 84.0 -> no
(13/14 instances correct)


Time taken to build model: 0.01 seconds
Time taken to test model on training data: 0 seconds

=== Error on training data ===

Correctly Classified Instances          13               92.8571 %
Incorrectly Classified Instances         1                7.1429 %
Kappa statistic                          0.8372
Mean absolute error                      0.0714
Root mean squared error                  0.2673
Relative absolute error                 15.3846 %
Root relative squared error             55.7386 %
Total Number of Instances               14


=== Confusion Matrix ===

 a b   <-- classified as
 9 0 | a = yes
 1 4 | b = no



=== Stratified cross-validation ===

Correctly Classified Instances           5               35.7143 %
Incorrectly Classified Instances         9               64.2857 %
Kappa statistic                         -0.3404
Mean absolute error                      0.6429
Root mean squared error                  0.8018
Relative absolute error                135      %
Root relative squared error            162.5137 %
Total Number of Instances               14


=== Confusion Matrix ===

 a b   <-- classified as
 4 5 | a = yes
 4 1 | b = no


如下weather.numeric.arff 案例集的4個屬性，以temperature所建單節點決策樹錯誤率雖低，
只是遷就看過資料能力強，預測未見資料能力則弱。


 
 
 
 
 
 

  outlook
  temperature
  humidity
  windy
  play
 

  overcast
  64
  65
  TRUE
  yes
 

  rainy
  65
  70
  TRUE
  no
 

  rainy
  68
  80
  FALSE
  yes
 

  sunny
  69
  70
  FALSE
  yes
 

  rainy
  70
  96
  FALSE
  yes
 

  rainy
  71
  91
  TRUE
  no
 

  overcast
  72
  90
  TRUE
  yes
 

  sunny
  72
  95
  FALSE
  no
 

  rainy
  75
  80
  FALSE
  yes
 

  sunny
  75
  70
  TRUE
  yes
 

  sunny
  80
  90
  TRUE
  no
 

  overcast
  81
  75
  FALSE
  yes
 

  overcast
  83
  86
  FALSE
  yes
 

  sunny
  85
  85
  FALSE
  no
 


參考: weka.classifiers.rules.OneR
1. source code
2. documentation

Seke Blog

weka.classifiers.rules.OneR

沒有留言:

Building a Lightweight Streamlit Client for Local Ollama LLM Interaction

總網頁瀏覽量