Seke Blog: weka.classifiers.functions.Winnow

weka.classifiers.functions.Winnow 屬錯誤驅動型學習器，
只處理文字屬性，將之轉成二元屬性，用來預測二元類別值。可線上累進學習。
適用在案例集屬性眾多，卻多數和預測不相關情況，可以快速鎖定相關屬性作預測。

給定案例屬性(a0, a1, ..., ak)，門檻 theta，權重升級係數alpha，權重降級係數beta，
權重向量(w0, w1, ..., wk)或(w0+ - w0-, w1+ - w1-, ..., wk+ - wk-)
其中，所有符號皆為正數，擴充屬性 a0 恆為 1。則預測式有二: 

  不平衡版: 權重向量各維度只能正數
     w0 * a0 + w1 * a1 + ... + wk * ak > theta 表類別1; 否則類別2

  平衡版:  權重向量各維度允許負數
    (w0+ - w0-) * a0 + (w1+ - w1-) * a1 + ... + (wk+ - wk-) * ak > theta 表類別1; 否則類別2

學習過程若遇預測錯誤，則權重向量調整法如下:
  類別2誤為類別1:   w *= beta  或 w+ *= beta  and w- *= alpha 讓權重變小
  類別1誤為類別2:   w *= alpha 或 w+ *= alpha and w- *= beta  讓權重變大

參數說明:
 -L  使用平衡版。預設值false
 -I  套用訓練集學習權重的輪數。預設值1
 -A  權重升級係數alpha，需>1。預設值2.0
 -B  權重降級係數beta，需<1。預設值0.5
 -H  預測門檻theta。預設值-1，表示屬性個數
 -W  權重初始值，需>0。預設值2.0
 -S  亂數種子，影響訓練集的案例訓練順序。預設值1


> java  weka.classifiers.functions.Winnow  -t data\weather.nominal.arff


Winnow

Attribute weights

w0 8.0
w1 1.0
w2 2.0
w3 4.0
w4 2.0
w5 2.0
w6 1.0
w7 1.0

Cumulated mistake count: 7


Time taken to build model: 0 seconds
Time taken to test model on training data: 0 seconds

=== Error on training data ===

Correctly Classified Instances          10               71.4286 %
Incorrectly Classified Instances         4               28.5714 %
Kappa statistic                          0.3778
Mean absolute error                      0.2857
Root mean squared error                  0.5345
Relative absolute error                 61.5385 %
Root relative squared error            111.4773 %
Total Number of Instances               14     


=== Confusion Matrix ===

 a b   <-- classified as
 7 2 | a = yes
 2 3 | b = no



=== Stratified cross-validation ===

Correctly Classified Instances           7               50      %
Incorrectly Classified Instances         7               50      %
Kappa statistic                         -0.2564
Mean absolute error                      0.5   
Root mean squared error                  0.7071
Relative absolute error                105      %
Root relative squared error            143.3236 %
Total Number of Instances               14     


=== Confusion Matrix ===

 a b   <-- classified as
 7 2 | a = yes
 5 0 | b = no

如下 weather.nominal.arff 案例集的14個案例利用4個文字屬性，預測文字屬性。

 
 
 
 
 

  
 
 
 
 
 

  outlook
  temperature
  humidity
  windy
  play
 

  sunny
  hot
  high
  FALSE
  no
 

  sunny
  hot
  high
  TRUE
  no
 

  overcast
  hot
  high
  FALSE
  yes
 

  rainy
  mild
  high
  FALSE
  yes
 

  rainy
  cool
  normal
  FALSE
  yes
 

  rainy
  cool
  normal
  TRUE
  no
 

  overcast
  cool
  normal
  TRUE
  yes
 

  sunny
  mild
  high
  FALSE
  no
 

  sunny
  cool
  normal
  FALSE
  yes
 

  rainy
  mild
  normal
  FALSE
  yes
 

  sunny
  mild
  normal
  TRUE
  yes
 

  overcast
  mild
  high
  TRUE
  yes
 

  overcast
  hot
  normal
  FALSE
  yes
 

  rainy
  mild
  high
  TRUE
  no




參考:
1.weka.classifiers.functions.Winnow
   code | doc
Seke Blog

weka.classifiers.functions.Winnow

沒有留言:

Building a Lightweight Streamlit Client for Local Ollama LLM Interaction

總網頁瀏覽量