SMO
Package
weka.classifiers.functions
Synopsis
Implements John Platt's sequential minimal optimization algorithm for training a support vector classifier.
This implementation globally replaces all missing values and transforms nominal attributes into binary ones. It also normalizes all attributes by default. (In that case the coefficients in the output are based on the normalized data, not the original data — this is important for interpreting the classifier.)
Multi-class problems are solved using pairwise classification (1-vs-1 and if logistic models are built pairwise coupling according to Hastie and Tibshirani, 1998).
To obtain proper probability estimates, use the option that fits logistic regression models to the outputs of the support vector machine. In the multi-class case the predicted probabilities are coupled using Hastie and Tibshirani's pairwise coupling method.
Note: for improved speed normalization should be turned off when operating on SparseInstances.
For more information on the SMO algorithm, see
J. Platt: Machines using Sequential Minimal Optimization. In B. Schoelkopf and C. Burges and A. Smola, editors, Advances in Kernel Methods - Support Vector Learning, 1998.
S.S. Keerthi, S.K. Shevade, C. Bhattacharyya, K.R.K. Murthy (2001). Improvements to Platt's SMO Algorithm for SVM Classifier Design. Neural Computation. 13(3):637-649.
Trevor Hastie, Robert Tibshirani: Classification by Pairwise Coupling. In: Advances in Neural Information Processing Systems, 1998.
Options
The table below describes the options available for SMO.
Option |
Description |
---|---|
buildLogisticModels |
Whether to fit logistic models to the outputs (for proper probability estimates). |
c |
The complexity parameter C. |
checksTurnedOff |
Turns time-consuming checks off - use with caution. |
debug |
If set to true, classifier may output additional info to the console. |
epsilon |
The epsilon for round-off error (shouldn't be changed). |
filterType |
Determines how/if the data will be transformed. |
kernel |
The kernel to use. |
numFolds |
The number of folds for cross-validation used to generate training data for logistic models (-1 means use training data). |
randomSeed |
Random number seed for the cross-validation. |
toleranceParameter |
The tolerance parameter (shouldn't be changed). |
Capabilities
The table below describes the capabilites of SMO.
Capability |
Supported |
---|---|
Class |
Binary class, Missing class values, Nominal class |
Attributes |
Empty nominal attributes, Unary attributes, Nominal attributes, Numeric attributes, Binary attributes, Missing values |
Min # of instances |
1 |