SMO

Package

weka.classifiers.functions

Synopsis

Implements John Platt's sequential minimal optimization algorithm for training a support vector classifier.

This implementation globally replaces all missing values and transforms nominal attributes into binary ones. It also normalizes all attributes by default. (In that case the coefficients in the output are based on the normalized data, not the original data — this is important for interpreting the classifier.)

Multi-class problems are solved using pairwise classification (1-vs-1 and if logistic models are built pairwise coupling according to Hastie and Tibshirani, 1998).

To obtain proper probability estimates, use the option that fits logistic regression models to the outputs of the support vector machine. In the multi-class case the predicted probabilities are coupled using Hastie and Tibshirani's pairwise coupling method.

Note: for improved speed normalization should be turned off when operating on SparseInstances.

For more information on the SMO algorithm, see

J. Platt: Machines using Sequential Minimal Optimization. In B. Schoelkopf and C. Burges and A. Smola, editors, Advances in Kernel Methods - Support Vector Learning, 1998.

S.S. Keerthi, S.K. Shevade, C. Bhattacharyya, K.R.K. Murthy (2001). Improvements to Platt's SMO Algorithm for SVM Classifier Design. Neural Computation. 13(3):637-649.

Trevor Hastie, Robert Tibshirani: Classification by Pairwise Coupling. In: Advances in Neural Information Processing Systems, 1998.

Options

The table below describes the options available for SMO.

Option

Description

buildLogisticModels

Whether to fit logistic models to the outputs (for proper probability estimates).

c

The complexity parameter C.

checksTurnedOff

Turns time-consuming checks off - use with caution.

debug

If set to true, classifier may output additional info to the console.

epsilon

The epsilon for round-off error (shouldn't be changed).

filterType

Determines how/if the data will be transformed.

kernel

The kernel to use.

numFolds

The number of folds for cross-validation used to generate training data for logistic models (-1 means use training data).

randomSeed

Random number seed for the cross-validation.

toleranceParameter

The tolerance parameter (shouldn't be changed).

Capabilities

The table below describes the capabilites of SMO.

Capability

Supported

Class

Binary class, Missing class values, Nominal class

Attributes

Empty nominal attributes, Unary attributes, Nominal attributes, Numeric attributes, Binary attributes, Missing values

Min # of instances

1