GridSearch
Package
weka.classifiers.meta
Synopsis
Performs a grid search of parameter pairs for the a classifier (Y-axis, default is LinearRegression with the "Ridge" parameter) and the PLSFilter (X-axis, "# of Components") and chooses the best pair found for the actual predicting.
The initial grid is worked on with 2-fold CV to determine the values of the parameter pairs for the selected type of evaluation (e.g., accuracy). The best point in the grid is then taken and a 10-fold CV is performed with the adjacent parameter pairs. If a better pair is found, then this will act as new center and another 10-fold CV will be performed (kind of hill-climbing). This process is repeated until no better pair is found or the best pair is on the border of the grid.
In case the best pair is on the border, one can let GridSearch automatically extend the grid and continue the search. Check out the properties 'gridIsExtendable' (option '-extend-grid') and 'maxGridExtensions' (option '-max-grid-extensions <num>').
GridSearch can handle doubles, integers (values are just cast to int) and booleans (0 is false, otherwise true). float, char and long are supported as well.
The best filter/classifier setup can be accessed after the buildClassifier call via the getBestFilter/getBestClassifier methods.
Note on the implementation: after the data has been passed through the filter, a default NumericCleaner filter is applied to the data in order to avoid numbers that are getting too small and might produce NaNs in other schemes.
Available in Weka 3.6.x - 3.7.1. Available via the package management system for Weka >= 3.7.2 (gridSearch).
Options
The table below describes the options available for GridSearch.
Option |
Description |
---|---|
XBase |
The base of X. |
XExpression |
The expression for the X value (parameters: BASE, FROM, TO, STEP, I). |
XMax |
The maximum of X. |
XMin |
The minimum of X. |
XProperty |
The X property to test (normally the filter). |
XStep |
The step size of X. |
YBase |
The base of Y. |
YExpression |
The expression for the Y value (parameters: BASE, FROM, TO, STEP, I). |
YMax |
The maximum of Y. |
YMin |
The minimum of Y (normally the classifier). |
YProperty |
The Y property to test (normally the classifier). |
YStep |
The step size of Y. |
classifier |
The base classifier to be used. |
debug |
If set to true, classifier may output additional info to the console. |
evaluation |
Sets the criterion for evaluating the classifier performance and choosing the best one. |
filter |
The filter to be used (only used for setup). |
gridIsExtendable |
Whether the grid can be extended. |
logFile |
The log file to log the messages to. |
maxGridExtensions |
The maximum number of grid extensions, -1 for unlimited. |
sampleSizePercent |
The sample size (in percent) to use in the initial grid search. |
seed |
The random number seed to be used. |
traversal |
Sets type of traversal of the grid, either by rows or columns. |
Capabilities
The table below describes the capabilites of GridSearch.
Capability |
Supported |
---|---|
Class |
Date class, Numeric class |
Attributes |
Missing values, Numeric attributes, Date attributes |
Min # of instances |
1 |