...
weka.classifiers.functions
Synopsis
Class for building and using linear regression for prediction. Uses the Akaike criterion for model selection, and is able to deal with weighted instances.
Options
The table below describes the options available for LinearRegression.
...
Option
...
Description
...
attributeSelectionMethod
...
Set the method used to select attributes for use in the linear regression. Available methods are: no attribute selection, attribute selection using M5's method (step through the attributes removing the one with the smallest standardised coefficient until no improvement is observed in the estimate of the error given by the Akaike information criterion), and a greedy selection using the Akaike information metric.
...
debug
...
a multinomial logistic regression model with a ridge estimator.
There are some modifications, however, compared to the paper of leCessie and van Houwelingen(1992):
If there are k classes for n instances with m attributes, the parameter matrix B to be calculated will be an m*(k-1) matrix.
The probability for class j with the exception of the last class is
Pj(Xi) = exp(XiBj)/((sum[j=1..(k-1)]exp(Xi*Bj))+1)
The last class has probability
1-(sum[j=1..(k-1)]Pj(Xi))
= 1/((sum[j=1..(k-1)]exp(Xi*Bj))+1)
The (negative) multinomial log-likelihood is thus:
L = -sum[i=1..n]{
sum[j=1..(k-1)](Yij * ln(Pj(Xi)))
+(1 - (sum[j=1..(k-1)]Yij))
* ln(1 - sum[j=1..(k-1)]Pj(Xi))
} + ridge * (B^2)
In order to find the matrix B for which L is minimised, a Quasi-Newton Method is used to search for the optimized values of the m*(k-1) variables. Note that before we use the optimization procedure, we 'squeeze' the matrix B into a m*(k-1) vector. For details of the optimization procedure, please check weka.core.Optimization class.
Although original Logistic Regression does not deal with instance weights, we modify the algorithm a little bit to handle the instance weights.
For more information see:
le Cessie, S., van Houwelingen, J.C. (1992). Ridge Estimators in Logistic Regression. Applied Statistics. 41(1):191-201.
Note: Missing values are replaced using a ReplaceMissingValuesFilter, and nominal attributes are transformed into numeric attributes using a NominalToBinaryFilter.
Options
The table below describes the options available for Logistic.
Option | Description | |
---|---|---|
debug | Output debug information to the console. | |
eliminateColinearAttributes | Eliminate colinear attributes maxIts | Maximum number of iterations to perform. |
ridge | The Set the Ridge value of the Ridge parameterin the log-likelihood. |
Capabilities
The table below describes the capabilites of LinearRegressionLogistic.
Capability | Supported |
---|---|
Class | Missing class values, Numeric Nominal class, Date Binary class |
Attributes | Numeric Empty nominal attributes, Unary attributes, Nominal attributes, Binary attributes, Date attributes, Numeric attributes, Missing values , Empty nominal attributes, Unary attributes |
Min # of instances | 1 |