Page Comparison

...

weka.classifiers.functions

Synopsis

Class for building and using linear regression for prediction. Uses the Akaike criterion for model selection, and is able to deal with weighted instances.

Options

The table below describes the options available for LinearRegression.

...

Option

...

Description

...

attributeSelectionMethod

...

Set the method used to select attributes for use in the linear regression. Available methods are: no attribute selection, attribute selection using M5's method (step through the attributes removing the one with the smallest standardised coefficient until no improvement is observed in the estimate of the error given by the Akaike information criterion), and a greedy selection using the Akaike information metric.

...

debug

...

a multinomial logistic regression model with a ridge estimator.

There are some modifications, however, compared to the paper of leCessie and van Houwelingen(1992):

If there are k classes for n instances with m attributes, the parameter matrix B to be calculated will be an m*(k-1) matrix.

The probability for class j with the exception of the last class is

Pj(Xi) = exp(XiBj)/((sum[j=1..(k-1)]exp(Xi*Bj))+1)

The last class has probability

1-(sum[j=1..(k-1)]Pj(Xi))
= 1/((sum[j=1..(k-1)]exp(Xi*Bj))+1)

The (negative) multinomial log-likelihood is thus:

L = -sum[i=1..n]{
sum[j=1..(k-1)](Yij * ln(Pj(Xi)))
+(1 - (sum[j=1..(k-1)]Yij))
* ln(1 - sum[j=1..(k-1)]Pj(Xi))
} + ridge * (B^2)

In order to find the matrix B for which L is minimised, a Quasi-Newton Method is used to search for the optimized values of the m*(k-1) variables. Note that before we use the optimization procedure, we 'squeeze' the matrix B into a m*(k-1) vector. For details of the optimization procedure, please check weka.core.Optimization class.

Although original Logistic Regression does not deal with instance weights, we modify the algorithm a little bit to handle the instance weights.

For more information see:

le Cessie, S., van Houwelingen, J.C. (1992). Ridge Estimators in Logistic Regression. Applied Statistics. 41(1):191-201.

Note: Missing values are replaced using a ReplaceMissingValuesFilter, and nominal attributes are transformed into numeric attributes using a NominalToBinaryFilter.

Options

The table below describes the options available for Logistic.

Option	Description
debug	Output debug information to the console.
eliminateColinearAttributes	Eliminate colinear attributes maxIts	Maximum number of iterations to perform.
ridge	The Set the Ridge value of the Ridge parameterin the log-likelihood.

Capabilities

The table below describes the capabilites of LinearRegressionLogistic.

Capability	Supported
Class	Missing class values, Numeric Nominal class, Date Binary class
Attributes	Numeric Empty nominal attributes, Unary attributes, Nominal attributes, Binary attributes, Date attributes, Numeric attributes, Missing values , Empty nominal attributes, Unary attributes
Min # of instances	1

Version	Old Version 1	New Version 2
Changes made by	Former user	Former user
Saved on	Dec 04, 2008	Dec 04, 2008

Versions Compared

Key

Synopsis

Options

Options

Capabilities