LMT
Package
weka.classifiers.trees
Synopsis
Classifier for building 'logistic model trees', which are classification trees with logistic regression functions at the leaves. The algorithm can deal with binary and multi-class target variables, numeric and nominal attributes and missing values.
For more information see:
Niels Landwehr, Mark Hall, Eibe Frank (2005). Logistic Model Trees. Machine Learning. 95(1-2):161-205.
Marc Sumner, Eibe Frank, Mark Hall: Speeding up Logistic Model Tree Induction. In: 9th European Conference on Principles and Practice of Knowledge Discovery in Databases, 675-683, 2005.
Options
The table below describes the options available for LMT.
Option |
Description |
---|---|
convertNominal |
Convert all nominal attributes to binary ones before building the tree. This means that all splits in the final tree will be binary. |
debug |
If set to true, classifier may output additional info to the console. |
errorOnProbabilities |
Minimize error on probabilities instead of misclassification error when cross-validating the number of LogitBoost iterations. When set, the number of LogitBoost iterations is chosen that minimizes the root mean squared error instead of the misclassification error. |
fastRegression |
Use heuristic that avoids cross-validating the number of Logit-Boost iterations at every node. When fitting the logistic regression functions at a node, LMT has to determine the number of LogitBoost iterations to run. Originally, this number was cross-validated at every node in the tree. To save time, this heuristic cross-validates the number only once and then uses that number at every node in the tree. Usually this does not decrease accuracy but improves runtime considerably. |
minNumInstances |
Set the minimum number of instances at which a node is considered for splitting. The default value is 15. |
numBoostingIterations |
Set a fixed number of iterations for LogitBoost. If >= 0, this sets a fixed number of LogitBoost iterations that is used everywhere in the tree. If < 0, the number is cross-validated. |
splitOnResiduals |
Set splitting criterion based on the residuals of LogitBoost. There are two possible splitting criteria for LMT: the default is to use the C4.5 splitting criterion that uses information gain on the class variable. The other splitting criterion tries to improve the purity in the residuals produces when fitting the logistic regression functions. The choice of the splitting criterion does not usually affect classification accuracy much, but can produce different trees. |
useAIC |
The AIC is used to determine when to stop LogitBoost iterations. The default is not to use AIC. |
weightTrimBeta |
Set the beta value used for weight trimming in LogitBoost. Only instances carrying (1 - beta)% of the weight from previous iteration are used in the next iteration. Set to 0 for no weight trimming. The default value is 0. |
Capabilities
The table below describes the capabilites of LMT.
Capability |
Supported |
---|---|
Class |
Missing class values, Nominal class, Binary class |
Attributes |
Nominal attributes, Missing values, Empty nominal attributes, Date attributes, Numeric attributes, Unary attributes, Binary attributes |
Min # of instances |
1 |