SubsetByExpression
Package
weka.filters.unsupervised.instance
Synopsis
Filters instances according to a user-specified expression.
Grammar: boolexpr_list ::= boolexpr_list boolexpr_part | boolexpr_part; boolexpr_part ::= boolexpr:e {: parser.setResult(e); :} ; boolexpr ::= BOOLEAN | true | false | expr < expr | expr <= expr | expr > expr | expr >= expr | expr = expr | ( boolexpr ) | not boolexpr | boolexpr and boolexpr | boolexpr or boolexpr | ATTRIBUTE is STRING ; expr ::= NUMBER | ATTRIBUTE | ( expr ) | opexpr | funcexpr ; opexpr ::= expr + expr | expr - expr | expr * expr | expr / expr ; funcexpr ::= abs ( expr ) | sqrt ( expr ) | log ( expr ) | exp ( expr ) | sin ( expr ) | cos ( expr ) | tan ( expr ) | rint ( expr ) | floor ( expr ) | pow ( expr for base , expr for exponent ) | ceil ( expr ) ;
Notes:
- NUMBER
any integer or floating point number
(but not in scientific notation!) - STRING
any string surrounded by single quotes;
the string may not contain a single quote though. - ATTRIBUTE
the following placeholders are recognized for
attribute values: - CLASS for the class value in case a class attribute is set.
- ATTxyz with xyz a number from 1 to # of attributes in the
dataset, representing the value of indexed attribute.
Examples:
- extracting only mammals and birds from the 'zoo' UCI dataset:
(CLASS is 'mammal') or (CLASS is 'bird') - extracting only animals with at least 2 legs from the 'zoo' UCI dataset:
(ATT14 >= 2) - extracting only instances with non-missing 'wage-increase-second-year'
from the 'labor' UCI dataset:
not ismissing(ATT3)
Options
The table below describes the options available for SubsetByExpression.
Option |
Description |
---|---|
debug |
Turns on output of debugging information. |
expression |
The expression to used for filtering the dataset. |
Capabilities
The table below describes the capabilites of SubsetByExpression.
Capability |
Supported |
---|---|
Class |
Date class, Numeric class, Missing class values, Nominal class, No class, Binary class |
Attributes |
Missing values, Numeric attributes, Nominal attributes, Empty nominal attributes, Date attributes, Binary attributes, Unary attributes |
Min # of instances |
0 |