SubsetByExpression
Package
weka.filters.unsupervised.instance
Synopsis
Filters instances according to a user-specified expression.
Grammar:
boolexpr_list ::= boolexpr_list boolexpr_part | boolexpr_part;
boolexpr_part ::= boolexpr:e {: parser.setResult(e); :} ;
boolexpr ::= BOOLEAN
| true
| false
| expr < expr
| expr <= expr
| expr > expr
| expr >= expr
| expr = expr
| ( boolexpr )
| not boolexpr
| boolexpr and boolexpr
| boolexpr or boolexpr
| ATTRIBUTE is STRING
;
expr ::= NUMBER
| ATTRIBUTE
| ( expr )
| opexpr
| funcexpr
;
opexpr ::= expr + expr
| expr - expr
| expr * expr
| expr / expr
;
funcexpr ::= abs ( expr )
| sqrt ( expr )
| log ( expr )
| exp ( expr )
| sin ( expr )
| cos ( expr )
| tan ( expr )
| rint ( expr )
| floor ( expr )
| pow ( expr for base , expr for exponent )
| ceil ( expr )
;
Notes:
NUMBER
any integer or floating point number
(but not in scientific notation!)STRING
any string surrounded by single quotes;
the string may not contain a single quote though.ATTRIBUTE
the following placeholders are recognized for
attribute values:CLASS for the class value in case a class attribute is set.
ATTxyz with xyz a number from 1 to # of attributes in the
dataset, representing the value of indexed attribute.
Examples:
extracting only mammals and birds from the 'zoo' UCI dataset:
(CLASS is 'mammal') or (CLASS is 'bird')extracting only animals with at least 2 legs from the 'zoo' UCI dataset:
(ATT14 >= 2)extracting only instances with non-missing 'wage-increase-second-year'
from the 'labor' UCI dataset:
not ismissing(ATT3)
Options
The table below describes the options available for SubsetByExpression.
Option | Description |
|---|---|
debug | Turns on output of debugging information. |
expression | The expression to used for filtering the dataset. |
Capabilities
The table below describes the capabilites of SubsetByExpression.
Capability | Supported |
|---|---|
Class | Date class, Numeric class, Missing class values, Nominal class, No class, Binary class |
Attributes | Missing values, Numeric attributes, Nominal attributes, Empty nominal attributes, Date attributes, Binary attributes, Unary attributes |
Min # of instances | 0 |