RandomProjection
Package
weka.filters.unsupervised.attribute
Synopsis
Reduces the dimensionality of the data by projecting it onto a lower dimensional subspace using a random matrix with columns of unit length (i.e. It will reduce the number of attributes in the data while preserving much of its variation like PCA, but at a much less computational cost).
It first applies the NominalToBinary filter to convert all attributes to numeric before reducing the dimension. It preserves the class attribute.
For more information, see:
Dmitriy Fradkin, David Madigan: Experiments with random projections for machine learning. In: KDD '03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, New York, NY, USA, 517-522, 003.
Options
The table below describes the options available for RandomProjection.
Option |
Description |
---|---|
distribution |
The distribution to use for calculating the random matrix. Sparse1 is: sqrt(3) * { -1 with prob(1/6), 0 with prob(2/3), +1 with prob(1/6) } Sparse2 is: { -1 with prob(1/2), +1 with prob(1/2) } |
numberOfAttributes |
The number of dimensions (attributes) the data should be reduced to. |
percent |
The percentage of dimensions (attributes) the data should be reduced to (inclusive of the class attribute). This NumberOfAttributes option is ignored if this option is present or is greater than zero. |
randomSeed |
The random seed used by the random number generator used for generating the random matrix |
replaceMissingValues |
If set the filter uses weka.filters.unsupervised.attribute.ReplaceMissingValues to replace the missing values |
Capabilities
The table below describes the capabilites of RandomProjection.
Capability |
Supported |
---|---|
Class |
Missing class values, Empty nominal class, Nominal class, String class, Date class, No class, Unary class, Relational class, Numeric class, Binary class |
Attributes |
Unary attributes, Date attributes, Nominal attributes, Relational attributes, String attributes, Empty nominal attributes, Missing values, Numeric attributes, Binary attributes |
Min # of instances |
0 |