sIB

Package

weka.clusterers

Synopsis

Cluster data using the sequential information bottleneck algorithm.

Note: only hard clustering scheme is supported. sIB assign for each instance the cluster that have the minimum cost/distance to the instance. The trade-off beta is set to infinite so 1/beta is zero.

For more information, see:

Noam Slonim, Nir Friedman, Naftali Tishby: Unsupervised document classification using sequential information maximization. In: Proceedings of the 25th International ACM SIGIR Conference on Research and Development in Information Retrieval, 129-136, 2002.

Options

The table below describes the options available for sIB.

Option

Description

debug

If set to true, clusterer may output additional info to the console.

maxIterations

set maximum number of iterations (default 100)

minChange

set minimum number of changes (default 0)

notUnifyNorm

set whether to normalize each instance to a unify prior probability (eg. 1).

numClusters

set number of clusters (default 2)

numRestarts

set number of restarts (default 5)

seed

The random number seed to be used.

Capabilities

The table below describes the capabilites of sIB.

Capability

Supported

Class

No class

Attributes

Numeric attributes

Min # of instances

1