What's new or improved in Weka 3.7.6

What's new or improved in Weka 3.7.6

Core Weka

  • Weka 3.7 is now GPL 3.0.

  • Weka releases now available on Maven central

  • Logistic now has an option to use conjugate gradient descent rather than quasi-Newton with BFGS updates.

  • weka.classifiers.bayes.NaiveBayesMultinomialText - naive Bayes multinomial classifier that operates directly on string attributes.

  • Appender component for the Knowledge Flow that can append sets of instances together.

  • SubstringLabeler component for the Knowledge Flow that can use substring or regex matching on string attribute values to assign various user defined nominal values to a new "label" attribute.

  • SubstringReplacer component for the Knowledge Flow that can replace substrings or regex matches with user supplied strings in string attribute values.

  • Sorter component for the Knowledge Flow that implements a streaming merge sort that writes a sorted in-memory buffer to a file when full. Can sort descending or ascending on multiple attributes.

  • DatabaseSaver can now truncate the target table if desired.

  • Area under the precision-recall curve evaluation metric.

  • Package manager's cache refresh mechanism is now much faster.

  • Package manager now checks for new versions of existing packages on the server as well as entirely new packages.

  • Random forest now has an option to print all the ensemble trees as part of its output.

In Packages

  • weka.clusterers.CascadeSimpleKMeans, contributed by Martin Guetlein.

  • weka.classifiers.functions.RBFRegressor added to the RBFNetwork package

  • jsonFieldExtractor package - Knowledge Flow step to extract one or more fields from repeating blocks of JSON text into new attributes.