Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Kettle Type

Hadoop Type

ValueMetaInterface.TYPE_STRING

org.apache.hadoop.io.Text

ValueMetaInterface.TYPE_BIGNUMBER

org.apache.hadoop.io.Text

ValueMetaInterface.TYPE_DATE

org.apache.hadoop.io.Text

ValueMetaInterface.TYPE_INTEGER

org.apache.hadoop.io.LongWritable

ValueMetaInterface.TYPE_LONG

org.apache.hadoop.io.DoubleWritable

ValueMetaInterface.TYPE_BOOLEAN

org.apache.hadoop.io.BooleanWritable

ValueMetaInterface.TYPE_BINARY

org.apache.hadoop.io.BytesWritable

Defining your own Type Converter

TODO

Distributed Cache

Pentaho MapReduce relies on Hadoop's Distributed Cache to distribute the Kettle environment, configuration, and plugins across the cluster. By leveraging the Distributed Cache network traffic is reduced up for subsequent executions as the Kettle environment is automatically configured on each node. This also allows you to use multiple version of Kettle against a single cluster.

...