Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Property Name

Description

pmr.kettle.installation.id

Version of Kettle to use from the Kettle HDFS installation directory. If not set we will use the version of Kettle that is used to submit the Pentaho MapReduce job.

pmr.kettle.dfs.install.dir

Installation path in HDFS for the Kettle environment used to execute a Pentaho MapReduce job. This can be a relative path, anchored to the user's home directory, or an absolute path if it starts with a /.

pmr.libraries.archive.file

Pentaho MapReduce Kettle environment runtime archive to be preloaded into kettle.hdfs.install.dir/pmr.kettle.installation.id

pmr.kettle.additional.plugins

Comma-separated list of additional plugins (by directory name) to be installed with the Kettle environment.
e.g. "steps/DummyPlugin,my-custom-plugin"

h1 Customizing the Kettle Environment used by Pentaho MapReduce

Anchor
customizing
customizing

TODO

Upgrading from the Pentaho Hadoop Distribution (PHD)

The PHD is no longer required and can be safely removed. If you have modified your Pentaho Hadoop Distribution installation you may wish to preserve these files so that the new Distributed Cache mechanism can take advantage of them. To do so follow the instructions here.

If you're using a version of the Pentaho Hadoop Distribution (PHD) that allows you to configure the installation directory via mapred-site.xml, preform the following on all TaskTracker nodes:

  1. Remove the pentaho.* properties from your mapred-site.xml
  2. Remove the directories those properties referenced
  3. Restart the TaskTracker process