Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Customizing the Kettle Environment used by Pentaho MapReduce
Anchor
customizing
customizing

TODO

  1. Unzip pentaho-mapreduce-libraries.zip, it contains a single lib directory with the required Kettle dependencies
  2. Add additional libraries to the lib/ directory
  3. Zip up the lib/ directory into pentaho-mapreduce-libraries-custom.zip so the archive contains the lib/ with all jars within it (you may create subdirectories within lib/. All jars found in lib/ and its subdirectories will be added to the classpath of the executing job.)
  4. Update pentaho-mapreduce.properties and update the following properties:
    Code Block
    pmr.kettle.installation.id=custom
    pmr.libraries.archive.file=pentaho-mapreduce-libraries-custom.zip
    

The next time you execute Pentaho MapReduce the custom Kettle environment will be copied into HDFS at pmr.kettle.dfs.install.dir/custom and used when executing the job. You can switch between Kettle environments by specifying the pmr.kettle.installation.id property as a User Defined property per Pentaho MapReduce job entry or globally in the pentaho-mapreduce.properties file*.

*Note: If the installation referenced by pmr.kettle.installation.id does not exist the archive file and additional plugins currently configured will be used to "install" it into HDFS.

Upgrading from the Pentaho Hadoop Distribution (PHD)

...

If you're using a version of the Pentaho Hadoop Distribution (PHD) that allows you to configure the installation directory via mapred-site.xml, preform perform the following on all TaskTracker nodes:

...