...
Customizing the Kettle Environment used by Pentaho MapReduce
Anchor | ||||
---|---|---|---|---|
|
TODO
- Unzip pentaho-mapreduce-libraries.zip, it contains a single lib directory with the required Kettle dependencies
- Add additional libraries to the lib/ directory
- Zip up the lib/ directory into pentaho-mapreduce-libraries-custom.zip so the archive contains the lib/ with all jars within it (you may create subdirectories within lib/. All jars found in lib/ and its subdirectories will be added to the classpath of the executing job.)
- Update
pentaho-mapreduce.properties
and update the following properties:Code Block pmr.kettle.installation.id=custom pmr.libraries.archive.file=pentaho-mapreduce-libraries-custom.zip
The next time you execute Pentaho MapReduce the custom Kettle environment will be copied into HDFS at pmr.kettle.dfs.install.dir/custom
and used when executing the job. You can switch between Kettle environments by specifying the pmr.kettle.installation.id
property as a User Defined property per Pentaho MapReduce job entry or globally in the pentaho-mapreduce.properties
file*.
*Note: If the installation referenced by pmr.kettle.installation.id
does not exist the archive file and additional plugins currently configured will be used to "install" it into HDFS.
Upgrading from the Pentaho Hadoop Distribution (PHD)
...
If you're using a version of the Pentaho Hadoop Distribution (PHD) that allows you to configure the installation directory via mapred-site.xml, preform perform the following on all TaskTracker nodes:
...