...
You can customize an existing Kettle environment install in HDFS by manually copying jars and plugins into HDFS. This can be done manually (hadoop fs -copyFromLocal <localsrc> ... <dst>
or with the Hadoop Copy Files job entry.
See Appendix B for the supported directory structure in HDFS.
Adding JDBC drivers to the Kettle environment
JDBC drivers and their required dependencies must be placed in the installation directory's lib/ directory.
Upgrading from the Pentaho Hadoop Distribution (PHD)
...
- Remove the
pentaho.*
properties from yourmapred-site.xml
- Remove the directories those properties referenced
- Restart the TaskTracker process
Appendix A: pentaho-mapreduce-libraries.zip structure
...
Anchor | ||||
---|---|---|---|---|
|
Code Block |
---|
pentaho-mapreduce-libraries.zip/ `- lib/ +- kettle-core-{version}.jar +- kettle-engine-{version}.jar `- .. (all other required Kettle dependencies and optional jars) |
Appendix B: Example Kettle environment installation directory structure within DFS
Anchor | ||||
---|---|---|---|---|
|
Code Block |
---|
/opt/pentaho/mapreduce/ +- 4.3.0/ | +- lib/ | | +- kettle-core-{version}.jar | | +- kettle-engine-{version}.jar | | `- .. (all other required Kettle dependencies and optional jars) | `- plugins/ | +- pentaho-big-data-plugin/ | `- .. (additional optional plugins) `- custom/ +- lib/ | +- kettle-core-{version}.jar | +- kettle-engine-{version}.jar | +- my-custom-code.jar | `- .. (all other required Kettle dependencies and optional jars) `- plugins/ +- pentaho-big-data-plugin/ | .. `- my-custom-plugin/ .. |