Include Page | ||||
---|---|---|---|---|
|
Excerpt |
---|
Setting up and configuring the Pentaho node dist, Kettle (PDI) and Reporting |
...
- Follow installation instructions provided by MapR for your architecture: Setting up the Client - MapR
...
Kettle Client
- Download and extract Kettle CE from the Downloads page.
- Configure PDI Client for MapR
- Overview:
- The MapR native libraries for your architecture must be added to the
java.library.path
- MapR Hadoop Configuration directory needs to be on the classpath
- MapR Hadoop Core library must be on the classpath
- The MapR native libraries for your architecture must be added to the
- All architectures
- Update the $PDI_HOME/launcher/launcher.properties with 's classpath property to include the relative path to your MapR configuration directory. e.g.:
classpath=../:../ui:../ui/images:../libext/mondrian/config:${HADOOP_HOME}/conf:../libext/bigdata/pigConf:../../../../opt/mapr/conf
, or use the attached launcher.properties - Delete $PDI_HOME/libext/pentahobigdata/hadoop-0.20.2-core.jar
- Copy $MAPR_HOME/hadoop/hadoop-0.20.2/lib/hadoop-0.20.2-dev-core.jar into $PDI_HOME/libext/pentahobigdata
- Copy $MAPR_HOME/hadoop/hadoop-0.20.2/lib/maprfs-0.1.jar into $PDI_HOME/libext/pentahobigdata
- Update the $PDI_HOME/launcher/launcher.properties with 's classpath property to include the relative path to your MapR configuration directory. e.g.:
- Linux x64
- Update the $PDI_HOME/spoon.sh with the attached spoon.sh
- Update the $PDI_HOME/pan.sh with the attached pan.sh
- Update the $PDI_HOME/kitchen.sh with the attached kitchen.sh
- Update the $PDI_HOME/carte.sh with the attached carte.sh
- Mac OS X 64-bit
- Update the Data Integration 64-bit.app/Content/Info.plist with the attached Info.plist
- Overview:
- Apply the Hadoop client configuration files by adding the core-site, hdfs-site, and mapred-site.xml files in the $PDI_HOME directory.
Report Designer
- Download and extract PRD from the Downloads page.
- Configure PRD for MapR
- Delete $PRD_HOME/lib/jdbc/hadoop-0.20.2-core.jar
- Copy $MAPR_HOME/hadoop/hadoop-0.20.2/lib/hadoop-0.20.2-dev-core.jar into $PRD_HOME/lib
- Copy $MAPR_HOME/hadoop/hadoop-0.20.2/lib/maprfs-0.1.jar into $PRD_HOME/lib
- Linux x64:
- Add "-Djava.library.path=/opt/mapr/hadoop/hadoop-0.20.2/lib/native/Linux-amd64-64" to the last line in $PRD_HOME/report-designer.sh
- For MacOS:
- Add "-Djava.library.path=/opt/mapr/hadoop/hadoop-0.20.2/lib/native/Mac_OS_X-x86_64-64" to the "VMOptions" entry in $PRD_HOME/Pentaho\ Report\ Designer.app/Contents/Info.plist
Hadoop Node Configuration
Download the Pentaho Hadoop Node Distribution (PHD):
Ubuntu: phd-ce-mapr-bigdata-preview-4.3_all.deb
RedHat/CentOS: phd-ce-mapr-bigdata-preview-4.3.noarch.rpm
All TaskTracker nodes must have the pentaho-mapreduce (PHD) package installed on them. Our packages require the MapR TaskTracker (mapr-tasktracker
) package being installed.
From a high level the packages perform the following steps:
...
<property>
<name>pentaho.kettle.home</name>
<value>/opt/pentaho/pentaho-mapreduce</value>
</property>
<property>
<name>pentaho.kettle.plugins.dir</name>
<value>/opt/pentaho/pentaho-mapreduce/plugins</value>
</property>
RedHat/CentOS
...
rpm -i phd-ce-mapr-bigdata-preview-4.3.noarch.rpm
Upgrade
rpm -U --force phd-ce-mapr-bigdata-preview-4.3.noarch.rpm
Ubuntu
Install
dpkg -i phd-ce-mapr-bigdata-preview-4.3_all.deb
Upgrade
Remove then reinstall:
dpkg -r pentaho-mapreduce
dpkg -i phd-ce-mapr-bigdata-preview-4.3_all.deb
Restart JobTracker and TaskTracker
To complete the installation you need to restart the JobTracker and TaskTracker nodes so the HADOOP_CLASSPATH changes take effect.
Include Page | ||||
---|---|---|---|---|
|