Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

New in Pentaho Big Data Plugin v1.3, Hadoop Configurations are collections of Hadoop libraries required to communicate with a specific version of Hadoop (and related tools: Hive, HBase, Sqoop, Pig, etc.). They are designed to be easily configured.

Configuring the default Hadoop configuration

The Pentaho Big Data Plugin will use the Hadoop configuration defined in it's plugin.properties file to communicate with Hadoop. By default, the "pentaho-hadoop-shims-hadoop-20" configuration is used. You should update this property to match the Hadoop configuration you wish to use when communicating with Hadoop:

Code Block

# The Hadoop Configuration to use when communicating with a Hadoop cluster. This is used for all Hadoop client tools
# including HDFS, Hive, HBase, and Sqoop.
active.hadoop.configuration=pentaho-hadoop-shims-hadoop-20

Structure

Hadoop configurations reside in pentaho-big-data-plugin/hadoop-configurations. They all share a basic structure:

...