Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin

Excerpt

How to set up and configure Pentaho (Kettle, Pentaho Data Integration, Pentaho Business Analytics Suite) for your specific Hadoop distribution.


Include Page
BAD:Warning - Pentaho 5.2, 5.3, and 5.4 - ConfigurationBAD:
Warning - Pentaho 5.2, 5.3, and 5.4 - Configuration

...

Pentaho supports different versions of Hadoop distributions from several vendors such as Cloudera, Hortonworks, and MapR. To support this many versions, Pentaho uses an abstraction layer, called a shim, that connects to the different Hadoop distributions. A shim is a small library that intercepts API calls and redirects or handles them, or changes the calling parameters. Periodically, Pentaho develops new shims as vendors develop new Hadoop distributions and versions. These big data shims are tested and certified by Pentaho engineers. The following steps will help you get Pentaho set up to work with your Hadoop cluster.

...

Note:  For instructions on preparing the shim to connect to a Kerberos cluster, see our Mindtouch documentation here: https://help.pentaho.com/Documentation/5.4/0P0/0W0/030/040.

Set Active Hadoop Distribution

...