Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Wiki Markup
{excerpt}How to set up and configure Kettle for your specific Hadoop distribution.{excerpt}

h2. {color:red}This page applies to Kettle and BA Suite version 4.4 (suite 4.8) only, for 5.0 go [here|Configuring Pentaho for your Hadoop Distro and Version (Pentaho Suite Version 5.1)]{color}

_The Pentaho applications come pre-configured for Apache Hadoop 0.20.2. If you are using this distro and version, no further configuration is required._

Documentation for configuring Pentaho for distros other than Apache Hadoop 0.20.2 is now located on the Pentaho Infocenter [here|http://infocenter.pentaho.com/help/topic/bigdata_guide/reference_active_hadoop_configuration.html]

h2. Currently supported Hadoop distributions:
Pentaho uses an abstraction layer to facilitate supporting the rapid and never ending distributions version updates.  We call this layer a shim.  The following list shows the current known support and status of various distributions.  We generally do not have to update a shim for a minor or patch version change.

h2. {color:red}Upgrade{color} your Big Data Plugin to version 1.3.3.1 
The Big Data plugin has been updated to version 1.3.3.1 and is [available for download.|https://support.pentaho.com/entries/24445558-Big-Data-Plugin-Version-1-3-3-for-Pentaho-BA-Server-4-8-1-x-and-PDI-4-4-1-x]

This upgrade works with PDI 4.4 (Suite 4.8) and is compatible with both EE and CE editions of Pentaho.  Additional shims that were not shipped with the updated plugin are available on the [Additional Shims download page.|https://pentaho.box.com/v4-AdditionalShims]

h2. Important information about supported Hadoop versions
Pentaho does not ship all available shims with the product.  Shims that support older distributions as well as new ones created after release are available for download.  If the note says that a later version of a shim also supports your version, Pentaho recommends using the later version.

Click [Install Hadoop Distribution Shim] for installation instructions.

{composition-setup}{composition-setup}
{deck:id=MyDeck|class=tan}

{card:label=Apache}
||Hadoop Version||Shim||Pentaho Suite Ver||Download||Notes||
|0.20.x|hadoop-20|4.8+| included | |
|1.0.x|NS*| | | No support planned [See this blog post|http://funpdi.blogspot.com/2013/03/pentaho-data-integration-44-and-hadoop.html]|
|1.1.x|NS*| | | Not likely to be done in favor of 1.2.x [PDI-9964|http://jira.pentaho.com/browse/PDI-9964] |
|1.2.x|NS*| | | Possibly in patch post 5.0 but not committed http://jira.pentaho.com/browse/PDI-10393 |
|2.x.x|NS*| | | Distro is Alpha |
_Go to [Apache releases|http://hadoop.apache.org/releases.html]_
{card}

{card:label=Cloudera}
||Hadoop Version||Shim||Pentaho Suite Ver||Download||Notes||
|CDH3u3, u4 and u5|cdh3U4|4.8+ |[download|https://pentaho.box.com/s/97j347r3kv34tmyshqpo ] | Support will be dropped in 5.0 |
|CDH4.0, 4.0.1, 4.1, 4.1.1|cdh4|4.8+ | [download|https://pentaho.box.com/s/ricgmvpxbj409u3id6va ] | The cdh42 shim also supports this configuration |
|CDH4.1.2 |cdh412|4.8 + BD Plugin 1.3.2+| [download|https://pentaho.box.com/s/slsk5nqxi7uxc1r61iza ] | The cdh42 shim also supports this configuration |
|CDH4.1.3 |cdh413|4.8 + BD Plugin 1.3.2+| [download|https://pentaho.box.com/s/beiic9maz4jqfz6l659g ] | The cdh42 shim also supports this configuration |
|CDH4.2|cdh42|4.8 + BD Plugin 1.3.2+| included | Backward compatible with all earlier cdh4.x distros|
|CDH4.2.1|cdh42|4.8 + BD Plugin 1.3.3.1+| included | |
|CDH4.3|cdh42| 4.8 + BD Plugin 1.3.3.1+ | included |  |
|CDH4.4.x| cdh42 | 4.8 + BD Plugin 1.3.3.1+ | included | |
_Go to [Cloudera releases|https://ccp.cloudera.com/display/SUPPORT/CDH+Downloads]_

{color:navy}*NOTE: the cdh42 shim supports all versions of CDH from 4.0 through 4.4.x*{color}
{card}

{card:label=DataStax}
||Hadoop Version||Shim||Pentaho Suite Ver||Download||Notes||
|DSE 3.0.x| NS* | | | Possibly in patch post 5.0 but not committed [PDI-8036|http://jira.pentaho.com/browse/PDI-8036]|
|DSE 2.2.x| NS* | | | No current plans to support |
_Go to [DataStax releases|http://www.datastax.com/docs/datastax_enterprise3.0/dse_release_notes]_
{card}

{card:label=Hortonworks}
||Hadoop Version||Shim||Pentaho Suite Ver||Download||Notes||
|HDP 1.2.x| hdp12 |4.8 + BD Plugin 1.3.2+| included | |
|HDP 1.3.x| hdp13 |4.8 + BD Plugin 1.3.2+| [download|https://pentaho.box.com/s/0wqy2qty3szv7j3qt2za ] | |
|HDP 2.x| NS* | | | In patch post 5.0 - [PDI-8962|http://jira.pentaho.com/browse/PDI-8962] |
|HDP 1.1 for Win| NS* | | | In patch post 5.0 - [PDI-10266|http://jira.pentaho.com/browse/PDI-10266] |
_Go to [Hortonworks releases|http://hortonworks.com/download/]_
{card}

{card:label=Intel}
||Hadoop Version||Shim||Pentaho Suite Ver||Download||Notes||
|IDH 2.3|ihd23|4.8 + BD Plugin 1.3.2+| [download|https://pentaho.box.com/s/t8gzshgsnmfve6b2eg3h ] | |
_Go to [Intel releases|http://hadoop.intel.com/]_
{card}

{card:label=MapR}
||Hadoop Version||Shim||Pentaho Suite Ver||Download||Notes||
|1.1.3, 1.2.0|mapr|4.8+| [download|https://pentaho.box.com/s/dmdtw3vtq863cgz2l7k2 ] | |
|2.0.x|NS*| | | No Support planned [PDI-9648|http://jira.pentaho.com/browse/PDI-9648]|
|2.1.x|mapr21|4.8 + BD Plugin 1.3.2+| included | |
|3.0.x|NS*| | | Planned for immediately post 5.0 [PDI-10037|http://jira.pentaho.com/browse/PDI-10037]|
_Go to [MapR releases|http://www.mapr.com/doc/display/MapR/MapR+Release+Notes]_
{card}

{deck}

_*\* NS - Not supported.*  See [Hadoop Configurations] for information on how to create or modify a shim to support your configuration_

_*\+ Pentaho Ver* is the earliest version of the Pentaho suite that supports this shim.  Subsequent Pentaho versions will also support this shim unless otherwise noted._

{HTMLComment}
{tip}
The Pentaho support policy for Hadoop is available on the [Pentaho Support Plan for Hadoop Distributions] page.
{tip}
{HTMLComment}

h2. Open JIRA cases for Distro Support
{jiraissues:anonymous=true|columns=key;fixVersion;summary;status;assignee;updated|url=http://jira.pentaho.com/sr/jira.issueviews:searchrequest-xml/temp/SearchRequest.xml?jqlQuery=labels+%3D+BD_Distro+AND+status+in+%28Open%2C+%22In+Progress%22%2C+%22Resolved%22%2C+Reopened%2C+%22Ready+For+Test%22%2C+%22Ready+for+Publishing%22%29&tempMax=1000}

h2. Release resources

[!http://2.bp.blogspot.com/-GO6HF0OAFHw/UOfNEH-4sEI/AAAAAAAAAD0/dEWFFYTRgYw/s1600/output-file.png|width=100,height=75!|http://2.bp.blogspot.com/-GO6HF0OAFHw/UOfNEH-4sEI/AAAAAAAAAD0/dEWFFYTRgYw/s1600/output-file.png] [!http://hortonworks.com/wp-content/uploads/2013/05/hdp13.png|width=100,height=75!|http://hortonworks.com/wp-content/uploads/2013/05/hdp13.png]
* [Genealogy of Elephants II|http://drcos.boudnik.org/2013/01/what-you-wanted-to-know-about-hadoop.html]
* [A brief history of Apache Hadoop branches and releases|http://blog.cloudera.com/blog/2012/01/an-update-on-apache-hadoop-1-0/]