Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Wiki Markup
{scrollbar}
{excerpt}How to set up and configure Kettle for your specific Hadoop distribution.{excerpt}

_The Pentaho applications come pre-configured for Apache Hadoop 0.20.2. If you are using this distro and version, no further configuration is required._

Documentation for configuring Pentaho for distros other than Apache Hadoop 0.20.2 is now located on the Pentaho Infocenter [here|http://infocenter.pentaho.com/help/topic/pdi_admin_guide/reference_active_hadoop_configuration.html]

h2. Currently supported Hadoop distributions:
Pentaho uses an abstraction layer to facilitate supporting the rapid and never ending distributions version updates.  We call this layer a shim.  The following list shows the current known support and status of various distributions.  We generally do not have to update a shim for a minor or patch version change.

{composition-setup}{composition-setup}
{deck:id=MyDeck|class=tan}

{card:label=Apache}
||Hadoop Version||Shim||Pentaho Suite Ver||Notes||
|0.20.x|hadoop-20|4.8+| |
|1.0.x|NS*| | No support planned [See this blog post|http://funpdi.blogspot.com/2013/03/pentaho-data-integration-44-and-hadoop.html]  |
|1.1.x|NS*| | Not likely to be done in favor of 1.2.x [PDI-9964|http://jira.pentaho.com/browse/PDI-9964] |
|1.2.x|NS*| | Possibly in patch post 5.0 but not committed http://jira.pentaho.com/browse/PDI-10393 |
|2.x.x|NS*| | Distro is Alpha |
_Go to [Apache releases|http://hadoop.apache.org/releases.html]_
{card}

{card:label=Cloudera}
||Hadoop Version||Shim||Pentaho Suite Ver||Notes||
|CDH3u3, u4 and u5|cdh3U4|4.8+ | Support will be dropped in 5.0 |
|CDH4.0, 4.0.1|cdh4|4.8+ |  The cdh42 shim also supports this configuration |
|CDH4.1, 4.1.1|cdh4|4.8+ |  The cdh42 shim also supports this configuration |
|CDH4.1.2, 4.1.3|cdh412, cdh413|4.8 + BD Plugin 1.3.2+| The cdh42 shim also supports this configuration |
|CDH4.2|cdh42|4.8 + BD Plugin 1.3.2+| Backward compatible with all earlier cdh4.x distros|
|CDH4.2.1|cdh42|4.8 + BD Plugin 1.3.3.1+| |
|CDH4.3|cdh42| 4.8 + BD Plugin 1.3.3.1+ |  |
_Go to [Cloudera releases|https://ccp.cloudera.com/display/SUPPORT/CDH+Downloads]_

{color:navy}*NOTE: the cdh42 shim supports all versions of CDH from 4.0 through 4.3*{color}
{card}

{card:label=DataStax}
||Hadoop Version||Shim||Pentaho Suite Ver||Notes||
|DSE 3.0.x| NS* | | Possibly in patch post 5.0 but not committed [PDI-8036|http://jira.pentaho.com/browse/PDI-8036]|
|DSE 2.2.x| NS* | | No current plans to support |
_Go to [DataStax releases|http://www.datastax.com/docs/datastax_enterprise3.0/dse_release_notes]_
{card}

{card:label=Hortonworks}
||Hadoop Version||Shim||Pentaho Suite Ver||Notes||
|HDP 1.2.x| hdp12 |4.8 + BD Plugin 1.3.2+| |
|HDP 1.3.x| hdp13 |4.8 + BD Plugin 1.3.2+| |
|HDP 2.x| NS* | | In patch post 5.0 - [PDI-8962|http://jira.pentaho.com/browse/PDI-8962] |
|HDP 1.1 for Win| NS* | | In patch post 5.0 - [PDI-10266|http://jira.pentaho.com/browse/PDI-10266] |
_Go to [Hortonworks releases|http://hortonworks.com/download/]_
{card}

{card:label=Intel}
||Hadoop Version||Shim||Pentaho Suite Ver||Notes||
|IDH 2.3|NS*|5.0| Planned for 4.8.1.2 (June) [PDI-9647|http://jira.pentaho.com/browse/PDI-9647]|
_Go to [Intel releases|http://hadoop.intel.com/]_
{card}

{card:label=MapR}
||Hadoop Version||Shim||Pentaho Suite Ver||Notes||
|1.1.3, 1.2.0|mapr|4.8+| |
|2.0.x|NS*| | No Support planned [PDI-9648|http://jira.pentaho.com/browse/PDI-9648]|
|2.1.x|mapr21|4.8+| |
|3.0.x|NS*|5.0 | Planned for immediately post 5.0 [PDI-10037|http://jira.pentaho.com/browse/PDI-10037]|
_Go to [MapR releases|http://www.mapr.com/doc/display/MapR/MapR+Release+Notes]_
{card}

{deck}

_*\* NS - Not supported.*  See [Hadoop Configurations] for information on how to create or modify a shim to support your configuration_

_*\+ Pentaho Ver* is the earliest version of the Pentaho suite that supports this shim.  Subsequent Pentaho versions will also support this shim unless otherwise noted._
 
{tip}
The Pentaho support policy for Hadoop is available on the [Pentaho Support Plan for Hadoop Distributions] page.
{tip}

h2. Open JIRA cases for Distro Support
{jiraissues:anonymous=true|columns=key;fixVersion;summary;status;assignee;updated|url=http://jira.pentaho.com/sr/jira.issueviews:searchrequest-xml/temp/SearchRequest.xml?jqlQuery=labels+%3D+BD_Distro+AND+status+in+%28Open%2C+%22In+Progress%22%2C+Reopened%2C+%22Ready+For+Test%22%2C+%22Ready+for+Publishing%22%29&tempMax=1000}

h2. Release resources

[!http://2.bp.blogspot.com/-GO6HF0OAFHw/UOfNEH-4sEI/AAAAAAAAAD0/dEWFFYTRgYw/s1600/output-file.png|width=100,height=75!|http://2.bp.blogspot.com/-GO6HF0OAFHw/UOfNEH-4sEI/AAAAAAAAAD0/dEWFFYTRgYw/s1600/output-file.png] [!http://hortonworks.com/wp-content/uploads/2013/05/hdp13.png|width=100,height=75!|http://hortonworks.com/wp-content/uploads/2013/05/hdp13.png]
* [Genealogy of Elephants II|http://drcos.boudnik.org/2013/01/what-you-wanted-to-know-about-hadoop.html]
* [A brief history of Apache Hadoop branches and releases|http://blog.cloudera.com/blog/2012/01/an-update-on-apache-hadoop-1-0/]