Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Kettle - The name of the open source project and also the name of the ETL engine. When the name Kettle is used, it usually refers to the engine that Executes the Jobs and Transforms. Unfortunately, many long time Kettle users also refer to the Kettle graphical designer UI called Spoon as Kettle which adds to the confusion. For more info click The Kettle community home is here.
  • Pentaho Data Integration (PDI) - When Pentaho created the commercial or Enterprise Edition of Kettle, it chose PDI as the branded name to distinguish the commercial version from the open source project. Unfortunately the names are used interchangeably and may just have created more confusion.
  • Spoon - The Kettle desktop visual design tool used to create and edit ETL transformations and jobs. Spoon also has perspectives for running and debugging, visualizing and generating data models that can be used by the rest of the Pentaho Suite.
  • Pentaho Hadoop Distribution (PHD) - This is the kettle engine packaged for distribution to a hadoop cluster.  The PHD allows kettle Transforms to be run as a map task, reduce task or combiner and take advantage of the power of the hadoop cluster.  This distribution will eventually become unnecessary as Kettle is modified to use the hadoop distributed cache to locate the resources it needs to execute within the cluster.
  • Pan - A program that can execute transformations from the command line, usually via scheduler.
  • Kitchen - A program that can execute jobs from the command line, usually via scheduler.
  • Carte - A simple web server that allows you to execute transformations and jobs remotely. It does so by accepting XML (using a small servlet) that contains the transformation to execute and the execution configuration. It also allows you to remotely monitor, start and stop the transformations and jobs that run on the Carte server.
  • Pentaho Report Designer (PRD) - The Kettle Engine is embedded in the Pentaho Report Designer which enables PRD to generate reports from a Kettle transform without having to stage the data.  It also gives PRD access to all of the database connectors within Kettle including the NoSQL databases. For more info click The Pentaho Reporting community home is here.
  • Pentaho BI Platform - The Kettle Engine is embedded in the BI Platform which enables reports created with PRD that rely on transforms to be published to the web. For more info click The Kettle community home is here.
  • Pentaho Data Integration Server (DI Server) EE - Standalone server for running Kettle Jobs and transforms. It has a CMS repository for storing and versioning Jobs and Transforms. It also has a scheduler and performance monitor. The DI Server is part of Pentaho Enterprise Edition and is not available in open source.

...