What's new in PDI version 3.2
Index
Introduction
When you compare this release to the previous one, you will see that the changes are more evolutionary, rather than revolutionary. Even so, there have been a large amount of changes for a minor version increase. The main focus of the release is once again stability and usability. We went through a large number of pet-peeves, common mis-understandings and simply solved them, either through new features or by modifying existing ones. On top of that we worked on the clustering side to make that mode "cloud-ready" with dynamic clustering, making it more solid, adding features too.
Once again, many many thanks go to our large community of Kettle enthusiasts for all the help they provided to make this release another success.
General changes
Visual changes
Hop color scheme with mini-icons, tooltips (note: tooltips not available on OSX currently)
*** After running with errors : show error icons (note: error details tooltip not available on OSX currently)
Visual feedback : reading from info steps
Visual feedback : writing to target steps that run in multiple copies
New step categories
Step filter in step tree tool bar
Long standing bugs attack
Long standing wish list attack
Resource Exporter to export transformations or complete jobs including their used resources (sub-transformations and sub-jobs) to a single ZIP file.
Jobs and transformations can now define parameters with default values that will be available at runtime as variables. This makes it easy to have dynamic configuration of a job/transformation from the command line (e.g. specifying a date range to process with the default being yesterday)
Instead of having to configure all of the slaves that a transformation will be executed on in clustered mode, you can run Carte slaves in dynamic mode, configuring them to register with a master (or multiple masters) when they start up. The clustered transformation is configured with a list of the masters it can run on. When the transformation is executed, it will go down the list of masters, attempting to submit the job to each one until it is accepted. That master will then execute the transformation using all of the currently available slaves that are registered to it.
Step changes
New steps
Analytic Query : get information from previous/first rows
User Defined Java Expression : evaluate Java expressions, in-line compiled for maximum performance
Formula step : promoted from a plug-in to a native step
Synchronize after merge : performs updates, inserts or deletes in a database depending on a flag in the incoming data
SalesForce Input : reads information from SalesForce (promoted from a plug-in to a native step)
Replace in string : replace values in strings
Strings cut : cut strings down to size
If field value is null : ... then set default values per type or per field
Mail : send e-mails all over the globe
Process files: Copy, move or delete files
Identify last row in a stream : sets a flag when the last row in a stream was reached
Credit card validator : validates a credit card number, extracts information
Mail validator : valides an e-mail address
Reservoir sampling : promoted from a plug-in to a native step
Univariate statistics : promoted from a plug-in to a native step + upgrade
LucidDB Bulk loader : high performance bulk loader for the LucidDB column database
Unique Rows by Hashset : Allows de-duping a stream without having to sort it first. Requires enough memory to be able to store each set of unique keys.
Updated steps
Table Output: ability to specify the fields to insert
Calculator : all sorts of new calculations, string manipulations, etc.
Java script values : ability to replace values + improved script testing
Database lookup : pre-load cache option (load all values in memory)
Dimension Lookup/Update:
Cache pre-load (load all dimension entries in memory)
Support for alternative start of date range scenarios
Support for timestamp columns (last update/insert/both)
Support for current version column
Combination Lookup/Update : support for last update timestamp column
Data validator:
New option to report all errors, not only the first
Ability to read data from another step
Group By: added support for cumulative sum and average
Text File Input: New option to pass through additional fields from previous step (removing the need to do a Cartesian join)
Mapping:
Inherit all variables from parent transformation
Allow setting of variables in mapping
Allow preview of mapping output
Improved logging
New "Open mapping" option in transformation graph right click