.07 Variables
Variables
Variables can be used throughout Pentaho Data Integration, including in transformation steps and job entries. You define variables by setting them with the Set Variable step in a transformation or by setting them in the kettle.properties file in the directory:
$HOME/.kettle (Unix/Linux/OSX) C:\Documents and Settings\<username>\.kettle\ (Windows XP) C:\Users\<username>\.kettle\ (Windows Vista, 7 and later)
The way to use them is either by grabbing them using the Get Variable step or by specifying meta-data strings like:
- ${VARIABLE}
or:
- %%VARIABLE%%
Both formats can be used and even mixed, the first is a UNIX derivative, the second is derived from Microsoft Windows. Dialogs that support variable usage throughout Pentaho Data Integration are visually indicated using a red dollar sign. You can use <CTRL>+ space hot key to select a variable to be inserted into the property value. Mouse over the variable icon to display the shortcut help.
Other ways to set and access variables:
- There are also System parameters, including command line arguments. These can be accessed using the Get System Infostep in a transformation.
- You can also specify values for variables in the "Execute a transformation/job" dialog in Spoon or the Scheduling perspective. If you include the variable names in your transformation they will show up in these dialogs.
- It is also possible to set variables by Named Parameters.
Special Characters
Whenever it is possible to use variables, it is also possible to use special characters (e.g. CHAR ASCII HEX01). This can be set with the format $[hex value], e.g. $[01] (or $[31,32,33] equivalent to 123). These Hex numbers can be looked up at an ASCII conversion table.
The feature of special characters makes it possible to escape the variable syntax. E.g. when you want to use ${foobar} really in your data stream, then you can escape it like this: $[24]{foobar}. $[24] is then replaced by '$' what results in ${foobar} without resolving the variable. See also feature request PDI-6188.
Recursive usage of variables is possible by alternating between the Unix and Windows style syntax. For example you want to resolve a variable that is itself depending on another variable then you could use this example: ${%%inner_var%%}
Variable scope
The scope of a variable is defined by the place in which it is defined.
Environment variables
The first usage (and only usage in previous Kettle versions) was to set an environment variable. Traditionally, this was accomplished by passing options to the Java Virtual Machine (JVM) with the -D option. It's also an easy way to specify the location of temporary files in a platform independent way, for example using variable ${java.io.tmpdir}. This variable points to directory /tmp on Unix/Linux/OSX and to C:\Documents and Settings\<username\Local Settings\Temp on Windows machines. The only problem with using environment variables is that the usage is not dynamic and problems arise if you try to use them in a dynamic way. For example, if you run two or more transformations or jobs run at the same time on an application server (for example the Pentaho platform) you get conflicts. Changes to the environment variables are visible to all software running on the virtual machine.
Kettle variables
Because the scope of an environment variable is too broad, Kettle variables were introduced to provide a way to define variables that are local to the job in which the variable is set. The "Set Variable" step in a transformation allows you to specify in which job you want to set the variable's scope (i.e. parent job, grand-parent job or the root job).
Internal variables
The following variables are always defined:
Variable Name |
Sample value |
---|---|
Internal.Kettle.Build.Date |
2007/05/22 18:01:39 |
Internal.Kettle.Build.Version |
2045 |
Internal.Kettle.Version |
2.5.0 |
These variables are defined in a transformation:
Variable Name |
Sample value |
---|---|
Internal.Transformation.Filename.Directory |
D:\Kettle\samples |
Internal.Transformation.Filename.Name |
Denormaliser - 2 series of key-value pairs.ktr |
Internal.Transformation.Name |
Denormaliser - 2 series of key-value pairs sample |
Internal.Transformation.Repository.Directory |
/ |
These are the internal variables that are defined in a Job:
Variable Name |
Sample value |
---|---|
Internal.Job.Filename.Directory |
/home/matt/jobs |
Internal.Job.Filename.Name |
Nested jobs.kjb |
Internal.Job.Name |
Nested job test case |
Internal.Job.Repository.Directory |
/ |
These variables are defined in a transformation running on a slave server, executed in clustered mode:
Variable Name |
Sample value |
---|---|
Internal.Slave.Transformation.Number |
0..<cluster size-1> (0,1,2,3 or 4) |
Internal.Cluster.Size |
<cluster size> (5) |