Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
{panel} May
Wiki Markup
Panel

May 23,

2006:

Weekly

Kettle

Tip:

Using

Variables,

from

Matt

Casters,

Chief

of

Data

Integration,

Pentaho

{panel}

So

...

there

...

have

...

been

...

a

...

lot

...

of

...

questions

...

lately

...

regarding

...

ETL

...

management

...

issues

...

like:

...

how

...

do

...

I

...

move

...

files,

...

how

...

do

...

I

...

use

...

parameters,

...

etc.

...


I'm

...

happy

...

to

...

say

...

that

...

version

...

2.3.0

...

has

...

gained

...

a

...

lot

...

of

...

functionality

...

in

...

that

...

regard

...

in

...

the

...

last

...

2

...

weeks.

...


In

...

fact

...

I've

...

been

...

working

...

very

...

hard

...

to

...

not

...

only

...

add

...

support

...

for

...

dynamically

...

setting

...

variables,

...

but

...

also

...

in

...

making

...

these

...

variables

...

local

...

to

...

a

...

job

...

in

...

order

...

not

...

to

...

have

...

them

...

influence

...

each

...

other.

...

This

...

is

...

very

...

important

...

in

...

the

...

long

...

run

...

when

...

we

...

will

...

be

...

running

...

several

...

in-line

...

jobs

...

and

...

transformations

...

at

...

the

...

same

...

time

...

on

...

the

...

same

...

virtual

...

machine.

...

(J2EE

...

for

...

example)

...

Please

...

file

...

bug

...

reports

...

for

...

anything

...

that

...

doesn't

...

work

...

in

...

your

...

situation.

...

We

...

will

...

fix

...

those

...

issues

...

before

...

2.3.0

...

is

...

released.

...

Thank

...

you

...

in

...

advance.

...

Filing

...

bug

...

reports

...

really

...

helps

...

Kettle

...

!

What is a variable?

A variable used to be synonym for "environment variable". You can use these in a lot of places in a transformation. Most of the time the fields that support it have a "Variable" button next to it.
Environment variables can still be used, but they are no longer the only option you have. You can also set variables only for a certain parent job, grand-parent job or root job.
Variables are accessed using this format:

${VariableName}

: Unix style
%%VariableName%% : Windows style Extra tip: you can set environment variables by defining a properties file in $HOME/.kettle/kettle.properties,

...

using

...

format

...

VARIABLE=a

...

certain

...

value

Todays example of variable use

This is a sample job that does the following:
Define a new variable TODAY (20060523)
Process all text files in a directory that ${TODAY}

in the filename (customer_01_20060523.txt)

...


When

...

all

...

is

...

processed

...

correctly,

...

move

...

the

...

files

...

to

...

the

...

archive

...

directory
Image Added
This is the use-case

...

of

...

the

...

example:

Image Added

Setting the variables

To allow you to set variables dynamically we constructed a new step "Set Variables". In the transformation shown below you can see how it's done.
Please note that the new "Set Environment Variables" step accepts exactly ONE row of data, no more. That would not make sense.
As you can see in the image, you can set the scope of the variable.

Image Added

Using the variables

Variables can be used in many steps, mostly there is a "Variable..."

...

button

...

present

...

next

...

to

...

the

...

field.

...


The

...

following

...

2

...

screenshots

...

show

...

the

...

transformation

...

and

...

step

...

that

...

uses

...

the

...

variable

...

we

...

defined

...

earlier:

Image Added

As you can see in this transformation we collect the processed file names and send them to the next job entry.
Also, if you would run or preview this transformation in Spoon, a dialog will pop up to allow you to enter a value for TODAY.
This way you can test the transformation without running the complete job.
Image Added

Moving files

When the previous transformation ran as expected (without errors) we can move the processed files to an archive directory
For this, we use a very simple batch file with one line in it: move %1 C:\testfiles\archive\

...

Also

...

note

...

that

...

there

...

is

...

a

...

"Execute

...

once

...

for

...

every

...

input

...

row"

...

option

...

enabled

...

here

...

that

...

will

...

execute

...

this

...

shell

...

script

...

once

...

for

...

every

...

result

...

row

...

generated

...

by

...

the

...

previous

...

transformation.

...


Image Added

Note:

...

Contrary

...

to

...

what

...

was

...

the

...

case

...

in

...

earlier

...

versions,

...

any

...

output

...

generated

...

by

...

the

...

shell

...

script

...

will

...

be

...

"eaten"

...

and

...

put

...

in

...

the

...

log

...

as

...

Basic

...

logging.

...

I'm

...

sure

...

this

...

will

...

make

...

many

...

people's

...

lives

...

easier.

...


I hope that this Kettle tip will once again clear up some questions and lead Kettle on a path to greater usability.
In three simple job entries, we have implemented functionality that would otherwise take shell scripting to solve.

I hope you found this Weekly Kettle Tip interesting. Join us next time for more Kettle fun.
Matt