Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin

...

  • Log separation between objects (transformations, jobs, ...) on the same server
  • Central logging store infrastructure with central memory management
  • Logging data lineage so we know where each log row comes from
  • Incremental log updates from the central log store
  • Error line identification to allow for color coding

...

Code Block
CentralLogStore.discardLines(trans.getLogChannelId(), false);

Please note that CentralLogStore was renamed to KettleLogStore in v5.

Logging levels

Since PDI version 4 it is no longer possible to change the logging level while a transformation or job is running.  That is because every object that is executed keeps its own log level for the duration of the execution. Obviously this was implemented to allow different transformations and jobs to run with different logging levels on the same Carte, DI or BI server.

...

As you can imagine, keeping log lines in memory indefinitely will cause some memory leaks over time, especially if a data integration user wants to log with incredibly high
If you don't want to discard lines (for example on a DI server where you don't know when the user will be querying the log text) you can set log line time-out and a maximum size of the central log buffer.   These options are available with Pan/Kitchen/Carte-XML/DI Server but also as a number of environment variables...

Variable name

Description

Default

KETTLE_MAX_LOG_SIZE_IN_LINES

The maximum number of log lines that are kept internally by Kettle. Set to 0 to keep all rows (default

0 (<4.2)
5000 (>=4.2)

KETTLE_MAX_LOG_TIMEOUT_IN_MINUTES

The maximum age (in minutes) of a log line while being kept internally by Kettle. Set to 0 to keep all rows indefinitely (default

0 (<4.2)
1440 (>=4.2)

The following options were introduced in 4.2.0-M1 to further keep memory usage under control while using repeat loops in jobs and so on:

Variable name

Description

Default

KETTLE_MAX_JOB_ENTRIES_LOGGED

The maximum number of job entry results kept in memory for logging purposes.

1000

KETTLE_MAX_JOB_TRACKER_SIZE

The maximum number of job trackers kept in memory

1000

KETTLE_MAX_LOGGING_REGISTRY_SIZE

The maximum number of logging registry entries kept in memory for logging purposes.

1000

4.2.0 also sets sane defaults on all these values to make sure that by default you don't run out of memory.

Cleaning up after execution

In Spoon, Carte and the DI Server, cleanup of the logging records or logging registry entries is done automatically.  However, if you are executing transformations or jobs yourself using the Kettle API (see elsewhere in the SDK) you might want to remove logging records when you are done with them explicitly:

Code Block

// Remove the logging records

String logChannelId = trans.getLogChannelId(); // or job.getLogChannelId()
CentralLogStore.discardLines(logChannelId, true);



// Also remove the entries from other objects like TransMeta or JobMeta...
//
CentralLogStore.discardLines(transMeta.getLogChannelId(), true);

// Remove the entries from the registry
LoggingRegistry.getInstance().removeIncludingChildren(logChannelId);