/
.03 Database Connections

.03 Database Connections

Unknown macro: {scrollbar}

Database Connections

A database connection describes the method by which Kettle connects to a database. You can create connections specific to a Job or Transformation or store them in the Kettle repository for re-use in multiple transformations or jobs.

Note: Please see also Special database issues and experiences

The following topics are covered in this section:

Creating a New Database Connection


This section describes how to create a new database connection and includes a detailed description of each connection property available in the Connection information dialog box.

To create a new connection right click the Database Connections in the tree and select New or New Connection Wizard. You can also double click Database Connections, or press F3.

The Connection information dialog box appears. The topics that follow describe the configuration options available on each tab of the Connection information dialog box.

General

The general tab is where you set up the basic information about your connection such as the connection name, type, access method, server name and log on credentials. The table below provides a detailed description of the options available under the General tab:

Feature

Description

Connection Name

Uniquely identifies a connection across transformations and jobs

Connection Type

Type of database you are connecting to (for example, MySQL, Oracle, and so on)

Method of access

This will be either Native (JDBC), ODBC, or OCI. Available access types depend on the type of database you are connecting to

Server host name

Defines the host name of the server on which the database resides. You can also specify the host by IP-address

Database name

Identifies the database name you want to connect to. In case of ODBC, specify the DSN name here

Port number

Sets the TCP/IP port number

Username

Optionally specifies the user name to connect to the database

Password

Optionally specifies the password to connect to the database

Pooling

The pooling tab allows you to configure your connection to use connection pooling and define options related to connection pooling like the initial pool size, maximum pool size and connection pool parameters. The table below provides a more detailed description of the options available on the Pooling tab:

Feature

Description

Use a connection pool

Enables connection pooling

The initial pool size

Sets the initial size of the connection pool.

The maximum pool size.

Sets the maximum number of connections in the connection pool.

Parameter Table

Allows you to define additional custom pool parameters.

MySQL

By default, MySQL returns complete query results in one block to the client, (Kettle in this case), so "result streaming" is enabled. One drawback associated result streaming is that it allows only one single query to be opened at any given time. You can disable this option in the MySQL tab of the database connection dialog box if necessary.

Another issue you may encounter is that the default timeout in the MySQL JDBC driver is set to 0 (no timeout). In certain instances, this may not allow Kettle to detect a server crash or sudden network failure if it occurs in the middle of a query or open database connection. This in turn leads to the infinite stalling of a transformation or job. To solve problem, set the "connectTimeout" and "socketTimeout" parameters for MySQL in the Options tab. The value to be specified is in milliseconds: for a 2 minute timeout you would specify value 120000 ( 2 x 60 x 1000 ).

You can also review other options on the linked MySQL help page by clicking on the 'Show help text on option usage' button found on the Options tab.

Note: please see also MySQL

Oracle

This tab allows you to specify the default data and index tablespaces which Kettle uses when generating SQL for Oracle tables and indexes.

This version of Pentaho Data Integration ships with the Oracle JDBC driver version 10.2.0. It is the most stable and recent driver we could find; however, if you have issues with Oracle connectivity or other problems, you may consider replacing the 10.2. JDBC driver to match your database server. Replace files "ojdbc14.jar" and "orai18n.jar" in the directory libext/JDBC of your distribution with the files found in the $ORACLE_HOME/jdbc directory on your server.

If you are using OCI and an Oracle Net8 client, the JDBC driver version used in Kettle needs to match your Oracle client version.  PDI 2.5.0 shipped with version 10.1, 3.0.0 ships with version 10.2. You can either install that version of the Oracle client or change the JDBC driver in PDI if versions don't match up.

Note: please see also Oracle

Informix

For Informix, you must specify the Informix Server name in the Informix tab in order for a connection to be usable.

SQL Server

This tab allows you configure the following properties specific to Microsoft SQL Server:

Feature

Description

SQL Server instance name

Sets the instance name property for the SQL Server connection.

Use .. to separate schema and table

Enable when using dot notation to separate schema and table.

Other properties can be configured by adding connection parameters on the options tab of the Connection information dialog box. For example, you can enable single sign-on login by defining the domain option on the Options tab as shown below:

From the jTDS FAQ on http://jtds.sourceforge.net/faq.html:

Specifies the Windows domain to authenticate in. If present and the user name and password are provided, jTDS uses Windows (NTLM) authentication instead of the usual SQL Server authentication (i.e. the user and password provided are the domain user and password). This allows non-Windows clients to log in to servers which are only configured to accept Windows authentication.

If the domain parameter is present but no user name and password are provided, jTDS uses its native Single-Sign-On library and logs in with the logged Windows user's credentials"

Note: please see also MS SQL Server

SAP R/3

This tab allows you configure the following properties specific to SAP R/3:

Feature

Description

Language

Specifies the language to be used when connecting to SAP.

System Number

Specifies the system number of the SAP system to which you want to connect.

SAP Client

Specifies the three digit client number for the connection.

Note: For this to work, you need a plug-in, please see List of Available Pentaho Data Integration Plug-Ins

Generic

This tab is where you specify the URL and Driver class for Generic Database connections. You can also dynamically set these properties using Kettle variables. Using Kettle variables provides you with t