Table Input
PLEASE NOTE: This documentation applies to an earlier version. For the most recent documentation, visit the Pentaho Enterprise Edition documentation site.
Description
This step is used to read information from a database, using a connection and SQL. Basic SQL statements can be generated automatically by clicking Get SQL select statement.
Options
Option | Description |
---|---|
Step name | Name of the step;the name has to be unique in a single transformation |
Connection | The database connection from which to read data |
SQL | The SQL statement used to read information from the database connection. You can also click Get SQL select statement... to browse tables and automatically generate a basic select statement. |
Enable lazy conversion | When enabled, lazy conversion avoids unnecessary data type conversions and can result in a significant performance improvements. |
Replace variables in script? | Enable to replace variables in the script; this feature was provided to allow you to test with or without performing variable substitutions. |
Insert data from step | Specify the input step name where Pentaho? can expect information to come from. This information can then be inserted into the SQL statement. The locators where Pentaho? inserts information is indicated by ? (question marks). |
Execute for each row? | Enable to perform the data insert for each individual row. |
Limit size | Sets the number of lines that is read from the database; zero (0) means read all lines. |
Metadata Injection Support
You can use the Metadata Injection supported fields with ETL Metadata Injection step to pass metadata to your transformation at runtime. The following Option fields of the Table Input step support metadata injection:
- Options: Connection, SQL, Enable Lazy Conversion, Replace Variables in Script?, Insert Data from Step, and Limit Size
Example
Below is a sample SQL statement:
SELECT * FROM customers WHERE changed_date BETWEEN ? AND ?
This statement needs two dates that are read on the Insert data from step.Note: The dates can be provided using the Get System Info step type. For example, if you want to read all customers that have had their data changed yesterday, you may get the range for yesterday and read the customer data.
Preview
Preview allows you to preview the step. This is accomplished by preview of a new transformation with two steps: this one and a
Dummy step. To see a detailed log of the execution, click Logs in the Preview window.
FAQ
Variables are not getting substituted
Q: I'm feeding parameters via a input hop to a table input, but they're not being replaced properly. My input row contains e.g. the fieldname "input_param". The following is my query in Table input step:
SELECT VID FROM CO_VEHICLES WHERE PONO = ${input_param};
the input_param does not get replaced. How to do this?
A:The reason it fails is that there's a difference between variables and field substitution. The above syntax would work if "input_param" would be a variable. However to use the field values of the incoming rows as arguments you need to use ? Parameters. The query would need to become:
SELECT VID FROM CO_VEHICLES WHERE PONO = ?;
These parameter markers are filled in positionally.
Parameters are working, but not as part of a tablename
Q: I'm using parameters in a table input step. This works when I use the parameter in a where clause as:
SELECT VID FROM CO_VEHICLES WHERE POID = ?;
If 1 argument would be passed to the table input step from the previous step. However when I use the parameter marker as part of a tablename as:
SELECT VID FROM CO_VEHICLES_?;
it fails. The intention of the above query would be e.g. to access multiple tables CO_VEHICLES_1, CO_VEHICLES_2, ... where the number would be passed as a parameter. This always fails, why?
A:The reason it fails is that JDBC does not allow parameter markers in a table. The reason for this is simple: the tablename determines the columns you can get retrieve, the access rights, ... These things are checked once for parametrized SQL statements and afterwards only the parameters are filled in to execute the query.
If parameter markers would be allowed in tablenames the SQL statement would have to be prepared every time, hence parameter markers in tablenames are not supported.
There is a way around it: you can use (kettle) variables as part of a tablename as in:
SELECT VID FROM CO_VEHICLES_${NUMBER};
What PDI does for these queries (if you enable "variable replacement" in the table input step) is:
- Replace the variables in the SQL statements;
- Bind the parameters per execution;
This way JDBC is not aware of the changing tablename (because it will be filled in by PDI before the SQL is sent over to the database), it will however result in different SQL statements being prepared by the database. There's also another section in is document on the setting of variables.