Excel Output
PLEASE NOTE: This documentation applies to an earlier version. For the most recent documentation, visit the Pentaho Enterprise Edition documentation site.
Description
The Excel Output step allows you to write data to one or more Excel files. The following sections describe the features available for configuring the Excel output step.Â
This step exports data to a Microsoft Excel 2003 spreadsheet file (xls). If you want to write to an xlsx file (Excel 2007 and above), check the Excel Writer step.
Options
File Tab
The File tab defines basic file properties for this step's output.
Option | Description |
---|---|
Step name | The name of this step in the transformation workspace. |
Filename | The name of the spreadsheet file you are reading from. |
Create Parent folder | If checked, the parent folder will be created |
Do not create file at start | If checked, does not create the file until the end of the step. This avoids to create empty files when no rows are in the data stream. |
Extension | The three-letter file extension to append to the file name. |
Include stepnr in filename | If you run the step in multiple copies (launching several copies of a step), the copy number is included in the file name, before the extension. (_0). |
Include date in file name | Includes the system date in the filename (_20101231). |
Include time in file name | Includes the system time (24-hour format) in the filename (_235959). |
Specify Date time format | If checked, the filename will include a date and time stamp that follows the selection you choose from the drop-down box. Selecting this option disables the previous two options. |
Date time format | Provide a specific date time format for the filename. |
Show file name(s) | Displays a list of the files that will be generated. This is a simulation and depends on the number of rows that will go into each file. |
Add filenames to result | Adds the filenames to the internal filename result set. This internal result set can be used later on, e.g. to process all created files. |
Content Tab
The content tab contains options for describing the file's content.
Option | Description |
---|---|
Append | When checked, appends lines to the end of the specified file. If the file does not exist, a new one will be created. |
Header | Enable this option if you want a header to appear before the spreadsheet grid data. |
Footer | Enable this option if you want a footer to appear after the spreadsheet grid data. |
Encoding | Specifies the text file encoding to use. Leave blank to use the default encoding on your system. To use Unicode, specify UTF-8 or UTF-16. On first use, Spoon searches your system for available encodings and populates this list accordingly. |
Split every ... rows | After this many rows, start a new spreadsheet file to continue data output. |
Sheet name | Specifies the name of the worksheet within the spreadsheet file. |
Protect sheet? | If checked, enables password protection on the worksheet. You must also specify a password in the Password field. |
Auto size columns | If checked, automatically sizes the worksheet columns to the largest value. |
Retain NULL values | If checked, NULL values are preserved in the output. If un-checked, NULLs are replaced with empty strings. |
Use temporary file | Sets whether a temporary file is used during the generation of the workbook. If not set, the workbook will take place entirely in memory. Setting this flag involves an assessment of the trade-offs between memory usage and performance. |
Temporary files directory | Define the temporary files directoy. |
Use Template | If checked, PDI will use the specified Excel template to create the output file. The template must be specified in the Excel template field. |
Excel Template | Define the Excel template to use. |
Append to Excel Template | Appends output to the specified Excel template. |
Custom Tab
Option | Description |
---|---|
Header Font | Defines the Header Font details |
Row Font | Defines the Row Font details |
Fields Tab
The Fields tab defines properties for the exported fields. The Get Fields button will automatically retrieve a list of fields from the inputstream and populate the list. The Minimal width button removes any padding from the output.
Option | Description |
---|---|
Name | The name of the field. |
Type | The field's data type; String, Date or Number. |
Format | The format mask (number type). |
Note: You can specify any format definitions available in Excel. These formats are not tied to any Kettle specific formatting.
Metadata Injection Support
All fields of this step support metadata injection. You can use this step with ETL Metadata Injection to pass metadata to your transformation at runtime.