S3 File Output
PLEASE NOTE: This documentation applies to an earlier version. For the most recent documentation, visit the Pentaho Enterprise Edition documentation site.
Description
This step exports data to a text file on an Amazon Simple Storage Service (S3) account.
Options
File Tab
The File tab defines basic file properties for this step's output.
Option | Description |
---|---|
Step name | The name of this step in the transformation workspace. |
Access Key | S3 Access Key (optional, see Filename) |
Secret Key | S3 Secret Key (optional, see Filename) |
Filename | The name of the output text file. A filename of a file in S3 cloud follows the schema: s3://(access_key):(secret_key)@s3/(s3_bucket_name)/(absolute_path_to_file) (s3://%28access_key%29:%28secret_key%29@s3/%28s3_bucket_name%29/%28absolute_path_to_file%29) |
Do not create file at start | When this check box is selected, the file will be created at the end of processing. When cleared, the file will be created at the start of processing. |
Accept file name from field? | When checked, enables you to specify file names in a field in the input stream. |
File name field | When the Accept file name from field option is checked, specify the field that will contain the filenames. |
Extension | The three-letter file extension to append to the file name. |
Include stepnr in filename | If you run the step in multiple copies (launching several copies of a step), the copy number is included in the file name, before the extension. (_0). |
Include partition nr in file name? | Includes the data partition number in the file name. |
Include date in file name | Includes the system date in the filename (_20101231). |
Include time in file name | Includes the system time (24-hour format) in the filename (_235959). |
Specify Date time format | When checked, enables you to specify the Date time format. |
Date time format | The Date time format to use that is added to the filename. |
Show filename(s) | Displays a list of the files that will be generated. This is a simulation and depends on the number of rows that will go into each file. |
Add filenames to result | When this check box is selected, file names will be added to the output file. |
Content Tab
The content tab contains options for describing the file's content.
Option | Description |
---|---|
Append | When checked, appends lines to the end of the file. |
Separator | Specifies the character that separates the fields in a single line of text; typically this is semicolon or a tab. |
Enclosure | Optionally specifies the character that defines a block of text that is allowed to have separator characters without causing separation. Typically a single or double quote. |
Force the enclosure around fields? | Forces all field names to be enclosed with the character specified in the Enclosure property above. |
Header | Enable this option if you want the text file to have a header row (first line in the file). |
Footer | Enable this option if you want the text file to have a footer row (last line in the file). |
Format | Specifies either DOS or UNIX file formats. UNIX files have lines that are separated by line feeds, DOS files have lines that are separated by carriage returns and line feeds. |
Compression | Specifies the type of compression to use on the output file -- either zip or gzip. Only one file is placed in a single archive. |
Encoding | Specifies the text file encoding to use. Leave blank to use the default encoding on your system. To use Unicode, specify UTF-8 or UTF-16. On first use, Spoon searches your system for available encodings. |
Right pad fields | When checked, fields will be right-padded to their defined width. |
Fast data dump (no formatting) | Improves the performance when dumping large amounts of data to a text file by not including any formatting information. |
Split every ... rows | If the number N is larger than zero, splits the resulting text file into multiple parts of N rows. |
Add Ending line of file | Enables you to specify an alternate ending row to the output file. |
Fields Tab
The Fields tab defines properties for the exported fields.
Option | Description |
---|---|
Name | The name of the field. |
Type | The field's data type; String, Date or Number. |
Format | The format mask (number type). |
Length | The length option depends on the field type. Number: total number of significant figures in a number; String: total length of a string; Date: determines how much of the date string is printed or recorded. |
Precision | The precision option depends on the field type, but only Number is supported; it returns the number of floating point digits. |
Currency | Symbol used to represent currencies. |
Decimal | A decimal point; this is either a dot or a comma. |
Group | A method of separating units of thousands in numbers of four digits or larger. This is either a dot or a comma. |
Trim type | Truncates the field (left, right, both) before processing. Useful for fields that have no static length. |
Null | Inserts the specified string into the text file if the field value is null. |
Buttons
Get | Retrieves a list of fields from the input stream. |
Minimal width | Minimizes field width by removing unnecessary characters (such as superfluous zeros and spaces). If set, string fields will no longer be padded to their specified length. |
Metadata Injection Support
All fields of this step support metadata injection. You can use this step with ETL Metadata Injection to pass metadata to your transformation at runtime.