...
- Start PDI on your desktop. Once it is running choose 'File' -> 'New' -> 'Transformation' from the menu system or click on the 'New file' icon on the toolbar and choose the 'Transformation' option.
- Add a Hadoop File Input Step: You are going to read data from a CLDB file, so expand the 'HadoopBig Data' section of the Design palette and drag a 'Hadoop File Input' node onto the transformation canvas. Your transformation should look like:
- Edit the Hadoop File Input Step: Double-click on the 'Hadoop File Input' node to edit its properties. Enter this information:
- File or directory: Enter 'maprfs://<CLDB>:<PORT>/weblogs/aggregate_mr'
...
- Add a Hadoop File Input Step: You are going to read data from a CLDB file, so expand the 'HadoopBig Data' section of the Design palette and drag a 'Hadoop File Input' node onto the transformation canvas. Your transformation should look like:
- Edit the Hadoop File Input Step: Double-click on the 'Hadoop File Input' node to edit its properties. Enter this information:
- File or directory: Enter 'maprfs://<CLDB>:<PORT>/weblogs/aggregate_mr'
...