Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...


5. Configure the Fields. Click the Get Fields button so that PDI can parse the copybook definition and determine what fields will be present.
Note:

Note

If you see an error that it was unable to find tools.jar, this is because tools.jar was not found in the classpath. A simple fix is to copy JDK_HOME/lib/tools.jar to data-integration/lib.



6. Preview Data. Hit the Preview button to make sure that all of the settings are correct.

...


12. You can now run this transformation and it should complete successfully. The sample data contains 10,000 rows, you should see the following in your Step Metrics tab after running the transformation.



Note

...

If your transformation has errors on the HDFS step, double check that you have configured your Big Data Shim correctly.


You should also see your file in the Hadoop file browser:

...

Note

If you see an error that it was unable to find tools.jar, this is because tools.jar was not found in the classpath. A simple fix is to copy JDK_HOME/lib/tools.jar to data-integration/lib.


Note

If you see an error that PDI is unable to write to HDFS correctly, it could be that you have not yet configured the Big Data Shim. Double check the instructions about configuring the shim for your distribution here: Using Pentaho MapReduce to Parse Weblog Data


Note

If you see an error when running the MapReduce job that it is unable to split fields, you need to follow the directions in Step 10 of the Convert z/OS File to HDFS. You need to make sure that every numeric value uses #.# for Format. The exception will show up in the Hadoop logs as this:

Code Block
Split Fields - Unexpected error
Split Fields - org.pentaho.di.core.exception.KettleValueException:
Error converting value [ 1], when splitting field [value]!

Unexpected conversion error while converting value [CustomerId String] to an Integer

CustomerId String : couldn't convert String to Integer
Unparseable number: " 1"

Summary

Congratulations! You now can process Mainframe files in Hadoop so that the "dark data" residing on the Mainframe can be part of Big Data Analytics projects!