...
This step requires Python 2.7 or 3.4 to be installed. It also requires the pandas, numpy, matplotlib and sklearn packages to be installed in Python. The python executable must be available in the user's PATH.
See also Mark Hall's blog post: CPython Scripting in Pentaho Data Integration