/
Transforming Data within a Hadoop Cluster
Transforming Data within a Hadoop Cluster
Unknown macro: {scrollbar}
How to transform data within the Hadoop cluster using Pentaho MapReduce, Hive, and Pig.
- Using Pentaho MapReduce to Parse Weblog Data — How to use Pentaho MapReduce to convert raw weblog data into parsed, delimited records.
- Using Pentaho MapReduce to Generate an Aggregate Dataset — How to use Pentaho MapReduce to transform and summarize detailed data into an aggregate dataset.
- Transforming Data within Hive — How to read data from a Hive table, transform it, and write it to a Hive table within the workflow of a PDI job.
- Transforming Data with Pig — How to invoke a Pig script from a PDI job.
- Using Pentaho MapReduce to Parse Mainframe Data — How to use Pentaho to ingest a Mainframe file into HDFS, then use MapReduce to process into delimited records.
, multiple selections available,
Related content
Loading Data into a Hadoop Cluster
Loading Data into a Hadoop Cluster
Read with this
Helpful Commands for Working with Hadoop Configurations
Helpful Commands for Working with Hadoop Configurations
Read with this
Loading Data into HDFS
Loading Data into HDFS
Read with this
Configuring Pentaho for your Hadoop Distro and Version
Configuring Pentaho for your Hadoop Distro and Version
Read with this