Wiki Markup |
---|
{scrollbar} {excerpt} How to read data from a Hive table, transform it, and write it to a Hive table within the workflow of a PDI job.{excerpt} h1. Prerequisites In order follow along with this how-to guide you will need the following: * Hadoop * Pentaho Data Integration * Hive h1. Sample Files The source data for this guide will reside in a Hive table called weblogs. If you have previously completed the [Loading Data into Hive] guide, then you can skip to [\#Create a Database Connection to Hive|]. If not, then you will need the following datafile and perform the Create a Hive Table\] instructions before proceeding. The sample data file needed for the [\#Create a Hive Table|] instructions is: | File Name | Content | | [weblogs_parse.txt.zip|Transforming Data within Hive in MapR^weblogs_parse.zip] | Tab-delimited, parsed weblog data | \\ h1. Step-By-Step Instructions h2. Setup Start Hadoop if it is not already running. Start Hive Server if it is not already running. {anchor:Create a Hive Table} {include:Include Transforming Data within Hive} |
Page Comparison
Manage space
Manage content
Integrations