Create Hive Database Connection

Create a Database Connection to Hive

If you already have a shared Hive Database Connection defined within PDI then this task may be skipped.

  1. Start PDI on your desktop. Once it is running choose 'File' -> 'New' -> 'Job' from the menu system or click on the 'New file' icon on the toolbar and choose the 'Job' option.

  2. Create a New Connection: In the View Palette right click on 'Database connections' and select 'New'.


  3. Configure the Connection: In the Database Connections window enter the following:
    1. Connection Name: Enter 'Hive'
    2. Connection Type: Select 'Hadoop Hive'
    3. Host Name and Port Number: Your connection information. For local single node clusters use 'localhost' and port '10000'.
    4. Database Name: Enter 'Default'
      When you are done your window should look like:

      Click 'Test' to test the connection.
      If the test is successful click 'OK' to close the Database Connection window.
  4. Share the Hive Database Connection: You will want to use your Hive database connection in future transformations, so share the connection by expanding 'Database Connections' in the View Palette, right clicking on 'Hive', and selecting 'Share'.
    Sharing the connection will prevent you from having to recreate the connection every time you want to access the Hive database in a transformation.