What's a Pentaho Metadata Dialect?
A dialect defines how Pentaho Metadata should generate SQL for a given database type.
Pentaho Metadata's Dialect Plugin System
Introduced in BI Suite Release 3.7, Pentaho Metadata supports additional dialects through it's Service Provider Interface (SPI): SQLDialectInterface
. A Pentaho Metadata Dialect Plugin is a jar file that consists of a dialect implementation and some configuration. The SPI enables developers to develop new database dialects which Pentaho Metadata will automatically detect and register.
Implementing SQLDialectInterface
Database types are defined in Kettle with a unique database id, e.g. SampleDB. Dialects relate one-to-one with database type ids defined in Kettle DB Core or a Kettle Database Plugin. A dialect is an implementation of the SPI:
public interface SQLDialectInterface { public String getDatabaseType(); public String quoteStringLiteral(Object str); public boolean isSupportedFunction(String functionName); public boolean isAggregateFunction(String functionName); public boolean isSupportedInfixOperator(String operator); public SQLFunctionGeneratorInterface getFunctionSQLGenerator(String functionName); public SQLOperatorGeneratorInterface getInfixOperatorSQLGenerator(String operatorName); public String getDateSQL(int year, int month, int day); public String generateSelectStatement(SQLQueryModel model); public int getMaxTableNameLength(); }
A base implementation exists and can be used as a simple starting point for most new dialects. In addition to the SPI, DefaultSQLDialect
defines additional extension points to facilitate building new dialects quickly. Below is a dialect that defines the maximum table name length and a trivial method for quoting a String literal for our Sample Database:
public class SampleDialect extends DefaultSQLDialect { public SampleDialect() { super("SampleDB"); } @Override public int getMaxTableNameLength() { return 128; } @Override public String quoteStringLiteral(Object str) { return "'" + str + "'"; } }
Declaring your Dialect as a Service Provider
A jar file containing your new dialect should be created with an additional META-INF entry. Pentaho Metadata utilizes the Java Service Loader API to dynamically look up dialect implementations. The ServiceLoader API requires you list all Service Provider implementations in a file named after the interface of the service they provide in the META-INF/services
directory of the jar file:
META-INF/services/org.pentaho.pms.mql.dialect.SQLDialectInterface: org.pentaho.pms.mql.dialect.SampleDBDialect
Where org.pentaho.pms.mql.dialect.SampleDialect
is the implementation of SQLDialectInterface
for our fictitious database type: SampleDB.
For more information see Jar Service Provider Packaging Notes from Oracle.
Using your dialect
All that is required is the jar file to be in the classpath. In most applications this is a /lib
directory within the application's installation directory.
- Note: In order to use a dialect the database type must be defined in Kettle. See the sample project for how to create a Kettle Database Plugin, and the Unit Test for how to register it programmatically.
Sample Source Code
Attached is a sample project with the aforementioned SampleDB Dialect Plugin and a SampleDB Kettle Database Plugin. The attachment is an Eclipse project with independent Ant build scripts based on Subfloor. All dependencies are managed by Ivy. Subfloor will take care of downloading Ivy and resolving all dependencies by running the default target:
../sampledb-plugin> ant
http://wiki.pentaho.com/download/attachments/19235857/sampledb-plugin.tar.gz
Further Reading
For more information on how to test a Kettle Plugin click here: http://wiki.pentaho.com/display/EAI/How+to+debug+a+Kettle+4+plugin
For more general information about Java's ServiceLoader API see these links:
http://download.oracle.com/javase/6/docs/api/java/util/ServiceLoader.html
http://java.sun.com/developer/technicalArticles/javase/extensible/index.html