Purpose
The purpose of this document is to provide Java developers with a set of simple, cookbook-like instructions for how to add custom formulas to LibFormula for use in the Pentaho BI Platform, Report Designer, Metadata Editor, etc. This is not intended to be a guide on the internals of LibFormula.
Tools and prerequisites
To add your own function to LibFormula, you should be an experienced Java developer and have:
- A Java IDE (Eclipse, IntelliJ, NetBeans, etc.)
- The LibFormula .jar file in your classpath
- The LibFormula source .jar file in your classpath
Before starting
Before you get started, you need to familiarize yourself with a few key classes and interfaces. After that, you need to examine the requirements of your new function and be prepared to answer some basic questions about what you're going to implement.
Key Classes and Interfaces
org.pentaho.reporting.libraries.formula.typing.DefaultTypeRegistry
which provides the default implementation oforg.pentaho.reporting.libraries.formula.typing.TypeRegistry
org.pentaho.reporting.libraries.formula.typing.Type
which describes data typesorg.pentaho.reporting.libraries.formula.lvalues.TypeValuePair
org.pentaho.reporting.libraries.formula.util
which is a package of helper utilities that you'll use for various conversionsorg.pentaho.reporting.libraries.formula.util.NumberUtil
which contains important methods (especiallygetAsBigDecimal
) for safely getting and converting BigDecimal numbersorg.pentaho.reporting.libraries.formula.util.DateUtil
which contains important functions for safely getting and converting dates
Basic questions you need to ask yourself before coding
Input parameters
- How many parameters does your function need? For example, a NOW function would need zero parameters, and a SIN function would need one parameter.
- Are any of the parameters optional?
- What is the expected data type of each input? For example, a LN function expects a numeric input parameter, and an UPPERCASE function expects a string input parameter.
- Does your function operate on a single value, or a sequence of values? For example, a SQUAREROOT function works on a single numeric input, and an AVERAGE function works on a sequence of numbers.
Output parameters
- What type of result does your function return? For example, a SUBSTRING function returns String data, a TAN function returns numeric data.
- Does your function return a single value, or multiple values (array or sequence)? For example, an ACOS function returns a single numeric type, and INDEX returns an array of numeric values.
- What is the category of your result? Categorization of functions allows user-interfaces to list your function in the right place. At the time of this writing, there were the following pre-defined categories of functions:
- DateTimeFunctionCategory (e.g. YEAR)
- FinancialFunctionCategory (e.g. IRR - Internal Rate of Return)
- InformationFunctionCategory (e.g. ISBLANK)
- LogicalFunctionCategory (e.g. XOR)
- MathFunctionCategory (e.g. ATAN)
- RoundingFunctionCategory (e.g. INT)
- TextFunctionCategory (e.g. TRIM)
- UserDefinedFunctionCategory (e.g. NULL)
The LibFormula Cookbook
Extending LibFormula is as simple as implementing two interfaces, and creating a couple of .properties files that tells it about your new function. Here are the two interfaces you'll be implementing:
org.pentaho.reporting.libraries.formula.function.Function org.pentaho.reporting.libraries.formula.function.FunctionDescription
To see a simple example that uses these classes, open up the LibFormula source in your IDE and go to the org.pentaho.reporting.libraries.formula.function.math
package. Then open up the classes AbsFunction
and AbsFunctionDescription
, and the file Abs-Function.properties
.
File Descriptions:
AbsFunction - This class performs the work. It evaluates the incoming parameters, and returns the result.
AbsFunctionDescription - This class describes the function to the outside world. It is the mechanism that allows user interfaces to recognize, categorize, and display your function in the correct place.
Abs-Function.properties - This file provides the name and description for your function as well as all the arguments to your function.
Fast-path to implementing Function
- Make sure you have a zero-argument constructor
- Check your parameter-count first, then check the types of each of your parameters
- Throw an EvaluationException if you have problems interpreting the parameters
Fast-path to implementing FunctionDescription
- Subclass
org.pentaho.reporting.libraries.formula.function.AbstractFunctionDescription
and override only what's required. - Create a zero-argument constructor, and call the super-class constructor with two parameters:
- The string you return from your Function's getCanonicalName() function.
- The path and name (using package notation, not file notation) of the .properties file that contains translatable strings for use in user interfaces (explained below) without the .properties extension. In this case, it's
org.pentaho.reporting.libraries.formula.function.math.Abs-Function
, which tells LibFormula to find the fileAbs-Function.properties
in theorg.pentaho.reporting.libraries.formula.function.math
package.
Create your function .properties file
In the same package as your Function and FunctionDescription implementations, create the properties file you named in your FunctionDescription zero-argument constructor. This properties file provides names and descriptions for your function, and for all of your parameters. For a zero-parameter function, you'll only need two properties in the file:
display-name description
Make sure that display-name is the same as the string you return in your Function's getCanonicalName() function.
Create libformula.properties
The class LibFormulaBoot
is coded to look for all libformula.properties files that aren't in any package. This file tells LibFormula about your new formula. Your new function requires two properties. The property name is broken into three parts separated by dots:
- The fully qualified category. For example -
org.pentaho.reporting.libraries.formula.functions.datetime
for date/time functions, ororg.pentaho.reporting.libraries.formula.functions.math
for math functions - The root name of your class (without Function or FunctionDescription). For example, if you created a class called
VarianceFunction
andVarianceFunctionDescription
, then the root name of the class isVariance
. class
for Function, ordescription
for FunctionDescription. The value of theclass
property is the fully qualified package and class for the Function implementation. The value of thedescription
property is the fully qualified package and class for the FunctionDescription implementation.
For example, let's assume the following:
- You've created a new GCD (Greatest Common Denominator) function
- You work for the company Acme
- Your Function implementation is
com.acme.libformula.GreatestCommonDenomFunction
- Your FunctionDescription implementation is
com.acme.libformula.GreatestCommonDenomFunctionDescription
Given the above, your libformula.properties file would have the following two lines in it:
org.pentaho.reporting.libraries.formula.functions.math.GreatestCommonDenom.class=com.acme.libformula.GreatestCommonDenomFunction org.pentaho.reporting.libraries.formula.functions.math.GreatestCommonDenom.description=com.acme.libformula.GreatestCommonDenomFunctionDescription
Sample Code
Attached is a sample Java project that implements a simple LibFormula function called sleep (SLEEP). The SLEEP function takes a numeric parameter, and calls the Thread.sleep method which causes the current thread to cease execution for the specified number of milliseconds.
Summary
The Pentaho LibFormula library was designed from the very beginning to allow extensions in many different ways; adding your own formula is only one of many possible extension points. We hope that the extensibility of LibFormula encourages community contributions like financial libraries, algebraic expressions, and other useful operations.