04. MQLQuery Architecture Documentation
The MQLQuery and pms.mql packages have been deprecated, this documentation is relevant to Metadata releases 3.0 and earlier.
Important Classes and Interfaces within the org.pentaho.pms.mql package:
MQLQuery |
This is the public facing interface of the MQL SQL generation code. This interface defines the basic object model of mql queries: selections, constraints, and orderings. The getQuery() call returns the necessary sql. |
MQLQueryFactory |
This is the factory class used by other systems for determining the instance of MQLQuery to use when generating SQL from MQL. |
MQLQueryImpl |
This is the default implementation of MQLQuery. This class contains the necessary members and methods for serializing and de-serializing MQL. It delegates SQL generation to the SQLGenerator class. |
Selection |
This is a single Business Column selection within an MQL query |
WhereCondition |
This is an open formula expression which constrains the MQL query |
OrderBy |
This is a single business column and type of ordering used by the MQL query |
SQLGenerator |
This class generates SQL based on MQL. The SQL Generator determines the join path for the selected business columns, and also determines the necessary group by organization if aggregates are used. |
PMSFormula |
PMSFormula translates an open formula expression into SQL. This is used in many contexts, including constraints, exact business column definitions, and complex join definitions. |
Path |
This class is used as a data structure when determining the correct join path of selected business columns. |
Query |
This is an interface returned by MQLQuery that allows access to the SQL generated. It also provides an API for generating Extended Metadata. |
MappedQuery |
This is the implementation of the Query interface. the SQL generated by MQL uses COL0 to COLN for sql columns, and these columns are mapped back to Business Columns via the ExtendedMetaData. |
ExtendedMetaData |
This class enhances MemoryMetaData by providing more information on each column selected by the query. |
MQL Query Model
Domain Type |
the type of domain, either relational or olap. Olap is not supported at this time. |
Domain |
the Metadata domain selected |
Model |
the Metadata Business Model selected within the domain |
Options |
options allow for customization of the SQL generation |
Selections |
Selections are the set of business columns returned in the generated SQL query |
Constraints |
Constraints are open formula expressions which constrain the SQL query |
Orders |
Orders are used for sorting the SQL generated. |
See the document Pentaho Metadata MQL Schema for more details on the XML configuration.
Important Algorithms
Deserialization of MQL XML
The MQLQueryImpl class provides the functionality to parse MQL XML. First an XML DOM is generated from a String, and then that DOM is traversed via the fromXML() methods. The dom is broken into five segments.
The first section includes the parsing of domain_type, domain_id, and model_id. These are all required.
The options element contains customized options that impact the generation of the SQL. The only option defined within MQLQueryImpl is disable_distinct.
Selection elements contain the business column selections for the query in the column child element. Selection objects are created for each xml element.
Constraint elements contain an open formula expression in the condition child element, and an operator definition as a child element. A new WhereCondition object is generated for each constraint.
Order elements contain direction, view_id, and column_id child elements. direction may equal "asc" or "desc", and the view_id and column_id elements must resolve to an existing business column. An OrderBy object is generated for each order element.
Serialization of MQL XML
The MQLQueryImpl class provides functionality to generate MQL XML based on it's object model of selections, constraints, orders, and options. The getXML() call first generates an XML DOM, and then prints that out to a string.
SQL Generation
The SQLGeneration class combined with the PMSFormula class contain the code necessary for generating SQL based on the MQL model. First, an optimal join path is determined that contains all selected and referenced business tables. From there, the SQL is generated in seven generation steps. These generation steps include Select, From, Join, Where, Group By, Having, and Order By.
Join Path Algorithm
The getShortestPathBetween() method determines the shortest path between the list of included tables within the MQL Query.
The algorithm first determines if there is an existing path between all selected tables. If not, the algorithm continues to add new tables to the list until a path is discovered. If more than one path is available with a certain number of tables, the algorithm uses the relative size values if specified to determine which path to traverse in the SQL Join.
Select
The generateSelect() method traverses the set of selections and renders those selections to the SQL string buffer. This method determines the SQL column aliases. It also calls getBusinessColumnSQL() which renders each individual business column in three different ways. Either as an MQL Formula, an aggregate function, or as a standard SQL column.
From
The generateFrom() method traverses the set of included business tables and renders those tables to the SQL string buffer.
Joins
The generateJoins() method traverses the set of included RelationMetas and renders them to the SQL string buffer via the getJoin() method
Where
The generateWhere() method renders all WhereCondition's that are not part of an aggregate column.
Group By
The generateGroupBy() method renders the non aggregate columns in this section, creating the correct SQL for group by statements.
Having
The generateHaving() method renders all the Where Conditions of aggregate columns.
Order By
The generateOrderBy() method renders all the Order By business columns
PMSFormula Class
The PMSFormula class is used in various contexts while generating SQL from MQL. These contexts include Business Column Exact Formula rendering, Relation Complex Join rendering, and Constraint rendering.
Business Column Exact Formula Rendering
Used within the SQLGeneration.getBusinessColumnSQL() method when a business column is configured as "exact", the formula specified as part of the business column is parsed as an open formula and then rendered as SQL.
Relation Complex Join Rendering
Used within the SQLGeneration.getJoin() method when a Relation is set to complex, the complex join string is parsed as open formula and rendered to SQL.
Constraint Rendering
Constraints are rendered using PMSFormula. Each WhereCondition class contains a PMSFormula object which manages the SQL rendering. Constraints are parsed and resolved before SQL generation, to determine which business tables are referenced. This is an important step so that the Join algorithm contains the correct set of tables.
 Extending MQLÂ
Specifying your own MQLQuery class
The default MQLQuery behavior may be extended or replaced. To accomplish this:
Step 1 |
Extend MQLQueryImpl or implement the MQLQuery interface |
Step 2 |
Make sure to have a constructor with the following signature: |
Step 3 |
Either pass in the fully qualified class name into MQLQueryFactory.getMQLQuery(), |
Extending MQLQueryImpl
To easily add modifications or additional information to the MQL XML syntax, one must extend MQLQueryImpl, SQLGenerator and PMSFormula. The MQL Query Model can be extended by overriding specific methods within these classes. In the org.pentaho.pms.example package, an AdvancedMQLQuery class is available that demonstrates the extension of Selections XML parsing.