JFR9DataProcessing

Data Processing and Functions

Introduction to the Data-Layer

JFreeReport strictly separates the data processing from the layout processing.

The Data-Layer provides a unified interface to the various data-sources and encapsulates all implementation specific details. In JFreeReport, there are three general classes of data-sources: (1) Tables provide JFreeReport with mass-data. Table-data is usually read from a database or similar data storage. (2) Parameters allow to feed single data values from the outside. (3) Functions and Expressions as third data-source compute values using other data-sources as input.

General Architecture

The datarow is the central interface for accessing values from the various data sources. The datarow is a stacked and strictly separated structure. A global view grants access to the local data and prevents the direct manipulation of the datastructures of the backend.

The backend itself is a collection of all datasources. In the default implementation there are four different datasource types available:

Parameters
Report Data (Tables queried from the ReportDataFactories)
Functions
Imported Parameters

Datarows are layered. Each report (either a master or a subreport) opens a new data-row context. Datarows from lower layers can be accessed from upper layers through the imported parameters mechanism.

Parameters

Each master and subreport can take an unlimited number of parameters. When a report process is started, the parameters are used to perform the query. Parameters are always available and it is assumed, that they don't change during the report processing.

Parametervalues for subreports are read from the dataset of the innermost report. Parameters for subreports can either be defined explicitly - in that case the parameters can be aliased to avoid naming conflicts - or a default mapping can be declared. In that case, all columns of the master row will be imported into the subreport with the same name they had in the master report.

A subreport that defines export parameters maps those parameters into the datarow of the master report. Deep traversing functions can access these parameters for their computations.

Export parameters stay available after the sub-report has been fully processed until the next 'commit' operation has been done.

Report-Data and DataFactories

The report processing always starts with a (possibly empty) set of parameters and a query name. The parameters correspond to the report properties of the old reporting engine. Each parameter automaticly appears as static column in the data row. Sub-Reports have no explicit parameter set, they receive their parameters from the current data row context. Every report and subreport has a query name. The name, along with the current values from the data row, is used by the ReportDataFactory to query the underlying datasource and to return a valid ReportData object.

A report data object represents a data table with an assigned cursor. JFreeReport guarantees, that report-data objects are interated using the 'advance' operation. The cursor movement is used to previously read data-rows. The cursor will never be positioned after the last row read with 'advance'.

The format and contents of the 'query'-string is not defined and depends on the actual DataFactory implementation. It can be anything, from valid SQL to an handle to an arbitary complex operation.

Classical reporting is driven by uniform mass-data given in two dimensional tables. As the reporting process advances, this data is merged into the report definition. Additional computations. aggregation and grouping can be applied to provide additional information derived from that original data.

In the default implementation all repeating elements bind their repeatability to the report data. Repeating is only done if there is more data to process.

The classical reporting uses tables as storage format for the mass-data. One reason for that reporting originated in (and is mostly used in) relational databases and business applications that use such relational databases as primary storage system. Another reason for using tables is, that tables can be used to construct almost any other data structure. Trees and other complex structures can be built using nested tables or subqueries. In the reporting domain subqueries are done using subreports.

Tree Report makes two assumptions when dealing with tabular data.

The data is constant. Reading the same position multiple times must always return the same result. The number or order of the rows does never change.
The data is randomly accessible. The current cursor position can change to any previously read row at any time.
The data that is retained is sorted in a suitable way. The reporting engine will not sort the data; it is the responsibility of the report author to provide a reasonably sorted data-set that fits the declared groupings of the report.

All report data factories described in this document can be created using the API or can be instantiated from the XML parser.

SQL-Data-Factories

The SQL-DataFactories allow JFreeReport to query JDBC-DataSources. By default, the query-string is an alias for the real SQL-Query-string.

SQLDataFactories can be defined in XML and can be referenced directly from the report definition.
The queries can be parametrized. The engine will translate named parameters into positional parameters. The actual query is always done using PreparedStatements, JDBC-Drivers which do not support prepared statements cannot be used with this implementation.

Relational databases are the most commonly used datasource for reports. JFreeReport provides a JDBC based ReportDataFactory implementation to access such databases.

To provide support for named parameters, the data factories use a special parametrisation syntax. Parameters are specified using '${parameter}'. The special characters can be escaped using the backslash.

Example: SELECT * FROM table WHERE column = ${parameter}

This query string is translated into a prepared statement and the value read from the column 'parameter' from the report parameters is used as argument for the query.

JFreeReport comes with two implementations of the ReportDataFactory. The SimpleSQLReportDataFactory expects a valid SQL- Query (optionally with parameter specifications) and executes these queries directly. The SQLReportDataFactory provides a naming mechanism so that each query is addressed using asymtotic name instead of the raw SQL string.

The SQLReportDatafactory can be fully defined using an XML-document.

All SQL-ReportDataFactories need a Connection Provider to gain access to a valid JDBC Connection object. We provide two implementations for the ConnectionProvider-interface

StaticConnectionProvider carries an user-provided connection object. The connection contained in the provider must be open and will be controlled by the data factory.
DriverConnectionProvider: A JDBC-Driver implementation is used to create a connection to the database.

Code Examples:

Defining a SQL-DataSource in XML:

<?xml version="1.0"?>
<!--
  ~ Copyright (c) 2006, Pentaho Corporation. All Rights Reserved.
  -->

<sql-datasource
        xmlns="http://jfreereport.sourceforge.net/namespaces/datasources/sql"
        xmlns:html="http://www.w3.org/1999/xhtml">
  <connection>
    <driver>org.hsqldb.jdbcDriver</driver>
    <url>jdbc:hsqldb:./sql/sampledata</url>
    <properties>
      <property name="user">sa</property>
      <property name="pass"></property>
    </properties>
  </connection>

  <!-- First query: get all regions .. -->
  <query name="default">
      SELECT DISTINCT
           QUADRANT_ACTUALS.REGION
      FROM
           QUADRANT_ACTUALS
      ORDER BY
          REGION
  </query>

  <query name="actuals-by-region">
      SELECT
           QUADRANT_ACTUALS.REGION,
           QUADRANT_ACTUALS.DEPARTMENT,
           QUADRANT_ACTUALS.POSITIONTITLE,
           QUADRANT_ACTUALS.ACTUAL,
           QUADRANT_ACTUALS.BUDGET,
           QUADRANT_ACTUALS.VARIANCE
      FROM
           QUADRANT_ACTUALS
      WHERE
          REGION = ${REGION}
      ORDER BY
          REGION, DEPARTMENT, POSITIONTITLE
  </query>
</sql-datasource>

Parsing the file can be done using the common LibLoader code:

    JFreeReport report; // created elsewhere
    Object sourceObject; // either a valid URL, File or String object

    ResourceManager manager = new ResourceManager();
    Resource resource = manager.createDirectly
        (sourceObject, ReportDataFactory.class);
    ReportDataFactory dataFactory = (ReportDataFactory) resource.getResource();
    report.setDataFactory(dataFactory);

The SQLDataFactory can also be created using the API.

    DriverConnectionProvider provider = new DriverConnectionProvider();
    provider.setDriver("your.database.jdbc.Driver");
    provider.setProperty("user", "joe_user");
    provider.setProperty("pass", "secret");
    provider.setUrl("jdbc:yourdb://host/database");
    SQLReportDataFactory dataFactory = new SQLReportDataFactory(provider);
    dataFactory.setQuery("default",
        "SELECT DISTINCT REGION FROM QUADRANT_ACTUALS");
    dataFactory.setQuery("actuals-by-region",
        "SELECT * FROM QUADRANT_ACTUALS WHERE REGION=${REGION}");

Table-Data-Factories

Functions and Expressions

Reporting sometimes requires inline computations. JFreeReport uses several classes of computation functionality. Named expressions and functions are used to compute reusable values, The resulting value gets added to the data row and can be referenced from any other expression as long as the element that declared the expression is in scope.

Other expressions serve a specific purpose within the system. Grouping expressions define when a group is finished, display expressions control whether an element is printed on the output-target and so on.

Named Expressions: Usage patterns

Expressions for computation purposes generally get added to repeating sections. Expressions go out of scope as soon as the element that declared them is finished. The scoping can be used to control the visibility of computation results. Named expressions publish their computation result in the datarow. A named expression can reference itself to read the last computed value. This schema allows some simple recursion in the form "i = i + n".