Pentaho Terminology Project

Thank you for contributing your terms! Read Me First...

Please follow the guidelines below when entering your terms. The Pentaho Community is encouraged to participate and contribute to this effort.

  1. Definitions are not required, but are welcome.
     
  2. New entries must be displayed in red font and must follow the formatting and style already established in the existing entries. I want to see terms in red boldface letters.
     
  3. Please tell me what Pentaho product your term relates to. I may not always know.

    Do not underestimate the importance of context! General terms like ACLs, Web Server, LDAP, cube, and so on must be defined in context of how they apply to Pentaho products otherwise, the general definition is worthless.

  4. Append your entry with your name.
     
  5. Don't worry about alphabetical order. Just plug your new term(s) under "Glossary of Pentaho Terms."
     
  6. Don't worry about the accuracy of your definition; we will work on getting consensus at a later time.
     
  7. Terms that are related to what we do, such as industry terms (e.g., LDAP) are also welcome. Definitions that are pulled from sources such as Wikipedia or Webopedia must be identified as such. We cannot tolerate plagiarism.
     
  8. Don't worry too much about grammar and usage; I will take care of any problems when I create the final draft.

"WAQR" and Other Abominations

I know I am inflicting pain on many of you... but from now on cease and desist from using these abominations:

  • WAQR — Correct term is Pentaho Ad Hoc Reporting
  • Pimper — Correct term is Pentaho Metadata Editor or PME
  • SWAG — Scientific Wild Assed Guess, let's not go there

Note: The term, "Hypersonic," persists despite all efforts to suppress it. The correct terminology is HSQLDB and has been HSQLDB since 2001. See #HSQLDB.

Glossary of Pentaho Terms

Pentaho BI Pillar

Also referred to as a Pentaho module. A grouping of products related by functionality. The six Pillars of the Pentaho BI Project are: Reporting, Analysis, Dashboards, Data Mining, Data Integration and BI Platform. (MBaker)

Pentaho BI Suite

The Pentaho BI Suite is a collective of Business Intelligence applications that include Reporting, Analysis, Dashboards, Data Integration/ETL, Data Mining, and more. There are two versions of the Pentaho BI Suite - the Community Edition and the Enterprise Edition. See also, #Pentaho BI Platform. (MBaker)

Solution

A solution consists of a collection of documents (files) that collectively define the processes and activities that are the system's part in implementing a solution to a business problem. These documents include Action Sequences, workflow process definitions, report definitions, images, rules, queries etc. A solution is represented in the file system as a top level folder in the Solution Repository. (MBaker)

Solution Engine

The BI Server contains the engines and components for reporting, analysis, business rules, email, desktop notifications, and workflow. These components are integrated together so that they can used to solve a BI-related problem. In a solution, the behavior, inter-operation, and user interaction of each subsystem is defined by a collection of solution definition documents. These documents are XML-based and contain the definitions of business processes, definitions that execute as part of processes on-demand, or called by Web services. These activities include definitions for data sources, queries, report templates, delivery and notification rules, business rules, dashboards, analytic views. (MBaker)

Solution Repository

The location where solutions and the metadata they rely on is stored and maintained. Requests made to the BI Platform to have actions executed rely on the action being defined in the Solution Repository. There are two implementations of the solution repository - the file-based solution repository and the DB based solution repository. (MBaker)

Pentaho Deployment

Refers to the in-production installations of Pentaho software. (LanceW.)

Pentaho Enterprise Console

The subscription-only Pentaho Enterprise Console provides you with a central location from which to administer your Pentaho deployments. The console aggregates and simplifies many common administrative tasks such as setting up user authentication, monitoring performance, managing connections, testing configuration, configuring LDAP, and more. (MBaker)

Access Control Lists (ACLs)

In computer security, an access control list (ACL) is a list of permissions attached to an object. The list specifies who or what is allowed to access the object and what operations are allowed to be performed on the object. In a typical ACL, each entry in the list specifies a subject and an operation: for example, the entry (Alice, delete) on the ACL for file XYZ gives Alice permission to delete file XYZ. (excerpted from Wikipedia)

Please provide a short context for where Pentaho implements the use of ACLs. I know that Security is one area. (MBaker)

Action Definition

An XML definition specifying the parameters, resources and settings required for the execution of a task within a single component. The Action Definition defines which component to call, what data to pass into and receive from the component and any component specific information required. An action definition is not a standalone document; it is a part of an Action Sequence. See also #Action Sequence (MBaker)

Action Sequence

An XML document that defines the smallest complete task that the solution engine can perform. It is executed by a very lightweight process flow engine and defines the order of execution of one or more the components of the Pentaho BI Platform. Action sequences are good for sequencing small, linear, success oriented tasks like reporting and bursting. They have the ability to loop through a result set, call another action sequence and conditionally execute components. Action Sequence documents have an ".xaction" suffix. (MBaker)

Action Sequence Editor

The Eclipse Plug-in that allows people to generate Action Sequences (a script that run in the Pentaho BI Platform.) (MBaker)

J2EE

J2EE stands for Java 2 Enterprise Edition. J2EE is a device and operating system independent Java environment for developing and deploying web-based applications. The BI Server can be deployed in various J2EE environments (presently Tomcat and JBoss). (Marc)

JNDI

JNDI stands for Java Naming And Directory Interface. It is used by Java applications as a standardized interface for accessing various directory and name services. In the BI Server, it is mainly used for locating data sources that are bound into the J2EE container for access by a Web application. (Marc)

Java

Java is a computer programming language developed by Sun Microsystems. The BI Server software was developed using the Java programming language. (Marc)

Acegi

We don't use Spring Security yet (plagiarized from mat Lowery's e-mail) (Anthony Carter) From Spring Security version 2.0.0 on, the term "Acegi" should be purged from documentation and user interfaces. (M.Baker)

Regarding Acegi Security, we use the version of Acegi Security when it was referred to as Acegi Security. The change to Spring Security is a relatively recent change. I propose changing the name in the our docs when we upgrade to version 2.0. (We use version 1.0.6.) To demonstrate the confusion that might arise should we change the name, consider the following links. The first one is for Acegi Security 1.0.7 Reference Documentation. The second is for Spring Security 2.0.x Reference Documentation. (M. Lowery)

HSQLDB

HSQLDB is a relational database management system written in Java. It is based on Thomas Mueller's discontinued Hypersonic SQL Project. (Anthony Carter)

The term, "Hypersonic" should be purged from our documentation, file names, and user interfaces. This includes the command that starts and stops HSQLDB... start_hypersonic.bat, and so on. (MBaker)

Data Mining

The process of analyzing data from different angles and summarizing it into relevant information that can be used to increase profitability, cuts costs, or both. Data mining software is one of a number of analytical tools for analyzing data. It allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. Technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases.

Pentaho BI Suite Enterprise Edition


A process-centric, solution-oriented platform that includes Business Intelligence (BI) components, which enable companies to develop complete solutions to BI-related issues. The Pentaho BI Suite Enterprise Edition includes Reporting, Analysis, Dashboards, Data Mining, Community Forums Interaction, Community Web Documentation (wiki), exclusive access to the Enterprise Edition online Knowledge Base, professional product documentation, and professional support. MBaker on 11/23

Attribute

A property or field of an object in the directory.

Authority, role, or group

In the BI Server, these three terms are synonymous. A role is a string that is associated with a user. A role is said to be granted to a user. A user is said to belong to or be a member of a role. The same role can be granted to multiple users and users can be granted zero or more roles. The BI Server uses roles to make authorization decisions.

BI Server


The BI Server consists of the Pentaho BI Platform and the libraries that deliver end user BI capabilities. The server runs inside a J2EE-compliant Application Server such as Apache, JBOSS AS, IBM WebSphere, WebLogic, and Oracle AS. The BI Server referred to in this document is your customized PCI. See also, #Pre-Configured Installation (PCI). The device that runs the application server (see BI Platform)

The BI Server is the thing that starts up when you type start-pentaho.bat(question) . I would provide a better definition if I had one. This is an "inside looking out" perspective. It runs action sequences, etc. but I haven't found a good high-level way to describe the capabilities of the BI Server that would be more useful than the way we describe the capabilities of the BI Platform. (LanceW)

This definition needs to updated. (MBaker)

End user capabilities

In the Pentaho Open BI Suite, end user capabilities include reporting, analysis, workflow, dashboards, and data mining.

LDAP User DN (Distinguished Name)

Used with LDAP authentication, this name consists of one or more strings identifying the user's assigned attributes in the LDAP Backend server and a user password.

Manager

A user with read access to relevant objects in the directory. If you're familiar with the JDBC API, a manager is analogous to a user name given along with a URL and password in a DriverManager.getConnection (url, user, password) call.

Pentaho BI Platform

The BI Platform is the core architecture and foundation of the Pentaho BI Suite. The BI Platform is composed of the libraries and compiled code that provide execution framework and services associated with logging, auditing, security, scheduling, ETL, Web Services, attribute repository, and rules engine. See also, #BI Server.

Pentaho Design Studio

The Pentaho Design Studio is a desktop Eclipse-based design environment that allows solutions, reports, queries, business rules, dashboards, and workflows to be viewed and edited graphically. The Pentaho Design Studio is a Java application that is installed on the system administrator's desktop.

Pre-Configured Installation (PCI) 


The PCI is a ready-to-use pre-configured sample deployment that can be customized quickly and easily. The PCI deployment includes the following components: JBoss Application Server, JBoss Portal V2.0, sample JSPs that demonstrate platform component usage, sample data, sample reports and BI processes, users and roles used in samples. The PCI can be modified to work with MySQL, Postgres or Oracle for the RDBMS repository.

Provider URL

A URL usually specifying protocol (such as ldap:// or ldaps://), host name, port, and root DN. If you are familiar with the JDBC API, a provider URL is analogous to a URL given along with a user name and password in a DriverManager.getConnection (url, user, password) call.

Root DN

The distinguished name of an object to which all search bases are relative.

Search base

An LDAP directory is hierarchical. Objects in the directory can have children and those children can have children, and so on. To search for relevant sub trees in the directory, a search base is necessary. The base indicates the DN of an object from which to start searching. Search bases are relative to the root DN. Stated differently: A search base is appended to the root DN to form a search base DN.

Search filter

A search filter is an expression that adheres to the rules specified in RFC 2254. It is always enclosed in parentheses.

Server repositories

The BI Server includes three embedded repositories that store the data necessary to define, execute, and audit a solution. These include: a solution Repository, a runtime repository, and an Audit Repository. The solution repository contains the metadata that defines solutions. The runtime repository contains items of work managed by the workflow engine. The audit repository contains tracking and auditing information.

Solution Engine

The BI Server contains the engines and components for reporting, analysis, business rules, email, desktop notifications, and workflow. These components are integrated together so that they can used to solve a BI-related problem.  In a solution, the behavior, inter-operation, and user interaction of each subsystem is defined by a collection of solution definition documents. These documents are XML-based and contain the definitions of business processes, definitions that execute as part of processes on-demand, or called by Web services. These activities include definitions for data sources, queries, report templates, delivery and notification rules, business rules, dashboards, analytic views.
 

Other Glossaries

The links in this section are associated with other attempts to create terminology guides or glossaries. You may want to check these and see if your term already exist in one of these glossaries.

Glossary location

Description

Actions

Description of many of the Actions available for use with the action sequence editor

Terminology

One of the first Pentaho attempts of creating a terminology document

Aggregation Designer

An early Aggregation Designer glossary

Pentaho Whitepaper

A whitepaper for version 1.6 that contains a short glossary of technical terms at the end