Code branching and concurrent development

We at Pentaho maintain numerous old versions of our software, as well as actively developing software both for the next release and the subsequent ones.

To this end, we use the following arrangement of branches. We demand that developers be aware of this pattern and use it to decide against which branch to send in their pull requests and code changes.

The next release - aka. the master branch

Every defect which are identified as needing to be fixed in the next release must be committed here. This also includes cases that need to be fixed in older releases (more on that below).

All code changes must be made via a Git pull request (PR) and reviewed by a second developer before the code is checked into the repository. The code must also adhere to the coding, styling and architectural standards.

The numbered branches - maintenance branches

There are branches which use a number as their name. These represent maintenance branches for versions of the software that we have already release but to which we'd like to add some fixes and enhancements for a re-release. They will typically be named after the minor release from which they originate.

4.8
5.0
5.1
...
5.N

A maintenance branch is created by the build team once a new release is made from the master branch. There will be one branch per minor release, but we do not create new branches for revision releases.

General workflow with maintenance branches

If a defect needs to be back ported to an older numbered branch, the developer must cherry-pick the required commits from master into the right branches. So for example, the general workflow looks like this.

Let's assume that the next release is 5.5. This is what's contained in the master branch.
A bug needs to be fixed in both 5.5 and 5.4.
The developer starts by sending a pull request (PR) against master.
The PR is reviewed and merged into master.
The initial developer must validate his change in the master CI builds before going further.
Once validated, the developer does a cherry-pick of the commits related to this bug only, and sends a second PR against branch 5.4.
A second developer validated the PR against 5.4 and merges it.
The code must now be also fixed in the 5.4 CI builds.

Preventing self-made regressions

The workflow described above is very important. All of the code must be first and foremost checked into the master branch. If we were to fix an issue in a maintenance branch and for some reason the code wouldn't make it into the master branch, we would create a regression for the customers. If we fix and issue for a customer in a patch and it reappears when they migrate to subsequent versions, we have created this regression ourselves. A bug fixed in an old release but still present in the next is a self-made regression and this situation must not happen.

Revision releases and tags

We do not, in the general case, create a new numbered branch for revision releases. We instead use tags within their respective minor numbered branch.

All revision releases are identified as tags within their respective minor release branch.

5.0
- 5.0.0.0-R
- 5.0.0.1-R
- 5.0.1.0-R
5.1
- 5.1.0.0-R
- 5.1.1.0-R

Notice that to disambiguate the tags from the branches, we use the '-R' suffix, implying a 'release'.

Codenamed branches

When code is written and planned for a future release, after whatever is scheduled for master, we must create a new development branch. We call these 'codenamed', as they do not posses an assigned version number yet. In most cases, there should be at most one such branch per project, and it usually has the name 'future-develop'.

These branches are created so that the team can collaborate on a common code base without checking in code into the next release, whether the work is not planned for release, or it is simply experimental.

Branch inheritance

The codenamed branches must be based off their 'previous' release. As an example:

The master branch contains code to be released in 5.4.
A developer wants to start working on new features for post-5.4.
A new codenamed branch is created. It is a fork from master.

There are other more complex situations possible. Let's take the following example, where a developer wants to work on features that are neither planned for the next release, nor the one after.

The master branch contains code to be released in 5.4.
The project already contains a codenamed branch called future-develop. We estimate this code to ship around 6.0.
A developer wants to work on features to be released post-6.0.
We create a new codenamed branch, but this second branch must be forked off the previous codenamed branch.

Continuous integration and merging

While codenamed branches are being worked on, we must make sure that we can at any given point in time collapse the feature branches into the master codeline.

To make this possible, we use weekly processes which 'pulls up' the master branch into the codenamed branches. Every codenamed branch must pull the changes from its original fork point.

As an example:

The master branch contains code to be released as 5.4.
There is a first codenamed branch called future-develop-1. It is based off master and merged with it each week.
There is a second codenamed branch called future-develop-2. It is based off future-develop-1 and is merged with it each week

The reason to proceed this way is to make sure that any work being done on the master branches will not cause problems in subsequent versions. It also allows fast and seamless 'collapse' and integration of future work into the master code line.

Should a conflict happen between the master code and the codenamed code, it is the responsibility of the developer from the future branch to take the lead and address the issue. The code must be either changed in the codenamed branch, or the developer must get in touch with the person who architected the master code and make sure that it is done in a way to accommodate the future code.