Problem statement

At maal we have recognized that well written commit messages can greatly improve the code review experience and make resolving git conflicts easier but just knowing that does not help us improve the situation. Complaining each time with an arbitrary set of remarks about the quality of the messages would be counterproductive, it’s draining on both ends of the conversation. We needed a standard, ideally, a machine verifiable standard. Among many trends in git commits, one that caught our attention is conventional commits specification so we went on to implement the enforcement of the standard with our CI infrastructure.

What are conventional commits and how can they help?

Conventional commits standardise the structure of a commit introducing a notion of a type, scope and some other elements. The type starts the message and signals the kind of a change that a commit introduces. Just adding a structure to the commits is not enough to assert better quality of communication but forcing the developers to stop for a moment to select an appropriate type of a commit before pushing it has proven to be beneficial. There are some other gains we have not considered until we saw the amount of tools that work with conventional commits like the possibility of automatic change-log generation.

What is git-cliff?

Git-cliff is not a tool for testing adherence to conventional commits specification. It is actually a change-log generator that can use the git history to fill a template. However, because it supports and differentiates between conventional and unconventional commits it has the potential to be used to do just that. The tool is built in Rust and it uses bindings to libgit2 to fetch the commits from the repository so you do not have to have the git binary even installed alongside it. The compressed docker image that is distributed with it as a result is even (slightly) smaller than that of alpine/git. Since at maal we are now looking forward to generating change-logs from our commits being able to use a single application for both verifying them and rendering the change-log would limit the amount of dependencies in our development process.

Why would you check commit messages with CI pipeline?

Commit message contents could be checked in multiple ways including:

a client-side git hook,
a server-side git hook,
an inspection during a code review,
and the CI pipeline.

When we are talking about a process that affects each and every commit made it is worthwhile to consider costs and benefits of each approach.

If you can control or automate your contributor’s development environment then using client-side git hooks will be probably the least expensive option. These scripts can be made to execute prior to the creation of a commit saving your remote git or CI pipeline server from unnecessary computation and teach your developers discipline through immediate feedback. They are not installed on git clone contrary to what some could expect and they can be added and removed by the developers at any moment which means that enforcing them using a top-down approach within an organisation requires some additional tools. There are amazing solutions that integrate installation of the hook into local build tools such as husky for npm and its re-implementations for other languages like cargo-husky. However, at maal we are yet to settle on a solution that would enforce common development environment, each new development dependency complicates onboarding, so we have decided to try something else.

Server-side git hooks on the other hand may require some control over the server hosting the git repository. With Gitea for example it is relatively easy to manage them per repository but unless you plan to use shell scripting solely to verify the commits (which we just don’t) you will need to install something on the machine or the docker image running the server. It could be a really good idea if you do not even have a CI infrastructure. As we already have easily configurable CI tools we decided to stick with them and not introduce complexity in a new area of the infrastructure.

An interface for managing server-side git hooks in the repository’s settings at Gitea.

Similarly, using code reviews to check something that is easily verifiable by a machine could unnecessarily increase tension between developers during the adaptation period. Unless your dev team is made up of few very disciplined people this will also be the most expensive approach.

Finally, using CI it is usually much easier to achieve decoupling from the users’ and server’s configurations. Unlike git hooks, the CI configuration can be version-controlled. The downsides are that it will be always computationally more expensive than the client-side hooks and the feedback from it might not be as fast.

Considering prior practices at _maal _and the tools we had available, employing CI for the job was the most natural choice.

The solution we chose

At maal we are using Drone-CI for our pipelines so our whole solution will be contained in two files committed to our repository:

.drone.yml - defining CI steps executing git-cliff
cliff.toml - containing configuration for git-cliff

Writing a failing template

Since git-cliff is primarily a change-log generator it does not have an explicit configuration option that would make it exit immediately with an error on a commit that does not follow conventional commits but what it has is a templating system which can be hacked to throw an error. Git-cliff uses Tera for its templates which offers a simple throw function. The most minimal template that does this is:

[changelog]
# template for the changelog body
body = """
{% for commit in commits %}
    {% if not commit.conventional %}
        {{ throw() }}
    {% endif %}
{% endfor %}
"""

However, for it to work as expected we need to first make sure to add to our configuration two options:

[git]
# parse the commits based on https://www.conventionalcommits.org
conventional_commits = true
# filter out the commits that are not conventional
filter_unconventional = false

Adding conventional_commits = true makes the tool parse the commits using the specification. Otherwise it wouldn’t try to recognise the scope part of the message.
Setting filter_unconventional to false we tell the tool to put even the “unconventional” commits into the template which will give it a chance to throw an error. Otherwise git-cliff will ignore without an error commits that are not conventional.

Having just those contents in the cliff.toml and executing git-cliff in one of our repositories in which we want to introduce the conventional commits we will get:

git-cliff
 ERROR git_cliff > Template render error:
Function call 'throw' failed  
Function `throw` was called without a `message` argument

This is the expected behaviour considering that the repository is full of commits that do not follow the specification. The command returned with an exit code equal to 1 which works great for us since this will make Drone CI pipeline reject the code. However, as is we would end up with getting rejected forever because the non-conventional commits rather won’t disappear from our history.

Injecting context to the error

Before we figure out how to set up the CI so that git cliff only checks the code in the pull requests we can improve the error message slightly. We can add a message to the throw call and build the input string including some other fields of commits that are available in the context of the template.

[changelog]
body = """
{% for commit in commits %}
    {% if not commit.group or not commit.conventional %}
        {{ throw(message=
          "Invalid commit format of commit:"
          ~ commit.id
          ~ "\nMessage: "
          ~ commit.message
          ~ "does not adhere the conventional commits specification."
        ) }}
    {% endif %}
{% endfor %}
"""

Now when we run the program we get more details about what went wrong:

git-cliff
 ERROR git_cliff > Template render error:
Function call 'throw' failed
Invalid commit format of commit: 834b6b5df53395132e79bcd54c80a1f83778f29b
Message: Merge branch 'branch1' of ssh://remote-url into branch2
does not adhere the conventional commits specification.

Note that this requires at least version 2.2.0 of git-cliff, and in this tutorial we are using version 2.2.2.

Ignoring merge commits

By extending the error message we have discovered that merge commits might cause us some problems which luckily we can fix by extending our cliff.toml with commit parsers:

[git]
conventional_commits = true
filter_unconventional = false
# parsing commits before they are put in the template context
commit_parsers = [
    { message = "^Merge branch", skip = true }
]

Parsers pre-process the commits before they land in the context of a template. They can do much more than just skipping commits based on the regex pattern.

After eliminating merge commits we should be left only with the bad, lazy messages that we want to fight against:

git-cliff
 ERROR git_cliff > Template render error:
Function call 'throw' failed
Invalid commit format of commit: f1ab72c9048eeea52f9a190c4adf97b0a1501e2d
Message: quickfix
does not adhere the conventional commits specification.

Ah yes.

quickfix

Clearly someone wanted to be done with writing that message ASAP.

Using git-cliff with Drone CI to check pull requests

To be fully aware of the issues with our bad past let’s run git-cliff on the CI pipeline server. We can create a new dedicated pipeline for this task in our .drone.yml using image of git-cliff stored on the Github Container Registry:

kind: pipeline
type: docker
name: parse_commits

steps:
  - name: render_changelog
    image: ghcr.io/orhun/git-cliff/git-cliff:2.2.2
    pull: if-not-exists
    commands:
      - git-cliff

trigger:
  event:
    - pull_request

After adding changes to the file let’s commit them following the specification, open a pull request, and see the pipeline execute.

git commit -m "ci: make sure that commit messages follow conventional commits spec"

It fails, with a different commit but overall the same problem. Instead of feeling happy about implementing conventional commits we have now blocked our pull requests.

To finally escape our bad past we can use some environment variables that are set on the Drone CI, specifically:

We have to be careful however since we have found out here at _maal _ during our experiments that DRONE_COMMIT_BEFORE is not always set and while something like `..bc8abbf9f5d0304ad1aef059fa59e72554363f5f` will be understood by git-cliff as range it will not include the commit(s) we want. To work around this issue we will create a very brief script using POSIX shell (which is the only shell available in our docker image) for our pipeline step.

steps:
  - name: try_render_changelog
    image: ghcr.io/orhun/git-cliff/git-cliff:2.2.2
    pull: if-not-exists
    commands:
      - >
        if [ -z ${DRONE_COMMIT_BEFORE:-""} ];
          then git-cliff ${DRONE_COMMIT_AFTER}~1..${DRONE_COMMIT_AFTER};
          else git-cliff ${DRONE_COMMIT_BEFORE}..${DRONE_COMMIT_AFTER};
        fi

When you read this solution we have provided please be mindful of the fact that Drone does some variable substitution on the file before it executes the commands. As a result we get the following run:

Minimising the pipeline duration

Drone starts each pipeline with a “hidden” step that does git clone using provided by Drone docker image of git. It performs a simple git clone which means that it pulls the whole repository with all its files just to check the what is essentially just a git log. We can improve the performance of the pipeline greatly by turning off this step and providing our own implementation that performs a git clone which pull just the commits history and a single file: cliff.toml.

Here’s the final pipeline configuration:

kind: pipeline
type: docker
name: parse_commits

clone:
    disable: true

steps:
  - name: git_clone_history_only
    image: bitnami/git
    pull: if-not-exists
    commands:
      - git clone --filter=blob:none --filter=tree:0 --sparse --no-checkout --quiet ${DRONE_GIT_HTTP_URL} .
      - git checkout ${DRONE_COMMIT} -- cliff.toml
  - name: try_render_changelog
    image: ghcr.io/orhun/git-cliff/git-cliff:2.2.2
    pull: if-not-exists
    commands:
      - >
        if [ -z ${DRONE_COMMIT_BEFORE:-""} ];
          then git-cliff ${DRONE_COMMIT_AFTER}~1..${DRONE_COMMIT_AFTER};
          else git-cliff ${DRONE_COMMIT_BEFORE}..${DRONE_COMMIT_AFTER};
        fi

trigger:
  event:
    - pull_request

We are using several tricks to make clone as minimal as possible

git clone —filter=blob:none prevents the clone from downloading any file contents. We won’t need any files but cliff.toml which we checkout in the next step.
git clone —filter=tree:0 is similar git clone —sparse in that it limits the depth of the file tree that is pulled.
git clone —no-checkout prevents the clone from checking out the default branch of the repository.
git checkout ${DRONE_COMMIT} — cliff.toml pulls the contents of the only file we care about which is the configuration for git-cliff at the most recent for a given PR version.

Using those tricks we have managed to cut the pipeline duration in half - from more than 20 seconds to 11 seconds.

Making the most of git-cliff

Now that we have configured git-cliff as a commit message linter we could extend the template and use its output for something else. We could have a release pipeline based on pull requests to a selected branch with change-logs generated from commits. We could also make the CI post comments on pull requests with a neatly formatted summary. Basically there’s the whole “change-log generator” aspect we haven’t covered here but you can learn more about that from git-cliff documentation.

Business Management and ERP Systems

E-commerce

Hosting and IT Infrastructure

Web app development

IT Consultations

How to enforce conventional commits using git-cliff with a CI pipeline?