Search
  • JeB

Open Source Series: Automation

Updated: Aug 28


Hello everyone! For those of you who haven’t read the previous parts or are wondering what I planned for the next parts:


Table of content

  1. Intro

  2. Starting a Project

  3. Documentation

  4. Publicity

  5. Issues and PRs

  6. Automation

  7. Versions management (WIP)

Recap

In the previous chapter, we’ve talked about managing contributions, i.e. issues and PRs. Today we’re going to talk about Automation — probably one of the most important aspects in OSS project management.


Why automate

If there is anything I’ve learnt in the years of owning an open source it is that the less routine work you have to do yourself the more free time you have for actual work (like fixing bugs and developing new features).


Therefore, automate whatever you can.


Here’s how I’d like us to approach this question; Lets first examine both workflows, the non-automated and the fully-automated to see how much of your time is actually put into routine tasks. We’ll then go into how we can achieve an improved workflow leading to more time to fix bugs.


Worst case — no automation

As you can see, in case that nothing is automated you do all the work. It is a lot of work just for a bug fix, and on top of that, this is the work that you’ll have to do every time there is a bug fix or a new feature… Now let’s take a look at another scenario.


Best case — everything is automated

In this case, you only do what you have to do — inspect the code and (sometimes) approve the pull request. Everything else is done automatically. Science fiction? No, it is called continuous integration and continuous deployment.


We’re not going to get into the details of build scripts and system specific configurations, instead we’ll review the tools you need to make it work and I’ll let you decide on the specifics yourself.


Continuous integration (CI)

Continuous integration (CI) is the practice of automating the integration of code changes from multiple contributors into a single software project. The CI process is comprised of automatic tools that assert the new code’s correctness before integration.

A very basic CI run would include build and unit tests, however it is not limited to these two. It might also include all kinds of static code analysis tools, linters etc. This is where you define your standards.


End to end tests

While build and unit tests provide you with a quick feedback for code changes, take a relatively short time and fail quickly if something goes wrong, end-to-end (E2E) tests have a special place in CI. E2E tests should cover not just the correctness of the code but also your deployment flow, package integrity and etc. I myself realized it when I accidentally published a new version of a package that didn’t contain any code. The build has passed, the unit tests were green as well as E2E tests (those at a time were installed by linking the build output directory from the test project). Where did it fail? In the packaging phase.


A key take away: E2E tests should test your packages as if it was used by a real user.


In order to achieve this I recommend the following:

  1. During your CI run start up a local package registry Each language/ecosystem has a few options, for example for Java or Scala projects you have Nexus Repository and for JavaScript there is Verdaccio (which I’m using in @angular-builders)

  2. Have a separate project that makes use of your package (can reside in the same repo). The tests in this project should test your package’s functionality.

  3. Configure this project to use the local package registry.

  4. After your package is built publish it to the local package registry (started up in your CI system).

  5. Install the latest version of the package (that you’ve just published) into your test project

  6. Run the tests

Not only will it test your package integrity and reliability, it will also save you some work when it comes to Continuous Deployment.


CI systems

There are plenty of CI systems that have a free plan for open source projects, among them Travis CI, CircleCI, AppVeyor, Github Actions and others. They are all more of the same and do basically the same; check out your code to a virtual machine, run a script that you define (usually run build and tests) and then report either a success or a failure to Github.

All these systems have an App for integration with Github and the integration flow is pretty similar in all of them:

  1. Register on the platform.

  2. Install the corresponding App in your Github account.

  3. Configure access to the selected repositories.

  4. Create a configuration file (like travis.yaml ) that defines the build matrix, required build chain and CI script.

  5. Push it to the master

This will make your CI run on every PR and report the status to Github, however it’s not enough. What you really want is to block the merge to the master branch until the PR passed all the checks.


This is done by defining branch protection rules. In order to define those you should go to Branches section in your repository Settings and press on Add rule button:

Then check the Require status checks to pass before merging checkbox:

As you can see corresponding Github Apps checkboxes already appear here so the only thing that’s left is to enable them.


The exact build script really depends on your ecosystem, the language your project is written in, the frameworks you’re using and more. Therefore we won’t cover it here — you’ll have to check out the documentation of the CI system to get into specifics.


However now you have a pretty good idea what CI is and how it automates your PRs, so let’s move on.


Continuous Deployment (CD)

Continuous Deployment (CD) is a software release process that uses automated testing to validate if changes to a codebase are correct and stable for immediate autonomous deployment to a production environment.

In our case the production environment is when a package is publicly available in a Package Registry. This is a point-of-no-return phase, once you published it you cannot un-publish it since this is publicly available, hence potentially in use.


There are multiple strategies for continuous deployment, it really depends on the project and its complexity, but in my opinion, releases should be made solely from a master branch. This makes the workflow pretty easy:

  1. Each PR represents either a bug fix or a new feature.

  2. The code is tested (including E2E) before it even gets to the master.

  3. The master is a protected branch so as long as you don’t merge failing PRs the master stays stable.

  4. Every PR merge to a master triggers master CI run which eventually releases a new version.

This will promise that all the releases are sequential and will make it really easy to associate certain PR with specific version.


To automate package releases we’ll need a few things:

  1. Automatic version advancement based on commit messages.

  2. Automatic CHANGELOG updates based on commit messages.

  3. Automatic package publishing to a public package repository.

  4. Automatic release on Github.

Good news everyone: all these are already supported by semantic-release.


Bad news: you’ll have to invest some time to make it work (but eventually it pays off).


semantic-release

semantic-release automates the whole package release workflow including: determining the next version number, generating the release notes and publishing the package. This removes the immediate connection between human emotions and version numbers, strictly following the Semantic Versioning specification.

We won’t be covering the whole integration process here, they have very good documentation and there is no reason to recapitulate it here.


I will mention a few points though:

  • Make sure you understand semantic versioning specification and conventional commits format before you start with Semantic Release.

  • To make semantic-release work well you should enforce certain commit messages format. To do so you can run commitlint as husky precommit hook. It will enforce conventional commits when someone creates a local commit, but it can’t do anything about commits that are performed directly from Github web UI (which often happens when someone wants to make a quick fix to their PR). Therefore I recommend you back it up by commitlint Github Action.

After you setup Semantic Release as part of your workflow you’re pretty much done and you no longer have to spend your time on these routine processes.


Although there is one more optimization you can do.


Keeping the project up to date

If your project has no external dependencies — skip this part.


However, most projects depend on other packages. And other packages tend to change.


Keeping your project up to date with its dependencies is important and it is time consuming.


Luckily for us there is a solution. In fact there are a few of them, such as Greenkeeper, Renovate and Dependabot. The idea is pretty much the same in all of them so I’ll just quote Dependabot’s “How it works” section:

1. Dependabot checks for updates Dependabot pulls down your dependency files and looks for any outdated or insecure requirements. 2. Dependabot opens pull requests If any of your dependencies are out-of-date, Dependabot opens individual pull requests to update each one. 3. You review and merge You check that your tests pass, scan the included changelog and release notes, then hit merge with confidence.

As you may have noticed it only makes sense when you have a working CI.


Wrapping it up

If you have a fully automated CI/CD cycle and a new issue is open in your OSS repository you can provide a bug fix within minutes. In fact you can enter the mobile Github version from your phone, fix the buggy line or two and commit the code. The rest is done automatically, and your customers are provided with a new version right away. I myself was able to quickly and painlessly get a fixed version to my customers multiple times.

Having great automation is not about freeing some time for leisure, it’s about dedicating your time to really important things and increasing your responsiveness.

In the next part we’ll discuss Versions Management, which sooner or later becomes relevant for every OSS project that has a decent amount of users.


If you enjoyed my writing, learned something new or insightful or if you just don’t want to miss the next part, make sure you’re following me on Twitter or here. Cheers!

© 2020 is a good year