Open Source Series: Version Management
Hey there, it’s been a rough couple of months, but here we are again, talking about Open Source. In this chapter (which is also the concluding part of the series) we’ll talk about version management. You’ll learn about version notations, breaking changes, back-ports and more.
Before reading this chapter I highly recommend you get yourself familiarized with the topics we previously discussed, especially the last one, talking about Automation:
Table of contents
Let’s see what Wikipedia has to say about software versioning.
Software upgrade versioning is the process of assigning either unique version names or unique version numbers to unique states of computer software.
Modern computer software is often tracked using two different software versioning schemes — internal version number that may be incremented many times in a single day, such as a revision control number, and a release version that typically changes far less often, such as semantic versioning or a project code name.
Indeed, there are multiple ways of uniquely identifying your software product version.
The most widely known way is giving it a name.
The vast majority of people on Earth, even those indirectly connected to technology have probably heard of Android Ice Cream Sandwich and Marshmallow or Mac OS Leopard, its frozen cousin Snow Leopard and Big Sur.
Programmers have probably heard about Eclipse with its celestial bodies Luna, Mars and Photon.
All these are major versions of software products.
Though names are great for marketing, they can also be confusing sometimes. In fact Google has dropped the usage of candies in their Android version names because they:
Heard feedback over the years from users that the names weren’t always intuitively understandable by everyone in the global community
And rightfully so, yet perhaps we just haven’t evolved enough to extrapolate version numbers from animal species, even though Snow Leopard is much cooler than Leopard.
Celestial bodies and candies are a bit easier concepts to grasp, but only if you name them alphabetically, like Android and Eclipse do.
But one thing is certain — there is no better way to determine succession than numbers.
Thus, if you name the first version of your software product “Product 1” and the second version “Product 2” it’s pretty intuitive to say that the second version is the more recent, isn’t it?
However, unlike standalone software products that don’t expose API, software that is consumed by other software (like the majority of OSS products) needs a better versioning than just a sequence of numbers.
For example, if we used a simple numbers sequence for versioning, how would the user distinguish between a bug fix and a change that is breaking the existing API?
The answer is…
Semantic version (also known as SemVer) is a widely adopted version scheme that uses a sequence of 3 digits in the following format: MAJOR.MINOR.PATCH . The rules are simple — given a version number MAJOR.MINOR.PATCH, increment the:
MAJOR version when you make incompatible API changes
MINOR version when you add functionality in a backwards compatible manner
PATCH version when you make backwards compatible bug fixes.
Additional labels for pre-release and build metadata are available as extensions to the MAJOR.MINOR.PATCH format.
It provides a clear and a concise way to communicate the changes in your software product to your users.
But most importantly, it is widely adopted by all kinds of package managers and build tools (like NPM and Maven), which allows users to depend on a specific range of versions rather than on a specific version.
For example, specifying the version range ^2.2.1 rather than explicit version 2.2.1would let the user accept any backwards compatible bug fixes or new features that will be released on top of version 2.2.1.
That said, build tools and package managers rely on a contract between a user and a package owner — a contract that is defined by SemVer. That means the responsibility is yours— you’re the one who defines what a breaking change is and what a minor change is. You can accidentally release a breaking change as a bug fix (patch version) and it will break builds that depend on a range.
Breaking builds is a horrible thing to do so I’d recommend you use semantic-release with a predefined message format together with commits format enforcement tool. We did cover it in the Automation chapter, so if you didn’t read it yet, this is the time.
You can find more info about Semantic Versioning on the official website semver.org.
Now, that we learned about identifying the breaking changes, let’s talk about introducing them.
Breaking changes are changes to your public API that are removing, renaming or changing in an incompatible way your contracts with the user.
Ideally you would maintain backward compatibility in your code and wouldn’t introduce any breaking changes ever.
But then you wake up into harsh reality.
Software is evolving and so does your code. The needs of the users change and so does your API. You grow as a developer and so does your product. Therefore, especially as an open source developer who doesn’t get paid for their job, you just can’t allow yourself to maintain all the legacy code that exists in your project. Sometimes, you need to get rid of it.
The question is how?
As always, it is a tradeoff. You would know better how this change or another impacts the users. You don’t have to maintain backward compatibly at any cost, nor do you have to implement all the new features in every old version. But it is certainly something that you should take into account.
If the migration cost is relatively low for the user then it’s fine to make a breaking change and it’s quite reasonable to not support this feature in older versions. However, if the migration cost is high and the vast majority of users cannot afford this effort, you should probably consider making this change backward compatible at first and releasing a deprecation warning.
A deprecation warning is often released together with a new API, while the old API is still supported. This way the users have time to migrate, and once they do, in the next major version, the deprecation warning and the old API can be both safely removed.
In any case whenever you introduce a breaking change make sure you have a migration guide that has step-by-step instructions for the migration.
In addition, as an act of courtesy, it would be very nice of you to give users the time to prepare for a breaking change, especially if it doesn’t have a grace period (both old and new APIs are supported). A little heads-up that explains the breaking change, the reasoning behind it and the expected time frame. It can be a tweet, a blog post or even a new minor version of your product with a deprecation warning.
Remember that while a breaking change is essentially a negative experience, a sudden breaking change is an extremely negative experience.
We can divide breaking changes into two categories — non-deterministic and deterministic. Non-deterministic are the ones in which you can’t predict the outcome of the migration effort, for example when you completely remove a certain portion of an API. In this case it’s up to the user to decide whether he wants to replace it with some other 3rd party library, implement it himself or deprecate it as well.
Deterministic changes are the ones that given code X and user input I allow you to transform it into code Y. For example changing a function name or an import statement.
If you introduce a deterministic breaking change you can write an automation that will change user’s code base and adjust it to the new API. With this automation in place you won’t have to care for backward compatibility and detailed migration guides. You provide a user with a way to upgrade their code with zero effort from their side, which is crucial in software updates.
However, there is an inherent tradeoff here as well. Writing code takes time, just as writing a migration guide does. And naturally, writing code that migrates a complex code flow into a new API will take more time than writing code that replaces a function name with a new one. Sometimes you just can’t afford this kind of effort.
In case you do decide to go for it, there are tools that can help you achieve what you want.
The most widely known and language agnostic one is Codemode by Facebook.
codemod is a tool/library to assist you with large-scale codebase refactors that can be partially automated but still require human oversight and occasional intervention.
There are also more sophisticated tools that use AST and can be used for more complicated tasks than just Find & Replace. For example, another Facebook library (JS/TS specific) called JSCodeShift. Or code-migrate — a tool (again JS/TS specific) that allows you to write a guided migration relatively easy and provides a user with nice CLI based prompts.
Some big OSS projects even have a solution of their own. One example for such a solution is Angular schematics — a template-based code generator that supports complex logic.
Automatic code migration can be published as a separate package (like my-cool-oss-migrate-v4-v5 ) and mentioned as a step in the migration guide. Alternatively, the migration can be a part of your major version that contains breaking changes and be executed upon installation of this version in the user’s code base. The choice is yours.
Another common practice is back-porting important changes to previous versions. For example, a critical bug has been found after a major release (with a breaking change) but it also applies to a previous version.
In this case you can’t expect your users to perform a tedious migration because of a single bug. On the other hand checking out the older revision, implementing the fix on top of it and releasing it as a minor bump of an older version might be cumbersome.
The solution: have a protected branch per major version.
Every time you plan to release a major version you create a branch from a main branch named c.x.x where c is the current major version number. You make all such branches protected (just as the main branch) so that you wouldn’t accidentally break them. Then, anytime you have to back-port a feature or a bug fix from a newer major version, you either reimplement it on this branch or (if possible) cherry-pick the commits from the main branch.
In addition, a strategy that is worth mentioning is having a separate branch for the next major version as well (as opposed to only having branches for previous major versions). This is usually relevant for large scale projects (like Webpack or Babel) that have a lot of changes in every new major version. Having a separate branch for the upcoming major version allows working on it and having it published for testing, while still keeping the most relevant version (and working on it) in the main branch. Once the new major version is published, its branch becomes a main branch and a new branch is created for the next major version.
This chapter is about version management but it is also a concluding part of the series. So I’d like to share with you one thing that you should always keep in mind while owning an open source project.
Listen to your users
It might sound counter-intuitive but that is the truth — you’re not the only one who defines the road map, users define it too. In fact, users define most of it. If you own an open source project then you do it to help others, not yourself.
Have multiple channels for feedback. There are users that only have a quick question to which you can provide an answer within a second. There are potential contributors that would like to discuss the roadmap but don’t want to do this in public. Give them a way to contact you. Provide a link to Slack or Discord, share your Twitter account etc. The more channels the better.
If you enjoyed my writing, learned something new or insightful, or if you just don’t want to miss my next article, make sure you’re following me on Twitter or here. Cheers!