Canary deployment: improving your software development cycles
As software development teams adopt more Agile methodologies, faster release cycles are becoming the norm. However, faster release cycles can lead to more potential issues, especially when deploying to production. That's where canary deployment comes in - a strategy that allows you to test new features and changes in production on a small subset of users before rolling them out to everyone.
What is Canary Deployment?
Canary deployment is a strategy used by software development teams to reduce the risk of releasing new features or changes to production. It involves releasing the new changes to a small group of users (the "canary group") first, before rolling out the changes to everyone else. This approach allows you to test your new features in a controlled environment and catch any issues before they affect your entire user base.
Canary deployment is named after the practice of using canaries in coal mines. In the past, coal miners would bring a canary bird with them into the mines to detect the presence of dangerous gases. If the canary stopped singing or died, it was a sign that there was a buildup of toxic gas and that the miners needed to evacuate immediately. Similarly, canary deployment is used to detect problems in a new software release before it reaches all users.
Benefits of Canary Deployment
There are several benefits to using canary deployment in your software development cycle:
-
Risk Reduction: Canary deployment reduces the risk of releasing new features or changes to production by limiting the initial release to a small group of users. This helps you catch any issues early and prevents them from affecting your entire user base.
-
Early Feedback: By releasing to a small group of users first, you can get feedback on the new features or changes before rolling them out to everyone else. This can help you identify any potential issues and make adjustments quickly.
-
Improved Quality: Canary deployment can improve the overall quality of your software by catching issues early and allowing you to make adjustments before the full release.
-
Faster Rollouts: By reducing risk, you can merge code faster and safer.
How to Implement Canary Deployment?
Implementing canary deployment involves the following steps:
-
Identify the Canary Group: Determine the group of users who will receive the new features or changes first.
-
Release to the Canary Group: Release the new features or changes to the canary group first.
-
Monitor and Collect Feedback: Monitor the canary group closely and collect feedback on the new features or changes.
-
Adjust and Roll Out: Use the feedback collected from the canary group to make any necessary adjustments, and then roll out the changes to everyone else.
Best Practices for Canary Deployment
While canary deployment can be a valuable strategy for software development teams, there are several best practices to keep in mind to ensure that it is implemented effectively. The most crucial part is selecting the right canary group.
The best way to select a canary group is to randomly pick users. This part is easier said than done and can completely ruin your efforts if not done correctly. When randomly assigning users to the canary group you need to consider:
-
The random distribution must be uniform. This ensures that you do not have a biased population. For instance, if you test a feature for your running app on half of your users using the timestamp at which they created their account (morning = canary group, after noon = control group) you might have a bias. Users who created an account in the morning are probably very different than users who created an account in the afternoon after work.
-
The random distribution should not be correlated between features. If you are doing a modulo 100 on the user's ID for instance, a user with ID xx00 will be in all canary groups while a user with ID xx99 will be in none. This means that all features are now correlated and you cannot tell which feature is causing a drop in a given metric.
-
The distribution should be sticky. A given user should always be part of the canary group (or not part of it) and this should not change when the user refreshes the page or restarts the app. This avoids flickering.
This is why it is recommended to delegate this part to a third-party tool like Tggl.io that does the job perfectly.
Conclusion
Canary deployment is a powerful tool for software development teams looking to reduce the risk of releasing new features or changes to production. By testing new features with a small group of users first, you can catch any issues early and make adjustments before the full release. This approach can help you improve the overall quality of your software, reduce risk, and speed up the rollout process.