Continuous Deployment: How To Make It Safer

| 7 min. (1303 words)

Long gone are the days of the ‘big bang’ release.

At Raygun we deploy our code continuously – around six times per day.

This means that every change we make to our code that passes the automated tests is deployed to production automatically. Continuous deployment isn’t hard if you have the right people, processes and tools in place. Today, we have our Chief of Engineering talking about our continuous deployment process which allows Raygun to consistently and safely deploy multiple times a day (without disasters).

Please feel free to draw inspiration from our process and ask questions in the comments section!

Raygun’s Continuous Deployment Process

With a multitude of tools now available, moving to a continuous deployment model has never been easier.

What I plan to cover here is how we do continuous deployment and how we use Raygun to support the process.

(See how the Raygun Platform can improve your continuous deployment process – try free for 14 days)

How We Deploy Small Changes With Zero Perceived Downtime.

The Raygun crash reporting software is a complex tool built to support many different programming languages and scale with the number of errors our customers produce.

We use a lot of different tech ourselves to deliver what may seem like a simple process.

Hidden behind a simple web interface is a node.js public API, various workers that consume and classify errors, process JavaScript minified stack traces using Source maps and symbolication of iOS stack traces with dSYM files (to name a few.)

(As a side note I think the architecture and infrastructure of Raygun would be a good follow up blog post, so feel free to leave a comment below if you’d like to hear more about it.)

To make continuous deployment easier we’ve broken down Raygun into small self-contained components. These communicate with each other using queues and APIs. Each component can be shut down, updated, and when restarted, pick off from where it left off. This is important as it allows us to make small changes to the code and deploy those changes quickly with zero perceived downtime.

Where Our Process Starts.

Our process starts with a developer creating a new branch from master (btw we use git for source control). Every change is made in its own branch and no-one is allowed to make changes directly in master. Master must be deployable at any point in time.

When a code change is pushed to remote our continuous build process immediately kicks in.

Our builds are handled by TeamCity which is responsible for building the software and running automated unit tests. If this completes successfully, then it also creates deployment packages and publishes them to the internal Nuget feed.

For our deployment process, we use the automated deployment tool Octopus Deploy. Every component of Raygun is configured as a project in Octopus and can be deployed individually with the click of a few buttons.

Octopus pulls the required packages from the TeamCity feed and installs the component onto the appropriate servers in the selected target environment. If you don’t know about Octopus Deploy and you are building software in .NET then I would highly recommend reading up more about this tool on their website.

We have multiple environments that we can use for internal and beta testing. Octopus can deploy to any one of these environments allowing developers to not only test their code change, but also the deployment process in a safe environment.

When ready the branch is merged into master, a new build is produced automatically and is then deployed to our beta environment. From there it gets taken to production. Having a set of continuous build and automated deployment tools is key to making this a simple process. If you’d like to make it scalable and repeatable I highly recommend finding a core set of tools. 

It’s Not All Machines, We Want Humans Involved Too.

Our deployment process is not fully automatic…

Taking some of the automatic parts out can help to mitigate the risk of the continuous deployment model. Basically our process is automatic up to the point of pushing the deploy button.

And we want a human to control that.

Specifically we want the developer responsible for those changes to be the one pushing the button.

Why?

Because it makes them more involved in the process. They get to see their change going live, they get to be the first one to test it. With tools like Slack they are immediately available if anything goes wrong…

which at some point it will which is why our error and rash reporting tools are so valuable…

…We Use Raygun Too…

When moving to a continuous deployment process, one of the most important things to do is to measure and monitor everything and make it visible to the whole team.

With a continuous deployment process, small code changes and features are constantly going into production.

This usually means less QA and regression testing is being done. So when something goes wrong, or performance is degraded, you need to know about it immediately and you need to know when it started.

With Raygun integrated in your software you will immediately be alerted to errors being produced in your software.

You will get detailed information about the error, including the stack trace, which user encountered the error and any context and environment related data that’s available. This information helps developers locate the source of an error and helps you make the decision of whether you need to rollback to a previous version or fix the bug and deploy a new version.

With Raygun Real User Monitoring you can monitor the performance of your website and determine if your users are having a good or poor experience. With lists of the slowest pages and assets you can quickly determine how your latest changes are performing and even drill into a user’s session and monitor their experience in real time.

To make sure your team is notified as soon as an error occurs you should integrate Raygun with your teams communication tools. Raygun will send you and your team an email when an error occurs. However, for more effective notifications you should add an integration.

Tools like Slack integrate with Raygun, allowing diagnostic details to reach developers quickly.

(If you don’t use Slack we also offer HipChat, PagerDuty, VictorOps and many other integrations that alert you to new errors in your software. If we don’t offer an integration for a tool you use then let us know as we may be able to help).

With Raygun deployment tracking you can see when an error started occurring:

This allows you to make the decision about what version to roll back to. (Or if the fix is simple, then to deploy a new version). The deployment tracking feature also allows you to see if a version has introduced a regression bug, raising errors you thought had previously been fixed.

As an added bonus if you use Github and have the Github integration enabled then you can even see which commits were included with the deployment, further helping you narrow down the cause of a bug.

In Conclusion

Continuous deployment isn’t hard if you have the right people, processes and tools in place.

You need to:

  • keep your changes small and easily deployable.
  • have a repeatable, largely automated build and deployment process.
  • have tools in place to monitor performance and alert you to problems.

But most of all you need to need people who support the process.

Raygun can help you along the road to a successful continuous deployment process.

But having people who buy-in to the strategy, collaborate and support each other when things go wrong is the key to success!

Resources Mentioned