If you’re tired of users finding errors before you do, a library called Scientist improves canary testing by taking users out of the equation.

When we release new code, we test it with as much rigor as we can, but it’s very hard to replicate the full range of data, workflows, and environments where users use our software.

Because of this, many organizations use beta testers or potentially even alpha testers and expose new features via feature flags.

This makes early users effectively a form of canary testers like the canaries coal miners brought with them into the mines to get early warning of hazardous gasses.

An example of a feature flag where beta users flow to a new routine
Beta users getting access to a new routine before other users

The problem is, we never want our users to effectively become a “dead canary” and encounter a bug that we could have spared them from if something else could have found it first.


The Scientist library offers a solution. Using Scientist, the new code is deployed alongside the old, and Scientist runs an experiment in which it executes the legacy code as well as an experimental new version, then compares the two versions together.

Regardless of whether the results of these two methods match, the result of the legacy version is used and returned to the caller, meaning that the user is shielded from any issues caused by the new version.

Scientist performs a canary test by running the old and new routinesScientist uses the legacy library and returns its result, but also calls the new routine to compare it

This means that errors in the new routine can be found without users ever seeing them. If a user’s data finds some error or logic gap in a new version of the code, the user should be completely ignorant of that fact and keep on using the software as if it worked the way it did prior to the update.

Instead, the results of this comparison are sent to a results publisher which can log to a number of places and allow the development team to tweak the new routine before going live with the feature.

Scientist insulates the client from being a canary tester by always returning the legacy result
If there was a mistake in the new routine, it gets logged to a results publisher and the user gets the working old version

This allows us to rewrite or expand portions of our code, ship the fixed version alongside the old, and then compare the implementations against live production data. Once we’ve collected enough data to be satisfied with the new routine, we can issue a new release removing Scientist and the Legacy Routine from the equation.

Using Scientist

An example in C# using Scientist .NET is listed below:

var legacyAnalyzer = new LegacyAnalyzer();
var newAnalyzer = new RewrittenAnalyzer();
var result = Scientist.Science<CustomResult>("Experiment Name",
experiment =>
experiment.Use(() => legacyAnalyzer.Analyze(resume, container));
experiment.Try(() => newAnalyzer.Analyze(resume, container));
experiment.Compare((x, y) => x.Score == y.Score);

In this example, we call out to Scientist.Science and declare that we expect a custom result type of type CustomResult back from the method invocation. From there, we give Scientist a name for the experiment (available to the result publisher) and tell it to Use a legacy implementation. This method’s return value will always be returned. We can also declare 1 to many different experiments to compare it to via the Try method. Finally, we can define a custom means to Compare the two results looking for equality.

Note that Scientist .NET will, run the different routines in random ordering.

Testing with Scientist

Scientist can also be used inside of unit tests to compare a legacy and a refactored way of doing things. In such cases, you wouldn’t want the refactored version to exhibit different behavior, so you could rely on a custom result publisher to fail a unit test.

Scientist performing a canary test and then reporting mismatches to a report publisher
Results publishers can report errors to a testing framework to fail a unit test if logic is changed from legacy implementations.

Result Publishing

The results of the experiment are then published to the result publisher. A custom result publisher used for unit testing is defined below for reference:

public class ThrowIfMismatchedPublisher : IResultPublisher
public Task Publish<T, TClean>(Result<T, TClean> result)
if (result.Mismatched)
var sb = new StringBuilder();
sb.AppendLine($"{result.ExperimentName} had mismatched results: ");
foreach (var observation in result.MismatchedObservations)
sb.AppendLine($"Value: {observation.Value}");
throw new InvalidOperationException(sb.ToString());
return Task.CompletedTask;

This will allow Scientist to run in such a way that mismatches throw exceptions, which is useful only in unit testing scenarios (for production scenarios you’d want to log to a log file, error reporting service, database, etc).

This publisher can then be provided to scientist by setting Scientist.ResultPublisher = new ThrowIfMIsmatchedPublisher();


Scientist has a lot of value for cases where you want to replace bit by bit of an application, try a performance improvement, or other forms of incremental changes. Scientist is not, however, suited for scenarios where the code is creating some external side effect such as modifying a database, sending an E-Mail, or making some sort of modifying external API call since both the legacy and the new routine will run.

The library is available in a wide variety of languages from Ruby to .NET to JavaScript to R to Kotlin and others. Additionally, the core concepts of the library are simple and can be used by any language.

Ultimately Scientist is a very helpful library for comparing old and new versions of your codebase and for giving your users a buffer between bleeding edge features and unwittingly acting as a dead canary.