The Shadow Worrier: For a Future of Only Delightful API Surprises
We needed a way to introduce changes in our APIs without breaking people’s code. So we made The Shadow Worrier. Here’s the story of how it works and why we made it.
Published
Simen Svale
Co-founder and CTO at Sanity
We needed a way to introduce changes in our APIs without breaking people’s code. So we made The Shadow Worrier. Here’s the story of how it works and why we made it.
At its best, the Cloud hides all the hard, boring yarn of machinery behind a cover of beautiful and fluffy APIs. It frees us to just get on with what we actually want to do. But as cloud dwellers, we live with worry. APIs that suddenly change in the night, breaking perfectly fine code we wrote months and years back, forcing us to do pointless firefighting when we should be getting on with the awesome.
Surprises should always be delightful and lovingly deliberate. Like gifts. We had to contend with this rule a year back as we were fixing some completely innocuous inconsistencies in our APIs. It turned out that for every change we made we broke someone’s stuff. We had reached that humbling, yet annoying milestone in the life of a cloud service where absolutely every inconsistency we had was being relied upon by someone, somewhere.
We detest APIs that go bump in the night. It feels really awful to have to inflict that onto people using our services. Sanity’s backend should feel light and easy, yet solid. The cloud where you can safely build your castle.
Our API inconsistencies are now part of our service offering. If anyone relies on some mistake we made, and it does not represent a security issue, we aim to maintain it. This moves a lot of the pain with migration from those who use our services, to us. As it should be. That, however, poses some challenges on our part. We have expansive test suites to help us keep our services working as intended as we change them. But that doesn’t capture all the possible permutations from real use. We are looking at queries and edge cases that by definition are not covered in our tests. This is about maintaining queries that are unknown to us. It’s these you shouldn’t worry about, but let us take that toll.
We too like to sleep at night though. So we built The Shadow Worrier.
The Shadow Worrier’s purpose is to look out for everyone that depends on some behavior in our APIs. It will run the upcoming version of our APIs in the background, the Shadow Backend. When your apps access our APIs we will service your requests as fast as we can. But then, behind the scenes, the Worrier will re-run your requests on the Shadow Backend. Any discrepancies that emerge from the results will mean we are about to break some queries that we didn't know of and did not cover in our tests yet. When we do, we lock the behavior in using test cases and make sure we maintain it going forward. This is how the Shadow Worrier fights unwanted surprises.
Does it mean we will never fix bugs or introduce API-breaking changes? Of course not! That's what versioned APIs are for, right? Our goal is that even the weird stuff in any given version of our API is as dependable as the consistent, intended stuff. Then, when you are ready for our fixes, you explicitly upgrade to the newer version and fix any breakage you might encounter in a controlled situation. With help from our migration guides. This way we get to keep moving, and you still get to sleep at night even when we realize the result of that all-important (to you for some inexplicable reason) count("banana")
should be null
instead of throwing Invalid function call: count(string)
.
The architecture of The Shadow Worrier and the Shadow Backend is largely made possible by running our backend with Kuberneetes. When we’re introducing new features, we put these on the shadow branch and create a new deployment internally on the cluster. Then we enable the endpoints to the ingress, that routes traffic to the production clusters, as well as The Shadow Worrier. It’s a pretty simple setup in terms of operations, but oh so useful.
The Shadow Worrier then produce logs and statistics for every deviation it finds. As more and more people have started to use Sanity, we don’t need to run The Shadow Worrier for very long or particularly many times before having a decent amount of data to work from. Since it also doubles the pressure on some parts of our backend, it’s a good reminder for knowing how well we scale.
This is not to say that our intention is to never ever take an API out of service. But when we eventually decide it is time, we intend to give you ample heads up before we do.
So let's talk a little bit about our versioning scheme.
We are not only an API provider. Obviously, we also consume a lot of APIs ranging from our platform vendor, marketing automation, analytics, service monitoring, payment processing… I could go on. But on that list, our payment provider Stripe really inspired us. They implemented a very similar scheme back in 2017.
We are intending to introduce API improvements piecemeal and continuously. We could have used numbered versions, but at least the way we plan to do this, we would very soon be on v311
. Also, you would have to look up the latest version every time you wanted to set up a new project.
We like Stripe's approach to just version the API by date. So as I'm writing this, v2019-06-14
is guaranteed to give me all the latest changes. In the future, when you need an API feature you'll have to look it up in our release log to suss out what version-date you need in order to get it.
Only backward incompatible changes will be versioned. We might introduce new functionality that we deem purely supplementary, like new API endpoints or pure extensions of the GROQ syntax can be introduced at any time. We also need a way to beta test changes and will do so by publishing version X (aka vX
). This version will contain our experimental, unstable API features. Version X may change at any time, and you must not rely on it. But we intend to release new features early to gain experience before we completely lock it in. If you want to ride on the bleeding edge, X will be for you.
You can read more about our versioning scheme [here] and how to use it with the JS client [here]. We will soon be announcing some very subtle, yet useful improvements in GROQs handling of arrays. This will be our first versioned feature.
We believe that The Shadow Worrier and the new API versioning scheme really makes a good foundation for providing a sustainable and reliable service for people’s structured content in the years to come. We’re very excited about bringing new features, without having to break people’s exciting code. Because, after all, our main job is to move, and get out of the way.