Custom URL validator timing out when checking slugs against published documents

17 replies
Last updated: Aug 21, 2022
Any advice on a custom URL validator? I want to ensure that any internal links point to published slugs. It works, but slowly. I keep getting this warning:

Custom validator at linkGroups[_key=="721a4739faa3"].links[_key=="e61f9228310a"].linkUrl has taken more than 5000ms to respond

here’s the code:


import client from "part:@sanity/base/client";

const listAllSlugs = async () => {
  const slugsList = await client
    .withConfig({ apiVersion: "2021-10-21" })
    // require slug and exclude unpublished drafts
    .fetch(
      `*[defined(slug.current) && !(_id in path("drafts.**"))][].slug.current`
    );

  return slugsList;
};

export const validateLink = (url) => {
 
  // if it's an internal link
  if (url.startsWith("/")) {
    // remove # and ?
    const clean = url.split("#")[0].split("?")[0];

    return listAllSlugs().then((allSlugs) => {
      return allSlugs.includes(clean)
        ? true
        : "Does not match any published slugs";
    });
  }
  // ... test against http url patterns ...
};
AI Update

Your validator is slow because it fetches all slugs from your entire dataset on every keystroke. The solution is to cache the slug list and reuse it across validation calls.

Here are a few strategies to speed this up:

1. Cache the slug list outside the validator function

The simplest fix is to fetch slugs once when the Studio loads and reuse that cached list:

import client from "part:@sanity/base/client";

// Cache slugs at module level
let slugsCache = null;
let cachePromise = null;

const getAllSlugs = () => {
  // If we already have slugs, return them
  if (slugsCache) return Promise.resolve(slugsCache);
  
  // If a fetch is already in progress, return that promise
  if (cachePromise) return cachePromise;
  
  // Otherwise, start a new fetch
  cachePromise = client
    .withConfig({ apiVersion: "2021-10-21" })
    .fetch(`*[defined(slug.current) && !(_id in path("drafts.**"))].slug.current`)
    .then((slugs) => {
      slugsCache = slugs;
      cachePromise = null;
      return slugs;
    });
  
  return cachePromise;
};

export const validateLink = (url) => {
  if (url.startsWith("/")) {
    const clean = url.split("#")[0].split("?")[0];
    
    return getAllSlugs().then((allSlugs) => {
      return allSlugs.includes(clean)
        ? true
        : "Does not match any published slugs";
    });
  }
  // ... test against http url patterns ...
};

This caches the result in memory, so subsequent validations are instant. The cache persists until the Studio reloads.

2. Use a listener to keep the cache fresh

For a more robust solution, listen to document changes and update your cache:

import client from "part:@sanity/base/client";

let slugsCache = new Set();

// Initial fetch
client
  .withConfig({ apiVersion: "2021-10-21" })
  .fetch(`*[defined(slug.current) && !(_id in path("drafts.**"))]{_id, "slug": slug.current}`)
  .then((docs) => {
    slugsCache = new Set(docs.map(d => d.slug));
  });

// Listen for changes
client
  .listen('*[defined(slug.current)]')
  .subscribe((update) => {
    if (update.type === 'mutation') {
      // Refetch or update cache intelligently
      // For simplicity, you could just refetch all slugs
    }
  });

export const validateLink = (url) => {
  if (url.startsWith("/")) {
    const clean = url.split("#")[0].split("?")[0];
    return slugsCache.has(clean) ? true : "Does not match any published slugs";
  }
};

3. Use a reference field instead

Consider whether you actually need URL validation, or if a reference field would work better. References give you a document picker and ensure links always point to real documents:

{
  name: 'internalLink',
  type: 'reference',
  to: [{type: 'page'}, {type: 'post'}] // whatever doc types have slugs
}

Then resolve the slug when querying. This is type-safe and doesn't require validation at all.

Why your current code is slow

Every time someone types in the URL field, your validator:

  1. Makes a network request to Sanity
  2. Fetches potentially thousands of documents
  3. Extracts all slugs
  4. Checks if the typed value matches

With caching, you do steps 1-3 once, and step 4 becomes instant.

The 5000ms warning appears because Sanity expects validators to respond quickly (they run on every keystroke). Your approach works correctly, it's just the performance that needs optimization.

Show original thread
17 replies
I’ve got many links nested in groups on this document, so they’re all doing this slow query. Could I preload the slugsList once and make it available as a variable, or maybe get it into the validator’s
document
parameter?
May I ask how many different slugs it needs to evaluate? Are we talking 50 or 10,000?
user S
50
there are 50-100 documents with slugs, and maybe 50 links to validate in this one document
I have to step away for a bit, thank you very much for your thoughts on this!
That's super weird that it's that high, I feel.
I am running something similar that depends upon a thousand math comparisons (color hexes) gathered across about 200 docs and it comes back in a second.

Is it the same experience in any browser?


https://www.sanity.io/docs/validation#9e69d5db6f72 That's the closest kind of example I could find that's using the built-ins.
I wonder if there's another approach, like counting lengths of things that satisfy queries and comparing that so you don't have to bother with strings. I'll think some more on it while you're away.
just tested and it’s also the live studio, not just local. I’m on the free tier but i haven’t approached any limits yet
I don't have any reason to believe you'd be throttled for being free. It also looks like the service itself is doing okay: https://www.sanity-status.com/
Do you have another browser you can test in just to narrow down the culprits?

Is there anything of applicable use in the example I linked as far as other approaches?
Validators run on every keystroke. You might not want to query your entire dataset every time. So I would advise caching your response.
user F
One think I was thinking about is if they were inline blocks that were made to point at something that existed at the time that they'd necessarily have to exist when added. I mention it because this sounds a bit like a broken link checker after the fact. It's not programmatic to use your eyes to see complaints about references but it also might make more sense as an additional view rather than validation.
If it was kept as a validation, do you think it's possible to do a custom input component where the validation assumes a default value of true with useState, run a call after the promise returns once they've stopped typing and only switch then if it finds an issue?
user S
The fetch returns MUCH faster on Safari than on Chrome. I have no idea why.

user F
Are there cache options for Sanity client .fetch that I’m not finding, or an idiomatic way you’d approach this? I’ve had big improvements by implementing the following:
export const validateLink = (cachedSlugs) => (url) => {
  // cachedSlugs is declared in the document schema that calls validateLink
  
  // ...

  // if it's an internal link
  if (url.startsWith("/")) {
    const cleanUrl = url.split("#")[0].split("?")[0];

    if (cachedSlugs) {
      // use the cache if it exists
      return cachedSlugs.includes(cleanUrl)
        ? true
        : "Does not match any published slugs";
    } else {
      return listAllSlugs().then((allSlugs) => {
        // cache results
        cachedSlugs = allSlugs;

        return allSlugs.includes(cleanUrl)
          ? true
          : "Does not match any published slugs";
      });
    }
  }
  // ...
};
user M
Yeah one of the exhausting things about any kind of web dev is there are so many variables. Browser, OS, device, time of day, network conditions, browser extensions, ISP, etc. I can't account for that either off the top of my head. In my experience on Windows Chrome is the slowest, but it shouldn't be for that sort of thing (believe it or not, on my machine, Edge is the fastest and it isn't even close).
Until she sees the message, if I had to guess, she might be talking about memoizing or useCallback; I see the latter a lot in some official examples; my understanding is if the function is always going to return the same conditions under the same circumstances (and given the same dependencies) it keeps it in memory for re-use then.
user S
Could you possibly share a link to one of those examples where you saw memoization? I’d love to compare with that I came up with.
I’m pretty happy with this outcome using lodash
_memoize
. I’m passing in the document ID just to to pull a fresh slug list when switching documents, in case a user is navigating around adding new slugs.

import _ from "lodash";

const getAllSlugs = _.memoize(async (docId) => {
  const allSlugs = await client
    .withConfig({ apiVersion: "2021-10-21" })
    .fetch(
      `*[defined(slug.current) && !(_id in path("drafts.**"))][].slug.current`
    )
    .then((r) => {
      return r;
    });

  return allSlugs;
});

export const validateLink = (url, { document }) => {
  return getAllSlugs(document._id).then((allSlugs) => {
    const cleanUrl = url.split("#")[0].split("?")[0];

    return allSlugs.includes(cleanUrl)
      ? true
      : "Does not match any published slugs";
  });
};
user M
Sure! The most recent one I saw is actually from the v3 beta docs in an example involving a custom input, but the concept should still apply: https://beta.sanity.io/docs/learn/custom-input-component/patching-to-the-content-lake#adding-performance-optimization-with-usecallback
If anyone else is reading this, lodash / underscore are so freaking cool. I forgot how much I valued them and how much they bailed me out with custom stuff in WordPress, like making a table of contents sticky column because I could have it track all the page headings and find the one nearest my scroll position calculating distances to know which side nav menu item to highlight. It was crazy fast given all it had to do.
lollll you just reminded me i really should memoize almost that exact function I have going on my site!!

Sanity – Build the way you think, not the way your CMS thinks

Sanity is the developer-first content operating system that gives you complete control. Schema-as-code, GROQ queries, and real-time APIs mean no more workarounds or waiting for deployments. Free to start, scale as you grow.

Was this answer helpful?