Faceted Taxonomy Setup & Use
Faceted taxonomies are a great option for organizing resources that don't fit neatly into hierarchical structures.
Go to Faceted Taxonomy Setup & UseA search taxonomy (also called a "thesaurus") allows you to manage and accommodate the language variations your online visitors enter into your site search tool. Maybe potential customers for your artisanal dungarees search for "coverall" or "romper" or "onesie." Perhaps you host information on how to make the most of space in one's garage ... but you have a contingent of visitors who consistently search for "car hole." Search taxonomies help you anticipate these variants, and, where appropriate, guide users to the terms and language you recommend.
In this guide you'll learn what makes a thesaurus different from other kinds of taxonomy structures, like type, topic, or faceted taxonomies, and you'll learn how to set up a thesaurus in the Sanity Taxonomy Manager plugin. You'll also find tips on how to integrate your custom thesaurus into managed or self-hosted search tools in order to deliver a more effective site search experience for your users.
Site search tools include SaaS products like Algolia and Elasticsearch and front end solutions like Pagefind and Flexsearch. They all work by indexing site content, querying that index in response visitor search terms, and then returning relevant results. A well-tuned search experience balances the precision of the results returned (do all these results match what the user has in mind?) and the recall of the result set (did the search return everything your site has on the topic?).
A search thesaurus helps your search tool increase the precision and recall of its results by allowing you to specify alternative labels and hidden labels that match the terms for which your users are searching—even if (especially if) those terms are "wrong."
Alternative labels include synonyms, near-synonyms, abbreviations, and acronyms of a concept. Alternative labels in your thesaurus help your search tool bridge the gaps between what users enter (term variants and misspellings) and what the search index picks up from crawling your content and extracting keywords and concepts.
Many modern search tools have built-in functionality for dealing with common synonyms and typos. When your content is in a specialized domain, however, these tools may need additional guidance. On the Mayo Clinic website, for example, a site search for "Lou Gehrig's" returns both results for amyotrophic lateral sclerosis (ALS) which include the eponym "Lou Gehrig's" in result text, and those with no mention of "Lou Gehrig" at all.
Hidden labels include misspelled variants and other terms you don't want to be otherwise visible. For example, if a site visitor misspells "Lou Gehrig's" as "Lou Gherig's," the Mayo Clinic search tool suggests the correct spelling and offers a link to the properly spelled term. This serves a dual function of getting visitors the content they want, and educating them about the correct spelling.
Alternatively, a search for "handicap parking pass" on the Washington State Department of Licensing website returns results for "disabled parking permits," but otherwise hides references to a search query which is now considered outdated and unacceptable for referring to individuals or accessible environments.
Due to the wide and ever evolving range of search tools currently on the market, there isn't a single standard way to integrate a thesaurus managed in Sanity with site search tools. Many SaaS tools have APIs that can ingest thesaurus data as JSON or CSV, and have varying levels of support for alternative and hidden labels. Some tools also allow you to treat narrower terms as synonyms for their broader parent categories.
Front end search tools targeted toward static site generators such as Pagefind or FlexSearch allow you to customize what gets entered into the search index with data tags in your page templates or via custom additions to the search index. Terms can be weighted and shown (or not) on your search result page based on how you configure the results page on your site.
Some potential benefits of managing thesaurus terms in Sanity with the Taxonomy Manager plugin include:
You may not want to manage thesaurus terms in Sanity if:
The Taxonomy Manager plugin allows you to create standards compliant relationships that help keep your taxonomy interoperable and reusable.
Concept Schemes are used to create multiple taxonomies in a single project, and, where needed, use the same concepts across them. This gives you a single source of truth for each concept you define, and allows you to establish semantic relationships between individual taxonomies.
Add a new Concept Scheme with either the global "new document" button, or the "new document" button in the Concept Schemes list view
Sometimes the concept scheme you need is the one you already have: if you need to create synonyms for a set of concepts in an existing taxonomy, create them there. Every concept created with Sanity Taxonomy Manager has Alternative Label and Hidden Label fields, so there's no reason your Type or Topic taxonomy can't also be your thesaurus.
Add a clear name and describe the purpose and goals of your thesaurus to users. Tagging content with managed terms may be new to your content creators: good descriptions can help users understand why the tagging step is important.
"Concepts" are the central modeling metaphor in simple knowledge organization system (SKOS) taxonomies. A concept's "Preferred Label" corresponds to the ISO 2788/5964 standard's idea of "term." For each concept for which you need synonyms, click "Add Concept" and provide a Preferred Label.
Once you have a concept and its Preferred Label defined, add Alternative Labels and Hidden Labels as necessary.
Preferred, alternative, and hidden label sets must not overlap. Taxonomy Manager will show a validation error if you accidentally duplicate a label across label sets.
You're now ready to publish your thesaurus, integrate it into your content scheme, and start tagging content. Taxonomy Manager includes two helper functions for ensuring that only the appropriate concepts are available for a given field
As noted above, adding synonyms to your search index is entirely dependent on the affordances provided by your search tool. The flexibility of the GROQ query language, however, means that getting your synonym data into a format that can be used by your search tool need not be a daunting task.
Here, for example, is a query for formatting synonyms for Algolia's synonym API:
*[_type == "skosConceptScheme" && schemeId == "69d9c8" ].concepts[]->
{
"objectId": conceptId,
"type": "oneWaySynonym",
"input": prefLabel,
"synonymns": coalesce(
altLabel[] + hiddenLabel[],
altLabel[],
hiddenLabel[]
)
}
This data matches the shape Algolia specifies for creating or updating a one-way synonym with their API:
"data": [
{
"objectId": "f3ac93",
"type": "oneWaySynonym",
"input": "Bilirubinometer",
"synonymns": [
"Jaundice Meter",
"Bilimeter"
]
},
{
"objectId": "e84bd2",
"type": "oneWaySynonym",
"input": "Sphygmomanometer",
"synonymns": [
"Blood Pressure Cuff",
"BP Cuff"
]
},
{
"objectId": "caefe4",
"type": "oneWaySynonym",
"input": "Tongue Depressor",
"synonymns": [
"Spatula",
"Popsicle Stick"
]
}
]
Find more examples, applications, and tips in the Sanity Taxonomy Manager Docs >>
Sanity Composable Content Cloud is the headless CMS that gives you (and your team) a content backend to drive websites and applications with modern tooling. It offers a real-time editing environment for content creators that’s easy to configure but designed to be customized with JavaScript and React when needed. With the hosted document store, you query content freely and easily integrate with any framework or data source to distribute and enrich content.
Sanity scales from weekend projects to enterprise needs and is used by companies like Puma, AT&T, Burger King, Tata, and Figma.
Faceted taxonomies are a great option for organizing resources that don't fit neatly into hierarchical structures.
Go to Faceted Taxonomy Setup & UseAdd topic-based relationships to your content to make it more discoverable and reusable across contexts.
Go to Topic Taxonomy Setup & UseAdd semantic relationships to your content to make it more discoverable and reusable across contexts.
Go to Type Taxonomy Setup & UseA comprehensive guide to self-hosting Sanity Studio on DreamHost with GitHub Actions for continuous integration.