Last updated
Headless SEO 101: Everything You Need to Get Started
In the last few years we’ve seen a global rise in demand for headless CMSes. Because of a growing need for flexibility and control in how content is deployed across different channels, the headless CMS market is forecasted to grow from $605 million to $3.8 billion by 2032. They are also seen as more developer-friendly than traditional, monolithic CMSes.
Transitioning into a headless CMS can unlock a ton of potential for your content, but it does not come without its challenges.
Headless SEO refers to the unique processes required when optimizing content for search using a headless CMS. These include: content modeling for SEO, a more intentional technical set-up, and an omnichannel approach to search.
Many of the well-known tools and techniques of the SEO trade don't translate automatically when working with a headless CMS. The principles of SEO, however, stay exactly the same: we still aim to provide the best content to answer a user’s query.
There are three main aspects to consider when you’re creating your headless SEO strategy.
Headless CMSes decouple content from the presentation of the content. Instead of creating a series of web pages, you will typically build a content model that more naturally represents your content. Because you are thinking about the different content types, content is treated as an asset that can be stored as data in a central place and reused anywhere. Any front-end interface can consume the data in the format most appropriate for its purposes. A web page might not care about the same things as an app or an IoT device.
Content modeling is the process of defining the types of content needed, the attributes of each one, and the relationships between them. Instead of a series of page layout templates, models provide a map of content types that can be used to create exactly the type of content needed at any time.
Each item of content inherits the attributes of the content type. So a case study would always have the following fields: Case Study Title, Description, Customer Company, Goals, Outcomes, Story, Images, and Service Type. Any web page that displays information about the case study would pull in the fields needed for that page.
This 'structured content' approach lets you treat content as data, which means that typically all your SEO data can be retrieved from the content itself. In some cases, you might still want to have the ability to override certain fields.
Many SEOs learned their trade using Wordpress, Joomla, Drupal and other similar CMSes. We have gotten used to working with tools such as Yoast that make sure all of our meta tags are there and canonical URLs are applied logically.
Though SEO plugins are available, SEOs need to adapt their mindset when working with a headless CMS. You can ask for fields to be added for all the tags you need. You will have full control over their technical setup, being able to add validation rules to prevent mistakes and apply personalized logic to canonicals, faceted navigation, pagination and much more.
The technical aspects of your SEO strategy will need to be clear. You will have to be very explicit with your technical requests to the dev team. We will dive into this a little bit later.
Omnichannel SEO is a growing trend, and the jobs of SEOs have evolved way beyond optimizing for Google rankings. Since you’re not building out pages, but content and content models, headless SEO goes beyond your website. This includes optimizing for search engines, social media, email, and other channels like YouTube, TikTok, Google Lens or voice search.
Having a well-optimized website that is easy to navigate and provides relevant, accurate information is still absolutely essential for a successful SEO strategy, but the job of an SEO does not end there anymore.
As technology integrates further into our lives, we’ve started performing searches on our home assistants, our smartwatches, an IoT device, Instagram, Tiktok, Amazon and almost anything that’s part of the internet.
Due to the rise of omnichannel search SEOs must make sure businesses provide a consistent, positive experience for potential customers in every channel. An omnichannel approach to content can be a challenge, but it’s important to consider if you want to reach the widest possible audience and maximize your chances of success.
Providing the right content to users at the right time is at the core of SEO. To keep achieving this goal, we must make sure users can find our content wherever they are looking for it. With a headless CMS, you can repurpose your website content into an omnichannel SEO strategy and let users find you where they are.
We cannot deny that there are trade-offs to going headless and, most of the time, it’s not a decision made solely by the SEO team. While the technical aspect needs to be more carefully thought through in a headless setup, the opportunities for content distribution and re-purposing are much larger.
Whether headless is going to be better for SEO or not will depend on each individual business, but it’s becoming increasingly relevant considering the rise of omnichannel search we covered earlier.
One of the biggest benefits of using a headless CMS is that it decouples the content from the presentation. This means that the same content can be used in multiple ways, and on multiple devices.
For example, a blog post can be displayed on a website, in a mobile app, or even on a smartwatch. This is possible because the content is stored in a format that can be easily consumed by any type of client.
Another benefit of using a headless CMS is that it makes it easier to manage content in multiple languages. This is because of how personalisable the editorial workflow can be. With Sanity you have the option to integrate with third party translation services, assign user permissions around markets or languages and manage translations at field or document level.
Headless CMSes have also proven to be a great alternative if you want to leverage the benefits of a static site generator. The combo of a headless CMS and a static site generator is a main staple of the modern JAMStack web architecture.
Except for the considerations we’ve mentioned above, headless SEO follows the same guiding principles as any other SEO strategy.
No matter what CMS are you on, SEO should look to:
- Create valuable content with real expertise
- Help users fulfill their search intent
- Provide a great user experience
- Ensure search engines aren’t crawling bloated or meaningless pages
Let’s take a look at how to make this happen on a headless site.
You’ll have to request every meta tag you want on your site. Stuff you’ve not usually had to think about before on monolithic CMS becomes a lot more intentional here.
When you’re building on a headless CMS, remember to request the main meta tags you are going to want on your site. These are some you don’t want to miss:
- Title: this is the title users will see for your document in the SERPs.
- Meta description: the tiny snippet that tells users what your content is about and Google will most certainly rewrite.
- Meta robots: you might want a field to edit your meta robots tags, unless you’re handling them through the X-Robots-Tag in the HTTP header.
- Viewport: this tag is key to ensure your content is mobile friendly. Most web developers will be sure to include it, but it’s never a bad idea to check in with them.
- Content type: this is used to specify the character encoding for the HTML document and it allows the browser to correctly display the characters in the document.
- Open Graph tags: These are not strictly SEO-related, but we’ve come to own these tags as an industry in the past few years. The main ones will be image, title and description. You can go a step further with enhanced tags.
- Language: This tag is used to determine the language a document is written in. This is key in international SEO.
You can request to include validation rules for these fields. Some ideas are:
- Character limits for your title and meta description.
- Excluding pages from the sitemap if they include a noindex meta robots tag.
- Only accept language values included in the ISO 639-1list of language codes.
You can include other logic rules that will help you streamline your work such as matching Open Graph values to other values in the document or using the first H1 as title if the title field is empty.
There should be a field within your backend where you can create and edit the URL where your content is going to live.
As always, make sure you keep your URL user friendly and include your main keyword.
Setting up your canonical URLs correctly will help search engines index the right pages for your site and it can help you avoid duplicate content issues.
Canonical URLs should be defined in the page's <head> or HTTP header.
If you have multiple versions of a page, the canonical URL should be the version you want to be indexed. These can be pages with UTM or tracking parameters, page versions with and without www and http versions of a page on an https protocol.
Some rules your setup should follow are:
- Use absolute URLs, including the domain and protocol.
- Define only one canonical URL per page.
You’ll need to consider what types of URLs are being generated for your site and evaluate this with your technical team to determine if you need any other canonicalisation rules. This is specially relevant on e-commerce CMS, since they rely heavily on faceted navigation.
When requesting that your developers create an XML sitemap, there are a few things you should keep in mind.
You need to be clear in your request that your sitemap is not static and it will have to be updated regularly. If your website updates often, you might want a daily update to the sitemap.
You might also want to have the option to clear your sitemap’s cache and regenerate it on demand if you want the content you launch to be added fast.
You’ll have to define the rules to determine what pages will be included in your sitemap. You’ll want to add only:
- Indexable URLs
- Canonical URLs
- And URLs with a 200 HTTP response code
Most of the time, you’ll want your sitemap to live in the root directory of your website. If that doesn’t work for you, it can live anywhere as long as you indicate it in your robots.txt file and submit it to your target search engines.
Many sites chose to divide their sitemaps based on the different types of content the website offers. This can lead to different sitemaps for posts, pages, authors, and different taxonomies.
XML sitemaps support more than just URLs, but it’s worth mentioning that Google mostly ignores the <priority> and <changefreq> tags.
You can use sitemaps to submit content other than web pages. Search engines support specific sitemaps to submit content such as videos, images or news articles.
These are especially relevant for publishers and media companies, so make sure you look into it if that’s your case.
To make the most of your content, you should request a field to input schema markup. This can be included per URL or be set at a content component level, with some rules to deliver it all in one single JSON-LD script on the frontend.
Schema markup will help search engines better understand your content and make you eligible for an infinity of rich results such as breadcrumbs, recipe enhancements, video results, FAQs, and many others.
Heading hierarchy is quite tricky when you’re working on a headless CMS. This is because you create your content decoupled from the layout it will have on the page, so you are going to have to pay special attention to heading hierarchy when you’re building out your content models.
According to best practice, heading hierarchy should reflect how the information is organized. This is a basic web accessibility requirement that helps visually impaired users navigate your content.
While it used to be much more important to search rankings than it is nowadays, ensuring your site is accessible to users of all abilities is much more than just best practice.
A headless architecture is a great tool to empower businesses to own every aspect of the digital experience they’re creating for their users.
The need to request a specific technical set-up can create some friction when getting started with headless SEO, but the level of control and integration offered by headless CMSes is absolutely worth any extra upfront work.
The possibilities offered by headless SEO are endless and exciting. Imagine the incredible content-led experiences that ecommerce merchants will be able to create for their potential costumers or how much easier content management will be for huge international enterprise websites.
But the beauty of headless SEO is everything we can’t imagine yet, all the amazing ways in which digital teams are going to leverage all of this potential.
Sanity: unlocking the potential of headless SEO
Sanity is the SEO-friendly structured content platform you’ve been dreaming of. With all the features of a top tier headless CMS, Sanity goes beyond simple content management enabling you to scale your SEO strategy beyond your website and into the next generation of omnichannel search.