CoursesMigrating content from WordPress to SanityPreparing your migration script
Track
Replatforming from a legacy CMS to a Content Operation System

Migrating content from WordPress to Sanity

Lesson
4

Preparing your migration script

Import WordPress content into Sanity in bulk using the CLI Migration tool, creating a script that queries the WordPress REST API and writes documents to Sanity.

Log in to mark your progress for each Lesson and Task

There are several ways to write scripts to import content into Sanity in bulk, which are covered in the Scripting content migrations lesson. For this course, you'll use the CLI Migration tooling, as recommended.

See the documentation for more about Migration.
Take the Handling schema changes confidently course to learn more about the tooling.

In the root of your Studio project, create a new migration script called "Import WP":

npx sanity@latest migration create "Import WP"
  • Skip defining document types in the following prompt
  • Choose the "minimalistic migration" template because you'll replace this file anyway.
Bootstrap a migration script with the instructions above

Once finished, you should have a document in your Studio at the following path that looks something like this:

./migrations/import-wp/index.ts
import {at, defineMigration, setIfMissing, unset} from 'sanity/migrate'
export default defineMigration({
title: 'import-wp',
migrate: {
document(doc, context) {
// ...and so on

Migration scripts are primarily used for handling schema changes for a dataset in the Sanity Content Lake, such as renaming fields and removing or changing specific values across documents.

You can use the same tooling to do whatever you like – including creating new documents in bulk. The migration script you're creating won't target existing documents in the dataset of an existing type but rather query the WordPress REST API, iterate over the results, and return new createOrReplace mutations, making the script idempotent.

The major benefit of using the migration tooling is that it will automatically batch mutations into transactions to avoid hitting rate limits. It also supports testing with "dry runs" by default and provides visual feedback in the terminal when running the script.

You'll use some development dependencies as you build out the migration script. To save time, install them all now.

Install these development dependencies to your Studio project.
npm install -D wp-types html-entities p-limit @sanity/block-tools @wordpress/block-serialization-default-parser jsdom
  • wp-types is a collection of WordPress types for API responses
  • html-entities contains a helper function for decoding entities in HTML strings
  • p-limit can throttle the number of concurrent asynchronous operations in an array of promises – used when uploading assets
  • @sanity/block-tools converts HTML strings to Portable Text
  • jsdom converts HTML strings into a DOM that can be traversed
  • @wordpress/block-serialization-default-parser can convert pre-processed HTML stored in the WordPress block editor into an array of block objects
Create a file to store all the Types used in your migration script.
./migrations/import-wp/types.ts
import type {
WP_REST_API_Categories,
WP_REST_API_Pages,
WP_REST_API_Posts,
WP_REST_API_Tags,
WP_REST_API_Users,
} from 'wp-types'
export type WordPressDataType = 'categories' | 'posts' | 'pages' | 'tags' | 'users'
export type WordPressDataTypeResponses = {
categories: WP_REST_API_Categories
posts: WP_REST_API_Posts
pages: WP_REST_API_Pages
tags: WP_REST_API_Tags
users: WP_REST_API_Users
}
export type SanitySchemaType = 'category' | 'post' | 'page' | 'tag' | 'author'
Create a file to store constants and update the BASE_URL variable to use your WordPress website's URL.
./migrations/import-wp/constants.ts
// Replace this with your WordPress site's WP-JSON REST API URL
export const BASE_URL = `https://<your-domain>/wp-json/wp/v2`
export const PER_PAGE = 100
Create a helper function for returning a page of results from WordPress REST API
./migrations/import-wp/lib/wpDataTypeFetch.ts
import {BASE_URL, PER_PAGE} from '../constants'
import type {WordPressDataType, WordPressDataTypeResponses} from '../types'
export async function wpDataTypeFetch<T extends WordPressDataType>(
type: T,
page: number
): Promise<WordPressDataTypeResponses[T]> {
const wpApiUrl = new URL(`${BASE_URL}/${type}`)
wpApiUrl.searchParams.set('page', page.toString())
wpApiUrl.searchParams.set('per_page', PER_PAGE.toString())
return fetch(wpApiUrl).then((res) => (res.ok ? res.json() : null))
}

With these files created, you should now have a directory structure at the root of your Studio like this:

migrations
└── import-wp
├── types.ts
├── index.ts
├── constants.ts
└── lib
└── wpDataTypeFetch.ts

With all required dependencies installed and some basic helpers, replace the migration script created by the CLI.

Update your migration script entirely with the code below.
./migrations/import-wp/index.ts
import type {SanityDocumentLike} from 'sanity'
import {createOrReplace, defineMigration} from 'sanity/migrate'
import {wpDataTypeFetch} from './lib/wpDataTypeFetch'
// This will import `post` documents into Sanity from the WordPress API
export default defineMigration({
title: 'Import WP',
async *migrate() {
const wpType = 'posts'
let page = 1
let hasMore = true
while (hasMore) {
try {
const wpData = await wpDataTypeFetch(wpType, page)
if (Array.isArray(wpData) && wpData.length) {
const docs: SanityDocumentLike[] = []
for (const wpDoc of wpData) {
const doc: SanityDocumentLike = {
_id: `post-${wpDoc.id}`,
_type: 'post',
title: wpDoc.title?.rendered.trim(),
}
docs.push(doc)
}
yield docs.map((doc) => createOrReplace(doc))
page++
} else {
hasMore = false
}
} catch (error) {
console.error(`Error fetching data for page ${page}:`, error)
// Stop the loop in case of an error
hasMore = false
}
}
},
})

This new version of the script queries the WordPress REST API URL set in your constants.ts file inside the wpDataTypeFetch function.

It should return up to 100 posts, which are then iterated over in a for of loop to stage a new Sanity document for each:

  • The _id is generated from the WordPress post ID
  • The _type is set to post, which means that these documents will appear under Post in your studio
  • We also set the title here so it's easier to visualize how this script works.

You might have HTML entities in your titles – you will deal with these later in this course.

The while loop will keep querying for another 100 posts each time by paginating the results until it finds no more. Yes, similar to the iconic while(have_posts() : the_posts()) WordPress loop.

As these posts are being "staged," the migration tooling will batch them into transactions, which are committed once the transaction reaches a certain size.

Run your migration script. By default, it will perform a "dry run" where nothing is written to the dataset.
npx sanity@latest migration run import-wp

You should get visual feedback in the terminal for the posts that would be created. Something like this:

Running migration "import-wp" in dry mode
Project id: f3lbec6z
Dataset: production
createOrReplace post post-685064
{
"_id": "post-685064",
"_type": "post",
"title": "Robotic Assembly and Outfitting for NASA Space Missions"
}
Run the script again, with dry run disabled, to write documents to your dataset.

You will need to confirm you wish to proceed.

npx sanity@latest migration run import-wp --no-dry-run

Once run, you should receive a summary of the finished mutations and transactions (numbers will vary).

0 documents processed.
30 mutations generated.
1 transactions committed.

Fantastic! You now have a migration script that can query the WordPress REST API and write as many documents to Sanity as it finds.

However, it only looks for posts.

Let's make this script much smarter with a few options.

Courses in the "Replatforming from a legacy CMS to a Content Operation System" track