CoursesMigrating content from WordPress to SanityProcessing post types
Track
Replatforming from a legacy CMS to a Content Operation System

Migrating content from WordPress to Sanity

Lesson
5

Processing post types

Create a dynamic migration script for WordPress post and pages types that accepts custom content transformation options to unlock reuse potential.

Log in to mark your progress for each Lesson and Task

Currently, the migration script is hard-coded only to query WordPress for posts and recreate new post documents.

You'll want to be able to import many WordPress built-in types, as well as custom types. It's beneficial to be able to choose which type to import when executing the migration script like this:

npx sanity@latest migration run import-wp --type=tags

In the example above --type=tags is an argument that can be passed from the terminal and read during the script's execution. Let's update the migration script to accept a specific WordPress post type when run and select the appropriate Sanity Studio schema type to create new documents.

Update your constants file to create a map of WordPress types to Sanity Studio schema types
migrations/import-wp/constants.ts
import type {SanitySchemaType, WordPressDataType} from './types'
// Replace this with your WordPress site's WP-JSON REST API URL
export const BASE_URL = 'https://<your-domain>/wp-json/wp/v2'
export const PER_PAGE = 100
export const WP_TYPE_TO_SANITY_SCHEMA_TYPE: Record<WordPressDataType, SanitySchemaType> = {
categories: 'category',
posts: 'post',
pages: 'page',
tags: 'tag',
users: 'author',
}

You'll extend this object in another lesson when importing custom post types. You'll also account for situations where something that lives in WordPress as a page may actually become structured content of a different type in Sanity.

Create a helper function to read the current arguments and throw an error if they are missing or invalid:
./migrations/import-wp/lib/getDataTypes.ts
import {WP_TYPE_TO_SANITY_SCHEMA_TYPE} from '../constants'
import type {SanitySchemaType, WordPressDataType} from '../types'
// Get WordPress type from CLI arguments, and the corresponding Sanity schema type
export function getDataTypes(args: string[]): {
wpType: WordPressDataType
sanityType: SanitySchemaType
} {
let wpType = args
.find((a) => a.startsWith('--type='))
?.split('=')
.pop() as WordPressDataType
let sanityType = WP_TYPE_TO_SANITY_SCHEMA_TYPE[wpType]
if (!wpType || !sanityType) {
throw new Error(
`Invalid WordPress data type, specify a with --type= ${Object.keys(
WP_TYPE_TO_SANITY_SCHEMA_TYPE,
).join(', ')}`,
)
}
return {wpType, sanityType}
}

This script will accept the current arguments (from process.argv) and return both the WordPress type being queried and the matching Sanity schema type to use for creating new documents.

So, for example, if you run:

npx sanity@latest migration run import-wp --type=tags

The helper function above will return:

{"wpType": "tags", "sanityType": "tag"}

Now you can implement this into your main migration script.

Replace your migration script with the code below, which uses this new helper function.
migrations/wp-import/index.ts
import {decode} from 'html-entities'
import type {SanityDocumentLike} from 'sanity'
import {createOrReplace, defineMigration} from 'sanity/migrate'
import type {WP_REST_API_Post, WP_REST_API_Term, WP_REST_API_User} from 'wp-types'
import {getDataTypes} from './lib/getDataTypes'
import {wpDataTypeFetch} from './lib/wpDataTypeFetch'
// Allow the migration script to import a specific post type when run
export default defineMigration({
title: 'Import WP JSON data',
async *migrate() {
const {wpType, sanityType} = getDataTypes(process.argv)
let page = 1
let hasMore = true
while (hasMore) {
try {
let wpData = await wpDataTypeFetch(wpType, page)
if (Array.isArray(wpData) && wpData.length) {
const docs: SanityDocumentLike[] = []
for (let wpDoc of wpData) {
const doc: SanityDocumentLike = {
_id: `${sanityType}-${wpDoc.id}`,
_type: sanityType,
}
if (wpType === 'posts' || wpType === 'pages') {
wpDoc = wpDoc as WP_REST_API_Post
doc.title = decode(wpDoc.title.rendered).trim()
} else if (wpType === 'categories' || wpType === 'tags') {
wpDoc = wpDoc as WP_REST_API_Term
doc.name = decode(wpDoc.name).trim()
} else if (wpType === 'users') {
wpDoc = wpDoc as WP_REST_API_User
doc.name = decode(wpDoc.name).trim()
}
docs.push(doc)
}
yield docs.map((doc) => createOrReplace(doc))
page++
} else {
hasMore = false
}
} catch (error) {
console.error(`Error fetching data for page ${page}:`, error)
// Stop the loop in case of an error
hasMore = false
}
}
},
})

The key changes are highlighted in the code above. Now, when running the migration script, you must supply a valid WordPress type, and the script will create documents differently depending on which type they are.

Note also the decode function, which will convert HTML entities in your WordPress REST API response. The string is also trimmed as it may begin or end with whitespace.

You could use a runtime validation library such as Zod to check the validity of every value added to a staged document before it is added to the transaction.

Run the script now with a type argument
npx sanity@latest migration run import-wp --no-dry-run --type=posts

You should now see the same post documents with titles in your Studio. You can re-run the script now for pages, categories, and tags and see the Studio update with them as transactions complete:

npx sanity@latest migration run import-wp --no-dry-run --type=pages
npx sanity@latest migration run import-wp --no-dry-run --type=categories
npx sanity@latest migration run import-wp --no-dry-run --type=tags
npx sanity@latest migration run import-wp --no-dry-run --type=users

Your script now handles multiple post types that are core to WordPress. In the next few lessons, we'll create more complete documents.

Courses in the "Replatforming from a legacy CMS to a Content Operation System" track