CoursesRefactoring content for migrationValidating incoming content
Track
Replatforming from a legacy CMS to a Content Operation System

Refactoring content for migration

Lesson
6

Validating incoming content

Never trust your existing content source. Validate all data during a migration to avoid future headaches.

Log in to mark your progress for each Lesson and Task

The formatting of your existing content may be problematic or insufficient. Examples include:

  • You may need to escape HTML entities or process Markdown formatting.
  • You may want to trim strings to remove whitespace.
  • Integers may need to be converted to strings – or vice-versa.

TypeScript lets you add data type definitions to JavaScript. It's beneficial for migration projects because you probably want to go from messy and idiosyncratic content (we assume!) to tidy and structured content.

If you have started configuring your content model for Sanity Studio, you can use Sanity TypeGen to generate types for it and use those types for the output in your migration scripts.

Check out the Typed content with Sanity TypeGen course.

Example of using generated types in a migration script:

import { Post } from './sanity.types' // From Sanity TypeGen
export default defineMigration({
title: 'Import WP JSON data',
async *migrate(documents) {
const wpType = "posts";
let page = 1;
let hasMore = true;
while (hasMore) {
try {
const wpData = await wpDataTypeFetch(wpType, page);
if (Array.isArray(wpData) && wpData.length) {
for (const wpDoc of wpData) {
const doc: Post = {
_id: `post-${wpDoc.id}`,
_type: "post",
// Add other required fields here based on wpDoc structure
};
yield createOrReplace(doc);
}
page++;
} else {
hasMore = false;
}
} catch (error) {
console.error(`Error fetching data for page ${page}:`, error);
hasMore = false; // Stop the loop in case of an error
}
}
},
});

You can also consider a runtime validation library such as Zod to validate, catch, transform or throw errors for any unexpected problems with incoming data.

Zod is a popular library for validating content at run time.

Your Sanity Studio schema types should also include validation rules on all content. This way, as your content is migrated into the Content Lake, you can validate all new documents against these rules from the terminal by running:

npx sanity@latest documents validate
See the documentation about Validation on schema types and validating all documents with the CLI.

Courses in the "Replatforming from a legacy CMS to a Content Operation System" track