Unlock seamless workflows and faster delivery with our latest releases - Join the deep dive

Discussion about special characters in Portable Text to HTML conversion

15 replies
Last updated: Oct 5, 2022
Does the portable text to html accept special characters? We need to be able to put characters like Ă (we currently have the old blocks to html, but will be upgrading soon)
Oct 4, 2022, 7:10 PM
I believe it does! Is that not the behavior you're getting?
Oct 4, 2022, 7:15 PM
It works fine with emojis, Japanese characters, Cyrillic characters, etc. So it should be fine, there is no reason these characters won’t work. They’re not more special than any other. :)
Oct 4, 2022, 8:37 PM
I am able to get some special characters to work with our current blocks to html, but not all. Here is a screenshot from a coworker of what she is trying to insert vs what she gets
Oct 4, 2022, 8:44 PM
And here is what I type in vs what I get
Oct 4, 2022, 8:47 PM
Is it possible that those characters don't exist in the font you're using?
Oct 4, 2022, 9:15 PM
This looks like an encoding issue, not a font issue. For example, á (a with acute) is represented in UTF-8 encoding by the two bytes
0xC3
0xA1
. Those same two bytes in Windows-1252 encoding represent à (A with tilde) followed by ¡ (inverted exclamation mark).
Oct 4, 2022, 9:36 PM
Is this something I can fix or is it a sanity thing or something different all together?
Oct 5, 2022, 4:27 PM
ooh, maybe if I can change it to UTF-16?
Oct 5, 2022, 5:28 PM
UTF-8 is the most common and standard encoding in the web world, so I would use that if I had the choice. Sanity's API serves the results in that encoding as far as I can see (their
Content-Type
heading says
application/json;charset=utf-8
here), so you'd have to convert it if you need something else.
Oct 5, 2022, 5:35 PM
Most likely you can fix it by configuring the web site (wherever the right-hand-side parts of the screenshots come from) to serve the content as UTF-8. Exactly how that is done will depend on how the site is hosted, but the HTTP header should say
Content-Type: text/html;charset=utf-8
.
Oct 5, 2022, 5:38 PM
Failing that, putting
<meta charset="utf-8">
in the actual HTML's
head
element is an option.
Oct 5, 2022, 5:39 PM
oh ok, I misunderstood and thought some of the characters I wanted weren't in UTF-8. Just looked at the character list and see I was incorrect. I will look at what you suggested
Oct 5, 2022, 5:53 PM
Many thanks for stepping in here
user Q
!
Oct 5, 2022, 5:54 PM
Ah, no worries. UTF-8 can encode everything in Unicode, just like UTF-16 🙂
Oct 5, 2022, 5:55 PM
Added
<meta charset="UTF-8" />
in the preview where it was missing and it now displays as expected. So happy it was an easy fix. Thank you for the guidance!!!
Oct 5, 2022, 8:30 PM

Sanity– build remarkable experiences at scale

Sanity is a modern headless CMS that treats content as data to power your digital business. Free to get started, and pay-as-you-go on all plans.

Was this answer helpful?