Sublime Forum

Extended characters getting corrupted

#1

Hello,

I’m having an issue where some characters are becoming corrupted in my HTML documents after I save and then reopen them.

For a site I’m working on, I have a capital U with an umlaut in the document, Ü. When I save these documents and reopen them, that character is replaced with a character I don’t recognize, 脺. (In case that character is scrambled after I post, it looks to be some Chinese character.)

Why is this happening? Any thoughts? It seems like there may be some character setting I’ve got messed up or something with Sublime Text 2, but I’m not sure what the solution is. (Converting the character to HTML entities isn’t always a good option, but I have done that as a work around. I would prefer to have the characters I enter not change though.)

Machine details:
Mac running OSX 10.8.2

0 Likes

#2

What character encoding and font are you using in that file?

If the character encoding does not support it (if it’s UTF-8 then you’re safe), or if the font you’re using can’t display it. Then the problem lies in these, this is from my personal experience BTW.

0 Likes

#3

I’m not sure that I’ve specified any particular character set. I see my preferences say:

// The encoding to use when the encoding can't be determined automatically.
// ASCII, UTF-8 and UTF-16 encodings will be automatically detected.
"fallback_encoding": "Western (Windows 1252)",

// Encoding used when saving new files, and files opened with an undefined
// encoding (e.g., plain ascii files). If a file is opened with a specific
// encoding (either detected or given explicitly), this setting will be
// ignored, and the file will be saved with the encoding it was opened
// with.
"default_encoding": "UTF-8",

I assume that means any file I create (like the one I’m speaking about) will get UTF-8 as it’s character encoding, but I may be wrong about that.

0 Likes

#4

More investigation reveals that the file is saved properly, but the Ü character is changed upon opening the document.

For example, if I create the file in Sublime Text 2 with the Ü character and then save it and then open that document in BBEdit, the character is still correct when viewed in BBEdit. If I reopen the document in Sublime Text 2 though, in the opening process, it gets changed to the Chinese character.

I’ve tried using the “Reopen with Encoding” menu and selected nearly every option, but none open the document back in its original format with the Ü intact.

0 Likes

#5

Hmm… Well that seems like a bug, I don’t know what to suggest though. O_o

0 Likes

#6

Thanks, Eduan.

Do you or anyone else know of another way to report a bug or is this forum it?

0 Likes

#7

Try opening the file with package “EncodingHelper”
Also the console, View -> Show console, tells which encoding is used.

0 Likes