Sublime Forum

How ST2 set encoding when opening files?

#1

Could someone confirm how ST2 open a file and which encoding it use ?

  1. Try reading with ‘ASCII’ encoding only -> Set the encoding to ‘Undefined’
  2. Try reading with ‘UTF-8’ encoding -> Set the encoding to ‘UTF-8’
  3. Try reading with ‘fallback_encoding’ settings -> Set the encoding to ‘fallback_encoding’
  4. Error ?
0 Likes

#2

More or less, yes, except if a BOM is present, the encoding indicated by the BOM will be used.

There’s no condition 4 either, as when using the fallback encoding a best effort is used to open it with that encoding. Typical values for the fallback encoding, such as Windows 1252, don’t have any notion of being in error in any case, as every byte sequence is valid input.

0 Likes

#3

Thanks for your help Jon.

I’ve done quickly the code below (partial copy and paste) that seems to work not too bad. I’ve to look on how to deal with BOM better.

But I’ve one more question:
How to get the fallback_encoding ?

When loading a file from plugin your not necessarily working with a view, so how to get the fallback_encoding setting without view.settings() ?
I wrote the get_fallback_encoding() method below and it seems to work fine (if I not call it directly at module loading), but is it the best way to do it ?
Isn’t a window.settings() API missing here ?

[code]FALLBACK_ENCODING = “”
PY_FALLBACK_ENCODING = “”

def get_fallback_encoding():
global FALLBACK_ENCODING, PY_FALLBACK_ENCODING
if not FALLBACK_ENCODING:
s = sublime.load_settings(“Preferences.sublime-settings”)
FALLBACK_ENCODING = s.get(“fallback_encoding”)
PY_FALLBACK_ENCODING = st2python(FALLBACK_ENCODING)
del s
return (FALLBACK_ENCODING, PY_FALLBACK_ENCODING)

def open_st(filename, mode=‘rb’, fallback_encoding=None):
list_encoding = “ascii”, “utf8”, “utf16”]
if fallback_encoding:
list_encoding.append(fallback_encoding)
def_fallback_encoding = get_fallback_encoding()
if def_fallback_encoding[0] != fallback_encoding:
list_encoding.append(def_fallback_encoding[1])

for encoding in list_encoding:
	try:
		f = codecs.open(filename, mode, encoding)
		for line in f:
			pass
		f.seek(0)
		break
	except UnicodeError:
		f.close()
		f = None

if f is None:
	raise ST2DecodeError("unknown Sublime Text encoding")

return f[/code]
0 Likes