Home Download Buy Blog Forum Support

How can I use utf-8 codes in a buffer with view.substr() ?

How can I use utf-8 codes in a buffer with view.substr() ?

Postby shvva on Mon Feb 18, 2013 8:52 am

Hi,

While I had tried to use view.substr() to extract Japanese character (utf-8) on a buffer, it didn't work.
Is it possible to handle this correctly ?

Text on a buffer:
Code: Select all
あいうえお


and I had tried to use view.substr() on the console:
Code: Select all
print view.substr(sublime.Region(0,2))


then, codecs.py had caused the following error messages:
Code: Select all
>>> print view.substr(sublime.Region(0,2))
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/codecs.py", line 352, in write
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/codecs.py", line 351, in write
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 0: ordinal not in range(128)


It would be much appreciated if you could give any answer.
Thank you,
shvva
 
Posts: 7
Joined: Sun Dec 30, 2012 9:31 pm

Re: How can I use utf-8 codes in a buffer with view.substr() ?

Postby bizoo on Mon Feb 18, 2013 9:27 am

It works on Windows.
Looks like the encoding used for the console is ACSII, which mean that you couldn't print these chars.
So the command work fine but you couldn't print the result to the console.

Try to type only:
Code: Select all
view.substr(sublime.Region(0,2))


Don't know how to change encoding in OS X.
bizoo
 
Posts: 873
Joined: Wed Dec 08, 2010 6:53 am
Location: Switzerland

Re: How can I use utf-8 codes in a buffer with view.substr() ?

Postby shvva on Tue Feb 19, 2013 4:26 am

Thanks for your prompt reply.

Yes, you may be right because it was no problem to use substr() w/o 'print', and my environment is os x actually.

However, this leads me to another question about handling utf-8 characters on Python.

It seems NOT to handle utf-8 characters in webbrowser module same as console.
The following code would be fail to Google the query, such as 寿司 (sushi)
Code: Select all
# import webbrowser
webbrowser.open_new_tab('http://www.google.com/search?q=寿司')

Given that Python would be able to handle utf-8 along with the following statements,
is there any solution to handle utf-8 correctly even in plugin using webbrowser module of the Sublime Text [23] ?
Code: Select all
#!/usr/bin/env python
# -*- coding: utf-8 -*-


Thank you,
shvva
 
Posts: 7
Joined: Sun Dec 30, 2012 9:31 pm

Re: How can I use utf-8 codes in a buffer with view.substr() ?

Postby bizoo on Tue Feb 19, 2013 7:41 am

You must give the source an encoding using the header (like your example) AND use an unicode string for the url by prefixing it with u:
Code: Select all
# -*- coding: utf-8 -*-
import sublime, sublime_plugin
import webbrowser

class ExampleCommand(sublime_plugin.WindowCommand):
    def run(self):
        webbrowser.open_new_tab(u'http://www.google.com/search?q=寿司')
bizoo
 
Posts: 873
Joined: Wed Dec 08, 2010 6:53 am
Location: Switzerland

Re: How can I use utf-8 codes in a buffer with view.substr() ?

Postby shvva on Wed Feb 20, 2013 6:05 am

Thanks again, bizoo.

Now I have doubt that this problem might come up ONLY OS X because ...
    a. My original code can work all right on Windows, even though doesn't work on os x
    b. The sample code you can provide me doesn't work on os x as well
The difference of Python's behavior might come from implementation of Python interpreter. Only os x version uses the system Python.

Is there any workaround for this ? Any ideas ?
shvva
 
Posts: 7
Joined: Sun Dec 30, 2012 9:31 pm

Re: How can I use utf-8 codes in a buffer with view.substr() ?

Postby sapphirehamster on Wed Feb 20, 2013 7:54 am

URLs need to be escaped, and typically need to be encoded in UTF-8. The following worked for me on OSX:

Code: Select all
# -*- coding: utf-8 -*-
import sublime, sublime_plugin
import webbrowser
import urllib

class ExampleCommand(sublime_plugin.WindowCommand):
    def run(self):
       quoted = urllib.quote_plus(u'寿司'.encode('utf-8'))
        webbrowser.open_new_tab('http://www.google.com/search?q='+quoted)
sapphirehamster
 
Posts: 83
Joined: Sun Jul 01, 2012 11:19 pm

Re: How can I use utf-8 codes in a buffer with view.substr() ?

Postby shvva on Thu Feb 21, 2013 6:45 am

Thanks sapphirehamster,

Now I have a clear understanding for that, and I can close the problem !!!

Thanks again, sapphirehamster, bizoo.
Kind regards,
shvva
 
Posts: 7
Joined: Sun Dec 30, 2012 9:31 pm


Return to Plugin Development

Who is online

Users browsing this forum: No registered users and 5 guests