Sublime Forum

Character usage

#1

Hi,

Complete beginner question… when coding html stating UTF-8 encoding what is the best practise for inputting special characters?

For example… I know html names are there for some character such as ampersand… & etc. but others such as commas and brackets only have a numeric code… , etc.

Most browsers I’ve tried seem to display the characters as intended but for the best compatibility do you need to / should you write the numeric or html name rather than just typing the character itself?

If this is the case what’s the best plugin currently for auto formatting these numeric codes?

Thanks in advance!

0 Likes

Foreign language encoding
#2

Just type the characters.

You may use the following: (in reverse order)

& → & (ampersand, U+0026)
< → < (less-than sign, U+003C)
> → > (greater-than sign, U+003E)
" → " (quotation mark, U+0022)
’ → ’ (apostrophe, U+0027)

to prevent code injections, Imagine this forum evaluating something like this:

also you use theses to get a valid XML, example

0 Likes

#3

Hi, thanks for the reply.

So, just to confirm… are you saying that for anything like a comma, semi-colon or question mark (non-letter characters that aren’t on the list you gave -including ampersand) you can just type it in the html and all browsers will recognise them and display them correctly provided that you state UTF-8?

And secondly, the list of characters you gave including ampersand have ‘special’ codes which should be used so that the server or browser don’t mis-interpret them as commands?

Cheers

0 Likes

#4

The browser will recognise them if:
1 - the document is written in UTF8 encoding
2 - the document is served(response headers) to the browser with the encoding UTF8 1] 2]
3 - the document have the

The browser will maybe display “squares”, maybe with codes in it (firefox), if you don’t have the required fonts to display these correctly. Found some maybe at the bottom of this page: wikipedia.org/
The browser will display mojibake when fails to render these correctly because of an encoding issue.

You just type every character you need (letters and non-letters), and when typing &, <, >, ", ’ just type &, &lt, >, ", ’ instead.

", ’ is usefull in case like this one:


which can be translated as:

0 Likes

#5

Brilliant, thank you tito - it’s much clearer now.

0 Likes