Character Encoding

0 comments

Character encoding options

Thought I would share some charset guidelines. A client had with a funny character where there should have been an acute accent.

 

I have always found that ISO encoding works best:

<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">

 

And NOT:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

 

This fixed the problem. We should always use ISO as our default.

 

Plus, the text was not HTML encoded. Adding htmlencode is another way of fixing the problem, then it doesn’t matter what the charset is.

 

 

In this case I did both - since it is a newsletter it is possible the content type meta tag will be stripped off, so it is a good idea to use htmlencode. Plus it is always good to htmlencode in case some hacker has got javascript into our database somehow.

 

20110613 Maori Macrons issue

 

I found an issue with HorizonPoll trying on to use Maori Macrons with ISO encoding and had to switch to UTF8. Macrons are pretty important because the site has a Maori Panel and people can select which iwi they are from. In doing this I discovered that UTF8 seems to work reasonably well with our .NET code but not very well with classic ASP. I seems the classic ASP problem is saving into the SQL database possibly an issue with the ADO database driver for classic ASP. I managed to get UTF8 working properly with the macrons (in terms of being able to display, edit, save and load them and have the characters be the same onscreen as in the database)... but....

 

Then several months later I discovered that smart quotes and apostrophes do not paste in from Word correctly. So rather than alter the charset (which was now UTF8 throughout) we wanted to dumb down the smart quotes as I had already tried everything I could think of to get the macrons working in the first place I applied the following code (in pagebegin.asp). I also added a call to DumbDownSmartQuotes in CleverPaste_AfterPaste() and PasteRawText() in htmltext-editor.js so this works in all input fields, textareas and rich text edits.

 

 <script type="text/javascript" src="http://code.jquery.com/jquery-1.6.1.min.js"></script>
 <script>
 function DumbDownSmartQuotes(txt) {
  var replacements, regex, key, textnodes, node, s;
  
  replacements = {
      "\xa0": " ",
      "\xa9": "(c)",
      "\xae": "(r)",
      "\xb7": "*",
      "\u2018": "'",
      "\u2019": "'",
      "\u201c": '"',
      "\u201d": '"',
      "\u2026": "...",
      "\u2002": " ",
      "\u2003": " ",
      "\u2009": " ",
      "\u2013": "-",
      "\u2014": "--",
      "\u2122": "(tm)"};
  for (key in replacements) {
    txt = txt.replace(new RegExp(key, 'g'), replacements[key])
  }
  return txt
 }
 
 jQuery(document).ready(function() {
  jQuery("input,textarea").bind("paste", function(){
   var field = this
   window.setTimeout(function() {
    field.value = DumbDownSmartQuotes(field.value)
   }, 500);
  })
 })
 </script>










Comments


Leave a Comment