I am facing a strange encoding problem using xsl transformation.
Concisely, i execute a SQL statement to fill a recordset which contain some fields with Greek Characters saved as UTF-16 (on SQL Server 2000 using nvarchar) ,which then i save to the
Response object of an ASP page as xml (adPersistXML).Then i use an xsl file to transform XML to HTMl.The problem is that when i see the page in the internet explorer instead of the greek character i get something like this "Ãâ¢ÃâºÃâºÃâÃÂÃâ¢Ãšßã äÃâ¢Ã¤ÃâºÃŸÃ£" . Moreover when i do a "view source" and see the HTML source using Notepad the Text is displayed correctly in Greek.
Currently i am using the following ASP code to do the transformation :
Code:
' rs is an ADO recordset filled with data from a sql Select statement
styleFile = Server.MapPath(xslfile)
set stylexml =Server.CreateObject("MSXML2.FreeThreadedDOMDocument")
stylexml.async = false
stylexml.load(styleFile)
set sourcexml = Server.CreateObject("MSXML2.FreeThreadedDOMDocument")
sourcexml.async = false
rs.Save sourcexml,1 ' Save as adPersistXML
set rs=Nothing
strPath=BuildPath(id,18)
dim xslty,xslProc
set xslt = Server.CreateObject("MSXML2.XSLTemplate")
xslt.stylesheet =stylexml
Set xslProc =xslt.createProcessor()
xslProc.input=sourcexml
xslProc.addParameter "Path",escape(strPath)
Response.charSet = "UTF-16"
xslProc.output = Response
xslProc.transform
The xsl file looks like this :
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882"
xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882"
xmlns:rs="urn:schemas-microsoft-com:rowset"
xmlns:z="#RowsetSchema"
version="1.0">
<xsl:param name="Path" />
<xsl:output encoding="utf-16" method="html" version="4.0"/>
.....
<h2 style="color:white;font-family: 'Verdana', Arial, Helvetica, Tahoma, sans-serif;" align="left"><xsl:value-of disable-output-escaping="yes" select="rs:data/z:row/@Description_GR" /></h2>
......
Iâve posted the same question on the MSDN NEwsGroups with the title âStrange xsl encoding problemâ and I had a really useful answer from Mike Sharp.
He told me that the problem has to do with Big-endian and Little-endian switch. His answer was the following :
Quote:
This is proving difficult to nail down (no surprise to you, I'm sure), but I *think* I see what's happening...
It looks like it's an "endian" switch. That is, the ADO recordset is being saved as UTF-16 big endian (UTF-16BE). Rather, the data is. "Endianness" only matters when serializing UTF-16. When it's in-memory, it doesn't
matter. When there is no byte order mark, it's supposed to be Big Endian.When I play around with those bytecodes I can get a similar result.I'm not sure how this will appear in your newsreader, but as an example, the character codes: 03 B5 03 BB 03 B1 in UTF-16BE are displayed as åëá but if I interpret those same character codes as UTF-16M, they show up as µ » ± which is similar to your result.
This article by Mark Davis is extremely helpful in this area:
http://www-106.ibm.com/developerwork...rms/index.html
Since the entire file isn't save as big endian, only the characters, I'm thinking that perhaps the original data is being stored that way.
I think you'll find Mark Davis's UTF-converter very useful for diagnosing problems like this. I use it all the time. The page is at:
http://www.macchiato.com/unicode/convert.html
He's got a lot of interesting stuff on his site, in fact.
|
Anyway ,I still canât figure out how to correct the problem.THose links was realy usefull but i don't managed to fix the problem.
Is there any way to fix it ?
Thanks In advanced
Teo