Wrox Programmer Forums
Go Back   Wrox Programmer Forums > XML > XSLT
|
XSLT General questions and answers about XSLT. For issues strictly specific to the book XSLT 1.1 Programmers Reference, please post to that forum instead.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the XSLT section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
 
Old February 9th, 2005, 04:54 AM
Registered User
 
Join Date: Feb 2005
Posts: 3
Thanks: 0
Thanked 0 Times in 0 Posts
Default strange xsl encoding problem

I am facing a strange encoding problem using xsl transformation.
Concisely, i execute a SQL statement to fill a recordset which contain some fields with Greek Characters saved as UTF-16 (on SQL Server 2000 using nvarchar) ,which then i save to the
Response object of an ASP page as xml (adPersistXML).Then i use an xsl file to transform XML to HTMl.The problem is that when i see the page in the internet explorer instead of the greek character i get something like this "ΕΛΛΗΝΙΚΟΣ ΤΙΤΛΟΣ" . Moreover when i do a "view source" and see the HTML source using Notepad the Text is displayed correctly in Greek.

 Currently i am using the following ASP code to do the transformation :


Code:
' rs is an ADO recordset filled with data from a sql Select statement

styleFile = Server.MapPath(xslfile)
set stylexml =Server.CreateObject("MSXML2.FreeThreadedDOMDocument")
stylexml.async = false
stylexml.load(styleFile)

set sourcexml = Server.CreateObject("MSXML2.FreeThreadedDOMDocument")
sourcexml.async = false
rs.Save sourcexml,1 ' Save as adPersistXML
set rs=Nothing

strPath=BuildPath(id,18)

dim xslty,xslProc

set xslt = Server.CreateObject("MSXML2.XSLTemplate")
xslt.stylesheet =stylexml
Set xslProc =xslt.createProcessor()
xslProc.input=sourcexml
xslProc.addParameter "Path",escape(strPath)
Response.charSet = "UTF-16"
xslProc.output = Response
xslProc.transform

The xsl file looks like this :

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882"
                xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882"
                xmlns:rs="urn:schemas-microsoft-com:rowset"
                xmlns:z="#RowsetSchema"
                version="1.0">

  <xsl:param name="Path" />
  <xsl:output encoding="utf-16" method="html" version="4.0"/>
.....

<h2 style="color:white;font-family: 'Verdana', Arial, Helvetica, Tahoma, sans-serif;" align="left"><xsl:value-of disable-output-escaping="yes" select="rs:data/z:row/@Description_GR" /></h2>

......




I’ve posted the same question on the MSDN NEwsGroups with the title “Strange xsl encoding problem” and I had a really useful answer from Mike Sharp.

He told me that the problem has to do with Big-endian and Little-endian switch. His answer was the following :


 
Quote:
quote:
Quote:


This is proving difficult to nail down (no surprise to you, I'm sure), but I *think* I see what's happening...
It looks like it's an "endian" switch. That is, the ADO recordset is being saved as UTF-16 big endian (UTF-16BE). Rather, the data is. "Endianness" only matters when serializing UTF-16. When it's in-memory, it doesn't
matter. When there is no byte order mark, it's supposed to be Big Endian.When I play around with those bytecodes I can get a similar result.I'm not sure how this will appear in your newsreader, but as an example, the character codes: 03 B5 03 BB 03 B1 in UTF-16BE are displayed as åëá but if I interpret those same character codes as UTF-16M, they show up as µ » ± which is similar to your result.



This article by Mark Davis is extremely helpful in this area:
http://www-106.ibm.com/developerwork...rms/index.html
Since the entire file isn't save as big endian, only the characters, I'm thinking that perhaps the original data is being stored that way.

I think you'll find Mark Davis's UTF-converter very useful for diagnosing problems like this. I use it all the time. The page is at:
http://www.macchiato.com/unicode/convert.html
He's got a lot of interesting stuff on his site, in fact.





Anyway ,I still can’t figure out how to correct the problem.THose links was realy usefull but i don't managed to fix the problem.
Is there any way to fix it ?


Thanks In advanced
Teo



 
Old February 9th, 2005, 11:28 AM
Authorized User
 
Join Date: Nov 2003
Posts: 63
Thanks: 0
Thanked 0 Times in 0 Posts
Send a message via AIM to sonicDace Send a message via MSN to sonicDace Send a message via Yahoo to sonicDace
Default

Teo,

I had the same, or maybe a similar problem. If your HTML is OK, but is somehow displayed incorrectly by the browser.

Basically in my case, I found that when going to different comps, I would notice that some would display the HTML correctly and some incorrectly. After some investigation, I noticed that MS DOM parsers inserted a default character encoding header into the rendered HTML after the transformation and the different comps were interpretting this problem in different ways due to different language selections on their comps (I had spanish, others english). This was causing my #xA0; to be displayed as "?"

This is more of a workaround, but what I did was hardcode the same character encoding tag into my template with the iso 8859-1 character set ( if my memory suits me correctly, you'll have to find a way to insert it before, but if it doesn't work try inserting it after the tag inserted by the MS Dom parser)

I'm sorry I don't have code samples... this was too long ago :-P

Hope this helps
 
Old February 9th, 2005, 08:03 PM
Registered User
 
Join Date: Feb 2005
Posts: 3
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Hi
Thanks for the reply

 
Quote:
quote:
Quote:
I noticed that MS DOM parsers inserted a default character encoding header into the rendered HTML after the transformation
When you say "a default character encoding header into the rendered HTML" you mean an HTML <meta charset=".." > directive or something else ?

I ask becouse my output HTML seems that has the correct encoding directive :
Code:
<html xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" xmlns:rs="urn:schemas-microsoft-com:rowset" xmlns:z="#RowsetSchema">
<head>
<META http-equiv="Content-Type" content="text/html; charset=utf-16">
 
Old February 10th, 2005, 06:04 AM
Friend of Wrox
 
Join Date: Jun 2003
Posts: 1,212
Thanks: 0
Thanked 1 Time in 1 Post
Default

I think Dace is saying he solved a similar problem by swapping encodings, but iso-8859-1 won't work for you because it doesn't contain Greek chars.

Is there any reason you must use UTF-16? Have you tried UTF-8 instead?
 
Old February 10th, 2005, 06:29 AM
Registered User
 
Join Date: Feb 2005
Posts: 3
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Yes i've try to use UTF-8 with no success.

I have also try several ways to perform the transformation
including : TransformNode , transformNodeToObject but the result was always the same.







 
Old May 30th, 2007, 09:13 AM
Registered User
 
Join Date: May 2007
Posts: 1
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Try to declare the output encoding attribute in your xsl file like this:

<xsl:output method="html" encoding="WINDOWS-1253" indent="no"></xsl:output>

 
Old May 30th, 2007, 10:05 AM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

You've asked for UTF-16 output but what you've shown looks more like a rendition of UTF-8 by something that doesn't understand that it's looking at UTF-8.

Is there a META element in the generated HTML that gives the charset, if so what does it say?

And what's in the HTTP headers?

Have you tried different browsers?

In Firefox, what does View/Character Encoding say?

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference





Similar Threads
Thread Thread Starter Forum Replies Last Post
xsl:output with encoding and UPPER case entity ROCXY XSLT 1 April 19th, 2007 08:20 AM
Encoding problem. Neal XML 0 April 4th, 2006 06:49 AM
JSP encoding problem! recepkocur Servlets 1 January 24th, 2005 04:54 AM
JSP encoding problem recepkocur Apache Tomcat 0 December 3rd, 2004 02:44 AM
JSP encoding problem recepkocur JSP Basics 0 December 1st, 2004 11:34 AM





Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.