Wrox Programmer Forums
Go Back   Wrox Programmer Forums > XML > XSLT
|
XSLT General questions and answers about XSLT. For issues strictly specific to the book XSLT 1.1 Programmers Reference, please post to that forum instead.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the XSLT section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
 
Old December 14th, 2007, 05:52 AM
Authorized User
 
Join Date: Nov 2007
Posts: 15
Thanks: 0
Thanked 0 Times in 0 Posts
Default XML to XML through XSLT

Hi,

I am trying to convert an XML file through XSLT, the unicode entities in the master xml is converted as symbol in the output xml file.

Example:

<quote><p>#x201C;I want my agents [sales women] to feel that their first duty is to humanity.#x201D;</p>
<byline>#x2014;Madame C. J. Walker</byline></quote>

<blockquote><p>“I want my agents [sales women] to feel that their first duty is to humanity.”</p>
<byline>—Madame C. J. Walker</byline></blockquote>

I need to retain the unicode entity.

Please Guide me.

Thanks in Advance,

srkumar
:)
 
Old December 14th, 2007, 06:07 AM
samjudson's Avatar
Friend of Wrox
 
Join Date: Aug 2007
Posts: 2,128
Thanks: 1
Thanked 189 Times in 188 Posts
Default

Your problem is that both are equivalent as far as the XSLT processor is concerned.

The only control you have is over the output encoding. If you choose UTF-8 then either format is completely valid.

Choosing an alternatively encoding like Windows-1252 might give you the results you want.

/- Sam Judson : Wrox Technical Editor -/
 
Old December 14th, 2007, 06:47 AM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

I'm assuming the ampersands have somehow been lost in the system.

Your "entities" are not entities at all, but character references. It doesn't make a big difference in this case, but it's always easier to get information on a topic if you use the right terminology.

The XSLT processor can't retain character references because it doesn't know they were there. The XML parser converts them into regular characters. The recipient of the XML isn't supposed to make any distinction between a character reference and the character it represents, so the XML parser doesn't provide this information.

What you can do is to force non-ASCII characters to be represented in the output using character references by setting <xsl:output encoding="us-ascii"/> (or encoding="iso-8859-1" if you prefer).

However, this will cause a serialization failure if you have non-ASCII characters in places where character references aren't allowed, for example in element names or in comments.

I would accept a statement "I *want* to retain the Unicode [character reference]" as a reasonable requirement. Saying "I *need* to retain" gives me concern - it suggests that the XML is being sent to a recipient that can't handle arbitrary well-formed XML, and that's always bad news for the robustness of your system.

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference





Similar Threads
Thread Thread Starter Forum Replies Last Post
xml and xsl templates as input to xslt gives xml rameshnarayan XSLT 5 August 3rd, 2005 01:58 AM
XSLT read through XML to transform another XML dendenx2 XSLT 8 July 7th, 2005 08:18 PM
XSLT for complicated xml to xml transf. required doug@sirvisetti XSLT 3 June 17th, 2005 04:26 PM
merge two xml file and make new xml using xslt ketan XSLT 0 September 21st, 2004 08:48 AM
Merge XML files into a xml file using xslt lxu XML 4 November 6th, 2003 06:01 PM





Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.