Wrox Programmer Forums
Go Back   Wrox Programmer Forums > XML > XSLT
|
XSLT General questions and answers about XSLT. For issues strictly specific to the book XSLT 1.1 Programmers Reference, please post to that forum instead.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the XSLT section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
 
Old February 15th, 2008, 08:31 AM
Authorized User
 
Join Date: Jul 2007
Posts: 55
Thanks: 0
Thanked 0 Times in 0 Posts
Default Character to Unicode entities

Hi all,

Can somebody let me know why my character entities are getting converted into unicode values in the resulting transformed output.

My Input is
<surname>Norberg&hyphen;Sch&ouml;nfeldt</surname>

Output:

<surname>Norberg[#x02010]Sch[#x000F6]nfeldt</surname>

And here the snippet of my stylesheet:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:output method="xml" indent="yes" media-type="text/xml"/>

I am using Editx 2008 on windows.

Thanks for suggestions.

Pankaj



Pankaj
__________________
Pankaj
 
Old February 15th, 2008, 08:53 AM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

Because that's what XML parsers do.

The XSLT processor can't convert the characters back to entity references because it doesn't know that they started life as entity references - the XML parser doesn't pass on this information.

If you really need to preserve the entity references, the only way to do it is to convert them into something else first (for example &hyphen; becomes <?pi hyphen?>), by preprocessing using non-XML-aware tools such as sed or Perl,and then reverse the process afterwards.

However, this only matters for human readers of your output. Any well-behaved software processing the output will treat the Unicode character as 100% equivalent to the original entity reference.

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference
 
Old February 15th, 2008, 08:59 AM
Authorized User
 
Join Date: Jul 2007
Posts: 55
Thanks: 0
Thanked 0 Times in 0 Posts
Default

"Any well-behaved software processing the output will treat the Unicode character as 100% equivalent to the original entity reference."

Well in that case I think I will be lucky. I need to import transformed xml into Adobe Indesign CS2.

Thanks Michael, I will test it and will revert for help if required.

Pankaj


Pankaj





Similar Threads
Thread Thread Starter Forum Replies Last Post
XSLT replace for character entities atulshin XSLT 1 November 1st, 2008 05:19 AM
Get UNICODE or ASCII Value of a character Eyob_the_pro C# 0 January 10th, 2007 03:42 AM
Unicode translation using [xsl:output-character] ROCXY XSLT 5 May 15th, 2006 12:19 PM
Entities safin XSLT 1 November 7th, 2005 04:59 AM
Converting Unicode to Character RobinR Classic ASP Basics 4 August 6th, 2004 11:40 AM





Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.