Wrox Programmer Forums

Need to download code?

View our list of code downloads.

Go Back   Wrox Programmer Forums > XML > XSLT
Password Reminder
Register
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read
XSLT General questions and answers about XSLT. For issues strictly specific to the book XSLT 1.1 Programmers Reference, please post to that forum instead.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the XSLT section of the Wrox Programmer to Programmer discussions. This is a community of tens of thousands of software programmers and website developers including Wrox book authors and readers. As a guest, you can read any forum posting. By joining today you can post your own programming questions, respond to other developers’ questions, and eliminate the ads that are displayed to guests. Registration is fast, simple and absolutely free .
DRM-free e-books 300x50
Reply
 
Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old February 15th, 2008, 08:31 AM
Authorized User
 
Join Date: Jul 2007
Location: New Delhi, Delhi, India.
Posts: 55
Thanks: 0
Thanked 0 Times in 0 Posts
Default Character to Unicode entities

Hi all,

Can somebody let me know why my character entities are getting converted into unicode values in the resulting transformed output.

My Input is
<surname>Norberg&hyphen;Sch&ouml;nfeldt</surname>

Output:

<surname>Norberg[#x02010]Sch[#x000F6]nfeldt</surname>

And here the snippet of my stylesheet:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:output method="xml" indent="yes" media-type="text/xml"/>

I am using Editx 2008 on windows.

Thanks for suggestions.

Pankaj



Pankaj
__________________
Pankaj
Reply With Quote
  #2 (permalink)  
Old February 15th, 2008, 08:53 AM
mhkay's Avatar
Wrox Author
Points: 18,487, Level: 59
Points: 18,487, Level: 59 Points: 18,487, Level: 59 Points: 18,487, Level: 59
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Apr 2004
Location: Reading, Berks, United Kingdom.
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

Because that's what XML parsers do.

The XSLT processor can't convert the characters back to entity references because it doesn't know that they started life as entity references - the XML parser doesn't pass on this information.

If you really need to preserve the entity references, the only way to do it is to convert them into something else first (for example &hyphen; becomes <?pi hyphen?>), by preprocessing using non-XML-aware tools such as sed or Perl,and then reverse the process afterwards.

However, this only matters for human readers of your output. Any well-behaved software processing the output will treat the Unicode character as 100% equivalent to the original entity reference.

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference
Reply With Quote
  #3 (permalink)  
Old February 15th, 2008, 08:59 AM
Authorized User
 
Join Date: Jul 2007
Location: New Delhi, Delhi, India.
Posts: 55
Thanks: 0
Thanked 0 Times in 0 Posts
Default

"Any well-behaved software processing the output will treat the Unicode character as 100% equivalent to the original entity reference."

Well in that case I think I will be lucky. I need to import transformed xml into Adobe Indesign CS2.

Thanks Michael, I will test it and will revert for help if required.

Pankaj


Pankaj
Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
XSLT replace for character entities atulshin XSLT 1 November 1st, 2008 05:19 AM
Get UNICODE or ASCII Value of a character Eyob_the_pro C# 0 January 10th, 2007 03:42 AM
Unicode translation using [xsl:output-character] ROCXY XSLT 5 May 15th, 2006 12:19 PM
Entities safin XSLT 1 November 7th, 2005 04:59 AM
Converting Unicode to Character RobinR Classic ASP Basics 4 August 6th, 2004 11:40 AM



All times are GMT -4. The time now is 08:33 PM.


Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
© 2013 John Wiley & Sons, Inc.