Wrox Programmer Forums

Need to download code?

View our list of code downloads.

Go Back   Wrox Programmer Forums > XML > XSLT
Password Reminder
Register
| FAQ | Members List | Search | Today's Posts | Mark Forums Read
XSLT General questions and answers about XSLT. For issues strictly specific to the book XSLT 1.1 Programmers Reference, please post to that forum instead.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the XSLT section of the Wrox Programmer to Programmer discussions. This is a community of tens of thousands of software programmers and website developers including Wrox book authors and readers. As a guest, you can read any forum posting. By joining today you can post your own programming questions, respond to other developers’ questions, and eliminate the ads that are displayed to guests. Registration is fast, simple and absolutely free .
DRM-free e-books 300x50
Reply
 
Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old April 29th, 2006, 03:21 PM
Registered User
 
Join Date: Apr 2006
Location: , , .
Posts: 5
Thanks: 0
Thanked 0 Times in 0 Posts
Default UTF-8 in XML: "Undeclared Entity" errors?

Hey all,

I'm working on a web site using XSLT for the front end. Unfortunately, one non-Roman character slipped in (such as an e-acute) brings the whole thing down with "Undeclared Entity" errors. It's getting this error both in Sablotron (on the server) and loading the offending XML file directly into Firefox to display it.

Escaping the offending character (as é) doesn't do a whole lot of good since the XSLT parser decodes it internally and then chokes on the same error again. Double-escaping it (&eacute) works, but means I have to put 'disable-output-escaping="yes"' on every line of every XSLT file to display it properly, and code extra server-side software to keep "bad" things from getting unescaped too. But that just seems like way too much work!

So can XML not handle non-roman characters without going through all these steps? It seems very, very strange that XML would natively support UTF-8, yet claim any non-ASCII character to be malformed...

Any advice?

(I do start both the source XML files and the XSLT files with the usual '<?xml version="1.0" encoding="UTF-8"?>' declaration.)

Thanks!
Richard
Reply With Quote
  #2 (permalink)  
Old April 29th, 2006, 04:46 PM
mhkay's Avatar
Wrox Author
Points: 18,487, Level: 59
Points: 18,487, Level: 59 Points: 18,487, Level: 59 Points: 18,487, Level: 59
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Apr 2004
Location: Reading, Berks, United Kingdom.
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

If you want to use entity references such as & eacute; in XML, you have to declare them. You have a number of options: you can use numeric character references instead (for example & #xac;), or you can use the native character encoding of the character. Of course, you have to use the encoding you have declared in the XML declaration.

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference
Reply With Quote
  #3 (permalink)  
Old April 29th, 2006, 08:13 PM
Registered User
 
Join Date: Apr 2006
Location: , , .
Posts: 5
Thanks: 0
Thanked 0 Times in 0 Posts
Default

There lies my problem. The original character causing all these problems is the UTF-8 character itself, of an e with an acute accent mark over it (same character as &eacute;, but the original character.) And the <?xml declaration states UTF-8 as the character set.

In spite of the fact that both Sablotron and Firefox are reporting "undeclared entity" errors and refusing to render the page, there really is no XML entity. There's just a UTF-8 extended latin character that's nicely breaking things by being included.

Any ideas of what could be causing this?

Thanks!

Rich
Reply With Quote
  #4 (permalink)  
Old April 30th, 2006, 03:19 AM
joefawcett's Avatar
Wrox Author
Points: 9,763, Level: 42
Points: 9,763, Level: 42 Points: 9,763, Level: 42 Points: 9,763, Level: 42
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Jun 2003
Location: Exeter, , United Kingdom.
Posts: 3,074
Thanks: 1
Thanked 38 Times in 37 Posts
Default

You'll have to show some of the original code, if the files are utf-8 and you're using e-acute either typed directly or using a numeric charcter reference then it should work.
Can you post a sample of the original XML where the charcter appears, or post a link to prevent the forum's software messing it up?

--

Joe (Microsoft MVP - XML)
Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Processing ENTITY and NOTATION lines in xml mrame XSLT 2 August 1st, 2008 01:24 PM
xml validation of entity text in txt files reblev XML 1 May 4th, 2006 09:03 AM
xml <!ENTITY..... anpham ASP.NET 1.0 and 1.1 Basics 0 June 28th, 2005 09:02 PM
xml parameter entity yengzhai XML 2 April 10th, 2005 12:54 PM
entity references not preserved in XML Output srivalli9 J2EE 0 November 14th, 2003 05:29 PM



All times are GMT -4. The time now is 07:31 PM.


Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
© 2013 John Wiley & Sons, Inc.