Wrox Programmer Forums
Go Back   Wrox Programmer Forums > XML > XML
|
XML General XML discussions.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the XML section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
 
Old September 16th, 2005, 04:07 AM
Authorized User
 
Join Date: Jun 2003
Posts: 57
Thanks: 0
Thanked 0 Times in 0 Posts
Send a message via MSN to Kabe
Default entity conversion (TidyCOM)

hi

i converted an html to xml with TidyCom.
for special characters the conversion results into this:
é -> é

When loading the converted xml into an msxml object I get error on those special character entities.
I tested a lot with TodyCom, and seems there's no way to keep 'é' as 'é'.

How to get it fixed to be able to handle the new converted xml into an msxml object?

thx a lot, kabe.be
 
Old September 16th, 2005, 06:43 AM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

I've no idea what TidyCom is, but I doubt that this forum is a good way of reporting bugs in the product, which is what this appears to be.



Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference
 
Old September 16th, 2005, 07:09 AM
Authorized User
 
Join Date: Jun 2003
Posts: 57
Thanks: 0
Thanked 0 Times in 0 Posts
Send a message via MSN to Kabe
Default

I wasn't trying to get support for TidyCom here ...
just trying to get some help on handling next part of xml (as result of use of TidyCom):

<TR>
<TD height="20">Algemeen | G&eacute;n&eacute;ral</TD>
<TD>Seco-M Manager</TD>
</TR>

i get error on "&eacute;" when loading in msxml object.

is there a way to load it in a right way ?
Thx, Kabe


btw: more info about tidycom: http://tidy.sourceforge.net/
 
Old September 16th, 2005, 07:36 AM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

If your document contains a reference to &eacute; and doesn't reference a DTD that defines eacute, then it's invalid XML and no parser will be able to do anything with it. So you've got to fix the process that generates it.

I didn't realize TidyCOM was just a binding for the well-known Tidy utility. I suspect you've somehow invoked it with some incorrect options, or something.

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference
 
Old September 16th, 2005, 07:55 AM
Authorized User
 
Join Date: Jun 2003
Posts: 57
Thanks: 0
Thanked 0 Times in 0 Posts
Send a message via MSN to Kabe
Default

it wasn't a solution, but always helps to affirm I had to search deaper :-)

Solution:
Had to set an option within the Tidy-code to convert special characters to Numeric Entities.

Works fine now.
Regards, Kabe





Similar Threads
Thread Thread Starter Forum Replies Last Post
Entity Conversion vengatatindia Word VBA 0 February 15th, 2008 05:52 AM
character entity into numeric character entity srkumar XSLT 1 November 22nd, 2007 04:53 AM
entity conversion orlyyefet XSLT 3 July 29th, 2007 03:14 PM
Entity Question muki XSLT 6 December 5th, 2005 05:28 AM
Difference between Entity and Entity type arshad mahmood C++ Programming 0 May 8th, 2004 12:34 AM





Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.