|
Subject:
|
✓ not output correctly from DOM parser
|
|
Posted By:
|
srivalli9
|
Post Date:
|
11/17/2003 12:40:02 PM
|
Need a small help again.
in my original XMl file, I have an entity &10003; It represents(ALT 10003) a check symbol(a box).
But the output XMl file from my DOMparser represents it as a question mark(?), which is NOT what I want.
I'm using DOM parser with JAXP API to parse the XML file.
Could you help me why does my parser doesn't output correct character for this particular entity only?
Any suggestion is grately appreciated.
-Srivalli.
|
|
Reply By:
|
armmarti
|
Reply Date:
|
11/18/2003 1:09:05 AM
|
Probably the editor you're using to see your output confuses you(doesn't support the encoding you specified in the XML declaration). Check this first or try some editor which allows hexadeciamal view.
Regards, Armen
|
|
Reply By:
|
joefawcett
|
Reply Date:
|
11/18/2003 3:50:23 AM
|
Or your file is not encoded properly. Show some code.
Joe (MVP - xml)
|
|
Reply By:
|
srivalli9
|
Reply Date:
|
11/18/2003 9:31:54 AM
|
Hi joefawcett,armmarti....
http://xml.coverpages.org/xml-ISOents.txt
This link has a character equivalent for check character, which is x2713 and it uses the encoding scheme of ISO 10646.
Here is the statement in my DTD... ---------------------------------- <!ENTITY check "#x2713;"> what I have above in quotations is &_#_x_2_7_1_3_; (WITH OUT underscores)
This is a statement in input XML file... ------------------------------------ The foll ✓ owing is
This is my Output XML file when viewed in EditPlus-a text editor.... --------------------------- The foll ? owing is
Here is the code in my java file where I didnot explicitly specify any output encoding type.... -------------------------------------------------- DocumentBuilderFactory factory=DocumentBuilderFactory.newInstance(); factory.setValidating(true); factory.setIgnoringElementContentWhitespace(true); DocumentBuilder builder=factory.newDocumentBuilder(); currDocument=builder.parse(f);
TransformerFactory tf=TransformerFactory.newInstance(); Transformer t = tf.newTransformer(new StreamSource("transformoutput.xsl")); t.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM, "nmstat.dtd");
FileWriter fileOut = new FileWriter("outputAnno.xml"); t.transform(new DOMSource(document), new StreamResult(fileOut));
This is the code in transformoutput.xsl... ---------------------------------------------- <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" indent="yes"/> <xsl:template match="/"> <xsl:copy-of select="node()"/> </xsl:template> </xsl:stylesheet> -----------------------------------------------
Could you help me figuring out the problem. Thank you very much.
-Srivalli.
|
|
Reply By:
|
armmarti
|
Reply Date:
|
11/19/2003 8:16:09 AM
|
Hi,
UTF-16 is an encoding of ISO 10646, so you must specify the encoding for the output document. Add this top-level element to your stylesheet:
<xsl:output method="xml" version="1.0" encoding="UTF-16"/>
Then, any editor(viewer, browser, etc.) which supports UTF-16 must show that character in a proper way.
Regards, Armen
|