Subject: ✓ not output correctly from DOM parser
Posted By: srivalli9 Post Date: 11/17/2003 12:40:02 PM
Need a small help again.

in my original XMl file, I have an entity &10003;
It represents(ALT 10003) a check symbol(a box).

But the output XMl file from my DOMparser represents it as a question mark(?), which is NOT what I want.

I'm using DOM parser with JAXP API to parse the XML file.

Could you help me why does my parser doesn't output correct character for this particular entity only?

Any suggestion is grately appreciated.

-Srivalli.


Reply By: armmarti Reply Date: 11/18/2003 1:09:05 AM
Probably the editor you're using to see your output confuses you(doesn't support the encoding you specified in the XML declaration). Check this first or try some editor which allows hexadeciamal view.

Regards,
Armen
Reply By: joefawcett Reply Date: 11/18/2003 3:50:23 AM
Or your file is not encoded properly. Show some code.

Joe (MVP - xml)
Reply By: srivalli9 Reply Date: 11/18/2003 9:31:54 AM
Hi joefawcett,armmarti....

http://xml.coverpages.org/xml-ISOents.txt

This link has a character equivalent for check character, which is x2713 and it uses the encoding scheme of ISO 10646.

Here is the statement in my DTD...
----------------------------------
<!ENTITY check "#x2713;">
what I have above in quotations is &_#_x_2_7_1_3_;
(WITH OUT underscores)

This is a statement in input XML file...
------------------------------------
The foll &check; owing is


This is my Output XML file when viewed in EditPlus-a text editor....
---------------------------
The foll ? owing is


Here is the code in my java file where I didnot explicitly specify any output encoding type....
--------------------------------------------------
DocumentBuilderFactory factory=DocumentBuilderFactory.newInstance();
factory.setValidating(true);
factory.setIgnoringElementContentWhitespace(true);
DocumentBuilder builder=factory.newDocumentBuilder();
currDocument=builder.parse(f);

TransformerFactory tf=TransformerFactory.newInstance();
Transformer t = tf.newTransformer(new StreamSource("transformoutput.xsl"));
t.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM, "nmstat.dtd");

FileWriter fileOut = new FileWriter("outputAnno.xml");
t.transform(new DOMSource(document), new StreamResult(fileOut));


This is the code in transformoutput.xsl...
----------------------------------------------
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<xsl:copy-of select="node()"/>
</xsl:template>
</xsl:stylesheet>
-----------------------------------------------

Could you help me figuring out the problem.
Thank you very much.

-Srivalli.

Reply By: armmarti Reply Date: 11/19/2003 8:16:09 AM
Hi,

UTF-16 is an encoding of ISO 10646, so you must specify the encoding for the output document. Add this top-level element to your stylesheet:


<xsl:output method="xml" version="1.0" encoding="UTF-16"/>


Then, any editor(viewer, browser, etc.) which supports UTF-16 must show that character in a proper way.

Regards,
Armen

Go to topic 6657

Return to index page 1002
Return to index page 1001
Return to index page 1000
Return to index page 999
Return to index page 998
Return to index page 997
Return to index page 996
Return to index page 995
Return to index page 994
Return to index page 993