 |
| XSLT General questions and answers about XSLT. For issues strictly specific to the book XSLT 1.1 Programmers Reference, please post to that forum instead. |
Welcome to the p2p.wrox.com Forums.
You are currently viewing the XSLT section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
|
|
|
|

October 12th, 2007, 08:18 AM
|
|
Authorized User
|
|
Join Date: Oct 2007
Posts: 24
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Quote:
quote:Originally posted by samjudson
Correct - everything inside the CDATA is just text, not part of the XML at all.
/- Sam Judson : Wrox Technical Editor -/
|
Thank u for your reply. If we have tags inside CDATA like here:
<![CDATA[
<div class="Textkrper"><span class="hn_para">Many word formatting conventions have been used in the formatting of this </span></div>
]]>
In this case, if I want to get the text between those tags using xslt. how can I do it? I tried with all the replies and my own ideas, I couldn't get it. please give me some ideas how to do that.
|
|

October 12th, 2007, 08:28 AM
|
 |
Wrox Author
|
|
Join Date: Jun 2003
Posts: 3,074
Thanks: 1
Thanked 38 Times in 37 Posts
|
|
Well that's why it's a bad model to use, putting markup within CDATA. You can either use an extension function, Saxon uses saxon:parse I believe, or wite recursive templates to turn the text back into nodes, a highly difficult task. If using MSXML or .NET you can write script functions that load the text into an X<ML document and return the nodes.
--
Joe ( Microsoft MVP - XML)
|
|

October 12th, 2007, 08:52 AM
|
|
Authorized User
|
|
Join Date: Oct 2007
Posts: 24
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Quote:
quote:Originally posted by joefawcett
Well that's why it's a bad model to use, putting markup within CDATA. You can either use an extension function, Saxon uses saxon:parse I believe, or wite recursive templates to turn the text back into nodes, a highly difficult task. If using MSXML or .NET you can write script functions that load the text into an X<ML document and return the nodes.
--
Joe (Microsoft MVP - XML)
|
Hi
thanks for your reply. I could't understand your point of "loading text into xml document and returning the nodes using MSXML".
Did you mean to retrieve and load text inside CDATA in another xml doc file and change to nodes and then transform? Please clear me.
Thanks
Aruna
|
|

October 12th, 2007, 09:03 AM
|
 |
Wrox Author
|
|
Join Date: Jun 2003
Posts: 3,074
Thanks: 1
Thanked 38 Times in 37 Posts
|
|
Yes, if you are using MSXML look in the sdk for msxsl:script element. There is a similar example to what you need. Load the text into a new DomDocument, you'll have to take care of the document element or use createFragment, and return the document element's children. This assumes that the text in the CDATA will be well-formed when translated. If this is the case then why is it being put into CDATA in the first place?
--
Joe ( Microsoft MVP - XML)
|
|

October 12th, 2007, 09:04 AM
|
 |
Friend of Wrox
|
|
Join Date: Aug 2007
Posts: 2,128
Thanks: 1
Thanked 189 Times in 188 Posts
|
|
Putting XMl inside a CData tag is equivalent to the following code:
Code:
<xml><![CDATA[
<div id="1">value</div>
]]></xml>
equals the following, if NOT inside CDATA
Code:
<xml><div id="1">value</div></xml>
As you can now tell there is no 'xml' inside the value to actually select, its just text.
If you wrote an extension function in .Net or java which took a string as its parameter, and returned a XmlDocument as its return value then you could call that. I've never written an extension function, and whichever XSLT processor you're using depends on how to call it but the following might give you some idea of what would be required:
Code:
<xsl:variable name="newxml" select="myfuncs:ConvertTextToXml(xml/text())" />
<xsl:apply-template select="$newxml/div" />
and the function (in C#) might look like this:
Code:
public class ExtensionMethods
{
public static XmlDocument ConvertTextToXml(string xml)
{
XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);
return doc;
}
}
As mentioned (I think) Saxon contains an extension method which already does this, called saxon:parse().
Code:
<xsl:variable name="newxml" select="saxon:parse(xml/text())" />
<xsl:apply-template select="$newxml/div" />
/- Sam Judson : Wrox Technical Editor -/
|
|

October 12th, 2007, 09:20 AM
|
 |
Wrox Author
|
|
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
|
|
There are no tags inside CDATA. The whole purpose of CDATA (its only purpose) is to say "the stuff inside here might look like markup, but I want it treated as plain text". If you want it treated as markup, don't put it in CDATA.
Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference
|
|

October 12th, 2007, 09:24 AM
|
|
Authorized User
|
|
Join Date: Oct 2007
Posts: 24
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Quote:
quote:Originally posted by mhkay
There are no tags inside CDATA. The whole purpose of CDATA (its only purpose) is to say "the stuff inside here might look like markup, but I want it treated as plain text". If you want it treated as markup, don't put it in CDATA.
Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference
|
I am not writing the xml document, it will be generated by a web service so in that case I think, I have to consider that.
|
|

October 12th, 2007, 09:46 AM
|
 |
Wrox Author
|
|
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
|
|
Well, if you're unfortunate to have to deal with XML that has been designed this way, which is sadly all too common, then your only option is to disentangle it by putting it through the XML parser twice. The first parse extracts the text of the CDATA section as a string; then you have to feed this back into the XML parser to get it back as a tree. The saxon:parse() extension was designed for the poor souls who are faced with this problem.
Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference
|
|

October 19th, 2007, 04:37 AM
|
|
Authorized User
|
|
Join Date: Oct 2007
Posts: 24
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Hi All,
Thank you very much for your help. Its working for me but I am using completely different xml's without CDATA(after requesting my boss).
Regards,
Aruna.G
|
|

October 19th, 2007, 04:44 AM
|
 |
Wrox Author
|
|
Join Date: Jun 2003
Posts: 3,074
Thanks: 1
Thanked 38 Times in 37 Posts
|
|
Good, we'll get rid of CDATAed markup eventually :)
--
Joe ( Microsoft MVP - XML)
|
|
 |