Wrox Programmer Forums
Go Back   Wrox Programmer Forums > XML > XSLT
|
XSLT General questions and answers about XSLT. For issues strictly specific to the book XSLT 1.1 Programmers Reference, please post to that forum instead.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the XSLT section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
 
Old October 12th, 2007, 08:18 AM
Authorized User
 
Join Date: Oct 2007
Posts: 24
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Quote:
quote:Originally posted by samjudson
 Correct - everything inside the CDATA is just text, not part of the XML at all.

/- Sam Judson : Wrox Technical Editor -/
Thank u for your reply. If we have tags inside CDATA like here:

<![CDATA[

<div class="Textkrper"><span class="hn_para">Many word formatting conventions have been used in the formatting of this </span></div>

]]>


In this case, if I want to get the text between those tags using xslt. how can I do it? I tried with all the replies and my own ideas, I couldn't get it. please give me some ideas how to do that.





 
Old October 12th, 2007, 08:28 AM
joefawcett's Avatar
Wrox Author
 
Join Date: Jun 2003
Posts: 3,074
Thanks: 1
Thanked 38 Times in 37 Posts
Default

Well that's why it's a bad model to use, putting markup within CDATA. You can either use an extension function, Saxon uses saxon:parse I believe, or wite recursive templates to turn the text back into nodes, a highly difficult task. If using MSXML or .NET you can write script functions that load the text into an X<ML document and return the nodes.

--

Joe (Microsoft MVP - XML)
 
Old October 12th, 2007, 08:52 AM
Authorized User
 
Join Date: Oct 2007
Posts: 24
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Quote:
quote:Originally posted by joefawcett
 Well that's why it's a bad model to use, putting markup within CDATA. You can either use an extension function, Saxon uses saxon:parse I believe, or wite recursive templates to turn the text back into nodes, a highly difficult task. If using MSXML or .NET you can write script functions that load the text into an X<ML document and return the nodes.

--

Joe (Microsoft MVP - XML)
 Hi
  thanks for your reply. I could't understand your point of "loading text into xml document and returning the nodes using MSXML".

  Did you mean to retrieve and load text inside CDATA in another xml doc file and change to nodes and then transform? Please clear me.

Thanks
Aruna


 
Old October 12th, 2007, 09:03 AM
joefawcett's Avatar
Wrox Author
 
Join Date: Jun 2003
Posts: 3,074
Thanks: 1
Thanked 38 Times in 37 Posts
Default

Yes, if you are using MSXML look in the sdk for msxsl:script element. There is a similar example to what you need. Load the text into a new DomDocument, you'll have to take care of the document element or use createFragment, and return the document element's children. This assumes that the text in the CDATA will be well-formed when translated. If this is the case then why is it being put into CDATA in the first place?

--

Joe (Microsoft MVP - XML)
 
Old October 12th, 2007, 09:04 AM
samjudson's Avatar
Friend of Wrox
 
Join Date: Aug 2007
Posts: 2,128
Thanks: 1
Thanked 189 Times in 188 Posts
Default

Putting XMl inside a CData tag is equivalent to the following code:

Code:
<xml><![CDATA[ 
<div id="1">value</div>
]]></xml>
equals the following, if NOT inside CDATA

Code:
<xml>&lt;div id="1"&gt;value&lt;/div&gt;</xml>
As you can now tell there is no 'xml' inside the value to actually select, its just text.

If you wrote an extension function in .Net or java which took a string as its parameter, and returned a XmlDocument as its return value then you could call that. I've never written an extension function, and whichever XSLT processor you're using depends on how to call it but the following might give you some idea of what would be required:

Code:
<xsl:variable name="newxml" select="myfuncs:ConvertTextToXml(xml/text())" />
<xsl:apply-template select="$newxml/div" />
and the function (in C#) might look like this:

Code:
public class ExtensionMethods
{
  public static XmlDocument ConvertTextToXml(string xml)
  {
    XmlDocument doc = new XmlDocument();
    doc.LoadXml(xml);
    return doc;
  }
}
As mentioned (I think) Saxon contains an extension method which already does this, called saxon:parse().

Code:
<xsl:variable name="newxml" select="saxon:parse(xml/text())" />
<xsl:apply-template select="$newxml/div" />

/- Sam Judson : Wrox Technical Editor -/
 
Old October 12th, 2007, 09:20 AM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

There are no tags inside CDATA. The whole purpose of CDATA (its only purpose) is to say "the stuff inside here might look like markup, but I want it treated as plain text". If you want it treated as markup, don't put it in CDATA.

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference
 
Old October 12th, 2007, 09:24 AM
Authorized User
 
Join Date: Oct 2007
Posts: 24
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Quote:
quote:Originally posted by mhkay
 There are no tags inside CDATA. The whole purpose of CDATA (its only purpose) is to say "the stuff inside here might look like markup, but I want it treated as plain text". If you want it treated as markup, don't put it in CDATA.

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference
I am not writing the xml document, it will be generated by a web service so in that case I think, I have to consider that.

 
Old October 12th, 2007, 09:46 AM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

Well, if you're unfortunate to have to deal with XML that has been designed this way, which is sadly all too common, then your only option is to disentangle it by putting it through the XML parser twice. The first parse extracts the text of the CDATA section as a string; then you have to feed this back into the XML parser to get it back as a tree. The saxon:parse() extension was designed for the poor souls who are faced with this problem.

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference
 
Old October 19th, 2007, 04:37 AM
Authorized User
 
Join Date: Oct 2007
Posts: 24
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Hi All,

  Thank you very much for your help. Its working for me but I am using completely different xml's without CDATA(after requesting my boss).

Regards,
Aruna.G

 
Old October 19th, 2007, 04:44 AM
joefawcett's Avatar
Wrox Author
 
Join Date: Jun 2003
Posts: 3,074
Thanks: 1
Thanked 38 Times in 37 Posts
Default

Good, we'll get rid of CDATAed markup eventually :)

--

Joe (Microsoft MVP - XML)





Similar Threads
Thread Thread Starter Forum Replies Last Post
search value of XML element/node in general deean XML 1 June 14th, 2008 03:17 AM
Please help to check the current node... darshil XSLT 1 May 9th, 2007 02:42 AM
Position of a node outside current context QuickSilver002 XSLT 2 April 19th, 2007 02:07 PM
XSLT Going up a level from current node. lafilip XSLT 4 February 23rd, 2007 03:06 PM
"for-the-current-node" instead of "for-each" ? webhead XSLT 2 August 25th, 2006 02:55 PM





Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.