check name of the current node element in xml

arunagottimukkala · October 12th, 2007, 08:18 AM

Quote:

quote:Originally posted by samjudson
Correct - everything inside the CDATA is just text, not part of the XML at all.

/- Sam Judson : Wrox Technical Editor -/

Thank u for your reply. If we have tags inside CDATA like here:

<![CDATA[

<div class="Textkrper"><span class="hn_para">Many word formatting conventions have been used in the formatting of this </span></div>

]]>

In this case, if I want to get the text between those tags using xslt. how can I do it? I tried with all the replies and my own ideas, I couldn't get it. please give me some ideas how to do that.

joefawcett · October 12th, 2007, 08:28 AM

Well that's why it's a bad model to use, putting markup within CDATA. You can either use an extension function, Saxon uses saxon:parse I believe, or wite recursive templates to turn the text back into nodes, a highly difficult task. If using MSXML or .NET you can write script functions that load the text into an X<ML document and return the nodes.

--

Joe (Microsoft MVP - XML)

arunagottimukkala · October 12th, 2007, 08:52 AM

Quote:

quote:Originally posted by joefawcett
Well that's why it's a bad model to use, putting markup within CDATA. You can either use an extension function, Saxon uses saxon:parse I believe, or wite recursive templates to turn the text back into nodes, a highly difficult task. If using MSXML or .NET you can write script functions that load the text into an X<ML document and return the nodes.

--

Joe (Microsoft MVP - XML)

Hi
thanks for your reply. I could't understand your point of "loading text into xml document and returning the nodes using MSXML".

Did you mean to retrieve and load text inside CDATA in another xml doc file and change to nodes and then transform? Please clear me.

Thanks
Aruna

joefawcett · October 12th, 2007, 09:03 AM

Yes, if you are using MSXML look in the sdk for msxsl:script element. There is a similar example to what you need. Load the text into a new DomDocument, you'll have to take care of the document element or use createFragment, and return the document element's children. This assumes that the text in the CDATA will be well-formed when translated. If this is the case then why is it being put into CDATA in the first place?

--

Joe (Microsoft MVP - XML)

samjudson · October 12th, 2007, 09:04 AM

Putting XMl inside a CData tag is equivalent to the following code:

Code:

<xml><![CDATA[ 
<div id="1">value</div>
]]></xml>

equals the following, if NOT inside CDATA

Code:

<xml>&lt;div id="1"&gt;value&lt;/div&gt;</xml>

As you can now tell there is no 'xml' inside the value to actually select, its just text.

If you wrote an extension function in .Net or java which took a string as its parameter, and returned a XmlDocument as its return value then you could call that. I've never written an extension function, and whichever XSLT processor you're using depends on how to call it but the following might give you some idea of what would be required:

Code:

<xsl:variable name="newxml" select="myfuncs:ConvertTextToXml(xml/text())" />
<xsl:apply-template select="$newxml/div" />

and the function (in C#) might look like this:

Code:

public class ExtensionMethods
{
  public static XmlDocument ConvertTextToXml(string xml)
  {
    XmlDocument doc = new XmlDocument();
    doc.LoadXml(xml);
    return doc;
  }
}

As mentioned (I think) Saxon contains an extension method which already does this, called saxon:parse().

Code:

<xsl:variable name="newxml" select="saxon:parse(xml/text())" />
<xsl:apply-template select="$newxml/div" />

/- Sam Judson : Wrox Technical Editor -/

mhkay · October 12th, 2007, 09:20 AM

There are no tags inside CDATA. The whole purpose of CDATA (its only purpose) is to say "the stuff inside here might look like markup, but I want it treated as plain text". If you want it treated as markup, don't put it in CDATA.

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference

arunagottimukkala · October 12th, 2007, 09:24 AM

Quote:

quote:Originally posted by mhkay
There are no tags inside CDATA. The whole purpose of CDATA (its only purpose) is to say "the stuff inside here might look like markup, but I want it treated as plain text". If you want it treated as markup, don't put it in CDATA.

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference

I am not writing the xml document, it will be generated by a web service so in that case I think, I have to consider that.

mhkay · October 12th, 2007, 09:46 AM

Well, if you're unfortunate to have to deal with XML that has been designed this way, which is sadly all too common, then your only option is to disentangle it by putting it through the XML parser twice. The first parse extracts the text of the CDATA section as a string; then you have to feed this back into the XML parser to get it back as a tree. The saxon:parse() extension was designed for the poor souls who are faced with this problem.

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference

arunagottimukkala · October 19th, 2007, 04:37 AM

Hi All,

Thank you very much for your help. Its working for me but I am using completely different xml's without CDATA(after requesting my boss).

Regards,
Aruna.G

joefawcett · October 19th, 2007, 04:44 AM

Good, we'll get rid of CDATAed markup eventually :)

--

Joe (Microsoft MVP - XML)