Wrox Programmer Forums
Go Back   Wrox Programmer Forums > XML > XSLT
|
XSLT General questions and answers about XSLT. For issues strictly specific to the book XSLT 1.1 Programmers Reference, please post to that forum instead.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the XSLT section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
 
Old February 3rd, 2010, 07:16 AM
Registered User
 
Join Date: Feb 2010
Posts: 4
Thanks: 3
Thanked 0 Times in 0 Posts
Default Escaped CDATA - outputting HTML

Hi, I am trying to consume the XML output from a webservice and extract the XML and translate it to HTML using XSL.

I have run into an issue where the content of the nodes is surrounded with CDATA which is escaped, as is the content of the CDATA. I would like to be able to remove the CDATA and output the contents as unescaped HTML.

Here's a sample of the XML, any suggestions gratefully received.

Code:
<?xml version="1.0" encoding="UTF-8"?>
<Articles>
	<Article>
		<ID>{GUID}</ID>
		<Title>Article TItle</Title>
		<DatePublished>2008-11-26 12:19:00</DatePublished>
		<Intro>&lt;![CDATA[&lt;strong&gt;Intro text that is surrounded by a strong tag. &lt;/strong&gt;]]&gt;</Intro>
		<Summary>&lt;![CDATA[&lt;![CDATA[Summary text that sometimes has a double CDATA tag for some reason.  May also have strong or p tags also. ]]&gt;]]&gt;</Summary>
		<Url></Url>
	</Article>
</Articles>
Thanks,

Matt
 
Old February 3rd, 2010, 07:30 AM
samjudson's Avatar
Friend of Wrox
 
Join Date: Aug 2007
Posts: 2,128
Thanks: 1
Thanked 189 Times in 188 Posts
Default

I'm fairly sure that double CData elements is invalid XML, unless one of the CData elements is encoded as well.

Saxon contains a method called saxon:parse which can be used to part an elements text as if it were XML. You can then output this using xsl:copy for example:

<xsl:copy select="saxon:parse(//Info)" xmlns:saxon="http://saxon.sf.net/"/>
__________________
/- Sam Judson : Wrox Technical Editor -/

Think before you post: What have you tried?

Last edited by samjudson; February 3rd, 2010 at 07:37 AM..
The Following User Says Thank You to samjudson For This Useful Post:
mattisimo (February 3rd, 2010)
 
Old February 3rd, 2010, 07:33 AM
Friend of Wrox
 
Join Date: Nov 2007
Posts: 1,243
Thanks: 0
Thanked 245 Times in 244 Posts
Default

That is not easy to solve as the CDATA section markup is also escpaped.
If you had e.g.
Code:
<Intro><![CDATA[&lt;strong;&gt;foobar&lt;/strong&gt;]]></Intro>
you could simply use disable-output-escaping as in
Code:
<xsl:template match="Intro">
  <xsl:value-of select="." disable-output-escaping="yes"/>
</xsl:template>
But as your CDATA section markup is also escaped that approach does not work. Which XSLT processor do you use? In case of Saxon 9 you might be able to parse the contents of the 'Intro' or 'Summary' elements with an extension function.
__________________
Martin Honnen
Microsoft MVP (XML, Data Platform Development) 2005/04 - 2013/03
My blog
 
Old February 3rd, 2010, 07:50 AM
Registered User
 
Join Date: Feb 2010
Posts: 4
Thanks: 3
Thanked 0 Times in 0 Posts
Default

Hi, thanks for your responses.

I wasn't using any XSLT processor at the moment, just hand coding the transform with basic templates and <xsl:value-of type stuff.

Perhaps I should investigate Saxon? The Parse solution looks good.

All the examples I have found so far online seem to have the CDATA unescaped which makes a lot more sense to me than to escape the lot.

I was wondering if I could do a simple replace on the starting and ending CDATA, replacing with an empty string and then unescape the rest? Does that sound reasonable? It might get around the illogical inclusion of double CDATAs.

Thanks again,

Matt
 
Old February 3rd, 2010, 08:06 AM
samjudson's Avatar
Friend of Wrox
 
Join Date: Aug 2007
Posts: 2,128
Thanks: 1
Thanked 189 Times in 188 Posts
Default

Apologies, since Martin posted I've double checked - and the parse method doesn't work I'm afraid - at least not with the CData escaped as well.
__________________
/- Sam Judson : Wrox Technical Editor -/

Think before you post: What have you tried?
 
Old February 3rd, 2010, 08:44 AM
Friend of Wrox
 
Join Date: Nov 2007
Posts: 1,243
Thanks: 0
Thanked 245 Times in 244 Posts
Default

Quote:
Originally Posted by mattisimo View Post
Hi, thanks for your responses.

I wasn't using any XSLT processor at the moment, just hand coding the transform with basic templates and <xsl:value-of type stuff.

Perhaps I should investigate Saxon? The Parse solution looks good.

All the examples I have found so far online seem to have the CDATA unescaped which makes a lot more sense to me than to escape the lot.

I was wondering if I could do a simple replace on the starting and ending CDATA, replacing with an empty string and then unescape the rest? Does that sound reasonable? It might get around the illogical inclusion of double CDATAs.

Thanks again,

Matt
If you use XSLT 2.0 and you know those elements like 'Intro' or 'Summary' start and end with that escaped CDATA section markup then you could remove it as follows and then output the escaped HTML markup:
Code:
  <xsl:template match="Intro | Summary">
       <xsl:value-of select="replace(., '^(&lt;!\[CDATA\[)+|(\]\]&gt;)+$', '')" disable-output-escaping="yes"/>
  </xsl:template>
__________________
Martin Honnen
Microsoft MVP (XML, Data Platform Development) 2005/04 - 2013/03
My blog

Last edited by Martin Honnen; February 3rd, 2010 at 09:52 AM.. Reason: correcting typo
The Following User Says Thank You to Martin Honnen For This Useful Post:
mattisimo (February 3rd, 2010)
 
Old February 3rd, 2010, 10:48 AM
Registered User
 
Join Date: Feb 2010
Posts: 4
Thanks: 3
Thanked 0 Times in 0 Posts
Default

Hi again, thanks for your suggestion.

I've tried this approach in Altova XML Spy 2004 and it said that the replace function wasn't valid.

I tried writing as ASP.Net page to implement it instead in case this version of the software didn't support it and I get the same message:

'replace()' is an unknown XSLT function.

Here's the sample stylesheet:

Code:
<?xml version="1.0" encoding="UTF-16"?>
<xsl:stylesheet version="2.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:xs="http://www.w3.org/2001/XMLSchema"
 xmlns:f="http://fxsl.sf.net/">
	<xsl:output method="xml" encoding="utf-16" omit-xml-declaration="yes"/>
	<xsl:template match="/">
		<xsl:apply-templates/>
	</xsl:template>
	
	<xsl:template match="LiveWellArticle">
		<xsl:apply-templates select="Intro"/>
	</xsl:template>
	
	<xsl:template match="Intro">
		<xsl:param name="text" select="."/>
        <br/><xsl:value-of select="replace($text, '^(&lt;!\[CDATA\[)+|(\]\]&gt;)+$', '')" disable-output-escaping="yes"/>
        <br/><xsl:value-of select="$text" disable-output-escaping="yes"/>
        <br/><xsl:value-of select="." disable-output-escaping="yes"></xsl:value-of>
	</xsl:template>

</xsl:stylesheet>
In XML Pad it just seems to ignore the line with the replace as it outputs nothing for that line and outputs the full contents including the CDATA for the other two lines.

Am I missing something with the implementation of <xsl:value-of select="replace(.......)">?

Thanks again for your help with this. As you may have realised I'm new to XSL.
 
Old February 3rd, 2010, 10:54 AM
Friend of Wrox
 
Join Date: Nov 2007
Posts: 1,243
Thanks: 0
Thanked 245 Times in 244 Posts
Default

replace is defined in XPath 2.0 so you need to use an XSLT 2.0 processor with XSLT 2.0 stylesheet to use that function. As XSLT and XPath 2.0 exist since the beginning of 2007 an editor with 2004 in its name is not likely to support that function or XSLT/XPath 2.0 at all.
Try Saxon 9 or the free AltovaXML tools 2010.
__________________
Martin Honnen
Microsoft MVP (XML, Data Platform Development) 2005/04 - 2013/03
My blog
The Following User Says Thank You to Martin Honnen For This Useful Post:
mattisimo (February 3rd, 2010)
 
Old February 3rd, 2010, 11:01 AM
Registered User
 
Join Date: Feb 2010
Posts: 4
Thanks: 3
Thanked 0 Times in 0 Posts
Default

Great, thanks for you help - I'll investigate these further.





Similar Threads
Thread Thread Starter Forum Replies Last Post
javascript in CDATA doesnt escape html eruditionist XSLT 5 September 17th, 2008 01:07 PM
Render HTML inside CDATA with XSL c2c XSLT 0 September 10th, 2006 11:10 AM
Outputting Array manih C++ Programming 2 June 14th, 2006 08:24 PM
Retrieve the escaped node content carlos.bravo XSLT 4 September 16th, 2005 03:23 PM
CDATA in XML - convert to HTML suri XSLT 1 July 31st, 2003 08:22 AM





Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.