Wrox Programmer Forums
Go Back   Wrox Programmer Forums > XML > XSLT
XSLT General questions and answers about XSLT. For issues strictly specific to the book XSLT 1.1 Programmers Reference, please post to that forum instead.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the XSLT section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
Old November 1st, 2008, 02:09 AM
Registered User
Join Date: Sep 2008
Posts: 7
Thanks: 0
Thanked 0 Times in 0 Posts
Default XSLT replace for character entities

Hi all,

I have XML file that contains HTML tags and when I generate PDF file the HTML are shown as such e.g. <br /> tag is output as it is. Whereas I want a new line instead of tag. I checked my XML file it contained "br" tag in the form " &lt;br /&gt;".

My XSL code works below for simple "br" tag but not for character entities.

      <xsl:template match="br" >
          <fo:block />

How can I interpret character entities as HTML tags in XSLT?
I tried string replace but I think something else will work here.

Old November 1st, 2008, 05:19 AM
mhkay's Avatar
Wrox Author
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts

>it contained "br" tag in the form " &lt;br /&gt;".

No, it didn't contain a "br" tag. If it was a tag it would be written <br/>. If a "<" is written as "&lt;", that is because the author doesn't want the character to be treated as part of a tag. If they don't want it treated as a tag, then why are you trying to treat it as one?

OK, that's harsh: this is a common situation (though in my view it is bad design). But it helps to be clear about the terminology. You have an XML text node whose contents contain a fragment of lexical unparsed HTML. If you want to process that HTML in a way that takes account of its structure then you first need to parse it. There are two ways to do that. You can try to parse it in XSLT code, but unless you're dealing with a very constrained subset of HTML that is going to be hard work. Or you can pass it to an HTML parser (or perhaps an XML parser if you know that it's actually XHTML). That will require use of extension functions - which might come from your vendor, like saxon:parse(), or which you might have to write yourself.

Michael Kay
Author, XSLT 2.0 and XPath 2.0 Programmer's Reference

Similar Threads
Thread Thread Starter Forum Replies Last Post
Character to Unicode entities Pankaj C XSLT 2 February 15th, 2008 08:59 AM
Replace character gregalb SQL Server 2000 6 July 12th, 2007 01:14 AM
SQL REPLACE after a character. Stuart Stalker SQL Server 2000 6 April 11th, 2006 02:13 PM
Character replace function? echovue Access 2 December 21st, 2004 01:53 PM

Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.