Wrox Programmer Forums
Go Back   Wrox Programmer Forums > XML > XSLT
|
XSLT General questions and answers about XSLT. For issues strictly specific to the book XSLT 1.1 Programmers Reference, please post to that forum instead.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the XSLT section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
 
Old October 8th, 2012, 06:12 AM
Registered User
 
Join Date: Oct 2012
Posts: 3
Thanks: 0
Thanked 0 Times in 0 Posts
Default XML parsing in XSLT

Hi,

I need a generic XML parser (pref. SAX implementation) in XSLT.

What I basically need to do is, given an XML document, I need an output which represents the tags, tag-depth, tag-value, attributes, attribute value in a delimited string format.

Any help would be appreciated.

Cheers
 
Old October 8th, 2012, 06:37 AM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

Why would you need an XML parser in XSLT to achieve this? All the information you require is available in the tree model generated by a standard XML parser.

You just need a stylesheet that does something along the lines of

Code:
<xsl:output method="text()"/>
<xsl:template match="*">
  name=<xsl:value-of select="name()"/>
  depth=<xsl:value-of select="count(ancestor::*)"/>
  attributes=(<xsl:apply-templates select="@*"/>)
</xsl:template>
<xsl:template match="@*">
  name=<xsl:value-of select="name()"/>
  value=<xsl:value-of select="."/>
</xsl:template>
etc.
__________________
Michael Kay
http://www.saxonica.com/
Author, XSLT 2.0 and XPath 2.0 Programmer\'s Reference
 
Old October 8th, 2012, 06:53 AM
Registered User
 
Join Date: Oct 2012
Posts: 3
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Hi,

Thanks for the quick reply; I have modified your code slightly as follows..

Code:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="text()"/>
<xsl:template match="*">
  <xsl:for-each select="//node()">
  name=<xsl:value-of select="name()"/>
  depth=<xsl:value-of select="count(ancestor::*)"/>
  attributes=(<xsl:apply-templates select="@*"/>)
  </xsl:for-each>
</xsl:template>
<xsl:template match="@*">
  name=<xsl:value-of select="name()"/>
  value=<xsl:value-of select="."/>
</xsl:template>
</xsl:stylesheet>
The source XML I use is this..

Code:
<?xml version="1.0"?>
<catalog>
   <book id="bk101">
      <author>Gambardella, Matthew</author>
      <title>XML Developer's Guide</title>
      <genre>Computer</genre>
      <price>44.95</price>
      <publish_date>2000-10-01</publish_date>
      <description>An in-depth look at creating applications 
      with XML.</description>
   </book>
   <book id="bk102">
      <author>Ralls, Kim</author>
      <title>Midnight Rain</title>
      <genre>Fantasy</genre>
      <price>5.95</price>
      <publish_date>2000-12-16</publish_date>
      <description>A former architect battles corporate zombies, 
      an evil sorceress, and her own childhood to become queen 
      of the world.</description>
   </book>
</catalog>
And the output I get from the XSL is this..

Code:
  name=catalog
  depth=0
  attributes=()
  
  name=
  depth=1
  attributes=()
  
  name=book
  depth=1
  attributes=(
  name=id
  value=bk101)
In the above output, I miss value of tags themselves, e.g. publish_date

It would be great if you could help me with this..

Thanks & cheers

Last edited by bsudhindra; October 8th, 2012 at 06:59 AM..
 
Old October 8th, 2012, 07:13 AM
samjudson's Avatar
Friend of Wrox
 
Join Date: Aug 2007
Posts: 2,128
Thanks: 1
Thanked 189 Times in 188 Posts
Default

The issue here is how you understand XML, and what a text node is. node() returns all the nodes under a element, even if they are text nodes. * only returns which element nodes (which does not include text nodes).


Try this on for size:

Code:
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>

<xsl:template match="*">
  name=<xsl:value-of select="name()"/>
  depth=<xsl:value-of select="count(ancestor::*)"/>
  attributes=(<xsl:apply-templates select="@*"/>)
  value=<xsl:value-of select="normalize-space(text()[1])"/>
  <xsl:text>
</xsl:text>
<xsl:apply-templates select="*"/>
  
</xsl:template>
<xsl:template match="@*">name=<xsl:value-of select="name()"/>,value=<xsl:value-of select="."/></xsl:template>
__________________
/- Sam Judson : Wrox Technical Editor -/

Think before you post: What have you tried?
 
Old October 8th, 2012, 07:19 AM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

It's not at all clear to me what form you want the output to take, but I would think that somewhere in the template match="*" you want to process the children of the element by calling xsl:apply-templates.
__________________
Michael Kay
http://www.saxonica.com/
Author, XSLT 2.0 and XPath 2.0 Programmer\'s Reference
 
Old October 8th, 2012, 07:31 AM
Registered User
 
Join Date: Oct 2012
Posts: 3
Thanks: 0
Thanked 0 Times in 0 Posts
Default

@samjudson & mkhay

Thanks agian, the latest modified code is as follows

Code:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
	<xsl:output method="text" />
	<xsl:strip-space elements="*" />
	<xsl:template match="*">
		<xsl:for-each select="//node()">
			name = 	<xsl:value-of select="name()" />
			depth = <xsl:value-of select="count(ancestor::*)" />
			attributes = (<xsl:apply-templates select="@*" />)
			value = <xsl:value-of select="normalize-space(text()[1])" />
			<xsl:text>
	</xsl:text>
		</xsl:for-each>
		<xsl:apply-templates select="*" />
	</xsl:template>
	<xsl:template match="@*">
				name = <xsl:value-of select="name()" />
				value = <xsl:value-of select="." />
	</xsl:template>
</xsl:stylesheet>
This works almost perfect, except that when the tag depth is counted, the end-tags (e.g. </Id>) is also counted in and the depth is increased, this shouldn't happen - any idea how can I avoid this?

The sample output I get is this..and as you see, the depth is increased for all end-tags

Code:
			name = 	catalog
			depth = 0
			attributes = ()
			value = 
	
			name = 	book
			depth = 1
			attributes = (
				name = id
				value = bk101)
			value = 
	
			name = 	author
			depth = 2
			attributes = ()
			value = Gambardella, Matthew
	
			name = 	
			depth = 3
			attributes = ()
			value = 
	
			name = 	title
			depth = 2
			attributes = ()
			value = XML Developer's Guide
	
			name = 	
			depth = 3
			attributes = ()
			value =





Similar Threads
Thread Thread Starter Forum Replies Last Post
Parsing a string using XSLT bpleshek XSLT 7 September 13th, 2010 11:48 AM
XSLT for parsing XHTML Form shahbhat XSLT 6 August 26th, 2008 06:22 PM
Parsing and counting using XSLT manish_jaiswal XSLT 2 January 24th, 2008 02:38 AM
Urgent help regarding XSLT parsing.... to xml.. netbramha XSLT 1 September 19th, 2005 09:03 AM
XSLT simple parsing problem misu XSLT 3 August 18th, 2004 02:00 AM





Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.