Wrox Programmer Forums
Go Back   Wrox Programmer Forums > XML > XSLT
|
XSLT General questions and answers about XSLT. For issues strictly specific to the book XSLT 1.1 Programmers Reference, please post to that forum instead.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the XSLT section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
 
Old July 14th, 2009, 10:13 AM
Registered User
 
Join Date: Jul 2009
Posts: 3
Thanks: 0
Thanked 0 Times in 0 Posts
Default Normalizing XML containing href/id

Hello,
In the spirit of talk radio, I'm a first-time poster, long-time listener. This site has been a gold mine of XSLT resources. Thanks. This will sound like a SOAP question, but the solution is using XSLT.

I am calling legacy SOAP rpc/encoded services from a newer platform that does not support the entire SOAP rpc spec. I need to transform xml formatted with multireference accessors(href/id) to a more traditional inline xml. Our COTS software expects one child under the SOAP Body, and multireference rpc payloads contain multiple children under the SOAP Body. I am trying to transform this:

<env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/" xmlns:tns="http://target.org" env:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<env:Body>
<tns:legacyRpcServiceResponse>
<result href="#ID1"/>
</tns:legacyRpcServiceResponse>
<tns:LookupTableResult id="ID1" xsi:type="tns:LookupTableResult">
<returnCode xsi:type="xsd:string">0</returnCode>
<returnMessage xsi:type="xsd:string" xsi:nil="1"/>
<tableArray href="#ID2"/>
</tns:LookupTableResult>
<tns:ArrayOfTableArray id="ID2" xsi:type="enc:Array" enc:arrayType="tns:TableArray[2]">
<item href="#ID3"/>
<item href="#ID4"/>
</tns:ArrayOfTableArray>
<tns:TableArray id="ID3" xsi:type="tns:TableArray">
<lastChangeDt xsi:type="xsd:string">2009-05-12-13.13.42.145</lastChangeDt>
<tableName xsi:type="xsd:string">NameTable</tableName>
</tns:TableArray>
<tns:TableArray id="ID4" xsi:type="tns:TableArray">
<lastChangeDt xsi:type="xsd:string"/>
<tableName xsi:type="xsd:string">JobTable</tableName>
</tns:TableArray>
</env:Body>
</env:Envelope>

to this inline version:

<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/" xmlns:tns="http://target.org" env:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<env:Body>
<tns:legacyRpcServiceResponse>
<result>
<tns:LookupTableResult type="tns:LookupTableResult">
<returnCode type="xsd:string">0</returnCode>
<returnMessage type="xsd:string" nil="1"/>
<tableArray>
<tns:ArrayOfTableArray type="enc:Array" arrayType="tns:TableArray[2]">
<item>
<tns:TableArray type="tns:TableArray">
<lastChangeDt type="xsd:string">2009-05-12-13.13.42.145</lastChangeDt>
<tableName type="xsd:string">NameTable</tableName>
</tns:TableArray>
</item>
<item>
<tns:TableArray type="tns:TableArray">
<lastChangeDt type="xsd:string"/>
<tableName type="xsd:string">JobTable</tableName>
</tns:TableArray>
</item>
</tns:ArrayOfTableArray>
</tableArray>
</tns:LookupTableResult>
</result>
</tns:legacyRpcServiceResponse>
</env:Body>
</env:Envelope>

My XSLT below uses templates to copy all nodes. When it encounters a node containing an href attribute, it copies the nodeset with the matching id attribute. I had to use a mode to keep from copying the elements with an id attribute twice. Some of the services I call can contain 1,000's of references. The bulk of my xslt is spent selecting the elements referenced by the href. I have saved a ton of time using a key, but believe that there is still a lot of room for improvement. Are there tricks out there to:
1) select an element by id from a distinct large list of elements?
2) stop processing the document once I've finished processing the first child of the soap:Body ?

My xslt is:

<xsl:stylesheet version="1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:key name="linkableItems" match="//*[@id]" use="@id"/>
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="@*[name()='id']|@*[name()='href']"/>
<xsl:template match="node()" mode="normalize">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="node()[@href]">
<xsl:copy>
<xsl:apply-templates select="key('linkableItems',substring-after(current()/@href,'#'))" mode="normalize"/>
</xsl:copy>
</xsl:template>
<xsl:template match="node()[@id]"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

Thank you.
 
Old July 14th, 2009, 12:30 PM
Friend of Wrox
 
Join Date: Jul 2006
Posts: 430
Thanks: 28
Thanked 5 Times in 5 Posts
Send a message via Yahoo to bonekrusher
Default

There are several ways to select the first child:

Code:
/env:Envelope/env:Body/tns:legacyRpcServiceResponse[1]
Code:
/env:Envelope/env:Body/node()[1]
As for selecting an element by id from a distinct large list of element will depend on what you want to select. in the input xml you posted, what to do you want to select?
 
Old July 14th, 2009, 01:58 PM
Registered User
 
Join Date: Jul 2009
Posts: 3
Thanks: 0
Thanked 0 Times in 0 Posts
Default

As my XSLT is copying nodes, for any node containing an href attribute, I want to go fetch the element with the matching id attribute and copy it and its children into the current node. So if the original xml is

<tns:legacyRpcServiceResponse>
<result href="#ID1"/>
</tns:legacyRpcServiceResponse>
...
the output xml after resolving the href to id references would be

<tns:legacyRpcServiceResponse>
<result>
<tns:LookupTableResult type="tns:LookupTableResult">
...
</tns:LookupTableResult>
</result>
</tns:legacyRpcServiceResponse>

I defined a Key on the @id attribute to significantly improve the selection of the element whose @id attribute matches the @href attribute of the context node.
98% of my transformation is spent on this selection, so I was trying to see if there's an even faster way to do it. So if I know I have a list of 10,000 elements with distinct @id values, is there a way to instruct the processor to stop searching the full node list once a match is found? I think the XSL engine is trying to find all nodes who match the criteria, even though I know it will only result in one node.

Adding the [1] predicate at the end happens after the fact and only selects the 1st node of a node list that contains one element. At least that's how I understand it to work.
Any other thoughts?
 
Old July 15th, 2009, 07:39 AM
Friend of Wrox
 
Join Date: Jul 2006
Posts: 430
Thanks: 28
Thanked 5 Times in 5 Posts
Send a message via Yahoo to bonekrusher
Default

I am a bit confused. Your first post gives a desired output:

Code:
<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/" xmlns:tns="http://target.org" env:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
    <env:Body>
        <tns:legacyRpcServiceResponse>
            <result>
                <tns:LookupTableResult type="tns:LookupTableResult">
                    <returnCode type="xsd:string">0</returnCode>
                    <returnMessage type="xsd:string" nil="1"/>
                    <tableArray>
                        <tns:ArrayOfTableArray type="enc:Array" arrayType="tns:TableArray[2]">
                            <item>
                                <tns:TableArray type="tns:TableArray">
                                    <lastChangeDt type="xsd:string">2009-05-12-13.13.42.145</lastChangeDt>
                                    <tableName type="xsd:string">NameTable</tableName>
                                </tns:TableArray>
                            </item>
                            <item>
                                <tns:TableArray type="tns:TableArray">
                                    <lastChangeDt type="xsd:string"/>
                                    <tableName type="xsd:string">JobTable</tableName>
                                </tns:TableArray>
                            </item>
                        </tns:ArrayOfTableArray>
                    </tableArray>
                </tns:LookupTableResult>
            </result>
        </tns:legacyRpcServiceResponse>
    </env:Body>
</env:Envelope>
which is what your xslt outputs.
 
Old July 15th, 2009, 08:40 AM
samjudson's Avatar
Friend of Wrox
 
Join Date: Aug 2007
Posts: 2,128
Thanks: 1
Thanked 189 Times in 188 Posts
Default

I think his question is not "how do I do this" but "is this the fastest way of doing this".

Your performance will vary greatly depending on the XSLT processor you use and how they have implemented keys etc. Some processors may do exactly what you want when faced with the [1] predicate and stop once they have found the first key, however others may not. Different processors will also likely have better or worse routines for doing the key indexing.

The only way to know for sure would be to develop some test cases and try running them against different XSLT processors. Basically you're going to have to benchmark.
__________________
/- Sam Judson : Wrox Technical Editor -/

Think before you post: What have you tried?
 
Old July 15th, 2009, 08:46 AM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

I would think any XSLT processor with a half-decent optimizer would ensure that xxx[1] stops searching xxx after finding the first item in the sequence. But perhaps there are some XSLT processors around without half-decent optimizers. The only way to find out is to do measurements.
__________________
Michael Kay
http://www.saxonica.com/
Author, XSLT 2.0 and XPath 2.0 Programmer\'s Reference
 
Old July 15th, 2009, 11:11 AM
Registered User
 
Join Date: Jul 2009
Posts: 3
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Thanks for the tip on how the [1] predicate effects optimization. I did not know that.
The processor definitely makes a difference. I have a 4mb xml file with over 10k href attributes. XMLSpy processes the doc in 24 - 30 seconds, yet our Oracle SOA Suite server, using Xalan 2.7.0, processes the doc in under 7 seconds. This is fine for now since we only get a couple of these requests per day. The majority of our docs are in the 1 to 2kb range and process sub second. I'm just getting greedy and wanted to see how far I could optimize this xslt.

I appreciate all of your tips. Thanks!





Similar Threads
Thread Thread Starter Forum Replies Last Post
xlink:href, xml and xslt jamesdurham XSLT 7 April 23rd, 2009 06:22 AM
Saving XML thru href link using XSLT kaukabhishek XSLT 16 June 25th, 2008 07:16 PM
xml href from sql server toddw607 SQL Server 2000 9 May 14th, 2007 06:24 AM
XML value in HRef link aware XSLT 3 January 8th, 2007 08:52 AM
Normalizing topshed Classic ASP Databases 0 April 30th, 2005 10:24 PM





Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.