Wrox Programmer Forums
Go Back   Wrox Programmer Forums > XML > XSLT
|
XSLT General questions and answers about XSLT. For issues strictly specific to the book XSLT 1.1 Programmers Reference, please post to that forum instead.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the XSLT section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
 
Old July 16th, 2008, 04:47 PM
Registered User
 
Join Date: Jul 2008
Posts: 4
Thanks: 0
Thanked 0 Times in 0 Posts
Default Copying Branches From a Source to a Target

I have a long question -- I've been looking through XSLT 2.0 / XPath 2.0 and Beginning XML, and the Pawson FAQ, and I still can't figure this out.

I have two XML documents, Source and Target with the same element structure, but additional branches of data in Source. I would like to output an XML document combining the two that had all items in Target and copies of the Source data for those items in Target with an 'inherits_from' attributes. Those copies should not overwrite branches in Target not in Source, and should take the Target data for any branches in both Target and Source.

For example:

Source
<geniuses>
  <genius>Turing
    <field>Mathematics</field>
    <contribution>Turing Machines</contribution>
  </genius>

  <genius>Godel
    <field>Logic</field>
    <contribution>No complete system can be consistent</contribution>
    <annoyed>Bertrand Russell</annoyed>
  </genius>
</geniuses>

Target
<geniuses>
  <note>Where would we be without these guys?</note>

  <genius inherits_from="Source">Turing</genius>
    <contribution>Broke Enigma</contribution>
    <fictionalized_in>Cryptonomicon</fictionalized_in>
  <genius inherits_from="Source">Godel</genius>
  <genius>Neumann</genius>
</geniuses>

So the result tree should output
- <geniuses> as root
- the <note> node
- the Turing node, but with 'Broke Enigma' in the <contributions> node and Target's <fictionalized_in> node, and all other branches as in Source
- the Godel branches as in Source
- the Neumann node as in target.

I've tried the following:

 14 <xsl:template match="node()[not(attribute::inherits_from)]">
 15 <xsl:copy>
 16 <xsl:apply-templates select="node()"/>
 17 </xsl:copy>
 18 </xsl:template>
 19
 20 <xsl:template match="node()[attribute::inherits_from]">
 21 <xsl:copy>
 22 <xsl:variable name="crnt" select="."/>
 23 <xsl:apply-templates select="document('Source.xml')/crnt"/>
 24 </xsl:copy>
 25 </xsl:template>

This distinguishes among nodes with and without the attribute, and it prints Target's branches for geniuses with the attribute, but it doesn't print the value of the genius element or the nested elements from Source.

I've fiddled with various constructions of the <<document('Source.xml')/>> expression. It seems that if I can construct a path expression with the inheriting elements' path address, but in the source document, I'd get what I want. But <<document('Source.xml')/.>> doesn't work, and neither does *. I suppose I could somehow pass the geniuses/genius path to document, but I'd like something more general.

Also, I'm also not sure this iterates properly through all the possible subtrees of an inheriting element. I think I want copy rather than copy-of, so I can have more control over the processing, but I don't really know. I suppose I could pull the call on document into another template and use apply-template recursively on those child branches that need copies and don't violate the no-overwrite requirement, but even then I don't know how to get a copy of the Source nodes, or to iterate through those and check for presence in Target.

I apologize for the long question but I'm missing some key point here, and I'm not making much progress without it.

 
Old July 17th, 2008, 06:26 AM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

I suspect you want to do parallel tree-walking down the two trees. That means treating one of the documents as the context document, and when you do apply-templates to its children, pass the "corresponding" node in the second document as a parameter on the template call. Then in your generic template for any element, you have a handle on the corresponding elements in the two documents. I'm not sure exactly what you want to do then, but it involves some kind of process where you copy children that are in one document but not the other, and call yourself recursively for children that "exist" in both documents.

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference
 
Old July 17th, 2008, 06:42 AM
samjudson's Avatar
Friend of Wrox
 
Join Date: Aug 2007
Posts: 2,128
Thanks: 1
Thanked 189 Times in 188 Posts
Default

The following seems to do what you want:

Code:
<xsl:variable name="source" select="document('Source.xml')"/>

<xsl:template match="node()|@*">
   <xsl:copy>
   <xsl:apply-templates select="@*"/>
   <xsl:apply-templates/>
   </xsl:copy>
 </xsl:template>
 
 <xsl:template match="@inherits_from">
 <xsl:apply-templates select="$source/geniuses/genius[text()=current()/../text()]"/>
 </xsl:template>
/- Sam Judson : Wrox Technical Editor -/
 
Old July 18th, 2008, 01:54 PM
Registered User
 
Join Date: Jul 2008
Posts: 4
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Quote:
quote:I suspect you want to do parallel tree-walking down the two trees....it involves some kind of process where you copy children that are in one document but not the other, and call yourself recursively for children that "exist" in both documents.
Yes, but don't I need some way to dynamically store and create location paths so I can check the Target for nodes in the Source? After all, that "existence" is defined by having the same location paths but for different roots.

I have figured out copying from the Source for "inheriting" nodes, but I'm using the hard-coded path path expression (<<geniuses/genius>>), predicated for a particular text value, suggested by Sam. (Thanks for that.)

So, is there a way to 1) store a node's location path into a variable and 2) construct a location path for the second document using the location path stored in the variable asked for in 1)? Does that construction require stripping of the root step of the target node and replacing it with the root step returned by the document function?

With those location path variable I can walk the Source, testing for and copying corresponding Target children and otherwise taking the source. I'm thinking of something like <<"document('Source.xml')/{variable specifying location path}[text()={variable with text of target node}]>>. Without that dynamic location path construction and testing, I really don't know how to do this.

Here is one code that copies out the source nodes. (Please note that the example documents posted above have an error -- I've reposted the corrected documents below.)

<xsl:variable name="source" select="document('Source.xml')"/>

<xsl:template match="node()[not(attribute::inherits_from)]">
    <xsl:copy>
        <xsl:apply-templates select="node()"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="node()[attribute::inherits_from]">
        <xsl:variable name="crnt" select="normalize-space(text())"/>
        <xsl:apply-templates select="$source/geniuses/genius[normalize-space(text())=$crnt]"/>
</xsl:template>

I suppose I could use set operators: an <<except>> to get all Source nodes not in the Target, and then a <<union>> (?) to combine those unique Source nodes with the Target nodes. But simply outputting that list would get the nodes, without any nesting.

I apologize again for the long question. I've spent the past two days in Michael's book, and in "Beginning XML", and I'm learning a ton, but I'm still clearly missing something. Thanks.

======================================
Corrected example documents:

Source
<geniuses>
  <genius>Turing
    <field>Mathematics</field>
    <contribution>Turing Machines</contribution>
  </genius>

  <genius>Godel
    <field>Logic</field>
    <contribution>No complete system can be consistent</contribution>
    <annoyed>Bertrand Russell</annoyed>
  </genius>
</geniuses>

Target
<geniuses>
  <note>Where would we be without these guys?</note>

  <genius inherits_from="Source">Turing
    <contribution>Broke Enigma</contribution>
    <fictionalized_in>Cryptonomicon</fictionalized_in>
  </genius>
  <genius inherits_from="Source">Godel</genius>
  <genius>Neumann</genius>
</geniuses>


 
Old July 18th, 2008, 06:05 PM
Registered User
 
Join Date: Jul 2008
Posts: 4
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Michael, I see on p.879 of your XSLT 2.0/XPath 2.0 book that the function <<string-join(ancestor-or-self::*/name(), "/")>> will return something like <<book/chapter/section/title>>. As you don't mention any other function that would return such a path, I suspect that no such function exists.

I suppose a variable holding the return of the string-join function could be used for the dynamic location path checking I need.

Unfortunately, I'm using xalan, which doesn't support XPath 2.0 functions like string-join; and there don't seem to be XSLT 2.0 / XPath 2.0 tools for the linux environment I'm working in.

While XSLT 2.0 seems to have strong features, I'm beginning to think it isn't the solution for my requirements, or that those solutions aren't within my capacity to learn the language.



 
Old July 19th, 2008, 04:04 AM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

I'm afraid I haven't really understood why you see a need to construct paths in this way (it's inevitably a tricky thing to get right because of namespace concerns). The way I was proposing doing it, you navigate down both trees in parallel, which means when you get to a particular point you know that the paths for those two nodes in the two trees are compatible, so you only need to look at the next level - the children of those two nodes - to decide how to match them up.

But this is a tricky problem and I think the first thing needed is a very precise and unambiguous statement of the requirement, which we don't yet have. Woolly language like "same element structure but additional branches" isn't good enough. No programming language is going to be any use at solving underspecified problems.

Incidentally, there are other design approaches that you may not have considered at all. There's one similar problem which I tackle by converting the document containing updates into a stylesheet that can then be applied to the base document.

Michael Kay
http://www.saxonica.com/
Author, XSLT 2.0 and XPath 2.0 Programmer's Reference
 
Old July 19th, 2008, 08:14 AM
Registered User
 
Join Date: Jul 2008
Posts: 4
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Mr. Kay, you aren't under any obligation to help me, but I don't think it necessary to insult my precision to explain why you can't. "[I]s there a way to 1) store a node's location path into a variable" is, I think, a rather precise question; and if location path storage and construction aren't the best approach, well, I've already confessed I don't entirely understand the tools I'm using. And if my formal specification of the practical problem is lacking, I have given a fair description of it, and an example that I think reasonably shows what I'm trying to accomplish.

I'm not sure that "you navigate down both trees in parallel", without any discussion of the mechanisms for doing so, is any wonder of precision in its own right.

I would point out that I've sought better understanding of these tools in your book for three days running, so far without success. This forum is advertised as a source of clarification for people who need a little help understanding the printed material. I can understand sending off numbskulls who want you to write your code for them, but I'm frankly confused at your treatment of someone who has evidenced diligent attention to your own writings and an independent effort to solve his own problem.

In any case, it seems we can agree that my intelligence is not such as is needed to derive the required solutions from your writings. I won't trouble you further.

 
Old July 19th, 2008, 08:46 AM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

I'm sorry you took my criticism personally; I think one of the signs of a good programmer is that they positively welcome criticism.

"[I]s there a way to 1) store a node's location path into a variable"

There is no unique concept of a node's location path. There are many paths that locate any given node. For some purposes the most useful path is one like *[1]/*[12]/*[3], because it has no dependencies on namespaces. What I think you are looking for here is a path to a node in one document which can be used to locate a "corresponding" node in a different document; and to define such a path we need to understand what it means for two nodes to "correspond".

>I'm not sure that "you navigate down both trees in parallel", without any discussion of the mechanisms for doing so, is any wonder of precision in its own right.

I deliberately gave only a sketch of the algorithm, because I didn't fee I understood the problem well enough to express it more precisely. And I did discuss the mechanism: apply-templates to the children of the current node in one tree, passing the "corresponding" node in the other tree as a parameter.

There are a couple of things that confuse me about the problem.

(a) you say that Source and Target have the same element structure, but in your example the structure is very different: in Source, the attributes such as <field> and <contribution> are children of <genius>, but in Target, they are siblings.

(b) you imply that the problem is recursive, so that "inherits_from" can appear in Target at any level. But it's not at all clear how the "corresponding" elements are to be identified. Presumably you want a match on element name, but it also seems you're looking for a match on content (Godel, Turing). But it's not clear in the general case what you're using as the key for content-based matching. It's relatively easy to come up with a solution that works for your example, but the problem as stated (and your attempt at solving it) suggests you are looking for something much more general.


Michael Kay
http://www.saxonica.com/
Author, XSLT 2.0 and XPath 2.0 Programmer's Reference





Similar Threads
Thread Thread Starter Forum Replies Last Post
NavigateUrl and Target spacy ASP.NET 1.x and 2.0 Application Design 9 November 5th, 2007 12:23 PM
Converting Source Xml into Target Xml Using XSL. alapati.sasi XSLT 3 May 14th, 2007 10:54 AM
Debug Target Missing CyberGeek VS.NET 2002/2003 1 February 23rd, 2006 12:47 PM
Copying Source Node attributes to output node pvsat XSLT 2 November 3rd, 2005 09:46 AM
target="_top" crmpicco HTML Code Clinic 3 March 23rd, 2005 12:50 PM





Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.