|
Subject:
|
Matching between two sibling nodes
|
|
Posted By:
|
AForgue
|
Post Date:
|
11/25/2003 3:23:58 PM
|
I am in the process of creating a transformation for OpenOffice.org's native XML files and have run into a hitch. Because of the nature of the OOo documents, I cannot have nested paragraph styles within other paragraph styles. What I need is a template that will match everything between the "BeginIntroduction" and "EndIntroduction" nodes in the XML below, and wrap it in <introduction></introduction> tags.
<office:body>
<text:p text:style-name="Standard">
<text:span text:style-name="ChapterNumber">1</text:span>
</text:p>
<text:p text:style-name="Standard">
<text:span text:style-name="ChapterTitle">Title</text:span>
</text:p>
<text:p text:style-name="BeginIntroduction">Begin Introduction</text:p>
<text:p text:style-name="InlineHeading">inline heading</text:p>
<text:p text:style-name="Normal">text</text:p>
<text:p text:style-name="Normal">text</text:p>
<text:p text:style-name="Normal">text</text:p>
<text:p text:style-name="Normal">text</text:p>
<text:p text:style-name="EndIntroduction">End Introduction</text:p>
<text:p text:style-name="InlineHeading">inline heading</text:p>
<text:p text:style-name="Normal">text</text:p>
<text:p text:style-name="Normal">text</text:p>
<text:p text:style-name="Normal">text</text:p>
<text:p text:style-name="Normal">text</text:p>
</office:body>
The ideal transformation should look like this:
<root>
<ChapterNumber>1</ChapterNumber>
<ChapterTitle>title</ChapterTitle>
<Introduction>
<InlineHeading>heading</InlineHeading>
<Paragraph>text</Paragraph>
<Paragraph>text</Paragraph>
<Paragraph>text</Paragraph>
<Paragraph>text</Paragraph>
</Introduction>
<InlineHeading>heading</InlineHeading>
<Paragraph>text</Paragraph>
<Paragraph>text</Paragraph>
<Paragraph>text</Paragraph>
<Paragraph>text</Paragraph>
</root>
Is this going to be possible? Or do I need to consider some different options? Also, it is important for me to note that I actually do have control over the "BeginIntroduction" and "EndIntroduction" tags. I can change the name of them, but they have to be siblings of the paragraphs.
So, for example, instead of:
<text:p text:style-name="BeginIntroduction">Begin Introduction</text:p>
...
<text:p text:style-name="EndIntroduction">End Introduction</text:p>
I could make it:
<text:p text:style-name="Introduction">Introduction</text:p>
...
<text:p text:style-name="Introduction">Introduction</text:p>
That is the extent of the control I have.
Thanks in advance for any advice!!!
Aaron
|
|
Reply By:
|
armmarti
|
Reply Date:
|
11/26/2003 1:57:36 AM
|
I've added namespace declarations:
<office:body xmlns:office="uri-for-office-here" xmlns:text="uri-for-text-here">
So, the stylesheet is:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:office="uri-for-office-here" xmlns:text="uri-for-text-here"
exclude-result-prefixes="office text">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<root>
<xsl:apply-templates select="office:body/text:p/text:span"/>
<Introduction>
<xsl:apply-templates select="office:body/text:p[preceding-sibling::text:p[@text:style-name='BeginIntroduction'] and following-sibling::text:p[@text:style-name='EndIntroduction']]"/>
</Introduction>
<xsl:apply-templates select="office:body/text:p[preceding-sibling::text:p[@text:style-name='EndIntroduction']]"/>
</root>
</xsl:template>
<xsl:template match="office:body/text:p/text:span">
<xsl:element name="{@text:style-name}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:template>
<xsl:template match="office:body/text:p[@text:style-name='Normal']">
<Paragraph>
<xsl:value-of select="."/>
</Paragraph>
</xsl:template>
<xsl:template match="office:body/text:p[@text:style-name='InlineHeading']">
<InlineHeading>
<xsl:value-of select="."/>
</InlineHeading>
</xsl:template>
</xsl:stylesheet>
This stylesheet relies on the positional structure of the source XML doc in some places, so the code is somehow awkward :) You can chnage some points in the stylesheet if you want; you'll find the proper places, I'm sure ;)
Regards, Armen
|
|
Reply By:
|
AForgue
|
Reply Date:
|
11/26/2003 9:02:39 AM
|
Armen, this worked very nicely! Thanks for the help.
Just for education's sake, you mentioned that this relies on the position of the XML structure. I can see what you mean by this in that if the Introduction was in any other place in the document this would produce strange results.
Going off of your example, I am wondering if it would be possible to match anything where any preceding-sibling != 'BeginIntroduction' AND any following-sibling != 'EndIntroduction'. If I am thinking about this correctly, this should match everything that does not appear between 'BeginIntro' and 'EndIntro'. After calling that, the next step would be to match everything between them.
So for example:
<xsl:template match="office:body">
<root>
<xsl:apply-templates select="*[preceding-sibling::node()[@text:style-name != 'BeginIntroduction'] and following-sibling::node()[@text:style-name != 'EndIntroduction']]"/>
</root>
</xsl:template>
Although this sounds great in my mind, it is producing the wrong result set. I am wondering if my select statement is wrong in some way.
-Aaron
|
|
Reply By:
|
armmarti
|
Reply Date:
|
11/26/2003 9:45:15 AM
|
quote:
Going off of your example, I am wondering if it would be possible to match anything where any preceding-sibling != 'BeginIntroduction' AND any following-sibling != 'EndIntroduction'. If I am thinking about this correctly, this should match everything that does not appear between 'BeginIntro' and 'EndIntro'. After calling that, the next step would be to match everything between them.
So for example:
<xsl:template match="office:body">
<root>
<xsl:apply-templates select="*[preceding-sibling::node()[@text:style-name != 'BeginIntroduction'] and following-sibling::node()[@text:style-name != 'EndIntroduction']]"/>
</root>
</xsl:template>
Although this sounds great in my mind, it is producing the wrong result set. I am wondering if my select statement is wrong in some way.
-Aaron
Your XPath expression produces wrong result because: not(A and B) IS EQUIVALENT TO not(A) or not(B)
So just negate:
<xsl:apply-templates select="office:body/text:p[not(preceding-sibling::text:p[@text:style-name='BeginIntroduction'] and following-sibling::text:p[@text:style-name='EndIntroduction'])]"/>
Regards, Armen
|
|
Reply By:
|
AForgue
|
Reply Date:
|
11/26/2003 9:52:37 AM
|
Heh, I read that last post just as I was getting ready to post my answer, which turned out to be exactly what you said. I just negated the whole thing.
<xsl:apply-templates select="node()[not(preceding-sibling::node()[@text:style-name = 'BeginIntroduction'] and following-sibling::node()[@text:style-name = 'EndIntroduction'])]"/>
Thanks again for your help. Good to know that there are knowledgable people out there willing to help out the not-so-knowledgable!
-Aaron
|
|
Reply By:
|
armmarti
|
Reply Date:
|
11/26/2003 10:05:44 AM
|
It has no connection with XSLT or programming; it's just a mathematical truth! ;)
|