p2p.wrox.com Forums

p2p.wrox.com Forums (http://p2p.wrox.com/index.php)
-   XSLT (http://p2p.wrox.com/forumdisplay.php?f=86)
-   -   Excluded commented content from an xml (http://p2p.wrox.com/showthread.php?t=100361)

msambasiva@gmail.com September 4th, 2019 03:39 AM

Excluded commented content from an xml
 
Hi,

XSLT 2.0, Saxon Parser

There are DB query statements in an xml file. Need to exclude the commented content in the generated output.
There are three types of comments
1) Single line comments. I've taken care to remove these comments
2) Multi line comments in a single line. I've taken care to remove these comments
3) Multi line comments spread into a different lines.
Looking for your suggestion to remove these comments.

It would be great help if you can suggest any clue on this.

Thanks in advance,
Samba.

Below is the sample xml content and xsl logic

Code:


<asset>
<query>
        <query>
                <sql-statement>SELECT  RT.TRANSACTION_ID RCV_TRANSACTION_ID              ,</sql-statement>
        </query>
        <query>
                <sql-statement>      --TRUNC(RT.TRANSACTION_DATE) RCV_TRANSACTION_DATE          ,</sql-statement>
        </query>
        <query>
                <sql-statement>      SL.ITEM_ID ITEM_ID                                ,</sql-statement>
        </query>
        <query>
                <sql-statement>      PL.ITEM_DESCRIPTION ITEM_DESCRIPTION --Comments              ,</sql-statement>
        </query>
        <query>
                <sql-statement>      PL.VENDOR_PRODUCT_NUM SUPPLIER_ITEM                  ,</sql-statement>
        </query>
        <query>
                <sql-statement>      ps.po_header_id PO_HEADER_ID                      ,</sql-statement>
        </query>
        <query>
                <sql-statement>      PH.SEGMENT1 PO_NUMBER                              ,</sql-statement>
        </query>
        <query>
                <sql-statement>      PL.LINE_NUM PO_LINE                                ,</sql-statement>
        </query>
       
        <query>
                <sql-statement>/*    Commented for bug#22652320</sql-statement>
        </query>
        <query>
                <sql-statement>      FND_LOOKUPS PLC  ,</sql-statement>
        </query>
        <query>
                <sql-statement>      PO_LOOKUP_CODES PLC1 ,</sql-statement>
        </query>
        <query>
                <sql-statement>      AP_LOOKUP_CODES ALC2 , </sql-statement>
        </query>
        <query>
                <sql-statement>      AP_LOOKUP_CODES ALC    */      </sql-statement>
        </query>
        <query>
                <sql-statement>      PS.SHIPMENT_NUM PO_SCHEDULE                        ,</sql-statement>
        </query>
        <query>
                <sql-statement>      SH.RECEIPT_NUM RECEIPT_NUMBER                      ,</sql-statement>
        </query>
        <query>
                <sql-statement>      SL.LINE_NUM RECEIPT_LINE                          ,</sql-statement>
        </query>
        <query>
                <sql-statement>      PS.NEED_BY_DATE NEED_BY_DATE                      ,</sql-statement>
        </query>
        <query>
                <sql-statement>        ALC1.lookup_code MATCH_BASIS_LOOKUP_CODE        ,</sql-statement>
        </query>
        <query>
                <sql-statement>        ALC1.displayed_field  MATCH_BASIS ,</sql-statement>
        </query>
        <query>
                <sql-statement>      /* bug#22652320 */ </sql-statement>
        </query>
</query>
</asset>

  <xsl:template match="asset">
                <p><xsl:apply-templates select="sql-statement"/></p>
        </xsl:template>

        <xsl:template match="sql-statement">
                <xsl:choose>
                        <!-- Contains single line comment
                        <sql-statement>      --TRUNC(RT.TRANSACTION_DATE) RCV_TRANSACTION_DATE          ,</sql-statement>
                        <sql-statement>      PL.ITEM_DESCRIPTION ITEM_DESCRIPTION --Comments              ,</sql-statement>
                        -->
                        <xsl:when test="contains(.,'--')">
                                <xsl:if test="normalize-space(substring-before(.,'--')) ne ''">
                                        <xsl:value-of select="substring-before(.,'--')"/><BR/>
                                </xsl:if>
                        </xsl:when>
                        <!--        Contains multi line comment in one line
                        <sql-statement>      /* bug#22652320 */ </sql-statement>
                        -->
                        <xsl:when test="contains(.,'/*') and contains(.,'*/')">
                                <xsl:if test="normalize-space(substring-before(.,'/*')) ne '' or normalize-space(substring-after(.,'*/')) ">
                                        <xsl:value-of select="concat(normalize-space(substring-before(.,'/*')), normalize-space(substring-after(.,'*/')))"/><BR/>
                                </xsl:if>
                        </xsl:when>
                        <!--        Contains multi line comment in different lines
                                <query>
                                <sql-statement>/*    Commented for bug#22652320</sql-statement>
                                </query>
                                <query>
                                        <sql-statement>      FND_LOOKUPS PLC  ,</sql-statement>
                                </query>
                                <query>
                                        <sql-statement>      PO_LOOKUP_CODES PLC1 ,</sql-statement>
                                </query>
                                <query>
                                        <sql-statement>      AP_LOOKUP_CODES ALC2 , </sql-statement>
                                </query>
                                <query>
                                        <sql-statement>      AP_LOOKUP_CODES ALC    */      </sql-statement>
                                </query>
                        -->
                       
                        <xsl:otherwise>
                                        <xsl:value-of select="."/><BR/>
                        </xsl:otherwise>
                </xsl:choose>
        </xsl:template>


mhkay September 4th, 2019 04:14 AM

I would suggest getting rid of the single-line comments the way you do now, then merge all the SQL lines into one using string-join() with newline as a separator, then get rid of the multi-line comments the way you do now, then split it up again into separate lines (if that's the way you need it) using xsl:analyze-string.

A caveat is that I don't think you are correctly handling lines that contain more than one comment. It would be easier to do that using regular expressions (e.g. replace() or xsl:analyze-string) rather than using substring-before and substring-after.

msambasiva@gmail.com September 5th, 2019 05:33 AM

As I am a beginner of XSLT, bit confused with your suggestion..Can we have control to merge and split the content with in template? How can the content be merged after removal of single line comments in a template. Do we need to define a new template to handle merge and split?
Below is the sample code where I struck to merge and split for multiline comments.

<xsl:template match="query/query">
<xsl:choose>
<xsl:when test="contains(.,'--')">
<xsl:if test="normalize-space(substring-before(.,'--')) ne ''">
<xsl:value-of select="substring-before(.,'--')"/><BR/>
</xsl:if>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="."/><BR/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
Could you suggest with sample psuedo code?

mhkay September 5th, 2019 05:56 AM

Try something like this:

Code:

<xsl:template match="asset/query">
<query>
  <xsl:variable name="merged"
      select="string-join(query/sql-statement!replace('--.*$', ''), '#xa;')"/>
  </xsl:variable>
  <xsl:variable name="stripped"
    select="replace($merged, '/\*[.*?]\*/', '', 'm')"/>
  <xsl:for-each select="tokenize($stripped, '#xa;')">
    <query><sql-statement>{.}</sql-statement></query>
  </xsl:for-each>
</query>
</xsl:template>

Before #xa; insert ampersand (I had formatting problems with ampersands).

The idea is that the first variable $merged first removes single-line comments from each sql-statement using a replace() call, then joins all the lines with a newline character as separator; then the next variable $stripped removes multiline comments with another replace() call; then the tokenize() splits the resulting string on newline boundaries and for each line it regenerates a nested query/sql-statement element. The use of {.} assumes that expand-text="yes" is specified at xsl:stylesheet level.

msambasiva@gmail.com September 9th, 2019 02:52 AM

It's working with basic regex as /\*.*?\*/.

But I am trying for robust regex as below to strip off multi line comments,
<xsl:variable name="stripped"
select="replace($merged,
'/\*[^*]*\*+(?:[^\/*][^*]*\*+)*/', '', 'ms')"/>
Some getting below error mesage.

Severity: fatal
Description: FORX0002: Syntax error at char 11 in regular expression: Non-capturing groups allowed only in XPath3.0
URL: http://www.w3.org/TR/2005/WD-xpath-f...1/#ERRFORX0002

Similar regex is working fine in Perl scripting but failing with xslt2.0.

Any clue would be great help!

Thanks in advance,
Samba.

mhkay September 9th, 2019 03:52 AM

Non-capturing groups were not part of the regex syntax defined in XPath 2.0, but they were added to the syntax for XPath 3.0 and 3.1.

What Perl allows is irrelevant: every regex dialect is different.

mhkay September 9th, 2019 04:25 AM

Actually, it's not clear to me why Saxon should produce this message. In Saxon 9.9, we follow the XSLT 3.0 and XPath 3.1 rules whatever the stylesheet version says. In earlier releases we try to enforce XSLT 2.0 restrictions if the stylesheet specifies version="2.0". If you really want to write 2.0-conformant code, then (a) you can't use this regex, and (b) you'll need to use an earlier Saxon release that still has XSLT 2.0 support in order to ensure that any accidental use of 3.0 constructs is rejected. Otherwise, I suggest setting version="3.0" on the xsl:stylesheet element.


All times are GMT -4. The time now is 03:11 PM.

Powered by vBulletin®
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
© 2013 John Wiley & Sons, Inc.