p2p.wrox.com Forums

p2p.wrox.com Forums (http://p2p.wrox.com/index.php)
-   XSLT (http://p2p.wrox.com/forumdisplay.php?f=86)
-   -   Grouping plain text into paragraphs (http://p2p.wrox.com/showthread.php?t=60114)

igraham July 14th, 2007 07:13 PM

Grouping plain text into paragraphs
 
I'm trying to process plain text to turn it into XML/DITA <p> and <pre> elements. The idea is that consecutive lines of text with indents of exactly n spaces should be grouped into a <p> element, whereas lines with either fewer or more spaces before non-whitespace content should be grouped into <pre> elements.

I've come up with the following template that does the job for a specific indent, in this case 15 spaces, but I haven't figured out how to support an indent defined by my indent parameter. Basically what I want is to dynamically create my regular expression with the correct indent value inserted where I currently have the value 15 hard-coded:
Code:

   <xsl:template name="convertFixedIndentToParagraphs">
     <xsl:param name="text"/>
     <xsl:param name="indent"/>
     <xsl:analyze-string select="$text" regex="(^ {{15}}[^ ][^\n]*\n?)+" flags="m">
        <xsl:matching-substring>
           <p>
              <xsl:for-each select="tokenize(., '\n')">
                 <xsl:text>#10;</xsl:text>
                 <xsl:value-of select="substring(., $indent + 1)"/>
              </xsl:for-each>
           </p>
        </xsl:matching-substring>
        <xsl:non-matching-substring>
           <xsl:if test="matches(., '\S')">
              <pre><xsl:text>#10;</xsl:text>
                 <xsl:call-template name="eliminateMinimumIndent"/></pre>
           </xsl:if>
        </xsl:non-matching-substring>
     </xsl:analyze-string>
   </xsl:template>

I really thought I had this working well enough with the hard-coded indent value, until I discovered that many of the text nodes I'm processing have slightly different standard indents, so I need to be able to use the indent parameter properly.

Is there an easy way to parameterize that value in my regex? Or am I going to have to come up with a completely different solution?

Ian




mhkay July 15th, 2007 12:02 PM

The regex attribute in xsl:analyze-string is an AVT, so the value can be constructed at run-time.

However, I think I would use a completely different approach. First turn each line into an element node, then group adjacent lines having the same indentation: use xsl:for-each-group group-adjacent="f:indent(.)" where f:indent() counts the number of leading spaces in a string.

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference

igraham July 15th, 2007 03:27 PM

Thanks Michael. I actually did try to construct the value at runtime, but couldn't figure out the syntax to do it. I'll read up on attribute value templates. Thanks also for the advice to use grouping - I haven't yet figured out how to use it, but it'll be easier now that I KNOW that that is the approach I should really take. I still haven't made it past the steepest hurdles in really getting XSLT. I still do a lot of hacking around just to get almost to what I want.

In particular, I have serious trouble understanding the processing of mixed content. Currently the books I have on XSLT are Learning XSLT and XSLT Cookbook, neither of which are suitable as references. It looks like I should really get your books, because I find it difficult to extract real understanding from the W3C specs which have limited examples.

Thanks again,
Ian


pauljr8 July 15th, 2007 09:44 PM

Hi Ian,

I often want to display text exactly as I've entered it in an xml element. To do so I use this template and I've included an example of how I call it. So if I enter:
<tips name="Tip Number 1">
<tip>This is some

            text
I want displayed</tip>
<tip>of
course I
don't
really display
      anything
this way
</tip>
</tips>

It displays it that way. Hope it helps.

BTW I've had XSLT 2nd Edition Programmer's Reference for many years. Couldn't live without it, but I still need Mr. Kay's help from time to time. Stopped holding my breath to be able to use XSLT V. 2 when I looked like :(

<xsl:template match="tips">

<xsl:for-each select="./tip">

<h1 align="center"><xsl:value-of select="@name" /></h1>
<p>
<xsl:call-template name="replace-text">

      <xsl:with-param name="text" select="."/>

      <xsl:with-param name="replace" select="'#10;'"/>

      <xsl:with-param name="by" select="'&lt;br /&gt;'"/>

    </xsl:call-template>
</p>
</xsl:for-each>

</xsl:template>


<xsl:template name="replace-text">

   <xsl:param name="text"/>

   <xsl:param name="replace" />

   <xsl:param name="by" />



   <xsl:choose>

   <xsl:when test="contains($text, $replace)">

      <xsl:value-of select="substring-before($text, $replace)"/>

      <xsl:value-of select="$by" disable-output-escaping="yes"/>

      <xsl:call-template name="replace-text">

         <xsl:with-param name="text" select="substring-after($text, $replace)"/>

         <xsl:with-param name="replace" select="$replace" />

         <xsl:with-param name="by" select="$by" />

      </xsl:call-template>

   </xsl:when>

   <xsl:otherwise>

      <xsl:value-of select="$text"/>

   </xsl:otherwise>

   </xsl:choose>



</xsl:template>


Paul Hickey

igraham July 16th, 2007 01:10 PM

Thanks for the <tips> Paul!

Ian



All times are GMT -4. The time now is 02:04 PM.

Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
© 2013 John Wiley & Sons, Inc.