 |
| XSLT General questions and answers about XSLT. For issues strictly specific to the book XSLT 1.1 Programmers Reference, please post to that forum instead. |
Welcome to the p2p.wrox.com Forums.
You are currently viewing the XSLT section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
|
|
|
|

July 7th, 2008, 11:45 AM
|
|
Authorized User
|
|
Join Date: Jul 2008
Posts: 11
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
separate a string with whitespace
hi,everyone,
i spent days with this problem. i want to split a string like "lessThan" into "less than".within a string there is one or more upper case,what to do is adding a whitespace before every upper case,and translate it to lower case.for latter it is easy .can anybody help me? thanks.
|
|

July 7th, 2008, 11:49 AM
|
 |
Friend of Wrox
|
|
Join Date: Aug 2007
Posts: 2,128
Thanks: 1
Thanked 189 Times in 188 Posts
|
|
In XSLT 2.0 you can use the <xsl:analyze-string> instruction. Otherwise you will likely have to write a recursive template and using the substring() function.
/- Sam Judson : Wrox Technical Editor -/
|
|

July 7th, 2008, 11:58 AM
|
 |
Wrox Author
|
|
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
|
|
In XSLT 2.0
<xsl:analyze-string select="$in" regex="\p{Lu}">
<xsl:matching-substring>
<xsl:value-of select="concat(' ', lower-case(.))"/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:-nonmatching-substring>
</xsl:analyze-string>
In XSLT 1.0 it's rather more difficult...
Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference
|
|

July 7th, 2008, 11:59 AM
|
 |
Wrox Author
|
|
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
|
|
Correction, should be regex="\p{{Lu}}" - the curlies need to be doubled.
Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference
|
|

July 7th, 2008, 01:42 PM
|
|
Authorized User
|
|
Join Date: May 2008
Posts: 32
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
another way of doing it in XSLT 2.0 would be:
Code:
<xsl:value-of select="lower-case(replace(val,'([A-Z])',' $1'))"/>
|
|

July 7th, 2008, 03:50 PM
|
|
Authorized User
|
|
Join Date: May 2008
Posts: 32
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
tried to implement it with XSLT 1.0 using recursion.
here what I got:
INPUT XML:
Code:
<?xml version="1.0" encoding="ISO-8859-1"?>
<root>
<val>lessThan</val>
<val>hellOWorld</val>
</root>
--------------------------------------------------------------------------------------------
XSL:
Code:
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE stylesheet [
<!ENTITY UPPER "ABCDEFGHIJKLMNOPQRSTUVWXYZ">
<!ENTITY LOWER "abcdefghijklmnopqrstuvwxyz">
<!ENTITY UPPER_TO_LOWER " '&UPPER;' , '&LOWER;' ">
]>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/root">
<root>
<xsl:apply-templates select="val"/>
</root>
</xsl:template>
<xsl:template match="val">
<old_val>
<xsl:value-of select="."/>
</old_val>
<new_val>
<xsl:call-template name="split_replace">
<xsl:with-param name="input" select="."/>
</xsl:call-template>
</new_val>
</xsl:template>
<xsl:template name="split_replace">
<xsl:param name="input"/>
<xsl:choose>
<xsl:when test="string-length($input)-string-length(translate($input,'&UPPER;',''))>0 and string-length($input)>1">
<xsl:call-template name="split_replace">
<xsl:with-param name="input" select="substring($input,1,floor(string-length($input) div 2))"/>
</xsl:call-template>
<xsl:call-template name="split_replace">
<xsl:with-param name="input" select="substring($input,floor(string-length($input) div 2)+1)"/>
</xsl:call-template>
</xsl:when>
<xsl:when test="string-length($input)-string-length(translate($input,'&UPPER;',''))=1 and string-length($input)=1">
<xsl:value-of select="concat(' ', translate($input,&UPPER_TO_LOWER;))"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$input"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
---------------------------------------------------------------------------------------
resulting XML:
Code:
<?xml version="1.0" encoding="UTF-8" ?>
<root>
<old_val>lessThan</old_val>
<new_val>less than</new_val>
<old_val>hellOWorld</old_val>
<new_val>hell o world</new_val>
</root>
would be nice to receive some criticism how it can be improved.
|
|

July 7th, 2008, 04:10 PM
|
 |
Wrox Author
|
|
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
|
|
>another way of doing it in XSLT 2.0
the replace() solution works if you make the assumption that a character that doesn't match [A-Z] will be unchanged by the lower-case() function. That clearly works only for ASCII. Even if you change it to \p{Lu}, there are characters (such as title-case Fi) that aren't upper-case, but are modified by the lower-case() function.
Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference
|
|

July 7th, 2008, 10:53 PM
|
|
Authorized User
|
|
Join Date: May 2008
Posts: 32
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
>That clearly works only for ASCII.
sure, I didn't mention it explicitly, because it was quite obvious.
According to example provided by the OP, probably english alphabet is enough for him.
>there are characters (such as title-case Fi) that aren't upper-case, but are modified by the lower-case() function.
you are talking about digraphs here, right?
but why it shouldn't be modified by lower-case() function?
Of course digraphs are something special, but still if it is upper-case inside a word it should be transformed to lower-case, shouldn't it?
PS
I re-read the initial post of OP
and I think he wants to make the described amendments only in case capital letter is 'within' a string, but not the first letter of a word, for example,
so e.g. 'LessThan a Mile' should become 'Less than a Mile'.
So both our solutions are not working properly in this a case.
|
|

July 8th, 2008, 01:56 AM
|
 |
Wrox Author
|
|
Join Date: Jun 2003
Posts: 3,074
Thanks: 1
Thanked 38 Times in 37 Posts
|
|
A notoriously difficult problem, how do you handle the following?
TheBBCDeniedPoliticalBias
TheUNVotedForAVeto
In the general class of problems it's unsolvable generically because you are trying to add information or decrease entropy...
--
Joe ( Microsoft MVP - XML)
|
|
 |