Wrox Programmer Forums
Go Back   Wrox Programmer Forums > XML > XSLT
|
XSLT General questions and answers about XSLT. For issues strictly specific to the book XSLT 1.1 Programmers Reference, please post to that forum instead.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the XSLT section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
 
Old April 6th, 2008, 03:21 PM
Authorized User
 
Join Date: Nov 2006
Posts: 28
Thanks: 0
Thanked 1 Time in 1 Post
Default Separating strings and replacing characters

Hello,

I have strings (such as "100mW", or "35 s") in the form of digits followed by non-digit characters. I need a way to extract only the digits and also only the non-digit characters, thus in the end displaying "100" and "mW", or "35" and "s". I am using the 'tokenize' function on my initial string and some regular expressions, but while I can extract the digits, I am having problems with the non-digit characters. I get " mW" (no. of white spaces = no. of digits) instead of "mW". What I use to get the characters is "tokenize($initial_string,'[0-9]')".

I tried getting rid of the leading white spaces, but all the functions that I used give me something like "A sequence of more than one item is not allowed as the first argument of ...". Such functions I tried are normalize-space(tokenize(...)) and translate(tokenize(...), ...).

How can I solve this?

Thank you!

Michael
 
Old April 7th, 2008, 01:52 AM
joefawcett's Avatar
Wrox Author
 
Join Date: Jun 2003
Posts: 3,074
Thanks: 1
Thanked 38 Times in 37 Posts
Default

Seems to me that you can use a different token, such as \s+ so that you get two parts, the digits and the letters or use replace() later in the process to remove the spaces. Depends on the XML you are processing and what fits your current process.

--

Joe (Microsoft MVP - XML)
 
Old April 7th, 2008, 03:02 AM
samjudson's Avatar
Friend of Wrox
 
Join Date: Aug 2007
Posts: 2,128
Thanks: 1
Thanked 189 Times in 188 Posts
Default

The tokenize() function returns an array. It splits the first string on every instance of the second pattern found. So your example above gives the array:

'', '', '', 'mW'

Why this is then being serialised as the empty strings converted to spaces I'm not 100% sure.

A better way might be to use <xsl:analyze-string>

Code:
<xsl:analyze-string select="$initial_string" regex="[0-9]+">

    <xsl:matching-substring>

      <xsl:value-of select="."/>
    </xsl:matching-substring>

    <xsl:non-matching-substring>

      <xsl:value-of select="."/>
    </xsl:non-matching-substring>

  </xsl:analyze-string>


/- Sam Judson : Wrox Technical Editor -/
 
Old April 7th, 2008, 03:39 AM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

Firstly, you want to be processing one string at a time, typically by doing inside a template rule or for-each loop. Your error message ""A sequence of more than one item is not allowed as the first argument of ..." says that you are trying to process several strings at once, so there is some structural problem in your code. (But you've made the mistake of not showing your code so we can't tell you what you're doing wrong).

I think the simplest way of doing this with regular expressions is to use replace() twice: replace($in, '([0-9]*)([^0-9]*)', '$1') to get the digits, and replace($in, '([0-9]*)([^0-9]*)', '$2') to get the non-digits. Or if you used xsl:analyze-string you would be able to extract both parts using regex-group().

It's not hard to do it using translate():

translate($in, '0123456789', '') gives you the non-digits, and

translate($in, translate($in, '0123456789', ''), '') gives you the digits.

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference
 
Old April 7th, 2008, 09:33 AM
Authorized User
 
Join Date: Nov 2006
Posts: 28
Thanks: 0
Thanked 1 Time in 1 Post
Default

Thank you everybody, it solved my problem. I used the translate function as suggested.

Michael





Similar Threads
Thread Thread Starter Forum Replies Last Post
Best approach to replace characters with strings lumbrigack XSLT 8 June 16th, 2008 12:38 PM
Replacing multiple characters in a string philboparker BOOK: XSLT Programmer's Reference, 2nd Edition 0 May 20th, 2008 04:38 PM
Replacing characters in a string semilemon C# 2005 2 June 16th, 2006 11:31 PM
Replacing characters in external csv file Axxess Access VBA 2 July 15th, 2005 01:39 AM
how to restrict strings with particular characters srini XSLT 6 November 28th, 2003 07:03 AM





Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.