Wrox Programmer Forums
Go Back   Wrox Programmer Forums > XML > XSLT
|
XSLT General questions and answers about XSLT. For issues strictly specific to the book XSLT 1.1 Programmers Reference, please post to that forum instead.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the XSLT section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
 
Old January 18th, 2010, 03:07 PM
Registered User
 
Join Date: Jan 2010
Posts: 5
Thanks: 1
Thanked 0 Times in 0 Posts
Default Format xml using xslt and regex

Hallo Leute,
I have a XML which I want to format by using XSLT and Regular Expressions.

I would like to transform this

Code:
<Mutter>
	<Kind>
		<Name>Jonas Maier</Name>
		<Geburtstag>12-3-2001</Geburtstag>
	</Kind>
	<Kind>
		<Name>Leon Jung</Name>
		<Geburtstag>7-30-1981</Geburtstag>
	</Kind>
	<Kind>
		<Name>Anna Krause</Name>
		<Geburtstag>4-14-1995</Geburtstag>
	</Kind>
</Mutter>
into something like this

Code:
<Mutter>
	<Kind>
		<Name>Maier, Jonas</Name>
		<Geburtstag>3-December-2001</Geburtstag>
	</Kind>
	<Kind>
		<Name>Jung, Leon</Name>
		<Geburtstag>30-July-1981</Geburtstag>
	</Kind>
	<Kind>
		<Name>Krause, Anna</Name>
		<Geburtstag>14-April-1995</Geburtstag>
	</Kind>
</Mutter>
So:
  1. The name is restructured, by replacing the places and by adding a "," in the middle
    Example: Name Surname --> Surname, Name
  2. The date is also reformated
    Example: MM-dd-yyyy --> dd-MMMM-yyyy

Can you offer me some help?
 
Old January 18th, 2010, 05:05 PM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

As regards the names, you don't really have enough information to do the job properly: given the name Norman St John Stevas, you need some pretty sophisticated logic to turn that into St John Stevas, Norman. Ditto with names like de Klerk or van Dijk. But assuming you just want to blindly split it at the last space, it's

Code:
<xsl:variable name="tokens" select="tokenize($in, '\s+')"/>
<xsl:value-of select="$tokens[last()]"/>
<xsl:text>, </xsl:text>
<xsl:value-of select="$tokens[position() lt last()]"/>
For the dates: split it into three tokens using tokenize(), expand the day to two digits using format-number(), convert the month to a string by doing ("January", "February"...)[$month], and then use concat() to put it back together again.
__________________
Michael Kay
http://www.saxonica.com/
Author, XSLT 2.0 and XPath 2.0 Programmer\'s Reference
The Following User Says Thank You to mhkay For This Useful Post:
t_herbert47 (January 18th, 2010)
 
Old January 18th, 2010, 05:17 PM
Registered User
 
Join Date: Jan 2010
Posts: 5
Thanks: 1
Thanked 0 Times in 0 Posts
Default

Thank you very mych. Could you just show me an example how to reformat the date with regex?
 
Old January 18th, 2010, 06:21 PM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

No, I'm not going to write the code for you. You've got to learn to do it for yourself. If there's part of my explanation you don't understand, please explain what you don't understand.
__________________
Michael Kay
http://www.saxonica.com/
Author, XSLT 2.0 and XPath 2.0 Programmer\'s Reference
 
Old January 19th, 2010, 05:53 AM
Authorized User
 
Join Date: Jan 2010
Posts: 12
Thanks: 2
Thanked 0 Times in 0 Posts
Default

What is the best way to handle this issue (datetime-data)? Or: How should I store/format date-fields within a XML-file to avoid this kind of prehistoric format transformation? Sorry, but this technique reminds me on handling date 'format' at the DOS-prompt. What I am looking for is a mechanism like a date-format within a database and a convert-function. Is there something similar within XLST?
 
Old January 19th, 2010, 06:06 AM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

It's always best to store dates in ISO format (yyyy-mm-dd) and then you can use the XSLT format-date() function to display them using local conventions for your country or language.
__________________
Michael Kay
http://www.saxonica.com/
Author, XSLT 2.0 and XPath 2.0 Programmer\'s Reference
 
Old January 19th, 2010, 07:28 AM
Authorized User
 
Join Date: Jan 2010
Posts: 12
Thanks: 2
Thanked 0 Times in 0 Posts
Default

Still don't get it. With the given data I would transform this to ISO and use format-date to any representation I like. Like that?

Code:
<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet 
	version="2.0" 
	xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
<xsl:output 
	method="xml" 
	encoding="iso-8859-1" 
	indent="yes" 
/>

<xsl:template match="/Mutter/Kind">

<xsl:variable name="tokens" select="tokenize(Geburtstag, '-')"/>
<xsl:variable name="demo-data">
	<xsl:value-of select="$tokens[3]"/>-<xsl:value-of select="substring(string(100+number($tokens[1])), 1, 2)"/>-<xsl:value-of select="substring(string(100+number($tokens[2])), 1, 2)"/>
</xsl:variable>
<xsl:value-of select="format-date($demo-data, '[D]. [MNn] [Y]')"/>
</xsl:template>

</xsl:stylesheet>
Result is closed to what is required, but there is still some remarable difference (Saxon 9.2.0.3):

Code:
[Language: en]10. November 2001
[Language: en]13. October 1981
[Language: en]11. October 1995
Regards

Christian
 
Old January 19th, 2010, 07:51 AM
samjudson's Avatar
Friend of Wrox
 
Join Date: Aug 2007
Posts: 2,128
Thanks: 1
Thanked 189 Times in 188 Posts
Default

It would help if you code was correct.

substring(x, 1, 2) returns the first two characters - you want the second two, so substring(x, 2, 2).

Alternatively:

<xsl:variable name="demo-data">
<xsl:value-of select="$tokens[3]"/>-<xsl:value-of select="if (number($tokens[1]) &lt; 10) then concat('0',$tokens[1]) else $tokens[1]"/>-<xsl:value-of select="if (number($tokens[2]) &lt; 10) then concat('0',$tokens[2]) else $tokens[2]"/>
</xsl:variable>

While its slightly more verbose it explains what you are trying to do better and is easier to understand.
__________________
/- Sam Judson : Wrox Technical Editor -/

Think before you post: What have you tried?
 
Old January 19th, 2010, 08:13 AM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

There are a couple of known problems with localization of dates in Saxon 9.2

This one concerns installation of localization code for languages other than English:

https://sourceforge.net/tracker/?fun...72&atid=397617

This one concerns the spurious [Language:en]:

https://sourceforge.net/tracker/?fun...72&atid=397617

I would recommend that if you want English format dates and your machine is configured to a language other than English, then you use the language argument of format-date() to say what language you want.
__________________
Michael Kay
http://www.saxonica.com/
Author, XSLT 2.0 and XPath 2.0 Programmer\'s Reference
 
Old January 19th, 2010, 02:42 PM
Authorized User
 
Join Date: Jan 2010
Posts: 12
Thanks: 2
Thanked 0 Times in 0 Posts
Default

Ad 1 (Sam):

Of course the substring argument was wrong and should be 2,2

The alternative code was no real alternative, as it will only work when there is always no leading '0'. To make it equal you have to add another conversion. So the code gets even a bit more longer.

Code:
<xsl:value-of select="$tokens[3]"/>-<xsl:value-of select="if (number($tokens[1]) &lt; 10) then concat('0',number($tokens[1])) else $tokens[1]"/>-<xsl:value-of select="if (number($tokens[2]) &lt; 10) then concat('0',number($tokens[2])) else $tokens[2]"/>
I though that the if-statement alternative is much slower. But it figured out to perform the same way. I tested this by transforming some million elements.

So in the end it is maybe only a matter of taste.

Ad 2 (Mike):
I have to use the binaries 'as is'. There is no way to use a customized/localized version at my customer's site. When I try to use the language argument like
Code:
format-date($demo-data, '[D]. [MNn] [Y]', 'de', (), ())
there is no effect. The result has still a leading [Language: en] and it is not localized/translated.

Best regards

Christian





Similar Threads
Thread Thread Starter Forum Replies Last Post
Read XML apply XSLT and format the data navik_pathak XSLT 5 April 28th, 2009 11:43 AM
regex-group problem in XSLT rajashekhara XSLT 10 March 6th, 2009 05:05 AM
regex in xslt rajesh_css XSLT 4 September 29th, 2008 11:35 PM
Creating XML doc ; writing string(xml format) into KamalRaturi XML 5 May 28th, 2008 05:51 AM
Converting XML into a particular format using XSLT AjayLuthria XSLT 1 April 10th, 2007 09:47 AM





Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.