 |
| XSLT General questions and answers about XSLT. For issues strictly specific to the book XSLT 1.1 Programmers Reference, please post to that forum instead. |
Welcome to the p2p.wrox.com Forums.
You are currently viewing the XSLT section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
|
|
|
|

January 18th, 2010, 03:07 PM
|
|
Registered User
|
|
Join Date: Jan 2010
Posts: 5
Thanks: 1
Thanked 0 Times in 0 Posts
|
|
Format xml using xslt and regex
Hallo Leute,
I have a XML which I want to format by using XSLT and Regular Expressions.
I would like to transform this
Code:
<Mutter>
<Kind>
<Name>Jonas Maier</Name>
<Geburtstag>12-3-2001</Geburtstag>
</Kind>
<Kind>
<Name>Leon Jung</Name>
<Geburtstag>7-30-1981</Geburtstag>
</Kind>
<Kind>
<Name>Anna Krause</Name>
<Geburtstag>4-14-1995</Geburtstag>
</Kind>
</Mutter>
into something like this
Code:
<Mutter>
<Kind>
<Name>Maier, Jonas</Name>
<Geburtstag>3-December-2001</Geburtstag>
</Kind>
<Kind>
<Name>Jung, Leon</Name>
<Geburtstag>30-July-1981</Geburtstag>
</Kind>
<Kind>
<Name>Krause, Anna</Name>
<Geburtstag>14-April-1995</Geburtstag>
</Kind>
</Mutter>
So: - The name is restructured, by replacing the places and by adding a "," in the middle
Example: Name Surname --> Surname, Name
- The date is also reformated
Example: MM-dd-yyyy --> dd-MMMM-yyyy
Can you offer me some help?
|
|

January 18th, 2010, 05:05 PM
|
 |
Wrox Author
|
|
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
|
|
As regards the names, you don't really have enough information to do the job properly: given the name Norman St John Stevas, you need some pretty sophisticated logic to turn that into St John Stevas, Norman. Ditto with names like de Klerk or van Dijk. But assuming you just want to blindly split it at the last space, it's
Code:
<xsl:variable name="tokens" select="tokenize($in, '\s+')"/>
<xsl:value-of select="$tokens[last()]"/>
<xsl:text>, </xsl:text>
<xsl:value-of select="$tokens[position() lt last()]"/>
For the dates: split it into three tokens using tokenize(), expand the day to two digits using format-number(), convert the month to a string by doing ("January", "February"...)[$month], and then use concat() to put it back together again.
__________________
Michael Kay
http://www.saxonica.com/
Author, XSLT 2.0 and XPath 2.0 Programmer\'s Reference
|
|
The Following User Says Thank You to mhkay For This Useful Post:
|
|
|

January 18th, 2010, 05:17 PM
|
|
Registered User
|
|
Join Date: Jan 2010
Posts: 5
Thanks: 1
Thanked 0 Times in 0 Posts
|
|
Thank you very mych. Could you just show me an example how to reformat the date with regex?
|
|

January 18th, 2010, 06:21 PM
|
 |
Wrox Author
|
|
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
|
|
No, I'm not going to write the code for you. You've got to learn to do it for yourself. If there's part of my explanation you don't understand, please explain what you don't understand.
__________________
Michael Kay
http://www.saxonica.com/
Author, XSLT 2.0 and XPath 2.0 Programmer\'s Reference
|
|

January 19th, 2010, 05:53 AM
|
|
Authorized User
|
|
Join Date: Jan 2010
Posts: 12
Thanks: 2
Thanked 0 Times in 0 Posts
|
|
What is the best way to handle this issue (datetime-data)? Or: How should I store/format date-fields within a XML-file to avoid this kind of prehistoric format transformation? Sorry, but this technique reminds me on handling date 'format' at the DOS-prompt. What I am looking for is a mechanism like a date-format within a database and a convert-function. Is there something similar within XLST?
|
|

January 19th, 2010, 06:06 AM
|
 |
Wrox Author
|
|
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
|
|
It's always best to store dates in ISO format (yyyy-mm-dd) and then you can use the XSLT format-date() function to display them using local conventions for your country or language.
__________________
Michael Kay
http://www.saxonica.com/
Author, XSLT 2.0 and XPath 2.0 Programmer\'s Reference
|
|

January 19th, 2010, 07:28 AM
|
|
Authorized User
|
|
Join Date: Jan 2010
Posts: 12
Thanks: 2
Thanked 0 Times in 0 Posts
|
|
Still don't get it. With the given data I would transform this to ISO and use format-date to any representation I like. Like that?
Code:
<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet
version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
<xsl:output
method="xml"
encoding="iso-8859-1"
indent="yes"
/>
<xsl:template match="/Mutter/Kind">
<xsl:variable name="tokens" select="tokenize(Geburtstag, '-')"/>
<xsl:variable name="demo-data">
<xsl:value-of select="$tokens[3]"/>-<xsl:value-of select="substring(string(100+number($tokens[1])), 1, 2)"/>-<xsl:value-of select="substring(string(100+number($tokens[2])), 1, 2)"/>
</xsl:variable>
<xsl:value-of select="format-date($demo-data, '[D]. [MNn] [Y]')"/>
</xsl:template>
</xsl:stylesheet>
Result is closed to what is required, but there is still some remarable difference (Saxon 9.2.0.3):
Code:
[Language: en]10. November 2001
[Language: en]13. October 1981
[Language: en]11. October 1995
Regards
Christian
|
|

January 19th, 2010, 07:51 AM
|
 |
Friend of Wrox
|
|
Join Date: Aug 2007
Posts: 2,128
Thanks: 1
Thanked 189 Times in 188 Posts
|
|
It would help if you code was correct.
substring(x, 1, 2) returns the first two characters - you want the second two, so substring(x, 2, 2).
Alternatively:
<xsl:variable name="demo-data">
<xsl:value-of select="$tokens[3]"/>-<xsl:value-of select="if (number($tokens[1]) < 10) then concat('0',$tokens[1]) else $tokens[1]"/>-<xsl:value-of select="if (number($tokens[2]) < 10) then concat('0',$tokens[2]) else $tokens[2]"/>
</xsl:variable>
While its slightly more verbose it explains what you are trying to do better and is easier to understand.
|
|

January 19th, 2010, 08:13 AM
|
 |
Wrox Author
|
|
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
|
|
There are a couple of known problems with localization of dates in Saxon 9.2
This one concerns installation of localization code for languages other than English:
https://sourceforge.net/tracker/?fun...72&atid=397617
This one concerns the spurious [Language:en]:
https://sourceforge.net/tracker/?fun...72&atid=397617
I would recommend that if you want English format dates and your machine is configured to a language other than English, then you use the language argument of format-date() to say what language you want.
__________________
Michael Kay
http://www.saxonica.com/
Author, XSLT 2.0 and XPath 2.0 Programmer\'s Reference
|
|

January 19th, 2010, 02:42 PM
|
|
Authorized User
|
|
Join Date: Jan 2010
Posts: 12
Thanks: 2
Thanked 0 Times in 0 Posts
|
|
Ad 1 (Sam):
Of course the substring argument was wrong and should be 2,2
The alternative code was no real alternative, as it will only work when there is always no leading '0'. To make it equal you have to add another conversion. So the code gets even a bit more longer.
Code:
<xsl:value-of select="$tokens[3]"/>-<xsl:value-of select="if (number($tokens[1]) < 10) then concat('0',number($tokens[1])) else $tokens[1]"/>-<xsl:value-of select="if (number($tokens[2]) < 10) then concat('0',number($tokens[2])) else $tokens[2]"/>
I though that the if-statement alternative is much slower. But it figured out to perform the same way. I tested this by transforming some million elements.
So in the end it is maybe only a matter of taste.
Ad 2 (Mike):
I have to use the binaries 'as is'. There is no way to use a customized/localized version at my customer's site. When I try to use the language argument like
Code:
format-date($demo-data, '[D]. [MNn] [Y]', 'de', (), ())
there is no effect. The result has still a leading [Language: en] and it is not localized/translated.
Best regards
Christian
|
|
 |