 |
| XSLT General questions and answers about XSLT. For issues strictly specific to the book XSLT 1.1 Programmers Reference, please post to that forum instead. |
Welcome to the p2p.wrox.com Forums.
You are currently viewing the XSLT section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
|
|
|
|

September 22nd, 2004, 07:46 AM
|
|
Registered User
|
|
Join Date: Jun 2003
Posts: 8
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Substring and special characters
Hello all. I have a situation where I have a single string that I need to break up into separate fields to create xml values.
The incoming data has escaped characters in a mix of amp, quote, gt, lt and apos.
Lets say I have the following:
$PARM,40,256 = 'Test Description < '
$PARM,296,2 = 'OK'
When I do the following:
<xsl:variable name="FIELD1" select="substring($PARM,40,256)"/>
FIELD1 contains 'Test Description <' + whitespace + 'OK'
Any subsequent substrings that I do to other fields are then corrputed by the offset created beyond position 256.
Anything I can do to address this. I am using Xalan for the parser.
Regards.
|
|

September 22nd, 2004, 07:51 AM
|
 |
Wrox Author
|
|
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
|
|
Your string does not contain the four characters &,l,t,; - it contains the single character <. Remember that XSLT is always working on the data model created by parsing your XML source, not on the XML source in its lexical form.
Michael Kay
http://saxon.sf.net/
|
|

September 22nd, 2004, 08:04 AM
|
|
Registered User
|
|
Join Date: Jun 2003
Posts: 8
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Thanks Michael. I understand what it is reading, my question really should have been - how do I trick him into doing something different?
I am trying to cope with a limitation of our process(es). The data is sourced from a fixed format data source and wrapped in <message> tags, so the data *has* to be pre-escaped at source.
So, looking for advice here, if there is no way of doing it via just substrings (as I was attempting) because of what the parser does, is it then possible to pre-process the string before I start substringing so that the real offsets to the data can be maintained? Dynamic positioning based on existence of escaped characters perhaps?
Cheers....
|
|

September 23rd, 2004, 11:39 AM
|
 |
Wrox Author
|
|
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
|
|
I don't think you understood my previous reply. The XSLT processor has absolutely no idea whether a character was originally written as ">" or ">" or "#x3E;" or "#00000000062;". It is working on the real textual value of the node, not on the escaped representation used in the lexical XML.
Michael Kay
http://www.saxonica.com/
|
|

September 24th, 2004, 02:27 AM
|
|
Registered User
|
|
Join Date: Jun 2003
Posts: 8
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Thanks Michael. Yes, I understood your inital reply (grateful - thanks), and understand your second reply.
What I was looking for was advice on if the escaped representation did appear in my incoming data and I needed to parse it based on known lengths, then I would have to dynamically calculate the start and end positions if I found escaped characters in the string.
While I believe someone must have come up against this in the past and found a solution, I figured that this is just too much effort for little reward. In the end I kicked back to the source application owners, with "do us a favour and wrap your free-form special character data in tags so I don't have to give a stuff where it is positionally".
Works like a charm now ;)
Cheers
|
|
 |