|
XSLT General questions and answers about XSLT. For issues strictly specific to the book XSLT 1.1 Programmers Reference, please post to that forum instead. |
Welcome to the p2p.wrox.com Forums.
You are currently viewing the XSLT section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
|
|
|
January 26th, 2004, 06:50 AM
|
Registered User
|
|
Join Date: Jan 2004
Posts: 7
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
German characters in XML/XSLT
Hi there
I have looked in a lot of places for an answer to this, so I apologize if it has already been answered - I couldn't find the solution.
I'm converting XML to HTML using XSLT/PHP. Works perfectly, except that I can't include German characters in the text. That produces an xslt error (bad token). What do I have to do so that I can include the following characters in the XML: ä, ö, ü, Ã, Ã, Ã, à and have them appear correctly in the HTML
Thanks for any hints you can offer
Norm
|
January 26th, 2004, 07:11 AM
|
|
Wrox Author
|
|
Join Date: Jun 2003
Posts: 3,074
Thanks: 1
Thanked 38 Times in 37 Posts
|
|
The easiest way is to use Unicode. Make sure that the editor or system you use to create the xml file can create files using utf-8 encoding. Then as long as you are not using an esoteric font you will be okay.
If you can't follow this, explain how the xml file is created.
Joe (MVP - xml)
|
January 26th, 2004, 10:00 AM
|
Registered User
|
|
Join Date: Jan 2004
Posts: 7
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Quote:
quote:Originally posted by joefawcett
The easiest way is to use Unicode. Make sure that the editor or system you use to create the xml file can create files using utf-8 encoding.Joe (MVP - xml)
|
Thanks, Joe. The files are created using XML Spy and the first line of each file is, of course:
<?xml version="1.0" encoding="UTF-8"?>
So, I can't really see what the problem is. Now of course, when I write the HTML to a variable ($result), I could do a string replace. So I could include, say, "uumlaut" in the middle of a word and then do a string replace to change it to "ü", which would be the correct HTML entity for "ü". However, that assumes that I have control over what is in the XML file, but in many cases I want to call a remote XML file over which I have no control.
I tried writing "ü" straight into an XML file, but the Sablotron processor threw an error (bad token).
PHP has the functions utf8_encode() and utf8_decode(), but I'm not sure if/how these can help me because, to tell you the truth, my knowledge of the problem with character encoding is shakey at best.
Thank you for any further light you can shed on this
Norm
|
January 26th, 2004, 10:24 AM
|
|
Wrox Author
|
|
Join Date: Jun 2003
Posts: 3,074
Thanks: 1
Thanked 38 Times in 37 Posts
|
|
If you email me the xml and xsl files (joefawcett AT hotmail.com). I will have a look. It's no use posting them as encoding can be lost. Can you also explain exactly how you do the transform?
Joe (MVP - xml)
|
January 27th, 2004, 04:45 AM
|
Registered User
|
|
Join Date: Jan 2004
Posts: 7
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
OK - will do in the next 10 mins.
Thanks
Norm
|
January 28th, 2004, 12:46 PM
|
|
Wrox Author
|
|
Join Date: Jun 2003
Posts: 3,074
Thanks: 1
Thanked 38 Times in 37 Posts
|
|
Norman
Not received, did you sort this out?
--
Joe
|
January 28th, 2004, 03:57 PM
|
Registered User
|
|
Join Date: Jan 2004
Posts: 7
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Yeah, I sent it to you a couple of days ago. I'll try again.
Norm
|
January 28th, 2004, 04:00 PM
|
Registered User
|
|
Join Date: Jan 2004
Posts: 7
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
OK - I sent it again 1 minute ago, to joefawcett at hotmail.com
Norm
|
January 29th, 2004, 05:51 AM
|
|
Wrox Author
|
|
Join Date: Jun 2003
Posts: 3,074
Thanks: 1
Thanked 38 Times in 37 Posts
|
|
Received the files, I don't use php but the xml file you sent was not utf-8 encoded but as you copied after opening in IE this may have caused it. If I opened it in my text editor, removed the extra bits piut in by IE (the dashes to mark expansion points) and saved as a utf-8 file it viewed fine. On the other hand it might be the way it is handled in php because you load it as a string not as a file. This often causes encoding to be lost, in VB for example strings are utf-16.
Perhaps you could search and replace using xml dom instead of string manipulation?
Hope this is of some help.
Joe (MVP - xml)
--
Joe
|
January 29th, 2004, 09:16 AM
|
Registered User
|
|
Join Date: Jan 2004
Posts: 7
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Hi Joe
Thanks for the info. If I try using PHP function utf8_encode() on the string, it might solve the problem. I convert it to a string because the Sablotron processor requires this. If the IE display is adding unwanted stuff, I can always do a string replace before UTF-8 encoding. I would need to have a look at what that 'stuff' is. I really don't know much about utf-8 or indeed XML.
Anyway, I'll get on it and let you know if I succeed. Thanks for your help.
Norm
|
|
|