Wrox Programmer Forums
Go Back   Wrox Programmer Forums > XML > XSLT
|
XSLT General questions and answers about XSLT. For issues strictly specific to the book XSLT 1.1 Programmers Reference, please post to that forum instead.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the XSLT section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
 
Old July 26th, 2007, 12:04 PM
Registered User
 
Join Date: Jul 2007
Posts: 1
Thanks: 0
Thanked 0 Times in 0 Posts
Default Removing UTF_8 Middle Dot using translate

Hi,
I am attempting to remove middle-dot with using the function translate. I have a 2003 word XML file that when translated contains Middle – Dot. http://www.fileformat.info/info/unic...00b7/index.htm

I use the following code to translate, and remove certain characters that is not supported by another program. It does not fully support UTF-8 and I have to work around that.

I understand that some of the characters below will be garbage. The first character is the most important anyways.
Quote:
quote:
<xsl:variable name="Remove">#xb7;â#128;#156;â#128;#153;â#128; â#128;¢â#128;#152;â#128;#147;â#128;¦</xsl:variable>
<xsl:value-of select="translate(current(),$Remove,'')"/>
In the output I find the
Quote:
quote:
·Verify
I did a OD dump of part of the file and it was c2b7 (which corresponds to UTF-8 Middle dot).

I am using the JAXP XSLT processor, and have an input of UTF-8 characters with an intended output of UTF-8.

Thank you,

Vinay Anantharaman

 
Old July 26th, 2007, 12:34 PM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

>I am using the JAXP XSLT processor

JAXP is an interface, not a processor. There are several XSLT processors that implement the JAXP interface, including Xalan, Saxon, and Oracle.

translate() works in terms of Unicode codepoints. The encoding of your input and output files (including your stylesheet) are irrelevant. The codepoint for middle dot is xB7, so you just need translate($input, '&_#xb7;', '') (omitting the "_" which is just to confuse the mailer).

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference





Similar Threads
Thread Thread Starter Forum Replies Last Post
Repeating image in middle of a page ychange Javascript How-To 1 March 11th, 2007 03:45 AM
Appending to the middle of a file monuindia2002 XML 4 March 8th, 2006 05:21 AM
Need to grab middle data from field datagram Classic ASP Databases 4 December 16th, 2004 12:37 PM
How to Insert additional column in the Middle of a ramk_1978 BOOK: Beginning ASP 3.0 0 November 6th, 2004 09:20 AM
Insert new Record to MIDDLE of existing recordset? PanzarPaw Classic ASP Databases 5 September 11th, 2004 07:50 AM





Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.