Properly supporting Unicode in C++
I have a few questions for anyone who is experienced writing C++ applications that support Unicode.
First, I generally try to derive my exception classes from std::exception, which has a virtual function what() that returns a char const*. Do you typically have to abandon this exception base class in order to have exceptions that carry Unicode strings?
Second, the standard library (both C and C++) provides support for manipulating wide character strings, but neither seems to provide support for filenames specified using wide characters. If you had a file that had a Chinese name, is it possible to open this file using standard I/O routines?
Lastly, how do people typically represent Unicode strings in the C++ data type wchar_t? Does the two-byte size of the character type mean that UCS-2 (UTF-16) encoding is always used, or is it largely up to the programmer how to handle this? Do you ever need to come up with a scheme for using and specifying a character encoding with text files that are meant to be exchanged with other programs?
Are there any good books recommended for learning to support Unicode in a C++ application? I have a general familiarity with the concepts behind Unicode and character encodings. What I'm really looking for is something that explains how to apply it to C++'s wide-character and internationalization support.
Thanks in advance for any advice.
Regards,
Jake.
|