Wrox Programmer Forums
Go Back   Wrox Programmer Forums > PHP/MySQL > Beginning PHP
|
Beginning PHP Beginning-level PHP discussions. More advanced coders should post to the Pro PHP forum.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the Beginning PHP section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
 
Old March 22nd, 2010, 10:13 AM
Friend of Wrox
 
Join Date: Dec 2008
Posts: 238
Thanks: 2
Thanked 20 Times in 19 Posts
Default SimpleXMLElement and encoding

I used the following code to parse RSS documents:
Code:
		$page = file_get_contents($rss);
		$feed = new SimpleXMLElement($page);
But it failed to parse some BIG5 encoding RSS and gave this warning:
Code:
Warning: SimpleXMLElement::__construct(): input conversion failed due to input error
here is one example document:
http://rss.chinatimes.com/rss/latestnews.rss

The issue has some inconsistency. Obviously, being an RSS document, the content of the above link changes over time. SimpleXMLElement can parse some instances with no issue at all, but not all instances.
 
Old March 22nd, 2010, 11:55 PM
Friend of Wrox
 
Join Date: Dec 2008
Posts: 238
Thanks: 2
Thanked 20 Times in 19 Posts
Default

Apparently PHP has more limited encoding support comparing to some other languages. Other than the BIG5 encoded document that I mentioned in my original post, SimpleXMLElement also had issues with a gb2312 encoded document. However I found a workaround for that gb2312 encoded document, by simply do the following pre-process, before I pass the document into SimpleXMLElement:

Code:
$page = str_replace('encoding="gb2312"', 'encoding="GBK"', $page);





Similar Threads
Thread Thread Starter Forum Replies Last Post
Encoding Will C# 2008 aka C# 3.0 0 January 11th, 2010 02:42 PM
issues with simplexml_load_string/SimpleXMLElement on PHP 5.1/Linux crmpicco PHP How-To 0 September 10th, 2009 12:33 PM
encoding paarupalli J2EE 0 March 2nd, 2007 04:14 AM
encoding alihussein3 Javascript How-To 1 October 28th, 2003 05:23 AM





Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.