|
Subject:
|
Searching for body element in an html not working
|
|
Posted By:
|
aldwinenriquez
|
Post Date:
|
8/26/2008 12:40:17 AM
|
It seems that SelectSingle node does not work well with HTML documents loaded as XML.
XmlDocument doc = new XmlDocument(); doc.Load(@"c:\temp\layout.html");//load html as xml doc
//add namespace manager XmlNamespaceManager man = new XmlNamespaceManager(doc.NameTable); man.AddNamespace(string.Empty, "www.w3.org/1999/xhtml" target="_blank">http://wwww.w3.org/1999/xhtml");
XmlNode body = doc.DocumentElement.SelectSingleNode("//body",man);//also tried without namespace manager, but didn't work too. if(body != null) Console.WriteLine(body.OuterXml);
However when I do doc.DocumentElement["body"], it gives me the node. What am I missing here?
Below is the HTML document: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html> <head> <title>Untitled Page</title> <link href="workflow.css" rel="stylesheet" type="text/css" /> </head> <body> <div style="height: 100%; width: 100%; overflow: auto; text-align: center; padding-left: 20px;"> <br /> <table class="workflow_container"> <tr> <td> <div id="dvDevelop" class="workflow_active">
</div> </td> </tr> <tr> <td> <div class="down_arrow"> </div> </td> </tr> <tr> <td> <div id="dvReview" class="workflow_disable">
</div> </td> </tr> </table> <div class="right_arrow"> </div> <div id="dvContentQC" class="workflow_next">
</div> <div class="right_arrow"> </div> <div id="dvPublish" class="workflow_disable">
</div> </div> <p> </p> <p> </p> <p> </p> <p> </p> <div class="workflow_disable"> Approve <table> <tr> <td> 3<br /> APP-00-102-11 <br /> Apr 12, 2007 </td> </tr> </table> </div> </body> </html>
|
|
Reply By:
|
samjudson
|
Reply Date:
|
8/26/2008 2:05:56 AM
|
1) I assume that the error above is in the url checking code on the forum, but the xhtml namespace is http://www.w3.org/1999/xhtml.
2) Have you tried adding the namespace with a prefix (such as xhtml) and then using "//xhtml:body" ?
/- Sam Judson : Wrox Technical Editor -/
|
|
Reply By:
|
aldwinenriquez
|
Reply Date:
|
8/26/2008 6:48:50 PM
|
XmlNode body = doc.DocumentElement.SelectSingleNode("//body",man); This always returns null.
"//xhtml:body" does not work either.
"Dont you ever give up!"
|
|
Reply By:
|
joefawcett
|
Reply Date:
|
8/27/2008 1:57:01 AM
|
Show the relevant code where you read the document and set up the NamespaceManager.
--
Joe (Microsoft MVP - XML)
|
|
Reply By:
|
aldwinenriquez
|
Reply Date:
|
8/27/2008 6:38:55 PM
|
XmlDocument doc = new XmlDocument(); doc.Load(@"c:\temp\layout.html");//load html as xml doc
//add namespace manager XmlNamespaceManager man = new XmlNamespaceManager(doc.NameTable); man.AddNamespace(string.Empty, www.w3.org/1999/xhtml" target="_blank">http://wwww.w3.org/1999/xhtml"); XmlNode body = doc.DocumentElement.SelectSingleNode("//body",man);//this is where I am searching for the node if(body != null) Console.WriteLine(body.OuterXml);
HTML document is available in the first post as inline text..
"Dont you ever give up!"
|
|
Reply By:
|
joefawcett
|
Reply Date:
|
8/28/2008 2:29:19 AM
|
Well you don't assign a prefix, string.Empty cannot be used. You need to use something like "xhtml" and use "//xhtml:body" as your XPath. If that doesn't work then your HTML is not XHTML.
--
Joe (Microsoft MVP - XML)
|
|
Reply By:
|
samjudson
|
Reply Date:
|
8/28/2008 3:26:27 AM
|
Joe, according to the docs String.Empty can be used to define the default namespace. Should this not work then?
http://msdn.microsoft.com/en-us/library/system.xml.xmlnamespacemanager.addnamespace.aspx
/- Sam Judson : Wrox Technical Editor -/
|
|
Reply By:
|
mhkay
|
Reply Date:
|
8/28/2008 3:46:12 AM
|
In XPath 1.0 "//body" means "find body elements in no namespace", not "find body elements in the default namespace". So setting the default namespace should make no difference.
Michael Kay http://www.saxonica.com/ Author, XSLT 2.0 and XPath 2.0 Programmer's Reference
|
|
Reply By:
|
joefawcett
|
Reply Date:
|
8/28/2008 5:56:20 AM
|
quote: Originally posted by samjudson
Joe, according to the docs String.Empty can be used to define the default namespace. Should this not work then?
http://msdn.microsoft.com/en-us/library/system.xml.xmlnamespacemanager.addnamespace.aspx
/- Sam Judson : Wrox Technical Editor -/
It can be used to define the default namespace but that doesn't help you use it in XPath so I'm not sure what use that is.
--
Joe (Microsoft MVP - XML)
|