Wrox Programmer Forums

Need to download code?

View our list of code downloads.

Go Back   Wrox Programmer Forums > XML > XML
Password Reminder
Register
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read
XML General XML discussions.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the XML section of the Wrox Programmer to Programmer discussions. This is a community of tens of thousands of software programmers and website developers including Wrox book authors and readers. As a guest, you can read any forum posting. By joining today you can post your own programming questions, respond to other developers’ questions, and eliminate the ads that are displayed to guests. Registration is fast, simple and absolutely free .
DRM-free e-books 300x50
Reply
 
Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old August 26th, 2008, 12:40 AM
Authorized User
 
Join Date: Jun 2005
Location: , , Philippines.
Posts: 97
Thanks: 0
Thanked 0 Times in 0 Posts
Send a message via Yahoo to aldwinenriquez
Default Searching for body element in an html not working

It seems that SelectSingle node does not work well with HTML documents loaded as XML.

XmlDocument doc = new XmlDocument();
doc.Load(@"c:\temp\layout.html");//load html as xml doc

//add namespace manager
XmlNamespaceManager man = new XmlNamespaceManager(doc.NameTable);
            man.AddNamespace(string.Empty, "http://wwww.w3.org/1999/xhtml");

XmlNode body = doc.DocumentElement.SelectSingleNode("//body",man);//also tried without namespace manager, but didn't work too.
if(body != null)
 Console.WriteLine(body.OuterXml);

However when I do doc.DocumentElement["body"], it gives me the node.
What am I missing here?


Below is the HTML document:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
    <title>Untitled Page</title>
    <link href="workflow.css" rel="stylesheet" type="text/css" />
</head>
<body>
        <div style="height: 100%; width: 100%; overflow: auto; text-align: center; padding-left: 20px;">
        <br />
        <table class="workflow_container">
            <tr>
                <td>
                    <div id="dvDevelop" class="workflow_active">


                    </div>
                </td>
            </tr>
            <tr>
                <td>
                    <div class="down_arrow">
                        &nbsp;
                    </div>
                </td>
            </tr>
            <tr>
                <td>
                    <div id="dvReview" class="workflow_disable">

                        </div>
                </td>
            </tr>
        </table>
        <div class="right_arrow">
            &nbsp;
        </div>
        <div id="dvContentQC" class="workflow_next">

            </div>
        <div class="right_arrow">
            &nbsp;
        </div>
        <div id="dvPublish" class="workflow_disable">

            </div>
    </div>
<p>
    &nbsp;</p>
<p>
    &nbsp;</p>
<p>
    &nbsp;</p>
<p>
    &nbsp;</p>
                    <div class="workflow_disable">
                        Approve
                        <table>
                            <tr>
                                <td>
                                    3<br />
                                    APP-00-102-11
                                    <br />
                                    Apr 12, 2007
                                </td>
                            </tr>
                        </table>
                    </div>
                </body>
</html>
__________________
\"Dont you ever give up!\"
Reply With Quote
  #2 (permalink)  
Old August 26th, 2008, 02:05 AM
samjudson's Avatar
Friend of Wrox
Points: 8,687, Level: 40
Points: 8,687, Level: 40 Points: 8,687, Level: 40 Points: 8,687, Level: 40
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Aug 2007
Location: Newcastle, , United Kingdom.
Posts: 2,128
Thanks: 1
Thanked 189 Times in 188 Posts
Default

1) I assume that the error above is in the url checking code on the forum, but the xhtml namespace is http://www.w3.org/1999/xhtml.

2) Have you tried adding the namespace with a prefix (such as xhtml) and then using "//xhtml:body" ?

/- Sam Judson : Wrox Technical Editor -/
Reply With Quote
  #3 (permalink)  
Old August 26th, 2008, 06:48 PM
Authorized User
 
Join Date: Jun 2005
Location: , , Philippines.
Posts: 97
Thanks: 0
Thanked 0 Times in 0 Posts
Send a message via Yahoo to aldwinenriquez
Default

XmlNode body = doc.DocumentElement.SelectSingleNode("//body",man);
This always returns null.

"//xhtml:body" does not work either.


"Dont you ever give up!"
Reply With Quote
  #4 (permalink)  
Old August 27th, 2008, 01:57 AM
joefawcett's Avatar
Wrox Author
Points: 9,763, Level: 42
Points: 9,763, Level: 42 Points: 9,763, Level: 42 Points: 9,763, Level: 42
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Jun 2003
Location: Exeter, , United Kingdom.
Posts: 3,074
Thanks: 1
Thanked 38 Times in 37 Posts
Default

Show the relevant code where you read the document and set up the NamespaceManager.

--

Joe (Microsoft MVP - XML)
Reply With Quote
  #5 (permalink)  
Old August 27th, 2008, 06:38 PM
Authorized User
 
Join Date: Jun 2005
Location: , , Philippines.
Posts: 97
Thanks: 0
Thanked 0 Times in 0 Posts
Send a message via Yahoo to aldwinenriquez
Default

XmlDocument doc = new XmlDocument();
doc.Load(@"c:\temp\layout.html");//load html as xml doc

//add namespace manager
XmlNamespaceManager man = new XmlNamespaceManager(doc.NameTable);
man.AddNamespace(string.Empty, http://wwww.w3.org/1999/xhtml");
XmlNode body = doc.DocumentElement.SelectSingleNode("//body",man);//this is where I am searching for the node
if(body != null)
 Console.WriteLine(body.OuterXml);

HTML document is available in the first post as inline text..


"Dont you ever give up!"
Reply With Quote
  #6 (permalink)  
Old August 28th, 2008, 02:29 AM
joefawcett's Avatar
Wrox Author
Points: 9,763, Level: 42
Points: 9,763, Level: 42 Points: 9,763, Level: 42 Points: 9,763, Level: 42
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Jun 2003
Location: Exeter, , United Kingdom.
Posts: 3,074
Thanks: 1
Thanked 38 Times in 37 Posts
Default

Well you don't assign a prefix, string.Empty cannot be used. You need to use something like "xhtml" and use "//xhtml:body" as your XPath.
If that doesn't work then your HTML is not XHTML.

--

Joe (Microsoft MVP - XML)
Reply With Quote
  #7 (permalink)  
Old August 28th, 2008, 03:26 AM
samjudson's Avatar
Friend of Wrox
Points: 8,687, Level: 40
Points: 8,687, Level: 40 Points: 8,687, Level: 40 Points: 8,687, Level: 40
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Aug 2007
Location: Newcastle, , United Kingdom.
Posts: 2,128
Thanks: 1
Thanked 189 Times in 188 Posts
Default

Joe, according to the docs String.Empty can be used to define the default namespace. Should this not work then?

http://msdn.microsoft.com/en-us/libr...namespace.aspx

/- Sam Judson : Wrox Technical Editor -/
Reply With Quote
  #8 (permalink)  
Old August 28th, 2008, 03:46 AM
mhkay's Avatar
Wrox Author
Points: 18,487, Level: 59
Points: 18,487, Level: 59 Points: 18,487, Level: 59 Points: 18,487, Level: 59
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Apr 2004
Location: Reading, Berks, United Kingdom.
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

In XPath 1.0 "//body" means "find body elements in no namespace", not "find body elements in the default namespace". So setting the default namespace should make no difference.

Michael Kay
http://www.saxonica.com/
Author, XSLT 2.0 and XPath 2.0 Programmer's Reference
Reply With Quote
  #9 (permalink)  
Old August 28th, 2008, 05:56 AM
joefawcett's Avatar
Wrox Author
Points: 9,763, Level: 42
Points: 9,763, Level: 42 Points: 9,763, Level: 42 Points: 9,763, Level: 42
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Jun 2003
Location: Exeter, , United Kingdom.
Posts: 3,074
Thanks: 1
Thanked 38 Times in 37 Posts
Default

Quote:
quote:Originally posted by samjudson
 Joe, according to the docs String.Empty can be used to define the default namespace. Should this not work then?

http://msdn.microsoft.com/en-us/libr...namespace.aspx

/- Sam Judson : Wrox Technical Editor -/
It can be used to define the default namespace but that doesn't help you use it in XPath so I'm not sure what use that is.

--

Joe (Microsoft MVP - XML)
Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
reading a html doc into outlook body message matpen Word VBA 5 June 21st, 2009 10:19 PM
Properties box for the body element -- Page 40-41 zcorker ASP.NET 1.0 and 1.1 Basics 3 October 25th, 2007 01:14 AM
Master Page Body Element Properties SomeoneKnows BOOK: Wrox's ASP.NET 2.0 Visual Web Developer 2005 Express Edition Starter ISBN: 978-0-7645-8807-5 0 August 10th, 2007 04:13 PM
Corrupt HTML Body in CDO email patwadd Classic ASP Professional 3 July 26th, 2007 05:14 PM



All times are GMT -4. The time now is 03:49 AM.


Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
© 2013 John Wiley & Sons, Inc.