Wrox Programmer Forums
Go Back   Wrox Programmer Forums > C# and C > C# 1.0 > C#
|
C# Programming questions specific to the Microsoft C# language. See also the forum Beginning Visual C# to discuss that specific Wrox book and code.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the C# section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
 
Old May 9th, 2006, 06:28 PM
Registered User
 
Join Date: May 2006
Posts: 1
Thanks: 0
Thanked 0 Times in 0 Posts
Default comparing HTMLDocuments

Hi everybody!

I want to compare two HTML files. I don't want compare each sign, and I don't care about atributes. For example:

------- text 1 ---------
<P align="justify">
    something
</p><SPAN style="color: Red;">buhuhu
</sPAn>
---- end of text 1 ----

------- text 2 ---------


<p>
    something
</p>


<span> style="color: Red;">buhuhu</span>
---- end of text 2 ----

These files are equal for me. I want use HTMLDocument class.

So I've got two HTMLDocuments and I want check if these trees are equal (don't care about attributes)

I don't know how to write function
bool compare(HTMLDocument, HTMLDocument)

Do you have any idea?

Thx for help

 
Old May 11th, 2006, 03:26 PM
Authorized User
 
Join Date: May 2006
Posts: 25
Thanks: 0
Thanked 0 Times in 0 Posts
Default

If you can be sure that each html document is "well-formed", you could use the XmlTextReader and make this an amazingly simple task:

Given:

Doc 1:
<html>
<body>
    <p>
    something
    </p>
    <span style='color: Red;'>
        buhuhu
    </span>
</body>
</html>

Doc 2:
<html>
<body>
    <p>something else</p><span>buhuhu else</span>
</body>
</html>


Try this little example in an aspx:

        XmlTextReader r1 = new XmlTextReader(Server.MapPath(<< doc 1 path >>));
        XmlTextReader r2 = new XmlTextReader(Server.MapPath(<< doc 2 path >>));

        Response.Write("<table border='1'>");
        Response.Write("<tr><th>Node Type</th></th>Node Name</th><th>Node Value</th></tr>");
        Response.Write("<tr><td colspan='3'>Doc 1</td></tr>");
        while (r1.Read())
            Response.Write("<tr><td>" + r1.NodeType + "</td><td>" + r1.Name + "</td><td>" + r1.Value + "</td></tr>");

        Response.Write ("<tr><td colspan='3'>Doc 2</td></tr>");
        while (r2.Read())
            Response.Write("<tr><td>" + r2.NodeType + "</td><td>" + r2.Name + "</td><td>" + r2.Value + "</td></tr>");
        Response.Write("</table>");

        r1.Close(); r1 = null;
        r2.Close(); r2 = null;


Brandon
 
Old May 24th, 2006, 07:08 AM
Friend of Wrox
 
Join Date: May 2006
Posts: 106
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Hi all, Brandon ur code is fine but how it will compare the two.

Bijgupt
 
Old May 24th, 2006, 05:29 PM
Authorized User
 
Join Date: May 2006
Posts: 25
Thanks: 0
Thanked 0 Times in 0 Posts
Default

        XmlTextReader r1 = new XmlTextReader(Server.MapPath(<< doc 1 path >>));
        XmlTextReader r2 = new XmlTextReader(Server.MapPath(<< doc 2 path >>));

    bool files_equal = false;
    bool done = false;
    while (!done)
    {
        if (r1.Read() && r2.Read())
        {
            // Check that nodes are of the same depth, nodetype, name, valuetype and value.
            // All things being equal, the nodes should always be of the same depth since
            // we are sequentially looping through every node in each file at the same time.
            // However, it doesn't hurt to verify...

            // There are other properties you could check. This assumes that you don't care
            // about any attributes on any node.
            files_equal = (r1.Depth.Equals(r2.Depth)
                        && r1.NodeType.Equals(r2.NodeType)
                        && r1.Name.Equals(r2.Name)
                        && r1.ValueType.Equals(r2.ValueType)
                        && r1.Value.Equals(r2.Value));

            if (!files_equal)
            {
                // No need to keep checking if we find even one inequality.
                done = true;
            }
        }
        else
        {
            // If we reach the end of either file, then we are done.
            done = true;
        }
    }

        r1.Close(); r1 = null;
        r2.Close(); r2 = null;

    Response.Write ("Files are " + files_equal ? "equal" : "not equal.");


Brandon
 
Old May 24th, 2006, 05:30 PM
Authorized User
 
Join Date: May 2006
Posts: 25
Thanks: 0
Thanked 0 Times in 0 Posts
Default

I forgot to say: That code was an example off the top of my head. I haven't actually tested it.

Brandon
 
Old May 25th, 2006, 08:00 AM
Friend of Wrox
 
Join Date: May 2006
Posts: 106
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Nice piece of code . it works. Thanks

Bijgupt





Similar Threads
Thread Thread Starter Forum Replies Last Post
Comparing dates Tomi XSLT 1 September 21st, 2006 04:45 AM
Comparing Files... Nick Y BOOK: Ivor Horton's Beginning Visual C++ 2005 0 July 30th, 2006 02:48 PM
Comparing 2 cols. Hudson40 Excel VBA 5 February 9th, 2005 03:27 PM
Help comparing dates Dave Brown Beginning PHP 3 December 20th, 2004 04:03 PM
Comparing DataSets. jitu ADO.NET 1 June 7th, 2004 11:18 AM





Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.