Wrox Programmer Forums

Need to download code?

View our list of code downloads.

Go Back   Wrox Programmer Forums > Java > Other Java > Java GUI
Password Reminder
Register
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read
Java GUI Discussions specific to programming Java GUI.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the Java GUI section of the Wrox Programmer to Programmer discussions. This is a community of tens of thousands of software programmers and website developers including Wrox book authors and readers. As a guest, you can read any forum posting. By joining today you can post your own programming questions, respond to other developers’ questions, and eliminate the ads that are displayed to guests. Registration is fast, simple and absolutely free .
DRM-free e-books 300x50
Reply
 
Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old June 30th, 2004, 07:20 AM
Registered User
 
Join Date: Jun 2004
Location: , , .
Posts: 1
Thanks: 0
Thanked 0 Times in 0 Posts
Send a message via AIM to vijaya_murali
Default Reading html file from website

Hi,
 My requirement is
I m having paths of html files (around 500)in a file, my program should be able to get the html file from the website(specified in the path) and get me the source of that in text format.
I have to automate this process which involves comparing some specific tags between the live site's page and our working version of the corresponding files.

I need to compare all the 500 files (from website and the working directory)and do the comparison for the specific case we encounter.

plz help me

Reply With Quote
  #2 (permalink)  
Old July 26th, 2004, 09:33 AM
Friend of Wrox
Points: 1,515, Level: 15
Points: 1,515, Level: 15 Points: 1,515, Level: 15 Points: 1,515, Level: 15
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Dec 2003
Location: Oxford, , United Kingdom.
Posts: 488
Thanks: 0
Thanked 3 Times in 3 Posts
Default

Hi Vijaya,
The way I'd approach this would normally be to use (*nix) shell scripting (diff && wget spring to mind). Or even better use proper version control like CVS.

If you need to use java, here's a possible approach.

First, make a class that reads a URL into an array. This one is slightly complex and you'd probably be better to use a Vector rather than a String Array so you don't have all the complex array copies (actually wouldhave been better to use System.arrayCopy(), but this is kinda old code):
Code:
import java.net.*;
import java.util.*;
import java.io.*;

/**
 *@author         Charlie
 *@created        06 September 2001
 *@description    Read a file from a URL, and return in as an array
 */
public class FileToArray {
    String[]  linesToReturn;
    String    urlString      = new String();



    public FileToArray(URL textURL) {
        String    urlString        = new String(textURL.toString());
        linesToReturn = new String[20];
        String[]  swap;
        String    line             = new String("");
        try {
            BufferedReader  theStream  = new BufferedReader(new InputStreamReader(textURL.openStream()));
            int             index      = 0;
            while ((line = theStream.readLine()) != null) {
                if (index == linesToReturn.length) {//array needs to grow
                    swap = linesToReturn;
                    linesToReturn = new String[linesToReturn.length * 2];
                    for (int i = 0; i < swap.length; i++) {
                        linesToReturn[i] = swap[i];
                    }
                }
                linesToReturn[index] = line;
                index++;
            }
            theStream.close();
        } catch (Exception e) {
            linesToReturn[0] = "File i/o error\n" + e.toString();
        }
        //Shrink the array down
        int       howManyNonNulls  = 0;
        swap = linesToReturn;
        for (int x = 0; x < swap.length; x++) {
            if (linesToReturn[x] != null)
                howManyNonNulls++;
        }
        linesToReturn = new String[howManyNonNulls];
        for (int y = 0; y < linesToReturn.length; y++) {
            linesToReturn[y] = swap[y];
        }
    }

    public String[] getLines() {
        return linesToReturn;
    }
}
OK now you'va abstracted the reading stuff into an array out, all you need to do is:
- read list of urls from a file into a String[],
- compare the contents of each line in the String[] to a corresponding line in the file on your local filesystem

I'll let you figure out how to compare between arrays (it's a tricky one, and you may want to use regexps). This version just prints out the contents of exch url in the file d:/jclasses/url-list.txt.
Code:
public class readLinks {
    public static void main(String[] argv) {
        try {
            FileToArray  linkFile  = new FileToArray(new URL("file://d:/jclasses/url-list.txt"));
            String[]     urls      = linkFile.getLines();
            for (int i = 0; i < urls.length; i++) {
                FileToArray  url    = new FileToArray(new URL(urls[i]));
                String[]     lines  = url.getLines();
                for (int j = 0; j < lines.length; j++)
                    System.out.println(lines[j]);

            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

}
HTH charlie

--
Don't Stand on your head - you'll get footprints in your hair. http://charlieharvey.com
Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Reading HTML from internet Ernie VB How-To 9 April 30th, 2007 12:29 PM
Fetching html from website nordboh ASP.NET 2.0 Professional 0 January 17th, 2007 02:40 PM
adding a html page in a website abhi_loveu2002 HTML Code Clinic 1 December 4th, 2006 05:19 AM
can't save html /css website !! ca123 CSS Cascading Style Sheets 2 November 15th, 2004 07:29 PM
reading HTML from file -> write to file mikeuk Beginning PHP 4 July 21st, 2004 05:40 AM



All times are GMT -4. The time now is 05:46 AM.


Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
© 2013 John Wiley & Sons, Inc.