Wrox Programmer Forums
Go Back   Wrox Programmer Forums > Java > Other Java > Java GUI
Java GUI Discussions specific to programming Java GUI.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the Java GUI section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
Old June 30th, 2004, 07:20 AM
Registered User
Join Date: Jun 2004
Posts: 1
Thanks: 0
Thanked 0 Times in 0 Posts
Send a message via AIM to vijaya_murali
Default Reading html file from website

 My requirement is
I m having paths of html files (around 500)in a file, my program should be able to get the html file from the website(specified in the path) and get me the source of that in text format.
I have to automate this process which involves comparing some specific tags between the live site's page and our working version of the corresponding files.

I need to compare all the 500 files (from website and the working directory)and do the comparison for the specific case we encounter.

plz help me

Old July 26th, 2004, 09:33 AM
Friend of Wrox
Join Date: Dec 2003
Posts: 488
Thanks: 0
Thanked 3 Times in 3 Posts

Hi Vijaya,
The way I'd approach this would normally be to use (*nix) shell scripting (diff && wget spring to mind). Or even better use proper version control like CVS.

If you need to use java, here's a possible approach.

First, make a class that reads a URL into an array. This one is slightly complex and you'd probably be better to use a Vector rather than a String Array so you don't have all the complex array copies (actually wouldhave been better to use System.arrayCopy(), but this is kinda old code):
import java.net.*;
import java.util.*;
import java.io.*;

 *@author         Charlie
 *@created        06 September 2001
 *@description    Read a file from a URL, and return in as an array
public class FileToArray {
    String[]  linesToReturn;
    String    urlString      = new String();

    public FileToArray(URL textURL) {
        String    urlString        = new String(textURL.toString());
        linesToReturn = new String[20];
        String[]  swap;
        String    line             = new String("");
        try {
            BufferedReader  theStream  = new BufferedReader(new InputStreamReader(textURL.openStream()));
            int             index      = 0;
            while ((line = theStream.readLine()) != null) {
                if (index == linesToReturn.length) {//array needs to grow
                    swap = linesToReturn;
                    linesToReturn = new String[linesToReturn.length * 2];
                    for (int i = 0; i < swap.length; i++) {
                        linesToReturn[i] = swap[i];
                linesToReturn[index] = line;
        } catch (Exception e) {
            linesToReturn[0] = "File i/o error\n" + e.toString();
        //Shrink the array down
        int       howManyNonNulls  = 0;
        swap = linesToReturn;
        for (int x = 0; x < swap.length; x++) {
            if (linesToReturn[x] != null)
        linesToReturn = new String[howManyNonNulls];
        for (int y = 0; y < linesToReturn.length; y++) {
            linesToReturn[y] = swap[y];

    public String[] getLines() {
        return linesToReturn;
OK now you'va abstracted the reading stuff into an array out, all you need to do is:
- read list of urls from a file into a String[],
- compare the contents of each line in the String[] to a corresponding line in the file on your local filesystem

I'll let you figure out how to compare between arrays (it's a tricky one, and you may want to use regexps). This version just prints out the contents of exch url in the file d:/jclasses/url-list.txt.
public class readLinks {
    public static void main(String[] argv) {
        try {
            FileToArray  linkFile  = new FileToArray(new URL("file://d:/jclasses/url-list.txt"));
            String[]     urls      = linkFile.getLines();
            for (int i = 0; i < urls.length; i++) {
                FileToArray  url    = new FileToArray(new URL(urls[i]));
                String[]     lines  = url.getLines();
                for (int j = 0; j < lines.length; j++)

        } catch (Exception e) {

HTH charlie

Don't Stand on your head - you'll get footprints in your hair. http://charlieharvey.com

Similar Threads
Thread Thread Starter Forum Replies Last Post
Reading HTML from internet Ernie VB How-To 9 April 30th, 2007 12:29 PM
Fetching html from website nordboh ASP.NET 2.0 Professional 0 January 17th, 2007 02:40 PM
adding a html page in a website abhi_loveu2002 HTML Code Clinic 1 December 4th, 2006 05:19 AM
can't save html /css website !! ca123 CSS Cascading Style Sheets 2 November 15th, 2004 07:29 PM
reading HTML from file -> write to file mikeuk Beginning PHP 4 July 21st, 2004 05:40 AM

Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.