p2p.wrox.com Forums

p2p.wrox.com Forums (http://p2p.wrox.com/index.php)
-   VB How-To (http://p2p.wrox.com/forumdisplay.php?f=78)
-   -   Eliminating HTML TAGs (http://p2p.wrox.com/showthread.php?t=265)

beyondforsaken June 6th, 2003 07:15 PM

Eliminating HTML TAGs
 
hi i have a code here but i have a error eliminating it

Dim a As String
        a = txtfile.Text
        If a <> <td bgcolor= "#C9EDFF" height="12" class="resulttext2" width="256">Featured Businesses </td><td height="12" width="120" class="resulttext2" >Address </td><td height="12" width="79" class="resulttext2" >Phone</td> Then
  txtfile.text=""

but i have blue wavy lines how to i eliminating those lines that are not the lines above??

Hal Levy June 6th, 2003 07:29 PM

The code you have pasted isn't close to being something that would work. Perhaps you pasted something wrong?

If that's what it says in your file, there's no way you could expect it to work.

Perhaps if you tell us what your trying to do....


Hal Levy
Daddyshome, LLC

beyondforsaken June 8th, 2003 07:09 PM

i am just trying to filter a html page and the above is what i want it's actually a table of information.

Hal Levy June 8th, 2003 09:59 PM

I think I understand... But.. The way your going about it isn't going to work.

There are two problems

1. Your string isn't in quotes...
2. The way you wrote this, your saying strip out everything that isn't HTML.

If your trying to strip HTML, and want to use this IF statement and you know that "<td bgcolor" will always start it then write this:

If left(a, 11) = "<td bgcolor" then a = ""

This takes the string you get back from the file and set your variable to an empty string.

That said- I would use a routine like This one to do what your trying.


Hal Levy
Daddyshome, LLC

mark.roworth June 9th, 2003 01:21 AM

HTML is just a type of XML. Load it into an XMLdomobject. You should be able to locate the information within that. For more information on manipulating XML, see http://www.w3schools.com/. Won't be able to answer anything more for a week, 'cos I'm going to Turkey for a week in about 5 minutes. Hope this helps.

Mark

Mark Roworth

Hal Levy June 9th, 2003 08:59 AM

Mark,

I Disagree. HTML- quite often- does not comply with XML. For example, many people will use a <BR> without an ending tag- would would confuse most XML processors. Not only that, it's overkill to do all that work when a simple REGEX will do it for him.

Hal Levy
Daddyshome, LLC

beyondforsaken June 9th, 2003 08:09 PM

Quote:

quote:Originally posted by Hal Levy
 I think I understand... But.. The way your going about it isn't going to work.

There are two problems

1. Your string isn't in quotes...
2. The way you wrote this, your saying strip out everything that isn't HTML.

If your trying to strip HTML, and want to use this IF statement and you know that "<td bgcolor" will always start it then write this:

If left(a, 11) = "<td bgcolor" then a = ""

This takes the string you get back from the file and set your variable to an empty string.

That said- I would use a routine like This one to do what your trying.


Hal Levy
Daddyshome, LLC

 Q.1) thanks but what does the "this one" mean i got the code but ...sorry to ask but how do i use it?? can i simply just paste it under a button_click

Q 2.)My main objective actually is to achive real-time updating of information from a website wwww.yellowpages.com.sg.i'm trying to do something like a search engine where by people use this program i do to look for information.But u never know when the website's information might chage so we ned real time updating .After surfing and searching on the internet i found nothing close to that.Can u give me a book or website to help achieve this?

Hal Levy June 10th, 2003 07:31 AM

If you click on THIS ONE it's a link to a example routine to to the removal of HTML tags from a text stream.


Here is the link in plain text: http://www.planet-source-code.com/vb...txtCodeId=6269

How to use the function is pretty straight forward. Instructions are right there on the page.

Hal Levy
Daddyshome, LLC

beyondforsaken June 10th, 2003 09:37 PM

hi Hal Levy,
              do u know the answer to my second question?

Hal Levy June 11th, 2003 10:18 AM

I can't write the code for you- If you use the HTTP functions that come with .NET along with the routine to parse out HTML you will be able to scrape the HTML screen for the data (as long as they don't change their format).

Personally, I would expect what your doing is against the TOS over at the site your hitting, so I can't condone or assist in that process.



Hal Levy
Daddyshome, LLC


All times are GMT -4. The time now is 06:54 PM.

Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
© 2013 John Wiley & Sons, Inc.