Wrox Home  
Search P2P Archive for: Go

  Return to Index  

asp_web_howto thread: how to search engine...


Message #1 by "aly" <aly@s...> on Mon, 9 Dec 2002 18:59:29
i dont know if i shud post this message to thsi forum.. but my question is 
as follows

my client wants to search on the web like google does..

basically is there any way by using aspa nd sql server to deisgn a mini 
serach engine thats eraches pages on te weba nd so we can index them?

i hope to recv some help from u guys or any response...

regards
Aly
Message #2 by Jack_Speranza <jsperanza@g...> on Mon, 9 Dec 2002 14:18:49 -0500
Well, there are a number of ways you can implement a search engine, but
you'd have a long way to go before you could even begin to emulate what the
major search engines accomplish.  What are the goals your client is looking
to accomplish by insitituting his own search and indexing engine?  

As a caveat, I'm not sure I would try something like this using ASP...
though it can be done, it's not going to be a great performer.  I just built
something along the lines of what you're suggesting here at work, but it is
a compiled COM component.  Unless you're talking about simply spidering and
indexing a fairly small and discrete amount of pages, I would not be
comfortable devoting the ASP scripting engine to the task... especially if
you're looking for that engine to simulataneously drive other processes.

If you're intent, however, I encourage you to take a look at the Microsoft
XMLHTTP COM object to seek out and retrieve your web content, and then
prepare to make heavy use of the regular expression interface provided by
the VBScript DLL to process your parsing and indexing (the latter of which
is an invaluable performance enhancer).  The SQL end of things is actually
quite easy, if you don't want to get overly sophisticated with your
retrieval algorithms and are looking for decent scalability and performance.
Just create about 3 separate tables in SQL, two of which will be fairly
narrow but deep.  The first is a two column table to hold each word you
index and a corresponding key value.  The second to hold info on each
document you scan and its related info (e.g. - document key, document name,
and other stuff).  The last will hold word key, document key, and maybe a
word count (the number of times that word appears in the identified
document).  Properly index each of the tables, and your queries to retrieve
documents that contain grouping of search terms will run quite quickly.

Hope this is helpful.

Jack

-----Original Message-----
From: aly [mailto:aly@s...]
Sent: Monday, December 09, 2002 1:59 PM
To: ASP Web HowTo
Subject: [asp_web_howto] how to search engine...


i dont know if i shud post this message to thsi forum.. but my question is 
as follows

my client wants to search on the web like google does..

basically is there any way by using aspa nd sql server to deisgn a mini 
serach engine thats eraches pages on te weba nd so we can index them?

i hope to recv some help from u guys or any response...

regards
Aly
Message #3 by "Alex Shiell, ITS, EB, SE" <alex.shiell@s...> on Tue, 10 Dec 2002 10:15:00 -0000
if you're talking about searching the whole internet, forget reinventing the
wheel!!  All the major search engines allow you to incorporate their search
facilities into your site in exchange for displaying their logo.  A search
engine is an extremely complex beast, masses of resources have gone into
creating the ones that are out there, and they run on massive servers with
the power to search the vast amount of content out there, and with the
storage capacity to index the billions of pages.  No offence, but nothing
you could write could possibly compare with them.

On the other hand, if you're just talking about searching your own site,
look into index server.  Comes free with IIS, with a whole bunch of examples
to get you started.

-----Original Message-----
From: aly [mailto:aly@s...]
Sent: 09 December 2002 18:59
To: ASP Web HowTo
Subject: [asp_web_howto] how to search engine...


i dont know if i shud post this message to thsi forum.. but my question is 
as follows

my client wants to search on the web like google does..

basically is there any way by using aspa nd sql server to deisgn a mini 
serach engine thats eraches pages on te weba nd so we can index them?

i hope to recv some help from u guys or any response...

regards
Aly

________________________________________________________________________
Scottish Enterprise Network
http://www.scottish-enterprise.com

Headquarters Address & Contact Numbers

150 Broomielaw
5 Atlantic Quay
Glasgow
G2 8LU.
Tel:  +44 (0) 141 248 2700.
Fax:  +44 (0)141 221 3217

 This message is sent in confidence for the addressee only.
It may contain legally privileged information. The contents are not to
be disclosed to anyone other than the addressee. Unauthorised recipients
are requested to preserve this confidentiality and to advise the sender
immediately of any error in transmission.



  Return to Index