Wrox Home  
Search P2P Archive for: Go

  Return to Index  

pro_php thread: Search Engines & Databases


Message #1 by "Grant Ballard-Tremeer" <grant@e...> on Tue, 27 Feb 2001 12:24:56 -0000
Ciao Grant!

 > Some time ago there was some discussion about the indexing of web pages with
 > PHP. Since then I've been doing some research: I recently converted my sites
 > from hardwired HTML to a PHP/MySQL combination - with dynamic 'newsy' info
 > held in a database, tailored for each visitor, with user log-on etc.
 > Unfortunately the 'hit rate' from search engines has gone virtually to zero
 > as Altavista, Hotbot etc. removed me from their indexes - indeed they
 > *don't* index .php files as far as I can see, and also don't follow links on
 > php pages (I imagine, in fact I'm sure it's the same for ASP, ColdFusion
 > etc.). Once the old pages 'dropped off' the search engines' lists I got
 > fewer and fewer visitors. It's not all bad: 'Google' and 'AllTheWeb' have no
 > problems with the .php extension (and even index my .pdf files! I am a
 > dedicated google.com fan now).

Well, thanks a lot for telling me about that. I didn't know Altavista's 
Scooter excludes them because of their extension. Yep, because I am sure 
that they can only exclude them by checking their extension, as long as all 
PHP documents returns a text/html content-type (if not else specificied).

 > I would be grateful for any wisdom on how best to deal with this. I could of
 > course make an HTML intro page, but that would be a pity since I'd hardly be
 > able to an intro page that conveys all the necessary info to bring the right
 > visitors. Any advice on how to do this well? I could alternatively change
 > the setting of the PHP parser to parse all .html as if it's php.
 > Unfortunately I'm sharing a server right now and my ISP is unlikely to
 > accept this because of the reduction in performance (I may be prepared to
 > move to a dedicated server if necessary though). Lastly I thought of writing
 > a PHP routine that, run once a day, will make a few HTML 'mirror' pages
 > based on the content of the database to capture the search engine visitors.
 > I could do that, but it seems to defeat the purpose of the dynamic pages
 > somewhat... pity.

Well, I am gonna expose my technique. I always use to return index pages 
and not to link them directly with their full name. Let me explain myself 
better.

You certainly have a DirectoryIndex set to index.php or something similar, 
or ... whatever, right?  If you have to call a document named show.php, do 
this:

- create a subidrectory called 'show'
- move show.php inside it and rename it 'index.php'
- and if you have to link it ... just use '[path]/show/'

By doing this way, the agent won't realize it's asking for a PHP document 
... and ... I guess, you are always granted with this. Of course, if you 
could just rename the extension, that would be easier, but you don't 
actually know that 'Scooter'  ignores 'phtml' files too.

Hope this helps ... and if you find something more, well, let *us* know!

Ciao
-Gabriele
Ciao,

Ciao
-Gabriele
-------------------------------------------------
Gabriele Bartolini - Computer Programmer
U.O. Rete Civica - Comune di Prato
Prato - Italia - Europa
e-mail: g.bartol@c...
http://www.po-net.prato.it
-------------------------------------------------
A "Supernova" is the celestial
equivalent of "rm -rf /*" with
root permissions.
-------------------------------------------------


  Return to Index