 |
| HTML Code Clinic Do you have some HTML code you'd like to share and get suggestions from others for tweaking or improving it? This discussion is the place. |
Welcome to the p2p.wrox.com Forums.
You are currently viewing the HTML Code Clinic section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
|
|
|
|

April 5th, 2009, 06:26 PM
|
|
Friend of Wrox
|
|
Join Date: Jan 2005
Posts: 1,525
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
prevent Google/Yahoo!/MSN spidering my webpage
Hi,
Am i safe enough with the following meta tags to prevent Google/Yahoo!/MSN Search from spidering my webpage?
Code:
<meta name="robots" content="noindex" />
<meta name="robots" content="nofollow" />
<meta name="robots" content="noarchive" />
<meta name="robots" content="noodp" />
<meta name="robots" content="noimageindex,nomediaindex" />
<meta name="robots" content="unavailable_after: 05-Apr-2009 22:00:00 CET" />
<meta name="googlebot" content="noindex">
<meta name="googlebot" content="nosnippet" />
<meta name="slurp" content="noydir">
My robots.txt file:
Code:
User-agent: *
Disallow: /picco.html
Also - is there anyway to test if Google will pick it up?
Thanks,
Picco
|
|

April 5th, 2009, 08:01 PM
|
 |
Wrox Staff
Points: 18,059, Level: 58 |
|
|
Join Date: May 2003
Posts: 1,906
Thanks: 62
Thanked 139 Times in 101 Posts
|
|
I think you have it more than covered. To verify it with Google, sign up for a free Google webmaster tools account
www.google.com/webmasters/tools/
and from there, you can verify which pages Google is and isn't spidering under "URLs restricted by robots.txt."
__________________
Jim Minatel
Associate Publisher, WROX - A Wiley Brand
Did someone here help you? Click  on their post!
|
|

April 6th, 2009, 05:47 PM
|
|
Friend of Wrox
|
|
Join Date: Jan 2005
Posts: 1,525
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Thanks Jim, i'll do that. What about MSN Search and Yahoo!
Will this prevent those two search engines from spidering the page?
Picco
|
|

April 7th, 2009, 04:28 PM
|
 |
Wrox Staff
Points: 18,059, Level: 58 |
|
|
Join Date: May 2003
Posts: 1,906
Thanks: 62
Thanked 139 Times in 101 Posts
|
|
I think you're covered with the robots.txt, as I think both Yahoo and MSN (Live) search are respectable crawlers and follow the robot.txt rules. Where you may run in to problems are some of the lesser known crawlers, and even malicious crawlers that deliberately ignore robots. If there is a link to this page from some other page, there's a good chance some crawlers will index it.
__________________
Jim Minatel
Associate Publisher, WROX - A Wiley Brand
Did someone here help you? Click  on their post!
|
|

April 21st, 2009, 04:13 AM
|
|
Friend of Wrox
|
|
Join Date: Jun 2007
Posts: 477
Thanks: 10
Thanked 19 Times in 18 Posts
|
|
Yeah, the robots.txt file is all you really need to disallow the page. None of the major engines are looking to catalog things you don't want indexed.
If it's something really sensitive, you may want to look at ASP.NET 2.0. There are some fairly basic ways of setting up a login system to password protect sensitive files like that. Check out www.asp.net for video tutorials on it if you're interested.
__________________
-------------------------
Whatever you can do or dream you can, begin it. Boldness has genius, power and magic in it. Begin it now.
-Johann von Goethe
When Two Hearts Race... Both Win.
-Dove Chocolate Wrapper
Chroniclemaster1, Founder of www.EarthChronicle.com
A Growing History of our Planet, by our Planet, for our Planet.
|
|

May 8th, 2009, 02:24 PM
|
|
Authorized User
|
|
Join Date: Nov 2008
Posts: 22
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Try php
Try php. It provide some ways to satisfy your needs.
|
|
 |