My question is whether any of the following code in the crawler function is necessary
// Handle redirects by reading in the header looking for the location
// this code never seems to run even when I am doing 4 redirects and $maxredirs=1 which should only allow 1 redirect!! so is it neccessary at all?
if ((stripos($urlinfo["header"], "location:")) && ($maxredirs > 0))
preg_match("/\r\nlocation:(.*)/i", $urlinfo["header"], $match);
$redirect = trim($match);
//echo "Redirecting to ".$redirect."\n";
// decrease counter
// call the function again to follow the redirect
return mycrawler_single($redirect, $useragent, $timeout, $maxredirs);
I am doing my redirects in all the test pages exactly as you specified e.g
and so on.
What I am saying is that the browser seems to follow all these redirects (4 of them)
without my code (that handles the location header) even being in the function.
therefore I am asking whether that code is necessary and if it is what will
cause it to fire.
If I have 4 pages that all redirect to each other with a final html page at the end
that I am trying to retrieve. Then in theory I would need to set the $maxredirs var
to at least 4 to allow for 4 redirects. However if I set it to 1 the redirects
are still all followed. Even if I comment out the whole block of code
the redirects are still all followed. So what is going on.
Do I even need to bother trying to handle redirects myself by reading in the headers
and looking for the location header value or is something else going on I am unaware of.
post with the URL of the first page redirect1.php passed as the value for u.
<a href="#" onclick="AjaxProxy.php?u=redirect1.php">click me</a>
would do the same job. OR you can just hardcode the value of the first page in for the value
for the $url var e.g
$url = "redirect1.php";
Is that enough?