|
Subject:
|
Need RegEx help
|
|
Posted By:
|
Snib
|
Post Date:
|
10/2/2004 12:25:53 PM
|
Hello,
I need a pattern that parses some HTML code and replaces all src and href values to include the absolute path to the resource.
So <a href='/index.php'> turns into <a href='http://www.mysite.com/index.php'> and <link href="styles.css"/> turns into <link href="http://www.mysite.com/directory/styles.css"/>
I would do it myself but I'm very new to regular expressions and can't figure it out.
Thanks,
-Snib <>< http://www.snibworks.com There are only two stupid questions: the one you don't ask, and the one you ask more than once ;-)
|
|
Reply By:
|
Richard Lightfoot
|
Reply Date:
|
10/4/2004 3:21:01 AM
|
Bit new to it myself, but
$picture="<a href='/index.php'>"; $picture= ereg_replace ( "<a href='/index.php'>", "<a href='http://www.mysite.com/index.php'>",$picture); echo $picture;
Should work.
|
|
Reply By:
|
Snib
|
Reply Date:
|
10/4/2004 4:52:23 PM
|
Richard,
That will replace all instances of <a href='/index.php'> but what if it doesn't link to /index.php? What if I used double quotes (")? What if the tag really looks like this: <a style='color:red' href='/index.php'>. And also I need it to parse <img> tags, <script> tags and <link> tags.
Thanks for helping,
-Snib <>< http://www.snibworks.com There are only two stupid questions: the one you don't ask, and the one you ask more than once ;-)
|
|
Reply By:
|
Moharo
|
Reply Date:
|
10/8/2004 9:21:59 AM
|
hey Snib
i gotta admit that regular expressions are not my favorites... but this is what i came up with for your problem... (might be buggy)....
<?php
$teststr = "<a href='index.php'>"; $newstr = preg_replace("/\<(a|link) href='(.*?)'>/","<\\1 href=\"http://www.mysite.com\\2\">",$str);
echo $newstr;
?>
after this script is parsed, you will not see anything on the screen (browser's window), but look at the html code ("view source")
hope that helped u 
crazy zoltalar
www.campusgrind.com the college portal
|
|
Reply By:
|
nikolai
|
Reply Date:
|
10/20/2004 12:10:54 PM
|
There's a problem with your regex replacement, Moharo. Check your own example: "index.php" gets matched as \2, so your replacement shows up as:
"http://www.mysite.comindex.php"
There are a bunch of cases you need to test for.
1) Does the link already specify an absolute URL? (e.g. http://www.example.com/foo/bar/page.html)
2) Does the link specify an absolute path? (e.g. /foo/bar/page.html)
3) Does the link specify a path relative to the current script? a) at or below the current path? (e.g. page.html, foo/bar/page.html) b) above or a sibling to the current path? (e.g. ../page.html, ../foo/bar/page.html)
For 1), do nothing. No replacement necessary.
For 2), simply prepend the host to the path.
for 3), you'll need to calculate the working directory of the currently executing path, and for a) append the relative url to the working directory. for b) modify the working directory to reflect the ".."s in the relative path.
Make sense?
Take care,
Nik http://www.bigaction.org/
|
|
Reply By:
|
Snib
|
Reply Date:
|
10/20/2004 3:19:35 PM
|
Welcome back, Nik.
You seem to know your way around regular expressions better than either of us, could you try to make a pattern for this?
I am still trying myself, unsuccessfully.
Thanks,
-Snib <>< Try new FreshView 0.2! There are only two stupid questions: the one you don't ask, and the one you ask more than once ;-)
|
|
Reply By:
|
anshul
|
Reply Date:
|
11/16/2004 5:59:59 AM
|
may b good url2learn regular expressions: http://www.webreference.com/js/column5/
|