Wrox Programmer Forums

Need to download code?

View our list of code downloads.

Go Back   Wrox Programmer Forums > PHP/MySQL > Pro PHP
Password Reminder
Register
Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read
Pro PHP Advanced PHP coding discussions. Beginning-level questions will be redirected to the Beginning PHP forum.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the Pro PHP section of the Wrox Programmer to Programmer discussions. This is a community of tens of thousands of software programmers and website developers including Wrox book authors and readers. As a guest, you can read any forum posting. By joining today you can post your own programming questions, respond to other developers’ questions, and eliminate the ads that are displayed to guests. Registration is fast, simple and absolutely free .
DRM-free e-books 300x50
Reply
 
Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old November 4th, 2003, 09:27 PM
richard.york's Avatar
Wrox Author
Points: 5,506, Level: 31
Points: 5,506, Level: 31 Points: 5,506, Level: 31 Points: 5,506, Level: 31
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Jun 2003
Location: Camby, IN, USA.
Posts: 1,706
Thanks: 0
Thanked 6 Times in 6 Posts
Default perl compatible regular expressions

I was wondering if anyone knew of a good tutorial on perl compatible regular expressions. I am trying to write a regular expression that would replace links in an email program with HTML formatted links.

I tried Nik's example in the following thread:
http://p2p.wrox.com/topic.asp?TOPIC_ID=5482

But have seem to run into a snag, in that I designed my program as a class and cannot seem to find a way to do the callback function from the class. I also tried defining the callback function in global scope, but the regular expression function didn't return the mail body. I've actually attempted several examples that I found on the web and none of them bring back the message body.

Here is one example that I tried:
$msg_body = imap_fetchbody($this->mailbox, $mid, $pid);

$msg_body = preg_replace("/([\w\.]+)(@)([\S\.]+)\b/i","<a href=\"mailto:$0\">$0</a>", $msg_body);
$msg_body = preg_replace("(^)"<a href=\"http$3://$4$5\"target=\"_blank\">$2$4$5</a>", $msg_body);

Neither of these look like a very good solution.

If I comment out the preg_replace functions the message body shows up, when I use them I get a blank message body.

I don't know much about regular expressions anyway, so I am at a loss to see where it might be going wrong. In all of my PHP books none of them seem to discuss perl compatible regular expressions in any detail, but they do talk quite a bit about POSIX-style regular expressions.

Thanks in advance!
: )
Rich


:::::::::::::::::::::::::::::::::
Smiling Souls
http://www.smilingsouls.net
:::::::::::::::::::::::::::::::::
Reply With Quote
  #2 (permalink)  
Old November 4th, 2003, 11:37 PM
richard.york's Avatar
Wrox Author
Points: 5,506, Level: 31
Points: 5,506, Level: 31 Points: 5,506, Level: 31 Points: 5,506, Level: 31
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Jun 2003
Location: Camby, IN, USA.
Posts: 1,706
Thanks: 0
Thanked 6 Times in 6 Posts
Default

I was able to figure out a way to get Nik's example working.

Apparently my decode function which decodes the message body from quoted-printable was creating a conflict, so I moved that to happen before I attempted regular expression replacement.

I used create_function() to use preg_replace_callback from within my class.

$msg_body = imap_fetchbody($this->mailbox, $mid, $pid);
$msg_body = $this->decode_message($msg_body, $this->encoding[$mid][$i]);

$pattern = '!\bhttps?://([\w\-]+\.)+[a-zA-Z]{2,3}(/(\S+)?)?\b!';

$msg_body = htmlspecialchars($msg_body);
$msg_body = preg_replace_callback($pattern, create_function('$matches', 'return "<a href=\'".$matches[0]."\' target=\'_new\'>".$matches[0]."</a>";'), $msg_body);

: )
Rich

:::::::::::::::::::::::::::::::::
Smiling Souls
http://www.smilingsouls.net
:::::::::::::::::::::::::::::::::
Reply With Quote
  #3 (permalink)  
Old November 5th, 2003, 03:55 PM
Friend of Wrox
Points: 2,570, Level: 21
Points: 2,570, Level: 21 Points: 2,570, Level: 21 Points: 2,570, Level: 21
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Jun 2003
Location: San Diego, CA, USA
Posts: 836
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Hey Rich,

I'd recommend reading through PHP's manual pages:
  http://www.php.net/pcre

Check out their "pattern syntax" and "pattern modifiers" page. Also, search for 'perl regular expression tutorial' on google; there's lots of hits.


I don't think for your case you need to use create_function(); the problem with that approach is that you create an unnamed function EVERY time you get to the point in execution. I don't think it causes a huge amount of excess overhead, but it's there nonetheless.


I don't have the time to play with your original patterns, but I suspect a couple reasons your patterns are failing:

1) You're using a dollar to access your back references. Perl-compatible regexes in PHP use a backslash and a number between 0 and 99 to access a back reference.

2) Your 2nd pattern isn't a valid string:
  "(^)"<a href=\"http$3://$4$5\"target=\"_blank\">$2$4$5</a>"

The 4th character of your pattern string is a double-quote character, which ends the string and should cause a parse error.

Good luck, and let me know if any more problems come up.





Take care,

Nik
http://www.bigaction.org/
Reply With Quote
  #4 (permalink)  
Old November 6th, 2003, 12:33 AM
richard.york's Avatar
Wrox Author
Points: 5,506, Level: 31
Points: 5,506, Level: 31 Points: 5,506, Level: 31 Points: 5,506, Level: 31
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Jun 2003
Location: Camby, IN, USA.
Posts: 1,706
Thanks: 0
Thanked 6 Times in 6 Posts
Default

Thanks Nik,

I must have overlooked the pattern syntax links when I was looking through the manual. I have been trying out some patterns.

I saw in the user notes at http://www.php.net/preg_replace_callback someone suggested plugging in an array with two indices, the first being the class name and the second the function name.. well actually here is a quote:

Quote:
quote:
Also, if you want to use a *static* class method for the callback function, you can refer to it like this:
   preg_replace_callback(pattern, array('ClassName', 'methodName'), subject)

In PHP5, from within the class:
   preg_replace_callback(pattern, array('self', 'methodName'), subject)
I tried this and it works, well the first method, I'm waiting for PHP 5 to come out of beta before fooling with that.

I have been pouring over your syntax for a while and cannot seem to get it modified to accept any protocol.

The original I think was this:
$pattern = '!\bhttps?://([\w\-]+\.)+[a-zA-Z]{2,3}(/(\S+)?)?\b!';

I tried changing it to this:
$pattern = '!\b(https?|telnet|ftp)(:\/\/)([\w\-]+\.)+[a-zA-Z]{2,3}(/(\S+)?)?\b!';

And I was also trying to include an optional '/' at the end of the URL... for cases where the url contains only http://www.somesite.com/

I wrote this one for emails which seems to work well... actually I took the example on the zend website and modified it to include more addresses.

$body = preg_replace_callback('/[A-z0-9_\-\.]+[@][A-z0-9_\-]+([.][A-z0-9_\-]+)+[A-z0-9\-]+([.][A-z0-9_\-]+)?+[A-z]?/', array('library', 'mailify'), $body);

It matches dots in the address and optionally matches sub-domain addresses or double suffix domains, like .co.uk and it matches addresses attached to a mailto: statement.

I would appreciate any comments you might be able to throw my way!

Thanks!
: )
Rich

:::::::::::::::::::::::::::::::::
Smiling Souls
http://www.smilingsouls.net
:::::::::::::::::::::::::::::::::
Reply With Quote
  #5 (permalink)  
Old November 6th, 2003, 03:07 PM
Friend of Wrox
Points: 2,570, Level: 21
Points: 2,570, Level: 21 Points: 2,570, Level: 21 Points: 2,570, Level: 21
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Jun 2003
Location: San Diego, CA, USA
Posts: 836
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Your modified version of the pattern works for recognizing telnet and ftp protocol declarations. The reason the trailing slash doesn't get recognized is because the transition from a slash to whitespace (or the end of the line) does NOT constitute a word boundary. I thought that it would...

Remove the last \b in the pattern and the slashes sould be recognized.

When matching hostnames, most people find it sufficient to just enforce the top-level domain to either be 2 or 3 characters. All country domains (ws, tv, uk, en, jp, etc...) and US domain types (net, com, org, edu, gov, mil) will be matched.


Take care,

Nik
http://www.bigaction.org/
Reply With Quote
  #6 (permalink)  
Old November 6th, 2003, 04:31 PM
richard.york's Avatar
Wrox Author
Points: 5,506, Level: 31
Points: 5,506, Level: 31 Points: 5,506, Level: 31 Points: 5,506, Level: 31
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Jun 2003
Location: Camby, IN, USA.
Posts: 1,706
Thanks: 0
Thanked 6 Times in 6 Posts
Default

Thanks Nik, that did the trick.

:::::::::::::::::::::::::::::::::
Smiling Souls
http://www.smilingsouls.net
:::::::::::::::::::::::::::::::::
Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Do I need regular expressions..? ypomonh XSLT 2 May 20th, 2007 05:09 PM
Help with Regular Expressions WestRowOps Other Programming Languages 1 May 18th, 2007 05:34 AM
Regular Expressions mega Beginning PHP 1 February 5th, 2007 05:31 PM
Regular expressions on C# hideway C# 2 November 27th, 2006 05:08 PM
regular expressions help kyootepuffy Classic ASP Databases 2 September 10th, 2003 01:37 PM



All times are GMT -4. The time now is 10:12 AM.


Powered by vBulletin®
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
© 2013 John Wiley & Sons, Inc.