regexps in PHP, again

People keep on insisting, that preg_match is better for non-unicode lookups than mb_ereg. So, here are actual benchmarks to make it clear.

Here are results:

preg_match:      19.8039090633
mb_ereg:         15.9386620522
mb_ereg_search:  1.24934506416

Here is the source:

<?php
$regexp = '[\w]+@[\w]+\.com';
$pcre_regexp = '/'.$regexp.'/';

$regexp2 = '[\s]+@[\s]+\.com';
$pcre_regexp2 = '/'.$regexp2.'/';

$text = 'blabla bla blbaaasdajkln dsfkl klewnjklfnjkne qwe123@gg.net adkljaskdlnkljnasdljk qwe@test.comasdjlajnsdklnasdklnjl';

$t1 = microtime(true);
for ($i = 0; $i < 100000; $i++) {
    $res1 = preg_match($pcre_regexp, $text);
    $res2 = preg_match($pcre_regexp2, $text);
}
$t2 = microtime(true);

$t3 = microtime(true);
for ($i = 0; $i < 100000; $i++) {
    $res3 = mb_ereg($regexp, $text);
    $res4 = mb_ereg($regexp2, $text);
}
$t4 = microtime(true);

$t5 = microtime(true);
mb_ereg_search_init($text);
for ($i = 0; $i < 100000; $i++) {
    $res5 = mb_ereg_search($regexp);
    $res6 = mb_ereg_search($regexp2);
}
$t6 = microtime(true);

echo 'preg_match:      '.($t2 - $t1)."\n";
echo 'mb_ereg:         '.($t4 - $t3)."\n";
echo 'mb_ereg_search:  '.($t6 - $t5)."\n";

Feel free to check it out yourself.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • DZone
  • FriendFeed
  • Reddit
  • Tumblr
  • Twitter

View Commentsregexps in PHP, again

  • hex

    1. I strongly doubt that any serious application (i.e., not a spherical CMS in vacuum) uses RE that often; a hundred of calls I can imagine, but not thousands of hundreds. This takes the comparison to a different degree: 0.019 vs 0.0012, which is not as significant as…

    2. Does it not depend on an extension which is not necessarily loaded? I’ve seen hosters without mbstring.

  • unfortunately, I did see CMS’s (not too spherical, by the way) which used regexps a lot. and:

    a) they applied those to larger texts
    b) they had a lot regexps to apply

    p.s. probably I should make more real-life comparison, though

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

blog comments powered by Disqus