regexps in PHP,again

People keep on insisting,that preg_match is better for non-unicode lookups than mb_ereg. So,here are actual benchmarks to make it clear.

Here are results:

preg_match:19.8039090633mb_ereg:  15.9386620522mb_ereg_search:1.24934506416

Here is the source:

<?php$regexp = '[\w]+@[\w]+\.com';$pcre_regexp = '/'.$regexp.'/';$regexp2 = '[\s]+@[\s]+\.com';$pcre_regexp2 = '/'.$regexp2.'/';$text = 'blabla bla blbaaasdajkln dsfkl klewnjklfnjkne qwe123@gg.net adkljaskdlnkljnasdljk qwe@test.comasdjlajnsdklnasdklnjl';$t1 = microtime(true);for ($i = 0;$i < 100000;$i++){ $res1 = preg_match($pcre_regexp,$text); $res2 = preg_match($pcre_regexp2,$text)}$t2 = microtime(true);$t3 = microtime(true);for ($i = 0;$i < 100000;$i++){ $res3 = mb_ereg($regexp,$text); $res4 = mb_ereg($regexp2,$text)}$t4 = microtime(true);$t5 = microtime(true);mb_ereg_search_init($text);for ($i = 0;$i < 100000;$i++){ $res5 = mb_ereg_search($regexp); $res6 = mb_ereg_search($regexp2)}$t6 = microtime(true);echo 'preg_match:'.($t2 - $t1)."\n";echo 'mb_ereg:  '.($t4 - $t3)."\n";echo 'mb_ereg_search:'.($t6 - $t5)."\n";

Feel free to check it out yourself.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • DZone
  • FriendFeed
  • Reddit
  • Tumblr
  • Twitter
Liked this post? Follow me on twitter:@jimi_dini.

  • http://deadchannel.ru hex

    1. I strongly doubt that any serious application (i.e.,not a spherical CMS in vacuum) uses RE that often;a hundred of calls I can imagine,but not thousands of hundreds. This takes the comparison to a different degree:0.019 vs 0.0012,which is not as significant as…

    2. Does it not depend on an extension which is not necessarily loaded? I’ve seen hosters without mbstring.

  • http://indeyets.pp.ru/ indeyets

    unfortunately,I did see CMS’s (not too spherical,by the way) which used regexps a lot. and:

    a) they applied those to larger texts
    b) they had a lot regexps to apply

    p.s. probably I should make more real-life comparison,though

  • http://gameblog.me/2011/04/technical-posting-function-ereg-is-deprecated-possible-fix/ Function ereg() is deprecated possible fix | Gameblog.me

    [...] which is the recommended replacement for ereg() (see point 1 above). This author actually did a speed test for mb_ereg() function versus the preg_match(). So…*Shrugs*. The replacement / fix worked for us [...]

A sample text widget

Etiam pulvinar consectetur dolor sed malesuada. Ut convallis euismod dolor nec pretium. Nunc ut tristique massa.

Nam sodales mi vitae dolor ullamcorper et vulputate enim accumsan. Morbi orci magna,tincidunt vitae molestie nec,molestie at mi. Nulla nulla lorem,suscipit in posuere in,interdum non magna.