{"id":2154,"date":"2005-04-09T14:12:12","date_gmt":"2005-04-09T19:12:12","guid":{"rendered":"\/?p=2154"},"modified":"2005-04-13T13:07:40","modified_gmt":"2005-04-13T18:07:40","slug":"regex-question","status":"publish","type":"post","link":"https:\/\/timesandseasons.org\/index.php\/2005\/04\/regex-question\/","title":{"rendered":"Regex question"},"content":{"rendered":"<p>UPDATE:  It&#8217;s working!  Thanks to everyone who made suggestions.  <\/p>\n<p>Also, as with any change to our filters, we&#8217;ve tried to be as careful as we can not to block innocent posts.  But it&#8217;s always possible that I didn&#8217;t design or implement this quite right, and that it will inadvertently catch your innocent post.  If it does, please let us know, as soon as possible.  Thanks!<!--more--><\/p>\n<p>(Original post:).<\/p>\n<p>There is a famous blog-spam-bot which we particularly hate.  It sends more spam than all of the rest of the spam world, combined.  And it uses a set formula which I&#8217;d like to take advantage of.  The bot draws from a list of 12 names and 16 domains, and all the spam it sends uses e-mail addresses constructed along this formula.  I&#8217;d like to put in a regex instruction into our wp-comments-post.php file to keep out this bot.  But I&#8217;m a terrible regex writer, and I can&#8217;t get it to work.<\/p>\n<p>The bot&#8217;s formula (which I&#8217;ve learned through sad experience) is as follows:  The names it uses are otard, napoleon, luba, johndoe, jane_doe, huy_lo, grey_goose, gocha, bushmills, azaddin, absolut, and absinth.  The domains it uses are aol, yahoo, usmail, bigfoot, mail, tech, see, come, classnet, work, arrivo, does, rocketmail, |mainl, freemail, and hotmail.<\/p>\n<p>Every spam that this bot sends (and it sends out hundreds a day) is built using those names, those domains, and a set of random digits after the name and before the @ symbol.  So our most recent spams show the e-mail addresses:<\/p>\n<p>luba8734@tech.tv<br \/>\nluba9207@does.it<br \/>\ngocha9776@see.it<br \/>\ngocha9334@does.it<br \/>\nabsinth1434@rocketmail.com<br \/>\njane_doe6926@work.com<\/p>\n<p>etc.<\/p>\n<p>Given the bot&#8217;s predictability, I&#8217;d like to run a command in my wp-comments-post.php file that runs a regex search on all comments and doesn&#8217;t let through the ones with a combination of that name and domain.  <\/p>\n<p>Here&#8217;s what I tried (which didn&#8217;t work):<\/p>\n<p>\/\/\t$regex = (otard|napoleon|luba|johndoe|jane_doe|huy_lo|grey_goose|gocha|bushmills|azaddin|absolut|absinth)[-w_.]*(aol|yahoo|usmail|bigfoot|mail|tech|see|come|classnet|work|arrivo|does|rocketmail|mainl|freemail|hotmail);<br \/>\n\/\/\t\tif (@preg_match($regex, $email))<br \/>\n\/\/\t\tdie( __(&#8216;Sorry, but some of the content of this comment appears to violate our comment policies.  This determination has been made using filtering software.  If you believe you have received this message in error, please contact us by e-mail as soon as possible.  Thank you.&#8217;) );<\/p>\n<p>It&#8217;s giving me a parse error when it hits the brackets.  Does anyone know why that is?  (Or better yet, does anyone know what I <em>would<\/em> need to put in to the wp-comments-post.php file to run that regex?)<\/p>\n<p>Muchas gracias.  <\/p>\n","protected":false},"excerpt":{"rendered":"<p>UPDATE: It&#8217;s working! Thanks to everyone who made suggestions. Also, as with any change to our filters, we&#8217;ve tried to be as careful as we can not to block innocent posts. But it&#8217;s always possible that I didn&#8217;t design or implement this quite right, and that it will inadvertently catch your innocent post. If it does, please let us know, as soon as possible. Thanks!<\/p>\n","protected":false},"author":30,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-2154","post","type-post","status-publish","format-standard","hentry","category-corn"],"jetpack_sharing_enabled":true,"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/timesandseasons.org\/index.php\/wp-json\/wp\/v2\/posts\/2154","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/timesandseasons.org\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/timesandseasons.org\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/timesandseasons.org\/index.php\/wp-json\/wp\/v2\/users\/30"}],"replies":[{"embeddable":true,"href":"https:\/\/timesandseasons.org\/index.php\/wp-json\/wp\/v2\/comments?post=2154"}],"version-history":[{"count":0,"href":"https:\/\/timesandseasons.org\/index.php\/wp-json\/wp\/v2\/posts\/2154\/revisions"}],"wp:attachment":[{"href":"https:\/\/timesandseasons.org\/index.php\/wp-json\/wp\/v2\/media?parent=2154"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/timesandseasons.org\/index.php\/wp-json\/wp\/v2\/categories?post=2154"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/timesandseasons.org\/index.php\/wp-json\/wp\/v2\/tags?post=2154"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}