Spam Comments

Comment spam is the art of crafting spammy comments and getting them posted on websites and blogs (and on blogs that are websites). The goal varies from just getting clicks to a malicious page to trying to augment page rank for some stupid website. Some comments don't even have any links to anything and are there just to see how easy it is to get a comment published.

It works because many blogs out there publish many of these comments, either automatically or because the moderator is so happy to read a positive comment they don't realise the spam aspect of it. Indeed many spam comments come in form of flattery, like (actual spam comment): Thanks for an idea, you sparked at thought from a angle I hadn’t given thought to yet. Now lets see if I can do something with it. If you search online for that comment you will find the same exact comment posted to numerous websites. In reality, it's a spam. Maybe not harmful, but definitely not an actual person flattering the author.

Note the spelling mistakes, they are intentional. Their purpose is to make it easy to find out where they've been successfully posted to. They can also test how much time it took for a comment to be moderated if moderated at all.

To prevent spam comments from being posted I've chosen to manually moderate each comment, this works fine because this site receives few visits.

However, one day I suddenly started to receive many spam comments, instead of giving up and using a 3rd party service I found some patterns that where common to all the spams. I then wrote a script to detect the pattern and move the spam comments awaiting moderation to a separate database dedicated to these spam comments. I also made the database publicly available on this site.

The spam rush stopped but I kept the system in place and added more stuff to detect spam and so on. It's relatively easy and mostly, it keeps the spam away while allowing real commentators to not have to go though visual contortions to decrypt letters nor do they have to "create an account" somewhere.

It's sort of a proof of concept that it might still be possible to allow users to interact with a website without forcing them to go through hoops. I'm sure this is the last breath of that, soon our Interweb will be ruined by bots.

Anyway, the spam comment database is constantly growing so I've moved it to a separate sub-domain at It should be easier to read through the comments and can help you, as a webmaster, determine if a comment you've received might actually be spam. Indeed, when I'm in doubt I do a verbatim search of a comment and see where it's been previously posted to or if it appears in some other spam comment database.

The spam comment database can also be interesting to look at, it's full of curiosities.

Please note/disclaimer: If your IP/comment is listed and it shouldn't then please do let me know. I am a human and I reserve the right to make mistakes.



By the way, wouldn't this pattern script be interesting to publish? You might for example provide another Wordpress antispam than the "conventional" one. (Which is good, but not free for commercial websites.)


I was able to just copy and paste the table in excel and then save it as a csv.

manu -

Yoni, that's good to know, but it's not great for automation.

I added a new page with links to JSON arrays. There are a few options so you can get just the IPs or comments or everything.
Leave a comment
You may use the following HTML tags: <p> <a> <strong> <b> <em> <i> <cite> <blockquote> <code> <pre>

Your comments WILL NOT be submitted to any third party (not even for anti spam verification).