So I was looking, at the start of the year about 8% of my stats was SPAM.yuck. Â Then something insane happened this week, it jumped to 28%.
So I crossed that point when something would have to be done!
I’ve already installed stuff to detect the spam, and it does a good overall job. Â But I wanted to take it to the next level, and block all traffic from the spammers! Â Anyone who SPAM’s probably is engaged in other nonsense that makes me not want their traffic.
Thankfully for me and this brave new era of google, I could quickly find someone has done 99% of the leg work for me right here! Â Thanks to Sakis’s hard work I was able to add some minor tweaks, and generate a full iptables config, flush & add the new rules, then have cron run it every few minutes.
Pretty cool stuff if I do say so myself!
—
Since the primary site is now offline, I’ve updated with an archive.org link. Â For what it’s worth, here is the meat of the article in question:
Dodging WordPress comment spammers
I admit: Allowing anyone to post comments is bad practice. Though, I’ve got my reasons to stand my ground. I’ve many times read something on a blog and to some of them I even had something to add. Could potentially help blog’s author or future visitors by sharing my own experience, or request a solution to one of my problems by posting a question. Guess what? I am so lazy that I rarely go through registration procedure, just to enable me posting a comment.
I am one of those that insist dialog and discussion is always constructive as long as both ends feel like establishing it. I do not want to loose the opinion and comments of stopping-by visitors, just because I want a “safe” thing that runs on its own. But, “buts” exist. My blog is currently one month old, still it manages to receive 300+, in average, spam-oriented comments per day, while I’ve even witnessed a 1k/day.
Thank god, WordPress provides blacklist features based both on IP addresses and comment content. And it really does a good job: After messing around with your recent “spam” you can easily end up with a list that accurately detect a non constructive comment. However, you’ve not solved all your problems this way:
- New comments still come. They are just automatically rated as spam.
- Your database fills with garbage.
- Your web traffic statistics are spoiled.
- You waste bandwidth.
- You waste CPU time.
- If your spammer ever stop selling drugs and starts advertising flesh, all your content matching rules go away.
- If your spammer loose interest into being a blog spammer and switch to a port-scanner, you will receive that too.
How about you refuse them a spare TCP socket? Besides, you don’t even wanna know them. All their connection attempts will end-up to void. Time for some iptables magic.
WordPress has already stored their IP addresses within its database. Consult that wp-config.php file you lately edit when you firstly installed WordPress, and refresh your memory on what your database name, username and password is. Mine are:
$ grep "DB_" wp-config.php
define('DB_NAME', 'mywordpress');
define('DB_USER', 'sakis');
define('DB_PASSWORD', 'myextrastrongpassword');
define('DB_HOST', 'localhost');
define('DB_CHARSET', 'utf8');
You now have to use that information into constructing this single-row command:
mysql -f -p --user=DB_USERÂ DB_NAMEÂ <<<"select distinct CONCAT('iptables -A INPUT -s ',comment_author_IP,'/32 -j DROP') from wp_comments where comment_approved='spam' order by 1 asc" | grep -v "^CONCAT" >> THEY_BOTHER_ME
Check my example:
$ mysql -f -p --user=sakis mywordpress <<<"select distinct CONCAT('iptables -A INPUT -s ',comment_author_IP,'/32 -j DROP') from wp_comments where comment_approved='spam' order by 1 asc" | grep -v "^CONCAT" >> THEY_BOTHER_ME
Enter password:
$ head THEY_BOTHER_ME
iptables -A INPUT -s 113.161.128.232/32 -j DROP
iptables -A INPUT -s 117.121.208.254/32 -j DROP
iptables -A INPUT -s 118.141.141.7/32 -j DROP
iptables -A INPUT -s 118.194.1.157/32 -j DROP
iptables -A INPUT -s 119.235.27.100/32 -j DROP
...
You now have a simple recipe, named “THEY_BOTHER_ME”, ready to be executed (as root):
$ su
# . ./THEY_BOTHER_ME
Make sure you hook “THEY_BOTHER_ME” at your system’s start-up procedure and construct a cron/at job to periodically refresh it.
I’ve created a file named /etc/cron.daily/update_spammers.sh, with the following contents:
#!/bin/sh
fileloc="/etc/THEY_BOTHER_ME"
before=`cat "${fileloc}" | wc -l`
before=`echo ${before}`
cp "${fileloc}" /tmp/BOTHERS.$$
mysql -f --user=sakis --password=myextrastrongpassword mywordpress <<<"select distinct CONCAT('iptables -A INPUT -s ',comment_author_IP,'/32 -j DROP') from wp_comments where comment_approved='spam' order by 1 asc" | grep -v "^CONCAT" >> /tmp/BOTHERS.$$
sort /tmp/BOTHERS.$$ | uniq > "${fileloc}"
rm -f "/tmp/BOTHERS.$$"
. "${fileloc}"
after=`cat "${fileloc}" | wc -l`
after=`echo ${after}`
di=`expr ${after} - ${before}`
di=`echo ${di}`
printf "[%s] Spammers updated. Added %d new spammer(s) (Before: %d, After: %d)\n" "`date`" ${di} ${before} ${after}
And sadly his original script is now offline. Â This should be enough for anyone to get going on this exciting spam adventure…
I’ve been seeing your blog on the sites I roam around on; one for Dell UNIX and the other for Neko.
Seems weird!
It isn’t that weird, I’m EVERYWHERE… or I have a wide & strange range of interests… 🙂
Be careful with blocking legacy IPv4 addresses… The use of CGNAT is extremely widespread so there might be many thousands of real users behind a single address. One customer of the ISP becomes infected with malware which starts sending out spam, and suddenly all customers are now blacklisted and can’t visit various sites. And from a user perspective, the site will look to be down.
In some cases you don’t even need any malicious behaviour, simply a large number of real users from a single address is enough to trigger a block in many cases.
I have frequent troubles accessing legacy IPv4 sites because of this. For more modern sites that have implemented IPv6 i never have a problem.
It’s been about a decade since I went through this, and I ended up using the askimet plugin to deal with the spam. It was a much better way to deal with the noise, and still allow me to access the blog from within China when I was there.