The eagle-eyed among you may have noticed that the websites on my host, including this one, Political Staples and Transit Toronto have had their internal search engines replaced by internal searches provided by Google. After testing Google’s ability to search within our websites, I was impressed enough that I decided to make the switch. As an added bonus, Google’s searches could be incorporated into our templates, and we could get a small boost in ad revenue through the Google Adsense program.
But the real reason I made the switch is because, if I didn’t, our sites would be without an internal search engine, and in a site such as Transit Toronto, whose extensive content is best discovered by searches, this would be a crippling loss of functionality. However, I had no choice: the file which controlled the internal search engines had gone haywire.
A week ago Sunday, after finishing off the latest Bloggers Hotstove, I tried logging into my Movable Type system to post to my blog, only to find that the files weren’t working. My web pages were still online, but I couldn’t add or edit posts, and nobody could comment. This was server wide, affecting Greg Staples’ blog as well, so I put in a call to my webhost’s technical support.
Now, let me take a moment to say how much of a good decision it was, after my big webhosting fiasco, to finally settle on Hostgator as my webhost. GoDaddy suspended me without warning when they encountered problems with my Movable Type installation, and forced me to run in circles to retrieve my files and my databases. 1and1 had shaky technical support, and significant server downtime problems. Hostgator has been extremely reliable and robust and, when I called their technical support department at 11 p.m. on Sunday night, a human answered the phone immediately.
Initially, the prognosis didn’t look good. After checking my account, he notified that my executable files had been disabled by order of the server administrators, and that I had to write an e-mail to their Abuse department to get the files re-enabled. He warned me that this task could take 24 to 48 hours, but within 20 minutes of sending off the e-mail, the Abuse department responded, not only re-enabling the files, but identifying the specific file that was causing them problems, and setting me on the road to fixing the situation permanently.
Hostgator: best. webhosting. company. ever.
When I was told that my executable files had been disabled, I had a suspicion what was going on, although the actual culprit surprised me. Movable Type has done an excellent job of ensuring that spammers trying to vandalize our blogs with spam comments or trackbacks are immediately shuffled off to our junk folders, but the spammers’ activity has not stopped, and even though most of the spam comments are not appearing on our blogs, the attempts still force Movable Type to do some work. And if a site gets targetted by a bunch of particularly bothersome spambots, these junk requests could come at a frequency of greater than one per minute, amounting to a mini denial-of-service attack.
When I was writing to the Abuse department, I explained this, and asked if the files mt-comment.cgi and mt-tb.cgi were the problem. They weren’t. Server resources were being stretched to the limit by mt-search.cgi, and I suspect the spammers are to blame.
Earlier, I had noticed what looked like spambot activity on mt-search.cgi, with someone or something querying with surprising regularity entries bearing text that was obviously a spam comment. At the time, I was baffled why the spambots would want to search mt-search; were they seeing if their acts of vandalism was getting through? But when I asked this question to the community of Movable Type developers, I got this response:
The key thing to understand here is that spambots are often “brute-force” in nature. Meaning, they will often post any form they can find of the web, whether it a comment, guestbook, contact, order, or search form, etc. The important thing to note is that the mt-search form is not being “targeted” — these spambots are going after every form they can find - and then often find the mt-search form. This can cause HUGE problems for larger MT installations, because of the poor performance of mt-search.cgi against larger databases.
Bottom line, while we like to call it “blog spam” or “comment spam”, a better name for it is “form spam”.
Well, there you have it. And with mt-search.cgi taking up more resources than a typical request to mt-comment.cgi or mt-tb.cgi, you have the makings of a denial of service attack. I immediately disabled mt-search.cgi (which has resulted in no loss of functionality within the Movable Type system) and looked for a replacement. A number of individuals have suggested Fastsearch, which is a Movable Type plugin that is remarkably fast and efficient in delivering search results, but in the end I went for Google. I’ve decided, in this area, I want to reduce the load on my server, and outsourcing the internal search engine is an effective way of doing this. Google’s results are also fast and efficient, and I’ve heard no complaints from my users, yet.
Still, it is rather irksome, isn’t it, that these spamming individuals have nothing better to do but attempt to vandalize websites, and drown out legitimate activities on the Internet. Already, the majority of e-mail on our system is spam; the image spam is proving difficult for our filters to recognize. We’ve already lost Usenet to this scourge; trackbacks are next to useless because of spam, and now we have to take all of these precautions to ensure that people can comment on our blogs. Why do these people have to ruin things for the rest of us?