Trying to Survive...
Quit a well paying job to start my own company.
Took the plunge to put my startup ideas to the test.
Making into something huge!
Tuesday, September 21, 2004
Killer Robot Lawyers from Outerspace
I was looking through my server logs and noticed that someone had been going through every page on hurricane-tshirts.com. In fact, they went through the entire site in about 3 minutes. I'll usually see that sort of activity when a search engine is crawling a site, but this was different. Whoever, or whatever, had gone to my site was showing that it was Internet Explorer on some pages, Netscape on some other pages and Safari on others. Also, some of the IP numbers (e.g. 255.255.255.002) were slightly different, but still on the same subnet. So this definitely raised my suspicion about what was going on. At first I thought it might be a spambot. A bot, or robot, is an automated program than goes through an collects information from websites. Some bots are good, and some are bad. Spambots are bad. They go through site and look for email address which the spammers then use to send junk mail. The good bots are catagloging sites for search engines. This obviously wasn't a good bot since it was trying to disguise itself. It also wasn't any spambot that I've seen before, because they at least don't try to disguise themselves as regular web browsers. There's really no need for the spambots to try to cloak what they really are. So after searching the IP numbers I found that the bot that was on my site was from an intellectual property watch service. In essence they crawl through sites and look for copyright and trademark infringements on their clients behalf. In addition, they sometimes provide competitor research for their clients. Well, I don't like this. Some might say that if you're not doing anything wrong then you don't need to worry. I look at it in the same way I look at warrants. Even though I'm not doing anything wrong, I don't want the police to freely search my home. Same goes with these lawbots. I don't need them rummaging through my sites looking for anything they might be able to use to their advantage. So after some more research I've implemented a way to block them from my sites.
Here's how I'm doing it: First I need to find the IP number ranges that this company uses. To do this I performed a WHOIS with the IP numbers in my server log. Then I modified my htacess file on my server. If you're on an Apache webserver you've got this, or you can create one by naming it ".htaccess". This is what I added to the htaccess file:
RewriteEngine on
Options +FollowSymlinks
RewriteBase /
RewriteCond %{REMOTE_ADDR} ^65\.102\.23\. [OR]
RewriteCond %{REMOTE_ADDR} ^65\.102\.12\. [OR]
RewriteCond %{REMOTE_ADDR} ^63\.227\.217\. [OR]
RewriteCond %{REMOTE_ADDR} ^12\.148\.209\.19 [OR]
RewriteCond %{REMOTE_ADDR} ^12\.148\.209\.2
RewriteRule ^.*$ [G]
This sends anyone from the IP subnets of 65.102.23.xxx, 63.227.217.xxx and 12.148.209.190 - 12.148.209.255 to a server page that says the site is gone. I would have preferred to send them to a page with my own customized text, but I couldn't get that to work. Another downside is that I can only prevent lawbots, or anyone else for that matter, when I know their IP addresses. But this will do for now.




