Pest-bots already scanning...
Thread Starter
Join Date: Dec 2003
Location: UK
Posts: 211
Likes: 0
Received 0 Likes
on
0 Posts
Pest-bots already scanning...
Hi,
About 21 days ago I registered a new www domain.
The site has only been up for an hour or two on/off as it's in development, and a blank page comes up currently when the url is visited.
I went in tonight to change features on the hosting, and thought I'd have a look at the visitor log, it's copied below:
So pests are scanning for my robots.txt file already!
Excuse me if this is stupid, but why are they scanning my robots.txt file? Isn't that the text file that gets you into search engines? How do they extrapolate data from it?
Can you take precautions against them? What are they doing?
Yours confused
Thanks in advance
About 21 days ago I registered a new www domain.
The site has only been up for an hour or two on/off as it's in development, and a blank page comes up currently when the url is visited.
I went in tonight to change features on the hosting, and thought I'd have a look at the visitor log, it's copied below:
216.144.233.206 - - [22/Nov/2004:01:19:39 +0000] "GET /robots.txt HTTP/1.0" 404 - "-" "-"
158-147-185-84.harris.com - - [23/Nov/2004:21:33:41 +0000] "GET / HTTP/1.1" 200 464 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)"
crawl12-public.alexa.com - - [26/Nov/2004:14:11:30 +0000] "GET /robots.txt HTTP/1.0" 404 - "-" "ia_archiver"
crawl12-public.alexa.com - - [26/Nov/2004:14:11:30 +0000] "GET / HTTP/1.0" 200 464 "-" "ia_archiver"
158-147-185-84.harris.com - - [23/Nov/2004:21:33:41 +0000] "GET / HTTP/1.1" 200 464 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)"
crawl12-public.alexa.com - - [26/Nov/2004:14:11:30 +0000] "GET /robots.txt HTTP/1.0" 404 - "-" "ia_archiver"
crawl12-public.alexa.com - - [26/Nov/2004:14:11:30 +0000] "GET / HTTP/1.0" 200 464 "-" "ia_archiver"
Excuse me if this is stupid, but why are they scanning my robots.txt file? Isn't that the text file that gets you into search engines? How do they extrapolate data from it?
Can you take precautions against them? What are they doing?
Yours confused
Thanks in advance
Supercalifragilistic
expialidocious
expialidocious
Join Date: Sep 2001
Location: Essex, UK
Posts: 588
Likes: 0
Received 0 Likes
on
0 Posts
Why do you say pest?
Hi WG774,
It looks like your site is being crawled by the bot for www.alexa.com
Now i'd not choose to add the Alexa tool bar to my own machines, but i'm curious, why do you consider it's attempt to index your site as the actions of a pest?
regards
Memetic
P.S. Looking for your robots.txt file means the bot is being well behaved. The following might be of interest:
http://www.robotstxt.org/wc/exclusion-admin.html
It looks like your site is being crawled by the bot for www.alexa.com
Now i'd not choose to add the Alexa tool bar to my own machines, but i'm curious, why do you consider it's attempt to index your site as the actions of a pest?
regards
Memetic
P.S. Looking for your robots.txt file means the bot is being well behaved. The following might be of interest:
http://www.robotstxt.org/wc/exclusion-admin.html
Last edited by Memetic; 1st Dec 2004 at 14:11.
Thread Starter
Join Date: Dec 2003
Location: UK
Posts: 211
Likes: 0
Received 0 Likes
on
0 Posts
Thanks Memetic.
Guess I overreacted, as I was lead to believe by anti-pest software that anything relating to Alexa was sinister... I jumped to the conclusion they were looking to harvest information on keywords / email addresses and sell this info on.
Guess I overreacted, as I was lead to believe by anti-pest software that anything relating to Alexa was sinister... I jumped to the conclusion they were looking to harvest information on keywords / email addresses and sell this info on.
Supercalifragilistic
expialidocious
expialidocious
Join Date: Sep 2001
Location: Essex, UK
Posts: 588
Likes: 0
Received 0 Likes
on
0 Posts
That Anti-Pest software is probably talking about the Alexa tool bar.
The tool bar lets you quicly acces the Alexa populatity rankings for sites and other Alexa services, it also lets them track which sites you visit so that they can compile the site rankings, which is why it is considered spy ware by some. Personally as they tell you what it does up front and don't attempt to force you to download it i'd not consider it as Malware.
Memetic.
The tool bar lets you quicly acces the Alexa populatity rankings for sites and other Alexa services, it also lets them track which sites you visit so that they can compile the site rankings, which is why it is considered spy ware by some. Personally as they tell you what it does up front and don't attempt to force you to download it i'd not consider it as Malware.
Memetic.
Spicy Meatball
Join Date: Jan 2004
Location: Liverpool UK
Age: 41
Posts: 1,115
Likes: 0
Received 0 Likes
on
0 Posts
What you want is for the big boys to come and spider you - such as google etc. When your site is finished, try submitting it to google (presuming you want the site indexed)!
:http://www.google.com/addurl.html
Regards
Maz :ok
:http://www.google.com/addurl.html
Regards
Maz :ok
Join Date: Jul 2003
Location: Scotland
Posts: 151
Likes: 0
Received 0 Likes
on
0 Posts
WG774
If you don't want bots trawling your site looking for email addresses add the following bit of code to your headers.
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
I am sure there are unscrupulous types out there that will ignore this but it is the official way to do it.
If you don't want bots trawling your site looking for email addresses add the following bit of code to your headers.
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
I am sure there are unscrupulous types out there that will ignore this but it is the official way to do it.