robots.txt

Kevin O'Gorman kevin
Mon May 17 11:44:10 PDT 2004


On Sun, 9 Feb 2003, Ken Moffat wrote:

> Anyone know the function of robots.txt? I have seen attempted access to 
> it in my apache logs.
> 

Yes.  That is where you can limit the browsing of well-behaved robots.
For instance, I run a website with a database of millions of nearly-
identical dynamic pages.  There's no point in letting random
webcrawlers try to index them all -- it wastes their time and my
bandwidth, so I put stuff in there to limit their activity.

Some crawlers have their own rules, some don't obey any, but robots.txt
is pretty common, and fairly standard.

++ kevin



-- 
Kevin O'Gorman, PhD  (805) 650-6274  mailto:kevin at kosmanor.com
Permanent e-mail forwarder: mailto:Kevin.O'Gorman.64 at Alum.Dartmouth.org
Permanent e-mail forwarder  mailto:kogorman at umail.ucsb.edu
Web: http://kosmanor.com/~kevin/index.html



More information about the Linux-users mailing list