Using wget to snapshot a web site

Alan Jackson ajackson
Wed Dec 8 18:35:37 PST 2004


On Wed, 07 Dec 2005 15:00:49 -0600
Michael Hipp <Michael at Hipp.com> wrote:

> I'm trying to use wget to grab an offline copy of this website so I can 
> refer to it when doing development without Internet access.
> 
>     http://wiki.wxpython.org/index.cgi/FrontPage
> 
> But all the links in that page all look like this:
> 
>     <a href="/index.cgi/ObstacleCourse">ObstacleCourse</a>
> 
> I can't find any combination of options for wget which will cause it to 
> follow these links. I presume it's because the link is written like an 
> absolute link when it is actually more of a relative link.
> 
> Anyone know how to get wget to grab these or another tool which might do 
> the job?
> 

The pages don't exist - they are created on the fly by a cgi program.

One could presumably write a perl script that would act like a user
going to each page and trap that, but it would be painful. 

-- 
-----------------------------------------------------------------------
| Alan K. Jackson            | To see a World in a Grain of Sand      |
| alan at ajackson.org          | And a Heaven in a Wild Flower,         |
| www.ajackson.org           | Hold Infinity in the palm of your hand |
| Houston, Texas             | And Eternity in an hour. - Blake       |
-----------------------------------------------------------------------


More information about the Linux-users mailing list