wget question
Alan Jackson
ajackson
Mon May 17 11:57:32 PDT 2004
On Mon, 29 Dec 2003 10:13:05 -0500
Joel Hammer <joel at hammershome.com> wrote:
> Some additional progress. I seem to be getting more files if I bypass
> the first couple of pages:
>
> wget -r -p http://atlases.muni.cz/atlases.muni.cz/_images-common
>
> This behavior has been puzzled. Could this be because these images are
> fetched by javascript?
>
wget doesn't do javascript. It runs ftp behind the scenes and retrieves
only files that have links in web pages it encounters. It also respects
robots.txt files.
To retrieve a site that uses javascript or database retrievals is hard.
If you can decode the file structure, you could write a simple webpage
with links to all the images, and then run wget on *that*. I also have
a simple perl hack for sucking files off a site... as you can see by the
amount of stuff commented out, I always start with this file and hack
it until it works.
#!/usr/bin/perl -w
# Retrieve a traffic map and display it
# initialize
use LWP::UserAgent;
#use HTML::Parse;
#use HTML::FormatText;
$url = "http://soiweb/~rjmiller/ISMAP/houtraf/new.houtraf.gif";
$url = " http://ccc.ece.utexas.edu/~thecap/pictures/Sewanee2000/2000-07-03/";
#print "$url\n";
# set up and open connection
$agent = new LWP::UserAgent;
#$agent->proxy('http','http://internet:80/'); # needed if going through firewall
#$agent->proxy('http','http://134.163.248.80:80/'); # needed if going through firewall
for ($i=1;$i<=50;$i++) {
my $z="";
#if ($i<100) {$z='0';};
if ($i<10) {$z='0';};
##next if $i < 40;
#my $file = "c$z" . $i . 'a.jpg';
#my $file = "amj$z" . $i . '.jpg';
#my $file = "lezz$z" . $i . '.jpg';
my $file = "tj$z" . $i . '.jpg';
next if -s $file;
print STDERR "fetch $file\n";
my $newurl = $url . $file;
$request = new HTTP::Request('GET', "$newurl");
$response = $agent->request($request);
if (!$response->is_success() ) {print STDERR "Couldn't get URL. Status code = ",$response->code,"\n";next;}
$output = $response->content;
# Display map
open(GIF,">$file");
print GIF $output;
close(GIF);
print STDERR "fetched $file\n";
}
--
-----------------------------------------------------------------------
| Alan K. Jackson | To see a World in a Grain of Sand |
| alan at ajackson.org | And a Heaven in a Wild Flower, |
| www.ajackson.org | Hold Infinity in the palm of your hand |
| Houston, Texas | And Eternity in an hour. - Blake |
-----------------------------------------------------------------------
More information about the Linux-users
mailing list