Saturday, October 30, 2010

wget notes

wget is used to download files from a web site.  Too bad the vastly easier to use filezilla, which can do so for ftp sites, (as well as using the sftp protocol over ssh) doesn't have an html option.

wget is invoked with the url as the argument.  Possible tuning options include:

-c continue a failed download
-r recurse into directories
-l depth (level) to recurse
--retry-connrefused  retry this error, rather than stop
--no-host-directories  gets rid of fetching to directory "http...."
-t number (override default retries of 20)

.wgetrc  save options per use
/etc/wgetrc  global option file if present

handy shell script at:

source package system:

Wget follows parent directories and downloads lots of crap you don't want if you don't use the "noparent" option.

here is an example:

wget -np -r -l 1

Next hint to figure out is how to save the files w/o all the crap at the front of the file name.

The above results in files with index.php?path=linux%2Fpackages/ at the front of each file name.

At least I can get all the sources for building the linux onto my system.  next to figure out how to clean up the names, probably a script.

Need a slice and dice python script to use for these occasions. 

No comments:

Post a Comment