[Web4lib] Moving Content between Web Site Servers

Estee mommiemooga at gmail.com
Tue May 3 18:58:11 EDT 2011


Hello David,

Have you tried Wget? If problems are occurring from a non-persistent
connection, I would recommend Wget.  I have enclosed the description below.
Software is downloadable at
http://www.gnu.org/software/wget/manual/wget.html

Also, check file permissions (source and target) to be sure that you have
permission to write to all target directories.

When logging-in and "switching user" be sure to use the "dash" to source the
dot files. Otherwise, one may accidentally continue writing with unintended
permissions. For example.

% su – mommiemooga

When all else fails use tar. To create a tar file that preserves all links
and permissions:

% tar cvfp tarfile.tar

And to extract a tar file:

% tar xvfp tarfile.tar
DESCRIPTION

GNU Wget is a free utility for non-interactive download of files from the
Web. It supports HTTP, HTTPS, and FTP protocols, as well as retrieval
through HTTP proxies.

Wget is non-interactive, meaning that it can work in the background, while
the user is not logged on. This allows you to start a retrieval and
disconnect from the system, letting Wget finish the work. By contrast, most
of the Web browsers require constant user's presence, which can be a great
hindrance when transferring a lot of data.

Wget can follow links in HTML pages and create local versions of remote web
sites, fully recreating the directory structure of the original site. This
is sometimes referred to as ``recursive downloading.'' While doing that,
Wget respects the Robot Exclusion Standard (*/robots.txt*). Wget can be
instructed to convert the links in downloaded HTML files to the local files
for offline viewing.

Wget has been designed for robustness over slow or unstable network
connections; if a download fails due to a network problem, it will keep
retrying until the whole file has been retrieved. If the server supports
regetting, it will instruct the server to continue the download from where
it left off.

Best,

Estee Donaghy

Tech Support Guru

mommiemooga at gmail.com


More information about the Web4lib mailing list