[Web4lib] http batch downloader that will submit cookies and post
data
John Fitzgibbon
jfitzgibbon at Galwaylibrary.ie
Tue Aug 25 06:50:56 EDT 2009
Hi,
In the past, I have used a batch downloader like Download Accelerator Plus to download a number of web pages. Each web page has an URL with a different query string. For example, if I wish to download files
http://www.somesite.com?age=1<http://www.somesite.com/?age=1>
http://www.somesite.com?age=2<http://www.somesite.com/?age=2>
...
http://www.somesite.com?age=100<http://www.somesite.com/?age=100>.
I can easily create a text file of such URLs and point the downloader at this file. The downloader, then, downloads each page, in turn, into a folder. I copy the resulting HTML files into one file and convert it into XML to extract the information I need.
This will not work if the site requires a cookie to be submitted each time. None of the downloaders I have tried will submit a cookie. Is there a downloader that will do this?
Secondly, if the page uses the POST method rather than the GET method to submit data to the server, specifying a file of URLs will not suffice; are there downloaders out there that can POST data to a web server in batch download mode.
I would appreciate any suggestions.
Regards,
John
John Fitzgibbon
w: www.galwaylibrary.ie
e: info at galwaylibrary.ie
p: 00 353 91 562471
f: 00 353 91 565039
#####################################################################################
This e-mail message has been scanned for content and cleared
by MailMarshal Hosted at Galway County Council
Tá an teachtaireacht ríomhphoist seo scanáilte dÁbhar agus glanta
ag MailMarshal atá Óstálta i gComhairle Chontae na Gaillimhe.
Correspondance is welcome in Irish or in English.
Tá míle fáilte roimh chomhfhreagras i nGaeilge nó i mBéarla.
Tá eolas atá príobháideach agus rúnda sa ríomhphost seo
agus aon iatán a ghabhann leis agus is leis an duine/na daoine
sin amháin a bhfuil siad seolta chucu a bhaineann siad.
Mura seolaí thú, níl tú údaraithe an ríomhphost nó aon iatán
a ghabhann leis a léamh, a chóipáil ná a úsáid.
Má tá an ríomhphost seo faighte agat trí dhearmad,
cuir an seoltóir ar an eolas thrí aischur ríomhphoist
agus scrios ansin é le do thoil.
This e-mail and any attachment contains information which is
private and confidential and is intended for the addressee
only. If you are not an addressee, you are not authorised
to read, copy or use the e-mail or any attachment.
If you have received this e-mail in error, please notify
the sender by return e-mail and then destroy it.
If you need this email in an alternative format please contact the sender
Má tá an ríomhphost seo ag teastáil uait i bhformáid eile déan teagmháil leis an duine a sheol chugat é
#####################################################################################
More information about the Web4lib
mailing list