[Web4lib] new book list script

Bret Parker Bret.Parker at ci.stockton.ca.us
Thu May 17 12:19:56 EDT 2007


Ouch. Yes, I see what you mean about the Millenium approach being inaccessible.

I have toyed with using Python's html and http modules (not their formal names). The examples I used when I was playing with this were all from Alex Martelli's Python in a Nutshell (around pp. 419; the httplib module).  Perl may have similar features but I find Python to be less Byzantine than Perl.  I have another O'Reilly book, John Callender's Perl for Web Site Management. I have used some scripts from there to just extract links from html files that reside on the same computer as perl program. But with Python I actually was able to 'get' the html using http calls (with Python acting like the browser on my workstation as I then contact a web server) and then I attempted to parse the xhtml (in this case) with Python. I ran into some deadends due to invalid XML or bad character data and abandoned my approach. But you might find something that would work for you, specially if you are not needing valid XML like I did for what I was doing.

For a helpful Python intro, try either of these sources:

  How to Think Like a Computer Scientist, Learning with Python
    http://www.ibiblio.org/obp/thinkCSpy/

  or  Learning Python by Mark Lutz (O'Reilly)  or any of the other books by Mark.

 I really like Alex Martelli's Python in a Nutshell, but it is not in any way a recommended way to see many examples of Python in terms of a systematic approach to learning Python scripting.

   Python can be freely downloaded at Python.org.

Bret Parker, Senior Applications Programmer Analyst (MLIS)
Stockton-San Joaquin County Public Library
City of Stockton (California)
bret.parker at ci.stockton.ca.us
(209) 937-7148

http://www.stockton.lib.ca.us


>>> "Ben Haines" <bhaines at forestparkpubliclibrary.org> 5/17/2007 8:16 AM >>>
Thanks for all your responses! 

Kathleen: the ISBNs are from other staff doing collection development, and ultimately from Baker and Taylor, etc.  I could manually put together a page by typing in the HTML to display the title and author for each book, then locating the cover image in our catalog and adding the URL.  But it would certainly be quicker and easier if the process could be automated. 

Bret: Our OPAC is managed at the consortium level, so I can't really do much configuration. Also, the Millenium reports server can't be accessed from member libraries at this point(although this might change in the near future). That's why I thought that scraping the OPAC page and reassembling the data might be the way to go. Is this sort of thing difficult to do in Python? Can you point me to any good examples?

-Ben

--
Ben Haines
Reference/Technology Librarian
Forest Park Public Library
bhaines at forestparkpubliclibrary.org 

-----Original Message-----
From: Turner,Kathleen [mailto:kt32 at drexel.edu] 
Sent: Thursday, May 17, 2007 8:25 AM
To: Ben Haines; web4lib at webjunction.org 
Subject: RE: [Web4lib] new book list script


Where are you getting the ISBN's and why couldn't that source also give
you the rest of the info?  

Kathleen 


Kathleen H. Turner
Web/Education Librarian
W.W. Hagerty Library
33rd and Market Streets
Philadelphia, PA 19104-2875

Tel: 215.895.6783
Fax: 215.895.2070
khturner at drexel.edu 
_______________________________________________
Web4lib mailing list
Web4lib at webjunction.org 
http://lists.webjunction.org/web4lib/



More information about the Web4lib mailing list