[WEB4LIB] MARC -> XML
Joerg Messer
joerg.messer at ubc.ca
Wed Mar 23 19:23:29 EST 2005
Hi Anita,
I've been doing a little hacking using Python, the Zoom PyZ3950 package and
the Amara XML tools. I don't know if my sample code is of any use but I've
included it below. I'm neophyte in this area so I make no claims of this being the prefered way of
doing things. If you come up with a better toolset, I'm all ears.
-----------------
from amara import binderytools
conn = zoom.Connection ('z3950.loc.gov', 7090)
conn.databaseName = 'Voyager'
conn.preferredRecordSyntax = 'USMARC'
query = zoom.Query ('CCL', 'ti=law')
results = conn.search (query)
print "Number of results: " + str(len(results))
count = 0;
for result in results:
count = count + 1
print "-------------------------------------------"
print "Record: " + str(count)
print "-------------------------------------------"
raw = result.data
# Convert to MARC
marcdata = zmarc.MARC(raw)
# print marcdata
# Convert to MARCXML
marcxml = marcdata.toMARCXML()
print marcxml
# Remove non-ascii characters (these cause problems for Amara)
marcxmlascii = unicode(marcxml, 'ascii', 'ignore').encode('ascii')
# print marcxmlascii
doc = binderytools.bind_string(marcxmlascii);
#print "[" + doc.record.leader.xml_text_content() + "]"
i = doc.xml_xpath("//datafield[@tag='020']/subfield[@code='a']")
if len(i)>0:
isbn = i[0].xml_text_content()
print " ISBN: " + isbn
t = doc.xml_xpath("//datafield[@tag='245']/subfield[@code='a']")
if len(t)>0:
title = t[0].xml_text_content()
print " Title: " + title
a = doc.xml_xpath("//datafield[@tag='100']/subfield[@code='a']")
if len(a)>0:
author = a[0].xml_text_content()
print "Author: " + author
print "-------------------------------------------"
conn.close ()
Anita Chiodo wrote:
> Hi,
> I'm looking for information on best method (easiest/fastest/cleanest) to
> convert MARC to XML. Can anyone help guide me to any software packages
> or resources that are available?
> =20
> I've been through the web4lib archives and attempted to access
> information via LOC (received page errors); I've had little success with
> both.
> =20
> Sincerely,
> Anita
> Anita Chiodo, M.S.L.S.
> Manager, Library Services
> BrittleBOOK.com/BookARCHIVE.com
> Local Phone: 319-390-9442 x24
> Toll Free: 888-870-0484 x24
> Email: achiodo at newspaperarchive.com
> =20
>
>
>
> *********************************************************************
> Due to deletion of content types excluded from this list by policy,
> this multipart message was reduced to a single part, and from there
> to a plain text message.
> *********************************************************************
>
--
Joerg Messer
Programmer/Analyst
UBC Library Systems
2206 East Mall
Vancouver, B.C. Canada V6T 1Z3
T: +1.604.822.5091
F: +1.604.822.3201
W: www.library.ubc.ca
E: joerg.messer at ubc.ca
More information about the Web4lib
mailing list