[Web4lib] google & library catalogs

Andrew Cunningham andrewc at vicnet.net.au
Tue Apr 11 19:47:21 EDT 2006


Some of the search engine technologies out there could have interesting 
applications when combined with OPAcs, but the idea of of exposing a 
library catalogues to Google is a frightening thought.

My concern is the problems that would result with non-English records. 
The ILS we use tends to store text as decomposed unicode character 
sequences.

Most common keyboard layouts generate precomposed character sequences.

Google does not do any Unicode normalization

Result: end users would not actually locate the items, even if the items 
are indexed in google. Google will not match a precomposed search string 
with decomposed text. They're different byte sequences.

Currently we have enough problems with some langauge searches and google.

Not even raising the issue of non-unicode character encodings that could 
create havoc.

-- 
Andrew Cunningham
Research and Development Coordinator
Vicnet, Public Libraries and Communications
State Library of Victoria
328 Swanston Street
Melbourne  VIC  3000
Australia

andrewc+AEA-vicnet.net.au

Ph. 3-8664-7430
Fax: 3-9639-2175

http://www.openroad.net.au/
http://www.libraries.vic.gov.au/
http://www.vicnet.net.au/


More information about the Web4lib mailing list