[WEB4LIB] Re: Z39.50 Discussion on Web4Lib
Sebastian Hammer
quinn at indexdata.dk
Sun Feb 11 16:20:39 EST 2001
At 15:22 09-02-01 -0800, Matthew Dovey wrote:
> > There are scalability issues with Z39.50. Multithreaded searching of more
> > than 5-7 institutions at a time can result in bottlenecks due to the
> > client/server communications overhead.
>
>Sebastian Hammer from IndexData has successfully searched about 200 Z39.50
>targets simultaneously acheiving the same response time as searching 1
>target. In fact the response time is determined by the slowest Z39.50
>server, so when searching 200 you are statistically more likely to hit one
>which is slow (due to insufficient hardware etc.) or down which can make the
>search appear slower.
Actually, it was "only" 100 targets, but Matthew is correct that we
observed no significant latency introduced by the concurrent searches, and
frequently saw response times of 5-7 seconds to search and fetch a pageful
of records (approximately equal to the average response time of the slowest
server in the group). Note that these tests were carried out running a
client on a well-connected network - this would most likely NOT work over a
56K modem. Before we formally publicise our results, I expect we will have
made the test with 200 targets as well.
>Of course it is possible to write bad multithreading code as well as good
>efficient multithreading code, and I suspect that some programmers may have
>made the claim you can't search more that 7 targets claim to cover poor
>performance in their clients....
I think mostly the problem has to do with a failure to analyse the sources
of delays. Most often, this turns out to be sub-optimal target
implementations that delay the whole process by taking inordinate amounts
of time to do simple tasks. Often these problems are resolved once
cross-searching clients are seriously deployed, and user needs are made
visible.
There *are* scalability issues, but my sense is in practice they will have
more to do with quality-of-service and reliability issues (ie. if you
search 500 targets, do you *need* a response from each one).
Perhaps the most important issue, and one which is sometimes ignored in
discussions of Z39.50-based virtual union catalogs of any form, has to do
with server-side scalability (rather than client-side scalability, which is
where the "mystic" bottlenecks are quoted). Many library systems in smaller
libraries are really only scaled to handle a handful of workstations and
perhaps the odd web-user visitng the OPAC from home. But, if you make the
local library visible in a large-scale, regional or national virtual union
catalogue using parallel searching - then you had better make sure the
local systems are capable of handling the load. Either that, or you need to
devise ways to avoid sending user queries to irrelevant databases.
--Sebastian
--
Sebastian Hammer <quinn at indexdata.dk> Index Data ApS
Ph.: +45 3341 0100 <http://www.indexdata.dk> Fax: +45 3341 0101
More information about the Web4lib
mailing list