[Web4lib] MARC strictness

Alexander Johannesen alexander.johannesen at gmail.com
Mon Nov 28 18:51:19 EST 2005


Hi,

On 11/29/05, Mike Taylor <mike at miketaylor.org.uk> wrote:
> This tells me that all MARC records could be replaced a single line of
> undifferentiated keywords and identifiers, like this: "c s lewis the
> abolition of man moral law subjectivism 0060652942".
>
> No!  Don't shoot me!  I'm only joking!  I think!

  1. You're both right and wrong.
  2. We all love and hate MARC at the same time.
    and
  3. Welcome to metadata heaven *and* hell.

Very concrete, isn't it? :)

> What it really _does_ show -- I think -- is that _for the purposes of
> Amazon-like searching_, this ultra-weak metadata suffices.  The
> question is what proportion of all catalogue searching is in this
> sense "Amazon-like", and my feeling is that the answer is very close
> to 100% of it.  Not quite 100%, though: sometimes you really do need
> to differentiate between searching for books _written by_ Winston
> Churchill and books _about_ Winston Churchill.

I think you're taking MARC too literal. You have to remember that is
is a 30 year old culture more than a strict standard, and I and my
collagues certainly treat it that way. No one handles MARC out of the
box; there are normalisation filters and procedures it has to go
through, lots of general second-guessing meaning and some black magic
thrown in to work out if the identification of anything within the
record is usable.

> Finally let me also say that of course metadata has other uses as well
> as searching.  Roughly, the other half of the equation is retrieval,
> or display.  But again, I find myself thinking that the world probably
> need rather less in the way of structure here than we information
> professionals tend to want to give them.

MARC is simply wonderful ... *when* you know how to handle it! If you
just use it out of the box, you *will* get into trouble. You need to
define a good measures of normalisation and cleaning up. There's a few
projects around that does that. Just to give you a good idea, we've
got three dedicated developers full time for the last year that have
create such a normalisation process, and we're still not happy with
it. It's a know problem within the library world, which is the very
reason a lot of us wants to push towards a more semantically rich
format. But of course, it ain't happening any time soon.

Good luck.


Alex
--
"Ultimately, all things are known because you want to believe you know."
                                                         - Frank Herbert
__ http://shelter.nu/ __________________________________________________


More information about the Web4lib mailing list