[Web4lib] MARC strictness

Reynolds, Bess breynolds at debevoise.com
Mon Nov 28 09:39:46 EST 2005


Lars, The examples look like mistakes as the d should be a date or
dates.
According to The Library of Congress, the proper order for subfields in
a personal name are:
a Personal name
b Numeration (followed by a comma)
c Titles and other words associated with the name (followed by a comma)
q Fuller form of name
d Dates associated with the name
The first indicator 0 is for a forename, 1 is a surname, or 3 for family
name.

These are usually based on authority records that are maintained by the
LOC.

Here is a link to the LOC standards:
http://www.loc.gov/marc/bibliographic/ecbdmain.html#mrcb100

Bess

-----Original Message-----
From: web4lib-bounces at webjunction.org
[mailto:web4lib-bounces at webjunction.org] On Behalf Of Lars Aronsson
Sent: Monday, November 28, 2005 5:51 AM
To: web4lib at webjunction.org
Subject: [Web4lib] MARC strictness


I'm looking at a set of MARC records from a library near me.  
Their cataloging guidelines are a very close translation of the 
Library of Congress' MARC21 guidelines, but there seems to be a 
lot of built-in tradition too, that isn't covered in documents.

My experience (and I should point out that I'm a programmer, not a 
librarian) tells me that people will follow formatting rules if it 
matters, but not otherwise.  All C, Java, and Perl programs have 
statements that end in a semicolon, or else they refuse to run.  
But not all programs are well structured, or easy to explain.  
And this seems to apply to MARC records as well.

The search interface to this library's catalog seems to handle 
every subfield just the same.  Sometimes in the personal names 
fields (100, 600, 700), I see subfields $c (title) and $d (years 
of birth and death) interchanged:

   100 1  $a Meriam, James Lathrop, $c 1917-2000.
   100 0  $a Husayn ibn Ali, $d King of Hejaz, $c 1853?-1931.
   700 1  $a Barth $d Professor $4 aut

In the two first examples, if the subfield markers are removed, 
the remainder is a human-readable line of text with commas and a 
period at the end.  This is the more common case, but the third 
example doesn't have these commas. Is there a rule for this?
In trying to clean up the records, simply removing the comma or 
period at the end of a subfield is insufficient, because there are 
cases such as "$c Dr." or "$a Eliot, T. S." where the period 
should be part of the subfield.

The contents of subfield $d also varies greatly, e.g. the English 
"fl." (flourished) is mixed with the Swedish "levde", or the 
English "B.C." with the Swedish "f.Kr.", or more complicated 
statements such as "was born no later than 1751".  Circa can be 
abbreviated "c." (as in English) or "ca" or "c:a" (as in Swedish). 
Or the simple question mark after 1853 in the example above. In 
LoC's guidelines, I find no rules for the text inside the $d 
subfield.

Apparently, all these formatting inconsistencies exist because it 
really doesn't matter.  You can search for "Lathrop 1917" or "King 
Husayn ibn Ali" and you find what you're looking for.  Nobody 
would search for people having the title 1917.

Is this kind of inconsistency a problem, and how do libraries 
handle it?  Do you insist that such errors be corrected (and how 
do you motivate this requirement?), or have you long since given 
up that fight?



-- 
  Lars Aronsson (lars at aronsson.se)
  Aronsson Datateknik - http://aronsson.se
_______________________________________________
Web4lib mailing list
Web4lib at webjunction.org
http://lists.webjunction.org/web4lib/





More information about the Web4lib mailing list