End of cataloging?

In response to an idea that surfaced on the mailing list Autocat suggesting the reduced need to maintain individual cataloguers with the rise of Google, Google Scholar, Open Worldcat etc., I posted the following:

Which makes a lot of sense why are thousands of trained cataloguers around the world all cataloguing the same books so we can all put variant records onto international cataloguing utilities? In the vast majority of cases, we would surely only need local holdings appended to one centralized catalogue record. The few trained cataloguers left would deal with the more individual items not held by anyone else then contribute these records to GoogleCat so no-one else has to.

This, and the idea generally, didn’t get too much sympathy. One member of the list even replied to me off-list to say Well, last one out the door, please turn off the lights. I think this is a little over the top. I think there’s a lot of sense in the idea; I also think I didn’t explain myself too well.

At the moment, bibliographic records are shared between libraries, often internationally, via services provided by organisations such as CURL, RLG, and OCLC. Typically, a library will periodically upload new records to one or more of these organisations of which they may be a member and routinely download individual records to add their own databases. Each library will generally edit the records it downloads according to local policies and authority files, to correct small errors, or update CIP data. If a library can’t find the item, they will make a catalogue record from scratch. The main point here is that every library is maintaining its own database of bibliographic data, which duplicates a lot of other libraries’ catalogues in terms of books described, though not necessarily in terms of the actual description. I think this is frequently needless in terms of staff time and leads to absolutely unnecessary differences in the record.

To give a petty example, libraries A and B download the same CIP record from CURL which has no physical description (MARC 300 field). Library A notes that there are illustrations and thinks the map on p.46 is significant; Library B thinks it isn’t. Library A gives 300 $a200p :$bill., map;$c24cm.; Library B gives 300 $a200p :$bill. ;$c24cm.. This doesn’t really matter too much, but extend this to authorities (Library A follows LC authorities slavishly; Library B used them when there is a conflict of headings), classification (Library A uses Dewey so makes sure the 082 field is correctly filled in; Library B uses its own scheme so couldn’t care less), fixed fields, subject headings (Library A uses MeSH for medical books; Library B is content with LCSH for all books), not to mention GMDs, and the relative importance and content of various note fields. Many libraries seems to give LCRI equal weight to AACR2, and MARC21; others, including my own library, don’t. All of these choices are valid for an intelligent cataloguing agency to make, and in some cases for individual cataloguers within an institution. The result is, as I said above, thousands of trained cataloguers around the world all cataloguing the same books so we can all put variant records onto international cataloguing utilities. If Library C has catalogued a book once, why must hundreds of other libraries do the same. Downloading other libraries’ records should solve this problem but doesn’t.

What if all libraries literally used the same record? GoogleWorldCat, or whoever, would hold a bibliographic record for Dan Brown The Da Vinci Code, which others link to. I’m no systems librarian or programmer of note, but it seems that catalogue records in modern systems (or at least Aleph) have an admin record from which hangs the bibliographic (bib) record, item records, order records, etc. So, the Da Vinci Code might be held on admin record 100. When someone wants to view it on the OPAC, the system pulls in bib record 645 from the bibliographic database and item records 6789 and 7923 from the item database and shows them to the user. What if the bibliographic database were held remotely and the system merely fetched the bib record from there, so http://www.googleworldcat.com/bib.cgi?record=645 or something more realistic or elegant. Considering the increasing speed of network connections and the volume of internet traffic in emails and the internet (I am probably showing my university library bias here), this would surely be possible: something like stealing an image from another website, except the bandwidth would be purposefully stolen. Something similar is happening on most web catalogues anyway as they are, of course, web catalogues. This is why I don’t think it would be hard for system suppliers to implement.

This model is not a million miles away from what happens with authority control now. It is senseless for each library to maintain its own authority files when NACO and SACO do it anyway and vendors sell the file whole and with regular updates, although even here we don’t quite link directly in the same way. It would have other benefits too: if the record is changed in any way, the whole world would get it seamlessly.

There are problems, of course. Security and bandwidth are a couple, and mirror sites would have to be used. Another is how to deal with items that don’t yet appear on GoogleWorldCat. The system outlined above could still accomodate local records if necessary (point to local record 789 rather than a remote one). Alternatively, libraries could catalogue and contribute records of such material to the central repository in the same way that “qualified” libraries already do for LCSH and authority records. Although GoogleWorldCat might endanger cataloguers’ jobs, which was the original idea behind the proposal I believe, it would instead give the opportunity to spend more time actually cataloguing unique materials. The benefits for serials and electronic material in particular must be high.

It’s an idea anyway.

1 thought on “End of cataloging?”

Tom says:

11 March 2009 at 12:00 am

The difference would be that the local systems would not have to maintain high quality and consistent records in the same way. More effort could be put into it centrally. A useful example would be with authority control: when Smith, John, 1900- dies, GoogleWorldCat would change it to Smith, John, 1900-2006. The rest of world would have to do Absolutely Nothing to maintain their catalogues, while now even those in the wise situation of using LC authorities would have to import new headings or manually change them, a needless effort. Extending this to all aspects of bibliographic control would be greatly beneficial. Governance is a problem, but I would see national libraries being obvious guardians: they could maintain mirror sites, and ensure locally published items are adequately dealt with (as they should do anyway in many cases anyway), although they could do it better perhaps if, say, university libraries diverted some of their resources towards a central pool. The comments here are a little basic I will admit, though I am erring for the moment on the side of caution and am happy to merely fight a running battle with comment spam rather than anything more serious.

Comments are closed.