What are libraries doing today?

Bernhard Eversberg

Universitätsbibliothek Braunschweig

2006-07-03

This presentation has been prepared for English-speaking students at Braunschweig Technical University.  Some parts of it are a bit technical. That's because it grew from a talk given to students of computer science, and database theory in particular. Most of it, however, should be easy to follow.

Libraries and the Internet, together, contain the intellectual record

  • of all times,
  • from all places
  • and all cultures,
  • in all languages,
  • with contributions from all individuals who wanted to share their ideas, insights, memories, experience, and opinions.

Navigating this enormous universe cannot be very easy. Much work on making it easier began a long time ago, and it will continue for a long time to come.

Libraries have been among the first to make use of the Internet to improve their services to the public. But the Internet is only one component.

Findability has become a primary concern in all intellectual activity and, increasingly, also in everyday activities: Never before has so much information been available to everybody, but nobody can buy more time! What follows is this:

You want to find the right thing / the best thing as directly and quickly as possible. You don't want to learn about searching more than absolutely necessary. You will prefer methods with which you have experienced success or which you expect to be the most economical (time-saving). Sometimes, though, to learn more about searching and to acquire new insights and habits can get you ahead. And this includes library catalogs. We'll get there in a minute.

Two things are necessary for findability:

good software and good input.

Authors and publishers of books have never bothered about the findability of their products (they always left that to libraries), but today ever more webmasters bother a lot about how their homepages will be findable in Google. The usefulness  of standardized metadata, however, has yet to be widely understood.

Internet search services have to earn money to be able to stay in business. They focus on popular content, on the very new items and on things people might want to buy, and they make it easy to find things that have many links pointing to them. Google and Amazon are best at finding the well-known sites and the high-demand items.

Libraries try to build broad collections that cover many subjects, they preserve rare items that are nowhere else to be found, and they try to help researchers who ask new questions to discover the unknown that no one has found before them.

Both approaches to findability are necessary today, and they complement each other - in fact they often link to each other. Database technology and the internet infrastructure is needed in any case. Libraries are the natural places to combine all resources and make them available to everybody. Libraries are also the natural places, for the communities they serve, to combine all resources that cost money, and that includes real books as well as access to online versions of periodicals.

Without computers, a library had to have two or more manual files. Book ordering and keeping track of orders was highly labor intensive. Books on order or "in process" were not findable for the library patrons, and difficult to find even for staff.

License management is partly done by groups of libraries to share the expenses and labor.

Several databases have to interact with each other to make a modern library work efficiently. The largest database, supporting all functions, is the catalog, and it keeps growing all the time.

What should a good catalog do?

  1. Produce reliable results
  2. Clearly display differences
  3. Bring together what belongs together
  4. Present meaningful choices
  5. Locate what users want

Three kinds of searches:

Known-item search (standardized names etc.)

Subject search (controlled vocabulary, "thesaurus")

Collocation search (… what belongs together)

Requirements:

STANDARDS: Data format / Cataloging rules

INDEXING: Metadata, like Names, keywords, titles, codes, subjects,… / Full text of books mostly not available for indexing.

SEARCHING: Boolean combinations, Fuzzy logic

BROWSING: Display of alphanumeric indexes

NAVIGATION: From any record to related records


Important: Standard Data structures

International: MARC21 for data exchange.

Systems can use different internal formats, they just need to be able to import and export MARC21.

RDBMS do not have all the necessary functions.

If an RDBMS is used, a large volume of additional software must be developed to provide the extra functions. Object-oriented approach more suitable!

Some related problems affecting findability:

- A person can have more than one name (different spellings in different languages).

- A document may have more than one title.

- A document may have two or many authors.

- There may be other persons involved with a book, but also institutions, with many functions.

- If a title begins with an article, how to sort it?

- A document may consist of several parts.

- A document may cover many subjects.

- What to do with non-latin scripts?

There are interesting differences between libarary catalogs and search engines.

A catalog is, however, only a means to an end: Readers want the books, not descriptions of the books!

Different from online documents, a physical book can be read by only one person at a time.

The catalog, therefore, has to be integrated with the circulation functions. This way, the catalog can also show it when a book is not available because someone else has it. In that case, you may place a hold on it.

Availability (the item is in the collection) and accessibility (one can actually get it) are aspects of findability! Automation has enormously improved these as well.

Physical location has lost much of its importance.

Readers are not interested where a document is stored and who owns it - as long as they can get it quickly. Increasingly, readers get used to consult larger catalogs of library networks. These show the combined holdings of many libraries, but also

  • Periodical articles (never before findable in normal library catalogs!)
  • E-documents and other Web objects. These can be found elsewhere, but it may help to find them in a context with printed sources. Virtual libraries try to do that.

Germany has six regional library networks, all of which can be accessed through a "virtual catalog", a software gateway installed in Karlsruhe.

Braunschweig University Library also has, for example, all German online-dissertations in the catalog as well as many e-journals.

Library catalogs, on the other hand, can also be integrated into portals together with other search services, like periodical article databases.

Publishing used to be very time-consuming and costly. In most cases, an intermediary was necessary: the publisher.

Today, "to publish" means to make something available to the public. Anybody can do it who has access to a webserver. Libraries can provide this service for the members of their institutions.

Advantages of  institutional repository:

  • Economy of scale
  • Standardization: file formats, metadata
  • Reliability and safety of server
  • 24/7 operation
  • Increased findability by integration: in catalogs and search engines

The Institutional repository in Braunschweig is called "Digitale Bibliothek Braunschweig". Its contents can be found both in our catalog and in Google as well as other services, like OAIster.

Some older books are still much in demand but in poor condition, or too rare to risk their loss or damage. Libraries have turned to digitization (instead of the earlier microfilming) to make these works accessible to a much wider audience. Scholars need no longer travel a lot to visit the libraries who own precious resources, they can make virtual visits and study the texts at home.  --> Example

Findability is achieved by including these objects in the online catalogs as well as placing metadata on webservers for harvesting by the search engines.

Google is trying to do book digitization on a very large scale.  To this date (2006-06) their project is still in beta phase, and challenged by publishers for copyright infringement. Since they index the entire book content, the potential is highest for known-item searches, and esp. when looking for rare names or word combinations. Complete lack of controlled vocabulary makes collocation search impossible.

 

Preservation is one of the reasons for digitization.

Sometimes a book must be preserved as a physical object, but in most cases the intellectual content is more important than the paper.

Computer files, be they digital images or text, are very easily destroyed - even more than paper!

Like microforms, the earlier preservation technology, digital storage media may become unusable in the longer term:

  • The material substrate deteriorates physically.
  • The hardware fails or becomes unavailable.
  • The software becomes incompatible.

Strategies are under development to overcome or minimize these risks.

Libraries have to cooperate world-wide in order to avoid duplicate efforts.

Elementary learning necessities today

  • Media (Textbooks, Reference sources etc.)
  • Infrastructure (PCs, printer, scanner, copier, network)
  • A suitable place for working and meeting with other learners.

 

Today, it is the most economical solution for a university to provide all the needed resources under one roof - in the library. There should be no digital divide (between rich and poor) in the learning environment.

Important components of the learning environment

  • Information literacy (part of the curriculum?)
  • Information architecture (databases, connectivity)
  • Usability  (smart functions, barrier-free interfaces)