On the way to a library ontology

11. April 2013 um 15:02 2 Kommentare

I have been working for some years on specification and implementation of several APIs and exchange formats for data used in, and provided by libraries. Unfortunately most existing library standards are either fuzzy, complex, and misused (such as MARC21), or limited to bibliographic data or authority data, or both. Libraries, however, are much more than bibliographic data – they involve library patrons, library buildings, library services, library holdings, library databases etc.

During the work on formats and APIs for these parts of library world, Patrons Account Information API (PAIA) being the newest piece, I found myself more and more on the way to a whole library ontology. The idea of a library ontology started in 2009 (now moved to this location) but designing such a broad data model from bottom would surely have lead to yet another complex, impractical and unused library standard. Meanwhile there are several smaller ontologies for parts of the library world, to be combined and used as Linked Open Data.

In my opinion, ontologies, RDF, Semantic Web, Linked Data and all the buzz is is overrated, but it includes some opportunities for clean data modeling and data integration, which one rarely finds in library data. For this reason I try to design all APIs and formats at least compatible with RDF. For instance the Document Availability Information API (DAIA), created in 2008 (and now being slightly redesigned for version 1.0) can be accessed in XML and in JSON format, and both can fully be mapped to RDF. Other micro-ontologies include:

  • Document Service Ontology (DSO) defines typical document-related services such as loan, presentation, and digitization
  • Simple Service Status Ontology (SSSO) defines a service instance as kind of event that connects a service provider (e.g. a library) with a service consumer (e.g. a library patron). SSSO further defines typical service status (e.g. reserved, prepared, executed…) and limitations of a service (e.g. a waiting queue or a delay
  • Patrons Account Information API (PAIA) will include a mapping to RDF to express basic patron information, fees, and a list of current services in a patron account, based on SSSO and DSO.
  • Document Availability Information API (DAIA) includes a mapping to RDF to express the current availability of library holdings for selected services. See here for the current draft.
  • A holdings ontology should define properties to relate holdings (or parts of holdings) to abstract documents and editions and to holding institutions.
  • GBV Ontology contains several concepts and relations used in GBV library network that do not fit into other ontologies (yet).
  • One might further create a database ontology to describe library databases with their provider, extent APIs etc. – right now we use the GBV ontology for this purpose. Is there anything to reuse instead of creating just another ontology?!

The next step will probably creation of a small holdings ontology that nicely fits to the other micro-ontologies. This ontology should be aligned or compatible with the BIBFRAME initiative, other ontologies such as Schema.org, and existing holding formats, without becoming too complex. The German Initiative DINI-KIM has just launched a a working group to define such holding format or ontology.

Ãœberlegungen zur Modellierung von Bibliotheksdienstleistungen

13. März 2013 um 12:44 3 Kommentare

P.S: Mit Beschränkung auf Bibliotheksdienstleistungen, die sich auf Dokumente beziehen, habe ich inzwischen mit der Spezifikation einer Document Service Ontology begonnen.

Im Rahmen der Entwicklung der Patrons Account Information API (PAIA) und der Document Availability Information API (DAIA) bin ich wiederholt auf die Frage gestoßen, was für Arten von Dienstleistungen Bibliotheken eigentlich anbieten. Auf meine Frage bei Libraries.StackExchange antwortete Adrian Pohl, der sich in seiner Masterarbeit für lobid.org mit dem Thema beschäftigt hat.

Meine Fragestellung ist natürlich einseitig und zwar ausgerichtet auf die mögliche Verwendung in APIs, mit denen Nutzer und Programme verschiedene Bibliotheksdienstleistungen abrufen oder anfordern können sollen. Das bekannteste Beispiel ist die Dienstleistung der Ausleihe. PAIA und DAIA wurden primär dafür entworfen, damit Dienste wie die Ausleihe von Büchern aus Bibliotheken offen und maschinenlesbar möglich ist. Offen heisst, dass praktisch jeder oder zumindest alle Bibliotheksnutzer direkt per gut dokumentierter API auf die Dienste zugreifen können statt nur über bestimmte Benutzeroberflächen.

Dieser Blogartikel ist erst einmal ein lautes Denken, bei dem ich Adrians Klassifikation auf Eignung für meine Fragestellung untersuche. Wie jede Klassifikation sind sowohl die von Adrian bereitgestellte Liste als auch meine Kommentare also nicht neutral sondern nur eine mögliche Sichtweise.

1. Webbased services without direct personnel involvement:

  • OPAC and other research services
  • List of recent acquisitions
  • Online Tutorials
  • Place an order
  • Renewal of a loan
  • Online acquisition request service
  • Digitization Service
  • Consulting Service

Ein Katalog ist keine Dienstleistung die angefordert oder reserviert werden müsste, ebenso sieht es für andere Informationen auf der Webseite der Bibliothek aus. Die Punkte „place an order“ und „renewal of a loan“ gehören zur Dienstleistung Ausleihe (bzw. Ansicht bei Präsenzexemplaren). Zu klären ist noch, wie sich Anschaffungsvorschläge einordnen lassen. Die Digitalisierung von Bibliotheksbeständen (z.B. DigiWunschbuch ist vermutlich eine eigene Dienstleistung, die zu den bestehenden DAIA-Services hinzukommen könnte. Wieso „Consulting Service“ hier eingeordnet ist, verstehe ich nicht.

2. Webbased services with direct personnel involvement:

  • Chat reference/consulting

Ist ein Chat eine Dienstleistung? Wie sieht es mit anderen Kommunikationswegen (Telefon, Gespräch vor Ort…) aus)? Ich würde das alles unter Auskunft zusammenfassen.

3. In-house services with direct personnel involvement

  • In-house guided tours and courses
  • Digitization Service
  • Microfiche readers
  • Reference Desk
  • Loan Desk
  • Registration Desk
  • Consulting Service
  • 3D printer
  • Makerspace

Schulungen und Führungen sind sicher eine eigene Art von Dienstleistung, allerdings gibt es hier verschiedene Arten. Ausleihtheke und Anmeldung sind keine eigene Dienstleistung sondern Teil anderer Services. Bei den Arbeitsmitteln wie Mirofiche-Leser, 3D-Drucker etc. kommt es darauf an, ob diese direkt nutzbar sind oder reserviert werden müssen und ob sie nur vor Ort nutzbar oder auch ausleihbar sind.

4. In-house services without direct personnel involvement

  • Reading Room
  • Study room
  • Wifi
  • Computers with Internet Access
  • Photocopier
  • Scanner for use by visitors
  • 3D printer
  • Computer for self-loan
  • Cafeteria
  • Drinks machine

Arbeitsräume und Arbeitsplätze, Kopierer, Scanner, Kaffeeautomat etc. gehören zu den oben genannten Arbeitsmitteln. Ich denke dass sich DAIA einfach erweitern lässt, so dass Nutzer per API abrufen können, ob ein Arbeitsmittel frei ist und PAIA erweitern lässt, so dass Nutzer Arbeitsmittel per API reservieren können.

Access to library accounts for better user experience

8. Februar 2013 um 11:10 5 Kommentare

I just stumbled upon ReadersFirst, a coalition of (public) libraries that call for a better user experience for library patrons, especially to access e-books. The libraries regret that

the products currently offered by e-content distributors, the middlemen from whom libraries buy e-books, create a fragmented, disjointed and cumbersome user experience.

One of the explicit goals of ReadersFirst is to urge providers of e-content and integrated library systems for systems that allow users to

Place holds, check-out items, view availability, manage fines and receive communications within individual library catalogs or in the venue the library believes will serve them best, without having to visit separate websites.

In a summary of the first ReadersFirst meeting at January 28, the president of Queens Library (NY) is cited with the following request:

The reader should be able to look at their library account and see what they have borrowed regardless of the vendor that supplied the ebook.

This goal matches well with my activity at GBV: as part of a project to implement a mobile library app, I designed an API to access library accounts. The Patrons Account Information API (PAIA) is current being implemented and tested by two independent developers. It will also be used to provide a better user experience in VuFind discovery interfaces.

During the research for PAIA I was surprised by the lack of existing methods to access library patron accounts. Some library systems not even provide an internal API to connect to the loan system – not to speak of a public API that could directly be used by patrons and third parties. The only example I could find was York University Libraries with a simple, XML-based, read-only API. This lack of public APIs to library patron accounts is disappointing, given that its almost ten years after the buzz around Web 2.0, service oriented architecture, and mashups. All all major providers of web applications (Google, Twitter, Facebook, StackExchange, GitHub etc.) support access to user accounts via APIs.

The Patrons Account Information API will hopefully fill this gap with defined methods to place holds and to view checked out items and fines. PAPI is agnostic to specific library systems, aligned with similar APIs as listed above, and designed with RDF in mind (without any need to bother with RDF, apart from the requirement to use URIs as identifiers). Feedback and implementations are very welcome!

SWIB + MTSR = SSSO

4. Dezember 2012 um 15:48 3 Kommentare

On my flight back from the Metadata and Semantics Research conference (MTRS) I thought how to proceed with an RDF encoding of patron information, which I had presented before at the Sematic Web in Libraries conference (SWIB). I have written about the Patrons Account Information API (PAIA) before in this blog and you watch my SWIB slides and a video recording.

As I said in the talk, PAIA is primarily designed as API but it includes a conceptual model, which can be mapped to RDF. The term „conceptual model“ needs some clarification: when dealing with some way to express information in data, one should have a conceptual model in her or his head. This model can be made explicit, but most times people prefer to directly use formal languages, such as OWL, or they even neglect the need of conceptual modeling languages at all. People that deal with conceptual modeling languages, on the other hand, often underestimate the importance of implementations – to them RDF is just a technology that is subject to change while models are independent from technology. Examples from the cultural domain include the CIDOC Conceptual Reference Model (CIDOC-CRM) and the Cultural Heritage Abstract Reference Model (CHARM), which I got to know in a talk at MTSR.

So thinking about conceptual models, RDF and patron information I came up with an expression of loan status in a library. In PAIA expressed as API we have defined six (actually five) status:

  • 0: no relation
  • 1. reserved (the document is not accesible for the patron yet, but it will be)
  • 2. ordered (the document is beeing made accesible for the patron)
  • 3. held (the document is on loan by the patron)
  • 4. provided (the document is ready to be used by the patron)
  • 5. rejected (the document is not accesible at all)

This list defines a data type, which one can happily work with without need to think about RDF, models, and all this stuff. But there is a model behind the list, which could also be expressed in different forms in RDF.

The first decision was to express each status as an event that connects patron, library, and document during a specific time. The second decision was to not put this into a PAIA ontology but into a little, specialized ontology that could also be used for other services. It turned out that lending a book in a library is not that different to having your hair cut at a barber or ordering a product from an online shop. So I created the Simple Service Status Ontology (SSSO), which eventually defines five OWL classes:

Service events can be connected through time, for instance a service can be executed directly after reservation or it could first be prepared. Putting this tiny model into the Semantic Web is not trivial: I found not less than eight (sic!) existing ontologies that define an „Event“, which a SSSO Service is subclass of. Maybe there are even more. As always feedback is very welcome to finalize SSSO.

First draft of Patrons Account Information API (PAIA)

29. Mai 2012 um 12:09 3 Kommentare

Integrated Library Systems often lack open APIs or existing services are difficult to reuse because of access restrictions, complexity, and poor documentations. This also applies to patron information, such as loans, reservations, and fees. After reviewing standards such as NCIP, SLNP, and the DLF-ILS recommendations, the Patrons Account Information API (PAIA) was specifed at the Common Library Network (GBV).

PAIA consists of a small set of precisely defined access methods to look up patron information including fees, to renew and request documents, and to cancel requests. With PAIA it should be possible to make use of all patron methods that can be access in OPAC interfaces, also in third party applications, such as mobile Apps and discovery interfaces. The specification is divided into core methods (PAIA core) and methods for authentification (PAIA auth). This design will facilitate migration from insecure username/password authentification to more flexible systems based on OAuth 2.0. OAuth is also used by major service providers such as Google, Twitter, and Facebook.

The current draft of PAIA is available at http://gbv.github.com/paia/ and comments are very welcome. The specification is hosted in a git repository, accompanied by a wiki. Both can be accessed publicly to correct and improve the specification until its final release.

PAIA complements the Document Availability Information API (DAIA) which was created to access current availability information about documents in libraries and related institutions. Both PAIA and DAIA are being designed with a mapping to RDF, to also publish library information as linked data.