On the way to a library ontology

11. April 2013 um 15:02 2 Kommentare

I have been working for some years on specification and implementation of several APIs and exchange formats for data used in, and provided by libraries. Unfortunately most existing library standards are either fuzzy, complex, and misused (such as MARC21), or limited to bibliographic data or authority data, or both. Libraries, however, are much more than bibliographic data – they involve library patrons, library buildings, library services, library holdings, library databases etc.

During the work on formats and APIs for these parts of library world, Patrons Account Information API (PAIA) being the newest piece, I found myself more and more on the way to a whole library ontology. The idea of a library ontology started in 2009 (now moved to this location) but designing such a broad data model from bottom would surely have lead to yet another complex, impractical and unused library standard. Meanwhile there are several smaller ontologies for parts of the library world, to be combined and used as Linked Open Data.

In my opinion, ontologies, RDF, Semantic Web, Linked Data and all the buzz is is overrated, but it includes some opportunities for clean data modeling and data integration, which one rarely finds in library data. For this reason I try to design all APIs and formats at least compatible with RDF. For instance the Document Availability Information API (DAIA), created in 2008 (and now being slightly redesigned for version 1.0) can be accessed in XML and in JSON format, and both can fully be mapped to RDF. Other micro-ontologies include:

  • Document Service Ontology (DSO) defines typical document-related services such as loan, presentation, and digitization
  • Simple Service Status Ontology (SSSO) defines a service instance as kind of event that connects a service provider (e.g. a library) with a service consumer (e.g. a library patron). SSSO further defines typical service status (e.g. reserved, prepared, executed…) and limitations of a service (e.g. a waiting queue or a delay
  • Patrons Account Information API (PAIA) will include a mapping to RDF to express basic patron information, fees, and a list of current services in a patron account, based on SSSO and DSO.
  • Document Availability Information API (DAIA) includes a mapping to RDF to express the current availability of library holdings for selected services. See here for the current draft.
  • A holdings ontology should define properties to relate holdings (or parts of holdings) to abstract documents and editions and to holding institutions.
  • GBV Ontology contains several concepts and relations used in GBV library network that do not fit into other ontologies (yet).
  • One might further create a database ontology to describe library databases with their provider, extent APIs etc. – right now we use the GBV ontology for this purpose. Is there anything to reuse instead of creating just another ontology?!

The next step will probably creation of a small holdings ontology that nicely fits to the other micro-ontologies. This ontology should be aligned or compatible with the BIBFRAME initiative, other ontologies such as Schema.org, and existing holding formats, without becoming too complex. The German Initiative DINI-KIM has just launched a a working group to define such holding format or ontology.

SWIB + MTSR = SSSO

4. Dezember 2012 um 15:48 3 Kommentare

On my flight back from the Metadata and Semantics Research conference (MTRS) I thought how to proceed with an RDF encoding of patron information, which I had presented before at the Sematic Web in Libraries conference (SWIB). I have written about the Patrons Account Information API (PAIA) before in this blog and you watch my SWIB slides and a video recording.

As I said in the talk, PAIA is primarily designed as API but it includes a conceptual model, which can be mapped to RDF. The term „conceptual model“ needs some clarification: when dealing with some way to express information in data, one should have a conceptual model in her or his head. This model can be made explicit, but most times people prefer to directly use formal languages, such as OWL, or they even neglect the need of conceptual modeling languages at all. People that deal with conceptual modeling languages, on the other hand, often underestimate the importance of implementations – to them RDF is just a technology that is subject to change while models are independent from technology. Examples from the cultural domain include the CIDOC Conceptual Reference Model (CIDOC-CRM) and the Cultural Heritage Abstract Reference Model (CHARM), which I got to know in a talk at MTSR.

So thinking about conceptual models, RDF and patron information I came up with an expression of loan status in a library. In PAIA expressed as API we have defined six (actually five) status:

  • 0: no relation
  • 1. reserved (the document is not accesible for the patron yet, but it will be)
  • 2. ordered (the document is beeing made accesible for the patron)
  • 3. held (the document is on loan by the patron)
  • 4. provided (the document is ready to be used by the patron)
  • 5. rejected (the document is not accesible at all)

This list defines a data type, which one can happily work with without need to think about RDF, models, and all this stuff. But there is a model behind the list, which could also be expressed in different forms in RDF.

The first decision was to express each status as an event that connects patron, library, and document during a specific time. The second decision was to not put this into a PAIA ontology but into a little, specialized ontology that could also be used for other services. It turned out that lending a book in a library is not that different to having your hair cut at a barber or ordering a product from an online shop. So I created the Simple Service Status Ontology (SSSO), which eventually defines five OWL classes:

Service events can be connected through time, for instance a service can be executed directly after reservation or it could first be prepared. Putting this tiny model into the Semantic Web is not trivial: I found not less than eight (sic!) existing ontologies that define an „Event“, which a SSSO Service is subclass of. Maybe there are even more. As always feedback is very welcome to finalize SSSO.