Class or Property? Objectification in RDF and data modeling

14. August 2009 um 00:23 4 Kommentare

A short twitter statement, in which Ross Singer asked about encoding MARC relator codes in RDF, reminded me of a basic data modeling question that I am thinking about for a while: When should you model something as class and when should you model it as property? Is there a need to distinguish at all? The question is not limited to RDF but fundamental in data/information modeling. In Entity-relationship modeling (Chen 1976) the question is whether to use an entity or a relation. Let me give an example by two subject-predicat-object statements in RDF Notation3:

:Work dc:creator :Agent
:Agent rdf:type :Creator

The first statement says that a specific agent (:Agent) has created (dc:creator) a specific work (:Work). The second statement says that :Agent is a creator (:Creator). In the first dc:creator is a property while in the second :Creator is a class. You could define that the one implies the other, but you still need two different concepts because classes and properties are disjoint (at least in OWL – I am not sure about plain RDF). In Notation3 the implications may be written as:

@forAll X1, X2. { X1 dc:creator X2 } => { X2 a _:Creator }.
@forAll Y1. { Y1 a _:Creator } => { @forSome Y2. Y2 dc:creator Y1 }.

If you define two URIs for class and property of the same concept (the concept of a creator and creating something) then the two things are tightly bound together: Everyone who ever created something is a creator, and to be a creator you must have created something. This logic rule sounds rather rude if you apply it to other concepts like to lie and to be a liar or to sing and to be a singer. Think about it!

Beside the lack of fuzzy logic on the Semantic Web I miss an easy way to do “reification” (there is another concept called “reification” in RDF but I have never seen it in the wild) or “objectification”: You cannot easily convert between classes and properties. In a closed ontology this is less a problem because you can just decide whether to use a class or a property. But the Semantic Web is about sharing and combining data! What if Ontology A has defined a “Singer” class and Ontology B defined a “sings” property which refer to the same real-world concept?

Other data modeling languages (more or less) support objectification. Terry Halpin, the creator and evangelist of Object-Role Modeling (ORM) wrote a detailed paper about objectification in ORM whithout missing to mention the underlying philosophical questions. My (doubtful)
philosophic intuition makes me think that properties are more problematic then classes because the latter can easily be modeled as sets. I think the need for objectification and to bring together classes and properties with similar meaning will increase, the more “semantic” data we work with. In many natural languages you can use a verb or adjective as noun by nominalization. The meaning may slightly change but it is still very useful for communication. Maybe we should more rely on natural language instead of dreaming of defining without ambiguity?

Dublin Core conference 2008 started

23. September 2008 um 12:20 2 Kommentare

Yesterday the Dublin Core Conference 2008 (DC 2008) started in Berlin. The first day I spent with several Dublin Core Tutorials and with running after my bag, which I had forgotten in the train. Luckily the train ended in Berlin so I only had to get to the other part of the town to recover it! The rest of the day I visited the DC-Tutorials by Pete Johnston and Marcia Zeng (slides are online as PDF). The tutorials were right but somehow lost a bit between theory and practise (see Paul’s comment) – I cannot tell details but there must be a way to better explain and summarize Dublin Core in short. The problem may be in a fuzzy definition of Dublin Core. To my taste there are far to many “cans”, “shoulds”, and “mays” instead of formal “musts”. I would also stress more the importance of publicating stable URIs for everything and using syntax schemas.

What really annoys me on DC is the low committement of the Dublin Core Community to RDF. RDF is not propagated as fbase but only as one possible way to encode Dublin Core. The same way you could have argued in the early 1990s that HTTP/HTML is just one framework to build on. That’s right, and of course RDF is not the final answer to metadata issues – but it’s the state-of-the-art to encode structured data on the web. I wonder when the Dublin Core Community lost tight connection with the W3C/RDF community (which on her part was spoiled by the XML community). In official talks you don’t hear this hidden stories of the antipathies and self-interests in standardization.

The first keynote that I heard at day 2 was given by Jennifer Trant about results of steve.museum – one of the best projects that analyzes tagging in real world environments. Data, software and publications are available to build upon. The second talk – “Encoding Application Profiles in a Computational Model of the Crosswalk” by Carol Jean Godby (PDF-slides) – was interesting as well. In our library service center we deal a lot with translations (aka mappings, crosswalks etc.) between metadata formats, so the crosswalk web service by OCLC and its description language may be of large use – if it is proberly documented and supported. After this talk Maria Elisabete Catarino reported with “Relating Folksonomies with Dublin Core” (PDF-slides) from a study on the purposes and usage of social tagging and whether/how tags could be encoded by DC terms.

At Friday we will hold a first Seminar on User Generated Matadata with OpenStreetmap, Wikipedia, BibSonomy and The Open Library – looking forward to it!

P.S: Pete Johnston’s slides on DC basic concepts are now also available at slideshare [via his blog]

Powered by WordPress with Theme based on Pool theme and Silk Icons.
Entries and comments feeds. Valid XHTML and CSS. ^Top^

Switch to our mobile site