Kommentare zu: Who identifies the identifiers?

Von: how to write article essay

how to write article essay — Fri, 22 Mar 2019 00:22:06 +0000

how to write article essay

Identifier

Von: Reinouts’ Nerdy Notes » Bibliographic metadata formats

Reinouts’ Nerdy Notes » Bibliographic metadata formats — Thu, 04 Feb 2010 16:07:50 +0000

[…] one in a machine-readable format. But I wouldn’t like to invent my own format when there are dozens to choose from. Could anybody point me to a (preferably semantic web-compatible) format suitable […]

Von: Ross

Ross — Mon, 11 May 2009 18:55:32 +0000

Right, ok, so take my above example with application/x-turtle. So the Wikipedia article on Turtle makes the claim:

„The mime type of Turtle is application/x-turtle (if registered, application/turtle will be sought).“

This of course means they may not have their wish granted (in this case it’s unlikely, but I suppose anything can happen during the RFC process) — what if IANA decides, instead, that the mime type should be application/rdf+turtle (honestly, I don’t know why it wouldn’t be that anyway). What happens to all of the resources that have describes themselves as http://www.iana.org/assignments/media-types/application/x-turtle?

Also, going back to the DC example, you may not always have HTTP headers to glean the content-type from (again, taking the „other format as data transport“ approach: Atom, METS, SRU, OpenURL, etc.).

Although maybe including the serialization as a mandatory attribute is overloading the role of the identifier.

Von: jakob

jakob — Mon, 11 May 2009 18:12:13 +0000

Ok, I got the point: The Dublin Core namespace for the core element set is http://purl.org/dc/elements/1.1/. This does not identify a concrete encoding schema (RDF/XML, RDF/turtle, DCSV etc.) but an abstract data model. Luckily there must be 1-to-1 mappings between the encoding schemas, so they are interchangeable. In practise this is solved with HTTP accept headers, so it’s less a problem – but in general you are right about this. However I fear that the ambiguity of encoding schemas in contrast to abstract data models cannot finally be solved because data exchange always relies on some implicit context. On a higher level encoding schemas are also abstract models. Interesting issue, I will more think about it. The question about application/x-foobar and IANA I don’t really understand.

Von: Ross

Ross — Mon, 11 May 2009 16:53:41 +0000

Actually an even better example of „namespace does not indicate format“ would be Dublin Core, which is about as descriptive as your analogy about „MARC“.

For many, many developers, the „surprise“ at getting back a response to a request for Dublin Core in, say, application/x-turtle would be noticeable.

I like your suggestion of the IANA registry for formats without namespaces, but how would it deal with a situation like application/x-foobar when it gets approved as application/foobaz?

Von: Jakob

Jakob — Mon, 11 May 2009 14:08:36 +0000

An XML namespace identifies the set of all elements that are defined in this namespace. If the creator of a format did not define another identifier for the format but provided one or more XML schemas with a common XML namespace, then you should better reuse this namespace as format identifier instead of inventing something on your own. Of course the namespace does not identify your favorite undocumented subset of elements but all elements in the namespace – but it’s still a format. If this format does not suit your needs, you don’t need another identifier but another format (which of course can be a subset of an existing format). If you need a MODS variant that excludes modsCollection then you should better talk to the LOC if they can clarify the use of URI fragment identifiers for subsets of an XML Schema, so http://www.loc.gov/standards/mods/v3#mods could identify the MODS variant without modsCollection (this is common practise at least in RDF Schemas and OWL).

Von: Rob Sanderson

Rob Sanderson — Mon, 11 May 2009 10:29:05 +0000

„If for a particular format there is a better identifier – like an XML or RDF namespace – then you should use that“

This is an unfortunately common misconception about the usage and meaning of XML namespaces. XML namespaces do NOT identify formats. As a demonstration of this, consider one namespace that defines two sets of elements which are never used together. Yes this is very poor design, but it is quite possible. Also consider MODS, which defines two top level elements, MODS and MODSCollection. If you used the namespace to ‚identify‘ the MODS format, you would not know whether you identified a collection or a single MODS instance. Finally, consider a namespace with many thousands of elements defined in it. Even if the top level tag were the same, the utility of such a ‚format‘ is minimal.

Therefore, there must be a second identifier which is neither schema location (as this is non unique) nor XML namespace. As IETF and W3C have NO interest in this sort of thing, standards are forced to build their own registries.