Mapping bibliographic record subfields to JSON

13. April 2011 um 16:26 4 Kommentare

The current issue of Code4Lib journal contains an article about mapping a bibliographic record format to JSON by Luciano Ramalho. Luciano describes two approaches to express the CDS/ISIS format in a JSON structure to be used in CoudDB. The article already provoked some comments – that’s how an online journal should work!

The commentators mentioned Ross Singer’s proposal to serialize MARC in JSON and Bill Dueber’s MARC-HASH. There is also a MARC-JSON draft from Andrew Houghton, OCLC. The ISIS format reminded me at PICA format which is also based on fields and subfields. As noted by Luciano, you must preserves subfield ordering and allow for repeated subfields. The existing proposals use the following methods for subfields:

Luciano’s ISIS/JSON:

[ ["x","foo"],["a","bar"],["x","doz"] ]


"subfields": [ {"x":"foo"},{"a":"bar"},{"x":"doz"} ]


[ ["x","foo"],["a","bar"],["x","doz"] ]

Andrew’s MARC/JSON:

"subfield": [
  {"code":"x","data":"doz"} ]

In the end the specific encoding does not matter that much. Selecting the best form depends on what kind of actions and access are typical for your use case. However, I could not hesitate to throw my encoding used in luapica into the ring:

{ "foo", "bar", "doz", 
  ["codes"] = { 
    ["x"] = {1,3}
    ["a"] = {2}

I think about further simplifying this to:

{ "foo", "bar", "doz", ["x"] = {1,3}, ["a"] = {2} }

If f is a field than you can access subfield values by position (f[1], f[2], f[3]) or by subfield code f[f.x[1]],f[f.a[1]],f[f.x[2]]. By overloading the table access method, and with additional functions, you can directly write f.x for f[f.x[1]] to get the first subfield value with code x and f:all("x") to get a list of all subfield values with that code. The same structure in JSON would be one of:

{ "values":["foo", "bar", "doz"], "x":[1,3], "a":[2] }
{ "values":["foo", "bar", "doz"], "codes":{"x":[1,3], "a":[2]} }

I think a good, compact mapping to JSON that includes an index could be:

[ ["x", "a", "x"], {"x":[1,3], "a":[2] },
  ["foo", "bar", "doz"], {"foo":[1], "bar":[2], "doz":[3] } ]

And, of course, the most compact form is:



RSS feed for comments on this post. TrackBack URI

  1. Congrats,

    somewhere in the line of your thought you have reinvented the ISO 2701 directory which MARC is so famous for: +1 for compactness and +1 for acceptance by the friends of convoluted legacy solutions (focls)!

    Comment by Thomas Berger — 14. April 2011 #

  2. [Http://Www.Cefoto.Org/|Bandar Bola]

    Data Modeling « Jakoblog — Das Weblog von Jakob Voß

    Trackback by [Http://Www.Cefoto.Org/|Bandar Bola] — 5. Mai 2020 #


    Data Modeling « Jakoblog — Das Weblog von Jakob Voß

    Trackback by — 5. Februar 2022 #

  4. free dedicated servers

    Data Modeling « Jakoblog — Das Weblog von Jakob Voß

    Trackback by free dedicated servers — 9. März 2023 #

Sorry, the comment form is closed at this time.