In this post I’d like to propose two small changes to the JTM notation. It’s not that there are not enough Topic Maps exchange formats available. Implementers can already today write import/export modules for their Topic Maps engine both day and night, instead of spending their valuable time writing TMQL lexers and parsers.

But in the past months I’ve been working with Topic Maps on mobile devices, more or less as a pet project. Not surprisingly, one of the challenges with mobile plaforms is handling memory constraints. The app I’m creating imports a JTM file, and during the import I have to keep the entire JTM file and the topic map itself in memory. It turned out that it would be really nice to reduce the size of the JTM file without too many changes to the JTM syntax.

Therefore I’d like to propose two changes to the JTM:

  1. A shortcut for type-instance associations in form of an instance_of-array prefixes for locators

A shortcut for type-instance associations

type-instance associations are common in Topic Maps, but their serialization is quite verbose. I woud therefore like to add a new member to the topic object: instance_of. instance_of is an array of topic references. Example:

{"version":"1.1",
  "item_type":"topic",
  "subject_identifiers":["http://psi.topincs.com/people/thomas-vinterberg"],
   "instance_of": [
       "si:http://psi.semanticheadache.com/person"
   ],
   "names":[
        {"value":"Thomas Vinterberg",
        "type":"si:http://psi.topicmaps.org/iso13250/model/topic-name"}]}

Alternatively, type-instance associations may still be exported associations. So old files are still compatible.

Prefixes

             There is a lot of redundancy in locator strings. Adding prefixes similar to CTM and LTM would make JTM files more readable (and also easier to write). I suggest adding an optional prefix member to the document. The corresponding value is an object with the prefixes as its keys and a reference as its value:

 "prefixes":{
    "dc": "http://purl.org/dc/elements/1.1/",
    "dcterms": "http://purl.org/dc/terms/" 
 }

So much on defining prefixes. Now on to their use: item_identifiers (in all topic map items), subject_identifiers and subject_locators can hold an IRI or a Safe_CURIE. A Safe_CURIE is a CURIE wrapped in ‘[’ and ’]’. A CURIE consists of a prefix and a reference. In topic references, an IRI or a Safe_CURIE may appear after ‘si:’, ‘sl:’ and ‘ii:’. Safe_CURIEs are also allowed in datatype members of occurrences and variants. Example:

 {"version":"1.1",
     "prefixes": {
         "topincs": "http://psi.topincs.com/people/",
         "tmdm": "http://psi.topicmaps.org/iso13250/model/"
     },
     "item_type":"topic",
     "subject_identifiers":["[topincs:thomas-vinterberg]"],
     "names":[
         {"value":"Thomas Vinterberg",
          "type":"si:[tmdm:topic-name]"}]}

CURIEs or Compact URIs are defined in http://www.w3.org/TR/curie/. CURIEs are a more relaxed version of QNames which are often used to model prefixes for IRIs. With a prefix “t”: “http://psi.topincs.com/”, “t:movies/dear-wendy” would be a valid CURIE, but not a valid QName. I propose Safe_CURIEs for JTM 1.1, because they explicitly define whether to interpret a locator as an IRI or a CURIE. This avoids edge cases where an IRI gets expanded when a prefix with the same name of the scheme is defined (Think of a prefix http). It also makes parsing easier for both humans and machines.

Comparison of JTM 1.0 and 1.1

The third change is quite obvious: All JTM 1.1 documents must have a member version with the value “1.1”. To illustrate the changes, let’s compare the topic map from the JTM 1.0 document to its JTM 1.1 equivalent:

{"version":"1.0",
 "item_type":"topicmap",
 "topics":[
     {"subject_identifiers":["http://psi.topincs.com/movies/dear-wendy"],
     "names":[
         {"value":"Dear Wendy",
          "type":"si:http://psi.topincs.com/title",
          "scope":[
               "si:http://www.topicmaps.org/xtm/1.0/country.xtm#US",
               "si:http://www.topicmaps.org/xtm/1.0/country.xtm#DE"]}],
     "occurrences":[
         {"value":"2005",
          "type":"si:http://psi.topincs.com/publication-year",
          "datatype":"http://www.w3.org/2001/XMLSchema#gYear"}]}],
     "associations":[
         {"type":"si:http://psi.topicmaps.org/iso13250/model/type-instance",
          "roles":[
               {"player":"si:http://psi.topincs.com/movies/dear-wendy",
                "type":"si:http://psi.topicmaps.org/iso13250/model/instance"},
               {"player":"si:http://psi.topincs.com/movie",
                "type":"si:http://psi.topicmaps.org/iso13250/model/type"}]}]}

becomes:

{"version":"1.1",
 "item_type":"topicmap",
 "prefixes": {
     "t": "http://psi.topincs.com/",
     "xtm": "http://www.topicmaps.org/xtm/1.0/" 
  },
  "topics":[
       {"subject_identifiers":["[t:movies/dear-wendy]"],
  "instance_of": ["si:[t:movie]"],
  "names":[
       {"value":"Dear Wendy",
        "type":"si:http://psi.topincs.com/title",
        "scope":[
           "si:[xtm:country.xtm#US]",
           "si:[xtm:country.xtm#DE]"]}],
  "occurrences":[
      {"value":"2005",
       "type":"si:[t:publication-year]",
       "datatype":"http://www.w3.org/2001/XMLSchema#gYear"}]}]}

Every JTM 1.0 file will still be a valid JTM 1.1 file, and changing existing parsers should not be too hard. In my opinion, this will help to make the JTM even more compact, while still maintaining its simplicity.

Special thanks go to Lars Heuer for comments and feedback. What do you think? Comments welcome!



blog comments powered by Disqus

Published

23 September 2010

Tags