JSON

Developing the next generation of open data interchange

« PreviousNext »

JSON Referencing Proposal and Library

19 October 2007

Based upon discussions from this post, I wanted to bring together some of the different ideas and propose a convention for referencing that will be flexible, readable and usable for general purpose JSON usage and provide a JSON referencing library that implements referencing for the purposes of circular references, multiple references and intermessage references. From our discussion and exploration of referencing, I have learned that neither id referencing nor path referencing alone is adequate for all problems. I was hoping that id referencing could solve all referencing problems since that is what I previously used in JSPON, but alas, id referencing requires id generation and can be more difficult to implement in situations where ids are not inherently available, especially when it desirable to avoid unnecessary ids (and clutter), and it is difficult to use when referencing arrays. However, path referencing alone can not deal with intermessage and remote referencing needs. However, I think we can use a single convention and actually utilize both of these approaches. JSPON’s previous approach of using the same property for id and referencing won’t fly for this either (conceptually doesn’t make sense), so I propose the following approach that I think takes the best of the different techniques. The library I am introducing can easily be modified to handle different conventions, but this is intended to be a compromise of different conventions and utilize the best concepts. This convention will use a “$ref” property in an object to denote a reference. The property should have a string value that is used to resolve the reference. The reference can be an id, a path, or both. Following the JSONPath convention, “$” will denote the root object of the current message. Therefore the reference string can start with an id or “$” folowed by property references. I will illustrate with some examples of an array with two objects that both refer to the same child object. First we can use this convention for path referencing:

[{”name”:”first”,
   “child”:{“name”:”the child”}},
{”name”:”second”,
   “child”:{“$ref”:”$[0].child”}}]

Or if the JSON objects have ids in them, we can refer by id:

[{“id”:”1”,”name”:”first”,
    “child”:{“id”:”3”,“name”:”the child”}}
{“id”:”2”,”name”:”second”,
    “child”:{“$ref”:”3”}}]

Or we can combine ids with paths

[{“id”:”1”,”name”:”first”,
    “child”:{“name”:”the child”}}
{“id”:”2”,”name”:”second”,
    “child”:{“$ref”:”1.child”}}]

With this one convention we can use and mix both forms of referencing. The library that I created for JSON parsing and serialization can utilize both of these techniques as well. The referencing library is basically an enhanced version of Crockford’s JSON library, although I used the “old” API, which I prefer because it does not break enumerability (that is require defensive enumeration). To use my library with the above example we could parse any of the above examples and it would properly resolve the second reference to the child to the correct original child object:

myArray = JSON.parse("[{\"name\":....");

The library can handle both path and id references (and mixed). We can then serialize the array using the stringify method:

var myArrayAsJson = JSON.stringify(myArray);

Once again, if there are ids available, the library will use the ids to create the references. Otherwise the "$" (root) with the full path reference is used for serialization. Of course, this library can handle circular references as well as the multiple references used in the example.

The library can also handle intermessage referencing. For example if we had used ids in our array example, we could process a subsequent JSON string:

[{“id”:”1”,”name”:”third”,
    “child”:{“$ref”:”3”}}
{“id”:”2”,”name”:”fourth”,
    “child”:{“$ref”:”3”}}]

When parsed, these additional objects will resolve their child reference to the same object as the items in the first message.

The library also provides a means for remote referencing. If an reference to an id is encountered that is not in any of the parsed messages, then JSON.resolveRef(refString) will be called. You can provide an implementation for this method to resolve references that can not be automatically resolved. For example you might want an object that makes an external reference:

{"name":"fifth","child":{"$ref":"http://otherdomain.com/jsonObject“}}

When this is parsed, the resolveRef method will be called with the string in the $ref field as the argument.

You can download the JSON referencing library here.

I am also updating JSPON to utilize this form of referencing to bring more consistency to JSON referencing.

It is important to note that while I am using the JSONPath convention, the references are not strictly resolved like JSONPath queries, since a JSONPath query returns a result array with an item in the array being the referenced object.

A few thoughts on the rationale behind this convention. It is intended to be highly readable. Some reference conventions used string with the reference embedded, some used an external list of references, however an object with a simple JSON name value pair in the place of the reference seems to be conceptually the simplest way to define a reference and most consistent with JSON. The name indicates that it is a reference, the value indicates the string to use to resolve the reference. I also found that using property in an object instead of reference embedded in a string appears to be slightly faster and easier to implement. There is no need to string matching and manipulation to get at the reference string. It is also prefixed with “$” to indicate that it has a special purpose and to prevent name conflict or reduction in useful expressive domain.

Posted in Libraries, Referencing | Trackback | del.icio.us | Top Of Page

    7 Responses to “JSON Referencing Proposal and Library”

  1. Claudio D'Angelo Says:

    I think is useful use different name convention to reference an object.
    $ref => use the id
    $ref.path =>use the jsonpath
    $ref.ext => use external reference

    In this manner a system can work in a performance issue.

  2. Kris Zyp Says:

    I have implemented this approach, and any performance benefit derived from multiple names would be negligible if there is any at all. In general JSON that has few references, this may actually be worse performance because multiple property names must be checked on every object. This also complicates the communication. KISS, simplicity is especially important when it comes to interchangeable data. Separate names would also hinder the ability to combine id and path referencing (they are not mutually exclusive).

  3. Andreas Blixt Says:

    Hey there, I’d like to chime in as I’ve recently been working with serializing objects that have circular and/or many-to-many references that simply cannot be represented without either duplicating data or using references.

    As for your proposal it turns out I came to pretty much the same conclusion when making a generic class I decided to call ObjectManager. You can see it here: http://blixt.googlecode.com/svn/trunk/js/ObjectManager/

    But there’s also an alternative of bending the JSON rules while still keeping JavaScript syntax to make references more intuitive (and less prone to property collisions.) I came up with two solutions which are documented here:
    http://blixt.googlecode.com/svn/trunk/js/JsonReferences/jsonrefs.html

    As for why I posted this… Well, I’d like to share my thoughts which may give some ideas and get feedback on the things I’ve done so that I may get more ideas!

    Thanks for your time,
    Andreas

  4. Kris Zyp Says:

    So from what I understand from your code, instead of using “$ref” as the property name to identify a reference you used “$”. Also in your path-based referencing approach you used “this” to refer to the root instead of “$”. A few thoughts on this:

    I prefer using “$ref” as the property name (instead of “$”) because I think it makes the JSON text more readable. Readability is a very high priority for me, since I am most interested in using referencing in situations where different systems are interchanging data.

    Your use of “this” in the path-based referencing is actually more asthetic, I think. My use of “$” to reference the root was to follow the lead of JSONPath. I didn’t want to directly introduce JavaScript syntax, whereas JSONPath is language dependent (and happens to follow JS syntax). However, this might not be tremendously important reason though. I have ported this referencing library to Dojo (http://trac.dojotoolkit.org/browser/dojox/trunk/rpc/JsonReferencing.js?rev=12525), but I think it would be pretty easy to allow to either “$” or “this” to reference the root.

    Also, in your id-based referencing, it looks like you are using auto-generated ids. This seems to make for a very cryptic situation for readability. You have to count nodes to determine the ids for each object. I think id-referencing should only be used when there is actually existing ids on the objects you are referencing, otherwise path-based is much more readable.

    Next, the jsonrefs alternate solutions sample does not seem to actually properly resolve references. By eval’ing twice you are creating two sets of objects, the returned object returning children from the non-returned object, and so you lose correct identity. Consequently if you do:
    obj1 = jsonDecode1(json1)
    then:
    obj1.Orders[0].Customer != obj1.Customers[0]
    Which doesn’t seem right.

    Finally, I have also considered the approach of using a function call of ref() to identify references. I think this it is good idea, it is appealing, visually very nice. I may include that in the dojo module.

  5. Andreas Blixt Says:

    Ah, you’re entirely correct in that they shouldn’t refer to a copy of the object, because that would ruin the whole idea of referential integrity… I guess in my hurry to finish a working prototype I missed that very important detail.

    I’ll make a new version that references only one object in the coming days!

    Anyways, it’s nice to see that we’re on the same train of thought here… We must be onto something! ;)

  6. Ben Weaver Says:

    Hello,

    Have you considered making the JSON Referencing Proposal compatible with the YAML Node Anchor syntax (see http://yaml.org/spec/1.2/#id2585322)? Since YAML and JSON are so closely related, it’d be nice to keep them compatible.

    For example:

    [{”name”:”first”,
    “child”: &c {“name”:”the child”}},
    {”name”:”second”,
    “child”: *c}]

    Or, if you wanted to incorporate path expressions:

    [&foo {”name”:”first”,
    “child”: {“name”:”the child”}},
    {”name”:”second”,
    “child”: *foo.child}]

  7. Kris Zyp Says:

    @Ben, thanks for the suggestion, that is a good idea. A few thoughts on it:
    First, this JSON Referencing proposal is intended to be completely compatible with JSON, that is it only uses JSON syntax, it is convention for JSON usage. I am not intending to try to expand the original JSON specification (application/json). YAML syntax is obviously not valid JSON.

    However, there are certainly people that feel that using extra syntax is more appropriate for doing things like referencing. This isn’t truly JSON, it is some superset of JSON. When considering a superset of JSON, I believe it is very reasonable to consider that JSON is based on being subset of JavaScript, and I think JSON supersets should generally attempt to also subset JavaScript for consistency. This is more than just a matter of principle. Using “eval” to parse JavaScript is one of the most commonly used techniques for parsing JSON, and it should remain so since the browser is a very bandwidth limited environment and creating dependencies on complicated parsing algorithms would not be beneficial. A superset should continue to be easily eval’able, IMO. Using the YAML anchor syntax would certainly require some degree of parsing prior to eval to make it viable.

    In the realm of JSON superset based referencing, others have suggested, and I agree that the best referencing strategy would probably be to simply replace the {$ref:”target”} with ref(”target”). This is very easy to handle in JavaScript and certainly difficult in manual parsing schemes in other languages either.

Leave a Reply