Standardizing RESTful JSON
19 August 2008Joe Gregorio recently began discussing a protocol for RESTful JSON . Having spent a significant amount of time in this field, and having authored several implementations of RESTful JSON, I thought it me be beneficial to offer some comments on the subject. I am certainly supportive of Joe’s efforts towards RESTful JSON, and believe this could be very valuable for the community. Clearly defining the mechanisms for JSON-based RESTful interchange would surely improve interoperability of services and consumers. However, I think it worthwhile to mention some suggestions for this effort.
One of the most important keys to positive successful standardization is not to prematurly standardize on new inventions . Fortunately RESTful JSON is not a new invention, and implementations have forged the way of leveraging and interpreting existing standards and creating conventions as necessary. Any standardization should stand on the shoulders of these efforts. Initial attempts at CRUD-style JSON-based interaction were quite inventive, but inventiveness is not the friend of interoperability when standards already exist (de facto or formal), and HTTP provides a comprehensive set of mechanisms for data interaction. For many forms of data interaction, the intuitive combination of JSON and HTTP provides a high level of power. Many JSON-REST implementations mostly follow the HTTP specification, and often multiple implementations share common conventions. The notes in this post are based on the behavior of server implementations of JSON REST including Persevere , CouchDB , GrassyKnoll , Amazon S3, and RoR’s ActiveResource , and on client implementations including Dojo , Jester , and Persevere JavaScript client .
Another key aspect to establishing a JSON REST protocol is to properly define and distinguish the orthogonal concerns involved. Basic JSON interchange using REST is actually well-specified in the HTTP, but applications often want higher level constructs for effective interaction. Various applications need different features, and they should be able to retain maximum simplicity without overhead from unnecessary constructs. Some of JSON protocols are certainly not specific to RESTful usage. Here is how I would characterize the different components that may be of interest to different RESTful JSON applications:
- Object identity to URL mapping - This is the critical protocol necessary for allowing JSON messages to contain object/sub-resources that can be addressed through REST verbs with URLs. A convention is necessary for determing how any given object is mapped to a URL. Every JSON REST implementation that I am aware defines an identity property and uses that identity to map to URL in the form:
/table/id
Or tableless stores will just do:
/id
It is certainly possible to create more sophisticated URL mapping schemes with multiple identity properties and alternate URL forms. Most simplicity driven JSON advocates would and probably should reject such complications.
- Data structure definition - JSON Schema seems to be emerging as the protocol for defining JSON structures.
- Service advertising - This defines a protocol for the a server to advertise it’s services. Service Method Description builds on JSON Schema to do this effectively for JSON services.
- Topology - Plain JSON is composed of simple directed object graphs. Applications that need more sophisticated property linking can utilize JSON Referencing for cyclic and cross-domain references. It should be noted that JSON Referencing is an exercise in minimal inventiveness as well, it is basically an application of the hyperlink/relative URL scheme to JSON data. Existing standards are leveraged.
Of course these protocols are not completely orthogonal, JSON Referencing also utilizes object identities, and anticipates RESTful URLs for cross-site referencing lookups. JSON Schema can leverage referencing for describing sophisticated data structures, and can itself define identity properties.
I certainly would be happy to see these protocols standardized. However, my main concern is that a new uber protocol that invents a new techniques for all these concepts, and creates divergence between RESTful JSON usage and other JSON usage doesn’t seem like it would be good for the community. Simply taking a single example use case and lumping all of the needs into a single specification is certainly not a wise course. And most importantly standardizatin of RESTful JSON should be stand on the work of the current implementations, not invent new wheels and contradict the work and conventions that are currently in use today.
Another important concern is that a RESTful JSON protocol should not add signficant complexity. Complexity may be acceptable in other realms, but JSON pioneers have rightly fought for keeping things simple. RESTful web services should be able to simply provide data in the form that it exists. For example if I want to return a set of users, it should be as simple as returning:
[{"name":"Kris", "gender":"male", "id":"1"},
{"name":"Nicole", "gender":"female", "id":"2"}]
A RESTful JSON protocol is not the place to force developers to wrap everything in envelopes or force extraneous additional metadata in objects.
Finally here is a simple example of combining HTTP REST with JSON Schema providing comprehensive data structure knowledge for full interaction:
Here is an example of we have been doing it:
GET /Trail
response:
{
name:"A database of trails",
properties:{
name:{type:"string"},
description:{type:"string"},
id:{type:"string",unique:true},
length:{type:"number"},
...
}
}
And then getting an object:
GET /Trail/3
response:
{
id:"3",
name:"Wasatch Crest Trail",
description:"Follows the ridgeline of the Wasatch mts",
length:3,
...
}
Likewise we can intuitively PUT or DELETE to /Trail/3.
I hope we can together make progress towards a standard RESTful JSON protocol that is useful for the community, but I believe this can only be done if we keep things simple and build upon what has been done by existing implementations. I believe we have a great foundation for future progress.
15 Responses to “Standardizing RESTful JSON”
August 19th, 2008 at 5:51 pm
Good to point out existing implementations. Speaking of, I hope you’re using one to trace the paths of your trails?
August 20th, 2008 at 3:07 am
I agree that the implementation of this should be as simple as possible. However, when I began looking at the proposed JSON schema I felt it was very unwieldy.
For example, there’s a “pattern” property which can use a regexp, and there’s the format property which can take a gazillion different types including phone, street address and - yes - regex!
I think that perhaps an approach that would be simpler to implement would be to leave (even fairly common) arbitrary domain type (like telephone, et.c.) out of the definition, and stick with basic types.
I don’t feel that domain constraints or typing adds any value to the definition and only makes it hard to grasp as well as overlapping in places.
Sorry to sounds like a wet blanket, but I (also) think that simplicity is key and I’d like to see the schema proposal become as simple as possible, instead of being as expressive as possible.
In our current project we’ve decided not to use it (even though we really need a schema) as it is, just for the overlappings and complexity issues. We will probably use a simplified version of it, though.
Thanks a lot for the good works anyway, and sorry to be a wet blanket!
Cheers,
PS
August 20th, 2008 at 7:28 am
@Sean: The example if from a demo I am creating for Persevere.
@Peter: The “formats” are intended to be an non-normative extension (maybe that isn’t clear, but it is on a separate page).
Here is how you define a string property with formats in the spec:
foo:{type:”string”}
Here is how you define a string property without formats in the spec:
foo:{type:”string”}
I am not sure how removing such “extras” simplifies the bread-and-butter usage of JSON Schema. JSON Schema is intended be easy to use for simple case, and it gets more complex for users that want to do more complex things.
I appreciate the feedback, do you have a proposal for how to make it easier to use? We have been pretty aggresive in trying to make it easy and simple to write schemas, so any proposal towards that ends would certainly be of interest.
August 20th, 2008 at 8:05 am
No of course not. I didn’t mean to remove the string type
What I meant was to have a central specification which only specified the types and constraints of the schema, and to have constraints perhaps as a string that was not further defined in the Schema.
So a minimal example would be to have only two properties for any item in the schema; type (String, number, integer, boolean, object, array) and constraint (any string at all, perhaps a regex, which is interpreted by the parsing party)
Possible option and default could be included as well, but not minItems, maxLength, et.c. since that pour a lot of semantics onto the schema.
Also, the format property, if present, should IMO not be defined at all either, especially not with domain-specific types as telephone, address, gender or the like.
What I’m aiming at is to have a format which is as slim as possible and which does little or no domain-specific assumptions on the items, but still allow room for uninterpreted string which can be used “higher up”.
The main purpose of the schema is after all structure. Maybe one could have a central definition which does this minimal structure and minimal typing definition, and then have an optional extension to it (maybe as a separate object property to an item) that can be used if needed.
Our initial goal was to implement json schema as it was, but what made us not do so was just this very long lists of properties that are in many cases things that are application dependent.
Also, if you include telephone as a type of format, maybe you should include birthdate, and if not, why not? et.c.
The best route as I see it is to try as much as you can to separate specifics from the schema definition as you can, and have those things added, if needed, as extensions at a later date.
Perhaps one could have the two parts in the following way 1) structural schema, 2) Several different domain-specific schemas for different domains; App Engine Models, Yahoo Search properties.
Another thought; It feels like what I’m after is something like the specification for the SMD format, whereas the current json schema definition can be compared to SMD file specification where all possible types for google user, link to image, telephone number, max values, what have you is penciled in as alternatives, which makes it quite large.
I feel that I’m rambling a bit, and I hope that I despite this manage to bring some sort of clarity to my otherwise generic gripes.
So, anyway; A shorter format, with just the basics would be much simpler to implement, where we could add more typing if absolutely needed.
Cheers,
PS
August 20th, 2008 at 8:14 am
Kris, I just noticed (stressed coffee break postings should be avoided!
that you said that the long and winding list of domain-specific formats was indeed optional. My bad!!
Cheers,
PS
August 20th, 2008 at 2:22 pm
In your example above GET /Trail returns a JSON schema and GET /Trail/3 returns an object - there doesn’t appear to be a way of retrieving a list of objects.
Have I missed something?
August 21st, 2008 at 5:36 pm
“A RESTful JSON protocol is not the place to force developers to wrap everything in envelopes…”
Unfortunately, there are legitimate security reasons to always return a JSON object as the top-level container. Returning a list at the top-level is a known XSS hole.
August 21st, 2008 at 11:06 pm
> Returning a list at the top-level is a known XSS hole.
More info: http://code.google.com/p/doctype/wiki/ArticleScriptInclusion
August 23rd, 2008 at 7:18 pm
@American: Returning a list as top-level is not security hole in properly secured environment; this FUD has circulated for some time. If you are delivering data with cookie-only authorization (bad practice) then you have already have a security hole, and this hole can be exploited easily when the top-level value is an array. However,the top-level array is not the security hole, it is the lack of proper authorization. Protected resources should never be delivered to cross-site requesters without proper authorization (not just authentication through cookies).
@Mark: Thanks for the link, hopefully that clears up the issue for those that think that arrays are a security hole.
October 11th, 2008 at 9:31 am
I think I agree that it is *far too early* to try to standardize this — it happens that I really don’t like the approach in the proposed JSON Schema to attribute naming - specifically that it is overly verbose and persists (’scuse pun) an unnecessary impedance mismatch with the relational model.
October 11th, 2008 at 9:36 am
@Steve: What is at about the attribute naming that you consider overly verbose? Would you like “props” instead of “properties”? Feel free to voice concerns or suggestions on the JSON Schema discussion group: http://groups.google.com/group/json-schema
December 22nd, 2008 at 6:16 pm
In your example above GET /Trail returns a JSON schema and GET /Trail/3 returns an object - there doesn’t appear to be a way of retrieving a list of objects.
May 20th, 2009 at 2:14 am
jhfsdf
May 29th, 2009 at 2:15 am
I’ve being working on resource discovery (or application/service description if you prefer) in JSON and I was recently made aware of JSON Schema. Please see Partial template expansion in described_routes. I’m keen to collaborate if anyone is interested.
May 29th, 2009 at 2:20 am
Oops - that link should have been Partial template expansion in described_routes.