JSONPath Query Expressions
16 October 2007JSONPath is a query language for JSON authored by Stefan Gössner. JSONPath is built on the concepts of XPath, but applied to JSON. JSONPath is intended to utilize natural language characteristics and be very lightweight and easy to implement. However, one the remaining challenges of JSONPath is to clearly define the allowable query expressions. In a recent discussion with Stefan, he asked, “How should a small subset of language independent JSONPath query expressions look like?”
JSONPath allows filter expressions to be applied to the current result set. For example, one could search for books with a price less than 10, from a book array with the following query (example from http://goessner.net/articles/JsonPath/):
$.book[?(@.price<10)]
This is a very powerful functionality of JSONPath, but currently the specification simply defers to the expression capabilities of the underlying script engine, but in order for JSONPath to be language independent (as JSON is) and for expressions to be limited to expressions that can be safely executed, the domain of query expressions need to be limited more than JavaScript expressions. I wanted to invite discussion on this topic, what do you want to see for JSONPath query expressions? Is there lessons we can learn from XPath here? Should JSONPath query expressions be a strict subset of JavaScript? Maintaining a subset of JavaScript simplifies JavaScript implementations, since syntax checked, non-transformed strings can be eval’ed.
To help you think about the topic, included is a list of some JavaScript operators (from Narcissus):
‘,’: “COMMA”,
‘?’: “HOOK”,
‘:’: “COLON”,
‘||’: “OR”,
‘&&’: “AND”,
‘|’: “BITWISE_OR”,
‘^’: “BITWISE_XOR”,
‘&’: “BITWISE_AND”,
‘===’: “STRICT_EQ”,
‘==’: “EQ”,
‘=’: “ASSIGN”,
‘!==’: “STRICT_NE”,
‘!=’: “NE”,
‘<<’: “LSH”,
‘<=’: “LE”,
‘<’: “LT”,
‘>>>’: “URSH”,
‘>>’: “RSH”,
‘>=’: “GE”,
‘>’: “GT”,
‘++’: “INCREMENT”,
‘–’: “DECREMENT”,
‘+’: “PLUS”,
‘-’: “MINUS”,
‘*’: “MUL”,
‘/’: “DIV”,
‘%’: “MOD”,
‘!’: “NOT”,
‘~’: “BITWISE_NOT”,
‘.’: “DOT”,
‘[’: “LEFT_BRACKET”,
‘]’: “RIGHT_BRACKET”,
‘{’: “LEFT_CURLY”,
‘}’: “RIGHT_CURLY”,
‘(’: “LEFT_PAREN”,
‘)’: “RIGHT_PAREN”
What would you like to see in JSONPath filter expressions?
4 Responses to “JSONPath Query Expressions”
October 16th, 2007 at 11:57 pm
After a little bit of thought it seems like the filter expressions should be a subset of JS and should be something like:
Values should be limited to primitives like
“string”,true,2,null, and linear path references from the current position(@) like @.foo.bar, and @[3][”foo”] (I believe allowing JSONPath’s inside of expressions complicates implentation).
Operators should be limited to (with their normal JS meanings):
?,:,|,&,===,!==,==,!=,>,< ,>=,<=,-,+,*,/,%,!,(,)
And I guess if we want expressions to eval, we need to define that unquoted path names that are JS reserved words are not allowed (you can’t do @.new, you have to do @[”new”]).
We should also ensure that we can create a regex syntax checker that will check to make sure the expression is safe before running eval on it (like json.js does).
October 17th, 2007 at 4:47 pm
Your posting sums up in a concise manner, what needs to get improved with JSONPath. Some more aspects:
JSONPath syntax is oriented more towards E4X than XPath, emulating the core of XPath functionality though.
JSONPath should be lightweight and easily implementable in different languages as you emphasize correctly. Consequence of this is, that we cannot rely on ‘eval’, but need to parse the expression manually - particularly the static compiling languages (Java, C#) shouldn’t need to embed an external script interpreter.
So I am thinking about a really small, very simple and easily to implement core set of expressions, for example
[/regex/] // regular expression
[’pre*’] // prefix
[’*post’] // suffix
[’som? t?xt’] // single character wildcards
[last()] // last array/object member
and evtl. simple comparisons.
October 17th, 2007 at 5:44 pm
For string matching would it be easier to simply only allow regular expressions? Prefix, suffix and character wildcards can all be expressed with regular expressions. Also does string matching like [’pre*’] or [/regex/] indicate that you are searching through currents results to looking for property names that match? So could it be used like: $.book[?(@.name=’pre*’)]
Also, what are you thinking for simple comparisons?
October 19th, 2007 at 12:30 pm
Yes, you are perfectly right. Prefix, suffix and character wildcards can all be expressed with regular expressions. Look at them as convenient shortcuts, but I also tend to skip them from the list in order to keep it redundancy free. Regexes do string matching solely with the current member name.
Simple comparisons as query expressions work by looking for the value of child members and might have the syntax
$.any.member[?(@.child op value)]
where op is one of [=,!=,,>=].
Those expressions have a high value of benefit, might be used frequently and are easy to implement. Adding them does not bloat JSONPath a lot and could be a good start … what do you think?
One problem here is to differentiate syntactically somehow between query expressions of the core set to be parsed ‘manually’ and query expression to be parsed by the underlying script engine. One way out would be to use square brackets or curly braces instead of parentheses. Do you see a better solution?
Finally there might be support of some indexing helper functions like ‘last()’ as in XPath.