JSON Schema Proposal
This is out of date. The new proposal can be found in the JSON Schema discussion group.
JSON Schema is a specification for a JSON-based format for defining the structure of JSON data. JSON Schema provides a contract for what JSON data is required for a given application and how it can be modified, much like what XML Schema provides for XML. JSON Schema is intended to provide validation, documentation, and interaction control of JSON data. JSON Schema is based on the concepts from XML Schema, RelaxNG, and Kwalify, but is intended to be JSON-based, so that JSON data in the form of a schema can be used to validate JSON data, the same serialization/deserialization tools can be used for the schema and data, and it can be self descriptive.
For this specification, a schema will be used to denote a JSON Schema definition, and an instance refers to the JSON object that the schema will be describing and validating. A JSON Schema is an JSON object with properties that correspond to properties in an instance, and the values of the properties in the schema should contain a property definition that provides the definition for the valid values for the properties in the instance. A property definition can define various attributes of the instance property that define it’s usage and valid values. An example property definition looks like:
{
"type":"string",
"optional":true
}
And an example JSON Schema definition could look like:
{
"name": {"type":"string"},
"age" : {"type":"number",
"maximum":125}
}
JSON instance objects can also have a self defined schema. An instance object can use a schema property to refer to an JSON Schema object that defines the schema of the referring object. Note that this is not limited to the root object, but any JSON object can refer to a schema to provide self-definition. For example:
{
"name" : "John Doe",
"age" : 30,
"schema" : {
"name": {"type":"string"},
"age" : {"type":"number",
"maximum":125,
"optional":true}
}
}
This is a self defined schema JSON object, and can be validated without a separate schema object.
Property Definition
A property definition can have the following properties (all properties are optional):
| Property | Meaning |
| type | This is a type definition value. A type definition can be a string indicating a primitive, it can be another schema definition object, or it can be an array which is a union of type definitions. The following are acceptable strings to denote primitive values:
If the property is not defined or is not in this list, than any type of value is acceptable. Other type values may be used for custom purposes, but minimal validators of the specification implementation can allow any instance value on unknown type values. |
| optional | This indicates that the instance property in the instance object is optional. This is false by default. |
| nullable | This indicates that the instance property allows null. This is false by default. |
| unique | This indicates that the instance property should have unique values. No other property with the same name in the instance object tree should have the same value. |
| minimum | This indicates the minimum value for the instance property when the type of the instance value is a number, or it indicates the minimum number of values in an array when an array is the instance value. |
| maximum | This indicates the minimum value for the instance property when the type of the instance value is a number, or it indicates the minimum number of values in an array when an array is the instance value. |
| pattern | When the instance value is a string, this provides a regular expression that a instance string value should match in order to be valid. Regular expressions should follow the regular expression specification from ECMA 262/Perl 5 |
| length | When the instance value is a string, this indicates maximum length of the string. |
| options | This provides an enumeration of possible values that are valid for the instance property. |
| unconstrained | When used in conjunction with the options property, this indicates a value can be used that is not in the list of options. This has no meaning when the options property is not a sibling. |
| readonly | This indicates that the instance property should not be changed (this is only for interaction, it has no effect for standalone validation). |
| description | This provides a description of the purpose the instance property. The value can be a string or it can be an object with properties corresponding to various different instance languages (with an optional default property indicating the default description). For example a multilingual description could be: “description”:{”en”:”My property”,”fr”:”Ma propriété”} |
| format | This indicates what format the data is among some predefined formats which may include:
Need to provide definitions for these formats, probably in another page. |
| default | This indicates the default for the instance property. |
| transient | This indicates that the property will be used for transient/volatile values that should not be persisted. This is false by default. |
Below is a more sophisticated example, first the instance object:
{
"name" : "John Doe",
"born" : "1979-03-23T06:26:07Z“,
“gender” : “male”,
“address” : {”street”:”123 S Main St”,
“city”:”Springfield”,
“state”:”CA”}
}
And here is a schema to validate it:
{
"name": {"type":"string"},
"born" : {"type":["number","string"], allow for a numeric year, or a full date
“format”:”date”, format when a string value is used
“minimum”:1900, min/max for when a number value is used
"maximum":2010,
"optional":true
},
"gender" : {"type":"string",
"options":["male","female"]}
"address" : {"type":
{"street":{"type":"string"},
"city":{"type":"string"}, “state”:{”type”:”string”}},
“format”:”address”}}
Schema Definition
JSON Schema definition object is made up of properties with property definitions that define corresponding properties in instance objects. However, JSON Schema definition objects can also include a few extra properties that define additional characteristics of instance objects:
| Property | Meaning |
| items | This indicates that the instance should be an array and the value of this property should be a property definition for the items in the array. For example an array of strings would be defined:”items”:{”type”:”string”} |
| * | This provides a default property definition for all properties that are not explicitly defined in the schema object. |
| final | This indicates that the instance objects should not have any additional properties beyond what is defined in the schema object, and the schema object should not be extended. This is false by default. |
Extending and Referencing
JSON Schema will follow the same rules for referencing and inheritance as described in JSPON (note that the convention for referencing may change per this discussion). This allows for circular references within schemas and provides a way for schemas to extend other schemas. To reference an object, that object should have an id property with a unique and then a reference to it should be an object with a single property, id, that refers the original object. Extending a schema can be done by referring to the base schema with the basis property. For example here is a self defined schema using id referencing and inheritance/extension:
{ "id" : "johndoe",
"name" : "John Doe",
"spouse" : {
"name" : "Jane Doe",
"spouse" : {"$ref" : "johndoe"}}, refer back to John Doe
"children" : [{"name" : "Jenny Doe"},
{"id" : "jimdoe",
"name" : "Jim Doe",
"spouse" : {"name" : "Janice Doe","spouse":{"$ref" : "jimdoe"}},
"schema" : {"$ref" : "marriedperson"}}], specify the schema for Jim Doe: marriedperson
“schema” : {”id” : “marriedperson”, the married person schema extends the person schema and inherits the name property
“basis” : {”id” : “person”, here person just defines the name property
“name”: {”type”:”string”}},
“spouse” : {”type”:{”$ref”:”marriedperson”}}, // the spouse value should be a married person, we use a reference to indicate this schema
“children” : {”type”:
{”items”:{”$ref”:”person”}}} // we define that each item in the array should be a person schema
}
}
Schema Location Conventions
There area a couple of very natural ways for schemas to be correlated with JSON data without actually including the schema in the object that are recommended. By using id referencing (per JSPON), ids provide an implicit URL through the web’s relative URI scheme. For example if an object is requested from http://mydomain.com/jsonData and returns:
{ "foo":"bar",
"schema":{"$ref":"mySchema"}}
The reference to the schema here is an id reference, and using relative URI rules, it indicates to a client that a schema for this object can retrieved from http://mydomain.com/mySchema. We can also use an absolute URL reference:
{ "foo":"bar",
"schema":{"$ref":"http://mydomain.com/mySchema"}}
Another recommended approach is to use property referencing syntax to augment URLs as a convention for correlating data with schemas. With this convention, one can find a schema for JSON data URL by appending “.schema”. “.schema” intuitively flows out of the fact that objects can reference schemas with a “schema” property. So for example, if http://mydomain.com/jsonData is a JSON data source, one should be able to find a schema for this data (with out even examining the data) at: http://mydomain.com/jsonData.schema
Tools
An JavaScript implementation of a JSON Schema validator is available here.