XML has a supporting schema-language in the form of the XML Schema. What you may be unaware of is that JSON has one too.
Read-on to find out more and why you might want to use it if you’re not already.
In recent years JSON has taken over from XML as the de facto data-format for integrations, whether using APIs or messaging. This has been driven by better support and ease of use for JavaScript clients, built in data-types, and terser syntax.
When designing and building our APIs my team standardised on using JSON in preference to XML some time back. However, when we first made the switch, there were a few things I missed about using XML on the server-side. The parsers weren’t quite as mature and lacked a standard API (at least on the Java platform). The lack of a query language like XPath.
Things have since improved in these areas. Popular libraries like Jackson provide JSON parsers with serialisation support built on top. (More recently a standard API for parsing and streaming has been added to Java in the form of JSON-P (JSR-353), which is part of JEE7. And JSON-B a standard API for binding is currently in the works as part of JEE8). And the JSONPath API has been created along with ported implementations.
But the other spec I missed when switching to JSON was the schema language XML Schema. We’ve never relied on XML Schema to implement APIs (preferring to code the validation rather than implement it by applying a schema validating parse, for simplicity and greater flexibility), but it supported a way to improve the specification of data-formats, as described in more detail below.
Specification By Example
Specifying example (HTTP or message) payloads is a simple and quick way to communicate a data-format. This is what we do today, when spec’ing our APIs, e.g.
{ "feedbacks": [ { "rating": 3, "message": "Would’ve liked more on the topic of Docker.", "authored": "2016-06-09T13:43:57Z" } ], "received" : "2016-06-09T13:44:02.123Z" }
However, this approach has its limitations –
1) It’s ambiguous – It doesn’t answer questions like:
- Is this property mandatory or optional?
- What other constraints apply on this property like – length, set of legal values (enum)?
2) It’s incomplete – It’s sometimes practically infeasible, or at least very time consuming, to try and specify every possible combination of a data-format through examples.
3) It’s not self-documenting – JSON’s lack of support for comments means there is no built-in way of describing the purpose of JSON objects and fields (without resorting to using your own additional fields).
So, specification-by-example, although very useful, is not on its own a sufficient way of documenting data-formats. These limitations can be addressed by additionally using a schema.
XML has a supporting schema-language in the form of the XML Schema. What you may be unaware of is that JSON has one too…
Introducing JSON Schema
JSON Schema specifies a means of describing a JSON data-format and the constraints that apply to it, by using another JSON document. It addresses all the previously cited limitations of specification-by-example, mainly:
1) It supports describing the purpose of JSON objects and component fields (using the schema’s title and description fields).
2) It supports stating the names of fields and the constraints that apply to them, including, e.g.
- Their expected (JSON) types – object, array, string, integer, etc.
- Whether the field is mandatory (‘required’)
- For fields of type integer, the minimum and maximum values.
- For fields of type string, whether they have a minimum and, or maximum length; and also enumerations (set) of supported values.
3) The ability to specify the valid pattern of a field of type string using either a regular expression, or by specifying one of the pre-specified formats which are built into JSON schema, such as ‘data’-time’ for an ISO-8601 format date/time string.
Example
Suppose you need to specify a JSON data-format to support the exchange of user activity containing one or more items of feedback. An example JSON data-format, repeated from above, is as follows –
{ "feedbacks": [ { "rating": 3, "message": "Would’ve liked more on the topic of Docker.", "authored": "2016-06-09T13:43:57Z" } ], "received" : "2016-06-09T13:44:02.123Z" }
This specification by example is ambiguous, leaving a lot of unanswered questions, for the reasons previously described. The following JSON schema removes the ambiguity by providing both descriptive metadata and applying field constraints –
{ "$schema": "http://json-schema.org/draft-04/schema#", "title": "User viewing activity", "description": "Viewing activity generated by a user.", "type": "object", "properties": { "feedbacks": { "description": "One or more items of feedback.", "type": "array", "items": { "type": "object", "properties": { "rating": { "enum": [1, 2, 3, 4, 5] }, "message": { "type": "string" }, "authored": { "description": "The date/time the feedback was authored. ISO-8601 format datetime string, to second resolution, with 'Z' time zone designator.", "type": "string", "format": "date-time", "pattern": "^\\d{4}\\-\\d{2}\\-\\d{2}T\\d{2}:\\d{2}:\\d{2}Z$" } }, "required": ["rating"] }, "minItems": 1 }, "received": { "description": "The date/time the activity was received. ISO-8601 format datetime string, to millisecond resolution, with 'Z' time zone designator.", "type": "string", "format": "date-time", "pattern": "^\\d{4}\\-\\d{2}\\-\\d{2}T\\d{2}:\\d{2}:\\d{2}\\.[0-9]{3}Z$" } }, "required": ["received"] }
To validate the schema, and check that the above, or any other JSON payload conforms to it, use a JSON schema validator such as this online validator.
You can find other simple and more advanced examples of how to write JSON schema in the examples section of the json-schema.org website.
Tool Support
In addition, because the schema follows strict rules, it is also machine readable, so tools can be used in conjunction with the schema to automate certain tasks. For example –
Validation – Support for checking whether a lump of JSON conforms to a given schema. As mentioned above, this is not something I’d use, for example, in production APIs, but it can be useful e.g. when developing, or writing automated-tests. For example this online JSON Schema validator is a personal favourite which I’ve been using when writing new schemas and doing ad hoc checks of the conformance of example JSON payloads.
Data generation – Support for automating the generation of valid data that conforms to schema, which is again useful for automating testing.
For a full list of tools in these and other categories see the software page of the json-schema.org website.
Specification History and Status
The JSON Schema spec. has actually been around for some time. A first draft was produced around 12/2009. Since then the specification has evolved and been extended. The latest version, still referred to as draft, is now at v4. It’s now split into 3 component specs – Core (the basics), Validation (additional support for validation constraints) and Hyper-Schema (additional support for hypermedia).
The JSON Schema spec has been submitted to the IETF as a proposed, draft standard, which may bode well for its uptake and stability. One concern is that the spec. has been languishing in this draft status with the IETF for 3 years. Although a recent (04/2016) post and thread on the project’s discussion forum suggest there is still motivation for getting it approved.
See the “docs” section of the json-schema.org website for the latest story.
Summary and Next Steps
JSON Schema provides a draft standard for a schema language for JSON that allows you to precisely and unambiguously describe JSON data-formats.
My first experience of JSON Schema and the available 3rd party tools has been positive.
I now intend to use JSON schema, in addition to specification-by-example, to fully document every new JSON data-format I specify, whether it be for an API resource or a message payload.
Useful Resources
json-schema.org – “The home of JSON Schema”. Contains the latest specs, and examples/tutorials, along with links to 3rd party tools.
- JSON Schema: core definitions and terminology – The latest version of the core component spec.
- JSON Schema: interactive and non-interactive validation – The latest version of the component spec. which adds support for defining validation constraints.
Understanding JSON Schema – Another overview providing a simpler introduction than the specs, and some additional examples.