Google+
Shineyrock web design & consultancy

Shineyrock

blog

  • like 1 03

    Validating Data With JSON-Schema, Part 1

    When you’re dealing with complex and structured data, you need to determine whether the data is valid or not. JSON-Schema is the standard of JSON documents that describes the structure and the requirements of your JSON data. In this two-part series, you’ll learn how to use JSON-Schema to validate data.

    Let’s say you have a database of users where each record looks similar to this example:

    The question we are going to deal with is how to determine whether the record like the one above is valid or not.

    Examples are very useful but not sufficient when describing your data requirements. JSON-Schema comes to the rescue. This is one of the possible schemas describing a user record:

    Have a look at the schema above and the user record it describes (that is valid according to this schema). There is a lot of explaining to do here.

    JavaScript code to validate the user record against the schema could be:

    or for better performance: javascript var validate = ajv.compile(userSchema); var valid = validate(userData); if (!valid) console.log(validate.errors);

    All the code samples are available in the GitHub repo tutsplus-json-schema. You can also try it in the browser.

    Ajv, the validator used in the example, is the fastest JSON-Schema validator for JavaScript. I created it, so I am going to use it in this tutorial.

    Before we continue, let’s quickly deal with all the whys.

    Why Validate Data as a Separate Step?

    • to fail fast
    • to avoid data corruption
    • to simplify processing code
    • to use validation code in tests

    Why JSON (and not XML)?

    • as wide adoption as XML
    • easier to process and more concise than XML
    • dominates web development because of JavaScript

    Why Use Schemas?

    • declarative
    • easier to maintain
    • can be understood by non-coders
    • no need to write code, third party open-source libraries can be used

    Why JSON-Schema?

    • the widest adoption among all standards for JSON validation
    • very mature (current version is 4, there are proposals for version 5)
    • covers a big part of validation scenarios
    • uses easy-to-parse JSON documents for schemas
    • platform independent
    • easily extensible
    • 30+ validators for different languages, including 10+ for JavaScript, so no need to code it yourself

    Tasks

    This tutorial includes several relatively simple tasks to help you better understand the JSON schema and how it can be used. There are simple JavaScript scripts to check that you’ve done them correctly. To run them you will need to install node.js (you need no experience with it). Just install nvm (node version manager) and a recent node.js version:

    You also need to clone the repo and run npm install (it will install Ajv validator).

    Let’s Dive Into the Schemas!

    JSON-schema is always an object. Its properties are called “keywords”. Some of them describe the rules for the data (e.g., “type” and “properties”), and some describe the schema itself (“$schema”, “id”, “title”, “description”)—we will get to them later.

    The data is valid according to the schema if it is valid according to all keywords in this schema—that’s really simple.

    Data Properties

    Because most JSON data consists of objects with multiple properties, the keyword “properties” is probably the most commonly used keyword. It only applies to objects (see the next section about what “apply” means).

    You might have noticed in the example above that each property inside the “properties” keyword describes the corresponding property in your data.

    The value of each property is itself a JSON-schema—JSON-schema is a recursive standard. Each property in the data should be valid according to the corresponding schema in the “properties” keyword.

    The important thing here is that the “properties” keyword doesn’t make any property required; it only defines schemas for the properties that are present in the data.

    For example, if our schema is:

    then objects with or without property “foo” can be valid according to this schema:

    and only objects that have property foo that is not a string are invalid:

    Try this example in the browser.

    Data Type

    You’ve already figured out what the keyword “type” does. It is probably the most important keyword. Its value (a string or array of strings) defines what type (or types) the data must be to be valid.

    As you can see in the example above, the user data must be an object.

    Most keywords apply to certain data types—for example, the keyword “properties” only applies to objects, and the keyword “pattern” only applies to strings.

    What does “apply” mean? Let’s say we have a really simple schema:

    You may expect that to be valid according to such schema, the data must be a string matching the pattern:

    But the JSON-schema standard specifies that if a keyword doesn’t apply to the data type, then the data is valid according to this keyword. That means that any data that is not of type “string” is valid according to the schema above—numbers, arrays, objects, boolean, and even null. If you want only strings matching the pattern to be valid, your schema should be:

    Because of this, you can make very flexible schemas that will validate multiple data types.

    Look at the property “id” in the user example. It should be valid according to this schema:

    This schema requires that the data to be valid should be either a “string” or an “integer”. There is also the keyword “pattern” that applies only to strings; it requires that the string should consist of digits only and not start from 0. There is the keyword “minimum” that applies only to numbers; it requires that the number should be not less than 1.

    Another, more verbose, way to express the same requirement is:

    But because of the way JSON-schema is defined, this schema is equivalent to the first one, which is shorter and faster to validate in most validators.

    Data types you can use in schemas are “object”, “array”, “number”, “integer”, “string”, “boolean”, and “null”. Note that “number” includes “integer”—all integers are numbers too.

    Numbers Validation

    There are several keywords to validate numbers. All the keywords in this section apply to numbers only (including integers).

    “minimum” and “maximum” are self-explanatory. In addition to them, there are the keywords “exclusiveMinimum” and “exclusiveMaximum”. In our user example, the user age is required to be an integer that is 13 or bigger. If the schema for the user age were:

    then this schema would have required that user age is strictly bigger than 13, i.e. the lowest allowed age would be 14.

    Another keyword to validate numbers is “multipleOf”. Its name also explains what it does, and you can check out the JSON-schema keywords reference to see how it works.

    Strings Validation

    There are also several keywords to validate strings. All the keywords in this section apply to strings only.

    “maxLength” and “minLength” require that the string is not longer or not shorter than the given number. The JSON-schema standard requires that a unicode pair, e.g. emoji character, is counted as a single character. JavaScript counts it as two characters when you access the .length property of strings.

    Some validators determine string lengths as required by the standard, and some do it the JavaScript way, which is faster. Ajv allows you to specify how to determine string lengths, and the default is to comply with the standard.

    martijn broeders

    founder/ strategic creative at shineyrock web design & consultancy
    e-mail: .(JavaScript must be enabled to view this email address)
    phone: 434 210 0245

By - category

    By - date