Validation rules¶

PyKwalify supports all rules implemented by the original kwalify and include many more to extend the specification.

type¶

A type specifies what rules and constraints should be applied to this node in the data structure.

The following types are available:

any

Will always be true no matter what the value is, even unimplemented types

bool

Only True/False validates. Integers or strings like 0 or 1, "True" or "False" do not validate for bool

date

A string or datetime.date object that follows a date format

float

Any object that is a float type, or object that python can interpret as a float with the following python code float(obj). Scientific notation is supported for this type, for example 1e-06.

int

Validates only for integers and not floats

mapping or map

Validates only for dict objects

none

Validates only for None values

number

Validates if value is int or float

scalar

Validates for all but seq or map. None values will also fail validation.

sequence or seq

Validates for lists

str

Validates if value is a python string object

text

Validates if value is str or number

time

Not yet implemented [NYI]

timestamp

Validates for basic timestamp formats

Example

# Schema
type: str

# Data
'Foobar'

Mapping¶

A mapping is validates to the dict datastructure.

Aliases

mapping

map

The map type is implicitly assumed when mapping or its alias map is present in the rule.

# Schema
type: map
mapping:
  key_one:
    type: str

# Data
key_one: 'bar'

The schema below sets the mapping type implicitly and is also a valid schema.

# Schema
map:
  key_one:
    type: str

There are some constraints which are available only for the map type, and expand its functionality. See the allowempty, regex;(regex-pattern) and matching-rule sections below for details.

By default, map keys specified in the map rule can be omitted unless they have the required constraint explictly set to True.

Sequence¶

Sequence/list of values with the given type of values.

The sequence type is implicitly assumed when sequence or its alias seq is present in the rule.

Aliases

sequence

seq

Example

# Schema
type: seq
sequence:
  - type: str

# Data
- 'Foobar'
- 'Barfoo'

The schema below sets the sequence type implicitly and is also a valid schema.

# Schema
seq:
  - type: str

Multiple list entries is supported to enable validation of different types of data inside the sequence.

Note

The original kwalify specification only allowed one entry in the list. This has been extended in PyKwalify to give more flexibility when validating.

Example

# Schema
type: seq
sequence:
  - type: str
  - type: int

# Data
- 'Foobar'
- 123456

Will be valid.

Matching¶

Multiple subrules can be used within the sequence block. It can also be nested to any depth, with subrules constraining list items to be sequences of sequences.

The matching constraint can be used when the type is sequence to control how the parser handles a list of different subrules for the sequence block.

any
- Each list item must satisfy at least one subrules
all
- Each list item must satisfy every subrule
*
- At least one list item must satisfy at least one subrule

Example

# Schema
type: seq
matching: "any"
sequence:
  - type: str
  - type: seq
    sequence:
      - type: int

# Data
- - 123
- "foobar"

Timestamp¶

Parse a string or integer to determine if it is a valid unix timestamp.

Timestamps must be above 1 and below 2147483647.

Parsing is done with python-dateutil. You can see all valid formats in the relevant dateutil documentation.

Example

# Schema
type: map
mapping:
  d1:
    type: timestamp
  d2:
    type: timestamp

# Data
d1: "2015-03-29T18:45:00+00:00"
d2: 2147483647

All datetime objects will validate as a valid timestamp.

PyYaml can sometimes automatically convert data to datetime objects.

Date¶

Parse a string or datetime object to determine if it is a valid date. Date has multiple valid formats based on what standard you are using.

For example 2016-12-31 or 31-12-16 is both valid formats.

If you want to parse a custom format then you can use the format keyword to specify a valid datetime parsing syntax. The valid sytax can be found here python-strptime

Example:

# Schema
type: date

# Data
"2015-12-31"

Format¶

Only valid when using date or datetime type. It helps to define custom datetime formats if the default formats is not enough.

Define the value as a string or a list with foramts as values that uses the builtin python datetime string formatting language. The syntax can be found here python-strptime

# Schema
type: date
format: "%Y-%m-%d"

# Data
"2015-12-31"

Required¶

If the required constraint is set to True, the key and its value must be present, otherwise a validation error will be raised.

Default is False.

Aliases

required

req

Example

# Schema
type: map
mapping:
  key_one:
    type: str
    required: True

# Data
key_one: foobar

Enum¶

Set of possible elements, the value must be a member of this set.

Object in enum must be a list of items.

Currently only exact case matching is implemented. If you need complex validation you should use pattern.

Example

# Schema
type: map
mapping:
  blood:
    type: str
    enum: ['A', 'B', 'O', 'AB']

# Data
blood: AB

Pattern¶

Specifies a regular expression pattern which the value must satisfy.

Uses re.match internally. Pattern works for all scalar types.

For using regex to define possible key names in mapping, see regex;(regex-pattern) instead.

Example

# Schema
type: map
mapping:
  email:
    type: str
    pattern: .+@.+

# Data
email: foo@mail.com

Range¶

Range of value between

min or max
min-ex or max-ex.

For numeric types (int, float and number), the value must be within the specified range, and for non-numeric types (map, seq and str) the length of the dict/list/string as given by len() must be within the range.

For the data value (or length), x, the range can be specified to test for the following:

min provides an inclusive lower bound, a <= x
max provides an inclusive upper bound, x <= b
min-ex provides an exclusive lower bound, a < x
max-ex provieds an exclusive upper bound, x < b

Non-numeric types require non-negative values for the boundaries, since length can not be negative.

Types bool and any are not compatible with range.

Example

# Schema
type: map
mapping:
  password:
    type: str
    range:
      min: 8
      max: 16
  age:
    type: int
    range:
      min: 18
      max-ex: 30

# Data
password: foobar123
age: 25

Unique¶

If unique is set to True, then the sequence cannot contain any repeated entries.

The unique constraint can only be set when the type is seq / sequence. It has no effect when used with map / mapping.

Default is False.

Example

# Schema
type: seq
sequence:
  - type: str
    unique: True

# Data
- users
- foo
- admin

Allowempty¶

Only applies to mapping.

If True, the map can have keys which are not present in the schema, and these can map to anything.

Any keys which are specified in the schema must have values which conform to their corresponding constraints, if they are present.

Default is False.

Example

# Schema
type: map
mapping:
  datasources:
    type: map
    allowempty: True

# Data
datasources:
  test1: test1.py
  test2: test2.py

Regex;(regex-pattern)¶

Only applies to mapping type.

Aliases

re;(regex-pattern)

This is only implemented in mapping where a key inside the mapping keyword can implement this regex;(regex-pattern) pattern and all keys will be matched against the pattern.

Please note that the regex should be wrapped with ( ) and these parentheses will be removed at runtime.

If a match is found then it will be parsed against the subrules on that key. A single key can be matched against multiple regex rules and the normal map rules.

When defining a regex key, matching-rule should also be set to configure the behaviour when using multiple regexes.

Example

# Schema
type: map
matching-rule: 'any'
mapping:
  regex;(mi.+):
    type: seq
    sequence:
      - type: str
  regex;(me.+):
    type: number

# Data
mic:
  - foo
  - bar
media: 1

Matching-rule¶

Only applies to mapping. This enables more finegrained control over how the matching rule should behave when validation regex keys inside mappings.

Currently supported constraint settings are

any

One or more of the regex must match.

all

All defined regex must match each key.

Default is any.

Example

The following dataset will raise an error because the key bar2 does not fit all of the regex. If the constraint was instead matching-rule: all, the same data would be valid because all the keys in the data match one of the regex formats and associated constraints in the schema.

# Schema
type: map
matching-rule: all
mapping:
  regex;([1-2]$):
    type: int
  regex;(^foobar):
    type: int

# Data
foobar1: 1
foobar2: 2
bar2: 3

Name¶

Name of the schema.

This have no effect on the parsing, but is useful for humans to read.

Example

# Schema
name: foobar schema

Desc¶

Description of schema.

This have no effect on the parsing, but is useful for humans to read. Similar to name.

Value for desc MUST be a string otherwise a RuleError will be raised upon usage.

Example

# Schema
desc: This schema is very foobar

Example¶

Write a example that can show what values is upported. Or just type any comment into the schema for future reference.

It is possible to use in all levels and places in the schema and have no effect on the parsing, but is useful for humans to read. Similar to desc.

Value for example MUST be a string otherwise a RuleError will be raised upon usage.

Example

# Schema
example: List of values
type: seq
sequence:
  - type: str
    unique: true
    example: Each value must be unique and a string