Validation rules

PyKwalify supports all rules implemented by the original kwalify and include many more to extend the specification.

type

A type specifies what rules and constraints should be applied to this node in the data structure.

The following types are available:

  • any
    • Will always be true no matter what the value is, even unimplemented types
  • bool
    • Only True/False validates. Integers or strings like 0 or 1, "True" or "False" do not validate for bool
  • date
    • A string or datetime.date object that follows a date format
  • float
    • Any object that is a float type, or object that python can interpret as a float with the following python code float(obj). Scientific notation is supported for this type, for example 1e-06.
  • int
    • Validates only for integers and not floats
  • mapping or map
    • Validates only for dict objects
  • none
    • Validates only for None values
  • number
    • Validates if value is int or float
  • scalar
    • Validates for all but seq or map. None values will also fail validation.
  • sequence or seq
    • Validates for lists
  • str
    • Validates if value is a python string object
  • text
    • Validates if value is str or number
  • time
    • Not yet implemented [NYI]
  • timestamp
    • Validates for basic timestamp formats

Example

# Schema
type: str
# Data
'Foobar'

Mapping

A mapping is validates to the dict datastructure.

Aliases

  • mapping
  • map

The map type is implicitly assumed when mapping or its alias map is present in the rule.

# Schema
type: map
mapping:
  key_one:
    type: str
# Data
key_one: 'bar'

The schema below sets the mapping type implicitly and is also a valid schema.

# Schema
map:
  key_one:
    type: str

There are some constraints which are available only for the map type, and expand its functionality. See the allowempty, regex;(regex-pattern) and matching-rule sections below for details.

By default, map keys specified in the map rule can be omitted unless they have the required constraint explictly set to True.

Sequence

Sequence/list of values with the given type of values.

The sequence type is implicitly assumed when sequence or its alias seq is present in the rule.

Aliases

  • sequence
  • seq

Example

# Schema
type: seq
sequence:
  - type: str
# Data
- 'Foobar'
- 'Barfoo'

The schema below sets the sequence type implicitly and is also a valid schema.

# Schema
seq:
  - type: str

Multiple list entries is supported to enable validation of different types of data inside the sequence.

Note

The original kwalify specification only allowed one entry in the list. This has been extended in PyKwalify to give more flexibility when validating.

Example

# Schema
type: seq
sequence:
  - type: str
  - type: int
# Data
- 'Foobar'
- 123456

Will be valid.

Matching

Multiple subrules can be used within the sequence block. It can also be nested to any depth, with subrules constraining list items to be sequences of sequences.

The matching constraint can be used when the type is sequence to control how the parser handles a list of different subrules for the sequence block.

  • any
    • Each list item must satisfy at least one subrules
  • all
    • Each list item must satisfy every subrule
  • *
    • At least one list item must satisfy at least one subrule

Example

# Schema
type: seq
matching: "any"
sequence:
  - type: str
  - type: seq
    sequence:
      - type: int
# Data
- - 123
- "foobar"

Timestamp

Parse a string or integer to determine if it is a valid unix timestamp.

Timestamps must be above 1 and below 2147483647.

Parsing is done with python-dateutil. You can see all valid formats in the relevant dateutil documentation.

Example

# Schema
type: map
mapping:
  d1:
    type: timestamp
  d2:
    type: timestamp
# Data
d1: "2015-03-29T18:45:00+00:00"
d2: 2147483647

All datetime objects will validate as a valid timestamp.

PyYaml can sometimes automatically convert data to datetime objects.

Date

Parse a string or datetime object to determine if it is a valid date. Date has multiple valid formats based on what standard you are using.

For example 2016-12-31 or 31-12-16 is both valid formats.

If you want to parse a custom format then you can use the format keyword to specify a valid datetime parsing syntax. The valid sytax can be found here python-strptime

Example:

# Schema
type: date
# Data
"2015-12-31"

Format

Only valid when using date or datetime type. It helps to define custom datetime formats if the default formats is not enough.

Define the value as a string or a list with foramts as values that uses the builtin python datetime string formatting language. The syntax can be found here python-strptime

# Schema
type: date
format: "%Y-%m-%d"
# Data
"2015-12-31"

Required

If the required constraint is set to True, the key and its value must be present, otherwise a validation error will be raised.

Default is False.

Aliases

  • required
  • req

Example

# Schema
type: map
mapping:
  key_one:
    type: str
    required: True
# Data
key_one: foobar

Enum

Set of possible elements, the value must be a member of this set.

Object in enum must be a list of items.

Currently only exact case matching is implemented. If you need complex validation you should use pattern.

Example

# Schema
type: map
mapping:
  blood:
    type: str
    enum: ['A', 'B', 'O', 'AB']
# Data
blood: AB

Pattern

Specifies a regular expression pattern which the value must satisfy.

Uses re.match internally. Pattern works for all scalar types.

For using regex to define possible key names in mapping, see regex;(regex-pattern) instead.

Example

# Schema
type: map
mapping:
  email:
    type: str
    pattern: .+@.+
# Data
email: foo@mail.com

Range

Range of value between
  • min or max
  • min-ex or max-ex.

For numeric types (int, float and number), the value must be within the specified range, and for non-numeric types (map, seq and str) the length of the dict/list/string as given by len() must be within the range.

For the data value (or length), x, the range can be specified to test for the following:
  • min provides an inclusive lower bound, a <= x
  • max provides an inclusive upper bound, x <= b
  • min-ex provides an exclusive lower bound, a < x
  • max-ex provieds an exclusive upper bound, x < b

Non-numeric types require non-negative values for the boundaries, since length can not be negative.

Types bool and any are not compatible with range.

Example

# Schema
type: map
mapping:
  password:
    type: str
    range:
      min: 8
      max: 16
  age:
    type: int
    range:
      min: 18
      max-ex: 30
# Data
password: foobar123
age: 25

Unique

If unique is set to True, then the sequence cannot contain any repeated entries.

The unique constraint can only be set when the type is seq / sequence. It has no effect when used with map / mapping.

Default is False.

Example

# Schema
type: seq
sequence:
  - type: str
    unique: True
# Data
- users
- foo
- admin

Allowempty

Only applies to mapping.

If True, the map can have keys which are not present in the schema, and these can map to anything.

Any keys which are specified in the schema must have values which conform to their corresponding constraints, if they are present.

Default is False.

Example

# Schema
type: map
mapping:
  datasources:
    type: map
    allowempty: True
# Data
datasources:
  test1: test1.py
  test2: test2.py

Regex;(regex-pattern)

Only applies to mapping type.

Aliases

  • re;(regex-pattern)

This is only implemented in mapping where a key inside the mapping keyword can implement this regex;(regex-pattern) pattern and all keys will be matched against the pattern.

Please note that the regex should be wrapped with ( ) and these parentheses will be removed at runtime.

If a match is found then it will be parsed against the subrules on that key. A single key can be matched against multiple regex rules and the normal map rules.

When defining a regex key, matching-rule should also be set to configure the behaviour when using multiple regexes.

Example

# Schema
type: map
matching-rule: 'any'
mapping:
  regex;(mi.+):
    type: seq
    sequence:
      - type: str
  regex;(me.+):
    type: number
# Data
mic:
  - foo
  - bar
media: 1

Matching-rule

Only applies to mapping. This enables more finegrained control over how the matching rule should behave when validation regex keys inside mappings.

Currently supported constraint settings are

  • any
    • One or more of the regex must match.
  • all
    • All defined regex must match each key.

Default is any.

Example

The following dataset will raise an error because the key bar2 does not fit all of the regex. If the constraint was instead matching-rule: all, the same data would be valid because all the keys in the data match one of the regex formats and associated constraints in the schema.

# Schema
type: map
matching-rule: all
mapping:
  regex;([1-2]$):
    type: int
  regex;(^foobar):
    type: int
# Data
foobar1: 1
foobar2: 2
bar2: 3

Name

Name of the schema.

This have no effect on the parsing, but is useful for humans to read.

Example

# Schema
name: foobar schema

Desc

Description of schema.

This have no effect on the parsing, but is useful for humans to read. Similar to name.

Value for desc MUST be a string otherwise a RuleError will be raised upon usage.

Example

# Schema
desc: This schema is very foobar

Example

Write a example that can show what values is upported. Or just type any comment into the schema for future reference.

It is possible to use in all levels and places in the schema and have no effect on the parsing, but is useful for humans to read. Similar to desc.

Value for example MUST be a string otherwise a RuleError will be raised upon usage.

Example

# Schema
example: List of values
type: seq
sequence:
  - type: str
    unique: true
    example: Each value must be unique and a string