Disallow Non-required Properties in Lisk Codec

Hello everyone,

I would like to propose a new LIP for the roadmap objective “Guarantee uniqueness of binary messages in Lisk codec". This LIP makes Lisk codec more strict by disallowing non-required properties in Lisk JSON schemas. Notice that some links to LIP 0055 may not be working yet before this PR has been merged in the lips repository.

I’m looking forward to your feedback.

Here is the complete LIP draft:

LIP: <LIP number>
Title: Disallow Non-required Properties in Lisk Codec
Author: Andreas Kendziorra <andreas.kendziorra@lightcurve.io>
Discussions-To: https://research.lisk.com/t/disallow-non-required-properties-in-lisk-codec/341
Type: Standards Track
Created: <YYYY-MM-DD>
Updated: <YYYY-MM-DD>
Replaces: 0027
Requires: 0055

Abstract

This proposal defines a stricter version of Lisk codec - the serialization method introduced in LIP 0027. It is stricter by requiring that every schema property must be marked as required. This simplifies the rules significantly and makes the serialization method much less error-prone without implying any change in the current Lisk SDK implementation.

Copyright

This LIP is licensed under the Creative Commons Zero 1.0 Universal.

Motivation

The generic serialization method defined in LIP 0027, also referred to as Lisk codec, allows to have properties in a Lisk JSON schema that are NOT marked as required. The reason was to allow a single schema to be used to serialize an object with and without a signature property. For example, there is only one block header schema in LIP 0029 that is used for serializing unsigned block headers (needed for computing block header signatures) and for serializing signed block headers (needed, for example, for computing block IDs). However, the implementation of the Lisk SDK 5 does not make use of this flexibility. Instead, every time two different serialization methods (with and without signature) are needed, two different schemas are used, where in each schema every property was marked as required (one schema simply does not contain the signature property - see the block header schemas). Moreover, this flexibility comes with a couple of downsides:

  1. It is error-prone: LIP 0027 encourages to mark every property in a schema as required, as otherwise unexpected things like encode(decode( binaryMsg )) != binaryMsg for a binary message binaryMsg can happen. But some developers may unintentionally miss this recommendation or may intentionally ignore it without foreseeing all consequences.
  2. Lisk codec is supposed to be compatible with proto2 in the sense that protobuf implementations with the adequate .proto file deserialize valid binary messages correctly. However, this could not be achieved to 100% as even the proto2 specification with regard to decoding missing optional fields (corresponds to non-required properties in Lisk codec) are not fixed. This is because it is not defined what the default value for message fields (corresponds to objects in Lisk codec) is. For proto3, it is mentioned that this is even language specific.
  3. The rules for Lisk codec and the implementation are significantly more complex.

For these reasons, we propose to make the rules for Lisk codec more strict in the way that every property in a Lisk JSON schema must be marked as required. This will imply in particular that:

  1. encode(decode( binaryMsg )) == binaryMsg holds for every valid binary message binaryMsg.
  2. Every protobuf implementation will decode a valid binary message correctly.
  3. The rules, especially for decoding, become simpler.

Specification

The serialization and deserialization method defined in this document works as defined in LIP 0027 with exceptions and differences specified in the following subsections.

Lisk JSON Schemas

We make the definition of Lisk JSON schema more strict: A Lisk JSON schema is a JSON schema as defined in the Lisk JSON Schemas section in LIP 0027 with the additional requirement that

  • every schema of type object must use the required keyword, and every property of the object must be contained in its value.

See below for examples.

Encoding

The encoding rules are exactly the same as in LIP 0027.

Decoding

The decoding rules become simpler compared to LIP 0027:

Only valid messages can be decoded (note that point 3.iii in the definition of valid binary message becomes obsolete with this proposal). For invalid binary messages, decoding fails. When a valid binary message is parsed, the binary message is decoded according to the proto2 specifications. In particular, whenever a binary message does not contain a specific field of type array, the corresponding field in the parsed object is set to the empty array. This holds for arrays using packed encoding as well as for arrays using non-packed encoding.

Backwards Compatibility

Implementations following this proposal are not totally backwards compatible with the one defined in LIP 0027 and implemented in Lisk SDK 5. That means, Lisk JSON schemas that are valid in the sense of LIP 0027 and that have properties not marked as required cannot be used for encoding and decoding. However, it is backwards compatible with valid Lisk JSON schemas in which every property is marked as required.

Conversely, every Lisk JSON schema valid in accordance with this proposal is also valid with LIP 0027, and the encoding and decoding rules are the same.

From the Lisk JSON schemas used in the protocol of Lisk SDK 5, only the block header schema is not valid with respect to this proposal. But this one is supposed to be superseded by LIP 0055 in which two schemas are used. One with the block header signature, the other one without.

Reference Implementation

TBD

Appendix

Examples

Invalid JSON schemas

The root schema does not use the required keyword:

{
  "type": "object",
  "properties": {
    "foo": {
      "dataType": "uint32",
      "fieldNumber": 1
    },
    "bar": {
      "dataType": "uint32",
      "fieldNumber": 2
    }
  }
}

Not all properties of the root schema are listed in the value of required:

{
  "type": "object",
  "required": ["foo"]
  "properties": {
    "foo": {
      "dataType": "uint32",
      "fieldNumber": 1
    },
    "bar": {
      "dataType": "uint32",
      "fieldNumber": 2
    }
  }
}

I created a PR for this LIP in the lips repository: Add LIP: "Disallow Non-required Properties in Lisk Codec" by ricott1 · Pull Request #142 · LiskHQ/lips · GitHub

This PR has been merged.