Discussion: Hot Module Initialization

Following the discussion on the new Lisk SDK architecture. Quick recap :

@shuse2

Genesis block execution happens only once when starting the blockchain without any data

@JesusTheHun

I think the chain state should store if a module has been initialized and not rely on the genesis block. My rationale is new modules can be introduced later in the blockchain as soft forks and still require state initialization. Lisk should support that by design and not expect developers to implement their own checks in the new module down the line. Happy to discuss the implementation details in the later stage.

@shuse2

I agree that it would be nice to be able to initialize the module without genesis block, and I think it is already possible using beforeTransactionsExecute hook or block asset.
Main problem I see is how we inject the large off-chain data without using genesis block.

If we remove the genesis block handling, it will be protocol change, so that would be out of scope of this LIP. However, if you have good idea, please open a topic on this forum, and we can discuss further =)

Currently adding a new module that requires state initialization must be done through a hard fork because the genesis block must be regenerated.
To avoid protocol change we must keep the possibility to initialize the state using the genesis block.

I think we should also open the state initialization without the need for the genesis block.

The module would give its block height activation and an initialization method.
When the block of the said comes, the initialization method is executed and the module will now be used to process transactions.

About the concern of loading a big chunk of data, I assume you meant from off-chain to on-chain, like the lisk core migration from v2 to v3.

So first of all I want to remind that hot module initialization is an option and that genesis initialization is still possible.
That being said we could imagine a prepareInitialization method that would be executed as soon as the node starts - or at a said height if we want. This method could start a rockdb transaction with a WriteUnprepared policy that would only be committed once the activation height is reached.

Small remark here

Currently adding a new module that requires state initialization must be done through a hard fork because the genesis block must be regenerated.

This is true on SDK v5, but with v6, it is allowed to have module without entry in the genesis block. Therefore, it is possible to add module without entry in the genesis block.

However, we don’t have any default mechanism to initialize the state without genesis block.

The module would give its block height activation and an initialization method.
When the block of the said comes, the initialization method is executed and the module will now be used to process transactions.

I think this would be quite nice feature if we can find a nice way to define how it gets the initial state information.
Also, in opposite direction, it would be nice to have a way to remove a module in the similar way as well like deconstruction.

About the concern of loading a big chunk of data, I assume you meant from off-chain to on-chain, like the lisk core migration from v2 to v3.

Yes, the concern is how we destribute the initial state information in decentralized way, where size could be big, since all the nodes must agree on the initial state.
Additionally, if the size is big, time to execute and compute the state root will be very long, and it would be quite difficult to fit into the block time defined.

This is true on SDK v5, but with v6, it is allowed to have module without entry in the genesis block. Therefore, it is possible to add module without entry in the genesis block.

This is nice :partying_face:

Also, in opposite direction, it would be nice to have a way to remove a module in the similar way as well like deconstruction.

The thing is you can remove a module without removing the data bound to it. It’s just sitting there, eventually unused but maybe it’s still used by another module.

all the nodes must agree on the initial state

I don’t see the issue here. Just like every node must run the same code, they will have to get the same file as the initial state. They download the module code from third-parties already, why not the module initial state file ?
You can include the sha of the file hardcoded in the module to prevent poisoning.

Additionally, if the size is big, time to execute and compute the state root will be very long, and it would be quite difficult to fit into the block time defined.

Assuming the hot module state initialization is append only, you can very well calculate the hash for that new branch root ahead of time and therefore when the activation height comes you only have to recalculate the parents of the branch roots.