Skip to main content

Using State Sync

Any new node joining a network needs to first catch up with the existing nodes. The default approach is to replay all the blocks from genesis. Depending on the number of blocks in the chain, this process can take a long time.

State sync is a feature that allows a new node to bootstrap directly from a state snapshot instead of replaying all the blocks from genesis. State sync greatly reduces the time needed to sync a new node to a network; however, the node will have a truncated transaction history starting from the height of the snapshot, and it requires a trusted provider to validate the snapshot.

info

Snapshot creation requires the pg_dump tool on the node, while snapshot restoration requires psql. These are typically included in a postgresql-client package. They must be the same version as the PostgreSQL daemon (postgres) that is required by a particular version of Kwil DB.

Snapshot Creation

Kwil nodes may be configured to create and share snapshots with the following configuration in the config.toml file. Refer to the snapshot configuration for more details.

# Snapshot creation and provider configuration
[snapshots]
# enable creating and providing snapshots for peers using statesync
enable = true
# snapshot creation period in blocks
recurring_height = 14400
# number of snapshots to keep, after the oldest is removed when creating a new one
max_snapshots = 3

If snapshots are enabled, the node generates snapshots at every block height that is a multiple of recurring_height. These snapshots are the logical SQL dumps of the underlying PostgreSQL database, compressed and divided into chunks. The node stores a maximum of max_snapshots on the disk, and older snapshots are deleted when the limit is reached.

For example, a node with the above configuration will generate snapshots at every 10,000 block height. If the node currently has snapshots at height 10000, 20000, and 30000, the node will generate the next snapshot at height 40000. As the max_snapshots is set to 3, the snapshot at height 10000 will be deleted when the snapshot at height 40000 is created.

State Sync

If there are existing nodes on a network creating snapshots, a new node may use state sync to bootstrap their node from these snapshots.

info

State sync is only used for node bootstrapping. Catch-up after a restart will use block sync.

Configure the new node to use state sync by providing the following configuration in the config.toml file.

# Statesync configuration (vs block sync)
[state_sync]
# enable using statesync rather than blocksync
enable = true
# trusted snapshot providers in node ID format (see bootnodes)
trusted_providers = []
# how long to discover snapshots before selecting one to use
discovery_time = '15s'
# how many times to try after failing to apply a snapshot before switching to blocksync
max_retries = 3

When state sync is enabled, the node first discovers snapshots from its peers. It then selects the latest snapshot from those discovered and validates the integrity of the snapshot with the trusted snapshot RPC provider. Once a valid snapshot is identified, the node fetches the snapshot from its peers and restores the application state from the snapshot. After state restoration, the node then begins syncing blocks starting from the snapshot height.

note

The node will stay in the discovery phase until a snapshot is discovered and validated. If there are no snapshots in the network, the node will be stuck in the discovery phase. To make progress, disable statesync and restart the node. The node will then progress with blocksync.

Trusted Snapshot Providers

A network should have at least one trusted snapshot provider responsible for verifying snapshot integrity.

Trusted snapshot providers are regular full nodes with snapshot creation enabled, and which provide snapshot checksums on their RPC service. The RPC service is used by joining nodes to validate snapshots received via P2P communication from either the trusted provider or other nodes.

Along with the trusted snapshot providers, other nodes in the network can also enable snapshots and distribute them to the joining nodes during the statesync process. However, a joining node only accepts these snapshots after validating them with the trusted snapshot providers.