Catchup and state sync improvements - Guide to Nearcore Development

This document is still a DRAFT.

This document covers our improvement plans for state sync and catchup. Before reading this doc, you should take a look at How sync works

State sync is used in two situations:

when your node is behind for more than 2 epochs (and it is not an archival node) - then rather than trying to apply block by block (that can take hours) - you 'give up' and download the fresh state (a.k.a state sync) and apply blocks from there.
when you're a block (or chunk) producer - and in the upcoming epoch, you'll have to track a shard that you are not currently tracking.

In the past (and currently) - the state sync was mostly used in the first scenario (as all block & chunk producers had to track all the shards for security reasons - so they didn't actually have to do catchup at all).

As we progress towards phase 2 and keep increasing number of shards - the catchup part starts being a lot more critical. When we're running a network with a 100 shards, the single machine is simply not capable of tracking (a.k.a applying all transactions) of all shards - so it will have to track just a subset. And it will have to change this subset almost every epoch (as protocol rebalances the shard-to-producer assignment based on the stakes).

This means that we have to do some larger changes to the state sync design, as requirements start to differ a lot:

catchups are high priority (the validator MUST catchup within 1 epoch - otherwise it will not be able to produce blocks for the new shards in the next epoch - and therefore it will not earn rewards).
a lot more catchups in progress (with lots of shards basically every validator would have to catchup at least one shard at each epoch boundary) - this leads to a lot more potential traffic on the network
malicious attacks & incentives - the state data can be large and can cause a lot of network traffic. At the same time it is quite critical (see point above), so we'll have to make sure that the nodes are incentivize to provide the state parts upon request.
only a subset of peers will be available to request the state sync from (as not everyone from our peers will be tracking the shard that we're interested in).

Things that we're actively analyzing

Performance of state sync on the receiver side

We're looking at the performance of state sync:

how long does it take to create the parts,
pro-actively creating the parts as soon as epoch starts
creating them in parallel
allowing user to ask for many at once
allowing user to provide a bitmask of parts that are required (therefore allowing the server to return only the ones that it already cached).

Guide to Nearcore Development

Things that we're actively analyzing

Performance of state sync on the receiver side

Better performance on the requestor side

Ideas - not actively working on them yet

Better networking (a.k.a Tier 3)

Dedicated nodes optimized towards state sync responses

Sending deltas instead of full state syncs