Benchmarking synthetic workloads
Benchmarking a synthetic workload starts a new network with empty state. Then state is created and afterwards transactions involving that state are generated. For example, the native token transfer workload creates n
accounts with NEAR balance and then generates transactions to transfer the native token between accounts.
This approach has the following benefits:
- Relatively simple and quick setup, as there is no state from real work networks involved.
- Fine grained control over traffic intensity.
- Enabling the comparison of
neard
performance at different points in time or with different features. - Might expose performance bottlenecks.
The main drawbacks of synthetic benchmarks are:
- Drawing conclusions is limited as real world traffic is not homogeneous.
- Calibrating traffic generation parameters can be cumbersome.
The tooling for synthetic benchmarks is available in benchmarks/synth-bm
.
Workflows
The tooling's justfile
contains recipes for the most relevant workflows.
Benchmark native token transfers
A typical workflow benchmarking the native token transfers using the above justfile
would be something along the:
- set up the network
rm -rf .near && just init_localnet
# Modify the configuration (see the "Un-limit configuration" section)
[t1]$ just run_localnet
[t1]$ just create_sub_accounts
- run the benchmark
# set the desired tx rate (`--interval-duration-micros`) and the total volume (`--num-transfers`) in the justfile
[t2]$ just benchmark_native_transfers
This benchmark generates a native token transfer workload involving the accounts provided in --user-data-dir
. Transactions are generated by iterating through these accounts and sending native tokens to a randomly chosen receiver from the same set of accounts. To view all options, run:
cargo run --release -- benchmark-native-transfers --help
For the native transfer benchmark transactions are sent with wait_until: None
, meaning the responses the near_synth_bm
tool receives are basically just an ACK by the RPC confirming it received the transaction.
Thus the numbers reported by the tool as if in
[2025-01-27T14:05:12Z INFO near_synth_bm::native_transfer] Sent 200000 txs in 6.50 seconds
[2025-01-27T14:05:12Z INFO near_synth_bm::rpc] Received 200000 tx responses in 6.49 seconds
are not directly indicative of the runtime performance and transaction outcomes.
The number of transactions successfully processed may be obtained by querying the near_transaction_processed_successfully_total
metric, e.g. with: http://localhost:3030/metrics | grep transaction_processed
.
Automatic calculation of transactions per second (TPS) when RPC requests are sent with wait_until: NONE
is coming up shortly.
Benchmark calls to the sign
method of an MPC contract
Assumes the accounts that send the transactions invoking sign
have been created as described above. Transactions can be sent to a RPC of a network on which an instance of the mpc/chain-signatures
is deployed.
Transactions are sent to the RPC with wait_until: EXECUTED_OPTIMISTIC
as the throughput for sign
is at a level at which neither the network nor the RPC are expected to be a bottleneck.
All options of the command can be shown with:
cargo run -- benchmark-mpc-sign --help
Auxiliary steps
Network setup and neard
configuration
Details of bringing up and configuring a network are out of scope for this document. Instead we just give a brief overview of the setup regularly used to benchmark TPS of common workloads in a single-node with a single-shard setup.
Build neard
Choose the git commit and cargo features corresponding to what you want to benchmark. Most likely you will want a --release
build to measure TPS. Place the corresponding neard
binary in the justfile's directory or set the NEARD_PATH
environment variable to point to it.
Create sub accounts
Creating the state for synthetic benchmarks usually starts with creating accounts. We create sub accounts for the account specified by --signer-key-path
. This avoids dealing with the registrar, which would be required for creating top level accounts. To view all options, run:
cargo run --release -- create-sub-accounts --help
Initialize the network
./neard --home .near init --chain-id localnet
Enable memtrie
The configuration generated by the above command does not enable memtrie. However, most benchmarks should run against a node with memtrie enabled, which can be achieved by setting the following in .near/config.json
:
"load_mem_tries_for_tracked_shards": true
Un-limit configuration
Following these steps so far creates a config that will throttle throughput due to various factors related to state witness size, gas/compute limits, and congestion control. In case you want to benchmark a node that fully utilizes its hardware, you can do the following modifications to effectively run with unlimited configuration:
# Modifications in .near/genesis.json
"chain_id": "benchmarknet"
"gas_limit": 20000000000000000 # increase default by x20
# Modifications in .near/config.json
"view_client_threads": 8 # increase default by x2
"load_mem_tries_for_tracked_shards": true # enable memtrie
"produce_chunk_add_transactions_time_limit": {
"secs": 0,
"nanos": 800000000 # increase default by x4
}
Note that as nearcore
evolves, these steps and BENCHMARKNET
adjustments might need to be updated to achieve the effect of un-limiting configuration.
Modifications of genesis.json
need to be applied before initializing the network with just init_localnet
. Otherwise just run_localnet
will fail. If you ran the node with default config and want to switch to unlimited config, the required steps are:
# Remove .near as you will need to initialize localnet again.
$ rm -rf .near
$ just init_localnet
# Modify the configuration
$ just run_localnet
Common parameters
The following parameters are common to multiple tasks:
rpc-url
The RPC endpoint to which transactions are sent.
Synthetic benchmarking may create thousands of transactions per second, which can hit network limitations if the RPC is located on a separate machine. In particular sending transactions to nodes running on GCP requires care as it can cause temporary IP address bans. For that scenario it is recommended to run a separate traffic generation vm located in the same GCP zone as the RPC node and send transactions to its internal IP
.
interval-duration-micros
Controls the rate at which transactions are sent. Assuming your hardware is able to send a request at every interval tick, the number of transactions sent per second equals 1_000_000 / interval-duration-micros
. The rate might be slowed down if channel-buffer-size
becomes a bottleneck.
channel-buffer-size
Before an RPC request is sent, the tooling awaits capacity on a buffered channel. Thereby the number of outstanding RPC requests is limited by channel-buffer-size
. This can slow down the rate at which transactions are sent in case the node is congested. To disable that behavior, set channel-buffer-size
to a large value, e.g. the total number of transactions to be sent.