Highly scalable, causally-consistent smart contracts
The ideal safety goal for any smart contract platform should be external consistency, the notion that any transaction will always operate on current information. This is currently not a scalable option in an open blockchain setting where precise timing information is not available and where network latencies are high.
Cross-shard smart contracts can be treated as processes that access distributed data stores to execute transactions. For these distributed data stores, the CAP theorem implies that, of consistency, availability and partition tolerance, we can only guarantee two. In a blockchain network we would have to guarantee partition tolerance. Hence we have to make a choice between consistency and availability. At Harmony we anticipate that:
- In an open blockchain setting, network latencies are high.
- Cross-shard read latencies will be a severe bottleneck for smart contracts, and so smart contracts should continue to execute despite these high latencies
As a result, we choose availability and partition tolerance, and we try to offer strongest possible consistency within those constraints.
Convergent causal consistency
Harmony provides convergent causal consistency. In a convergently causal system, all clients eventually agree on a common transaction result (convergence), and each transaction preserves causal ordering (causal consistency). Causal consistency is the strongest form of consistency available to highly scalable, latency-tolerant systems.
Causal consistency guarantees can be provided through a shim layer added onto a convergent system. Harmony follows a similar model by providing convergence and causal consistency through independent mechanisms.
Smart contract data in Harmony is managed as follows:
- When a smart contract is created, it is assigned to a shard.
- Each shard maintains a shard-local data store of key-value pairs. At any given point in time, this data store maintains a causally-consistent snapshot of data available to its own smart contracts.
- Each key is owned by a single smart contract, and only that smart contract may update the values associated with that key. A smart contract can read and write to values whose keys it owns, but it can only read values corresponding to keys owned by other smart contracts.
- Before the smart contract can begin executing, its shard must register for updates to data required by that smart contract. Once registered, the smart contract’s shard will receive a data feed for updates to remote data sources (other shards) that the smart contract depends on.
- Nodes in a shard are randomly assigned responsibility for disseminating updates from their own shard-local storage to nodes of other shards.
- When a node receives an update from a data feed, it disseminates that data to the rest of the shard. The incoming data is initially considered pending and is unavailable to smart contracts.
- Pending updates must be collated into into discrete chunks called causal cuts before they can be marked as available for smart contracts. A causal cut consists of a set of key-value pairs that represent a causal graph of non-conflicting updates. An update is applied once all variables in the causal cut are available.
- An update is considered non-conflicting so long as it creates no circular causal dependencies and so long as each step of its causal history has been validated against the current available state.
The data feed mechanism is responsible for providing liveness of smart contract data updates, and local data stores provide safety by only revealing data in discrete chunks of causal cuts. By providing only data whose causal cuts have been validated, each shard ensures that its read-ready data comes from some plausible chain of events, consistent with the history of all other smart contracts.
Liveness
Liveness of the data feed is necessary for convergence of shard-local data stores. Because of this, it is necessary for honest nodes to eventually be responsible for disseminating updates to remote shards.
-
In the simplest case, this is solved by randomly assigning a rotating set of nodes to publish and consume these cross-shard data feeds. Publisher nodes can connect to consumer nodes whenever its shard publishes a fixed number of new blocks to the beacon chain.
-
To make this process more robust, nodes can generate proofs of information dispersal that are validated along with other transactions. A successful consensus result on a proof of dispersal implies that a majority of nodes received a data update.
-
In a more advanced case, it may be possible to allow non-consensus nodes to claim the rewards for a proof of dispersal. This would enable a separation of responsibilities between transaction processing and data dispersal.
Safety of the causal cuts depends on a shard’s ability to validate the causal history of incoming data before revealing it to smart contracts. This validation ensures that, for a given key, when a newer value has been used, explicitly or implicitly, by some shard, all older values will cease to be used for subsequent operations by that same shard. This is accomplished by having nodes pass the causal history associated with each key-value pair along with that key-value pair. When sending data through the data feed, the sender must first query the last-known state of keys whose values are being provided. The sender then computes the section of the causal history unknown to the recipient and transmits that section.
Result
Under this scheme, smart contracts in a shard may optimistically read values from remote shards without suffering the latency costs associated with cross-shard transactions. As a result, smart contract transactions in Harmony can be executed with the same throughput and latency as regular token transactions, with zero slow-down resulting from cross-shard data reads.
Each Harmony smart contract currently executes within a single shard. On a single shard, this scheme provides finality of smart contract execution results. This is because nodes in a shard must get consensus to commit conflicting writes for a given key, and an honest quorum of those same nodes prevents such conflicting writes. Finality may be sacrificed in favor of convergence to support smart contracts that operate simultaneously on multiple shards.