in π Notes
Gossip
Table of Contents
4 min read
The Gossip architecture is a weak consistency replication technique where clients send requests to the nearest replica and then the replicas propagate the information to each other periodically just like a normal gossip. It allows for high availability and tolerance to net partitions (A+P).
Algorithm
The algorithm for this architecture can be adapted to the needs of each implementation. It does not need to be rigorously as described bellow. From now on, we will assume there are $n$ replicas.
The Timestamp
There is the concept of timestamp, which is a vector with $n$ entries, where each entry symbolizes the current state of the $i$th replica. Let’s define the merge operation:
merge(tsA, tsB):
for each entry i in tsA
if tsA[i] > tsB[i]
tsB[i] = tsA[i]
On each request, the client sends the prev
timestamp (from the last request). When a replica replies to the client, it returns the data, as well as their own timestamp new
. In this case, the client executes merge(new, prev)
.
Replica State
Each replica contains the following state:
- Value: the value that is stored on each replica we want to keep in sync.
- Value Timestamp: a timestamp representing the operations executed to achieve the current value.
- Log: all the update operations the replica has accepted so far. It may contain already processed operations that are yet required to be propagated to other replicas.
- Replica Timestamp: represents all the operations the current replica has accepted, i.e., that were placed in the log.
- Executed Operations: a list with the unique IDs of the executed operations on the current replica.
- Timestamp Table: a list with the the known replica timestamps from other replicas.
Get Operations
When a client makes a read request, it sends the prev
timestamp as well as the request information. Then, the server proceeds as follows:
- Compares the
prev
timestamp with thevalueTimestamp
to see if we can safely read the value to ensure consistency, i.e., ifprev
<=valueTimestamp
. - If the previous condition is met, the server replies the required information.
Otherwise, wait.
Update Operations
When a client wants to make an update, it generates a UUID id to uniquely identify the request. Then, it sends the id
, the data
and prev
, When the request req
arrives in the server, the replica checks decides whether of not to discard the request. A request is discarded if:
- The
id
is present on the executed operations list; or - The
id
is present on any log record.
If the request is accepted, then we follow the algorithm:
- Update the replica timestamp by incrementing the $i$th entry, where
i
is the current instance number starting on 0. - Create a unique
timestamp
to represent the operation from now on. This timestamp is created by duplicatingprev
and replacing the ith entry with the previously calculated value. - Creates a new log record with the new
timestamp
, the current instance number, theid
and the data to add. - Returns the new timestamp to the client.
- Checks if the operation can be executed immediately by checking if
prev
<=valueTimestamp
. If possible, executemerge(timestamp, valueTimestamp)
.
Otherwise, wait.
Note: if the system’s data has no casual dependencies, there is no need to wait to have the updated data to apply the changes. In that specific case, most of the logic above can be simplified in order to just avoid duplications, but the updates can be immediately applied.
Gossip Operations
This architecture requires each replica to gossip every $t$ time to each other in order to keep consistency. On a gossip request, a replica i sends to the replica j:
- The instance number i;
- The
sourceTimestamp
, which is the replica timestamp of replica i; - The logs records we estimate the replica j does not have.
When the replica j receives the gossip message from the replica i, it then proceeds as follows:
- Updates the entry i in the timestamp table with the
replicaTimestamp
; - For each log record
r
:- Checks if
r.id
is on the executed operations list. If so, discards. - Checks if
r.timestamp
>replicaTimestamp
. If not, discards. - Adds the record to its own record log.
- Checks if
- Updates
replicaTimestamp
, by executingmerge(sourceTimestamp, replicaTimestamp)
. - Goes through the log and executes all stable operations.
- Finally, cleans up the log.
Notes and Advices
- This is an architecture, not a protocol.
- Each use-case can be implemented with its specifics.
- There are ways to ensure the client does not get inconsistent results if they switch replicas.
Or if you don't know what a response is, you can always write a webmention comment (you don't need to know what that is).