distributed

Replication

Data replication happens when the same data is stored on multiple storage devices that might not be on the same network. Benefits High availability since the system is still available even when some nodes are offline. Higher performance and scalability since the clients can connect to their closest data source. Requisites Transparency: the client needs to see one system instead of cluster of collaborating systems. Consistency: the ideal model is where a client makes an update and all observers would see it immediately. Faults Replication can tolerate a maximum number of faults: Fail-silent faults: for $f+1$ nodes, we can tolerate $f$ faults since we only need one replica to get the value. Byzantine faults: for $2f+1$ nodes, we can tolerate $f$ faults. This way we can assure that we receive $f+1$ equal responses. Active vs Passive There are two types of data replication: Active, where all servers execute all requests.