State Machine Replication(SMR)

#distributed-computing #knowledge-bite

Static Machine Replication(SMR) is a great way to provide high availability for your application. However, it requires the logic behind your application to be deterministic.

Servers are physical machines that can break. Networks can fail. If you run your application's logic only on one server, any issues with that server will immediately impact your availability metrics. Which is not an issue for a personal blog like this. For a financial institution, missing any action by user might be catastrophic.

Developers provide fault tolerance by Using SMR with consensus algorithms like Paxos or Raft. The idea is to run the same deterministic logic on multiple servers and then only apply the end result if a majority of servers agree.