October 22, 2021

Coordination And Agreement In Distributed Systems

In the absence of errors, the problem of distributed coordination is essentially only the problem of ConcurrencyControl in a distributed environment, and we can adapt the approaches we have already seen for multiprocessors, first and foremost mutual exclusion. But we need to make sure that everything we use works when our only underlying communication mechanism is the transmission of messages. We also need to ensure that we no longer have as much control over the order of events: if a letter to a storage site can be treated as an atomic operation (if our hardware is properly configured), transferring a message to a network necessarily takes time. This can create confusion between widely distributed processors about when certain events occur or even in the order in which they occur. The common model in all cases: to build a successful distributed system across the Internet, it seems that you need an application that does not require centralized coordination, that allows newcomers to join easily and that evolves economically by paying attention to costs and benefits so that they are roughly balanced for most potential users. This also characterizes other large distributed systems such as UUCP (formerly) or various peer-to-peer content distribution systems (more recently). From a practical point of view, we can say that we have nothing against adopting the characteristics of our system that the real world actually has (such as timeouts or reliable watches). This gives us a more useful model and allows the use of standard coordination algorithms of the types described in Chapter 18 of SGG. The SMTP messaging system.

This is used for packet management, which is the web for the DNS: a storage and transfer system that is implemented at high level in the protocol stack and not in routers. Reliability is important (two generalities happen here, because my email server can only delete an outgoing message once your email server is sure to have it saved), but it is not necessary for most to do some kind of global coordination or consistency. There are three main methods for considering distributed systems, each offering a slightly different perspective from the others. The Domain Name System (DNS). Essentially a large hierarchical database for translating domain names like www.cs.yale.edu in IP addresses like 128.36.229.30. The underlying mechanism is that the user transmits RPC-type queries via a name server structure, first the name server for .edu, then for .yale.edu, then the .cs.yale.edu and finally the destination host. The margin of error is achieved through replication, scalability through storage and healthy profitability, carefully attributing most of the cost of a query to an organization that handles receiving the user`s response (i.e., Yale ITS runs the .yale.edu name server). Note that very little coordination is required: domain records do not change very quickly and can be published in a central location. Even more practically, we can adopt the principle: “If we create this right, we do not need algorithms”1 and we are targeting distributed systems that do not require much global coordination or coherence. Most of the distributed systems we use fall into the latter category. The failure of the coordinator is more difficult.

About Bob Bergey

Bob has been driving motorcoaches since 2002, in every state east of the Mississippi and a few west, as well as the four southeastern-most provinces of Canada. In addition to driving, he's an avid photographer (and former professional), enjoys writing and technology.