The complications associated with applications distributed over networked computers include the possibility that more than one process, each without any knowledge of the others, will attempt to make use of an unsharable resource at the same time (for example, updating the same data item). Figure illustrates an example of a common conflict pertaining to an entry in a data base. Each process wishes to update the credit remaining in an account when more than one person has access to the account. Two valid transactions, each executing without knowledge of the other, collide. The result is that one transaction is cancelled by the other. The problem illustrated in the figure is the interleaving of the reads and writes to the database. The result is called the lost transaction problem. Concurrency controls would prevent this from happening if both access attempts were under the control of the same multi-tasking operating system. However, since these processes are not subject to a common control, a valid transaction is lost.
Process 1 Credit Limit Process 2 --------- ------------ --------- | 5,000 | Read Credit | (5,000) T | I | Read Credit M | (5,000) E | | Subtract Purchase | (5,000 - 850) | Write New Credit | 4,150 | Subtract Purchase | (5,000 - 54.50) | Write New Credit | 4,945.50 V
In addition to the concurrency problem, distributed applications are further complicated by the fact that more than one processing unit is involved and one may fail while the other continues to function normally. What should happen when one application process requests that another perform some activity and then one or the other of the processes crashes? What must happen when the failed system restarts? These are the types of questions addressed by the Application Service Element Commitment, Concurrency, and Recovery (CCR).
The central concept in a distributed application is the notion of an atomic action. An atomic action is a sequence of operations such that either
The concept of an atomic action is familiar from the study of operating systems. However, it is complicated in distributed systems by the separation and independence of the processors. Since no single process controls scheduling and access for all the processes active through the network, control requires a distributed effort shared by separate processes that may be unaware of each other's existence.
Processes built from atomic actions are called transaction processes. A transaction is initiated by a client and carried out by a server. For the server, a transaction consists of two phases. During phase 1, the server receives a description of the atomic action (the sequence of operations) and makes arrangements to carry out the atomic action. (The server may carry out the requested operation, but retain the ability to return to its state before the action, or it may record the desired action pending final instruction to do it.) When the atomic action is fully defined, phase 2 begins. During phase 2, the server commits to successful completion of the transaction if possible; otherwise, the server aborts the transaction and reports failure to the client.
Once the server is committed, it must complete the requested operations regardless of any failures it may encounter. If the server crashes during execution of the operations that constitute the transaction, it must resume processing upon restart. Once the server has committed to carry out the transaction, neither the server nor the client can abort the processing.