Communication overhead is the most commonly used performance metric for the operation complexity of distributed algorithms in message-passing environments. However, aside with communication, many distributed operations utilize complex computations to reach their desired outcomes. Therefore, a most accurate operation latency measure should account of both computation and communication metrics. In this paper we focus on the efficiency of read and write operations in an atomic read/write shared memory emulation in the message-passing environment. We examine the operation complexity of the best known atomic register algorithm, presented in , that allows all read and write operations to complete in a single communication round-trip. Such operations are called fast. At its heart, the algorithm utilizes a predicate to allow processes to compute their outcome. We show that the predicate used in  is computationally hard, by devising a computationally equivalent problem and reducing that to Maximum Biclique, a known NP-hard problem. To improve the computational complexity of the algorithm we derive a new predicate that leads to a new algorithm, we call CCFAST, and has the following properties: (i) can be computed in polynomial time, rendering each read operation in CCFAST tractable compared to the read operations in the original algorithm, (ii) the messages used in CCFAST are reduced in size, compared to the original algorithm, by almost a linear factor, (iii) allows all operations in CCFAST to be fast, and (iv) allows CCFAST to preserve atomicity. A linear time algorithm for the computation of the new predicate is presented along with an analysis of the message complexity of the new algorithm. We believe that the new algorithm redefines the term fast capturing both the communication and the computation metrics of each operation.