can begin detailed analysis of those RPC requests and responses. Comparison-based verification offers a simpler solution, assuming that the benchmark runs properly when using the reference server. Comparing the SUT’s responses to the problem-free responses produced by the reference server can quickly identify the specific RPC requests for which there are differences. Comparison provides the most benefit when problems involve nuances in responses that cause problems for clients (as contrasted with problems where the server crashes)—often, these will be places where the server implementors interpreted the specification differently. For such problems, the exact differences between the two servers’ responses can be identified, providing detailed guidance to the developer who needs to find and fix the implementation problem. 2. Bug compatibility: In discussing vagueness in specifications, we have noted that some aspects are often open to interpretation. Sometimes, implementors misinterpret them even if they are not vague. Although it is tempting to declare both situations “the other implementor’s problem,” that is simply not a viable option for those seeking to achieve widespread use of their server. For example, companies attempting to introduce a new server product into an existing market must make that server work for the popular clients. Thus, deployed clients introduce de facto standards that a server must accommodate. Further, if clients (existing and new) conform to particular “features” of a popular server’s implementation (or a previous version of the new server), then that again becomes a de facto standard. Some use the phrase, “bug compatibility,” to describe what must be achieved given these issues. As a concrete example of bug compatibility, consider the following real problem encountered with a previous NFSv2 server we developed: Linux clients (at the time) did not invalidate directory cookies when manipulating directories, which our interpretation of the specification (and the implementations of some other clients) indicated should be done. So, with that Linux client, an “rm -rf” of a large directory would read part of the directory, remove those files, and then do another READDIR with the cookie returned by the first READDIR. Our server compressed directories when entries were removed, and thus the old cookie (an index into the directory) would point beyond some live entries after some files were removed—the “rm -rf” would thus miss some files. We considered keeping a table of cookie-to-index mappings instead, but without a way to invalidate entries safely (there are no definable client sessions in NFSv2), the table would have to be kept persistently; we finally just disabled directory compression. (NFSv3 has a “cookie verifier,” which would allows a server to solve this problem, even when other clients change the directory.) Comparison-based verification is a great tool for achieving bug compatibility. Specifically, one can compare each response from the SUT with that produced by a reference server that implements the de facto standard. Such comparisons expose differences that might indicate differing interpretations of the specification or other forms of failure to achieve bug compatibility. Of course, one needs an input workload that has good coverage to fully uncover de facto standards. 3. In situ verification: Testing and benchmarking allow offline verification that a server works as desired, which is perfect for those developing a new server. These approaches are of less value to IT administrators seeking comfort before replacing an existing server with a new one. In high-end environments (e.g., bank data centers), expensive service agreements and penalty clauses can provide the desired comfort. But, in less resource-heavy environments (e.g., university departments or small businesses), administrators often have to take the plunge with less comfort. Comparison-based verification offers an alternative, which is to run the new server as the SUT for a period of time while using the existing server as the reference server.3 This requires inserting a server Tee into the live environment, which could introduce robustness and performance issues. But, because only the reference server’s responses are sent to clients, this approach can support reasonably safe in situ verification. 4. Isolating performance differences: Performance comparisons are usually done with benchmarking. Some benchmarks provide a collection of results on different types of server operations, while others provide overall application performance for more realistic workloads. Comparison-based verification could be adapted to performance debugging by comparing per-request response times as well as response contents. Doing so would allow detailed request-by-request profiles of performance differences between servers, perhaps in the context of application benchmark workloads where disappointing overall performance results are observed. Such an approach might be particularly useful, when combined with in situ verification, for determining what benefits might be expected from a new server being considered. 3Although not likely to be its most popular use, this was our original reason for exploring this idea. We are developing a large-scale storage service to be deployed and maintained on the Carnegie Mellon campus as a research expedition into self-managing systems [4]. We wanted a way to test new versions in the wild before deploying them. We also wanted a way to do live experiments safely in the deployed environment, which is a form of the fourth item. 3 Components of a file system Tee Comparison-based server verification happens at an interposition point between clients and servers. Although there are many ways to do this, we believe it will often take the form of a distinct proxy that we call a “server Tee”. This section details what a server Tee is by describing its four primary tasks. The subsequent section describes the design and implementation of a server Tee for NFSv3. Relaying traffic to/from reference server: Because it interposes, a Tee must relay RPC requests and responses between clients and the reference server. The work involved in doing so depends on whether the Tee is a passive or an active intermediary. A passive intermediary observes the client-server exchanges but does not manipulate them at all—this minimizes the relaying effort, but increases the effort for the duplicating and comparing steps, which now must reconstruct RPC interactions from the observed packet-level communications. An active intermediary acts as the server for clients and as the only client for the server—it receives and parses the RPC requests/responses and generates like messages for the final destination. Depending on the RPC protocol, doing so may require modifying some fields (e.g., request IDs since all will come from one system, the Tee), which is extra work. The benefit is that other Tee tasks are simplified. Whether a Tee is an active intermediary or a passive one, it must see all accesses that affect server state in order to avoid flagging false positives. For example, an unseen file write to the reference server would cause a subsequent read to produce a mismatch during comparison that has nothing to do with the correctness of the SUT. One consequence of the need for complete interposing is that tapping the interconnect (e.g., via a network card in promiscuous mode or via a mirrored switch port) in front of the reference server will not work—such tapping is susceptible to dropped packets in heavy traffic situations, which would violate this fundamental Tee assumption. Synchronizing state on the SUT: Before RPC requests can be productively sent to the SUT, its state must be initialized such that its responses could be expected to match the reference server’s. For example, a file read’s responses won’t match unless the file’s contents are the same on both servers. Synchronizing the SUT’s state involves querying the reference server and updating the SUT accordingly. For servers with large amounts of state, synchronizing can take a long time. Since only synchronized objects can be compared, few comparisons can be done soon after a SUT is inserted. Requests for objects that have yet to be synchronized produce no useful comparison data. To combat this, the Tee could simply deny client requests until synchronization is complete. Then, when all objects have been synchronized, the Tee could relay and duplicate client requests knowing that they will all be for synchronized state. However, because we hope for the Tee to scale to terabyte- and petabyte-scale storage systems, complete state synchronization can take so long that denying client access would create significant downtime. To maintain acceptable availability, if a Tee is to be used for in situ testing, requests must be handled during initial synchronization even if they fail to yield meaningful comparison results. Duplicating requests for the SUT: For RPC requests that can be serviced by the SUT (because the relevant state has been synchronized), the Tee needs to duplicate them, send them, and process the responses. This is often not as simple as just sending the same RPC request packets to the SUT, because IDs for the same object on the two servers may differ. For example, our NFS Tee must deal with the fact that the two file handles (reference server’s and SUT’s) corresponding to a particular file will differ; they are assigned independently by each server. During synchronization, any such ID mappings must be recorded for use during request duplication. Comparing responses from the two servers: Comparing the responses from the reference server and SUT involves more than simple bitwise comparison. Each field of a response falls into one of three categories: bitwisecomparable, non-comparable, or loosely-comparable. Bitwise-comparable fields should be identical for any correct server implementation. Most bitwise-comparable fields consist of data provided directly by clients, such as file contents returned by a file read. Most non-comparable fields are either server-chosen values (e.g., cookies) or server-specific information (e.g., free space remaining). Differences in these fields do not indicate a problem, unless detailed knowledge of the internal meanings and states suggest that they do. For example, the disk space utilized by a file could be compared if both server’s are known to use a common internal block size and approach to space allocation. Fields are loosely-comparable if comparing them requires more analysis than bitwise comparison—the reference and SUT values must be compared in the context of the field’s semantic meaning. For example, timestamps can be compared (loosely) by allowing differences small enough that they could be explained by clock skew, communication delay variation, and processing time variation. |