or
or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up
Syntax | Example | Reference | |
---|---|---|---|
# Header | Header | 基本排版 | |
- Unordered List |
|
||
1. Ordered List |
|
||
- [ ] Todo List |
|
||
> Blockquote | Blockquote |
||
**Bold font** | Bold font | ||
*Italics font* | Italics font | ||
~~Strikethrough~~ | |||
19^th^ | 19th | ||
H~2~O | H2O | ||
++Inserted text++ | Inserted text | ||
==Marked text== | Marked text | ||
[link text](https:// "title") | Link | ||
 | Image | ||
`Code` | Code |
在筆記中貼入程式碼 | |
```javascript var i = 0; ``` |
|
||
:smile: | ![]() |
Emoji list | |
{%youtube youtube_id %} | Externals | ||
$L^aT_eX$ | LaTeX | ||
:::info This is a alert area. ::: |
This is a alert area. |
On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?
Please give us some advice and help us improve HackMD.
Do you want to remove this version name and description?
Syncing
xxxxxxxxxx
We thank the reviewers for their detailed and thoughtful comments.
Overview
Our response primarily focuses on (i) the technical questions posed by Reviewer-A; (ii) the relationship of our work with DRF-SC proofs as questioned by Reviewer-B; and (iii) Reviewer-C's questions on how our technique compares to model-checking approaches for weak memory.
Proposed Changes
In addition to addressing all the technical clarifications and minor typos raised in the reviews, our revision will provide a more detailed in-depth comparison with both DRF-based verification techniques as well as model-checking approaches developed for weak memory systems. We will also move salient material from the appendix to the main body of the paper as suggested by Reviewer-A.
Response
Review-A
Dependence on specifications: In order to discharge verification conditions, we typically require \(hb\) relations to be established among events. Some of them can be automatically inferred through static analysis (e.g. events in the same invocation, or update events caused by CAS statements). However, in some cases, we do require information about LPs. For example, in the register library, since there are no CAS statements, and we require \(hb\) information between events in different invocations, LPs provide the necessary information required to establish correctness.
Client locations: We require that a library implementations "owns" its accessed locations, and that they should not be accesible by clients. This is consistent with the assumptions made by previous work related to library verification [e.g., Raad et. al 2019], and is necessary for modular verification.
C11 with LB: Allowing LB introduces \(so \cup rf\) cycles which essentially allows read events to read from write events in the future (according to \(hb_{SC}\)). Our verification technique is based on maintaining equality between \(hb\) and \(hb_{SC}\) among write events in the past, which ensures robust behavior under RC20. Hence, our technique, as currently defined, cannot be used to detect LB type violations. We will add this requirement in the revised version. We leave the question of whether the libraries proven correct under our current method are indeed robust under C11-semantics with LB as a topic for future investigation.
Method calls inside other methods: We note that we only disallow method calls from inside other methods to simplify our formal treatment. Because our verification conditions operate at the lower granularity of memory events, it is not sensitive to whether these memory events are generated directly by statements inside a method body, or by a method call inside the body. In either case, our verification technique would still search for a potential or actual non-robustness witness.
Minor Comments:
Review-B
Comparison with DRF: The DRF property associated with a relaxed memory model typically asserts that if a program does not have races under SC (where a race can be defined either operationally or declaratively) then the program will not exhibit new behaviors under the relaxed memory model. As observed by the reviewer, this closely resembles the definition of execution-graph robustness, and indeed prior work ([Lahav and Margarit, PLDI 19], [Margarit and Lahav, POPL 21]) have formally proved a correspondence between the two definitions (see e.g., p14 of the POPL'21 paper). In this work, however, our goal is to automatically verify that a library implementation satisfies this property across all of its (unbounded) executions. Towards this end, we propose a novel inductive technique that crucially defines both (a) a general inductive invariant to handle unbounded executions and (b) a verification strategy that takes advantage of the high-level SC specifications of the library that helps to strengthen the inductive hypothesis and reduces the space of possible executions that need to be examined. We are unaware of any prior DRF-based verification technique that provides an automated verification methodology in this setting.
Moreover, our definition of effect robustness is specifically tailored for dealing with libraries, which often may have benign races that do not affect its observable behavior; effect robustness allows us to ignore such races, which, in turn, leads to important precision gains.
Specification language: Our technique establishes effect robustness, which ensures that every observable behavior of the library implementation (in terms of invocation argument and return values) under RC20 should also be possible under SC (L-834, Def. 5.1). Towards this end, we only consider robustness of events that contribute (directly or indirectly) towards the return value. Hence, the example provided by the reviewer is indeed effect-robust, since the return values of both methods are independent of all the other operations. Because the 'fail()' operation is an internal library event, it does not have any impact on observable behavior.
We specify correctness of SC executions using only the invocation events in an execution (Line 545, Def. 3.2, notice that each axiom \(\mathcal{A}_i\) is only applied on \(\Gamma\), the set of invocation events). As such, specification axioms can only constrain argument and return values, as well as the happens-before ordering among invocation events. We will expand on this description in the revised version.
For specifications involving unreachability of a program point, if it can affect the observable library behavior, then our technique will indeed be applicable. In the example provided by the reviewer, if the fail() location affected the return value of the method, then the program would be neither effect-robust nor execution-graph robust, and this would be detected by our tool in the form of an explicit non-robustness witness.
Review-C
We note that the primary contribution of this work is a fully automated technique to verify robustness of library implementations assuming the most general client. We are unaware of any prior work that automatically establishes robustness of arbitrarily long executions involving an unbounded number of threads, only given an SC correctness specification.
Novelty of the proposed work: While the definition of C11 robustness facilitates incremental verification, all prior works targeting library robustness (including those given by the reviewer) only perform bounded verification, assuming a constant number of threads. Hence, we do not believe they can be directly used for verifying library implementations in the presence of a most general client. Further, performing bounded verification for library implementation is unsound, as the library may manifest non-robust executions depending on a certain number of threads/invocations. As a concrete example, note the implementation in Fig. 7 in the paper exhibits a non-(execution-graph) robust execution with at least 4 invocations. Running this program on existing model-checking tools assuming e.g., only 3 invocations would not reveal the robustness violation.
As a more realistic example, we created a simple concurrent set library implementation to demonstrate this problem. The set is implemented as a sorted linked list. The add operation is implemented as a search for the correct position in the list, followed by the allocation of a new node; field updates are implemented using CAS operations. Our test program creates a designated thread to check membership and multiple concurrent threads to add elements. We run this program through GenMC (https://plv.mpi-sws.org/genmc/) a state-of-the-art stateless model checker for weak memory systems to check for robustness violations. We note that if the CAS operations are relaxed (not acquire release), then in two threads no robustness violations are observed, although a violation is detected when using 3 threads. In the two thread case, the thread executing the add operations induces a total order on its CAS operations (due to session order). But in 3 threads, concurrent CAS operations in the 2 concurrent adding threads do not form a total order. We highlight that we directly run the model checker as-is on the library implementation and the test code, without any changes on our end. The implementation and results of running the model checker on the implementation can be found at this anonymized link - https://github.com/m5vnv4sjvtc/automated-robustness-rebuttal. The output of our tool, which successfully identifies the robustness violation, along with an explanation of its results is also provided.
Finally, we also note that performing incremental robustness verification in the unbounded case is not straightforward, as we cannot directly use robustness as an inductive invariant (see Line 660). Our main insight is to strengthen the inductive hypothesis with two conditions \(\Phi_{src}\) and \(\Phi_{dst}\), which together ensure robustness at both source and destination to make it inductive. Our second important insight is to demonstrate that both these conditions can be automatically inferred from the high level SC specification of the library.
Silent actions: These correspond to actions in an invocation that do not access the shared memory (Line 377). These actions will occur in the program semantics transition system (\(\Omega_L\)), and not in the memory model transition system.
States in \(\S 3\): The states of the program semantics LTS maintain necessary bookkeeping information such as active invocations per thread and local variables valuation per invocation. The LTS is described in detail in the supplemental material. The states of the memory model LTS are just execution graphs; this is the standard LTS formulation used in previous work.
Well-formedness conditions: We do require \(rf\) to be functional and total in its range. We thank the reviewer for pointing out this omission; we will be sure to add this in the revised version.