翻译 C++ sequenced-before graphs

# 翻译 C++ sequenced-before graphs 原文地址：https://josephmansfield.uk/articles/c++-sequenced-before-graphs.html ## 名词解释 evaluation：to judge or calculate the quality, importance, amount, or value of something 文中译为求值 sequenced-before：文中译为先序，先排序 side effect： Implementation-defined behaviour: ## 正文 i = i++ This expression, and many like it, demonstrate the sequencing rules of C++ and how they can cause your program to behave in ways you might not expect. What should the result of this expression be given some initial value for i? Questions like this are extremely frequent on Stack Overflow. In this case, the answer is that this code is completely undefined — it could do anything. Literally anything (at least formally speaking). 这个表达式，以及许多类似的表达式展示了C++的顺序规则以及它是如何以你意料之外的方式影响程序的行为。如果给定i的初始值，这个表达式的结果应该是什么？在Stack Overflow上面，这样的问题问得非常频繁。在这个例子中，答案是这行代码完全是未定义的——他可能做任何事情，字面意义上的任何事情。 The C++ standard defines an execution path according to your code. Given particular inputs, the program will always follow this execution path. Sometimes, the standard allows multiple possible execution paths for your program. This gives compilers extra freedom to optimize in various ways. When the standard allows a particular set of possible paths, this is known as unspecified behavior. In other cases, the standard gives absolutely no requirements about the behavior of your program, and this is known as undefined behavior. Undefined behavior is certainly not something you want in your code. C++标准根据你的代码定义一个执行路径。给定特定的输入，程序永远都会遵循这个执行路径。有时候，C++标准允许你的程序有多个可能的执行路径。这给了编译器在各种方式上进行优化的额外的自由度。当标准允许程序有一组可能的执行路径时，这就被称为未定义的行为。在其他情况下，标准对你的程序的行为没有任何要求，这也被称为未定义的行为。未定义的行为绝对不是你想要出现在你的代码中的东西。 Implementation-defined behavior is a subset of unspecified behavior, for which the implementation is required to document its choice of behavior. The sequencing rules of C++, which describe how the evaluations of expressions and their subexpressions are ordered, may determine that an expression is undefined. `C++11` introduced a smarter but slightly more complex approach to specifying these rules, which is still used in `C++14`. There’s a great overview of the rules on the cppreference.com wiki. 由实现来定义的行为是未定义行为的子集，实现必须记录它的行为的选择。C++的顺序规则描述了表达式的求值和它的子表达式如何排序，它可能决定了某个表达式是未定义的。`C++11`引入了一个更聪明但略复杂的方法来明确这些规则，这个方法在`C++14`中还在用。在 cppreference.com wiki上面有更好的概括。 In simple terms, evaluations are ordered by a sequenced-before relationship. That is, evaluation of one part of an expression may be sequenced-before the evaluation of another part. Evaluations can be one of two things: value computations, which work out the result of an expression; and side effects, which are modifications of objects. When two evaluations are not sequenced-before each other, they are unsequenced (we cannot say which will occur first and they may even overlap). 简单来说，求值过程是由先序（sequenced-before）关系来排序的。也就是说，一个表达式的一部分的求值过程可能在另外一个部分的求值过程上先排序。求值过程是两种情况之一：数值计算（value computations），也就是计算出表达式的值；副作用（side effects），也就是对象的修改。当两个求值表达式一个在另一个顺序前面，它们就是未排序的（我们说不出来哪个更发生，它们可能甚至重叠）。 :::info evaluations分为value computations和side effects 在`i++`表达式中，取出`i++`的值就是value computations，side effects就是对i本身的修改，即i增加1。 ::: Informally, the basic sequencing rules are as follows: 非正式地讲，基本的顺序规则如下： * The value computation of an operator’s operands are sequenced before the value computation of its result — else how would we compute the result? Note, however, that the side effects of the operands are not necessarily sequenced-before. 运算符的操作数的数值运算排在它的结果的值运算前面——不然我们怎么计算结果？但是要注意，操作数的副作用不一定排序在前。 * In general, the evaluation of an operator's operands are unsequenced with respect to each other. For example, in x + y, the evaluation of x and y are unsequenced. 总体来说，操作符的操作数的求值之间彼此之间是无序的。比如在 x+y中，x和y的求值没有顺序前后。 * For `&&`, `||`, and `,,` however, evaluation of the left operand is sequenced before the right operand. 但对于 `&&`， `||`， `，`左边操作数的求值排在右边操作数的前面。 * The value computation of postfix ++ and -- is sequenced before their side effects. 后缀`++`，`--`的数值计算排在副作用的前面。 * The side effects of prefix ++ and -- are sequenced before their value computation. 前缀`++`，`--`的副作用排在数值计算的前面。 * The value computations of the operands of any assignment operator (=, +=, -=, etc.) are sequenced before the side effect of the assignment, which is itself sequenced before value computation of the assignment expression. 任何赋值操作符（=，+=，-=等）的操作数的数值运算都在其赋值副作用的顺序前面，而赋值的副作用又在赋值运算的值运算顺序前面。 The rules that define when the sequenced-before relationship exists between two evaluations inherently form an acyclic directed graph structure. As an example, let’s consider the earlier line of code once again: 定义两个求值过程存在顺序关系的规则，本质上形成了一个无环有向图结构。作为示例，让我们再次考虑前面这行代码： i = i++ Let's first split it up into its subexpressions. The full expression is the assignment, which has two operands. The left operand is just i, while the right operand is i++ which itself has i as an operand. The following tree represents this structure, where the arrows represent the sequenced-before relationship of the value computations of those subexpressions: 让我们将它分成两个子表达式。完善的表达式有两个操作数。左边的操作数是i，右边的操作数是i++，其中i又作为操作数。下面的树表达了这个结构，其中箭头表示了这些子表达式的求值的前序关系。 ![](https://hackmd.io/_uploads/HkIQ5yLEh.png) The subexpression tree of i = i++. :::info 整个表达式可以分为assignment和increment操作。 assignment和increment由于都会改变对象i的值，所以可以称这两个操作都有side effects，但是根据第一条规则，赋值操作符的数值运算在其赋值副作用的前面，但是其他副作用的操作符的顺序就不一定了 ::: Both the assignment and the increment have side effects (i.e. they modify the value of an object). We know how they are sequenced with respect to the value computations by looking at the above sequencing rules. The assignment comes after the value computation of its operands and before its own value computation. The increment comes after its value computation, but is not sequenced-before the assignment. If we add them as red nodes, we have: 赋值和增加操作都有副作用（也就是修改对象的值）。通过观察上面的排序规则，我们知道它们是根据值计算来排序的。赋值在它的计算数的数值计算后面，在它自己的数值计算前面。增加操作在它的数值计算后面，但是不在赋值的卡年，如果我们将它们作为红色的节点加在途中，我们就得到了： ![](https://hackmd.io/_uploads/S1hE9y8Nh.png) The almost-complete sequenced-before graph for i = i++. We can read this graph as flowing chronologically upwards. For the purposes of determining undefined behavior, we also need to identify any value computations that use the value of an object. This is quite a subtle point, but only value computations of expressions denoting objects (like i) that are being used where an rvalue is expected are value computations that use the value of an object. When such an expression is used where an lvalue is expected instead, its value is not important, as in the left operand of =. 我们可以对这张图按照时间顺序往上看。为了判定未定义的行为，我们还需要找出任何使用了对象的值的数值计算操作。这是相当微妙的一个问题，在期望使用右值的地方，只有使用了表示正在使用的对象（比如i）的数值运算的表达式，才是使用了对象的值的数值运算。 For those who want to read more, look up value categories. The left operand of = is an lvalue, which means that we don't care about its value. Lvalue-to-rvalue conversion can be thought of as reading the value from an object. 如果想了解更多，就查阅值种类。`=`的左操作数是左值，意味着我们不关心它的值。左值到右值的转化可以当作从一个对象中读取值。 If we highlight value computations that use the value of objects in blue, we get: 如果我们将使用对象的值的数值运算用蓝色高亮，我们就得到了： ![](https://hackmd.io/_uploads/Sy_B51LEh.png) An annotated sequenced-before graph for i = i++. i=i++的注释前序图 Notice the interesting placement of the i increment — nothing else is sequenced after it, so it could even occur after the assignment. Herein lies the undefined behaviour. Depending on whether the increment occurs before or after the assignment, we could get different results. 注意i增加操作的有趣的位置——其他没有东西排序在它后面，所以它甚至可以在赋值后面出现。因此这就是未定义的行为。我们根据增加操作出现在赋值的前后可以得到不同的值。 To determine whether an expression might be undefined, we can simply look at its graph. Two evaluations in different branches are completely unsequenced (that is, if there is no directed path between them). The standard states that we have undefined behaviour if: we have two side effects on the same scalar object that are unsequenced; or we have a side effect on a scalar object and a value computation using the value of the same object that are unsequenced. In the above graph, I have connected the problematic pair of evaluations with a dotted line — two unsequenced side effects. 为了决定一个表达式是否可能是未定义的，我们可以简单看这张图。在两个分支中的两个求值完全是不排序的（也就是说在它们中间没有直接的路径）。标准是这样声明的，只有当我们在同一个标量对象上有两个没有排序的副作用或者在一个标量对象的副作用和一个使用了同一对象的数值计算动作没有排序时，这就是未定义的行为。在上图中，我用点线将一组有问题的求值——两个未排序的副作用连了起来。 As you can see, this gives a great way to visualize the sequencing rules as applied to a particular expression and makes it much easier to see why certain expressions might result in undefined behaviour. 如你所见，这就给了一种很好的应用于特定表达式的讲排序规则可视化的方法，这让了解为什么特定表达式可能导致未定义的行为更简单。 Let’s take a look at some more examples of both undefined and well-defined expressions: 让我们多看些未定义的和定义清晰的表达式： i = ++i ![](https://hackmd.io/_uploads/ryWIvMLVn.png) An sequenced-before graph for i = ++i. 一个i=++i的前序图 It is interesting to see that by merely switching the postfix increment to a prefix increment, this expression has become well-defined. That’s because the increment of i has to occur before the value computation of the increment expression and, therefore, before the assignment to i. 很有趣的事，仅仅通过将后缀增加操作改成前缀增加操作，这个表达式就变成定义清晰的了。这是因为i的增加操作必须出现在数值计算操作之前，因此，也就在i的赋值之前。 Before `C++11`, this was actually considered undefined, despite having only one possible result. 在`c++11`之前，虽然这只有一种可能的操作，这实际上被认为是未定义行为。 i = i++ + i++ ![](https://hackmd.io/_uploads/HJUPDGIE2.png) An sequenced-before graph for i = i++ + i++. 一张i = i++ + i++的前序图 This is a slightly more complex adaptation of the expression we first looked at. It seems to be quite a popular example. It has undefined behaviour for the same reasons, but also exhibits unsequenced side effects and value computations between the two increments. In the same way, i++ + i++ alone is undefined. 这是我们一开始看到的表达式的稍微更复杂的改变。这似乎是一个相当流行的例子。这是未定义行为的理由一样，但是在两个增加操作之间还存在着没有排序的副作用和树枝操作。 i = v[i++] ![](https://hackmd.io/_uploads/HknwvG8N3.png) An sequenced-before graph for i = v[i++]. This one tends to be hard for many to grasp, although its problem is exactly the same as the previous examples. I suspect that most people think of i++ as an operation that “returns the value of i, then increments i.” The problem is that the increment can happen at any point, and might happen after the assignment to i. The fact that we're using it as a subscript doesn't change anything. 这一个例子对很多人来说很难掌握，虽然它的问题跟前面的例子一样。我认为许多人把i++看作是“返回i的值然后i加1”的操作。问题在于，增加操作可以发生在任何时候，可能发生在i的赋值之后。我们将它用作下表也不会改变任何事。 `i++ || i++` ![](https://hackmd.io/_uploads/HJMuwGLEh.png) An sequenced-before graph for `i++ || i++`. This demonstrates the ability of some operators (namely `||`, `&&`, and` ,`) to enforce sequencing between their operands. The standard states that every value computation and side effect associated with the left operand is sequenced before those of the right operand. f(i = 1, i = 2) ![](https://hackmd.io/_uploads/B1d_wfUVh.png) An sequenced-before graph for f(i = 1, i = 2). Here we’re calling a function and in each argument we’re assigning a different value to i. The evaluations of function arguments are also unsequenced, so we cannot say what the value of i will be after this expression has been evaluated. 这里我们调用了一个函数，在每个参数重我们将不同的值赋值给了i。函数的参数是未排序的，所以我们不能说i的值在这个表达式求值完成后是多少。 It’s worth noting that function calls are always indeterminately sequenced. This means that one always occurs before the other, but we cannot say which way around. This is to prevent function calls from interleaving. However, the undefined behaviour rule only applies with unsequenced evaluations. Therefore, if you have two function calls in different branches that modify the same scalar object, you don’t have undefined behaviour — instead you have unspecified behaviour. 值得注意的是，函数调用总是有不确定的顺序。这就意味着我们可以说一个发生在另一个之前，但是我们不能说顺序如何。这是为了方式函数交叉调用。但是未定义行为的规则只适用于未排序的求值过程。因此，如果有两个函数在不同的分支中修改了同一个标量对象，就不是未定义的行为——取而代之的是未明确的行为。 Also remember that overloaded operators are actually treated as function calls. That is, if you want to know how x + y is sequenced, and operator+ is overloaded for x, you need to treat the expression as x.operator+(y) or operator+(x, y) — whichever is defined. 还要记住的是，重载操作服实际上被当作函数调用。也就是说，如果你想要知道`x+y`是如何排序的和操作符是如何为x虫灾的，你就要将表达式视为`x.operator+(y)`或`operator+(x,y)`——具不管怎么定义的。 Finally, it is not always enough to look at the identifiers in expressions. Previously, we noted that an object i was modified in two parts of an expression. However, it’s possible to have two different identifiers, say i and j, that either refer to the same scalar object or are more complex data types that have a shared scalar object that they modify. Keep this in mind. 最后，只看表达式中的标识符并不总是足够的。在前面，我们注意在i在一个表达式中的两个地方被修改了。但是又肯呢个有两种不同的标识符，比如i和j，它们要么kenn只想同一个标量对心啊个，或者是共享了一个要修改的标量对象的复杂数据类型。牢记于心。 Next time you come across a complex expression (which is hopefully not very often), you now have a great way to analyse the sequencing of evaluations and determine the possible execution paths of that expression. Perhaps this might reduce the number of questions on Stack Overflow about this topic, but if not, at least it gives a nice visual way to answer them. 下次你在碰到一个复杂的表达式（希望不要太频繁），你就有一个分析求值过程顺序和决定表达式的可能的执行路径的好办法了。也许只会减少StackOverflow上面这个主题的问题的数量，如果没有的话，希望这至少给了一个很好的可视化的解答方法。