Response - HackMD

# Response We thank the reviewers for the thorough and valuable feedback, corrections, and suggestions. We have extensively fixed the problems and adopted the suggestions reported by the reviewers. Also, we have included additional explanations and illustrations. ## Main changes - We compressed and rewrote parts of Sections 2 and 3 for readability and conciseness. - In different sections (Sections 6 and 10), we re-emphasize that since we adopt the design decision that the unknown type represents any type, then $?$ can stand for a type variable $X$. This decision is crucial to understand the parametric reasoning of GSF, allowing a function of type $\forall X .? \rightarrow X$ to behave as any function whose type belongs to the concretization of $\forall X .? \rightarrow X$, including $\forall X .X \rightarrow X$ (the type of the polymorphic identity function), $\forall X .\mathit{Int} \rightarrow X$ (a function that always fails when applied) and many others. - We restructured and improved the explanation of designing evidence for parametricity in Section 8. - In Section 9, we added some missing definitions and improved the motivation of the term precision relation design, re-emphasizing our goal when designing $\leqslant$: to obtain a subrelation of the natural notion of precision as large as possible, for which the DGG is satisfied. - In section 10, we addressed the problems reported by reviewer one regarding the use of the terminology *synchronized* and references to the compositionality lemma in previous work. Also, we improved the description of the logical relations of Ahmed et al. [2017], especially regarding the distinction between casts and conversions. Additionally, we have updated the paper with the corrections and suggestions indicated by reviewers. For convenience, in addition to the revised paper, we also upload a version where all changes and additions are in purple. This should be helpful for reviewers to quickly spot the new material. We respond in detail to all the reviewer comments below. ### Reviewer 1 #### Review of 1st version > 410 "the term (t id) true should evaluate to a runtime type error" It took me a long time to understand why you make this claim. It would be clearer to write out your intended interpretation in full. let t = \Lambda P. (\lambda x: (\all X. P X). x [Int]) \all P. (\all X. P X -> P Int) in (t Q id) true Then for (t Q id) to be well-typed, Q must be Q X = X->X, but then (t Q id) has type Int -> Int, and hence (t Q id) true is ill-typed. There is no response to the above comment. Thank you, we added this explanation in the document. >1802 "ascribing the System F identity function to ∀X. ? → X yields a function that behaves exactly as the identity function (and hence never fails)" So parametricity in GSF is different from parametricity in λB? Before you seemed to be claiming they were largely the same. This is an important difference, which perhaps should be brought out earlier. >> We say that our development of parametricity is based on that of λB, but cheap theorems do not only depend on the parametricity logical relation, they also depend on the runtime semantics (weak dynamic gradual guarantee). >> > I don't see how the response addresses my comment. In a normal system satisfying parametricity, one would expect that functions of type (∀X. A → X) where X does not appear free in A cannot return a value, but that is not the case in GSF for functions of type (∀X. ? → X). So parametricity in GSF must differ from ordinary parametricity, or even parametricity in λB of Ahmed et al [2017]. I did not spot where in the paper this particular difference is noted and explained. - Regarding "X does not appear free in ?". Since we adopt the design decision that the unknown type represents any type, then $?$ can stand for $X$ as well, i.e. "$X$ is free in $?$" (formally, $X \in C(?)$, i.e. $X$ belongs to the concretization of $?$). Therefore, a function of type $\forall X. ? \rightarrow X$ may behave as any function whose type belongs to the concretization of $\forall X. ? \rightarrow X$, including: $\forall X. X \rightarrow X$ (the type of the polymorphic identity function), $\forall X X. \mathsf{Int} \rightarrow X$ (a functions that always fails when applied), $\forall X. (X*X) \rightarrow X$ (a function given a pair returns first/snd projection), $\forall X. (X*(X\rightarrow X)) \rightarrow X$, etc. To make it clearer, we re-emphasize this in different parts of the text (6.1 Syntax and Syntactic Meaning of Gradual Types, and 10.1 On Gradual Parametricities). - Regarding "parametricity in GSF and λB". We insist in the idea that the notion of parametricity enjoyed by GSF is based on, and similar, to that of λB. Although some parametric programs behave differently among these languages, these differences are given by dynamic semantics but not by the definition of the parametricity logical relations. For instance, we could create a language that always fails upon type instantiation. It would preserve a "stronger" notion of parametricity but is not driven by how parametricity is formalized. In section 10.1 “On Gradual Parametricities”, we make this clearer by discussing an example of why GSF's notion of parametricity is weaker than PolyG's notion of parametricity when dealing with imprecise types. Also, we added in the contributions a note emphasizing this as well. #### Review of 2nd version >201. Somewhere near here, add the observation that precision is covariant rather than contravariant, even for the domain of functions. Observation added. > 262. "Harmless imprecise ascriptions." Say more about how this property differs from the gradual guarantee, which seems to also specify that making ascriptions less precise has no effect. In particular, if t : A then we should have t = (t :: A) = (t :: A :: A) = (t :: B :: A) for B less precise than A. Or is the point that "Harmless imprecise ascription" is a weaker property than the gradual guarantee? Yes, "Harmless Imprecise Ascription" is a weaker property than the DGG since it can be implied from the DGG. The weak DGG satisfied by GSF, in particular, implies that imprecise ascriptions are harmless. We clarified this in the text. >271. "Consider the STLC term t = λx : _.x" This is not a term of STLC, where each bound variable must be labelled with a simple type. I would rephrase this development, perhaps by noting that the System F identity function (rather than λx : _.x) can be mimicked in Gradual STLC by (λx : ?.x). Thank you, we reformulated the sentence. > 332 "that has been considered" ??? maybe "that has to be reconsidered"? Fixed. > 418 "∀X .X → X and ∀X .X → ? are inconsistent with each other" Not in Ahmed et al [2009]! Here, we are referring to Igarashi et al. [2017a] (System FG). We clarify this in the text now. > 421 "(weak) dynamic gradual guarantee" This is the first mention of the weak dynamic gradual guarantee. Omit "(weak)", or define. We added text explaining the meaning of (weak) gradual guarantee, which is based on a more restrictive notion of precision than the natural one (the restricted precision implies the natural precision). > 439 "which are cast calculus" --> "which are cast calculi" Fixed. > 445 "Said differently" That says the two sentences have the same content expressed in different ways. But to me they have quite different content: the first is about modularity, an essential desideratum for any programming language; the second is about syntactic particulars of System F. Much as we all like System F its syntax is not necessarily sacrosanct, but modularity should be. Your underlying assumption is that the syntax of System F supports modularity, but you need to make that explicit.) Thank you, we clarify now that we are referring to modularity. > 466 Same comments as previously regarding _. Fixed. > 526 "we annotate with a “+” when an embedding has been proven for a language more expressive than the untyped lambda calculus" I don't know which more expressive language you mean; please clarify. For the embedding of a dynamic language (ED), we annotate with a "+" when the dynamic language used to state and prove ED is not the untyped lambda calculus, but a richer one, such as the untyped lambda calculus with sealing primitives. We clarify this in the text. > 684 "in the scenario above, the runtime system automatically wraps a type abstraction around h instead of failing at the application."Introduce a phrase such as "we conjecture that" to clarify that this is work in progress. Clarified. > 843. Shouldn't you now state as a theorem that your definition of join matches the usual definition? "Theorem. $G \sqcup G' \sqsubseteq G'' \iff G \sqsubseteq G'' \land G' \sqsubseteq G''$." Here we are defining the meet (not join) for which the lemma proposed is not true. For instance $\mathsf{Int} \sqcap ? \sqsubseteq \mathsf{Int}$ but, $? \not\sqsubseteq \mathsf{Int}$. We added the inductive definition for meet and added a lemma that establishes the equivalence between the inductive definition and the definition derived by AGT. > 947 "A term t is more precise than a term t′ if they both have the same structure and t is more precisely annotated than t." You need to say more here. I guess it is the case that > > t $\sqsubseteq$ t' G $\sqsubseteq$ G' > - - - > (t :: G) $\sqsubseteq$ (t' :: G') > > But is the following the case? > - - - > t $\sqsubseteq$ (t :: ?) > > By my understanding of the gradual guarantee it ought to be allowed, but it is not clear to me whether or not your definition intends to cover this case. [IMPORTANT] This work focuses on a syntactic definition of term precision [Siek et. al. 2015], rather than a semantic definition [New et al. 2020]. In particular, t' = t::? is not a variant of t which has less type annotations. Therefore, our syntactic term precision relation does not cover t ⊑ t :: ? since the terms do not have the same syntax structurally, but it includes t ::G ⊑ t ::?, where ⊢t :G. Analogously, in a language without ascriptions, there's no rule that relates t and (λ x :?.x) t in syntactic precision. As a final note, the definition of precision that we use is not ad-hoc because it is driven systematically by AGT (from the definition of concretization, which captures the intended design). > 983 "we use the dark blue color for evidence ε" Nice idea, but as a colour-blind person, I can barely distinguish the dark blue from black. Please see Marco Patrignani's paper (https://arxiv.org/pdf/2001.11334.pdf) which provides guidelines on using highlighting that is useful even to colour-blind people. Thank you! We changed the evidence's color to RoyalBlue, as suggested by the cited paper. > 1086 "Because the translation from GSF to GSFε introduces explicit ascriptions everywhere consistency is used" Give a forward pointer to Section 8.3 before this statement. Possibly rewrite statement to be in future tense, to reflect that the translation has not been seen yet. Or move the translation earlier. Thank you, forward reference added to the text. > 1112-1124 The rule mentions \hat\alpha and uses the notations t[\hat\alpha/X] and lift_{\Xi'}(\alpha), none of which are explained in the following description. You need to mention them and give forward pointers to their definition, at least. Done. > 1126 "ε[αˆ] \fatsemi ε_out = ε[αˆ,Gˆ], where ε[αˆ,Gˆ′] is the result of ..." It would be clearer to omit the = and the remainder of the sentence. The reader has seen \fatsemi, so explaining why you don't use it is a good idea. But introducing ε[αˆ,Gˆ] and then explaining why you don't use it only confuses. Fixed. > 1134 "in details" --> "in detail" Fixed. > 1143 Explain early on that you have two different relations, "~>_v" and "~>", and how they relate. Added explanations. > 1183 In addition to t ~> t : G, mention the relation v ~>_v t : G. We do not use the relation $v \leadsto_{v} t : G$ but $v \leadsto_{v} u$, where u is not contained in $t$. It is important to note that relations $t \leadsto t : G$ and $v \leadsto_{v} u: G$ are mutually defined. Also, observe that a value $v$ is also a term $t$ in the source language GSF. Therefore, $t \leadsto t : G$ also covers $v \leadsto t : G$. > 1226 "The third property is key for establishing the dynamic gradual guarantee" --> "Third, it is monotonic; this property is key for establishing the dynamic gradual guarantee". Changed, thanks. > 1232 "Monotonicity.If ε1 ⊑ ε1' and ε2 ⊑ ε2' and ε1;ε2 is defined, then ε1;ε2 ⊑ ε1';ε2'." --> "Monotonicity. If ε1 ⊑ ε1' and ε2 ⊑ ε2' and ε1;ε2 is defined, then ε1';ε2' is defined and ε1;ε2 ⊑ ε1';ε2'." (Or did you mean something different?) Yes, we want to mean: “ If ε1 ⊑ ε1' and ε2 ⊑ ε2' and ε1;ε2 is defined, then ε1';ε2' is defined and ε1;ε2 ⊑ ε1';ε2'”. We changed it. > 1233 "following the same approach" Which approach? The systematic AGT methodology to derive consistent transitivity. We clarified this in the text. > 1244 Σ appears in this line for the first time in the definition. Was it meant to be Ξ? Or how do Σ and Ξ relate? Σ is the static counterpart of Ξ, i.e. Σ is a mapping from type names to static types, and is contained in the concretization of Ξ. We added this explanation and the formal definition of concretization for Ξ in the text. > 1245-1270 I could not follow this discussion. See below to get some understanding of my issues with this paragraph. I suggest you might put 1275-1294 before the discussion at 1252-1270, as setting out the computations systematically as a table provides a solid outline on which to hang the discussion. We reworked the explanation of that part following your suggestions. > 1275-1294 Setting out the reduction systematically as a table in this way is a great idea! However, I find it impossible to follow the details. The rule for the first reduction appears at 1112, and refers to several concepts (lift, \hat\alpha, \epsilon[\hat\alpha], \epsilon_{out}) that have not yet been defined. So how is one supposed to understand this reduction? In particular, we seem to have that <\alpha->Int,Int->Int> in line 1277 is the value of \epsilon_out, but \epsilon_out has not yet been defined. This is especially confusing as until this point all evidence has had the form <G,G>, and one would like to see an intuitive explanation of how evidence not of that form arises. The text points out when evidence <Int,\alpha> first appears, but doesn't say where it comes from or provide any intuition of what it means. More details of the computation that yields <Int,\alpha> on line 1278 would also be welcome. In short, I could not follow the end of Section 8.1, which provides crucial motivation for 8.2. All of this needs to be done differently. [IMPORTANT] We have reformulated the explanation of that part. We added an explanation about why the evidence is represented as a pair of types, and an informal definition for $\epsilon_{out}$. If $\epsilon_{out} \vdash G[\alpha/X] \sim G[G'/X]$, then $\epsilon_{out} = \langle G[\alpha/X], G[G'/X] \rangle$. For instance, if a type abstraction with type $\forall X. X\rightarrow X$ is applied to $\mathsf{Int}$, $\epsilon_{out}$ is computed as $\langle X \rightarrow X[\alpha/X], X \rightarrow X[\mathsf{Int}/X]\rangle = \langle \alpha \rightarrow\alpha,\mathsf{Int}\rightarrow\mathsf{Int}\rangle$, where $\epsilon_{out}$ justifies that $(X \rightarrow X)[\alpha/X] = \alpha \rightarrow\alpha$ is consistent with $(X \rightarrow X )[\mathsf{Int}/X ] = \mathsf{Int} \rightarrow \mathsf{Int}$. An evidence such as $\langle\mathsf{Int}, \alpha \rangle$ can be obtained from the domain information of $\epsilon_{out}$ ($dom(\epsilon_{out})$) and justifies that $\mathsf{Int}$ is consistent with $\alpha$ (under an appropriate environment). > 1275-1294 Colouring is inconsistent in the two parts. Evidence is uniformly purple in the first part, but in the second part epsilon is purple and other occurrences of evidence are blue. Please be consistent throughout! Fixed. > 1316 "a type name is enriched with its transitive bindings in the store" What is a "transitive binding"? We changed the text: “...a type name is enriched recursively with the type that is instantiated in the store”. For instance if $\alpha \mapsto \beta$, and $\beta \mapsto \mathsf{Int}$, then $\alpha$ is enriched as $\alpha^{\beta^{\mathsf{Int}}}$. > 1317 lift_\Xi is defined in terms of lift_\Xi, which appears to be a non-terminating recursion. I guess it terminates because lift_\Xi on something other than \alpha has a definition that is not recursive. It would be better to give the full definition of lift at this point, rather than just one part of the definition. Aha! I see two sentences later you point to the remainder of the definition. Please point to the full definition *first*, and *then* explain the key part, not the other way around. Fixed. > 1318 Function unlift_\Xi appears to ignore \Xi, so why isn't it written simply unlift without the subscript? (I see it is written without the subscript in Figure 6.) The subscript is not needed and we removed it, thanks. > 1324-1327 Typo. Three instances of E should be G. Fixed. > 1337/1342 In the first occurrence of \epsilon[\alpha^E/X], the \epsilon is purple but the rest is blue, while the second occurrence is entirely purple. Please be consistent in your use of colour. Fixed. > 1404 Figure 8 (and surrounding discussion). One of the strengths of AGT is that important operations can be calculated from abstraction and concretion functions. But here an inductive definition is pulled from thin air, which doesn't seem entirely in line with the AGT philosophy. That is not a fatal weakness, but perhaps it is a weakness worth observing (or defending as not a weakness, if you have a defense in mind). Normally, when using AGT, an algorithmic/inductive definition for operators such as consistent transitivity is proposed, and then one has to prove that these algorithmic definitions are equivalent to their AI-based definitions. In 8.1 we show that the formal definition for consistent transitivity given by AGT breaks parametricity. To ensure parametricity, not only we had to refine evidence, but also we had to customize consistent transitivity. We add an explicit note about this shortcoming of AGT at the beginning of section 8. Also we explain that most of the consistent transitivity rules are equivalent to the formal definition of consistent transitivity given by AGT, except for (idL) which always preserves the left evidence instead. > 1370 "Also, there are symmetric variants for some rules—such as (idR) and (sealR)—in which every evidence and every evidence type is swapped."-->"Also, there are symmetric variants for some rules—such as (idR) and (sealR)—in which the left and right components of each piece of evidence are swapped." Changed, thanks. > 1418 "the imprecision required for parametricity." I don't understand why imprecision helps preserve parametricity. Please explain in more detail. Actually, it is a technical requirement for one of the main lemmas in the parametricity proof. In particular, evout is defined in such a way that make possible sealing and unsealing, and at the same time it does not introduce runtime errors when combined. Perhaps the proof could have been done differently by defining this evidence in another way, but the current definition makes the proof relatively straightforward. We rephrased the text. > 1420 This line refers to several free variables, \epsilon, \alpha, G', \Xi, \Xi', that are bound in rule (RappG). Strictly speaking, what you have done is not formally correct; as well as requiring the poor reader to remember an awful lot of detail from rule (RappG). Suggest you copy rule (RappG) here, with the definition of \epsilon_out now written out in full, providing the appropriate bindings for the free variables. Suggestion added. > 1420 Here E_* is computed from \pi_2(\epsilon), that is, the right part of \epsilon, and then used on *both* the left and the right. Why is the right part propogated to the left? Why not take left to left and right to right, or even propogate the left part to the right instead? Intuition about why this definition looks as it does is still sorely lacking. [IMPORTANT] E∗ is obtained using the information of the second component of evidence ε, and not using the information of the first component. This is because ε justifies that ∀X.G′′ is consistent with ∀X.G, where ∀X.G′′ is the intrinsic type of the type abstraction and ∀X .G is the ascribed type. Therefore, evidence ε[αˆ] (the result of instantiating ε with α) justifies that G′′[α/X] is consistent with G[α/X], where the right (resp. left) component of ε[αˆ] corresponds to the most precise information about G[α/X] (resp. G''[α/X]). As εout must justify that G[α/X] is consistent with G[G′/X], we use only the information of the second component of ε (which corresponds to the most precise information about G). To illustrate this, consider $\epsilon = \langle{\forall X. X \rightarrow \mathsf{Bool}, \forall X. X \rightarrow \beta^{\mathsf{Bool}}}\rangle$ and $\epsilon[\hat{\alpha}] = \langle \alpha^{\mathsf{Int}} \rightarrow \mathsf{Bool}, \alpha^{\mathsf{Int}} \rightarrow \beta^{\mathsf{Bool}} \rangle$, then $\epsilon_{out} = \langle{\alpha^{\mathsf{Int}} \rightarrow \beta^{\mathsf{Bool}}, \mathsf{Int} \rightarrow \beta^{\mathsf{Bool}}}\rangle$, and $\epsilon[\hat{\alpha}] ~\mathsf{o}~ \epsilon_{out} = \epsilon_{out}$. If $\epsilon_{out}$ would have been constructed using the first component, then $\epsilon_{out} = \langle{\alpha^{\mathsf{Int}} \rightarrow \mathsf{Bool}, \mathsf{Int} \rightarrow \beta^{\mathsf{Bool}}}\rangle$, but $\epsilon[\hat{\alpha}] ~\mathsf{o}~ \epsilon_{out}$ is not defined. We explain this now in the text. > 1430 The example suggests that the extra lift and unlift operations *reduce* the precision of the evidence. Is that correct? Why is it important to do this? Usually, the point of evidence is to carry around at run-time information that is *more* precise than the static gradual type information. Why here is it important to return something that is *less* precise? This technical relaxation is used to prove parametricity (specifically, to prove Lemma 10.4 (Compositionality)). It is important to note that although evidence εout can lose precision with respect to ε, this local loss does not affect the precision of the entire program since evidence ε[αˆ] maintains the precision achieved so far. The role of the outer evidence εout is just to seal/unseal and not to introduce new runtime errors; the possible runtime errors must come from the inner evidence ε[αˆ]. We added an explanation in the document. > 1441 I guess here you intend \Xi = \alpha := Int. You must make that explicit! Fixed. > 1456-1463 This paragraph appears to be about a detail of an example, but actually it is conveying a key insight about the entire design. This insight needs to be brought out earlier, and supplied with additional intuition and justification. In particular, it means that we have ((\Lambda X. \lambda x:X. x) :: \all X. X -> X) [Int] 1 --> 1 as we would expect, and hence to satisfy a form of the gradual guarantee the definitions also give us > >((\Lambda X. \lambda x:X. x) :: \all X. X -> ?) [Int] 1 --> 1 > >But we also have > >((\Lambda X. \lambda x:X. x::?) :: \all X. X -> ?) [Int] 1 --> error > > which appears to violate the gradual guarantee. Why did you choose that replacing \all X. X -> X by \all X. X -> ? doesn't result in an error, but also choose that replacing x by x::? does result in an error? In particular, it is surprising for you to choose that replacing \all X. X -> X by \all X. X -> ? doesn't result in an error, since by parametricity one might expect \all X. X -> ? to only contain functions where the result does not depend on the input. These choices require much more discussion. Further, that discussion belongs in the introduction of the paper, not hidden in the last paragraph of a technical section. [IMPORTANT] - It was not a design decision. These behaviors are a consequence of how evidence is combined, and driven by the AGT methodology as laid out by Garcia et al [2016]. In particular, the evidence keeps the most precise information about a judgment at each reduction step through consistent transitivity. For example, consider the following program: $((\Lambda X. \lambda x:X. x) :: \forall X. ? \rightarrow X) [\mathsf{Int}] 1$ The type abstraction has type $\forall X. X \rightarrow X$, and it's ascribed to $\forall X. ? \rightarrow X$. Therefore, the result of the evidence combination will retain the more precise information between those types, which is $\forall X. X \rightarrow X$. As a result, the reduction of this function $((\Lambda X. \lambda x:X. x) :: \forall X. ? \rightarrow X$) behaves like the static identity function ($\forall X. X \rightarrow X$), returning 1. However, consider the following program: $((\Lambda X. \lambda x:?. x::X) :: \forall X. ? \rightarrow X) [\mathsf{Int}] 1$ The type abstraction has types $\forall X. X \rightarrow X$, and it is ascribed to $\forall X. X \rightarrow ?$. Therefore, the result of the evidence combination will retain the more precise information between those types, which is $\forall X. ? \rightarrow X$. As a result, the reduction of the function $(\Lambda X. \lambda x:?. x) :: \forall X. ? \rightarrow X$ behaves as an imprecise parametric function that does not seal the argument, throwing an error (since the type information ($\forall X. ? \rightarrow X$) drives the dynamic semantics when a type abstraction is applied). Note that these two programs illustrate the violation of the DGG. - We insist that the notion of parametricity when imprecise types occur is open to debate in the community (of course, Reynolds does not consider imprecise types). In the case of GSF, we adopt that the unknown type represents any type, including any type variable ($X$), as specified by the definition of concretization (following AGT). Therefore, a function of type $\forall X. X \rightarrow ?$ can behave like the identity function with type $\forall X. X \rightarrow X$, if at runtime, its evidence evolves into something fully precise as the identity function. - In the paper's contribution, the limitations of GSF concerning DGG are mentioned. In Section 4, where we present how GSF behaves with examples, we added a paragraph and an example regarding the violation of the DGG and a reference to Section 9, where we explain in detail the reason for this violation (after having presented the dynamic semantics of GSF in details). > 1512 Please add a sentence before the proposition to explain informally why it holds. It is simple: since two successive ascriptions are collapsed via consistent transitivity, violation of monotonicity for consistent transitivity immediately implies violation of monotonicity for the former (and hence violation of the DGG). Suggestion and example added, thanks. > 1525 Again, please add a sentence before the proposition to explain informally why it holds. Again, it is simple: since type application makes use of instance instantiation, violation of monotonicity for the latter implies violation of monotonicity for the former (and hence violation of the DGG). Suggestion added, thanks. > 1583 "because it is more precise than ΛX .λx : Int.x + 1" → "because it is less precise than ΛX .λx : Int.x + 1". Fixed, thanks. > 1605 Case ( C ) is surprising to me. You seem to be saying that extrinsic *loss* of precision is not harmful, but extrinsic *increase* of precision may be helpful. Please explain why you make this choice. [IMPORTANT] As we explained in point 1456-1463, it was not a design decision. These behaviors are a consequence of how the dynamic semantics is derived by using the AGT methodology. Evidence is the most important piece of information in GSF, and dictates how a program behaves. The evidence keeps the most precise information about a judgment at each reduction step through consistent transitivity [Garcia et al, 2016]. For this reason, in example C, although the two ascribed type abstractions have different precision internally, they behave like the static identity function since the external ascription refines the evidence to be the identity function. > 1620 If I understand correctly, the rule > > G1 -> G2 \leq ? -> ? > - - --------------- > G1 -> G2 \leq ? > could just as well be replaced by > G \leq ? -> ? > - - --------- > G \leq ? > (And similarly for the rule for evidence types.) Is that correct? If so, why do you pick the formulation you use? Yes. Both rules are equivalent, but we think that the one proposed in the paper is more syntax-directed. > 1644 Why? I doubt it greatly increases brevity or clarity, but it does leave some points for the reader to fill in (perhaps incorrectly). We added the rules related to pairs and operators through this section. > 1667 "Ω ⊢ Ξ ⩽ Ξ ▷ s:G ⩽ Ξ_2 ▷ s:G" --> "Ω ⊢ Ξ ▷ s:G ⩽ Ξ ▷ s:G" Fixed, thanks. > 1686-1688 Shouldn't the t's in rule (⩽Mascε) be u's? No. We need it to be a "t" term in rule (⩽Mascε), and it cannot be a u value. When it is a u value, the rule to use is (⩽ascε), requiring that the evidence be related by strict precision (⩽). The rule (⩽Mascε) is used to account for GSFε terms resulting from the elaboration from GSF. Notice that Rule (⩽Mascε ) is key to satisfy example B (accepted by the DGG that uses the natural notion of precision $\sqsubseteq$). This was one of our goals when designing $\leqslant$: to design a precision relation as large as possible, for which the DGG is satisfied. > 1701 If I read your informal description properly, the following is implied: If Ω ⊢ Ξ_1 ▷ s_1:G_1 ⩽ Ξ_2 ▷ s_2:G_2 then Ξ_1 ⩽ Ξ_2 and G_1 ⩽ G_2. If this is the intention, please state it explicitly. I think it does not hold for your current rules without additional restrictions on Ξ_1 and Ξ_2. On second thought, it looks like G_1 ⩽ G_2 doesn't always hold, sometimes one has the weaker G_1 ⊑ G_2, so additional discussion is required. [IMPORTANT] We modified (⩽$b_{\epsilon}$) and (⩽$x_{\epsilon}$) such that Ω ⊢ Ξ_1 ▷ s_1:G_1 ⩽ Ξ_2 ▷ s_2:G_2 now implies Ξ_1 ⩽ Ξ_2. The other implication (Ω ⊢ Ξ_1 ▷ s_1:G_1 ⩽ Ξ_2 ▷ s_2:G_2 then G_1 ⩽ G_2) is not true, since sometimes we have that G_1 ⩽ G_2 and sometimes that G_1 ⊑ G_2. Actually, in this judgment, it is not important whether G_1 ⩽ G_2 or G_1 ⊑ G_2 since the type annotations (in functions and ascriptions) have no impact on the dynamic semantics (as we will explain in point 1709); evidences play the main role. In all rules, except in rule ⩽Mascε (where its more relaxed), the evidences must be related by the strict precision (⩽). We rewrote the text to clarify this. > 1705 "Furthermore, using ⩽ on Ω makes the precision relation on terms overly conservative rejecting example (C) above." Does this mean that you could replace \sqleq by \leq throughout in the definition of \leq if you gave up on (C) holding? If so, perhaps it is worthwhile to make this clear. Indeed, it may be worth giving up on (C) altogether and giving a simpler definition of \leq, where \sqleq is never mentioned. Having ⊑ instead of ⩽ allows us to accept example C and many other examples. Indeed, we could have made the term precision relation simpler by using only strict precision. However, our goal in this section was to design a precision relation as large as possible. We added a note clarifying this. > 1709 "except for types that do not influence evidence in the runtime semantics, namely function argument types and ascription types" Why do you say that ascription types don't influence evidence? Ascriptions are where evidence appears! In the source language GSF, these types are crucial because they determine how the evidence are constructed in its elaboration to GSFε. But once the evidences are elaborated, these type annotations are no longer relevant in GSFε because the evidences already have all the necessary information for the reduction. Additionally, we could have not related these types, but we wanted to be closer to the natural notion of term precision. We added a note clarifying this. > 1731 "Note that these evidences are harmless: when combined during reduction they do not contribute to any increase in the precision of other evidences." Which evidences? If you mean \epsilon_1 and \epsilon_2 in rule (⩽Mascε), there appears to be no restriction on where they can be used, so why are they harmless? But if they are not harmless, what justifies the use of ε1 ⊑ ε2 rather than ε1 ⩽ ε2 in rule (⩽Mascε)? [IMPORTANT] Yes, we are referring to evidences ε1 and ε2. We rephrase the sentence: "Note that evidences $\epsilon_1$ and $\epsilon_2$ does not contribute to any increase in precision: when combined with some arbitrary evidence $\epsilon$ during reduction $\epsilon ; \epsilon_i$, either the combination fails or the result of the combination is $\epsilon$." This happens for two reasons. First, these evidences are created from the Interior function (ε1 = Interior_Ξ1(G1,G1) and ε2 = Interior_Ξ2(G2,G2)), i.e., they are evidences created during the elaboration from GSF into GSFε, so they are the least precise evidences that can justify a judgment between two types. Second, note that types G1’ and G1 (G2’ and G2) are not just any types. They have to be related by the type matching relation (G'1 —> G1). Therefore, either G1’ and G1 are the same, or G1’ is the unknown type and G1 is a ground type (the same applies to G2’ and G2). This ground type was selected during the elaboration from GSF to GFS$_\epsilon$ depending on the elimination positions of the terms (type application, function application, etc.), and represent the least precise type for a given constructor. Similarly, in GTLC (from [Garcia et al, 2016], $\langle ? \rightarrow ?, ? \rightarrow ?\rangle$ can be combined with any other function evidence without producing errors. For these reasons, we say that these evidences do not contribute to precision. Given that G1 ⊑ G2, ε1 = Interior_Ξ1(G1,G1) and ε2 = Interior_Ξ2(G2,G2), we know that ε1 ⊑ ε2. Therefore, we can omit ε1 ⊑ ε2 in the rule ⩽Mascε; we put it there for consistency. Note that these evidences are related by ⊑ yielding a more permissive relation, as we saw in the example explained above (related to the point in lines 1686-1688). > At this point, I turned to the supplementary material to check its proof related to (⩽Mascε). The first occurrence of (⩽Mascε) is on p45 which is part of the proof of Proposition 5.26 starting on p42. Shouldn't it be mentioned in the proof of Proposition 5.18 on p37? [IMPORTANT] The proof of Proposition 5.18 follows by induction on the term precision relation in the target language (GSFε) : |- \Xi1▷ t1 : G1 \leq \Xi2 ▷t2:G2. The last case of the proof covered rule ⩽Mascε, but the name of the rule is not mentioned. We fixed this problem and organized the proof better. > Further, I had trouble reading and following the proof in the supplementary material. For instance, Proposition 5.18, p37, begins "If \Xi1 |- t1 \leq \Xi2 |- t2, we know that |- t1 \leq t2 : G1 \leq G2", but I cannot find a definition for |- t1 \leq t2 : G1 \leq G2. It was an error in the macro; we fixed it. This judgement refers to the term precision relation in the target language (GSFε) : $\vdash \Xi_1 \triangleright t_1 : G_1 ⩽ \Xi_2 \triangleright t_2:G_2$. > 1762. The quantifiers should not be left implicit. If I understand correctly, line (a) holds for *any* t1' and for *some* t2', and line (b) hold for *any* v1 and for *some* v2. Changed. > 1785. The formulation of Lemma 9.7 appears needlessly loose. "Let ⊢ t : G such that t ⇓ v, and let G ⊑ G′, then t :: G′ ⇓ v′ such that ⊢ Ξ ▷ v : G ⩽ Ξ ▷ v′ : G′, for some Ξ." Wouldn't it be better to replace t ⇓ v by t ⇓ Ξ ▷ v, to explain where Ξ comes from? Changed, thanks. > 1848 "If·⊢·▷ε1t1′ ::G1 :G1 ⩽·▷ε2t2′ ::G2 :G2" How can this be relevant?The original and final rules both involve arbitrary contexts ∆;Γ and Ω, but here we are talking about empty contexts? [IMPORTANT] We changed the text. The correct form is : "If Ω ⊢·▷ε1t1′ ::G1 :G1 ⩽·▷ε2t2′ ::G2 :G2", adding the Ω to the rule, where Ω binds a term variable to a pair of types related by precision. Ω has to be well-formed with respect to Γ1 and Γ2, i.e. Ω ⊢ Γ1 ⊑ Γ2 (now in Figure 11). Observe that we do not require type stores because source terms only exist prior to evaluation, and hence do not contain type names. ∆ is not relevant because it is not needed in the precision relation. > 1851 "then we require that G1 ⊓ G1 ⩽ G2 ⊓ G2" --> "then we require that G1' ⊓ G1 ⩽ G2' ⊓ G2" Fixed, thanks. > 1873 Similar to the comment for 1848. Fixed. > 1937 It appears you are restricting atoms to closed terms. Please make this explicit. We added text clarifying it. > 1946 "ρR ={X→R1,Y→R2}." Missing ellipses. Fixed. > 1967 Here you write "t |→∗ v" where previously you wrote "t |→ v" or "t ⇓ v". We use these notations as follows: t ⇓ v We use this notation for two operations, translate GSF term t to GSF$_{\epsilon}$ term t', and then reduce t' to v in zero or more steps. t |→ v We use this notation ( |→) for the dynamic semantics of target languages, like GSFε, for which the dynamic semantics is defined directly. In this case, t |→ v represents that t reduces to the value v in one step. t |→∗ v represents that target term t reduces to the value v in zero or more steps. > 1996 You should give some intuition here to relate step indexes to number of reduction steps. We added in the text an explanation about this. > 2008-2011 "then as ... (λx :α.x): α→α ⇒ Int→Int, according to the classic definition of parametricity" I would think "classic definition of parametricity" refers to Reynolds, but Reynolds has no construction such as α→α ⇒ Int→Int. More explanation is required here to make the discussion self-contained. In particular, you need to explain about casts and conversions in Ahmed et al [2017], and you need to explain that the above is a conversion, labeled with α. It is important to say something about conversions as they are closely related to sealing and unsealing in New et al [2020]. [IMPORTANT] - We added in the text what we mean by the classic definition of parametricity, i.e. two type abstractions are related if their instantiations to any type are related under an arbitrary relation on the selected types. Therefore, if the instantiations reduce to values, these values must be related. - We changed casts by conversions through the text as suggested, along their meaning: A conversion form is used to make explicit the conversion between a type name and the type it is bound to in the store. - Note that conversions are introduced in the reduction of type applications, and they are analogous to our evidence $\epsilon_{out}$. - For the definition of related type abstractions Vρ⟦∀X.G⟧, both logical relations, λB and GSF, take one step of reduction and reason inductively by striping out the outermost conversion in λB, and the $\epsilon_{out}$ evidence in GSF. - A conversion cannot introduce runtime errors: its objective is to seal and unseal values. Analogously, evidence $\epsilon_{out}$ in GSF does not introduce runtime errors. - We also added in the text that conversions are closely related to sealing and unsealing terms in polyG. > 2014 You should give some intuition here to relate future worlds to step indexes and reduction. I guess that if terms (t1,t2) are related in some world W and t1 --> t1' and t2 --> t2', then (t1',t2') are related in a world W', where W' is a future world of W? Yes, this statement is correct. We added in the text the intuition behind future worlds: “A future world intuitively captures how the world changes upon reduction: while the step-index decreases by one at each step of reduction, the store is extended after each type instantiation.” > 2050-2052 "one needs to use a compositionality lemma ... Essentially, such a lemma says ..." The lemma statement resembles the conversion case of the Fundamental Lemma, described on page 39:21 of Ahmed et al [2017]. It doesn't appear to be called a "compositionality" lemma in that paper. ("Compositionality" is mentioned only once in Ahmed et al [2017], as a "technical lemma".) Thank you, we fixed this. We now refer to the conversion lemma. > 2053 "It is important to notice that to apply this lemma, α must be synchronized in W" There is no such constraint on W in the corresponding result on page 39:21 of Ahmed et al [2017], and the word "synchronized" does not appear in that paper. There are two things here that need to be fixed. First, you need to explain the concept of synchronization in more detail, as I see it is important later in the current paper. Explain how synchronised values arise? The whole point of parametricity is that a type may be associated with an arbitrary relation, but here the type is associated with one specific relation. How does that happen and why is this treated as a special case? Second, you must either explain in detail how you think this corresponds to something in Ahmed et al [2017], or you should remove that claim. [IMPORTANT] - Indeed, the constraint does not appear on page 39:21 since the lemma is not formally written here. However, if we go to the technical report, we can find the formalization of the lemma: (Lemma 6.3, Page 29) (https://perma.cc/L74G-6A8W), which includes the constraints mentioned. - We agree that terminology *synchronized* does not appear in the literature. We introduce the terminology "α must be synchronized in W" (which we use hereafter in this work) as a shorthand for stating that α must be bound to the same type on both stores (W.Σ1(α) = W.Σ2(α) = T′) and to the value relation of that type (W.κ(α) = Vρ[T']). We have changed the text to make it clear that this is a new terminology, but that the restrictions/constrains themselves are used before, such as in parametricity for System F [Ahmed 2006] (Lemma C.12, Page 93) (https://www.ccs.neu.edu/home/amal/papers/lr-recquant-techrpt.pdf). - We added in the text an explanation of where and how this constraint is sufficient to demonstrate parametricity in System F (adapted) and Lambda B. > 2079-2081 "Therefore after type application, and following the definition of the relation for functions, we know that (W ,(λx : Int.x) 1,(λx : Int.x) 2) ∈ T_ρ[[X]]." This is confusing. The previous paragraph made the point that unseal_X, seal_X, and [X = Int] play a vital role in the syntax, but now they are nowhere in sight, even though X is in sight. More explanation is required. Yes, you are correct, the example is confusing, and instead of helping to understand better, it has the opposite effect :-( We decided to remove it from the document because the logical relation is not defined on terms of PolyF but in terms of CBPVOSum, which is very complex/verbose to explain. > 2082 Comparing parametricities. I found this discussion helpful. Thank you! > 2112 Here the word "synchronized" appears in conjunction with New et al [2020]. Again, it does not appear in that paper, just as it does not appear in Ahmed et al [2017]. The word requires more explanation. [IMPORTANT] We removed that part of the text. We explain better now what we call *synchronized*. (see above) > 2143-2146. It is intimidating to see so many concepts defined. Can you focus attention on the main two, Atom_ρ[G] and Atom^=_ρ[G]? Done. > 2146 "Atom_ρ[G] to denote a set of terms of the same type after substitution" I misread this as saying that after substitution \rho was applied to G that the two types must be the same. Rewrite. Yes, after substitution ρ is applied to G, the two types must be the same. Observe that ρ is a map from type variables to only one type name, no two types; therefore, we do not need to use ρ_1 and ρ_2. For example, if ρ = X $\mapsto$ α, Atom_ρ[X] represent pair of values of type ρ(X) = α, i.e. the pair of values must belong to Atomn[α,α]. Note that we need to have Atomn[G1,G2] since the arbitrary relations R in the logical relation of type abstractions, may be sets of pair of values of different types whatsoever. > 2147 "but restricts the set to values that have, after substitution, equally precise evidences (the equality is after unlifting because two sealed values may be related under different instantiations)" This definition needs more motivation and more explanation. (Why require equally precise evidences? Better motivate the need for unlifting.) The intuitive reason for equalizing evidences after unlifting is to ensure error-sensitive view of parametricity, i.e. we need that for any context one of the values does not fail more than the other one. Later in this section, when the logical relations for values are presented, we explain this in detail. > 2191 Atom^var_n[G_1,G_2] appears to be defined but not used. Atom^var_n[G_1,G_2] is used in the definition of Reln[G1, G2] to restrict the elements in relation to values. > 2191 "Atom_ρ[G] = ... Atom [ρ(G),ρ(G)] ..." → "Atom_ρ[G] = ... Atom [ρ_1(G),ρ_2(G)] ..." In this case, we do not use ρ_1 and ρ_2 because a type variable X is only associated with a type name alpha in ρ. For example, ρ = X -> α applied to X is α (ρ (X) = α). > 2213-2218 There is no mention nor motivation for the fact that one reduction is for i steps while the other is for an arbitrary number of steps. We added the following motivational text: “Observe that one reduction takes i steps in the definition of the interpretation of terms while the other one takes any arbitrary number of steps. This is because only one index is needed for this definition to be well-founded, and it would be challenging to establish the number of steps for the second reduction.” > 2218 "Note that Atom_ρ^=[G] ..." But the term form mentions Atom_ρ[G], not Atom_ρ^=[G]. We move this sentence to the paragraph related to logical relations for values. > 2230 "Recall that world extension is formally defined as ..." This definition should be in Figure 12 rather than the text. Definition added in the figure 12 (auxiliary definitions of the logical relation). > 2236 "if α is synchronized ..." You cite previous work on λB as motivation, but here you depart from that previous work--you should highlight that departure here. What is the motivation for this special case? You say "Intuitively, if α is synchronized, then after multiple possible unsealings the resulting values are kept related" But why is that required? [IMPORTANT] We added text highlighting that we have departed from the λB definition for this particular case. Also, we explain better the motivation for that. The first part of the definition is faithful to λB, while the second part is new. In particular, we also require that for any evidence ε that justifies the judgment between α and any type G, in any store such that W belongs to its interpretation, it is satisfied that ascribing the values to the type G and evidence ε, they remain related. This technical extension is necessary to prove the ascription lemma because of the differences in the dynamic semantics between GSF and λB. The dynamic semantics of GSF combine evidence eagerly, whereas λB accumulates casts (lazy). For instance, the conversion (1: Int ⇒ α): α ⇒ β from λB needs two reduction steps to obtain 1 (reducing the step-indexing by two), while in GSF, this information is compressed in a single evidence $\langle Int, β^{α^{Int}}\rangle$, and needs only one reduction step to extract the underlying value 1 (reducing the step-indexing in one). > 2245 "The function const extracts the top-level constructor of an evidence type" This definition should be in Figure 12 rather than the text. (You can break the figure into two if not everything fits in one figure, similar to Ahmed et al [2017].) We broke Figure 12 into two and include the definition of const in one of them. > 2265 "a world whose indexed is lowered" --> "a world whose index is lowered" Fixed. > 2266 "(iii) we require in the definition of related functions that arguments must be related at (at least) one step down" I guess this refers to ↓W that appears in line 2159, where there is no corresponding operation in λB. Again, you need to flag the difference from the preceding work and give some motivation for why you require this downstep where they do not. Observe that, unlike λB, the definition of the logical relation for the unknown case does not decrease the index of the world W by 1. However, in λB this is done on a case-by-case basis where needed (i.e., function, type name, etc.). We made this technical change to facilitate the parametricity proof. We clarify this in the text. Also, the operator $\downarrow W$ in GSF has the same definition as $\blacktriangleright W$ in λB. We now make this clear in the text. > 2267 "(W,⟨?→?,?→?⟩u1 ::?,⟨?→?,?→?⟩u1 ::?)∈V_ρ[[?]]" Was the second u1 meant to be u2? Similarly in the next two lines. Yes, thank you. > 2299 Lemma 10.2. Why is it the same evidence in each term, rather than related evidence? Observe that the terms ε(ΛX.t1)::G′ and ε(ΛX.t2)::G′ are open. Therefore, when they are closed with ρ, such that (W,ρ) ∈ D[∆], different evidences can be obtained for each term (ρ1(ε) and ρ2(ε)). There's no formal notion of related evidence, and is not needed for the proof of parametricity. > 2301 Lemma 10.3. Why does this relate t1 and t2 both applied to the same G? In the corresponding compatibility lemma in Ahmed et al [2017], we have v1[\rho_1(B)] and v2[\rho_2(B)], so different but related types? (The text of Ahmed et al [2017] has v1[\rho(B)] and v2[\rho(B)], but I believe that is a typo.) Intuitively, the fundamental property states that if an open term type checks, then given two value substitutions, it is related to itself. This means that every type application will be to the same type (including to the same type name if $\rho \mapsto \alpha$). For this reason, the lemma relates the application of related type ascriptions to the same type G'. Otherwise the lemma would be false. Observe that if we choose two different types G’1 and G’2, then we have to choose whether the terms t1[G′1] and t2[G′2] are related at type G[G′1/X] or G[G′2/X], i.e. Ξ; ∆; Γ ⊢ t1[G′1] ⪯ t2[G′2] : G[G′i/X], which is impossible as relations are indexed by a single type. Furthermore, the compatibility lemma relate programs **before** any type/value substitution, so in theory both G' after substitution can be replaced by some type name $\alpha$ which is bound to completely different types. Also, we reviewed the technical report of “Theorems for Free for Free: Parametricity, With and Without Types” Ahmed et al [2017] (https://perma.cc/L74G-6A8W) Lemma 7.10 (Compatibility: Type Application) page 54 and it is established in the same way as us. In addition, we get to the same conclusion in “Step-Indexed Syntactic Logical Relations for Recursive and Quantified Types” Ahmed [2006] (https://www.ccs.neu.edu/home/amal/papers/lr-recquant-techrpt.pdf) Lemma C.24, page 111. > 2310 You refer here to "the compositionality lemma" as if it were a well-known thing. But this is nothing like the compositionality lemma that appears in (the tech report of) Ahmed et al [2017]. You should explain informally what the purpose of the lemma is (it seems to be related to synchronisation, which also requires more explanation; see above). If this form of "compositionality" is an original contribution of your work, say so; if it corresponds to something common in other work, explain where. [IMPORTANT] You're right. We added some clarification notes in the text. “In order to prove Lemma 10.3 (Compatibility-EappG), we establish another lemma to relate terms after a type substitution. We call this lemma Compositionality (Lemma 10.4). Observe that this lemma is informally the combination of the compositionality and conversion lemma from Ahmed et al. [2017].” > 2347 "(ΛX.λx : ?.t)" --> "(ΛX.λx : ?.x)" We refer in this part to the term (ΛX.λx : ?.t), which is used in Lemma 10.6. We clarify this in the (re-ordered) text now. > 2348 "then the pair of any v′" I don't understand, a single v' is not a pair! Text reformulated. Here we are refering to the pair formed by two different ascriptions of any value $v$ to ?, using the domain of the two outer evidences (e_out evidence for each instantiation). > 2402 Lemma 10.7 is much easier to comprehend than Lemma 10.6. I suggest you begin with Lemma 10.7 to give the reader a simple overall view, and then relegate details such as Lemma 10.6 to the proof of Lemma 10.7 (and don't bother to write them out separately as a lemma). We changed the order of the exposition as suggested. > 2417 Section 10.4. Another way to summarise the result of this section might be in terms on external loss of precision being harmless, while internal loss of precision may change the meaning of a term. I suggest you tie the results given here to internal and external loss of precision. We change the text linking the results with the external and internal loss of precision. Corollary 10.9 is a consequence of the fact that imprecise ascriptions are harmless in GSF. Conversely, Theorem 10.10, which is proved using the parametricity result, shows that the internal loss of precision may change the resulting behavior of a term. For example, the function $\Lambda X.\lambda x:?.t$ from Theorem 10.10 might be the imprecise identity function $\Lambda X.\lambda x:?.x :: X$. Therefore, we could expect that applied to a type and a value of the same type, it returns the same value. However, by Theorem 10.10, we know that it always fails or diverges. > 2579 The term su is interesting, and should be mentioned in the introduction to the paper, as it conveys a great deal about the power of GSF. [IMPORTANT] We added in the introduction, specifically in the contributions, an explanation of the term su. “The embedding is built using a general seal/unseal generator, which is expressed as a \gsf term. The proposed term is a polymorphic pair of two functions, with type $\forall X.(X \rightarrow ?)\times(? \rightarrow X)$, and instantiated at the unknown type. The first component of the pair justifies sealing the function's argument but not its return, while the second one works the other way around.” > 3384 "itetahmedAl:icfp2017" typo Fixed. > 3397 "already in the simply-typed setting, embedding typed terms in gradual contexts is not fully abstract, because gradual types admit non-terminating terms" True but not fully satisfying. The immediate question that comes to mind is if we add recursion to the simply-typed lambda calculus, do we then have an embedding that is fully abstract? Jacobs et al. [2021] show the fully abstract embedding criterion for a gradual version of the simply-typed lambda calculus with recursive types and sums. We added this in the text. ### Reviewer 2 > Please discuss Max New's counter-example to the parametricity "principle" for GSF on page 263 of his thesis: const = (ΛX.λx : X.true) :: ∀X.? → Bool const[Int]3 -->* true const[Bool]3 -->* error We update this discussion on Section 10.1, paragraph "Comparing Parametricities". > p. 3 "Parametric polymorphism allows the definition of terms" => "Parametric polymorphism enables the definition of terms" Changed > "In contrast to this intrinsic, Church-style formulation" => "In contrast to this explicit, Church-style formulation" Fixed. > "the use of the judgment Int → Int <: ∀X .Int → Int is materialized by inserting a type abstraction constructor ΛX." That's a strange type to use as an example, X is not used. While its true that Mitchell and Odersky studied subtyping for polymorphic types, the traditional Hindley-Milner inference that ML and Haskell are based on does not use the subtyping relations of Mitchell or Odersky. Perhaps it would be more helpful to talk about Damas and Milner's GEN and INST rules (POPL 1982). We added the reference to Damas and Milner's GEN and INST rules (POPL 1982). > p. 4 "letting programmers in control of which discipline to adopt where" => "letting programmers be in control of which discipline to adopt where" or "giving programmers control of which discipline to adopt where" Changed. > p. 5 "Therefore, runtime type checks are inserted to ensure that, for example, the value bound to y is indeed of type Int." > > No, in program A, there is not a runtime check to ensure that the value bound to y is of type Int. (Program B does that.) I recommend deleting the above sentence because you're following sentence is correct and tells the whole story for program A: "This program typechecks and fails at runtime when trying to add 1 to true." Fixed. > p. 7 It would be good to explain up-front the boxed "v" in the example instead of waiting until the next paragraph. A reader may get hung up on the statement "this program is well-typed" when clearly it is not! The variable "v" is not defined! Changed. > p. 8 "λB, System FG, CSA ... in these gradual languages" λB isn't a gradual language. We removed "gradual" from the text. Although λB is a polymorphic cast calculus, we find interesting to include it in the discussion since all other gradual languages, like System FG and CSA, translate to λB to establish their dynamic semantics and properties (including the dynamic gradual guarantee). > "However, its compatibility relation for types is not a conservative extension of type equality in System F." You're evaluating λB as if it should satisfy the criteria for a gradual language. Many properties of the refined criteria can be applied to internal languages/cast calculi, as long as there's a notion of precision between types. Some properties do not apply which we marked them with a "-" in Table 1. Regarding the conservative extension, we are saying that λB is a conservative extension of System F, although the compatibility relation is slightly different as λB supports for implicit polymorphism and System F only explicit polymorphism. > p. 9 "If one removes unsealX around the application of f, the program behaves properly." It seems that you're revealing some bias regarding what "behaves properly" means in this case. Indeed, we removed this phrase. > p. 10 let t : (∀X .?) → ? = λx:(∀X .?). x [X=Int] in (t id) (sealX true) Could this example instead be written with the sealing performed inside the λx? let t : (∀X .?) → ? = λx:(∀X .?). λy:?. x [X=Int] (sealX y) in (t id) true Yes. The proposed version type checks, but it adds new problems regarding modularity because one would need to know the actual implementation of t, which goes against the purpose of t: to be a general instantiator of any polymorphic function to the type Int. For example, the following program: let t : (∀X .?) → ? = λx:(∀X .?). λy:?. unsealX (x [X=Int] (sealX y)) in (t id) 1 type checks and reduce to 1 successfully. But if we apply t with function f = (ΛX.λx:Int.x): (t f) 1 now the program type checks but fails at runtime because t seals 1 to X, and f expects a non-sealed value as argument. Similarly, if we apply t with fst = (ΛX.ΛY.λx:X.λy:Y.x): (t fst) [Bool] 1 true the program also fails at runtime at the instantiation to Bool as t returns a function and not a type abstraction. > p. 12 Table 1. Please include the explanation of the symbols in the caption. Done. > p. 15 "SF: Simple Static Polymorphic Language" I'd recommend removing the term "simple" from the name of this language, as "simple" is often associated with the Simply Typed Lambda Calculus which is not polymorphic. Fixed. > p. 17 "whenever an heterogeneous" => "whenever a heterogeneous" Fixed. > p. 34 > > "Strict term precision is the natural lifting of strict type precision ⩽ to terms, except for types that do not influence evidence in the runtime semantics, named function argument types and ascription types; for these, we can use the more liberal type precision relation ⊑." Do you need to use ⊑? If these annotations do not influence the runtime semantics, then perhaps they don't need to be related at all. We don't need to use ⊑ in those types. However, we include it to make ⩽ a subrelation of the natural notion of precision for terms. For example, the terms $\lambda x:\mathsf{Bool}.\mathsf{true}$ and $\lambda x:\mathsf{Int}.\mathsf{true}$ are not related by precision using the natural notion of precision, although if augmented with evidences, then they would no longer be related because the underlying evidences would not related as well. > p. 37 > > "First, given a term t that reduces to some value, ascribing it to a less precise type results also reduces to (less strictly precise) value." Two grammar problems. Perhaps: "Given a term t that reduces to some value, ascribing it to a less precise type also results in a (less strictly precise) value." Fixed. > p. 40 "This definition of parametricity for the statically-typed polymorphic lambda calculus is standard..." Move this sentence to the beginning of "Preliminaries" paragraph! "has only be proven for λB " => "has only been proven for λB " Fixed. > p. 42, 45 and 46The exerpt of Vρ⟦∀X.G⟧ from λB is incorrect. It includes casts around t₁ and t₂ where there should be static casts (not regular casts). Also the text describes them incorrectly as casts. Please fix that. So the logical relation of Amal et. al. (ICFP 2017) is different from the one in Fig. 12, where v₁[G₁] takes a step to a term with evidence ε₁ applied to t₁. A static cast cannot fail at runtime, whereas evidence can fail. It's not clear to me how much this difference matters, but some of Max New's critique of the logical relation of Fig. 12 hinges on this part of the definition. I recommend pointing out this difference between your logical relation and that of Amal et al. and explaining why yours differs in this regard. Also, you may need to revisit the statements about λB in section 10.3. For example, when you proved lemma 10.6 and 10.7 for λB, were you using the logical relation of Amal et al or the one in Fig. 12? - We changed casts to conversions throughout the text. - We added the explanation of conversion in the text, pointing out the similarity with the evidence e_out from GSF. - For the definition of Vρ⟦∀X.G⟧, both logical relations (λB and GSF) take one step of reduction and strip out the outermost conversion in the case of λB and the e_out evidence for GSF. - A conversion cannot fail at runtime, similarly evidence e_out from GSF. The notion of parametricity enjoyed by GSF is based on and similar to that of λB. Although some parametric programs behave differently among these languages, these differences come from the dynamic semantics, but not from the definition of the parametricity logical relations. - We changed the text a bit. We first establish and prove Lemmas 10.6 and 10.7 for GSF (using its logical relation), and then we mention that we also prove these (adapted) Lemmas in λB (using its logical relation). > p. 50 "and as such unusable" => "and as such is unusable" Fixed. > p. 69 "with the proof of parametricity by ̧itetahmedAl:icfp2017" Some kind of LaTeX error. Fixed. > p. 70 "recently established by [Azevedo de Amorim et al. 2020]" => "recently established by Azevedo de Amorim et al. [2020]" Fixed.

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.