arXiv:1501.04511v1 [cs.PL] 19 Jan 2015

38
arXiv:1501.04511v1 [cs.PL] 19 Jan 2015 Fragments of ML Decidable by Nested Data Class Memory Automata Conrad Cotton-Barratt 1,⋆ , David Hopkins 1,⋆⋆ , Andrzej S. Murawski 2,⋆⋆⋆ , and C.-H. Luke Ong 1 , , 1 Department of Computer Science, University of Oxford, UK 2 Department of Computer Science, University of Warwick, UK Abstract. The call-by-value language RML may be viewed as a canonical re- striction of Standard ML to ground-type references, augmented by a “bad vari- able” construct in the sense of Reynolds. We consider the fragment of (finitary) RML terms of order at most 1 with free variables of order at most 2, and iden- tify two subfragments of this for which we show observational equivalence to be decidable. The first subfragment, RML P-Str 21 , consists of those terms in which the P-pointers in the game semantic representation are determined by the underly- ing sequence of moves. The second subfragment consists of terms in which the O-pointers of moves corresponding to free variables in the game semantic repre- sentation are determined by the underlying moves. These results are shown using a reduction to a form of automata over data words in which the data values have a tree-structure, reflecting the tree-structure of the threads in the game semantic plays. In addition we show that observational equivalence is undecidable at every third- or higher-order type, every second-order type which takes at least two first- order arguments, and every second-order type (of arity greater than one) that has a first-order argument which is not the final argument. 1 Introduction RML is a call-by-value functional language with state [2]. It is similar to Reduced ML [19], the canonical restriction of Standard ML to ground-type references, except that it includes a “bad variable” constructor (in the absence of the constructor, the equality test is definable). This paper concerns the decidability of observational equivalence of finitary RML, RML f . Our ultimate goal is to classify the decidable fragments of RML f completely. In the case of finitary Idealized Algol (IA), the decidability of observational equivalence depends only on the type-theoretic order [15] of the type sequents. In con- trast, the decidability of RML f sequents is not so neatly characterised by order (see Figure 1): there are undecidable sequents of order as low as 2 [14], amidst interesting classes of decidable sequents at each of orders 1 to 4. Following Ghica and McCusker [6], we use game semantics to decide observational equivalence of RML f . Take a sequent Γ M : θ with Γ = x 1 : θ 1 , ··· ,x n : θ n . In Supported by an EPSRC Doctoral Training Grant ⋆⋆ Supported by Microsoft Research and Tony Hoare. Now at Ensoft Limited, UK. ⋆⋆⋆ Supported by EPSRC (EP/J019577/1) Partially supported by Merton College Research Fund Preprint submitted to Springer 16 January 2015

Transcript of arXiv:1501.04511v1 [cs.PL] 19 Jan 2015

arX

iv:1

501.

0451

1v1

[cs.

PL]

19

Jan

2015

Fragments of ML Decidable byNested Data Class Memory Automata

Conrad Cotton-Barratt1,⋆, David Hopkins1,⋆⋆, Andrzej S. Murawski2,⋆ ⋆ ⋆, and C.-H.Luke Ong1,,†

1 Department of Computer Science, University of Oxford, UK2 Department of Computer Science, University of Warwick, UK

Abstract. The call-by-value language RML may be viewed as a canonical re-striction of Standard ML to ground-type references, augmented by a “bad vari-able” construct in the sense of Reynolds. We consider the fragment of (finitary)RML terms of order at most 1 with free variables of order at most 2, and iden-tify two subfragments of this for which we show observational equivalence to bedecidable. The first subfragment, RMLP-Str

2⊢1 , consists of those terms in which theP-pointers in the game semantic representation are determined by the underly-ing sequence of moves. The second subfragment consists of terms in which theO-pointers of moves corresponding to free variables in the game semantic repre-sentation are determined by the underlying moves. These results are shown usinga reduction to a form of automata over data words in which the data values havea tree-structure, reflecting the tree-structure of the threads in the game semanticplays. In addition we show that observational equivalence is undecidable at everythird- or higher-order type, every second-order type whichtakes at least two first-order arguments, and every second-order type (of arity greater than one) that hasa first-order argument which is not the final argument.

1 Introduction

RML is a call-by-value functional language with state [2]. It is similar to Reduced ML[19], the canonical restriction of Standard ML to ground-type references, except thatit includes a “bad variable” constructor (in the absence of the constructor, the equalitytest is definable). This paper concerns the decidability of observational equivalence offinitary RML, RMLf . Our ultimate goal is to classify the decidable fragments ofRMLf

completely. In the case of finitary Idealized Algol (IA), thedecidability of observationalequivalence depends only on the type-theoretic order [15] of the type sequents. In con-trast, the decidability of RMLf sequents is not so neatly characterised by order (seeFigure 1): there are undecidable sequents of order as low as 2[14], amidst interestingclasses of decidable sequents at each of orders 1 to 4.

Following Ghica and McCusker [6], we use game semantics to decide observationalequivalence of RMLf . Take a sequentΓ ⊢ M : θ with Γ = x1 : θ1, · · · , xn : θn. In

⋆ Supported by an EPSRC Doctoral Training Grant⋆⋆ Supported by Microsoft Research and Tony Hoare. Now at Ensoft Limited, UK.

⋆ ⋆ ⋆ Supported by EPSRC (EP/J019577/1)† Partially supported by Merton College Research Fund

Preprint submitted to Springer 16 January 2015

game semantics [7][10], the type sequent is interpreted as aP-strategyJΓ ⊢ M : θK forplaying (against O, who takes the environment’s perspective) in the prearenaJθ ⊢ θK.A play between P and O is a sequence of moves in which each non-initial move hasa justification pointer to some earlier move – its justifier. Thanks to the fully abstractgame semantics of RML, observational equivalence is characterised bycomplete playsi.e. Γ ⊢ M ∼= N iff the P-strategies,JΓ ⊢ MK and JΓ ⊢ NK, contain the same setof complete plays. Strategies may be viewed as highly constrained processes, and areamenable to automata-theoretic representations; the chief technical challenge lies in theencoding of pointers.

In [9] we introduced theO-strict fragment of RMLf , RMLO-Str, consisting of se-quentsx1 : θ1, · · · , xn : θn ⊢ M : θ such thatθ is short (i.e. order at most 2 and arityat most 1), and every argument type of everyθi is short. Plays over prearenas denotedby O-strict sequents enjoy the property that the pointers from O-moves are uniquelydetermined by the underlying move sequence. The main resultin [9] is that the set ofcomplete plays of a RMLO-Str-sequent is representable as a visibly pushdown automa-ton (VPA). A key idea is that it suffices to require each word ofthe representing VPAto encode the pointer from onlyoneP-question. The point is that, when the full wordlanguage is analysed, it will be possible to uniquely place all justification pointers.

The simplest type that is not O-strict isβ → β → β whereβ ∈ int, unit. Encod-ing the pointers from O-moves is much harder because O-movesare controlled by theenvironment rather than the term. As observational equivalence is defined by a quan-tification over all contexts, the strategy for a term must considerall legal locations ofpointer from an O-move, rather than just a single location inthe case of pointer froma P-move. In this paper, we show that automata over data wordscan precisely capturestrategies over a class of non-O-strict types.

Contributions.We identify two fragments of RMLf in which we can use deterministicweak nested data class memory automata [4] (equivalent to the locally prefix-closednested data automata in [5]) to represent the set of completeplays of terms in thesefragments. These automata operate over a data set which has atree structure, and weuse this structured data to encode O-pointers in words.

Both fragments are contained with the fragment RML2⊢1, which consists of terms-in-contextΓ ⊢ M where every type inΓ is order at most 2, and the type ofM isorder at most 1. The first fragment, theP-Strict subfragment, consists of those terms inRML2⊢1 for which in the game semantic arenas have the property that the P-pointers inplays are uniquely determined by the underlying sequence ofmoves. This consists ofterms-in-contextΓ ⊢ M : θ in which θ is any first order type, and each type inΓ hasarity at most1 and order at most 2. The second fragment, RMLres

2⊢1, consists of terms-in-contextΓ ⊢ M : θ in whichθ, again, is any first order type, and each typeθ′ ∈ Γ isat most order2, such that each argument forθ′ has arity at most 1. Although these twofragments are very similar, they use different encodings ofdata values, and we discussthe difficulties in extending these techniques to larger fragments of RMLf .

Finally we show that observational equivalence is undecidable at every third- orhigher-order type, every second-order type which takes at least two first-order argu-ments, and every second-order type (of arity greater than one) that has a first-orderargument which is not the final argument. See Figure 1 for a summary.

Fragment Representative Type Sequent Recursion Ref.

DecidableO-Strict / RMLO-Str

(EXPTIME-Complete)((β → . . . → β) → β) → . . . → β ⊢

(β → . . . → β) → βwhile [8,9]

O-Strict + Recursion(DPDA-Hard)

((β → . . . → β) → β) → . . . → β ⊢(β → . . . → β) → β

β → β [8]

RMLP-Str2⊢1 (β → · · · → β) → β ⊢ β → · · · → β while †

RMLres2⊢1

(β → β) → · · · → (β → β) → β ⊢β → · · · → β

while †

Undecidable

Third-Order⊢ ((β → β) → β) → β

(((β → β) → β) → β) → β ⊢ β⊥ [8],†

Second-Order⊢ (β → β) → β → β

((β → β) → β → β) → β ⊢ β⊥ [8],†

Recursion Any (β → β) → β [8],†

Unknown

RML2⊢1(β → · · · → β) → · · · → (β → · · · → β)

→ β ⊢ β → · · · → β⊥ -

RMLX⊢ β → (β → β) → β

((β → β) → β) → β ⊢ β → β → β⊥ -

FO RML + Recursion ⊢ β → · · · → β β → β → β -

Fig. 1: Summary of RML Decidability Results.(† marks new results presented here;β ∈int, unit; we write⊥ to mean an undecidability result holds (or none is known) even if norecursion or loops are present, and the only source of non-termination is through the constantΩ)

Related Work.A related language with full ground references (i.e. with aint ref ref

type) was studied in [17], and observational equivalence was shown to be undecidableeven at types⊢ unit → unit → unit. In contrast, for RMLf terms, we show decidabilityat the same type. The key technical innovation of our work is the use of automataover infinite alphabets to encode justification pointers. Automata over infinite alphabetshave already featured in papers on game semantics [16,17] but there they were usedfor a different purpose, namely, to model fresh-name generation. The nested data classmemory automata we use in this paper are an alternative presentation of locally prefix-closed data automata [5].

2 Preliminaries

RML We assume base typesunit, for commands,int for a finite set of integers, anda integer variable type,int ref. Types are built from these in the usual way. Theorderof a typeθ → θ′ is given bymax(order(θ) + 1, order(θ′)), where base typesunitandint have order0, andint ref has order 1. Thearity of a typeθ → θ′ is arity(θ′) +1 whereunit and int have arity0, and int ref has arity 1. A full syntax and set oftyping rules for RML is given in Figure 2. Note though we include only the arithmeticoperationssucc(i) andpred(i), these are sufficient to define all the usual comparisonsand operations. We will writeletx = M inN as syntactic sugar for(λx.N)M , andM ;N for (λx.N)M wherex is a fresh variable.

Γ ⊢ () : unit

i ∈ N

Γ ⊢ i : int

Γ ⊢ M : int

Γ ⊢ succ(M) : int

Γ ⊢ M : int

Γ ⊢ pred(M) : int

Γ ⊢ M : int Γ ⊢ M0 : θ Γ ⊢ M1 : θ

Γ ⊢ if M thenM1 elseM0 : θ

Γ ⊢ M : int ref

Γ ⊢ !M : int

Γ ⊢ M : int ref Γ ⊢ N : int

Γ ⊢ M :=N : unit

Γ ⊢ M : int

Γ ⊢ ref M : int ref Γ, x : θ ⊢ x : θ

Γ ⊢ M : θ → θ′

Γ ⊢ N : θ

Γ ⊢ MN : θ′Γ, x : θ ⊢ M : θ′

Γ ⊢ λxθ.M : θ → θ

Γ ⊢ M : int Γ ⊢ N : unit

Γ ⊢ whileM doN : unit

Γ ⊢ M : unit → int Γ ⊢ N : int → unit

Γ ⊢ mkvar(M,N) : int ref

Fig. 2: Syntax of RML

The operational semantics, defined in terms of a big-step relation, are standard [14].For closed terms⊢ M we writeM⇓ just if there exists, V such that∅,M ⇓ s, V .Two termsΓ ⊢ M : θ andΓ ⊢ N : θ areobservationally equivalent(or contextuallyequivalent) if for all (closing) contextsC[−] such that∅ ⊢ C[M ], C[N ] : unit, C[M ]⇓if and only ifC[N ]⇓.

It can be shown that every RML term is effectively convertible to an equivalent termin canonical form[8, Prop. 3.3], defined by the following grammar (β ∈ unit, int).

C ::= () | i |xβ | succ(xβ) |pred(xβ) | if xβ thenCelseC | xint ref := yint | !xint ref |

λxθ.C |mkvar(λxunit.C, λyint.C) | letx = ref 0 inC |whileCdoC | letxβ = C inC |

letx = zyβ inC | letx = zmkvar(λuunit.C, λvint.C) inC | letx = z(λxθ.C) inC

Game SemanticsWe use a presentation of call-by-value game semantics in thestyleof Honda and Yoshida [7], as opposed to Abramsky and McCusker’s isomorphic model[2], as Honda and Yoshida’s more concrete constructions lend themselves more easilyto recognition by automata. We recall the following presentation of the game semanticsfor RML from [9].

An arenaA is a triple(MA,⊢A, λA) whereMA is a set ofmoveswhereIA ⊆ MA

consists ofinitial moves,⊢A⊆ MA × (MA\IA) is called thejustification relation,andλA : MA → O,P × Q,A a labelling function such that for alliA ∈ IAwe haveλA(iA) = (P,A) and if m ⊢A m′ then (π1λA)(m) 6= (π1λA)(m

′) and(π2λA)(m

′) = A ⇒ (π2λA)(m) = Q.The functionλA labels moves as belonging to eitherOpponentor Proponentand

as being either aQuestionor an Answer. Note that answers are always justified byquestions, but questions can be justified by either a question or an answer. We will usearenas to model types. However, the actual games will be played overprearenas, whichare defined in the same way except that initial moves are O-questions.

Three basic arenas are0, the empty arena,1, the arena containing a single initialmove•, andZ, which has the integers as its set of moves, all of which are initial P-answers. The constructions on arenas are defined in Figure 3.Here we useIA as an

abbreviation forMA\IA, andλA for the O/P-complement ofλA. Intuitively A ⊗ Bis the union of the arenasA andB, but with the initial moves combined pairwise.A ⇒ B is slightly more complex. First we add a new initial move,•. We take theO/P-complement ofA, change the initial moves into questions, and set them to nowbe justified by•. Finally, we takeB and set its initial moves to be justified byA’sinitial moves. The final construction,A → B, takes two arenasA andB and producesa prearena, as shown below. This is essentially the same asA ⇒ B without the initialmove•.

MA⇒B = • ⊎MA ⊎MB MA⊗B = IA × IB ⊎ IA ⊎ IBIA⇒B = • IA⊗B = IA × IB

λA⇒B = m 7→

PA if m = •OQ if m ∈ IA

λA(m) if m ∈ IAλB(m) if m ∈ MB

λA⊗B = m 7→

PA if m ∈ IA × IBλA(m) if m ∈ IAλB(m) if m ∈ IB

⊢A⇒B = (•, iA)|iA ∈ IA ⊢A⊗B = ((iA, iB),m)|iA ∈ IA ∧ iB ∈ IB∪(iA, iB)|iA ∈ IA, iB ∈ IB ∧(iA ⊢A m ∨ iB ⊢B m)

∪ ⊢A ∪ ⊢B ∪(⊢A ∩(IA × IA))

∪(⊢B ∩(IB × IB))

MA→B = MA ⊎MB λA→B(m) =

OQ if m ∈ IAλA(m) if m ∈ IAλB(m) if m ∈ MB

IA→B = IA ⊢A→B = (iA, iB)|iA ∈ IA, iB ∈ IB∪ ⊢A ∪ ⊢B

Fig. 3: Constructions on Arenas

We intend arenas to represent types, in particularJunitK = 1, JintK = Z (or a finitesubset ofZ for RMLf ) andJθ1 → θ2K = Jθ1K ⇒ Jθ2K. A termx1 : θ1, . . . , xn : θn ⊢M : θ will be represented by astrategyfor the prearenaJθ1K ⊗ . . .⊗ JθnK → JθK.

A justified sequencein a prearenaA is a sequence of moves fromA in which thefirst move is initial and all other movesm are equipped with a pointer to an earlier movem′, such thatm′ ⊢A m. A plays is a justified sequence which additionally satisfies thestandard conditions of Alternation, Well-Bracketing, andVisibility.

A strategyσ for prearenaA is a non-empty, even-prefix-closed set of plays fromA, satisfying the determinism condition: ifsm1, sm2 ∈ σ then sm1 = sm2. Wecan think of a strategy as being a playbook telling P how to respond by mapping odd-length plays to moves. A play iscompleteif all questions have been answered. Notethat (unlike in the call-by-name case) a complete play is notnecessarily maximal. Wedenote the set of complete plays in strategyσ by comp(σ).

In the game model of RML, a term-in-contextx1 : θ1, . . . , xn : θn ⊢ M : θ isinterpreted by a strategy of the prearenaJθ1K ⊗ . . . ⊗ JθnK → JθK. These strategiesare defined by recursion over the syntax of the term. Free identifiers x : θ ⊢ x : θ areinterpreted ascopy-catstrategies where P always copies O’s move into the other copyof JθK, λx.M allows multiple copies ofJMK to be run, applicationMN requires a formof parallel composition plus hiding and the other constructions can be interpreted usingspecial strategies. The game semantic model is fully abstract in the following sense.

Theorem 1 (Abramsky and McCusker [1,2]). If Γ ⊢ M : θ andΓ ⊢ N : θ areRML type sequents, thenΓ ⊢ M ∼= N iff comp(JΓ ⊢ MK) = comp(JΓ ⊢ NK).

Nested Data Class Memory AutomataWe will be using automata to recognise gamesemantic strategies as languages. Equality of strategies can then be reduced to equiva-lence of the corresponding automata. However, to representstrategies as languages wemust encode pointers in the words. To do this we use data languages, in which everyposition in a word has an associateddata value, which is drawn from an infinite set(which we call thedata set). Pointers between positions in a play can thus be encodedin the word by the relevant positions having suitably related data values. Reflecting thehierarchical structure of the game semantic prearenas, we use a data set with a tree-structure.

Recall atree is a simple directed graph〈D, pred 〉 wherepred : D D is thepredecessor map defined on every node of the tree except the root, such that every nodehas a unique path to the root. A noden has levell just if predl(n) is the root (thus theroot has level 0). A tree is of levell just if every node in it has level≤ l. We define anested data setof level l to be a tree of levell such that each data value of level strictlyless thanl has infinitely many children. We fix a nested data set of levell, D, and afinite alphabetΣ, to give a data alphabetD = Σ ×D.

We will use a form of automaton over these data sets based on class memory au-tomata [3]. Class memory automata operate over an unstructured data set, and on read-ing an input letter(a, d), the transitions available depend both on the state the automatonis currently in, and the state the automaton was in after it last read an input letter withdata valued. We will be extending a weaker variant of these automata, in which theonly acceptance condition is reaching an accepting state. The variant of class memoryautomata we will be using, nested data class memory automata[4], works similarly:on reading input(a, d) the transitions available depend on the current state of theau-tomaton, the state the automaton was in when it last read a descendant (under thepredfunction) ofd, and the states the automaton was in when it last read a descendant ofeach ofd’s ancestors. We also add some syntactic sugar (not presented in [4]) to thisformalism, allowing each transition to determine the automaton’s memory of where itlast saw the read data value and each of its ancestors: this does not extend the power ofthe automaton, but will make the constructions we make in this paper easier to define.

Formally, a Weak Nested Data Class Memory Automaton (WNDCMA) of level l isa tuple〈Q,Σ,∆, q0, F 〉 whereQ is the set of states,q0 ∈ Q is the initial state,F ⊆ Q

is the set of accepting states, and the transition functionδ =⋃l

i=0 δi where eachδi is afunction:

δi : Q×Σ × (i × (Q ⊎ ⊥)i+1) → P(Q×Qi+1)

We writeQ⊥ for the setQ⊎⊥, and may refer to theQj⊥ part of a transition as itssig-

nature. The automaton isdeterministicif each set in the image ofδ is a singleton. A con-figuration is a pair(q, f)whereq ∈ Q, andf : D → Q⊥ is a class memory function (i.e.f(d) = ⊥ for all but finitely manyd ∈ D). The initial configuration is(q0, f0) wheref0 is the class memory function mapping every data value to⊥. The automaton cantransition from configuration(q, f) to configuration(q′, f ′) on reading input(a, d) justif d is of level-i, (q′, (t0, t1, . . . , ti)) ∈ δ(q, a, (i, f(predi(d), . . . , f(pred(d)), f(d))),

andf ′ = f [d 7→ ti, pred(d) 7→ ti−1, . . . , predi−1(d) 7→ t1, pred

i(d) 7→ t0]. A run isdefined in the usual way, and is accepting if the last configuration (qn, fn) in the run issuch thatqn ∈ F . We sayw ∈ L(A) if there is an accepting run ofA onw.

Weak nested data class memory automata have a decidable emptiness problem, re-ducible to coverability in a well-structured transition system [4,5], and are closed underunion and intersection by the standard automata product constructions. Further, Deter-ministic WNDCMA are closed under complementation again by the standard methodof complementing the final states. Hence they have a decidable equivalence problem.

3 P-Strict RML2⊢1

In [9], the authors identify a fragment of RML, the O-strict fragment, for which theplays in the game-semantic strategies representing terms have the property that the jus-tification pointers ofO-moves are uniquely reconstructible from the underlying moves.Analogously, we define the P-strict fragment of RML to consist of typed terms inwhich the pointers forP -moves are uniquely determined by the underlying sequence ofmoves. Then our encoding of strategies for this fragment will only need to encode theO-pointers: for which we will use data values.

3.1 Characterising P-StrictRML

In working out which type sequents for RML lead to prearenas which are P-strict, it isnatural to ask for a general characterisation of such prearenas. The following lemma,which provides exactly that, is straightforward to prove:

Lemma 1. A prearena is P-strict iff there is no enabling sequenceq ⊢ · · · ⊢ q′ in whichbothq andq′ are P-questions.

Which type sequents lead to a P-question hereditarily justifying another P-question?It is clear, from the construction of the prearena from the type sequent, that if a freevariable in the sequent has arity> 1 or order> 2, the resulting prearena will havea such an enabling sequence, so not be P-strict. Conversely,if a free variable is of atype of order at most2 and arity at most1, it will not break P-strictness. On the RHSof the type sequent, things are a little more complex: there will be a “first” P-questionwhenever the type has an argument of order≥ 1. To prevent this P-question hereditarilyjustifying another P-question, the argument must be of arity 1 and order≤ 2. Hence theP-strict fragment consists of type sequents of the following form:

(β → · · · → β) → β ⊢ ((β → · · · → β) → β) → · · · → ((β → · · · → β) → β) → β

(whereβ ∈ unit, int.)From results shown here and in [8], we know that observational equivalence of all

type sequents with an order 3 type or order 2 type with order 1 non-final argumenton the RHS are undecidable. Hence the only P-strict types forwhich observationalequivalence may be decidable are of the form:(β → · · · → β) → β ⊢ β → · · · → β or(β → · · · → β) → β ⊢ β → · · · → β → (β → β) → β. In this section we show thatthe first of these, which is the intersection of the P-strict fragment and RML2⊢1, doeslead to decidability.

Definition 1. The P-Strict fragment ofRML2⊢1, which we denoteRMLP-Str2⊢1 , consists of

typed terms of the formx1 : Θ1, . . . , xn : Θ1 ⊢ M : Θ1 where the type classesΘi areas described below:

Θ0 ::= unit | int Θ1 ::= Θ0 |Θ0 → Θ1 | int ref Θ1 ::= Θ0 |Θ1 → Θ0 | int ref

This means we allow types of the form(β → · · · → β) → β ⊢ β → · · · → β whereβ ∈ unit, int.

3.2 Deciding Observational Equivalence ofRMLP-Str2⊢1

Our aim is to decide observational equivalence by constructing, from a termM , anautomaton that recognises a language representingJMK. As JMK is a set of plays, thelanguage representingJMK must encode both the moves and the pointers in the play.Since answer moves’ pointers are always determined by well-bracketing, we only rep-resent the pointers of question moves, and we do this with thenested data values. Theidea is simple: if a plays is in JMK the languageL(JMK) will contain a word,w, suchthat the string projection ofw is the underlying sequence of moves ofs, and such that:

– The initial move takes the (unique) level-0 data value; and– Answer moves take the same data value as that of the question they are answering;

and– Other question moves take a fresh data value whose predecessor is the data value

taken by the justifying move.

Of course, the languages recognised by nested data automataare closed under automor-phisms of the data set, so in fact each plays will be represented by an infinite set ofdata words, all equivalent to one another by automorphism ofthe data set.

Theorem 2. For every typed termΓ ⊢ M : θ in RMLP-Str2⊢1 that is in canonical form

we can effectively construct a deterministic weak nested data class memory automata,AM , recognising the complete plays ofL(JΓ ⊢ MK).

Proof. We prove this by induction over the canonical forms. We note that for eachcanonical form construction, if the construction is in RMLP-Str

2⊢1 then each constituentcanonical form must also be. For convenience of the inductive constructions, we in factconstruct automataAM

γ recognisingJΓ ⊢ MK restricted to the initial moveγ. Here wesketch two illustrative cases. A full proof is provided in Appendix A.

λxβ.M : β → θ. The prearenas forJMK andJλxβ .MK are shown in Figure 4.Note that in this case we must have thatΓ, x : β ⊢ M : θ, and so the initial moves inJMK contain anx-component. We therefore write these initial moves as(γ, ix) whereγ is theΓ -component andix is thex-component.

P’s strategyJλxβ .MK is as follows: after an initial moveγ, P plays the uniquea0-move•, and waits for aq1-move. Once O plays aq1-moveix, P plays as inJΓ, x ⊢ MKwhen given an initial move(γ, ix). However, as theq1-moves are not initial, it is pos-sible that O will play anotherq1-move,i′x. Each time O does this it opens a new threadwhich P plays as perJΓ, x ⊢ MK when given initial move(γ, i′x). Only O may switch

q1

a1

...

qn

an

JΓ K

(a) JΓ, β ⊢ θK

q0

a0

q1

a1

...

qn

an

JΓ K

(b) JΓ ⊢ β → θK

Fig. 4: Prearenas forJΓ, x : β ⊢ M : θK andJΓ ⊢ λxβ .M : β → θK

between threads, and this can only happen immediately afterP plays anaj-move (foranyj).

By our inductive hypothesis, for each initial move(γ, ix) of JΓ, x : β ⊢ θK we havean automatonAM

γ,ixrecognising the complete plays ofJΓ, x : β ⊢ M : θK starting with

the initial move(γ, ix). We construct the automatonAλx.Mγ by taking a copy of each

AMγ,ix

, and quotient together the initial states of these automatato one state,p, (whichby conditions on the constituent automata we can assume has no incoming transitions).This statep will hold the unique level-0 data value for the run, and states and transitionsare added to have initial transitions labelled withq0 anda0, ending in statep. The finalstates will be the new initial state, the quotient statep, and the states which are final inthe constituent automata. The transitions inside the constituent automata fall into twocategories: those labelled with moves corresponding to theRHS of the term in contextΓ ⊢ M , and those labelled with moves corresponding to the LHS. Those transitionscorresponding to moves on the RHS are altered to have their level increased by 1, withtheir signature correspondingly altered by requiring a level-0 data value in statep. Thosetransitions corresponding to moves on the LHS retain the same level, but have the topvalue of their data value signature replaced with the statep. Finally, transitions areadded between the constituent automata to allow switching between threads: wheneverthere is a transition out of a final state in one of the automata, copies of the transition areadded from every final state (though keeping the data-value signature the same). Notethat the final states correspond to precisely the points in the run where the environmentis able to switch threads.

letxβ = M inN . Here we assume we have automata recognisingJMK andJNK.The strategyJletxβ = M inNK essentially consists of a concatenation ofJMK andJNK, with the result of playingJMK determining the value ofx to use inJNK. Hencethe automata construction is very similar to the standard finite automata constructionfor concatenation of languages, though branching on the different results forJMK todifferent automata forJNK.

Corollary 1. Observational equivalence of terms inRMLP-Str2⊢1 is decidable

4 A Restricted Fragment ofRML2⊢1

It is important, for the reduction to nested data automata for RMLP-Str2⊢1 , that variables

cannot be partially evaluated: in prearenas where variables have only one argument,once a variable is evaluated those moves cannot be used to justify any future moves.If we could later return to them we would need ensure that theywere accessed onlyin ways which did not break visibility. We now show that this can be done, using aslightly different encoding of pointers, for a fragment in which variables have unlimitedarity, but each argument for the variable must be evaluated all at once. This means thatthe variables have their O-moves uniquely determined by theunderlying sequence ofmoves.

4.1 Fragment definition

Definition 2. The fragment we consider in this section, which we denoteRMLres2⊢1, con-

sists of typed terms of the formx1 : Θ12 , . . . , xn : Θ1

2 ⊢ M : Θ1 where the type classesΘi are as described below:

Θ0 ::= unit | int Θ11 ::= Θ0 |Θ0 → Θ0 | int ref

Θ1 ::= Θ0 |Θ0 → Θ1 | int ref Θ12 ::= Θ1 |Θ

11 → Θ1

2

q0

.... . . a0

q1

a1

...

qn

an

q(1)

q(1)0

a(1)0

a(1)

q(2)

q(2)0

a(2)0

a(2)

...

q(k)

q(k)0

a(k)0

a(k)

AB

C

Fig. 5: Shape of arenas in RMLres2⊢1

This allows types of the form(β → β) → · · · → (β → β) → β ⊢β → · · · → β whereβ ∈ unit, int.The shape of the prearenas for thisfragment is shown in Figure 5. Notethat moves in sectionA of the prearena(marked in Figure 5) relate to the typeΘ1 on the RHS of the typing judge-ment, and that we need only repre-sent O-pointers for this section, sincethe P-moves are all answers so havetheir pointers uniquely determined bywell-bracketing. Moves in sectionsBandC of the prearena correspond tothe types on the LHS of the typ-ing judgement. Moves in sectionBneed only have their P-pointers rep-resented, since the O-moves are allanswer moves. Moves in sectionChave both their O- and P-pointers rep-resented by the underlying sequenceof moves: the P-pointers because allP-moves in this section are answermoves, the O-pointers by the visibilitycondition.

4.2 Deciding Observation Equivalence

Similarly to the P-Strict case, we provide a reduction to weak nested data class memoryautomata that uses data values to encode O-pointers. However, this time we do notneed to represent any O-pointers on the LHS of the typing judgement, so use datavalues only to represent pointers of the questions on the RHS. We do, though, needto represent P-pointers of moves on the LHS. This we do using the same techniqueused for representing P-pointers in [9]: in each word in the language we represent only

one pointer by using a “tagging” of moves: the stringsm s′

m′ is used to representthe pointers m s′ m′. Because P’s strategy is deterministic, representing one pointer ineach word is enough to uniquely reconstruct all P-pointers in the plays from the entirelanguage. Due to space constraints we do not provide a full explanation of this techniquein this paper: for a detailed discussion see [8,9]. Hence fora termJΓ ⊢ M : θK thedata language we seek to recognise,L(JΓ ⊢ MK) represents pointers in the followingmanner:

– The initial move takes the (unique) level-0 data value;– Moves inJΓ K (i.e. in sectionB or C of the prearena) take the data value of the

previous move;– Answer moves inJθK (i.e. in sectionA of the prearena) take the data value of the

question they are answering; and– Non-initial question moves inJθK (i.e. in sectionA of the prearena) take a fresh

data value nested under the data value of the justifying answer move.

Theorem 3. For every typed termΓ ⊢ M : θ in RMLres2⊢1 that is in canonical form

we can effectively construct a deterministic weak nested data class memory automaton,AM , recognising the complete plays ofL(JΓ ⊢ MK).

Proof. This proof takes a similar form to that of Theorem 2: by induction over canonicalforms. We here sketch theλ-abstraction case. A full proof is provided in Appendix B.

λxβ.M : β → θ. This construction is almost identical to that in the proof ofTheorem 2: again the strategy for P is interleavings of P’s strategy forM : θ. The onlydifference in the construction is that where in the encodingfor Theorem 2 the movesin eachAM

γ,ixcorresponding to the LHS and RHS of the prearena needed to be treated

separately, in this case they can be treated identically: all being nested under the newlevel-0 data value. We demonstrate this construction in Example 1

Example 1.Figure 6 shows two weak nested data class memory automata. Wedraw a

transitionp, a, (j,(s0

...sj

)) → p′,

(s′0

...s′j

)∈ δ as an arrow from statep to p′ labelled with

“a,(s0

...sj

)→

(s′0

...s′j

)”. We omit the “→

(s′0

...s′j

)” part of the label ifs′j = p′ andsi = s′i for

all i ∈ 0, 1, . . . , j − 1.The automaton obtained by the constructions in Theorem 3 forthe term-in-context

J ⊢ let c = ref 0 inλyunit.if !c = 0 then c := 1 elseΩK is shown in Figure 6a (to aidreadability, we have removed most of the dead and unreachable states and transitions).

3 4 5, 0 6 7 5, 1q1, (⊥) a1, (4)

q2,(

5,0⊥

)

a2,(

5,0(6)

)

→(

5,1(7)

)

q2,(

5,0⊥

)

(a) Automaton forJ ⊢ let c = ref 0 inλyunit.if !c = 0 then c := 1 elseΩK

1

2 3 4 5, 0

5, 1

6 7

q0, (⊥)

a0, (2)q1,

(

(2)⊥

)

a1,(

(2)(4)

)

q2,(

(2)5,0⊥

)

a2,((2)5,0(6)

)

→((2)5,1(7)

)

q2,(

(2)5,0⊥

)

q1,(

(2)⊥

)

q1,(

(2)⊥

)

(b) Automaton forJ ⊢ λxunit.let c = ref 0 inλyunit.if !c = 0 then c := 1 elseΩK

Fig. 6: Automata recognising strategies

Note that we have the states(5, 0) and(5, 1) - here the second part of the state labelis the value of the variablec: the top-level data value will remain in one of these twostates, and by doing so store the value ofc at that point in the run. The moveq2 in thisexample corresponds to the environment providing an argument y: note that in a run ofthe automaton the first time ay argument is passed, the automaton proceeds to reachan accepting state, but in doing so sets the top level data value to the state(5, 1). Thismeans the outgoing transition shown from state7 cannot fire.

The automaton forJ ⊢ λxunit.let c = ref 0 inλyunit.if !c = 0 then c := 1 elseΩKis shown in Figure 6b (again, cleaned of dead/unreachable transitions for clarity). Notethat this contains the first automaton as a sub-automaton, though with a new top-leveldata value added to the transitions. Theq1 move now corresponds to providing a newargument forx, thus starting a thread. Transitions have been added from the acceptingstates(5) and(7), allowing a newx-thread to be started from either of these locations.Note that the transition from(7) to (6), which could not fire before, now can fire becauseseveral data values (corresponding to differentx-threads) can be generated and left inthe state(5, 0).

5 Undecidable Fragments

In this section we consider which type sequents and forms of recursion are expressiveenough to prove undecidability. The proofs of the results this section proceed by identi-fying terms such that the induced complete plays correspondto runs of Turing-completemachine models. Full proofs are given in Appendix C.

On the Right of the Turnstile.In [13] it is shown that observational equivalence is un-decidable for 5th-order terms. The proof takes the strategythat was used to show unde-

cidability for 4th-order IA and finds an equivalent call-by-value strategy. It is relativelystraightforward to adapt the proof to show that observational equivalence is undecidableat 3rd-order types, e.g.((unit → unit) → unit) → unit. A further result in [14] showedthat the problem is undecidable at the type(unit → unit) → (unit → unit) → unit.Both results easily generalise to show that the problem is undecidable atevery3rd-ordertype andevery2nd-order type which takes at least two 1st-order arguments. We modifythe second of these proofs to show undecidability at(unit → unit) → unit → unit.Our proof of this easily adapts to a proof of the following.

Theorem 4. Observational equivalence is undecidable at every 2nd-order type (of ar-ity at least two) which contains a 1st-order argument that isnot the final argument.

On the Left of the Turnstile.Note that ⊢ M ∼= N : θ if, and only if, f : θ → unit ⊢fM ∼= fN : unit. Thus, for any sequent⊢ θ at which observational equivalence isundecidable, the sequentθ → unit ⊢ unit is also undecidable. So the problem is unde-cidable if, on the left of the turnstile, we have a fourth-order type or a (third-order) typewhich has a second-order argument whose first-order argument is not the last.

Recursion.In IA, observational equivalence becomes undecidable if weadd recursivefirst-order functions [18]. The analogous results for RML with recursion also hold:

Theorem 5. Observational equivalence is undecidable inRMLO-Str equipped with re-cursive functions(unit → unit) → unit

6 Conclusion

We have used two related encodings of pointers to data valuesto decide two relatedfragments of RML2⊢1: RMLP-Str

2⊢1 , in which the free variables were limited to arity 1,and RMLres2⊢1, in which the free variables were unlimited in arity but eachargument ofthe free variable was limited to arity 1. It is natural to ask whether we can extend orcombine these approaches to decide the whole of RML2⊢1. Here we discuss why thisseems likely to be impossible with the current machinery used.

In deciding RMLP-Str2⊢1 we used the nested data value tree-structure to mirror the shape

of the prearenas. These data values can be seen as names for different threads, with thesub-thread relation captured by the nested structure. Whathappens if we attempt touse this approach to recognise strategies on types where thefree variables have aritygreater than 1? With free variables having arity 1, wheneverthey are interrogated by P,they are entirely evaluated immediately: they cannot be partially evaluated. With aritygreater than 1, this partial evaluation can happen: P may provide the first argument atsome stage, and then at later points evaluate the variable possibly several times withdifferent second arguments. P will only do this subject to visibility conditions though:if P partially evaluates a variablex while in a threadT , it can only continue that partialevaluation ofx in T or a sub-thread ofT . This leads to problems when our automatarecognise interleavings of similar threads using the same part of the automaton. If P’sstrategy for the threadT is the strategyJMK for a termM , and recognised by an au-tomatonAM , thenJλy.MK will consist of interleavings ofJMK. The automatonAλy.M

will use a copy ofAM to simulate an unbounded number ofM -threads. IfT is one suchthread, which performs a partial evaluation ofx, this partial evaluation will be repre-sented by input letters with data values unrelated to the data value ofT . If a sibling ofT , T ′, does the same, the internal state of the automaton will haveno way of tellingwhich of these partial evaluations was performed byT and which byT ′. Hence it mayrecognise data words which represent plays that break the visibility condition.

Therefore, to recognise strategies for terms with free variables of arity greater than1, the natural approach to take is to have the data value of free-variable moves be relatedto the thread we are in. This is the approach we took in deciding RMLres

2⊢1: the freevariable moves precisely took the data value of the part of the thread they were in.Then information about the partial evaluation was stored bythe thread’s data value.This worked when the arguments to the free variables had arity at most 1: however ifwe allow the arity of this to increase we need to start representing O-pointers in theevaluation of these arguments. For this to be done in a way that makes an inductiveconstruction work forletx = (λy.M) inN , we must use some kind of nesting of datavalues for the differentM -threads. The naıve approach to take is to allow theM -threaddata values to be nested under the data value of whatever partof theN -thread they arein. However, theM -thread may be started and partially evaluated in one part oftheN -thread, and then picked up and continued in a descendant part of thatN -thread. Thedata values used in continuing theM -thread must therefore be related to the data valuesused to represent the partial evaluation of theM -thread, but also to the part of theN -thread the play is currently in. This would break the tree-structure of the data values,and so seem to require a richer structure on the data values.

Further Work. A natural direction for further work, therefore, is to investigate richerdata structures and automata models over them that may provide a way to decideRML2⊢1.

The automata we used have a non-primitive recursive emptiness problem, and hencethe resulting algorithms both have non-primitive recursive complexity also. Althoughwork in [8] shows that this is not the best possible result in the simplest cases, the exactcomplexities of the observational equivalence problems are still unknown.

To complete the classification of RMLf also requires deciding (or showing undecid-able) the fragment containing order 2 types (on the RHS) withone order 1 argument,which is the last argument. A first step to deciding this wouldbe the fragment labelledRMLX in figure 1. Deciding this fragment via automata reductions similar to those inthis paper would seem to require both data values to represent O-pointers, and somekind of visible stack to nest copies of the body of the function, as used in [9]. In partic-ular, recognising strategies of second-order terms such asλf.f() requires the ability torecognise data languages (roughly) of the formd1d2...dndn...d2d1 |n ∈ N, eachdi isdistinct. A simple pumping argument shows such languages cannot be recognised bynested data class memory automata, and so some kind of additional stack would seemto be required.

References

1. Samson Abramsky and Guy McCusker. Linearity, sharing andstate: a fully abstract gamesemantics for idealized algol with active expressions.Electr. Notes Theor. Comput. Sci.,3:2–14, 1996.

2. Samson Abramsky and Guy McCusker. Call-by-value games. In Mogens Nielsen and Wolf-gang Thomas, editors,CSL 1997, volume 1414 ofLecture Notes in Computer Science, pages1–17. Springer, 1997.

3. Henrik Bjorklund and Thomas Schwentick. On notions of regularity for data languages.Theor. Comput. Sci., 411(4-5):702–715, 2010.

4. Conrad Cotton-Barratt, Andrzej S. Murawski, and C.-H. Luke Ong. Weak and nested classmemory automata.Proceedings of LATA 2015, to appear, 2015.

5. Normann Decker, Peter Habermehl, Martin Leucker, and Daniel Thoma. Ordered navigationon multi-attributed data words. In Paolo Baldan and DanieleGorla, editors,CONCUR 2014,volume 8704 ofLecture Notes in Computer Science, pages 497–511. Springer, 2014.

6. Dan R. Ghica and Guy McCusker. The regular-language semantics of second-order idealizedalgol. Theor. Comput. Sci., 309(1-3):469–502, 2003.

7. Kohei Honda and Nobuko Yoshida. Game-theoretic analysisof call-by-value computation.Theor. Comput. Sci., 221(1-2):393–456, 1999.

8. David Hopkins.Game Semantics Based Equivalence Checking of Higher-OrderPrograms.PhD thesis, Department of Computer Science, University of Oxford, 2012.

9. David Hopkins, Andrzej S. Murawski, and C.-H. Luke Ong. A fragment of ML decidableby visibly pushdown automata. In Luca Aceto, Monika Henzinger, and Jiri Sgall, editors,ICALP 2011, volume 6756 ofLecture Notes in Computer Science, pages 149–161. Springer,2011.

10. J. M. E. Hyland and C.-H. Luke Ong. On full abstraction forPCF: i, ii, and III. Inf. Comput.,163(2):285–408, 2000.

11. N. D. Jones and S. S. Muchnick. The complexity of finite memory programs with recursion.J. ACM, 25(2), 1978.

12. M. L. Minsky. Computation: Finite and Infinite Machines. Prentice-Hall, Inc., 1967.13. Andrzej S. Murawski. On program equivalence in languages with ground-type references.

In 18th IEEE Symposium on Logic in Computer Science (LICS 2003), page 108. IEEE Com-puter Society, 2003.

14. Andrzej S. Murawski. Functions with local state: Regularity and undecidability. Theor.Comput. Sci., 338(1-3):315–349, 2005.

15. Andrzej S. Murawski, C.-H. Luke Ong, and Igor Walukiewicz. Idealized algol with groundrecursion, and DPDA equivalence. In Luıs Caires, GiuseppeF. Italiano, Luıs Monteiro,Catuscia Palamidessi, and Moti Yung, editors,ICALP 2005, volume 3580 ofLecture Notesin Computer Science, pages 917–929. Springer, 2005.

16. Andrzej S. Murawski and Nikos Tzevelekos. Algorithmic nominal game semantics. In GillesBarthe, editor,20th European Symposium on Programming, ESOP 2011, volume 6602 ofLecture Notes in Computer Science, pages 419–438. Springer, 2011.

17. Andrzej S. Murawski and Nikos Tzevelekos. Algorithmic games for full ground references.In Artur Czumaj, Kurt Mehlhorn, Andrew M. Pitts, and Roger Wattenhofer, editors,ICALP2012, volume 7392 ofLecture Notes in Computer Science, pages 312–324. Springer, 2012.

18. C.-H. Luke Ong. An approach to deciding the observational equivalence of algol-like lan-guages.Ann. Pure Appl. Logic, 130(1-3):125–171, 2004.

19. Andrew M. Pitts and Ian D. B. Stark. Operational reasoning for functions with local state.Higher order operational techniques in semantics, pages 227–273, 1998.

20. E. L. Post. Formal reductions of the general combinatorial decision problem.AJM, 65(2),1943.

A Proof of Theorem 2

Given a RMLP-Str2⊢1 term-in-contextΓ ⊢ M we construct a Deterministic Weak NDCMA

AΓ⊢M recognising, as a language,comp(JΓ ⊢ MK). By the full abstraction theorem,observational equivalence can then be checked by testing the corresponding automatafor equivalence.

For notational convenience, in this appendix we write(ad

)for the letter(a, d) ∈ D.

q0

.... . . a0

q1

a1

...

qn

an

q′

q′0

a′0

...

q′m

a′m

a′

Fig. 7: Shape of prearenas for RMLP-Str2⊢1

The shape of the pre-arena for termsJΓ ⊢ MK in RMLP-Str

2⊢1 is shown in figure 7.The moves in on the right of the prearena cor-respond toM , while moves on the left corre-spond toΓ .

For type sequentsΓ ⊢ θ in RMLP-Str2⊢1 , a

play p in JΓ ⊢ θK is represented in the datalanguage as a wordw where string projectionof w is equal to the underlying sequence ofmoves inp. Pointers are only ambiguous forquestion moves in sectionsA andC of thearena. Pointers for questions are representedin the following manner:

– The initial question takes a (fresh) level-0data value.

– If a is an answer-move in the play, thenthe corresponding letter in the word will

be(ad

)whered is the same data value as

the answer’s justifier.– Question moves in sectionsA, B, andC

of the arena take a fresh data valued,such thatpred(d) is the data value of thejustifying move.

Essentially: question moves take a data value whose predecessor is the data value of thejustifying move, answer moves take the data value of the question they answer.

We note that this has the following (convenient) consequence: each data class of a

word in such a language is either empty or of the form(qd

)(ad

). It cannot be longer

than this.Reduction from RMLP-Str

2⊢1 . The reduction is inductive on the construction of thecanonical form. We make the construction indexed by initialmoves, with each au-tomatonAi recognising the appropriate language restricted to the initial movei. Theconstruction to combine these into one automaton as per the specification above is astraightforward union of the automata and merging of the initial states.

Our inductive hypothesis is slightly stronger than that theconstructed automatarecognises the appropriate languages. We also require the following conditions on theautomatonAM

i :

– Initial states are never revisited (or have data values assigned to them)– The automaton is deterministic– Each state can only ever “hold” data values of one, fixed, level.– There is precisely one transition from the initial state, labelledi, (0,⊥). We will call

the target state of this transition the “secondary state” ofthe automaton. Further, thisis the only transition in the automaton with signature(0,⊥).

– If q andq′ are (non-initial) final states in the automaton, then if there is a transition(q, a, ξ, p, ξ′) then(q′, a, ξ, p, ξ′) is also a transition.

Notation describing NDCMA. In the following, I represent transitions of ND-CMA in a couple of ways. The most standard notation I use is to write something of the

form pm,(k,p)−−−−−→ q, q. Here we havem ∈ Σ, p, q ∈ Q, andp, q ∈ (Q⊥)

k. This meansthat(q, q) ∈ δk(p,m, (k, p)).

I may write(ss

)for thek + 1-vector of elements ofQ⊥ obtained by puttings “on

top” of thek-vectors. Similarly(ss

)putss “below” s.

Sometimes I omit the finalq: in this case it is implicitly assumed to only update thecurrently-read data value, which is updated toq. Formally: this meansq = p[q/pk].

When I am omitting this finalq, it is also possible to draw the automata in a rela-tively standard manner (e.g. in the first couple of cases below).

A.1 () : unit

For JΓ ⊢ () : unitK the complete plays of the strategy are of the formγ • (or the emptyplay). HenceAi is simply:

s1 s2 s3γ, (0,⊥) •, (0, s2)

A.2 i : int

This is also straightforward, identical to the last case butwith a differently labelledmove:

s1 s2 s3γ, (0,⊥) i, (0, s2)

A.3 xβ : β

Here we haveΓ ⊢ xβ , sox : β is inΓ . Thus the initial moves have anx-component, soan initial move is of the form(γ, j) wherej is in thex-component. For such an initial

move, the plays recognised are just((γ,j)d

)(jd

): d ∈ D is level-0, and again the

appropriate automaton is straightforwardly given:

s1 s2 s3(γ, j), (0,⊥) j, (0, s2)

A.4 succ(xint) : int andpred(xint) : int

These are just as in the previous case, but adding or subtracting one to thej (modulothe fragment ofZ being used).

A.5 xint ref := yint : unit

Here we haveΓ ⊢ xint ref := yint, sox : int ref andy : int are inΓ . Thus the initialmoves have ay-component, sayj. Thus the language recognised byA(γ,j) is just:

((γ,j)d

)(writex(j)

d′

)(okx

d′

)(•d

)| d is level-0 andpred(d′) = d

This is recognised by the following automaton:

s1 s2 s3 s4 s5(γ, j), (0,⊥) wx(j), (1,

(

s2⊥

)

) okx, (1,(

s2s3

)

) •, (0, s2)

A.6 !xint ref : int

This is similar to the previous case, except that the value for P to return is given by O’sreponse toreadx. The desired language forAγ is is just

(γd

)(readx

d′

)(jxd′

)(jd

)| j ∈ N, d is level-0 andpred(d′) = d

This is recognised by a very similar automaton to the previous case, except that fromstates3 the automaton splits into different states for each possible answerjx.

A.7 if xβ thenM elseN : θ

The initial move contains anx-component. If thisx-component is0 then the automatonis as the as the automaton forN , otherwise it is as the automaton forM .

A.8 mkvar(λxunit.M, λyint.N) : int ref

This construction is very similar to that provided in the RMLres2⊢1 case, in appendix B.4.

The only difference is that now the automata forM andN needn’t be level-0, as theymay make plays inΓ . These parts of the automata must retain the data-levels used, butnested under the initial level-0 data value used: this is a simple construction.

A.9 whileM doN

The strategyJwhileM doNK plays as if playingM until the final move would bemade. If this would be 0, P gives the• answer to the initial move, and stops. Other-wise it plays as if playingN , until the final move would be made, when it starts as ifplayingM again. This is easy to construct from the automataAM

γ andANγ : for a full

formal description of how, see appendix B.5 which deals withthe RMLres2⊢1 case. Theconstruction here is very similar: the only difference is that the constituent automatamay no longer be level-0. This could lead to difficulties if data values spawned in onerun-throughs of the loop could be used in a later run-through, but this cannot happen asarguments cannot be partially evaluated in this fragment, so cannot be returned to later.

A.10 letx = ref 0 inM : θ

We assume we have a family of automata,AMi , recognising the strategyJΓ, x : int ref ⊢ M : θK.

JΓ ⊢ letx = ref 0 inM : θK is constructed by restricting behaviour ofx to “good vari-able” behaviour (i.e. after a read-move the response is an immediate reply of the lastinteger written to the variable), and then hiding those moves. The automata constructionis done in these two stages.

Restriction to good-variable behaviour. The value of the variable will be storedin both the current state of the automaton, and by the level-0data value. The level-0 data value will be used to ensure that when O switches between threads (see theλ-abstraction construction in section A.11), the correct variable value is retained. Bykeeping the value in the current state, the correct value is retained when moves inΓ arebeing made. Assume the finitary fragment we are using is0, 1, . . . ,K. LetQ be thenon-initial states ofAM

γ . We constructCγ as follows:

– The states of the automaton areqI ⊎ (Q × 0, 1, . . . ,K)– The final states areqI and those which are final inAM

γ paired with any integer.– The initial state isqI– The transitions are given as follows:

• qIq0,(0,⊥)−−−−−→ (sM , 0) wheresM is the secondary state ofAM

γ .

• if q1

m,(k,

(s0...sk

))

−−−−−−−−→ q2,(t0

...tk

)is in AM

γ wherem is not anx-write move or a

response to anx-read move, then:∗ if m is aqj move for somej 6= 0, for eachi0, i1, . . . , ik ∈ 0, 1, . . . ,K,

we have the transition(q1, i0)

m,(k,

((s0,i0)

...(sk,ik)

))

−−−−−−−−−−−→ (q2, i0),((t0,i0)

...(tk,i0)

)(for con-

venience, ifsk = ⊥ we interpret(sk, i) as⊥ also).∗ if m is not aqj move, for eachi, j0, j1, . . . , jk ∈ 0, 1, . . . ,K, we have

the transition(q1, i)

m,(k,

((s0,j0)

...(sk,jk)

))

−−−−−−−−−−−→ (q2, i),((t0,i)

...(tk,i)

).

• For eachj, if q1

writex(j),(1,

(s0⊥

))

−−−−−−−−−−−−→ q2,(t0t1

)is in AM

γ , then we have the

transition(q1, i1)writex(j),(1,

((s0,i0)

))

−−−−−−−−−−−−−−−→ (q2, j),((t0,j)(t1,j)

)for eachi1, i0.

• For each response to anx-read move,jx, if q1jx,(1,

(s0s1

))

−−−−−−−−→ q2,(t0t1

)is inAM

γ ,

then we have(q1, j)jx,(1,

((s0,i0)(s1,i1)

))

−−−−−−−−−−→ (q2, j),((t0,j)(t1,j)

)for eachi0, i1.

Note that adding the transitions between accepting states as required by the inductivehypothesis will not change the language recognised, since all the outward transitionsadded would be labelled with aqj move, and by construction these moves require alevel-0 data value to be in the correct place.

Hiding Alet x=ref 0 inMγ is constructed fromCγ as follows:

If we are in a configuration(s1, f) of Ci where we can perform a transitions1mx,(j,s)−−−−−→

s2, t wheremx is anx-move then by determinacy of strategies combined with the re-striction to good variable behaviour, it is the only possible transition from this config-

uration. Thus for every states0 of Cγ and every possible “signature”(t0t′0

), there is a

unique maximal (and not necessarily finite) sequence of transitions:

s0

m0,(1,

(t0⊥

))

−−−−−−−−→ s1,(t1t′1

) m1,(1,

(t1t′1

))

−−−−−−−−→ s2,(t2t′2

) m2,(1,

(t2⊥

))

−−−−−−−−→ . . .

or

s0

m0,(1,

(t0t′0

))

−−−−−−−−→ s1,(t1t′1

) m1,(1,

(t1⊥

))

−−−−−−−−→ s2,(t2t′2

) m2,(1,

(t2t′2

))

−−−−−−−−→ . . .

where eachmi is anx-move.From eachCγ we construct the automatonAlet x=ref 0 inM

γ by considering wherethis sequence terminates for each state. Everything is the same as inCγ except for thetransition relation, which is altered as follows:

– If the maximal sequence ofx-moves with signature(t0t′0

)out of states0 is empty

then all transitions requiring signature(t0t′0

)out ofs0 are unchanged.

– If the maximal sequence out ofs0 with signature(t0t′0

)is finite and non-empty and

ends in statesn and with signature(tnt′n

), then for every transitionsn

m,(1,

(tnt′n

))

−−−−−−−−→

sn+1, t we add the transitions0ǫ−→ sn+1 (note that by determinacy of the strategy

and restriction to good variable behaviour, thisǫ transition can be compressed outwithout loss of determinacy).

– All transitions onx-moves are removed– Transitions from final states as required by the inductive hypothesis are added. This

does not affect the language recognised, since the added transitions will require alevel-0 data value to be “in” the relevant copy ofQ0, and there can only be onelevel-0 data value in runs of this automaton.

Determinacy of the resulting automaton is inherited from determinism ofCγ (andthence fromAM

γ ).

A.11 λxβ.M : β → θ

We haveΓ, x : β ⊢ M : θ, and therefore assume there is a family of automataAMi

recognisingJMK. The prearenas forJΓ, x ⊢ MK and JΓ ⊢ λx.M K are shown in fig-ure 4. Note that the initial moves inJΓ, x ⊢ MK contain anx-component, so may be

considered pairs(γ, ix), while the initial moves inJΓ ⊢ λx.M K contain the sameΓ -component, but nox-component. The moveq0 therefore corresponds to theΓ -component,and the moveq1 precisely corresponds to anx-move.

JΓ ⊢ λx.M K is as follows: after an initial moveγ, P plays the uniquea0-move•,and waits for aq1-move. Once O plays aq1-moveix, P plays as inJΓ, x ⊢ MK whengiven an initial move(γ, ix). However, as theq1-moves are not initial, it is possible thatO will play anotherq1-move,i′x. Each time O does this it opens a new thread which Pplays as perJΓ, x ⊢ MK when given initial move(γ, i′x). Only O may switch betweenthreads, and this can only happen immediately after P plays an ai-move (for anyi).Hence we constructAλx.M

γ as follows:

– The set of states is the disjoint union of the set of non-initial states of eachAM(γ,ix)

,plus new states(1), (2), and(3).

– The initial state is(1)– The final states are those that are final in eachAM

(γ,ix), as well as(1) and(3).

– The transition relation is as follows:• (1)

γ,(0,⊥)−−−−−→ (2)

• (2)a0,(0,(2))−−−−−−→ (3)

• For eachix, (3)ix,(1,

((3)⊥

))

−−−−−−−−→ six wheresix is the secondary state ofAMγ,ix

• If s1m,(j,s)−−−−→ s2, t is a (non-initial) transition in one of theAM

ix, then:

∗ if m is aqi or ai move,s1m,(j+1,

((3)s

))

−−−−−−−−−−→ s2,((3)t

)is a transition.

∗ if m is a move inJΓ K, s1m,(j,s[(3)/s0])−−−−−−−−−→ s2, t[(3)/t0] is a transition.

• If s1 ands′1 are both (non-initial) final states ands1m,(j,s)−−−−→ s2, t is a transition

already given by the above rules, thens′1m,(j,s)−−−−→ s2, t. (This allows O to switch

between threads).

A.12 letxβ = M inN : θ

Here we haveΓ ⊢ M : β andΓ, x : β ⊢ N : θ. The initial moves ofJΓ, x : β ⊢ N : θKcontain anx-component, so we index the family of automata recognisingJΓ, x : β ⊢ N : θKasAN

γ,j wherej is thex-component. The family of automata recognisingJMK are in-dexed asAM

γ .The strategyJletxβ = M inNK is essentially a concatenation of the strategies for

JMK andJNK, with the result of theJMK strategy determining thex-component of theinitial move ofJNK. Alet xβ=M inN

i is constructed as follows:

If L(AMi ) =

(γd

)(jd

) thenAlet xβ=M inN

γ = ANγ,j . Otherwise, by determinacy

of the strategy there cannot be a transition from the secondary state to a final state inAM

γ , andAlet xβ=M inNγ is given by:

– The set states of states is disjoint union of the non-initialstates ofAMi and each

ANi,j , plus new a state(1).

– The initial state is(1).– The final states are those which are final in eachAN

γ,j .– The transitions are given as follows:

• (1)γ,(0,⊥)−−−−−→ sM wheresM is the secondary state ofAM

γ

• All transitions inAMγ not going to a final state (or from the initial state) are

preserved

• If s1j,(0,s3)−−−−−→ s2, t is a transition inAM

γ with s2 final (in AMγ ) andsN,j is the

secondary state ofANγ,j :

∗ if sN,jm,(0,sN,j)−−−−−−−→ s4, t

′ is inANγ,j then we have the transitions1

m,(0,s3)−−−−−→

s4, t′.

∗ if sN,j

m,(1,

(sN,j

))

−−−−−−−−−→ s4, t′ is inANγ,j then we have the transitions1

m,(1,

(s3⊥

))

−−−−−−−−→

s4, t′.• All other transition in eachAN

γ,j are preserved unchanged.• Transitions from final states as required by the inductive hypothesis are added.

This does not affect the language recognised, since the added transitions willrequire a level-0 data value to be “in” the relevant copy ofAN

γ,j , and there canonly be one level-0 data value in runs of this automaton.

Determinacy is inherited fromAMγ andAN

γ,j.

A.13 letx = zyβ inM : θ

As x must be of typeβ for this to be in RMLP-Str2⊢1 , this is essentially the same as the

previous case.

A.14 letx = z(λy.M) inN : θ

Here we haveΓ, y : β, z : (β → θ1) → β ⊢ M : θ1 andΓ, x : β, z : (β → θ1) → β ⊢N : θ. As in the previous cases, plays inJletx = z(λy.M) inNK consist of P playingJz(λy.M)K until x has been evaluated, and then playing asN with this value ofx. Theprearena for this case is shown in figure 8.

Plays inJletx = z(λy.M) inNK start with P playingqz. O can then either playq′0,starting anJλy.MK-thread, or playaz, giving a value forx in the rest of the play. If Ochooses the former, that thread is played asJλy.MK, with q′0-moves providing a newvalue fory, until O plays anaz move. Once O does play anaz move, P plays asJNKwith the answer O provided as the value forx.

In this construction a similar construction to that in A.11 will be used, to allow O tointerleave plays ofJMK. At any point when O would be able to change threads, it is alsoable to finish evaluatingM and give a value forx. Once this happens play continues inthe correspondingAN

γ,jx. The formal construction forAlet x=z(λy.M) inN

γ,izis as follows:

– The set of states consists of:• Fresh states(1), (2), and(3)

(γ, iz)

JΓ K

qz

q′0

a′0

az

Jθ1K

a0

q1

a1

...

qn

an

Fig. 8: Prearena forJΓ, z : (β → θ1) → β ⊢ θK

• A copy of the states of eachAMγ,iy

• A copy of the states of eachANγ,jx

– The initial state is(1)– The final states are those which are final in eachAN

γ,jx , and(1)– The transitions are:

• (1)γ,(0,⊥)−−−−−→ (2)

• (2)qz ,(1,

((2)⊥

))

−−−−−−−−→ (3)

• For each availableaz move labelledjx we have the transition(3)jx,

((2)(3)

)

−−−−−−→sN,j wheresN,j is the secondary state ofAN

γ,jx

• For each availableq′0 move labellediy, we have(3)

iy ,(2,

((2)(3)⊥

))

−−−−−−−−→ sM wheresM is the secondary state ofAM

γ,iy

• If s1m,(j,s)−−−−→ s2, t is a (non-initial) transition in one of theAM

iy , then:

∗ if m is aqi or ai move inJθ1K, s1

m,(j+2,

((2)(3)s

))

−−−−−−−−−−→ s2,((2)(3)t

)is a transition.

∗ if m is a move inJΓ K, s1m,(j,s[(2)/s0])−−−−−−−−−→ s2, t[(2)/t0] is a transition.

• If q1m,(k,s)q2,t−−−−−−−→ is a transition already defined by one of these, andq1 is either

(3) or a (non-initial) final state in one of theAMγ,iy

, andq3 is a (non-initial) final

state in one of theAMγ,iy

then we have the transitionq3m,(k,s)q2,t−−−−−−−→.

• For each transitionsN,jm,(k,s)−−−−−→ q, t in eachAN

γ,jx, wheresN,j is the sec-

ondary state ofANγ,jx , we have the transitionsN,j

m,(k,s[(2)/s0])−−−−−−−−−→ q, t

• All other transitions in eachANγ,jx are left unchanged

• Transitions from final states as required by the inductive hypothesis are added.This does not affect the language recognised, since the added transitions willrequire a level-0 data value to be “in” the relevant copy ofAN

γ,jx , and there canonly be one level-0 data value in runs of this automaton.

Determinism is inherited from the constituent automata.

A.15 letx = zmkvar(λuunit.M1, λvint.M2) inN : θ

This is very similar to the previous case: the difference is that theq′0 moves from thelast case can now be eitherread orwrite(j), leading to playing as eitherJM1K or JM2Krespectively. The formal construction is almost identicalto that given above.

B Proof of Theorem 3

Given a RMLres2⊢1 term-in-contextΓ ⊢ M we construct a Deterministic Weak NDCMAAΓ⊢M recognising, as a language,comp(JΓ ⊢ MK). By the full abstraction theorem,observational equivalence can then be checked by testing the corresponding automatafor equivalence.

The shape of the pre-arena for termsJΓ ⊢ MK in RMLres2⊢1 is shown in figure 5. The

moves in sectionA of the prearena correspond toM , while moves in sectionsB andCcorrespond toΓ .

A play p in JΓ ⊢ α(n)K is represented in the data language as a wordw where thestring projection ofw is equal to the underlying sequence of moves inp. Pointers areonly ambiguous for question moves (as for answers well-bracketing is enough to ensurejustification is clear). Pointers for questions are represented in the following manner:

– Initial questions (of which there is precisely one, at the beginning of the play) takea fresh level-0 data value.

– If a is an answer-move in the play, then the corresponding letterin the word will be(ad

)whered is the same data value as the answer’s justifier.

– Question moves in section A of the arena above take a fresh data value,d, such thatpred(d) = d′ whered′ is the data value of the justifier. These data values will beenough to determine the justifiers.

– All other moves (i.e. those in sections B and C) take the data value of the mostrecent move in A (or the initial move, if no move in A has yet been made). Movesin B will have their pointers represented using the “tagging” of source- and target-moves, as used in [9] for RMLO-Str. We will not encode pointers of such movesjustified by the initial move (i.e.q(1) moves), as they are unambiguously justified.

Reduction from RMLres2⊢1 The reduction is inductive on the construction of the

canonical form. We make the construction indexed by initialmoves, with each au-tomatonAi recognising the appropriate language restricted to the initial movei. Theconstruction to combine these into one automaton as per the specification above is astraightforward union of the automata and merging of the initial states.

Our inductive hypothesis is slightly stronger than that theconstructed automatarecognises the appropriate languages. We also require the following conditions on theautomatonAM

i :

– Initial states are never revisited (or have data values assigned to them)– The automaton is deterministic– Each state can only ever “hold” data values of one, fixed, level.– There is precisely one transition from the initial state, labelledi, (0,⊥). We will call

the target state of this transition the “secondary state” ofthe automaton. Further, thisis the only transition in the automaton with signature(0,⊥).

– If q andq′ are (non-initial) final states in the automaton, then if there is a transition(q, a, ξ, p, ξ′) then(q′, a, ξ, p, ξ′) is also a transition.

For the cases() : unit, i : int, xβ : β, succ(xint) : int, andpred(xint) : int, theconstructions are exactly as in RMLP-Str

2⊢1 . We deal with the remaining cases here:

B.1 xint ref := yint : unit

Here we haveΓ ⊢ xint ref := yint, sox : int ref andy : int are inΓ . Thus the ini-tial moves have ay-component, sayj. Thus the language recognised byA(γ,j) is just

((γ,j)d

)(writex(j)

d

)(okx

d

)(•d

)|d,∈ D andd is level-0. This is recognised by the fol-

lowing automaton:

s1 s2 s3 s4 s5(γ, j), (0,⊥) wx(j), (0, s2) okx, (0, s3) •, (0, s4)

B.2 !xint ref : int

This is similar to the previous case, only the value to returnis given by O’s play in the

x-section ofΓ . The language recognised byAγ is just:(γd

)(readx

d

)(jxd

)(jd

)| d ∈

D andd is level-0. The automaton is thus similar to that given above, except that fromstates3 the automaton splits into different states for each possible answerjx.

B.3 if xβ thenM elseN : θ

The initial move contains anx-component. If thisx-component is0 then the automatonis as the as the automaton forN , otherwise it is as the automaton forM .

B.4 mkvar(λxunit.M, λyint.N) : int ref

Here we haveΓ, x : unit ⊢ M : int andΓ, y : int ⊢ N : unit, and this “bad-variable”construction uses these methods as read- and write-methodsrespectively. The stringprojection of the language forJmkvar(λxunit.M, λyint.N)K is then

γ · • · (read · LM +∑

j

write(j) · LjN )∗

WhereLM is the language forJMK without the initial move, andLjN is the language

JNK wheny = j, without the initial move. Note that the representing automata arelevel-0.

For an initial moveγ, we make the following construction ofAmkvar(λx.M,λy.N)γ :

– The set of states is the disjoint union of the states ofAMγ , and eachAN

(γ,j), minusthe initial states, plus additional states(1), (2), and(3).

– The initial state is the state(1).– The final states are those which are final in the constituent automataAM

γ and eachAN

(γ,j), and(1) and(3).– The transition relation is given as follows:

• (1)γ,(0,⊥)−−−−−→ (2)

• (2)•,(0,(2))−−−−−→ (3)

• (3)read,(1,

((3)⊥

))

−−−−−−−−−−→ sM wheresM is the secondary state ofAMγ

• (3)write(j),(1,

((3)⊥

))

−−−−−−−−−−−−→ sN,j wheresN,j is the secondary state ofAN(γ,j)

• For all transitionss1m,(0,ξ)−−−−−→ s2 in a constituent automaton (not including

initial transition), we have the transitions1m,(1,

((3)ξ

))

−−−−−−−−→ s2.• From each final state,s, in one of the constituent automata, we add transitions

sread,(1,

((3)⊥

))

−−−−−−−−−−→ sM andswrite(j),(1,

((3)⊥

))

−−−−−−−−−−−−→ sN,j (wheresM andsN,j areas before)

We note that determinism is inherited from the constituent automata. Further, the onlytransitions from final states we need to add for the inductivehypothesis have alreadybeen added.

B.5 whileM doN : unit

The strategyJwhileM doNK plays as if playingM until the final move would bemade. If this would be 0, P gives the• answer to the initial move, and stops. Otherwiseit plays as if playingN , until the final move would be made, when it starts as if playingM again. Note thatAM

γ andANγ are both level-0. The automataAwhileM doN

γ is thusgiven by:

– The set of states is given by the disjoint union of the set of states ofAMγ andAN

γ ,without the initial states, plus new states(1) and(2).

– The initial state is(1).– The final states are(1) and(2).– The transitions are given as follows:

• (1)γ,(0,⊥)−−−−−→ sM wheresM is the secondary state ofAM

γ .

• if s′ is a final state ofAMγ ands

m,(0,ξ)−−−−−→ s′ is a transition inAM

γ , with m 6= 0,

we have the transitionsǫ−→ sN wheresN is the secondary state ofAN

γ . (Wecan compress the silent transitionǫ out, since by determinism of the strategythis is the only transition froms in AM

γ .)

• if s′ is a final state ofAMγ ands

m,(0,ξ)−−−−−→ s′ is a transition inAM

γ , with m = 0,

we have the transitions•,(0,ξ)−−−−→ (2)

• if s′ is a final state ofANγ ands

m,(0,ξ)−−−−−→ s′ is a transition inAN

γ , we have the

transitionsǫ−→ sM wheresM is the secondary state ofAM

γ . (We can compressthe silent transitionǫ out, since by determinism of the strategy this is the onlytransition froms in AN

γ .)

Determinacy is inherited from the constituent automata, and there are no transitionsfrom final states that need be added.

B.6 letx = ref 0 inM : θ

This is similar to the construction for RMLP-Str2⊢1 in appendix A, but this time the value

of the variable will be stored just by the level-0 data value.This will correctly capturethe scope of the variable.

We assume we have a family of automata,AMi , recognising the strategyJΓ, x : int ref ⊢ M : θK.

JΓ ⊢ letx = ref 0 inM : θK is constructed by restricting behaviour ofx to “good vari-able” behaviour (i.e. after a read-move the response is an immediate reply of the lastinteger written to the variable), and then hiding those moves. The automata constructionis done in these two stages.

Restriction to good-variable behaviour. Assume the finitary fragment we areusing is0, 1, . . . , k. By our inductive hypothesis, we know that each state can only’hold’ data values of one level: letQ0 be the set of states ofAM

γ which hold level-0data values, letQ>1 be the set of states ofAM

γ which hold data values of level> 1, sothe states ofAM

γ are partitioned intoQ0, Q>1, and the initial stateqI . We constructCγas follows:

– The states of the automaton areqI ⊎Q>1 ⊎ (Q0 × 0, 1, . . . , k)

– The final states are those which are final inAMγ , and those which are final inAM

γ

paired with any integer.– The initial state isqI , the initial state inAM

γ .– The transitions are given as follows:

• qIq0,(0,⊥)−−−−−→ (sM , 0) wheresM is the secondary state ofAM

γ

• If s1m,(0,s3)−−−−−→ s2, t is in AM

γ , wherem is not anx-write move or a re-

sponse to anx-read move, thens1, s2 ∈ Q0, and we have(s1, i)m,(0,(s3,i))−−−−−−−→

(s2, i), (t, i) for eachi

• If s1

m,(k,

(s3ξ

))

−−−−−−−−→ s2,(tt

)is in AM

γ (wherek ≥ 1), wherem is not anx-

write move or a response to anx-read move, thens1, s2 ∈ Q>1, and we have

s1

m,(k,

((s3,i)

ξ

))

−−−−−−−−−−→ s2,((t,i)t

)for eachi

• For eachj, if s1writex(j),(0,s3)−−−−−−−−−−→ s2, t is inAM

γ , thens1, s2 ∈ Q0, and we have

(s1, i)writex(j),(0,(s3,i))−−−−−−−−−−−−→ (s2, j), (t, j) for eachi

• For eachj, if s1

writex(j),(k,

(s3ξ

))

−−−−−−−−−−−−−→ s2,(tt

)is in AM

γ (wherek ≥ 1), then

s1, s2 ∈ Q>1, and we haves1writex(j),(k,

((s3,i)

ξ

))

−−−−−−−−−−−−−−−→ s2,((t,j)t

)for eachi

• For each response to anx-read move,jx, if s1jx,(0,s3)−−−−−→ s2, t is in AM

γ , then

s1, s2 ∈ Q0, and we have(s1, j)jx,(0,(s3,j))−−−−−−−−→ (s2, j), (t, j)

• For each response to anx-read move,jx, if s1jx,(k,

(s3ξ

))

−−−−−−−−→ s2,(tt

)is in AM

γ

(wherek ≥ 1), thens1, s2 ∈ Q>1, and we haves1jx,(k,

((s3,j)

ξ

))

−−−−−−−−−−→ s2,((t,j)t

)

Hiding Alet x=ref 0 inMγ is constructed fromCγ as follows:

If we are in a configuration(s1, f) of Ci where we can perform a transitions1mx,(j,s)−−−−−→

s2, t wheremx is anx-move then by determinacy of strategies combined with the re-striction to good variable behaviour, it is the only possible transition from this con-figuration. Further, we note that using onlyx-transitions cannot lead to a change indata-value being read. Thus for every states0 of Cγ and every possible “signature”ξ0,there is a unique maximal (and not necessarily finite) sequence of transitions:

s0m0,(k,ξ0)−−−−−−→ s1, ξ1

m1,(k,ξ1)−−−−−−→ s2, ξ2

m2,(k,ξ2)−−−−−−→ . . .

where eachmi is anx-move.From eachCγ we construct the automatonAlet x=ref 0 inM

γ by considering wherethis sequence terminates for each state. Everything is the same as inCγ except for thetransition relation, which is altered as follows:

– If the maximal sequence ofx-moves with signatureξ0 out of states0 is empty thenall transitions requiring signatureξ0 out ofs0 are unchanged.

– If the maximal sequence out ofs0 with signatureξ0 is finite and non-empty and

ends in statesn and with signatureξn, then for every transitionsnm,(k,ξn)−−−−−→ sn+1, t

we add the transitions0m,(k,ξ0)−−−−−→ sn+1, t.

– All transitions onx-moves are removed

– Transitions from final states as required by the IH are added.This does not affectthe language recognised, since the added transitions will require a level-0 data valueto be “in” the relevant location, and there can only be one level-0 data value in runsof this automaton.

Determinacy of the resulting automaton is inherited from determinism ofCγ (andthence fromAM

γ ).

B.7 λxβ.M : θ

We haveΓ, x : β ⊢ M : θ′, and therefore assume there is a family of automataAMi

recognisingJMK. The prearenas forJΓ, x ⊢ MK and JΓ ⊢ λx.M K are shown in fig-ure 4. Note that the initial moves inJΓ, x ⊢ MK contain anx-component, so may beconsidered pairs(γ, ix), while the initial moves inJΓ ⊢ λx.M K contain the sameΓ -component, but nox-component. The moveq0 therefore corresponds to theΓ -component,and the moveq1 precisely corresponds to anx-move.

JΓ ⊢ λx.M K is as follows: after an initial moveγ, P plays the uniquea0-move•,and waits for aq1-move. Once O plays aq1-moveix, P plays as inJΓ, x ⊢ MK whengiven an initial move(γ, ix). However, as theq1-moves are not initial, it is possible thatO will play anotherq1-move,i′x. Each time O does this it opens a new thread which Pplays as perJΓ, x ⊢ MK when given initial move(γ, i′x). Only O may switch betweenthreads, and this can only happen immediately after P plays an ai-move (for anyi).Thus we constructAλx.M

γ as follows:

– The set of states is the disjoint union of the set of non-initial states of eachAM(γ,ix)

,plus new states(1), (2), and(3).

– The initial state is(1)– The final states are those that are final in eachAM

(γ,ix), as well as(1) and(3).

– The transition relation is as follows:

• (1)γ,(0,⊥)−−−−−→ (2)

• (2)a0,(0,(2))−−−−−−→ (3)

• For eachix, (3)ix,(1,

((3)⊥

))

−−−−−−−−→ six wheresix is the secondary state ofAMγ,ix

• If s1m,(j,s)−−−−→ s2, t is a (non-initial) transition in one of theAM

ix , thens1m,(j+1,

((3)s

))

−−−−−−−−−−→

s2,((3)t

)is a transition.

• If s1 ands′1 are both (non-initial) final states ands1m,(j,s)−−−−→ s2, t is a transition

already given by the above rules, thens′1m,(j,s)−−−−→ s2, t.

B.8 letxβ = M inN : θ

This is very similar to the equivalent case in appendix A.

The strategyJletxβ = M inNK is a concatenation of the strategies forJMK andJNK, with the result of theJMK strategy determining thex-component of the ini-tial move of JNK. We haveΓ ⊢ M : β andΓ, x : β ⊢ N : θ. The initial moves ofJΓ, x : β ⊢ N : θK contain anx-component, so we index the family of automata recog-nisingJΓ, x : β ⊢ N : θK asAN

γ,j wherej is thex-component. The family of automata

recognisingJMK are indexed asAMγ . Alet xβ=M inN

i is constructed as follows:

If L(AMi ) =

(γd

)(jd

) thenAlet xβ=M inN

γ = ANγ,j . Otherwise, by determinacy

of the strategy there cannot be a transition from the secondary state to a final state inAM

γ , andAlet xβ=M inNγ is given by:

– The set states of states is disjoint union of the non-initialstates ofAMi and each

ANi,j , plus new a state(1).

– The initial state is(1).– The final states are those which are final in eachAN

γ,j .– The transitions are given as follows:

• (1)γ,(0,⊥)−−−−−→ sM wheresM is the secondary state ofAM

γ

• All transitions inAMγ not going to a final state (or from the initial state) are

preserved

• If s1j,(0,s3)−−−−−→ s2, t is a transition inAM

γ with s2 final (in AMγ ) andsN,j is the

secondary state ofANγ,j, andsN,j

m,(0,sN,j)−−−−−−−→ s4, t

′ is inANγ,j then we have the

transitions1m,(0,s3)−−−−−→ s4, t

′.• All other transition in eachAN

γ,j are preserved unchanged.• Transitions from final states as required by the inductive hypothesis are added.

This does not affect the language recognised, since the added transitions willrequire a level-0 data value to be “in” the relevant copy ofAN

γ,j , and there canonly be one level-0 data value in runs of this automaton.

Determinacy is inherited fromAMγ andAN

γ,j.

B.9 letx = zyβ inM : θ

We assumex is not of typeβ, as otherwise this could be handled by the previousconstruction.

We haveΓ, x : θ′, z : β → θ′, y : β ⊢ M : θ. Plays inJletx = zyβ inMK beginwith P copying they-component of the initial move into thez-component, and O mustrespond with the unique answer,•z (which corresponds to the initial move ofJθ′K). Playthen continues asJMK except that allx-moves are relabelled asz-moves, hereditarilyjustified by the occurrence of•z O was forced to play. The pointers for moves justifiedby •z will have to be made explicit as part of the construction.

Alet x=zyβinM

γ,iy,izis then constructed as follows:

– The states are two copies of the non-initial states ofAMγ,iy,iz,•x

(where•x is themove•z that O will be forced to play, relabelled as anx-move) plus new states(1),(2) and(3). The second copy ofAM

γ,iy,iz,•xwill be used to encode P-pointers, so

we write states in the second copy as•s.

– The initial state is(1).– The final states are those final inAM

γ,iy,iz ,•x, and(1).

– The transitions are as follows:• (1)

(γ,iy,iz),(0,⊥)−−−−−−−−−→ (2)

• (2)jz ,(0,(2))−−−−−−→ (3)wherejz is the initial move fory copied into thez-component.

• (3)•z ,(0,(3))−−−−−−→ sM and(3)

•z ,(0,(3))−−−−−−→

•sM wheresM is the secondary state of

AMγ,iy,iz ,•x

.

• s1m,(k,s)−−−−−→ s2, t is a transition inAM

γ,iy,iz ,•xandm is not anx-move, then

we have the transitionss1m,(k,s)−−−−−→ s2, t and

•s1

m,(k,•

s)−−−−−→

•s2,

t (where•s replaces

each element ofs of s with•s).

• s1mx,(k,s)−−−−−→ s2, t is a transition inAM

γ,iy,iz ,•xandmx is a non-initialx-move,

then we have the transitionss1mz,(k,s)−−−−−→ s2, t and

•s1

mz,(k,•

s)−−−−−→

•s2,

t, wheremz

is the relabelling ofmx into thez-component.

• s1mx,(k,s)−−−−−→ s2, t is a transition inAM

γ,iy,iz ,•xandm is the initialx-move, then

we have the transitionss1mz ,(k,s)−−−−−→ s2, t and

•s1

mz,(k,•

s)−−−−−→

•s2,

t and•s1

mz,(k,•

s)−−−−−→

•s2

,•

t, wheremz is the relabelling ofmx into thez-component.• Transitions from final states as required by the inductive hypothesis are added.

This does not affect the language recognised, since the added transitions willrequire a level-0 data value to be “in” the relevant location, and there can onlybe one level-0 data value in runs of this automaton.

Determinism is inherited from the constituent automaton.

B.10 letx = z(λy.M) inN : θ

Here we haveΓ, y : β, z : (β → β) → θ1 ⊢ M : β andΓ, x : θ1, z : (β → β) → θ1 ⊢N : θ. The prearena is as follows:

(γ, iz)

JΓ K•

jz

lz

•z

Jθ1K

a0

q1

a1

...

qn

an

Plays inJletx = z(λy.M) inNK start with P playing•. O can then either playjz,starting anJMK-thread, or play•z, the initial x-move. If O chooses the former, thatthread is played to completion, as inJMK. Once this is finished, (with P playingkz asthe final move), we return to the situation where O can play either jz or •z. Once Odoes play•z, P plays asJNK, except that allx-moves are renamed toz-moves (justifiedby •z). Further, whenever P plays inx (which becomes az-move), O can again playjzand start anJMK thread.

The automatonAlet x=z(λy.M) inNγ,iz

is constructed as follows:

– The set of states consists of:• Fresh states(1), (2), and(3)

• Two copies of the set of non-initial states ofANγ,•x,iz

, the second marked as•s

• defineS, the set of states from which anJMK-thread can be opened, as

S = (3)⊎r : (r = s or r =•s) andt

mx−−→ s in ANγ,•x,iz with mx a P-x move

We then take states(s, t) wheres is a state in someAMγ,iy,iz

andt ∈ S

– The initial state is(1)– The final states are those which are final inAN

γ,•x,iz (both tagged and untagged),and(1)

– The transitions are:

• (1)(γ,ix),(0,⊥)−−−−−−−→ (2)

• (2)•,(0,(2))−−−−−→ (3)

• (3)•z,(0,(3))−−−−−−→ sN and(3)

•z,(0,(3))−−−−−−→

•sN , wheresN is the secondary state of

ANγ,•x,iz

• If s1m,(k,s)−−−−−→ s2, t is a transition inAN

γ,•x,izandm is not anx-move, then we

have the transitionss1m,(k,s)−−−−−→ s2, t and

•s1

m,(k,•

s)−−−−−→

•s2,

t

• If s1mx,(k,s)−−−−−→ s2, t is a transition inAN

γ,•x,iz andmx is a non-initialx-move,

then we have the transitionss1mz,(k,s)−−−−−→ s2, t and

•s1

mz,(k,•

s)−−−−−→

•s2,

t, wheremz

is the relabelling ofmx into thez-component.

• If s1mx,(k,s)−−−−−→ s2, t is a transition inAN

γ,•x,iz andm is the initialx-move, then

we have the transitionss1mz ,(k,s)−−−−−→ s2, t and

•s1

mz,(k,•

s)−−−−−→

•s2,

t and•s1

mz,(k,•

s)−−−−−→

•s2

,•

t, wheremz is the relabelling ofmx into thez-component.

• If s ∈ S then for all transitionstmx,(k,t)−−−−−→ s, s we have the transitions

jz,(k,s)−−−−−→

(qM,j , s), sqM,j, whereqM,j is the secondary state ofAM

γ,jy,iz , andsqM,jis the

same ass but with the last element paired withqM,j . Further:

∗ If p1m,(0,p′

1)

−−−−−→ p2, p′2 is in AM

γ,jy,izwherep2 is not final (inAM

γ,jy,iz), we

have(p1, s)m,(k,sp′

1

)

−−−−−−→ (p1, s), sp′

2

∗ If p1l,(0,p′

1)

−−−−→ p2, p′2 is in AM

γ,jy,izwherep2 is final (inAM

γ,jy,iz), we have

(p1, s)m,(k,sp′

1

)

−−−−−−→ s, s

• Transitions from final states as required by the IH are added.This does notaffect the language recognised, since the added transitions will require a level-0 data value to be “in” the sub-automaton, and there can only be one level-0data value in runs of this automaton.

Determinism is inherited from the constituent automata.

B.11 letx = zmkvar(λuunit.M1, λvint.M2) inN : θ

This is very similar to the previous case: the difference is that thejz moves from thelast case can now be eitherread orwrite(j), leading to playing as eitherJM1K or JM2Krespectively. The formal construction is almost identicalto that given above.

C Proofs of Undecidability Results

C.1 Proof of Theorem 4

Q-Stores Following previous game semantics based undecidability results, we willreduce the halting problem for a class of finite state machines equipped with a queueto observational equivalence of RML-terms. The universality of such machines goesback to Post’s work on simple rewriting systems [20,12]. In particular, we will utiliseautomata equipped with a Q-store [13]. Q-stores are a generalisation of a queue whichdo not always follow queue behaviour. However, we will be able to detect whether thequeue discipline has been followed correctly or not.

Definition 3. A Q-storestores characters from a finite alphabetΣ. Its content is de-fined by a natural numbern and a functionf : 0, . . . , n → Σ × +,− × +,−.The three fields off(i) will be referred to asf(i).SYMBOL, f(i).ACCESSED andf(i).MARKED respectively. The first holds the character stored in this element of theQ-store and the other two are used for bookkeeping.

The empty Q-store is defined byn = 0 andf(0) = (†,+,−) where† is a dummysymbol set as accessed but unmarked.

There are two operations which can be performed on a Q-store.

– ADD x addsx ∈ Σ to the store. The new Q-storef ′ : 0, . . . , n + 1 → Σ ×+,−× +,− is defined byf ⊆ f ′, f ′(n+ 1) = (x,−,−).

– FETCH is the only access method. It can return any previouslyunaccessed ele-ment in the storef(i).SYMBOL (i.e. f(i).ACCESSED = −) provided an in-dex j can be found such that0 ≤ j < i ≤ n, f(j).ACCESSED = + andf(j).MARKED = −. As well as returning the value stored in theith element, theoperation setsf(i).ACCESSED andf(j).MARKED to+.

We see that a FETCH operation can return any unaccessed element i provided thereis an earlier elementj which has already been accessed but has not yet been marked.The choice of(i, j) is made nondeterministically and different choices can affect thestore in different ways. It is possible that the Q-store might behave as a queue. This willoccur if during a FETCH the choice ofi will always be the first unaccessed elementandj to be i − 1. If this happens then the Q-store will have a characteristicpattern:no unaccessed element occurs between two accessed elements. The only way to have aQ-store with this pattern is if its behaviour has been that ofa queue. In particular, if allelements of a Q-store have been accessed then its behaviour was that of a queue.

We can now consider finite state machines equipped with Q-stores.

Definition 4. A Q-machineis a tupleA =⟨Q,Σ, q0, F, δ

ADD , δFETCH⟩, where:

– Q = QA +QF + F is the finite set of states withq0 ∈ Q the initial state.– δADD : QA → Q × Σ defines transitions out of states inQA. If the machine is

in stateq1 andδADD (q1) = (q2, a) then the machine transitions into stateq2 andperforms ADDa on the machine’s Q-store.

– δFETCH : QF × Σ → Q defines the machine’s action when in a state fromQF . When in stateq1 ∈ QF the Q-machine will attempt to perform a FETCH.If this is successful and returns symbola then the machine transitions into stateδFETCH (q1, a).

We say that a Q-machinehalts if there exists a run (starting in the initial state)which ends in a final state (a state inF ) with a Q-store in which all elements have beenaccessed.

Since Q-machines only halt when every element in the Q-storehas been accessed(so when the Q-store has acted as a queue) as far as halting is concerned they are thesame as finite state automata equipped with a queue. Hence, from Post’s work we caninfer that they have an undecidable halting problem.

Representing Q-machinesWe now consider how to represent the run of an arbitraryQ-machine at the type sequent⊢ (unit → unit) → unit → unit. The relevant prearenais shown in Figure 9. For technical convenience we will assume that the initial state ofthe Q-store results from a dummy ADD action executed once at the very start of therun.

Our representation of the Q-machine will begin withq0 a0.Each ADD operation (including the dummy operation initializing the store) will

then be interpreted by the segmentq1 q q1 a1.Each FETCH will be represented by segmentsq2 q q2 a2 a a2 where the firstq2 is

justified by thea1 from theith ADD, q is justified by theq1 immediately before thata1and the secondq2 is justified by thea1 in thejth ADD. Here we are using the visibilitycondition to force the choice ofj to be a strictly earlier ADD-block than the choice ofi.

q0 a0 · · · q1 q q1 a1 · · · q1 q q1 a1 · · · q2 q q2 a2 a a2

Once the Q-machine has reached a final state at the end of the computation, wemust check that the Q-store has the correct shape. This is performed in a finishing up

q0

a0

q1

a1

q2

a2

q

a

Fig. 9: Prearena for⊢ (unit → unit) → unit → unit

state where we visit each ADD-block from last to first and check each of them has beenaccessed.

q0 a0 · · · q1 q q1 a1 · · · q2 a2 a a1

In order to construct a term which follows this strategy we first consider some termswhich perform the various responses. Our final term will keeptrack of which state thesimulation is in and imitate one of these terms accordingly.

– λf. . . . will respond to the initialq0 with a0.– λf.f();λx.Ω responds toq1 with q. Once this is (eventually) answered witha it

responds witha1. Thisa1 can never be used to justify anything or else P will notrespond.

– λf.λx.f() responds toq1 with a1. If this a1 is used to justify aq2 then it respondswith q. If this is answered witha then it responds witha2.

– λf.λx.() responds toq1 with a1 and toq2 with a2.

In order to keep track of which stage of the computation we arein, we will use anumber of global variables.

– State— keeping track of which state the simulated Q-machine is in.– First — a flag letting us know if the first dummy ADD-operation has occurred.– AddState— keeping track of how far through an ADD-operation we are.– FetchState— keeping track of how far through a FETCH-operation we are.– FinishingState— keeping track of how far through a finishing up operation we are.

Additionally, we will create several local variables for each ADD.

– Symbol, AccessedandMarked— representing the appropriate fields in the Q-store.– Finalised— a flag keeping track of whether this ADD-operation has been visited

during the finishing up stage. This is needed to ensure that each ADD is visitedexactly once during this phase.

The term is shown in Figure 10. We use the syntax[B1, . . . , Bn] as an abbreviationfor if

∧Bi then () elseΩ. The local variables are associated with theq1 · a1 part of

each ADD-block. This ensures they can be accessed during a FETCH or the finishingup stage when moves are hereditarily justified by them. Note that we cannot enforcethat during the finishing up stage, theq2 is justified by the last unfinaliseda1. However,we do ensure that eacha1 justifies at most oneq2 during this phase. Since we can relyon the second part of the finishing up state (a · a1) to hide (by visibility) thea1 fromthe last (by bracketing) unfinalised ADD-block, we know thatthe only way to reach acomplete play is if O does indeed finalise the ADD-blocks in order from last to first.

To establish undecidability we note that the represented Q-machine will halt if andonly if the term is not observationally equivalent toλf.Ω. Hence, observational equiva-lence is undecidable if the type contains a first-order (or higher) argument which isnotthe final argument (i.e. any type of the formθn → . . . → θ4 → (θ3 → θ2) → θ1 → θ0for any RML typesθi andn ≥ 3).

C.2 Proof of Theorem 5

We again rely on finite state systems equipped with a queue. However, rather than relyon Q-machines, this time we utilise a programming system calledQueue.

Definition 5. A Queue programhas a single memory cellz that can store a symbolfrom Σ and a queue (which can contain symbols fromΣ). A program consists of afinite sequence of instructions of the form1 : I1, 2 : I2, . . . ,m : Im, where eachIi isone of the following:

– enqueue a: add the symbola ∈ Σ to the end of the queue and go to the nextinstruction.

– dequeue: if the queue is empty then halt, otherwise remove the element at thefront of the queue and store it inz then go to the next instruction.

– if z = a goto L wherea ∈ Σ andL ≥ 0 is a label. If the value stored inz isa then go to theLth instruction, otherwise go to the next instruction.

– halt.

The halting problem for Queue programs is undecidable [11].We will simulate Queue programs using a recursive function of type (unit →

unit) → unit. We will model the queue using the call-stack. Everyenqueuewill causea recursive call which will allocate a variablecur containing the value to be enqueued.When an item is removed from the queue we will setcur to 0 which we assume is aspecial value not inΣ. This means that we know that the head of the queue correspondsto the oldest recursive call whosecur does not contain0.

In addition to the local variablecur we will also need global variableshalt (a flagletting us know we should stop the computation and collapse the call-stack),pc (whichinstruction we are currently on),z (the Queue program’s memory cell) and two variablesG andH . When we make our recursive call, the new value to be added to the queuewill be (temporarily) stored inG. Further, the argument to the call (a function of typeunit → unit) will be such that if it is run whenH = 0 then the value ofcur from the

l e tS t a t e = r e f $q0$F i r s t = r e f 1AddSta te = r e f 0F e t c h S t a t e = r e f 0F i n i s h i n g S t a t e = r e f 0

i n$\ lambda$ f .

[ ! S t a t e $\ i n QˆA$ ] ;i f ! AddSta te = 0 then

AddSta te := 1 ;f ( ) ;[ ! S t a t e $\ i n F$ , ! F i n i s h i n g S t a t e = 1 ] ;F i n i s h i n g S t a t e := 0 ;$\ lambda$ x . $\Omega$

e l s e i f ! AddSta te = 1 thenl e t

Symbol = r e f $\ddag$Accessed = r e f ( i f ! F i r s t t hen + e l s e−)Marked = r e f −F i n a l i s e d = r e f −

i nAddSta te := 0 ;i f ! F i r s t t hen

F i r s t := 0 ; Symbol := $\dagger$ ;e l s e

( Symbol , S t a t e ) := $\ d e l t a \ m a t h i tADD$ ( ! S t a t e ) ;$\ lambda$ x .

i f ! S t a t e $\ i n QˆF$ theni f ! F e t c h S t a t e = 0 then

[ ! Accessed = − ];Accessed := +; F e t c h S t a t e := 1 ;f ( ) ;[ ! F e t c h S t a t e = 2 ] ;F e t c h S t a t e := 0 ;S t a t e := $\ d e l t a \ m a t h i tFETCH$ ( ! S t a t e , ! Symbol ) ;

e l s e i f ! F e t c h S t a t e = 1 then[ ! Accessed = + , ! Marked =− ];F e t c h S t a t e := 2 ; Marked := +;

e l s e $\Omega$e l s e i f ! S t a t e $\ i n F$ then

[ ! F i n i s h i n g S t a t e = 0 , ! Accessed = + , ! F i n a l i s e d =− ];F i n i s h i n g S t a t e := 1 ; F i n a l i s e d := +;

e l s e $\Omega$e l s e $\Omega$

Fig. 10: The term encoding a Q-machine

previous call will be written toG. If, on the other hand, the argument is run whenH = 1it will cause the value at the front of the queue to be written to G and the appropriatecur to be set to0 (i.e. that element is removed from the queue).

Our term encoding a queue program is then

let halt , pc, z, G,H = ref 0, ref 1, ref 0, ref 0, ref 0 in(µF (unit→unit)→unit.λargunit→unit.body)(λcunit.Ω)

wherebody has the form

let cur = ref (!G) inwhile !halt = 0docase(!pc)[1 7→ J1, . . . ,m 7→ Jm].

EachJi depends onIi according to Table 1. This term is equivalent to⊢ () if

Ii Ji

enqueue n

pc := i+ 1;G :=n;F (λx.if !H = 0 thenL elseR)

whereL ≡ G := !curR ≡ if (H :=0; arg(); !G = 0) then z := cur ; cur :=0

elseH :=1; arg()

dequeue

if !cur = 0 then halt :=1 elseif H :=0; arg(); !G = 0 then z := !cur ; cur :=0 else

H :=1; arg();pc := i+ 1

halt halt :=1

if z = n goto L if !z = n then pc :=L else pc := i+ 1

Table 1: Simulations for each Queue program instruction

and only if the simulated Queue program halts. Hence, observational equivalence ofRMLO-Str with recursive functions of type(unit → unit) → unit is undecidable.