Download - An approach to a unified theory of grammar and L forms

Transcript

r~FO~MAT~0~ SCfENcES 15,77-94 (1978) 77

Comnturiicated by John M. Richardson

ABsTRAcr

WC present one approach within which the di&mik developments of the grammar form and the L form themica can be unified Our results also &cd light on the inkent differences of parallel and sequential rmiting.

1. INTRODUCTION

Grammar forms were originally introduced in the pioneering paper of Cremers and Ginsburg [3] and followed up in [4], [ 1 I] and [ 191, for example. L forms were originally introduced by Maurer, Salomaa and Wood in [I41 and have been followed up subsequently in [ 151, [16] and [ 171, for example.

In both cases there is the notion of a ‘cmasterV grammar or system (the so-called “form”) from which, by interpretation, a family of related grammam or systems is obtained. However, there is one major distinction in the two approaches, and that is the mode of interpretation used. In [3] a ge& or g-interpretation is found; in I141 a A%&~ or s-interpretation_

It seems to be clear that s-interpretations (restricted g-interpretations) have to be used for L forms, while g-interpretations cannot be so used, On the other hand, both g- and s-interpretations can be applied to grammar forms.

In this paper we first demonstrate how g-interpretations may be applied to L forms. Secondly, we examine “‘master’” ~~ in which termi& rewrit- ing is allowed, the so-called terminal grammars. These two extensions, one to L form theory and one to grammar form theory, enable a unified theory of grammar and L forms to be initiated.

78 V. K. VAISHNAVI AND D. WOOD

In Sec. 2 we present the terminology and basic results, explicitly introducing the underlying notion of a production scheme, which has been explored implicitly in [8], [9] and [2].

In Sec. 3 we develop the theory of g-interpretations of EOL forms, and in Sec. 4 the theory of terminal grammar forms is begun. Finally, in Sec. 5 we close with a discussion of the similarities and contrasts between the two developments.

2. BASIC TERMINOLOGY AND NOTATION

We first introduce a context-free production scheme, which is then used as the basis to define context-free grammars and EOL systems.

DEFINITION. Let V, and Z, be countably infinite alphabets with Z, c V, and V, - 2, infinite. V, is the universal alphabet, Z, is the universal terminal alphabet, and V, -2, is the universal nonterminal alphabet. A quadruple G = (V,Z, P,S) is a context-free production scheme (or simply production scheme, when the context-freeness is understood), if the following conditions obtain:

(i) V C V, is a finite alphabet, (ii) XC&,, Zc V, and Z#0,

(iii) P C V X F”, P is finite, and for each A in V there is some couple (&a) in P [we usually write (A,cr) as A+cr], and

(iv) S is in V-Z.

DEFINITION. Let G = ( V, 2, P, S) be a production scheme. A nonterminal N in V - Z is said to be a blocking nonterminal if N-N is the only production for N in P. We say G is blocking if for all a in Z, a+a in P implies (Y is a blocking nonterminal. Whenever productions are not explicitly supplied for a particular symbol A in V, by convention it is assumed that A +N is in P, where N is a blocking nonterminal in V.

Let PN c P, the set of nonterminal productions, be defined by PN = {A+ a:A is in V-Z and A-m is in P}.

Letting M, N C p the notation M+N, denoting { a+fi : a in M and /3 in N}, is used.

DEFINITION. A finite substitution f defined on an alphabet V is said to be a dfl-substitution (a disjoint finite letter substitution) if f (A) C V, for all A in V and for all A,B in V, A#& f(A)nf(l?)=QL

We can now introduce the two basic modes of interpretation discussed in this paper.

DEFINITION 1. Let G,. = ( 6, Xi, Pi, S,), i= 1,2, be two blocking production

UNIFIED THEORY OF GRAMMAR AND L FORMS 79

schemes. We say G, is a g(eneral)-interpretation of G, module p, denoted G,Q G2, where ,u is a finite substitution on c, if the following conditions

ob&:

(i) ~1 is a dfl-substitution from V, -& into Vi -Xi, (ii) p(a)cZf for all a in X2,

(iii) P,N cp(P$'), where

and (iv) S, is in p(&).

The g-interpretation is the one first introduced in [3], and is studied exclusively in the area of grammar forms; see [6] for a 1977 survey of work in this area.

Similarly we obtain the second mode of interpretation.

DEFINITION 2. Let Gi = (?$ Xi, Pi, $), i = 1,2, be two production schemes. We say Gi is an s(trict)-interpretation of Gz moduZo c, denoted G, pGz, where p is a dfl-substitution on V,, if the following conditions obtain:

(i) p(A)C Vi-Z, for all A in V2-&, (ii) p(a)CZ, for all a in X2,

(iii) PI cp(P2), and (iv) S, is in CL(&).

Comparing g-interpretations with s-interpretations, observe:

(a) In a g-interpretation a terminal symbol in Gr may be interpreted as a finite set of terminal wor& in G,, whereas in s-interpretations terminals in Gz are treated in a similar manner to nonterminals in GP

(b) In s-interpretations distinct terminals are interpreted as distinct sets of terminals. This notion was introduced in [l] for grammar forms and in [14] for EOL forms. We consider this notion of interpretation to be the most useful one, although as we shall see we lose many closure properties. S-interpreta- tions have been widely studied in L form theory (see [6]), and their study has been initiated in grammar form theory (see [7], [lo] and [13]).

(c) G-interpretations are only defined for blocking production schemes, since only nonterminal productions are interpreted. In general, p(a)+p(a), a in &, although well defined, gives rise to productions of the kinds e-q and /3,-q, where ( /$I > 1. This problem is avoided when using s-interpretations, since they are length-preserving.

80 V. K. VAISHNAVI AND D. WOOD

DEFINITION. Let G be a blocking production scheme; then the fati& of blocking production schemes generated by G under g-interpretations, denoted by 9,(G), is defined by

Similarly, letting G be a production scheme, the farnib of production schemes generated & G under s-interpretations, denoted g,(G), is defined by

It should ,be mentioned that the relations : and y are decidable [3] and

are also pre-orders 121. We now define (context-fry) grammars and EOL systems.

DEFINITION. A terminal (context-free) grammar G is a production scheme ( V, Z, P, S), together with a sequential derivation rule. The derivation rule is defined as follows: For a,fl in v*, a+oj3 (or simply a=+/3 if G is understood) ifa=uAu,@=u&forsomeuandoin V*andA+BisinP.Weextend+to =P, r~+ and J* in the usual way.

If (V,Z, P,S) is a blocking production scheme, we then say that G is a (context-free) grammar.

The language generated by G, denoted L(G), is defined by L(G)= {x : x is in Z* and S=Px}.

Note that because we require a pr~uction for every symbol in V, this notion of a grammar differs slightly from the usual one. However, in this case the additional terminal productions are blocking and are therefore, in one sense, irrelevant. We include them so that the underlying notion of a produc- tion scheme is consistent in both the definition of a grammar and that of an EOL system.

DEFINITION. An EOL wstem G is a production scheme ( V, X, P, S), together with a parallel derivation rule. The derivation rule is defined as follows: For a,p in v*, a=+cp (or simply a*b if G is understood) if a =A, * * -A,, @=@1***&, Ai is in V, fii is in V* and A,+/$ is in P, l<iCm, m>O. We extend + to =P, J+ and =+* in the usual way.

If (V, 2, P, S) is a bfocking production scheme, we say that G is a ,rynchro- ni.zed EOL system.

The language generated by G, denoted L(G), is defined by L(G) = {x : x is in Z+ and .9=9x}.

UNIFIED THEORY OF GRAMMAR AND L FORMS 81

We do not distinguish notationally the derivation rules for sequential and parallel derivations; however, no confusion should arise, since the context will always indicate which is meant.

We now have our central notions:

DEFINITION. A (fermid) grammar form G is a (terminal) grammar, and a (w&rronized) EOL form G is a (synchronized) EOL system.

If Gi and G2 are both termmal grammar forms or both EOL forms then we say G, is an s-i~~e~reru~ion of G2 moddo p, written G, y G2( p), if for the

production schemes Gi and G,, Gi 2Gz( p). Similarly, for two grammar forms or two synchronized EOL forms Gi and G,, G, is a g-intetpretution of G2 m&lo p, written G, :G,( p), if this is true for the underlying blocking

production schemes. We usually write G, 5] G2 or G, T GP

DEFMITION. Let G be a (terminal) grammar form or (synchronized) EOL form; then g#(G) or 8, (G) denotes the corresponding fami& of forms generated under g- or s-interpretation. Note that we only consider G as a production scheme here.

Let G be a (tee) grammar form; then the corr~o~ng famiZy of languages generated by G, denoted by I&(F) or C,(F), is defined by

E&(G)={ L(G’):G’;G),

or

respectively. Similarly, for G a (synchronized) EOL form we obtain

e,(G)-(L(G’):G’slcj.

“C” stands for Chomsky and “L” for Lindenmayer. In the following whenever the C and L are implied by the context we will omit them.

Two languages are equal if they diifer by at most the empty word. Similarly, two language families are equal if for each nonempty language in one family there is an equal language in the other family and vice versa.

82 V. K. VAISHNAVI AND D. WOOD

DEFINITION. Let G, and Gz be both (terminal) grammar forms or both (synchronized) EOL forms. Then

G, is s-form eq~~~~ent to G, if lZs (G,) = C, (Gz), and G, is g-form equivalent to Gz if C, (G,) = e, (GJ.

We also need:

DEF~~ON. A production scheme G = (V, X, P, S) is:

(a) separated if A +a in P implies a is in Z u (V- Z)* and A in Z implies a is not in Z,

(b) short if A+a in P implies ia/ <2, and (c) binary if each production is of one of the types A-x, A+a, A+B,

A+BCora+A,whereaisinZandA,B,CareinV-Z. (d) propagating if for each production X+a in P, a # E.

LEMMA 2.1. Let F- ( V, Z, P, S) be a terminal grammar form (or EOL form),

then k(F) (or EL, (0 is closed under dfr-substitution.

Proof. Straightforward.

LEMMA 2.2. Let F=( V, 1, P, S) be a terminal grammar form (or EOL form), where X consists of one symbol; then C,, (F) (or eti (F)) is closed under union.

Proof. Straightforward.

LEMMA 2.3. Let F=( V,Z, P,S) be a grammar form (or synchronized EOL form); then C,-, (F) (or efi (F)) is closed under intersection with regular sets.

Proof, We include the proof given in [I71 for completeness. We only consider grammar forms; synchronized EOL forms are handled

Since the standard construction assumes closure under union it has to be modified slightly. Assume F to be in binary normal form, about loss of generality. Consider F’=( yl, 2’, P’, S’) an arbitrary s-interpretation of F, and let M = (Q, Z’,S,qo, F) be an arbitrary finite state acceptor. We construct an interpretation F” of ,F’, F”=(V”,Z’,P”,S’)$lF’(~), such that L(F”)=L(F’) f-l L(M).

Let V”={[p,X,q]:p,q in Q, X in V’}uZu{S’}. Define p(X)=(X) u {[p,S,q]:p,q in Q}, and take into P” the following

productions:

(a) For each production S’+A, A in v’- z’, take all productions S’+ [q,A,q] with q in F.

(b) For each production S’+AB, A,B in V/-z’, take all productions S’~[q~A,p][p,B,q] with p in Q and q in F.

(c) For each production S--M, a in Z’, take the production S’+a iff u is in L(M), and take S’+[qO,a,qo] otherwise,

UNIFIED THEORY OF GRAMMAR AND L FORMS 83

(d) For each production a+,N, a in c’, take the production a+[q,N,qO]. (e) For each production A +B, A, B in V’ - z’, take all productions [p, A, q]

-0, & 01 for all p, 4 in Q. (f) For each production A+BC, a, B, C in V’ -Z’, take all productions

la,A,cll~Ip,B,rl[r,C,ql for dlp,q,r in Q. (g) For each production A+a take the production [p,A,q]+a for all p,q in

Q witi S(p,@=q, and tie Ip,~,ql~q*u,q~ for dp,q in Q with SW)+ q, where N is a blocking nonte~nal.

Note that F” is a grammar, and further that F”: F’( &, as desired. That L(F”)-L(F’)n L(M) is readily established.

Finally, we need:

DEFINI~ON . Let I? be a family of languages closed under union, finite substitution and intersection with a regular set, then we say C is a pre-semi- AFL. If l? is also closed under regular substitution, l? is a semi-AFL. Let I? be a family of languages; then the smallest pre-semi-AFL @mi-AFL] containing C is denoted by 9 (f?) [S (Q]. These are said to be the pre-semi-AFL and semi-AFL closure of &, respectively.

Readers are referred to [18] and 1121 for a more detailed exposition of gnmmars and L systems, respectively. Any unexplained terminology will be found therein.

3. G-INTERPRETATIONS OF SYNCHRONIZED EOL FORMS

In this section we present the basic results on the g-interpretations of synchronized EOL forms. In many ways these are analogous to those for grammar forms 131.

NOTATION. dugout this section we denote C,(F) and k&(F) by !Z$ (F) and f?,(F), respectively.

The simulation lemmas in [ 14] for EOL forms carry over straightforwardly to the present situation. Therefore we immediately have results corresponding to those in [14]. We summarize these below.

THEOREM 3.1 Let F be a Jynchronized EOL form. There exists a synchronized EOL form G, g-form equiwzlent to F, such that

(i) G is reduced, (ii) G is separated,

(iii) G is short, (iv) G is binary, and (v) G is pr~agating (and binmy),

84 V. K. VAISHNAVI AND D. WOOD

In both [3] and [14] there is a tacit assumption that any derivation in an interpretation grammar or system is, in fact, an image of a derivation in the master grammar or system respectively. However, when g-interpretations of synchronized EOL forms are considered, this is no longer true. For example, let F have productions S+aA ; A +AA ; A+4 ; A+a; a+N; N+N. Now S=+aAti+ blocking, hence L(F)=Q. However F’TF defined by .!&A;

A+AA; A--+A; A+a; a+N; N+N has L(F’)+@ In fact 15’E(F’)=l?(EOL), by Theorem 3.8, and hence f?*(F)= &(EOL). In contradistinction, under s-interpretation C,(F)= (0). This leads to the following notion, which we will find nseful.

DEFINITION Let F=( V&P, S) be a synchronized EOL form. Define I”, = (V, Z,P,, S), l$: F(p), the e-interpretation of F, where p(A)= A, for all A in

V-Z, p(a)- {a,~} for all u in Z, and P,N=p(P). In the example mentioned above F, is given by:

S-WA; S+A; A-+AA; A+A; A+a; A+e; a-N; N+N.

Immediately, L( F.) # 0.

We first consider closure properties of synchronized EOL forms under g-inter- pretations. We say that a blocking EOL form F is emp& if fZs(F,) is either (0) or [a, {e}}, otherwise F is none~~.

THEOREM 3.2. For all nonernpty synchronized EOL form F=( V, X,P, S), ff.* (F) is closed under (i) union, (ii) finite substitution and (iii) intersection with a regular set, i.e., C,(F) is a pre-semi-AFL.

Proof. By construction. (i) Let I;’ and F” be interpretations of F, with F’ = (V’, z’, P’,S’) and

F” - ( V”, Z”, Y”, S”). Construct F”’ = ( V”‘, Z”‘, P”‘, S”‘) such that I.,(,“‘) = L(F’)u L(F”). Without loss of generality assume (V’-zl)n( V” -2”)-0.

Let F’: F( p’) and F” 7 F( p”).

Define ~a”’ by:

~“‘(~)=~‘(~)u~~(~) for all a in Z’UZ”, p”‘(A)=p’(A) for all A in P”-(Xu S’), $“(A)=$‘I(A) for all A in V”-(X”u S”) and /.P(S)=~‘(S)U/.LR(S)U IS”‘}.

Finally, let yl” = V’U y”u{S”‘}, ~“‘=B’uz” and ,“‘=P’uP”u{S”’

-+LY : S’+a is in P’ or S”+IJ is in P”}. Clearly, F”‘aF( p”‘), and since F’ and F” are synchronized EOL forms, L(F”‘) = L(F’) u L(F”).

(ii) Let

F’= ( V’, Z’, P’, S’) yF( p’)

UNIFIED THEORY OF GRAMMAR AND L FORMS 85

and f be a finite substitution from X’-symbols to finite subsets of Z”*. Let F” = ( Y”, 2”, P “, S “) be a new synchronized EOL form, where V” = ( V’ - zl) uX”, S”=S’ and P”={A+f(a):A-+a is in P’, A in V-Z’, a in X’*}U{A 4a:A in V’-Z’,ain(V’-Z’)*}u( a-+ N: Q in X”, N is a blocking symbol in V’}. Then L(F”)=:f(L(F’)) and Fw; F.

(iii) This follows by a slight m~i~tion of the co~~~~on in Lemma 2.3.

Recall that an u-NG~~ (nondete~sti~ generalized sequential machine with accepting states) is defined by a 6-tuple M = (Q, Z, A, 6, qo, F), where Q is a finite set of states, X and A are the input and output alphabet respectively, 6 is a transition function 6 : Q x 2+2Q xA*, q. is an initial state and FC Q is a set of final states. Let M be an a-NGSM. Then an a-NGSM map M: X*-+2” is defined by M(e)=(e) and M(x)={y:(p,y) is in S(qo,x), for somep in F, where S has been extended to Q X 2? in the natural way}.

COROLLARY 3.3. Let F be a nonempty synchronized EOL form. Then &(F) is &wed under a-NGSM maps.

Clearly Theorem 3.2 provides the strongest closure results. Since C(EOL) is not a semi-AFL, e,(F) is in general not closed under inverse homorno~~~ This observation holds even when we consider synchronized ETOL forms, since synchronized EOL forms are a special case.

We now have:

THEOREM 3.4. Let F=( V, Z, P, S) be a nonenpry synchronized EOL form. Then l?.g(F)=9(es(F)).

Proof. Clearly S(f.T~(F))CIZB(F), since C,(F) C C,(F). For an arbitrary F’=(V’,Z’,P’,S);F(p’) construct F”=(V”,Z”,P”,S?~F(CL”) such that

L(F’)= h(L(F”)) for some homomorphism h. Let

VW-(yl-Z)uZ”, X” = {[a,x] : u is in X and x is in p’(a)), P”={A+t~“:A+h(a”)is inP’>, whereh(A)=A for aliA in V”-Z”, and

h([a,x])=x for all [a,x] in X”, and finally S”= S’.

NC-N L(F’)=h(L(F”)) and F”! F(p”), where $‘(A)=p’(A) for all A in V-Z, and $(a)=([ a,x :x is in p’(a)) for all a in Z. This gives us the ] required result, since h(L(F”)) is in sP( C,(F)).

Letting e, = {e,(F) : F is a synchronized EOL form} and e, = (l?..(F) : F is a synchronized EOL form}, we have:

THEOREM 3.5. iZ& and 1;L, are i~co~ar~~e.

86 V. K. VAISHNAVI AND D. WOOD

Proof. First consider the form F:S+a; a-+N, N-+N. C,(F) 5 lZ(FIN), since %(F) is the family of alphabets. However, no nontrivial subset of the lZ(FIN) can be generated by any synchronized EOL form under g-interpreta- tion (see Theorem 3.6).

On the other hand, C(FIN) cannot be generated by any synchronized EOL form under s-interpretation. This follows from the observation that s-interpre- tations are length-preserving.

Finally, we have incomparability and not disjointness, since e(EOL) is in both classes.

One consequence of this theorem is that the pre-semi-AFL closure of l?.*(F), for some synchronized EOL form F, is not necessarily in C,. This is similar to the situation obtaining for grammar forms (see Theorem 4.6).

We continue by studying the generative capacity of synchronized EOL forms under g-interpretation.

THEOREM 3.4. Let F=( V,Z, P,S) be a nonemp@ ~~hron~zed EUL form. Then $(F)= &(FIN) iff L(Z$) isfinite.

Proof. Zf: Since L(F) is nonempty, there exists at least one nonempty word x = a, - - - a,,,, for some m > 0, in L(F). Consider the derivation in F, SJ+X. Let K be an arbitrary finite set. Construct an isolating interpretation F’ 2 F(p),

where p(a)= Ku {E) for all a in 2, such that S’=3+*a2** - * =m,+x’=s+ blocking in F’ implies (i) x’ is in 1y and (ii) S=~-‘(S’)~~-‘(a,)~~-1(cw3 =$- * - =q~-~(a,,)*x*+ blocking in F. Clearly L(F’)= K.

On& if Since !I?, (F) = &(FIN), it follows immediately that L(F,) must be finite.

DEFINITION. An OL gstem G is an EOL system (2, Z, P, S) in which S can be an arbitrary word in Z + .

This leads to:

THEOREM 3.7. LA F= V, Z, P, S) be a synchronized EOL form. Then cg(F) = C(EOL) @ e(OL)r i?,(F).

Proof. Zf: Let I;’ = (Y’, z’, P’, S’) 2 F( p) be an arbitrary interpretation of F.

Consider the OL system G=( V”, V”,P”,S”) where V”= V’-Z’u (N} and P”=P’n(V”X V”*), and a single state a-NGSM M=({q},V”,Z’,S,q,q) which has transitions S (q,A) = {(q,x) : A +x is in P’ and x is in Z’*} (see IS]). immediately L(F’)= M(L(G)). S ince k&(F) is closed under a-NGSM maps, we have the result.

Only if: Obvious.

We need the following notion:

UNIFIED THEORY OF GEMS AND L FORMS 87

DEFINITION. An EOL form F=( V,Z,P, S) is expunsiue if there exists a nonterminal A in V-Z fulfilling the following conditions:

(i) A++ x for some x in 2+, and (ii) A*’ ff A fl A 8 for some a$,& in V*. The following result should be compared with the corresponding result for

i?(CF) in [3].

THEOREM 3.8. Let F= (V, Z, P, S) be a qvvhmized EOL form. Then L?,(F)

= e(EOL) iff F i.s expansiue.

Proof. f$ Assume F is properly expansive. By def~ition there is an A in V-X, such that A+“a A j3 A &, where a,&& are in V+ and A=Px, x in Z+. Consider the derived EOL form G =( V, X, P,A).

We first argue that l?r (G) E eg(F), omitting the details. For each G’ = (Y’,Z’, P’, S’) 2 G ( CL) we can construct an interpretation of F in such a way

that a derivation S=P a A fi in F is isolated, and furthermore a g=+* E is the only te~a~g derivation for a /3 in F’. Finally, including all productions P’

in PF also, we have L(F’)- L(G’).

Secondly, we show that Q(G)2 QEOL) giving the required result. Since Aroma A p A 6 for some m > 1, then we can construct an isolating

interpretation G’; G such that

(i) A &‘a A’ /3’ A 6’ and a’ ~3’ S’qke is the only te~a~g derivation for at 8’ S’, where k < n,

(ii) A=#‘a” A” p” A 6“ and (Y” A” p” 6”qke is the only terminating derivation for a” A” p” S”, where k < n, and

(iii) A=#a, a in Z,,.

Now by amend similar to those in Theorem 5.3 of [14] we have that &(EOL)G;~(G’).

On& if: If f?,(F)= E(EOL), clearly F is expansive, by arguments similar to those in [14] for l?*(F). Hence the theorem is proved.

THEOREM 3.9. For ezqv synchronized EOL form F* &(F) # e(CF).

Proof. Assume otherwise, and let F be a synchronized EOL form F=

(V,{a),P,S) such that $(F)=e(CF). Now since S(tQF,))=C,(F) by Theo- rem 3.4, we obtain a contradiction as follows. Observe that t3(F,) is closed under both union and intersection with a regular set (Lemmas 2.2 and 2.3). Observe that %(F=) must contain at least one metalmear language K which is not linear. For otherwise, either ~~(F~)~~(LI~, in which case ~(r~~F~))~

f?(LIN), or there exists a language M in fZ$ (Fe) which is not met&near. In this case F is expansive. Now it is straightforward to construct an FIT F such that

L(F’) is metalinear and not linear.

88 V. K. VAISHNAVI AND D. WOOD

Hence K exists. Now by arguments in 1201, the presence of such a K in l$(FJ implies there is non-context-free language in $(F,). By contradiction the theorem is proved.

4. S-INTERPRETATIONS OF GRAMMAR FORMS

We continue our study with an investigation of s-interpretations of terminal grammar forms. These results are analogous to those in [14] for EOL forms, since we allow nonblocking productions with terminals on the left hand side.

NOTATION. ~rou~out this section we denote f&(F) and C,(F) by C,(F) and $ (F), respectively.

DEFINITION. Let G = ( V, Z, P, S) be a terminal grammar form. A derivation A=++ a in G is a tnt-derivation (total nonterminal derivation) if for all sequencesa,,...,a,,,, wherem>O,aO=A,a,,,=aandao~al~az*~~~=+a~,~ is in V*(V-Z)V*, O<i<m.

The following simulation lemma is the appropriate generalization of the simulation lemma in [3].

LEMMA 4.1. Let F= ( V, 2, P, S) be a terrni~~ gram~r form and /I=++ a a Tut-deri~tion in F. Then G = ( V, 2, P u (A +a), S) is s-form e~i~lent to F,

Proof. Clearly $ (F) C I?$ (G), since F y G. A J+ a is a tnt-derivation in F;

therefore, in any interpretation G’T G a derivation involving an interpretation

of A--w can be “simulated” in an F’ y F without adding to or subtracting from 1;(G’).

The above lemma is not true when the tnt condition is dropped. For example, consider F: S+a; a-6; b-+N; N-+N: clearly a++ N in F, and this is not a t&derivation. Immediately G : S-a; a+b; a+N, b+N; N-N is not s-form equivalent to F, since G’TG, G’:S+a; a+N; N+N gives L(G)=

(a}, and there is no Fy F with L(F’)= {a}. It is a s~ai~tfo~ard observation that if F is a grammar form, any

nonblocking derivation can be simulated in the original [3] manner. We first contrast terminal grammar forms with grammar forms. A number of results in [3] also hold under s-interpretation and even when

grammar form is replaced by terminal grammar form. We summa rize these in the following theorem.

THEOREM 4.2. Let F be a terminal grammar form. There exists G, a terminal

UNIFIED THEORY OF GRAMMAR AND L FORMS 89

grammar form, s-form equivalent to F, such that:

(i) G is reduced, (ii) G is separated, (ii) G is short, and (iv) G is binaty.

Proof. (i) is straightforward. (ii), (iii) and (iv) all follow by use of Lemma 4.1.

In [14] it is shown that there exist EOL forms which have no s-form equivalent blocking EOL form. Here we have a similar result, namely:

THEOREM 4.2. Let G be the terminal grammar form given 6s, S+a; a+b; b-b: then there is no s-form equivalent grammar form F.

Proof. Assume such a grammar form F &ists. Now there exists F’ y F with

L(F’)= L(G), since F and G are s-form equivalent. However, since F’ is, by definition, blocking, there is a derivation X’++ a in F’, and there is no derivation a++ a in F’ for any (Y Z& Consider the interpretation F”: F’, in which the F-derivation S’++ a is isolated, that is, L(F”)= {a}. We now have a contradiction, since each language K in $(G) contains at least two words.

We also have a similar result for removal of empty rules, rules of the form A+e.

THEOREM 4.3. Let G be the terminal grammar form given by S+ab; a+N; b+e; N+N; then there is no s-form equivalent propagating terminal grammar form F.

Proof. Assume such an F exists. Consider F’=( V’, z’, P’,S’)? F with

L(F’)= L(G). On the one hand assume there are two distinct F’derivations:

S’++ ab++ blocking and ,!?‘++a-+ blocking,

such that neither a*+ ab nor ab++ a, Immediately there is an isolating interpretation F”~J F’ such that L(F”)= {a}, say. This leads to a contradic-

tion, since for all G’: G, L(G’) contains at least two words. On the other

hand, there are two possible F’-derivations:

(i) S’-+ab-+a++-.- or (ii) S’++ a++ ab=++ . . . .

Now in case (ii) we must also have ab+ab%+ ab’ for all i z 2, an immediate contradiction, since all words in L(G’), G’a G, are of length at most 2. In case s

90 V. K. VAISHNAVI AND D. WOOD

(i) we can only have b++ E, which contradicts the assumption that F and, hence, F’ are propagating.

We next show that there are terminal grammar forms which give rise to AFLs and anti-AFLs, a result similar in spirit to that of [14] for EOL forms. This demonstrates, in particular, that it is not parallel rewriting that gives rise to the nonclosure properties of EOL forms (even of synchronized EOL forms), but rather the ~-~te~re~tion mechanism itself.

THEOREM 4.4. Consider the following terminal grammar form:

(i) G, : S-a; S+SS; a-N; N+N, (ii) G2:S+ab; a+N; b-+c; c+N; N+N;

then G, is an AFL and G2 is an anti-AFL.

Proof. (i) Any context-free grammar in Chomsky normal form is an inter- pretation of G,; hence e,(G,) is the family of e-free context-free languages. This is a well-known AFL.

(ii) It is necessary to show that C,(GJ is not closed under any AFL operations. Note that $ (Gr) 5 QFIN), and therefore it is not an AFL. Proper

containment arises, since for each F’ T G,, L(F’) only contains words of length 2. This observation implies that es(G2) is not closed under catenation or catenation closure, Secondly, for each F’ y G2, L(F’) contains at least two words. This implies that !Zs ( G2) is not closed under homomo~~sm nor under intersection with a regular set. e$(G,) is not closed under union, since {ab,ba} can not be generated by any F’ yG2. Finally, letting h(a)=a, h(b)= b and h(c)=&, we have that h-‘({ab}) is not in es(G,), since it is not a finite language.

However, if we restrict attention to grammar forms then we can only obtain a nearly anti-AFL result, since for each grammar form G, $(G) is closed under intersection with regular sets by Lemma 2.3. In this case we have:

THEOREM 4.5. Consider the grammar form G3 : S--+ab; a+N; b+N; N+N; then !I?$ (G,) is onZy closed under intersection with regular sets.

Proof. The arguments are identical with those for Gz in Theorem 4.4, except that f!$(G3) is closed under intersection with a regular set.

In [3] it is shown that every nontrivial grammar form (F is nontrivial if L(F) is infinite) gives rise to a semi-AFL. As we have seen, this is not true under s-interpretation. However we do have:

THEOREM 4.6. Let F be a ‘nonempty grammar form; then there exists a grammar form H such that & (H) = S ( lZs (F)), the semi-d FL closure of C,(F).

Proof. Let F=(V,Z,Z’,S). First define G=((V-Z)u {a}, {a), P&2), let-

UNIFIED THEORY OF GRAMMAR AND L FORMS 91

ting h(A)=A, A in V-Z, and h(b)=a, b in 2; then PG={A+h(a):A-m in P}. In other words all terminals in I: are identified with a. Now it is clear that lZ* (F) c l?*(G), and further, l?.,(G) is closed under union (Lemma 2.2). Now construct H=( I’,.,, {(I}, PH,S), where V,, =( V-Z)u {a,$}, in which S, is a new nonterminal, andg(A)=A, A in V-C,g(a)=S,; then PH={A+g(a):A +a is in PG} u { S,+a, S,+aS,}. Again C,(G) c $(H); however, $(H) is a semi-AFL. The reason for this is that &(H) is closed under intersection with a regular set, union and regular substitution, since we have added this capability to G.

It remains to show that S (lZ, (F)) = lZs (H). Since S (es(F)) is the least semi-AFL containing l?,(F), we have S (&(F)) c L?(H). It suffices to show that the reverse inclusion obtains.

Consider an arbitrary interpretation H’ 7 H ( p), where H’ = ( V’, z’, P’, S’).

We construct an interpretation F’ y F from H’. Let F’ = (Y”, Z”, P “, S”),

where

V”=p(V-X)u{A”:A is in c(S,)}, Y={A”:A isin/@,)}, P” = { B+3 N : B+/? is in a( V- Z), and /3” is /3 except all p(&,) symbols in

/3 are replaced by their double primed versions}, and S” = S’,

Finally, define the regular substitution r on Z”* by 7(A”) = {x : x in Z’* and A+++x in H’}. Immediately L(H’)-T(L(F’)), and clearly F’YF; therefore we have the result.

We have also the much stronger result:

THEOREM 4.7. Let F= (V,Z, P,S) be a nontrivial grammar form. Then

S @s(F)) = QF)-

Proof. Since iZs (F) G e# (F) (an s-interpretation is always a g-interpretation), we have S (CJ F)) c !2, (F). It remains to show that for an arbitrary g-interpre- tation F’ ; F( p’), L(F’) is in S (E!,(F)). Construct an s-interpretation

F” 4 F( p”) and a homomorphism h as in Theorem 3.4 such that

L(&= h(L(F”)), whence the result.

We now have:

COROLUY 4.8. Let F be a nontrivial grammar form. There exists a gram- mar form H such that E!,(F) = $ (H).

Proof. By Theorems 4.6 and 4.7.

Letting k!, and $ denote the classes of grammar form language families, we

92 V. K. VAISHNAVI AND D. WOOD

have shown:

THEOREM 4.9. Cg and iZ3 are incomparable.

Proof. Analogous to the proof of Theorem 3.5.

We say a (terminal) grammar form G is complete if e,(G)= C(CF). Simi- larly G is uery complete (or uoqdete) if for all (terminal) grammar forms F’,

C,(F)= eS(G’) for some G’y G. We now have results similar to those in [ 14] and [ 161 for EOL forms.

THEOREM 4.10. Z%e following terminul grammar forms are complete:

(i) H, : S-a; S+SS; a+N; N-+N; (ii) Hz:S+a; S-dS; a-d;

(iii) H3 : S+a; S+aS; S+aSS; a+N; N+N; (iv) H4: S-a; S+E; S+aSS; a-d.

Proof. H, is in Chomsky normal form, and hence is complete, while HI 7 292, so that Hz is complete. H3 is in Greibach normal form and can be obtained from H4 via the simulation lemma, Lemma 4.1.

Recently, in [17], a “*super-normal form” result has been proved. This was also announced in [6]. Standard normal form-for example, the Chomsky and the Greibach normal form-follow as immediate consequences of this theorem.

THEOREM 4.11. Let F = ({ S, a}, {a}, P, S) be a grammar form such that (i) L(F) = a*, and (ii) there is a production S-+a in P with a containing at least two S symbols. Then 12, (F) = e(CF).

Proof. See [ 171.

COROLLARY 4.12. The fo~Iowing te~i~~ gr~~r forms are complete:

H5:S+a; S-taa; S+SaS; a-+N; N-+N, H6: S-M; S+SaS; a+N; N-+N,

COROLLARY 4.13. The following terminal grammar forms are vonydete:

E, : S-se; S+a; S+SS; a+S, E2: S-E; S+SaS; a-d.

Proof. El and E2 are clearly complete by Theorem 4.11. Now E, and E2 are very complete, since any arbitrary terminal grammar form G can be “simu- lated” by appropriate interpretations E; and E; of E, and E2 respectively. On the one hand, without loss of generality, we can assume G is a binary terminal

UNIFIED THEORY OF GRAMMAR AND L FORMS 93

grammar form, in winch case Ei 1s easily constructed. On the other hand by a result found in [6l_a stronger version of Theorem 4.1 l-each grammar form G can be assumed to be in the normal form specified by E,.

5, CONCLUDING REMARKS

The similarities and contrasts found in ~uen~ and parallel context-free rewriting are clearly brought out in this study of (terminal} ~~ forms and (synchronized) EOL forms. On the one hand letting F be a blocking produc- tion scheme, we have found, for example:

(i) l&(F) is a semi-AFL, whereas r;L, (F) is a pre-semi-AFL; (ii) &-- (F) = &(CF) if F is properly expansive, whereas lZL, (F) = L?(EOL) if

F is expansive; (iii) S(&--(F))=&(F) and 9(iZb(F))=i2LG(F);

(iv) E,,(F)= e(FfN) iff L(F) is finite iff i&(F)= QFIN).

On the other hand, letting F be a production scheme, we have, for example:

(i) i!&,(F) and f?,(F) may both be anti-AFLs, and (ii) there does not necessarily exist a propagating production scheme G or a

blocking production scheme H with

%sW=~cs(F)=fid~) or !2~(G)=lZb(F)=~,(H).

Clearly much remains to be done, to see how much further these analogies and differences can be carried. We close by citing two decidability problems which are now easily solved. Namely, given a blocking production scheme F under s-interpretation, can it be decided whether ‘I&(F) or kTti (F) equals the regular sets?

By a result similar to Theorem 4.11, &!cs (FQ &(REG) iff &(F)z, (I* for some a. [In this case, there is an F’dF such that L&F’)= a*. Since &..(F’) is closed under intersection with a regular set, for each K in c(REG) there is an F”QF’ with L&F”)= K.] This is decidable. We have a similar result for L?,(F)2 C(REG). Now L(F’) is in !Z(REG) for all F’a F iff F is non-self- embedding. Hence C,,(F)= QREG) iff (i) F is non-self-embedding and (ii) L,(F)2 a* for some a.

REFERENCES

1. E. Bertsch, An observation on relative parsing time, J. Assoc. Compt& Mach. 22,493 -498 (1975).

V. K. VAISHNAVI AND D. WOOD

2. B. von Braunmuh, E. Hotael, and D. Wood, Pre-orders, closure operators and grammar forms, Report 78CS-17, McMaster Univ., Hamilton, Canada, 1978.

3. A. B. Cremers and S. Ginsburg, Context-free grammar forms, J. Conput. @stem Sci. 11, 86-116 (1975).

4. A. B. Cremers, S. Ginsburg, and E. H. Spa&r, The structure of context-free grammatical fan&es, J. Comput. System Sci., to be publisbed.

5. K. Cuhk II, On some families of languages related to developmental systems, Inter&. J. Cotnput. Math. 4 (A-l), 31-42 (1974).

6. S. Ginsburg, A survey of 8mnunar forms--1977, unpublished manuscript. 7. S. Ginsburg, B. L. Leonp,, 0. Mayer, and D. Wotschke, on strict interpretations of

grammar forms, unpublished manuscript. 8. S. Ginsburg and H. A. Maurer, On strongly equivalent grammar forms. Computing 16,

281-290 (1976). 9. S. Ginsburg and H. A. Maurer, ~-~t~~tio~ of grammar forms, cotqltitig 19

141-147 (1977). 10. S. Ginsburg and D. Wood, Simple precedence relations in grammar forms, Report

76-CS-2, McMaster Univ., Hamilton, Gntario, Canada, 1976. 11. S. A. Greibach, Control sets on context-free- grammar forms, J. Comput. System Sci. 15

(l), 35-98 (1977). 12. G. Herman and G. Roxenberg, Dmeiopmental Systems and Lunguages, North-

Holland/American Elsevier, New York, 1975. 13. H. A. Maurer, M. Penttonen, A. Salomaa, and D. Wood, On non-context-free grammar

forms, Report 59, Institut fiir Angewandte Informatik und Formale IBM- fahren, Universitiit Karlsruhe, 1977.

14. H. A. Maurer, A. Salomaa, and D. Wood, EOL forms, Actu Informar. 8, 7J-% (1977) 15. H. A. Maurer, A. Salomaa, and D. Wood, ETOL forms, J. Conput. @stem Sci. 16, to

be published. 16. H. A. Maurer, A. Salomaa, and D. Wood, On good EOL forms, SIAM J. Conpt. 7,

2 (1978). 17. H. A. Maurer, A. Salomaa, and D. Wood, On generators and generative capacity of EOL

forms, Report 4, Institut fiir Informationverarbeitung, TUGI&?, Jan 1978. 18. A. Salomaa, FormaI Laguages, Academic, New York, 1973. 19. H. Walter, Grammar forms and grammar bomomorpbisms, Acta Znf&rmat. 7, 75-94

(1976). 20. J. Albert and H. A. Maurer, The class of context-free languages is not an EOL family,

Informatton Processing L,etters 6, HO-195 (1977).

Receiwd Februoty 1978