Static Checking of Variable Handling in Dijkstra's Guarded Commands Language

20
Comput. Lang. Vol lt, No 3,4, pp 123-142, 1986 0096-055D86 53.00+0.00 Pnnted m Great Britain Pergamon Journals Ltd STATIC CHECKING OF VARIABLE HANDLING IN DIJKSTRA'S GUARDED COMMANDS LANGUAGE PAUL A. BAILES Department of Computer Science, University of Queensland, St Lucia, Queensland 4067, Austraha (Received 10 July 1985; rertszon received 13 January 1986) Abstract--A procedure to check that the correct basic relationships hold between visibility, initialisation, use and update of variables in Dijkstra's Guarded Commands language, from static analysis of program text, is motivated, presented and verified. To simplify the presentation and verification, an abstracted language for the static analysis, together with a mapping from the concrete syntax, of Dijkstra's language is introduced. A precise statement of the objectives of the checking procedure precedes a specification of the procedure's principal data structure. Then follows the procedure itself, and a detailed outline of its verification according to the stated objectives. A number of interesting developments of this work are foreshado~ed. Dijkstra Guarded commands Static checking 1. INTRODUCTION Dijkstra's [1] Guarded Commands language offers a threefold challenge to the implementor: (a) nondeterministic selection of alternatives; (b) flexible arrays; (c) checking adherence to the rules governing visibility, initialisation, and access to variables. We have previously [2] sketched the outline of solutions to these three problems, and have also [3] pursued the third in some depth. This paper further explores the topic of developing, discussing and verifying a procedure that will check, by way of static analysis of program text, whether or not the rules governing the handling of variables are adhered to. In order to concentrate on essentials, we identify and deal in detail with only four basic operations on variables: (a) a variable may be made visible in a block; (b) a variable may be initialised; (c) a variable may be used (i.e. its value accessed); (d) a variable may be updated. We proceed by first defining an abstracted version of the Guarded Commands language (henceforth called GC) which embodies only those parts of GC at least indirectly germaine to (a)-(d) above. This allows the subsequent derivation and verification of the static checking procedure to proceed free of the detail of the concrete syntax of GC. The static analysis language furthermore has a syntax such that the recognition of a tree structure of programs is facilitated, which in turn facilitates the presentation and verification of our static checking procedures. A semiformal syntax-directed translation scheme (SDTS) from a concrete syntax to the abstracted static analysis language is provided. 2. SYNTACTIC METALANGUAGE In definition of context free-syntax throughout this paper, the form S-->E defines non-terminal symbol S to be the expression E. Such expressions may be of the form El ... En 123

Transcript of Static Checking of Variable Handling in Dijkstra's Guarded Commands Language

Comput. Lang. Vol lt, No 3,4, pp 123-142, 1986 0096-055D86 53.00+0.00 Pnnted m Great Britain Pergamon Journals Ltd

S T A T I C C H E C K I N G O F V A R I A B L E H A N D L I N G

I N D I J K S T R A ' S G U A R D E D C O M M A N D S L A N G U A G E

PAUL A. BAILES Department of Computer Science, University of Queensland, St Lucia, Queensland 4067, Austraha

(Received 10 July 1985; rertszon received 13 January 1986)

Abstract--A procedure to check that the correct basic relationships hold between visibility, initialisation, use and update of variables in Dijkstra's Guarded Commands language, from static analysis of program text, is motivated, presented and verified. To simplify the presentation and verification, an abstracted language for the static analysis, together with a mapping from the concrete syntax, of Dijkstra's language is introduced. A precise statement of the objectives of the checking procedure precedes a specification of the procedure's principal data structure. Then follows the procedure itself, and a detailed outline of its verification according to the stated objectives. A number of interesting developments of this work are foreshado~ed.

Dijkstra Guarded commands Static checking

1. I N T R O D U C T I O N

Dijkstra's [1] Guarded Commands language offers a threefold challenge to the implementor:

(a) nondeterministic selection of alternatives; (b) flexible arrays; (c) checking adherence to the rules governing visibility, initialisation, and access to variables.

We have previously [2] sketched the outline of solutions to these three problems, and have also [3] pursued the third in some depth. This paper further explores the topic of developing, discussing and verifying a procedure that will check, by way of static analysis of program text, whether or not the rules governing the handling of variables are adhered to.

In order to concentrate on essentials, we identify and deal in detail with only four basic operations on variables:

(a) a variable may be made visible in a block; (b) a variable may be initialised; (c) a variable may be used (i.e. its value accessed); (d) a variable may be updated.

We proceed by first defining an abstracted version of the Guarded Commands language (henceforth called GC) which embodies only those parts of GC at least indirectly germaine to (a)-(d) above. This allows the subsequent derivation and verification of the static checking procedure to proceed free of the detail of the concrete syntax of GC. The static analysis language furthermore has a syntax such that the recognition of a tree structure of programs is facilitated, which in turn facilitates the presentation and verification of our static checking procedures. A semiformal syntax-directed translation scheme (SDTS) from a concrete syntax to the abstracted static analysis language is provided.

2. SYNTACTIC METALANGUAGE

In definition of context free-syntax throughout this paper, the form

S - - > E

defines non-terminal symbol S to be the expression E. Such expressions may be of the form

El ... En

123

124 PAUL A. BAILES

denoting the concatenation of the languages defined by the E,, or the form

E,I..lEo denoting alternatives from among the languages defined by the E,. The alternation operator "1' has lower precedence than the "invisible" concatenation operator. Parentheses "(' ')" may be used to override precedence. The form [E] means that E is optional (i.e. has the empty string as an alternative). The form {E} means zero or more repetitions of E. Terminal symbols are either characters (including strings) enclosed in single quotes, or names for which no definitions as non-terminal symbols appear.

3. A B S T R A C T R E P R E S E N T A T I O N OF GC P R O G R A M S

After defining a language for the static analysis of GC and explaining its constructs, we semi-formally define the correspondence between GC concrete syntax and it, incidentally specifying the concrete syntax in so doing.

3.1 Static Analysis Language for GC

program - - > block

block - - > ' BLOCK' '(' nomenclist ',' s tmtpart ')'

nomenclist - - > {nomenc ','} homeric

nomenc - - > nomenc-class '(' name-list ')' :

nomenc-class - - > 'PRIVAR'

I ' PRICON' I 'VIRVAR'

I 'V IRCON' r ' G L O V A R '

I ' G L O C O N '

name-list - - > (name ','} name

stmtpart - - > 'STMT' '(' [stmt-list] ')'

stmt-list - - > (stmt ','} stmt

stmt - - > ' INIT ' '(' name-list ', ' use ')' I ' U P D A T E ' '(' name-list ',' use ')' I 'DO' "(' [gcmd-list] ')'

I ' IF ' '(' [gcmd-list] ')' f block

gcmd-list - - > {gcmd ",'} gcmd

Var iab le h a n d l i n g in D i j k s t r a ' s G u a r d e d C o m m a n d s 125

gcmd - - > 'GUARD' '(' use ",' stmtpart ")"

u s e - - >

'USE' "(' [name-list] ")'

The symbol "'name" is a token, and includes strings of symbols beginning with letters, and followed by any number of letters or digits.

3.2 Significance of Static Language Constructs GC static language programs are thus seen to be comprised of substrings of the form

XXXX (s. s . . . . . s)

where XXXX is a string of capital letters, and the s are further sub-strings. This suggests an equivalent tree representation

×XXX

S S . . . S

Such a phenomenon is intended to indicate the application of the semantic operator XXXX to the various s.

The semantic operators of the static language are now described in terms of the sub-sentential forms of which they are prefixes (i.e. sub-trees of which they are root nodes).

3.2.1 BLOCK (nomenclist, stmtpart)

B L O C K

n o m e n c . . . n o m e n c s t m t p a r t

The names made visible by the nomencs are available in stmtpart.

3.2.2 PRIVAR (name . . . . . name)

P R I V A R

J \ r l o n l e . . . F l a m e

The names have a nomenclature of PRIVAR (PRICON, VIRVAR, VIRCON, GLOVAR, GLOCON similar).

3.2.3 STMT (strut . . . . . strut)

The stmt are executed in sequence.

3.2.4 INIT (name . . . . . name, use)

STMT

/ \ strut . . . strut

IN~T

N a m e • . h o m e u s e

The named variables are initialised, in which certain others may be used (see use).

126 PAUL A. BAILES

3.2.5 UPDATE (name . . . . . name, use)

UPDATE

n a m e • • • n a m e use

The named variables are updated, in which certain others may be used (see use).

3.2.6 USE (name . . . . . name)

U S E

n a m e . . . n a m e

The named variables are used.

3.2.7 DO (gcmd . . . . . gcmd)

DO

gcmd . • . gcmd

A repetition involving the given gcmd.

3.2.8 IF (gcmd . . . . . gcmd)

IF

/ \ gcmd . . . gcmd

A selection involving the given gcmd.

3.2.9 GUARD (use, stmtpart)

GUARD

use sfmtpart

An arm of a selection or repetition construct in which the guarding expression makes use of some variables prior to any selection of stmtpart to be executed.

3.3 Augmenting the Metalanguage for SDTS Rules

In order to define the translations to be performed on programs in the concrete syntax to produce programs in the static language, we augment the syntactic metalanguage as follows.

Metasyntactic expressions of the form

may be followed by the form

Each G~ is either:

E [ . . . E n

- " > Gx ... Gm

(a) of the form $j, for some integer j; (b) a character string (delimited by ').

Variable handling m Dljkstra's Guarded Commands 127

The translation o f a concrete syntax fragment matching E~ . . . En is defined as the concatenat ion of the G, provided that the G, are given. A G, of form 5j refers to the result of the translation of the corresponding Ej. A G, which is a character string is just that.

If the G, are not given, then the translat ion of a f ragment matching E~ . . . E, is the concatenat ion of the translations of each of the E,. If such an F~ is a token, then the translation is the concrete syntax representat ion of the token.

3.4 A Semi-Formal SDTS: Concrete Syntax to Static Language

In the semi-formal definitions which follow, ellipsis (e.g. S1 . . . . . . $n) and forms such as 5cn + d (e.g. 51 . . . . . 5 2 n - I) are used to indicate a repeated pat tern involving various 5i's.

p rogram - - > block

block - - > 'begin' nomencla ture-par t ';' s ta tement-part "end'

- - -> ' B L O C K (' 52 ', S T M T (' 54 "))' nomencla ture-par t - - >

{nomenclature ';'} nomencla ture - - -> SI ',' 53 ',' . . . ', ' $ 2 n - 1 / , n nomenclatures in nomenclature-par t . /

nomencla ture - - > nomenclature-class name-list

- - > $l '(' $2 ')' nomenclature-class - - >

'privar ' - - -> ' P R I V A R '

L 'pr icon'

- - > ' P R I C O N ' I 'virvar '

- - > 'V IRVAR'

I 'v ircon'

: > ' V I R C O N '

I 'glovar '

- - > ' G L O V A R ' I 'g locon'

- - > ' G L O C O N ' s ta tement-par t - - >

{statement ';'} s tatement - - -> SI ',' . . . ',' Sn / , Si represents $2j - 1, where the j ' th s tatement in the s tatement-part is the i 'th one for which the corresponding static language form is non-null. , /

statement - - > initialisation-statement I update-s ta tement I do-statement

I

128 PAUL A. BAILES

if-statement

I block t 'skip'

I 'abort '

update-statement - - > assignment-statement I array-statement

assignment-statement - - > name-list ': = ' expression-list

- - > ' U P D A T E (' $1 ', USE (' $2 '))' name-list - - >

{name ','} name ---> 51 ',' $3 ",' . . . ', ' $ 2 n - 1 / , n names in name-list , /

expression-list - - > {expression ','} expression

- - -> El ', ' . . . ",' En / , Ei represents $ 2 j - 1 , where the j ' th expression is the i 'th one for which the corresponding static language" form, i.e. the list of used names, is non-null. , /

array-statement - - > name ":' 'shift ' '(' expression ')'

- - -> ' U P D A T E (' $1 ', USE (' $5 '))'

I name ':' 'hiext' "(' expression ')'

- - > ' U P D A T E (' $1 ', USE (' $5 '))'

I name ":' ' loext' "(' expression ')'

- - -> "UPDATE (' $1, USE (' $5 '))' I name ';' 'hirem'

---> ' U P D A T E (' $1 ', USE ())' I name ":' ' lorem'

- - -> ' U P D A T E (' $1 ', USE ())' if-statement - - >

"if' guarded-command-list 'fi' - - > ' IF (' $2 ')'

do-statement - - > "do' guarded-command-l is t 'od '

- - -> 'DO (' S2 ')' guarded-command-list - - >

[{guarded-command '[]'} guarded-command] ---> $1 ',' 53 ",' . . . ',' $ 2 n - 1 / , n guarded-commands in guarded-command-l is t , /

guarded-command - - > expression ' - - > " statement-part

Variable hand2zg in Dljkstra's Guarded Commands 129

- - > " G U A R D (USE (' 51 "). S T M T (' 53 '))'

name ':" "swap' '(' expression ",' expression ")" ~ > ' U P D A T E (' 51 " USE (" NU "))' • NU is the list of names (possibly empty) used in the expressions

initialisation-statement - - > simple-initialisation t array-initialisation

simple-initialisation - - > typed-name-list ' : = ' init-expression-list

- - > "INIT (' 51 ', USE (' 53 "))' typed-name-list - - >

{name "vir' type ','} name 'vir' type 51 " ' 54 . . . . 53n "

/ , n names in typed-name-list , / t y p e - - > ('int' ] 'bool ' ) [ 'array']

init-expression-list - - > {init-expression ','} init-expression

- - > I1 ",' . . . ',' In / , Ii represents 5 2 j - 1, where the j ' th init-expression is the i ' th one for which the corresponding static language form is non-null. , /

init-expression - - > expression

= > NU / , NU is a string which consists of the names which occur as par t of expression, separated by commas. , /

t array-init-expression

array-init-expression - - > "(' expression-list ')'

- - -> 52 expression - - >

expression logop log-expression

i log-expression

log-expression - - > "non' log-expression

1 rel-expression

rel-expression - - > add-expression relop add-expression I add-expression

add-expression - - > add-expression addop mul-expression

mul-expression mul-expression - - >

mul-expression mulop term 1 t e rm

130 P.~,L'L A. BAIL~

tel"m - - >

s ignop t e rm I fac to r

f ac to r - - > n a m e

I ' ( ' express ion ') '

I t ru th -va lue I n u m b e r

I a r r ay -a t t r i bu t e

a r r ay -a t t r i bu t e - - >

n a m e ' . ' s imple-a t t r ibu te J n a m e ' . ' ' va l ' ' ( ' express ion ' ) '

s imple-a t t r ibu te - - > 'c lom'

f ' b ib '

I ' l ob '

i ' h igh ' I ' l o w '

t ru th -va lue - - > ' t r u e '

I ' fa lse '

s ignop - - >

r m u l o p - - >

i ' / '

a d d o p - - >

I re lop - - >

I

I

I I ' 4 '

Variable handling in Dljkstra's Guarded Commands 131

logop -- >

'and' L ' o r '

I 'cand' I ~cor'

Symbols "'name" and "number" are tokens. Names are as in the static language (see Section 3.1). Numbers are the standard representations of natural numbers in base ten.

4. EXAMPLE

Now follows a Guarded Commands language program which computes the greatest common divisors of each of a list of pairs of numbers (given by corresponding elements of arrays X and Y), placing the results in an array table. Note that "'keywords" are not displayed in some special font, thus allowing the program to appear as it would in a common environment, following the suggestion of Hanson [4].

begin glovar X, Y; virvar table; table vir int array := (0); do

X.dom # 0 - - >

begin glovar table, X, Y; privar x, y; x vir int, y vir int :-- X.low, Y.low; X: lorem; Y: lorem; do

x > y - - >

x : = x - y

[1 x < y

y : = y - x od; table: hiext (x) end

od end

Now follows the corresponding static language representation.

BLOCK ( GLOVAR (X, Y), VIRVAR (table), STMT (

INIT (table, USE ()), DO (

GUARD ( USE (x),

132 PALL A. BAILES

) )

) )

STMT ( BLOCK ( OLOVAR (table, X, Y), PRIVAR (x, y), STMT (

INIT (x, y, USE (X, Y)), UPDATE (X, USE ()), UPDATE (Y, USE ()), D O (

GU,a, RD ( USE (x, y), S T M T (

UPDATE (x, USE (x, y)) )

), guard (

USE (x, y), STMT (

UPDATE (y, USE (y, x)) )

) ),

UPDATE (table, USE (x)) )

) )

The Static Language representation in graphical (tree) form appears on the next page.

5. C O N D I T I O N S ON H A N D L I N G OF VARIABLES

The following rules govern the visibility of, initialisation of, and access to variables, and the relationships between these.

5.1 Visibility

A variable may not be INITialised, UPDATEd or USEd unless a nomenclature for it is given in the current 'innermost' block.

5.2 Nomenclatures

The nomenclature given for a variable in a block must correspond with the nomenclature given for it in the immediately enclosing block as follows:

Inner Block Enclosing Block PRIVAR OR PRICON no correspondence GLOVAR PRIVAR (A), VIRVAR (A) or GLOVAR GLOCON PRIVAR (A), PRICON (A),

VIRVAR (A), VIRCON (A), GLOVAR or G L O C O N

VIRVAR or VIRCON PRIVAR (B), PRICON (B), VIRVAR (B), VIRCON (B)

Variable handhng m Dijkstra's Guarded Commands 133

GLOVAR

table X Y

BLOCK

GLOVAR V I RvAR STMT

x Y table INIT IX)

table uSE GUARD

USE STMT

I I X .....--.- BLOCK

PRIVAR STMT

x y ~

UPDATE

INIT UPDATE UPDATE DO table USE

USE X USE Y USE GUARD GUARD x

× Y USE STMT

/ I I x y UPDATE x

x USE

x y

Graphical (tree) form of Static Language representation.

USE STMT

y UPDATE

/ I y USE

/ I y x

Notes

(A) INITialisation of the relevant variable must have occurred by entry of the inner block. (B) INITialisation of the relevant variable must not have occurred by entry of the inner block.

5.3 Initialisation

A variable may be INITialised only once.

5.4 Nomenclatures and Initialisation

A variable is to be INITialised inside a block, if and only if its nomenclature for the block is one of PRIVAR, PRICON, VIRVAR, VIRCON, and the other conditions regarding initialisation are satisfied.

5.5 lnitialisation and Use

A variable may be USEd only after it is INITialised.

5.6 Initialisation and Update

A variable may only be UPDATEd after it is INITialised.

134 PAUL A. BAILES

5.7 Nomenclature and Update

A variable may be UPDATEd if and only if its nomenclature for the current block is one of PRIVAR, VIRVAR or GLOVAR, and all other conditions regarding updating are satisfied.

5.8 Extra Considerations for Static Checking

The halting problem [5] manifests itself in that we cannot generally tell statically what the value of an expression will be. This means that we cannot tell which "a rm" of an IF or DO will be selected (aside from any considerations of non-deterministic selection), nor can we tell how many iterations a DO will cause.

The following additional rules, which we shall see are easy to check statically, will be of importance in showing that the above conditions may be checked statically.

5.8.1 DO

A variable which exists external to a DO statement may not be INITialised inside it.

5.8.2 IF

A variable which exists external to an IF statement, if INITialised in one of its "arms", must be initialised in all of them.

6. BASIS OF THE C H E C K I N G P R O C E D U R E

The procedure will maintain a stack of symbol tables, each level of which is associated with a subtree (or region) of the static analysis language representation of a GC program. Such subtrees are characterised by being rooted by the following:

(a) a node:

(b) a node:

(c) a subtree:

That is, a region is

(a) a block (b) a DO-statement (c) an "a rm" of an IF-statement

BLOCK

DO

IF I

G U A R D

We shall use the term "current region" with respect to a point in a program to denote the smallest such subtree which includes the given point.

A table is a set of tuples

(name, nomenclature, initialisation status)

each of which describes how a visible variable may be used at the point (in the region with which the table is associated) which the checking procedure has reached.

The nomenclatures are, of course, PRIVAR, PRICON, etc. The initialisation statuses and their significances are as follows.

6. I. LOCINIT

INITialisation of the variable has occurred inside the current region (maybe in a nested region).

6.2 INIT

INITialisation of the variable has occurred in an enclosing region (maybe in one of the enclosing region's nested regions).

Variable handhng in Dijkstra's Guarded Commands 135

6.3 NOT

The variable has not yet been INITialised.

6.4 M U S T

The variable has not yet been INITialised, but must be by the end of this region.

6.5 DONOT

The variable is not yet INITialised, and may not be in this region (including any nested regions).

7. THE CHECKING PROCEDURE

The procedure applies to each node in the static language tree as follows.

7.1 BLOCK

To process a program portion of the form

B L O C K

N 1 - • - N m S T M T

/ \ / \ \ n a m e . . . n a m e n a m e . . , n a m e etc.

where N, is in the set {PRIVAR, PRICON, VIRVAR, VIRCON, GLOVAR, GLOCON}:

(1) Push an empty table on the stack (2) Process each

N t

n a m e . • , n a m e

(3) Process the STMT

I etc.

(4) For each entry in TOP -- i f (a) its nomenclature is not PRIVAR or PRICON and (b) its initialisation status is

LOCINIT, change the initialisation status in the corresponding entry in NEXT to LOCINIT

- - i f its nomenclature is MUST, then an error (failure to perform a compulsory initialisation) is detected

(5) Pop the stack

To process

7.2 Nomenclatures

N

J \ n a m e n a m e

where N ~ {PRIVAR, PRICON, VIRVAR, VIRCON, GLOVAR, GLOCON}:

136 PAUL A. BAILES

7.2.1 PRIVAR

For each name, make an entry in TOP of the form (name, PRIVAR, MUST)

7.2.2 PRICON

For each name, make an entry in TOP of the form (name, PRICON, MUST)

7.2.3 VIRVAR

For each name (1) Check that there is an entry for it in NEXT of the form

(name, PPVV, NM) where - - P P W denotes one of the set {PRIVAR, PRICON, VIRVAR, VIRCON} - - N M is one of {NOT, MUST}

(2) Make an entry in TOP of the form (name, VIRVAR, MUST)

7.2.4 VIRCON

For each name (1) See 7.2.3 (1) above (2) Make an entry in TOP of the form

(name, VIRCON, MUST)

7.2.5 GLOVAR

For each name (1) Check that there is an entry for it in NEXT of one of the forms

(name, PV, LI) (name, GLOVAR, don't care)

where - -PV denotes one of the set {PRIVAR, VIRVAR} - -LI denotes one of the set {LOCINIT, INIT}

(2) Make an entry in TOP of the form (name, GLOVAR, INIT)

7.2.6 GLOCON

For each name (1) Check that there is an entry for it in NEXT of one of the forms

(name, PPVV, LI) (name, GG, don't care)

where - - P P W is as in section 7.2.3 above - -LI is as in 7.2.5 above - - G G is one of {GLOVAR, GLOCON}

(2) Make an entry in TOP of the form (name, GLOCON, INIT)

7.3 STMT

To process a program portion of the form STMT

J \ S . ° .

process each stmt S in sequence.

Variable handling m Dijkstra's Guarded Commands 137

7.4 /NIT

To process a program portion of the form

iN1Y

n a m e . . n a m e use

etc

(1) Process the USE

i etc.

(2) For each name, check that TOP contains an entry of the form (name, don't care, NM)

where - -NM is one of {NOT, MUST} otherwise an error (attempt to INITialise initialised variable

or prohibited INITialisation inside IF or DO) Change the initialisation status to LOCINIT.

7.5 UPDATE

To process a program portion of the form

(l) Process the

U P D A T E

r o m e • • • r o m e U S E

\ e l c

USE

I etc.

(2) For each name, check that TOP contains an entry of the form (name, PVG, LI)

where --PVG is one of {PRIVAR, VIRVAR, GLOVAR} - -LI is as in 7.2.5 above

Otherwise, an error (update of uninitialised variable or read-only variable, i.e., PRICON, VIRCON, GLOCON variable) is detected.

7.6 USE

To process a program portion of the form

USE

J \ n o m e . . . n o m e

For each name, check that TOP contains an entry of the form

(name, don't care, LI)

where LI is as in 7.2.5 above. Otherwise, an error (attempt to access value of uninitialised variable) is detected.

C L II ~-4--B

138 PAL'L A . BAILE$

7.7 DO

To process a program portion of the form

DO

J \ GUARD . . • GUARD

/ \ / \ USE S T M T USE S T M T

I I 1 I etc. etc. etc etc.

(1) Push an empty table on the stack (2) For entries in NEXT:

(a) of the form (name, don't care, LI)

where LI is as in 7.2.5 place an entry in TOP (name, NN, INIT)

where NN is the nomenclature for the entry in NEXT; (b) of the form

(name, don't care, NMD) where NMD is one of (NOT, MUST, DONOT} place an entry in TOP

(name, NN, DONOT) (3) Process each

USE

t etc.

and each STMT

I etc.

in sequence (4) Pop the stack.

7.8 IF

To process a program portion of the form

IF

GUARD . . • GUARD

/ \ / \ USE 1 STMT 1 USE n

I I f e f t . e t c . e t c .

(1) Push an empty table on the stack

STMT n

I etc .

(2) For entries in NEXT, place a corresponding entry in TOP, changing the initialisation status as follows

NEXT TOP LOCINIT, INIT INIT NOT, MUST NOT DONOT DONOT

Variable handling m Dijkstra's Guarded Commands [39

(3) For each i such that i < n (where n is the number of "arms" of the IF), in sequence --process

USE, I

etc. and

STMT, I

etc. ---change the initialisation status of each entry in TOP as follows

Before After LOCINIT MUST

INIT INIT NOT, DONOT DONOT

MUST Error

(The error detected is a failure to INITialise a variable in an "arm" when that variable was INITialised in all preceding "arms")

--repeat, using the next i (4) Process

and

(5) For each entry in TOP

USE. L

etc.

STMT,

I etc.

- - i f its initialisation status is LOCINIT, change the initialisation status in the corresponding entry of NEXT to LOCINIT

- - i f its initialisation status is MUST, then an error (failure to INITialise a variable in an "arm" when that variable was INITialised in all preceding arms) is detected.

(6) Pop the stack.

7.9 Empty IF and DO

To process program portions of the form IF

and DO

(i.e. no-subtrees), perform no operation.

8. VERIFICATION OF THE C H E C K I N G PROCEDURE

It is appropriate at this point to only sketch the verification, a more rigorous treatment being planned for the future.

8.1 Temporal and Logical Dependencies

The concept of time permeates the conditions to be checked that are prescribed in Section 5. The checking procedure meets such criteria by processing constructs in the order in which they would be executed (e.g. 7.3, 7.4) or in the order determined by logical dependencies (e.g. 7.1 (2) and (3)).

8.2 Initialisation Statuses

Another central concept is the role of the initialisation statuses. That an initialisation status of

140 PAUL A. BAILES

LOCINIT does indeed indicate that the relevant variable has been INITialised inside the current region is seen by the fact that the only ways such a status can be conferred are:

(a) by an INIT operation (7.4) in the current region; (b) by the variable having attained a status of LOCINIT in a nested region.

The correctness of statuses INIT and NOT is basically determined by inspecting the parts of the procedure concerned with entry to regions (7.2, 7.7 (2), 7.8 (2)).

With regard to MUST status, compulsion to INITialise is a consequence of either:

(a) a nomenclature of one of PRIVAR, PRICON, VIRVAR, VIRCON; or

(b) the variable having been INITialised in all "arms" of an IF prior to the arm currently under consideration.

Part (a) is dealt with by inspection of the components (7.2.1-7.2.4) dealing with these nomenclatures. Part (b) is dealt with by inspection of the component (7.8 (3)) which deals with the transition from one "arm" of an IF to the next. We see that INITialisation inside one arm (leading to a status of LOCINIT for that region) induces a status of MUST for the next.

Finally, a DONOT status arises from either: (a) an INITialised variable not having been INITialised in all prior "arms" of an IF, in which

case 7.8 (3) applies; or

(b) a DO statement being entered with uninitialised variables visible, in which case 7.7 (2) (a) applies.

Rule 7.4 (2) implements the provision that INITialisation is prohibited.

8.3. Specific Conditions

Instruction on how the specific conditions of section 5 may be seen to be satisfied now follows.

8.3.1 Visibility

Any access of a variable is accompanied by an access of TOP.

8.3.2 Nomenclatures

The rules of 7.2 apply. The conditions noted (A) and (B) are met by the references in 7.2 to particular required initialisation statuses in NEXT.

8.3.3 lnitialisation

INITialisation induces a status of INIT or LOCINIT ever after. Both these statuses are anaethema as input to rule 7.4, which relates to the INIT operation.

8.3.4 Nomenclatures and initialisation

Permission to INITialise, given statuses PRIVAR, PRICON, VIRVAR, VIRCON, is given in rule 7.4. Compulsion to INITialise is given by rule 7.1 (4).

8.3.5 lnitialisation and use

Inspect rule 7.6 and the initialisation statuses it permits.

8.3.6 Initialisation and update

Inspect rule 7.5 (2) and the initialisation statuses it permits.

8.3.7 Nomenclature and update

Inspect rule 7.5 (2) and the nomenclatures it permits,

8.3.8 Prohibition of initialisation inside DO

Refer to the discussion of the status DONOT above.

Variable handling in Dijkstra's Guarded Commands 141

8.3.9 Compatible initialisations inside IF

Reference to the discussion of the status MUST above deals with the compulsion to INITialise in successors to the first "arm", providing INITialisation occurred therein. Likewise, reference to the discussion of the status DONOT deals with the prohibition on INITialisation in successors to the first "arm", if no INITialisation occurred therein.

9. FURTHER DEVELOPMENTS

Four possible projects come to mind. The first is the formalisation of the verification of the checking procedure, felt to be more appropriate to be presented separately and subsequently. The second is the extension of the procedure to embrace more of the rules of the language that could be expected to be checked statically, notably type comparability of operators and operands. Our earlier work referred to contains some thoughts on the matter. The third is a study and consequent exploitation of the way in which multiple entries for a given variable are stacked in the symbol table. As the checking procedure can be thought of as operating, with respect to any one variable, like a push-down automaton [6, 7] whose push-down store is the stack of symbol-table entries for that variable, we are led to the following question: "Can the derivation and verification of this checking procedure, and of any similar procedures that may be required in the analysis of programs in other languages, be simplified by proceeding via the deduction of a Context-Free Grammar [6] which describes the allowed and/or compulsory operations on one variable, followed by the automatic generation of the appropriate push-down machine?". The fourth project is of course the incorporation of our procedure into an implementation of GC.

10. S U M M A R Y

The key points of this paper are as follows: (a) the identification of those parts of Dijkstra's Guarded Commands language that most

challenge the implementor; (b) the provision of an abstracted language for static analysis, and of a semi-formal mapping

from the concrete syntax thereto, for those parts of GC that deal with variable handling; (c) the enunciation of the conditions, adherence to which is to be checked by static analysis of

program text; (d) the description of the data structures and algorithms of the checking procedure applied to

programs in static analysis language form; (e) the verification (in overview) of the checking procedure; (f) suggestions for further work.

The static analysis language has been designed with dual purposes in mind: to allow for easy translation from the concrete syntax, and to allow easy comprehension, by human readers, of program structure. The first of these is fulfilled because the static language is one-dimensional, i.e. programs in it are still simply a string of symbols. The second purpose is fulfilled because strings of the form

may be represented as trees

XXXX (s, . . . , s)

xxx×

/ \ $ . . . $

We gave a sample program in concrete syntax and in both of the above views of the static language. By using an abstracted language for static analysis, discussion of and about the checking

procedure has been able to proceed free of syntactic irritants, e.g. punctuation symbols such as semicolons. Ascribing a tree structure allows a graphical presentation which facilitates under-

142 PAUL A. BAILF.S

standing both of the recursive nature of the checking procedure and of the inductive nature of (parts of) the verification.

The checking procedure is, in effect, a set of parallel push-down automata, one for each visible variable. As part of further work, it is planned to explore the suggested alternative means of deriving the checking procedure. That is, to begin with a context-free grammar which specifies the valid operations on a single variable, and then to derive the corresponding automaton. Con- sequently, there may be applications of this technique to the static checking of programming languages other than GC.

Acknowledgements--Gordon Oulsnam introduced me to the problems of implementing GC. A number of former colleagues, mostly at the University of WoUongong, kept my interest in the area warm. Olwen Schubert conquered my cryptic script and layout in the preparation of the manuscript. David Billington proved to be an excellent proof-reader. Particular thanks are due to the anonymous referee of the first version of this paper, who gave most helpful comments.

REFERENCES

I. Dijkstra E. W., A Discipline of Programming. Prentice-Hall, Englewood Cliffs, N.J. (1976). 2. Bailes P. A., Implementing Guarded Commands. Computing and Information Studies Occasional Paper No. 3. Grittith

University, Queensland. 0985). 3. Bailes P. A., Static checking for Dijkstra's Guarded Commands language. Proceedings of the llth Australian Computer

Conference, pp. 29-39 (1984). 4. Hanson D. R., Is block structure necessary? Software--Pract. exper. 11(8), 853-866 (1981). 5. Turing A. M., On computable numbers with an application to the Entscheidungsproblem. Proc. Lond. math. Soc. Set.

2 42, 230-265 (1936). 6. Aho A. V. and Ullman J. D., The Theory of Translation, Parsing and Compiling, 2 Vols. Prentice-Hall, Englewood

Cliffs, N.J. (1972). 7. Aho A. V. and Ullman J. D., Principles of Compiler Design. Addison-Wesley, Reading, Mass. (1977).

About the Author--PAuL A. BAILES was born in 1957 and qualified in Computer Science from the University of Queensland, Australia with the degrees of B.Sc. in 1978 and Ph.D. in 1984. His teaching experience includes lecturing in the Department of Computing Science at the University of Wollongong, Australia, and playing a central role in the planning and implementation of Australia's first inter- disciplinary undergraduate programme in Informatics, at Griffith University. Currently a Senior Lecturer in Computer Science at the University of Queensland, his research is focused on problems in the relationship between language design and programming methodology (especially with regard to functional programming), with strong interests in programming language implementation and Computer Science education.