Computer tools to teach formal reasoning

11
Pergamon S0360-131506)000164 Computers Educ. Vol. 27, No. 1, pp. 59-69, 1996 Copyright © 1996 Elsevier Science Ltd Printed in Great Britain. All rights reserved 0360-1315/96 $15.00+0.00 COMPUTER TOOLS TO TEACH FORMAL REASONING P. FUNG,1t T. O'SHEA, 1 D. GOLDSON, 2 S. REEVES 3 and R. BORNAT 4 'Institute of Educational Technology, Open University, Milton Keynes, England, 2Massey University, Palmerston North, New Zealand, 3University of Waikato, Hamilton, New Zealand 4 and Queen Mary and Westfield College, London University, Mile End Road, England (Received 6 October 1995; accepted l 7 November 1995) Abstract--Computer science undergraduates, for a number of reasons, find it difficult to learn formal reasoning methods. In an experiment designed to address certain of these difficulties a complete first year undergraduate computer science intake was supplied with a selection of computer-based tools providing a mixture of graphical and textual on-screen help. This paper reports on the experiment and the evaluation studies which were undertaken to assess the effect oftbe tools upon the students' progress in learning formal reasoning methods. The results indicated that the tools had a positive effect upon the learning process, both in qualitative and quantitative terms. In addition, data from the experiment pointed to other factors which may exercise an influence on the degree of success which students have in learning formal methods. Copyright © 1996 Elsevier Science Ltd. INTRODUCTION Both in industry and universities, improving the reliability of computer software has led to interest in adopting a more rigorous approach to computer programming. Employing formal methods for program verification is seen to be one way of achieving a higher degree of program correctness at an earlier stage in the process of software development. Reflecting this interest, many university computer science undergraduate courses incorporate an element of teaching formal reasoning methods and techniques. However, introducing formal methods into university computer science courses has highlighted the problems experienced in teaching and learning in this subject area. A series of empirical studies [1, 2] investigated the nature of the difficulties students experience in learning formal reasoning techniques. Analysis of the data obtained strongly supported the hypothesis that a lack of mathematical experience was the most significant factor in predicting the performance of students learning formal reasoning techniques. In looking closely at the difficulties reported by students, among those found were difficulties encountered in interpreting and manipulating formal notations, in abstracting general principles from particular cases, in reducing problems to their component parts and in recognising the relationships between those parts. Students themselves often saw their problems as arising from a lack of prior knowledge of mathematics and/or programming experience. The experiment reported in this paper is an attempt, through the means of computer based tools, to address the difficulties students experience in learning formal reasoning. The underlying hypothesis of the experiment is that the use of computer tools will help address the apparent lack of mathematical experience by developing a familiarity with formal language and notations, by making the calculation and proof procedures explicitly and by allowing students to manipulate and experiment with the processes involved. In alleviating the more tedious aspects of the computation involved in formal reasoning by using computer-based tools, it is expected that the question of motivation can also be addressed. This paper outlines the tools which were selected for the experiment, reports briefly on the results of studies which were undertaken to monitor the use of those tools and puts forward an interpretation of the results obtained. The paper concludes by looking at the aspects of their use which the experiment indicated needed more attention, by summarising the advantages which the computer-based tools offered and by outlining the current plans for their future use and development. ~'Author for correspondence, e-mail: [email protected];fax: 01908 653744. 59

Transcript of Computer tools to teach formal reasoning

Pergamon S0360-131506)000164

Computers Educ. Vol. 27, No. 1, pp. 59-69, 1996 Copyright © 1996 Elsevier Science Ltd

Printed in Great Britain. All rights reserved 0360-1315/96 $15.00+0.00

C O M P U T E R T O O L S T O T E A C H F O R M A L R E A S O N I N G

P. FUNG,1t T. O'SHEA, 1 D. GOLDSON, 2 S. REEVES 3 and R. BORNAT 4 'Institute of Educational Technology, Open University, Milton Keynes, England,

2Massey University, Palmerston North, New Zealand, 3University of Waikato, Hamilton, New Zealand 4 and Queen Mary and Westfield College, London University, Mile End Road, England

(Received 6 October 1995; accepted l 7 November 1995)

Abstract--Computer science undergraduates, for a number of reasons, find it difficult to learn formal reasoning methods. In an experiment designed to address certain of these difficulties a complete first year undergraduate computer science intake was supplied with a selection of computer-based tools providing a mixture of graphical and textual on-screen help. This paper reports on the experiment and the evaluation studies which were undertaken to assess the effect oftbe tools upon the students' progress in learning formal reasoning methods. The results indicated that the tools had a positive effect upon the learning process, both in qualitative and quantitative terms. In addition, data from the experiment pointed to other factors which may exercise an influence on the degree of success which students have in learning formal methods. Copyright © 1996 Elsevier Science Ltd.

I N T R O D U C T I O N

Both in industry and universities, improving the reliability of computer software has led to interest in adopting a more rigorous approach to computer programming. Employing formal methods for program verification is seen to be one way of achieving a higher degree of program correctness at an earlier stage in the process of software development. Reflecting this interest, many university computer science undergraduate courses incorporate an element of teaching formal reasoning methods and techniques. However, introducing formal methods into university computer science courses has highlighted the problems experienced in teaching and learning in this subject area. A series of empirical studies [1, 2] investigated the nature of the difficulties students experience in learning formal reasoning techniques. Analysis of the data obtained strongly supported the hypothesis that a lack of mathematical experience was the most significant factor in predicting the performance of students learning formal reasoning techniques. In looking closely at the difficulties reported by students, among those found were difficulties encountered in interpreting and manipulating formal notations, in abstracting general principles from particular cases, in reducing problems to their component parts and in recognising the relationships between those parts. Students themselves often saw their problems as arising from a lack of prior knowledge of mathematics and/or programming experience. The experiment reported in this paper is an attempt, through the means of computer based tools, to address the difficulties students experience in learning formal reasoning. The underlying hypothesis of the experiment is that the use of computer tools will help address the apparent lack of mathematical experience by developing a familiarity with formal language and notations, by making the calculation and proof procedures explicitly and by allowing students to manipulate and experiment with the processes involved. In alleviating the more tedious aspects of the computation involved in formal reasoning by using computer-based tools, it is expected that the question of motivation can also be addressed. This paper outlines the tools which were selected for the experiment, reports briefly on the results of studies which were undertaken to monitor the use of those tools and puts forward an interpretation of the results obtained. The paper concludes by looking at the aspects of their use which the experiment indicated needed more attention, by summarising the advantages which the computer-based tools offered and by outlining the current plans for their future use and development.

~'Author for correspondence, e-mail: [email protected];fax: 01908 653744.

59

60 P. FUNG e t al.

C O M P U T E R - B A S E D TOOLS

Reasons for using computer-based tools for teaching formal reasoning stem from the belief that such tools give students the opportunity to investigate and manipulate relatively complex concepts at a stage where students' own limited expertise would make this difficult in a traditional context. By representing concepts graphically and/or, textually on screen in such a manner that students can explore the constraints and possibilities relatively easily, a basis may be laid for a deeper understanding of those concepts. In addition, by using computer-based tools which make it simple to display and control some of the more mechanistic and "book-keeping" aspects of formal reasoning processes, students can become familiar with those processes and more confident in carrying them out. In certain respects, the analogy could be that of a calculator. A user can play with a calculator, exploring its operations, finding it fun to use and in doing so, strengthening one's knowledge of those operations. Equally, a calculator can be used as an enabler to allow a user to perform complex calculations which would otherwise be time-consuming, tedious and in some cases, demotivating.

Computer-based tools were introduced into two of the first year courses, a first course in functional programming (FP1) and an introductory logic course (ITL). Prior to the experiment a review was made of the systems which were commercially available [3]. Tarski's World [4], designed to teach first order logic, was chosen for inclusion in the curriculum. This system satisfied the criteria of being robust, easy to use, attractive to use and offering help in the required areas. Many aspects of this tool relate directly to helping students develop those skills which previous studies have indicated they lacked. Logical formulae are easy to construct and check. The truth or falsity of logical statements can be checked interactively by the system and it is a simple matter to try again when mistakes are made.

Figure 1 shows the three main components of the program as shown on screen, a "world" module, a sentence module and a keyboard. The world module displays a graphical representation of objects and the relationships which hold between them, such as Large, Larger, Between, LeftOf and so on. The sentences module displays the same relationships expressed in formal notation. In addition to a number of given worlds and their associated sentences, the application allows the user to alter these worlds or construct her own worlds and logical statements relating to them. Double clicking on an object shown on the "palette" to the left of the world allows the user to add that object to the world. A further double click on the object then allows it to be sized and labelled. The

Fig. 1. World module, sentences module and keyboard.

Computer tools to teach formal reasoning 61

I . File Edit Search Xlindows ~ P ~

parsed. I Expand ~= Contract k

sclt x I ............................................... = sqt' x 0 1 Free ~A

where sqt' Y n I Bound ~S

= sq t ' x (n + . . . . . . . . . . . . . . . . . .

Type 36T member' x [] [--..--.-----..-...--.---..-..

= False member' x (Y " ys)

= True, if x = y

= member' x ys, otherwise

~ E

Fig. 2. A tool for reasoning about programs.

keyboard provides an easy means of constructing the logical sentences necessary to describe the new or altered worlds.

The MiraCalc tool (Fig. 2), in contrast, was developed "in house" to help students to reason formally about functional programs [5], there being no tool commercially available which was considered suitable. As with the tools designed to assist in learning formal reasoning in the logic domain, MiraCalc directly addresses a number of the problems students reported experiencing.

Being able to experiment with typing, to experiment with reducing terms and to explore the structure of a program offers help in practical terms to students who are struggling to come to terms with a number of new programming concepts within a fairly short time span. It is designed to help students become increasingly familiar with the terminology used, with the structuring of programs and to develop more confidence in their own ability to analyse a program. In doing so, MiraCalc is intended to be an enabler in helping students to bridge the gap between the abstract learning of formal reasoning concepts and being able to put these into practice. As input the system takes either a program which the student has constructed or one from a given set of example programs. After performing an initial parsing and pretty-printing operation, the system then offers the user a variety of options from a pull-down menu (Fig. 2). Users can query and confirm the scope of variables, determine the relationships between the component parts of the program, or use the system interactively to confirm the correct program type. A stepping facility allows the user to trace the reduction of terms which takes place as a program works through the evaluation process.

Another tool, also based in the logic domain, JAPE (Just Another Proof Editor), developed by [6] was made available in the following spring semester. This tool is intended to address a number of the difficulties which students reported that they experienced in learning to construct proofs using natural deduction methods. Again these problems centre on the difficulty of manipulating formulae, of relating abstract concepts to actual situations, of not knowing when to take which steps in constructing proofs. JAPE allows the student, by choosing steps from a menu selection, to construct on screen a proof for the truth or falsity of a given statement. The steps chosen can be shown either as Fitch box or tree representations. The system presents the construction of natural deduction proofs within a framework which actively encourages the user to experiment in manipulating formulae. Increased familiarity with the deduction rules being used can be gained through the ease with which proofs can be constructed and manipulated. It was hypothesised that this should, in strengthening the connection between the abstract learning of deduction rules and the process of putting them into practice, be instrumental in developing students' experience and confidence in deciding "which step" to take next in proof construction.

62 P. FUNG et al.

EVALUATING THE TOOLS IN USE

The approach adopted to evaluating the effect of the computer-based tools upon student's learning employed a multimethod perspective [7] which combined qualitative and quantitative measures. At the outset of the academic year a background profile of the intake was constructed, based on information such as academic achievements, computing experience and motivational factors in choosing a computer science degree course. In-depth interviews were conducted with a number of the intake, questionnaires were used to gauge the reactions of students to the tools provided and students' level of use of the computer tools was logged where possible. Students' departmental progress markers were monitored over the course of the academic year. A detailed report of the methodology of these studies and of the data obtained can be found in [8]. In this section we simply give a summary of the outcomes of the empirical work undertaken in relation to the use of the software tools. Data from the studies relate to 93 first year computer science students who initially registered for the two courses on which the tools were introduced. All these students completed the FP1 course, sat the relevant examination and received an end of year grade. Of those 93 students, while initially registering for ITL, 13 did not register for the examination, so data relating specifically to that course refer only to those 80 students who followed the ITL course, sat the examination and received an end of year grade. In the subject group of 93, approximately 60% had use of a computer at home prior to university, mostly used for word processing and games. Almost 60% considered themselves to have had a fair amount or a lot of programming experience. Over 50% had obtained a BTEC qualification, while approximately 30% had studied maths at 'A' level. Of the 74 out of 93 students who responded to questions about their motivations for choosing their computer science course, over 60% of those gave either "interest in computers" or "a more structured approach to programming" as one of their motivations.

Over the two semesters 27 students were interviewed individually in depth. In the first semester interviews related to the course Functional Programmingl (FP1) and use of MiraCalc, in the second to the Introduction to Logic (ITL) course and use of Tarski's World and JAPE. The format of the interviews was similar in both cases, consisting of two sections, initially a period during which the subject was asked for comments on the particular course and computer-based tool(s), followed by a session during which the subject logged on to a machine and used the relevant computer tool to "talk through" an exercise of their choice.

Seventeen students were interviewed in relation to the course FP1. They talked of it as being an absolutely new experience, of their panic and shock when confronted with abstract figures and abstract things. They spoke of the difficulty of understanding the formal notation, of it being like another language .. . like learning Latin, of there being too much to do in too short a time, of not knowing what was the underlying direction or purpose of formal reasoning. A number of students felt that while it was not exactly the same as mathematical thinking, that they would perhaps find it easier if they had studied mathematics.

The principal sources of difficulties mentioned by interviewees related to the amount of lab work and the level of difficulty of this work. Those interviewed mentioned a number of help sources. During lab sessions, these included the software tool provided, lab assistants, or friends. Peer discussion was by far the most frequently mentioned source of help, both during lab sessions and outside lab sessions. While some found the help given by lab assistants was invaluable, there were reservations expressed. Among these, interestingly, that it was embarrassing to be seen by peers as always needing help from the lab assistant(s). Of those interviewed, the majority volunteered that the software tool provided, MiraCalc, had been useful as a source of help. Its main use had been in helping to understand normal order reduction, in reasoning about the evaluation process and in helping them to consolidate their knowledge of basic types. The principal reservations expressed were of two sorts. One, of a technical nature, was that using the tool entailed running a process under 24-bit addressing, so limiting what other computing operations students could carry out during that session and that the tool itself was slow. The second type of reservation was that the help it provided, while welcome and given considerable praise, was as yet limited.

The second part of each interview consisted of the interviewee logging on and stepping through an exercise on-line using the software tool MiraCalc. The purpose of this was to obtain feedback on how students were actually using the tool, as opposed to how they said they were, what comments

Computer tools to teach formal reasoning 63

they had to make on it when they were using it, what help it appeared to be to them in addressing their difficulties and in understanding their work. It was apparent that the majority of those observed at interview enjoyed using MiraCalc. Even at interview, many of the interviewees used the MiraCalc session as a learning process and it was obvious that the tool was particularly useful to them in clarifying their conceptions of normal order reduction and helping them to reason about types. Students who felt proficient at procedural languages and were used to reasoning about programs using applicative evaluation found it particularly instructive to use the evaluation stepper which illustrated the normal order reduction process. Interviewees commented that they found the "type" facility useful in helping them to understand inconsistencies in their reasoning about their programs. This was supported by watching them use the tool to check their work as they completed the exercise(s). Some focused primarily on its use as a checking mechanism, while others used it as a stimulus to help them think about their program.

Ten students were interviewed in relation to the course ITL. The software tool JAPE was made available for those areas of the course in which students were introduced to the processes of natural deduction. It was made available at a later stage of the second semester and introduced as an optional software tool which students were welcome to take advantage of, if they wished to do so. The software tool Tarski's World (TW) was used as an integrated tool in those parts of the course which covered the use of first order logic and predicate calculus. In the event, the different approaches to the use of these tools within the context of the curriculum resulted in very different use being made of them. Ultimately, given additional factors related to the timing of its introduction, the number of students using JAPE was insufficient to use as a basis for its evaluation during the experimental period. At interview, students rarely mentioned having difficulty with the content of the ITL lectures and the use of the software tool Tarski's World by the lecturer for demonstrating points was mentioned as helpful to understanding. In commenting on this software all subjects interviewed were very positive about its use. In the second part of the interview subjects logged onto a machine and talked through an exercise of their choosing, using this software tool. Even where students reported that they had not used it regularly, they had no problems adapting to the interface. The advantage of being able to experiment without having to "draw it all out" was commented on, as was the advantage of being able to visualise logical statements, e.g. "it helps make sense of it", "it is good to be able to see everything". Comments also indicated that students appreciated the ability the tool gave them to build up logical statements while checking incrementally for correctness. Where students did mention being confused this was usually in relation to work in areas of the course concerning the processes of natural deduction. Comments indicated that students were aware that a lot of practice was required to get a feel for which rules to use in the deduction process, e.g. "there are so many laws that . . . you just get stuck . . . " and "there is no way except to just keep trying different ways until you find one that works".

Questionnaires were distributed towards the end of each semester, in the first semester relating to the use of MiraCalc, in the second to the use of Tarski's World. 51 students responded to the questionnaire relating to MiraCalc, 49 to that relating to Tarski's World, respectively representing 54 and 63% of the student population concerned. In both cases questionnaire design included a number of questions based on views about the software tools expressed by those students who had been interviewed individually. The purpose of this was not only to make the questionnaire itself more relevant to its recipients in terms of the language used, but also to assist in gauging the extent to which those views were representative of a wider body of student opinion. Overall, replies from the 51 students responding to the questionnaire on MiraCalc were very positive. Over 55% said that they used MiraCalc a lot and over 85% said that using MiraCalc when they were working was better than using the Miranda interpreter alone. Over 60°/0 indicated that they used the Guess Type option to try and reason about their use of basic types and over 75% that it had definitely helped them to understand the principles of normal order reduction. The majority of respondents would have liked to use MiraCalc outside their allocated lab times. Of those who said that they did not use MiraCalc often, 14% related this lack of frequent use to the length of time the software took to operate and the need to log on to a 24-bit session, while 11% related this to the time and learning which they thought might be necessary to be able to use it. Adverse criticism related to development issues which they would like addressed, while two respondents indicated that they had felt that lack of time had been a significant factor which prevented them from making more use of MiraCalc. Of the 49 students

64 P. FUNG e t al.

returning completed questionnaires on their use of Tarski's World, 90% said that they had found it useful. Over 90% said that they would recommend its use in ITL for the next year's intake. Most students responding (76%) found that the most useful role of Tarski's World had been in helping them to understand reasoning with quantifiers. All students replying agreed that it was easier to keep track of what they were doing and easier to try out ideas using the software than to perform the same operations with paper and pencil. According to their replies, only 14% of these students used the software simply to do the exercises which they had been set. 45% of respondents stated that in most of their ITL lab sessions they had also spent some time exploring TW, i.e. using it as more than simply a tool to help them complete their exercises. An additional 27% had done so, but in relatively few of their lab sessions. As with MiraCalc, students appreciated the ability which the software gave them to check their work as they progressed, over 70% replying that they liked to do so. Approximately the same proportion of students (71%) said that they found it useful to visualise objects when interpreting logical statements, 65% saying that usually they used Tarski's World type objects to do so. Overall questionnaire responses to the use of Tarski's World were very positive, there being virtually no adverse comments relating to it and every indication that its use had been very helpful.

In addition to interviews and questionnaires, a record was kept of students' assessed coursework scores and end of year marks in the two relevant subjects. As far as was possible a record was kept of students' use of the software tools provided. In the case of MiraCalc, the time of each student starting and finishing a MiraCalc session was logged to a central file. The time which students spent using MiraCalc varied considerably. For the purposes of the study, students not using MiraCalc were taken to be either those who had not logged on at all to a MiraCalc session, or who had in total spent an hour or less using the software. The average length of use was 5 h, the maximum was 27 h. Overall the total number of hours logged in the semester in which this software tool was introduced approached 500.

The body of data on the use of Tarski's World was drawn from students' own estimations of the number of lab sessions during which they used it. It is clear that all 80 students monitored on the ITL course would have used Tarski's World to some degree, since it was integrated with the work in the relevant areas of the course. Fortnightly exercises sheets were set within the context of Tarski's World objects and TW course work files were sent on-line for assessment. However no direct log of each individual's use of Tarski's World was kept, therefore information on the amount of use of Tarski's World was extracted from questionnaire responses. This is considerably less satisfactory than data collection for MiraCalc usage, since it relies on students' perceptions of the amount of time they spent using it and in addition data is not available from the 37% of students who did not complete a questionnaire on their use of Tarski's World. Of the 49 students who completed questionnaires approximately 50% reported using all or most of their allocated ITL lab sessions (sessions were not compulsory). In those sessions which they did use, over 70% of those replying reported that they used Tarski's World for all or most of the allocated time--it was possible for students to have used this time for other work and a number of students did indeed record using this time to catch up on other lab work.

End of year examination scores for FP1 and ITL were also monitored. The average end of year marks for FP1 and ITL were 56 and 59 respectively, though these mean averages hid a considerable difference in the range of scores, particularly in relation to FP1, where 35 students received grade A and 28 an F or fail grade. In summary, a combination of qualitative and quantitative information about students' use of the software tools was gathered over two semesters, through observation and monitoring. What does this information suggest and is it a basis for evaluation?

EVALUATION OF THE EFFECT OF THEIR USE

The main focus of the experiment was to evaluate the effect of using software tools on students' learning. There is no doubt from the data presented above that students felt that the use of computer-based tools helped them and had a beneficial effect on their learning of formal reasoning. Here we look at ways in which the tools appear to have helped and whom they have most helped. It must be noted that the experimental use of this software and its evaluation took place in the course

Computer tools to teach formal reasoning 65

of an academic year where the tools were used as part of the undergraduate programme. In contrast to laboratory or controlled experiments, all subjects monitored had access to the tools. Analysis of the data has been undertaken within the constraints of that context. Where quantitative information relating to the effectiveness in learning was looked for, we have looked for possible contrasts within the group, between those who have used the tools and those who have not. While this eliminates straightforward reporting of results from experimental and control groups in supporting any subsequent claims to the effectiveness of the tools, it also has advantages. In dealing with a complete cohort of undergraduates and observing them over the course of the two semesters, it is not difficult to get a feel for how useful the tools are, how much students like using them, how the tools integrate with the curriculum. Given their wide dissemination it is relatively easy to see how robust the tools are, how effective their interface and in what ways a wide range of students use them. Data in relation to Tarski's World is of a qualitative nature, using student perceptions and responses to questionnaires to gauge its effect. This data is strongly positive in showing the use that students have made of the tool and the benefits which they perceive in using it. Replies from questionnaires were from almost 65% of the target population. Of these, 90% had found it useful. Even more impressive is the 91% of those students positively recommending that it should be used for the following student intake. The benefits that they perceived from using Tarski's World were varied, depending in many ways upon the learning styles and needs of the individual. It was used regularly and students enjoyed using it. A major benefit noted by students was its appealingly simple, user-friendly interface encouraging exploration. Another was that of being able to visualise the more complex ideas of universal and existential quantifiers to which students were introduced. A most interesting finding from the data collected was the absence of comments on difficulties. In a study of the previous intake at the same institution [9], an area of difficulty which students had noted in relation to this introductory logic course was that of interpreting, building and manipulating logical statements. The ease with which students are able to do so using Tarski's World suggests that this difficulty is being addressed by the tool. Another difficulty mentioned in the previous study had been that which students had noted, of not knowing if the proofs they had worked out were correct, of the laboriousness of writing proofs out and having to frequently go back and change them. This type of difficulty was also conspicuous by its absence. In the present study, the majority, over 70% of questionnaire respondents, found Tarski's World useful in allowing them to check the truth and validity of their logical statements as they worked. Both qualitative and quantitative data were collected on the use of MiraCalc. Qualitative data collected from interviews and returns from questionnaires indicate that where students have used MiraCalc, they perceive the tool as useful and helping them in a number of areas (Figs 3 and 4).

Quantitative data available to help assess the effect of using MiraCalc upon learning outcomes consist of records of the number and length of each student's MiraCalc sessions and background information given by students at the beginning of the first semester. For the purposes of this study, learning outcomes were represented by the marks which students scored at the end of the year for the course Functional Programming 1. Initial analysis showed a significant difference in relation to FP1 scores between those who had used MiraCalc in their first semester and those who had not (P ~< 0.001 Mann-Whitney), the mean score of those using MiraCalc being higher than that of those

Do y o u use MJraCalc a lo t?

II y e s [] n o

[] s o m e t i m e , .

Fig. 3. MiraCalc use.

66 P, FUNG et al.

Has MiraCalc helped you understand normal order reduction?

l y e s

["l n o

i a little

Fig. 4. An aid to understanding.

students who did not use it. Next, in relation to the level of MiraCalc use, a scale of five was adopted ranging from very low, 1 h or less, to very high, use of 9 h and over, in assessing each student's use of MiraCalc. Looking at the level of MiraCalc use in relation to FPI scores, there was a significant difference between the mean FP1 scores of those with a low or very low level of MiraCalc use (42.2) and those with an average level of use (68.5), although there were no significant differences between the mean scores of groups as the level of MiraCalc use rose from average to very high. Other factors which might have had an effect on FP1 scores were also considered. Among these were mathematical backgrounds prior to university, previous experience of programming, previous home use of a computer, students' motivations and expectations of the course. These factors and FP1 scores were examined using Spearman's rank correlation coefficient as a measure of the relationships between them. Only Mathematical Background showed a significant correlation with FP1 scores. As mentioned in the introduction to this paper, previous studies [1] had indicated that a significant factor was likely to be whether or not students had studied 'A' level maths. Analysis of the data confirmed that in relation to FP1 scores, there was a significant difference between the two groups of students, those who had studied 'A' level maths and those who had not (P ~< 0.003 M a n n - Whitney), the mean score of the former group being 71.4, that of the latter group being 49.4.

While it is very encouraging that the use of MiraCalc shows a positive effect in relation the end of year scores, it is even more interesting to look at where the effect is most evident. By considering different groupings of students, using levels of programming expertise and maths backgrounds in order to address this point, perhaps surprisingly there appeared to be no significant relationship between the benefits of using MiraCalc and levels of programming expertise in relation to FP1 scores. Looking at the relationship between the use of MiraCalc, maths background and FP1 however was very enlightening. Here, in relation to FP1 scores, two groups were considered: those students with a mathematical background, i.e. had studied maths at 'A' levels, and those without a maths background and within each.of these groups, at or not they had used MiraCalc. In the group of students who had studied 'A' level maths, the results show a significant difference in relation to FP1 end of year scores between those who had used the software tool MiraCalc and those who had not (P ~< 0.03). This difference is, however, far more pronounced within the group of students who had not studied 'A' level maths (P ~< 0.001), suggesting that the use of MiraCalc was more beneficial to those students who lacked a mathematical background. Indications are that the use of the software tool MiraCalc has had a positive effect upon the learning outcomes and that this effect is most strongly seen in the results of those students who had not studied 'A' level maths. Given that one of the principal motivations for developing and introducing these software tools had been to address difficulties which non-mathematical students experience, this is a most encouraging result.

Important as this finding is, it does not in itself of course answer all the questions of evaluation. Aspects which must be considered in the evaluation of software designed to be used for or as an aid to teaching and learning are those relating to software interface and technical development. Concerning interface and technical development issues, there is very little that needs to be said in respect to Tarski's World. This software system was selected from those available because it met the criteria sought in those respects. It is a commercial product which had been developed and tested, with an interface which is particularly simple to operate and visually engaging. MiraCalc is

Computer tools to teach formal reasoning 67

an "in-house" tool developed to fill a need which was not met by any available commercial software. Over the academic year its use demonstrated that there had been a real need for such a tool and it proved to be robust and, in principle, easy to use. Its use "in the field" also showed the shortcomings of this prototype. Implemented in LPA Prolog, it could not be run under 32-bit addressing sessions, which understandably was seen as a serious drawback by students who, in order to use it, had to remember initially to log onto a 24-bit addressing session or else spend precious time logging offand logging on again, just to use MiraCalc. It says a lot for its usefulness that so many students felt it was worth that extra time. In terms of interface it was based on user-friendly standard Macintosh principles. Design choices, however, had been made relating to the number, roles and positions of windows on screen, which in the light of feedback gained from its use, could be considerably improved. In addition, as noted in the report on student feedback, there was a need for the scope of the tool to be extended to cover further areas of reasoning about functional programming which students found difficult, i.e. list manipulation and type inferencing.

Another important aspect is the place of software tools in the curriculum. As hinted earlier, this can have an important effect upon their use, or non-use. At one end of the spectrum, tools such as these can be introduced as an integrated part of the curriculum, or at the other they can simply be made available and their "take-up rate" determined by the students' perceptions of how useful they are. Of the three tools which were to be the subject of this experiment, Tarski's World and JAPE can perhaps be viewed as at different ends of the spectrum, with MiraCalc near a mid-point. As reported above, there is no doubt that every student in the course concerned used Tarski's World, at least minimally, to satisfy the requirements of the course into which its use was integrated. MiraCalc was initially introduced within the context of a course lecture as an aid which students might find useful, but not presented as an integral part of the course. As a result, its initial take-up rate was slow. This changed significantly when it was pointed out and demonstrated to the students concerned that MiraCalc was of tangible benefit to them in the work they were doing and its use related directly to sections of lab exercises students had been given. JAPE, a tool with great potential to help students in the area of natural deduction, was made available to the students as an aid to natural deduction, but not introduced within the setting of course lectures and not "officially sanctioned" and seen as part of the course. Its introduction also coincided with a time of the year, mid-second semester, when students begin to feel that time is very precious and end of year exams begin to feature highly among their concerns. This combination of circumstances resulted in an extremely low take-up rate of a tool which would have addressed some of the very concerns which students expressed in relation to that part of the logic course which dealt with natural deduction.

The main point which this issue brings to the fore, is the importance of introducing any new software tools into the curriculum in a manner which will encourage their use and ensure that students are aware of the benefits which will accrue from their use. Experience of watching students at interview and data collected from those interviews makes it clear that in the case of Tarski's World and perhaps more so in the case of MiraCalc, these software tools are potentially powerful teaching tools, whether teaching as in the sense of one experienced person helping others, or as in the sense of the individuals or groups exploring and using it in an individual context. The concept that motivated the development and use of the tool-based approach to learning formal reasoning was based to a large extent on the belief that if the tools were well-designed, robust and salient to the problems concerned, the benefits to be gained from using them would be self-evident. The experiment has shown that this is not necessarily so. Students feeling pressured, and these are the students for whom such tools are probably most beneficial, are likely to perceive the task of "learning how to use tools" as another distraction and one which they have not time to investigate. While this in no way invalidates the criteria which were envisaged as being necessary for the "here is a useful tool on the desktop, take it or leave it" approach, in practice the students addressed, in this case first year students, needed to be convinced of the benefits in advance of using such aids. Nor does it invalidate the approach, it is simply a matter of being aware of the context and the position within the curriculum which will allow such tools to be exploited to the best advantage of the intended users.

Finally, although it is very clear that the software tools have had a positive effect upon the learning of formal reasoning within the context of the cohort studied, there are other factors which contribute to the overall level of achievement which students attain. While the purpose of this

68 P. FUNG et al.

experiment was to focus on the role and effectiveness of the on-line tools in students' learning of formal reasoning, the resulting data has provided much information besides.

Distribution of end of year scores for both FP1 and ITL revealed interesting groupings at upper and lower ranges of scores. As a topic which merits further investigation, one could usefully look at the upper and lower quartiles, for instance, and consider the combination of factors which are common to those groups. While this discussion is largely outside the scope of this paper, cross- tabulation procedures flag several interesting similarities among those 23 students whose end of year scores were in the highest percentile, i.e. whose scores were over 88. Of these 23, 19 had the use of a computer at home before they came to university as opposed to 1 who did not--this data is missing for 3 of the group. Of the 18 in the top percentile for whom we have such data, 12 had given "interest in computers" as an important reason for choosing their course while 16 had made no mention of choosing it for "job" motivation. These pointers to profiles of those students who achieve a higher degree of success in learning formal reasoning are interesting, in that they draw attention to the widely variegated backgrounds and abilities of the students on these courses, who have very different motivations and expectations. Given that every avenue is explored to help them develop their potential in this area, there is still very likely to be a significant difference in how individual students respond to the materials provided.

C O N C L U S I O N S

There are research questions which this work raises but which are beyond the scope of this paper. A closer and more detailed investigation of the underlying reasons which contribute to the useful role of visualisation in this area merits further research. The overall curriculum planning and pedagogical approaches to formal reasoning courses within the undergraduate computer science programme are areas which deserve discussion and attention. A study such as reported here can only be viewed as a first assessment of this tool-based approach to teaching formal reasoning. To gain a more objective view of the extent of its use in a wider context an investigation of a different nature would now be necessary. On one hand a more detailed look at the use of these tools is indicated. This experiment has reported that the tools were helpful, in which areas they were helpful and how they were helpful. Closer research is needed to determine why this is so. On the other hand, these tools need to be tested across a broader spectrum, with other cohorts of students, in other contexts. These are topics for future research. In the context of this particular research enterprise, there is a number of immediate steps to be taken. There are aspects of the particular tools investigated which merit further attention. These are being addressed and involve modifications, extensions in MiraCalc and its development for use on other platforms. The latest version MiraCalc2 for UNIX, is currently under development [10]. JAPE is also being developed further in the light of subsequent field trails and results of this will be reported in due course. The principal conclusion to be drawn from the research reported here is that the software tools had an overall beneficial effect on students learning formal reasoning. Data has shown that their use has been of particular help to those students with a less mathematical background. The tools helped students to reason about their programs, they helped students to understand what they were learning. Drawing on what we know from the data, this was because of the facilities the tools provided for visualising the processes of formal reasoning, for checking the processes as they were being performed and because they offered students the opportunity to explore the domain in a way which would not have otherwise been possible.

R E F E R E N C E S

1. Fung P., O'Shea T., Goldson D., Reeves S. and Bornat R., Computer science students' perceptions of learning formal reasoning methods. Int. J. Math. Educ. Sci. Technol. 24, 749-760 (1993).

2. Fung P. and O'Shea T., Learning to reason formally about programs. CITE Report No. 168. Open University, Milton Keynes (1993).

3. Goldson D. and Reeves S., Using programs to teach logic to computer scientists. In Developments in the Teaching of Computer Science Conference (Edited by Bateman D. and Hopkins T.), pp. 167-176. University of Kent at Canterbury (1992).

4. Barwis¢ J. and Etchemendy J., The Language of First Order Logic. Centre for the Study of Language and Information, Stanford, Calif. (1990).

Computer tools to teach formal reasoning 69

5. Goldson D., A symbolic calculator for non-strict functional programs. Comp. J. 37, 176-187 (1994). 6. Bornat R. and Sufrin B. JAPE: Just Another Proof Editor. Computer Science Department Technical Report, Queen

Mary and Westfield College, University of London, in press. 7. Brewer J. and Hunter A. Multimethod Research: A Synthesis of Styles. Sage, London (1989). 8. Fung P. and O'Shea T., Using software tools to learn formal reasoning: a first assessment. CITE Report No. 197. Open

University, Milton Keynes (1994). 9. Fung P., O'Shea T., Golson D., Reeves S. and Bornat R., Why computer science students find formal reasoning

frightening. J. Comp. Assist. Learn. 10, 240-250 (1994). 10. Goldson D., Hopkins M. and Reeves S., MiraCalc: the Miranda calculator, the UNIX version. Computer Science

Department, Working Paper 94/5. University of Waikato, New Zealand (1994).