Teaching the Cache Memory System Using a Reconfigurable Approach

6
336 IEEE TRANSACTIONS ON EDUCATION, VOL. 51, NO. 3, AUGUST 2008 Teaching the Cache Memory System Using a Reconfigurable Approach Ricardo Quislant, Ezequiel Herruzo, Oscar Plata, José Ignacio Benavides, and Emilio L. Zapata Abstract—This paper presents a tool that simulates a reconfig- urable cache whose parameters can be changed at runtime through a special instruction at the instruction set architecture (ISA) level. The proposed tool simulates a cache system that can be recon- figured within a variety of 298 combinations of cache capacity, number of ways or associativity, and line/block size in words (C, W, and L) without changing its architecture. The simulator was developed through a series of laboratory exercises in computer ar- chitecture. The students are introduced to the reconfigurable cache architecture while refreshing their knowledge on computer archi- tecture issues like logic design, and register transfer level and com- puter system level architectures, as well as reinforcing concepts about memory system organization and architecture. This paper presents an overview of the reconfigurable cache and a description of the simulator interface. Finally, feedback from the students pro- vides assessment on using the simulator in the laboratory. Index Terms—Cache memories, computer-aided instruction, ed- ucation, reconfigurable cache design, simulation software. I. INTRODUCTION T HE growing disparity between processor and memory speeds makes the cache memory and its efficient utiliza- tion a critical factor in determining program performance. There are many ways to improve the cache performance re- sulting in a reduction of the application execution time. One so- lution integrates a number of cache levels inside the processor chip, although this may cause a significant increment in energy consumption and problems with heat dissipation. Other solu- tions are based on the modification of the cache organization [1], optimization of cache algorithms [2], or improvement of data lo- cality [3]. This paper presents the SImulator of Reconfigurable CAche (SIRCA) tool, a reconfigurable cache simulator designed to show that tuning cache parameters properly may improve performance, as a result of a reduction in the miss rate. Tuning the cache architecture may also optimize other critical aspects, like energy consumption, by using only those modules at any particular moment. This tool allows students to study the cache behavior (e.g., number of read/write misses and hits) for a set of memory ref- erences testing different cache organizations, such as direct- mapped, n-way set-associative, and so on, with different block and cache sizes. Manuscript received June 28, 2007; revised November 26, 2007. First pub- lished June 17, 2008; last published August 6, 2008 (projected). R. Quislant, O. Plata, and E. L. Zapata are with the Department of Computer Architecture, University of Málaga ETSI Informatica, Campus de Teatinos, Uni- versity of Málaga, 29071 Málaga, Spain (e-mail: [email protected]). E. Herruzo and J. I. Benavides are with the Department of Computer Archi- tecture and Electronics, University of Córdoba, 14071 Córdoba, Spain. Digital Object Identifier 10.1109/TE.2008.916767 The simulation tool shows the students a particular example of a reconfigurable cache design, and motivates them by using concepts that they already know implemented within a multimedia resource. The convenience of using multimedia resources as teaching materials has been extensively demon- strated [4]–[6]. Other simulators of the cache memory similar to SIRCA have been proposed in the literature. A representative list follows. Dinero IV [7]: A trace-driven uniprocessor cache sim- ulator that allows many cache design parameters to be varied (e.g., write-back versus write-through, least re- cently used (LRU) versus random replacement, demand fetching versus prefetching). Although it is a powerful tool, Dinero is difficult to configure and does not provide a graphic user interface (GUI). Dinero is a command-line driven simulator written in C. CAMERA [8]: Written in Java Swing [9], this simulator provides users with interactive tutorials and simulations to help them understand concepts about cache mapping and virtual memory using paging. Java Cache Simulator [10]: This tool addresses the con- cepts of the three cache mapping schemes, and allows flex- ibility in changing the cache size, block size and associa- tivity. However, Java Cache Simulator demonstrates nei- ther the flow of data throughout the system nor the flow of the cache read and write algorithms. SIRCA is written in Java Swing, like CAMERA and Java Cache Simulator. However, SIRCA’s interface differs from these in how it shows cache modules. The register transfer level view of SIRCA is more intuitive for students than are the text boxes and labels shown by CAMERA and Java Cache Simu- lator. Furthermore, SIRCA simulates a specific reconfigurable cache design and provides hit and miss rates. The rest of this paper is organized as follows. Section II de- scribes the educational framework where the simulator is used. Section III shows a description of the reconfigurable cache de- sign and simulator, along with an explanation of practical lab- oratory exercises. Section IV presents the feedback collected from students. Finally, conclusions are drawn in Section V. II. EDUCATIONAL FRAMEWORK A. Educational Objectives SIRCA tool has been used in several courses on Computer Design and Architecture in the University of Córdoba, Cór- doba, Spain. More precisely, this software tool is being used in two specific courses: Computer Architecture and Engineering, and Processor Design and Configurations Assessment, both of them in the computer engineering curriculum. This simulator is 0018-9359/$25.00 © 2008 IEEE

Transcript of Teaching the Cache Memory System Using a Reconfigurable Approach

336 IEEE TRANSACTIONS ON EDUCATION, VOL. 51, NO. 3, AUGUST 2008

Teaching the Cache Memory System Using aReconfigurable Approach

Ricardo Quislant, Ezequiel Herruzo, Oscar Plata, José Ignacio Benavides, and Emilio L. Zapata

Abstract—This paper presents a tool that simulates a reconfig-urable cache whose parameters can be changed at runtime througha special instruction at the instruction set architecture (ISA) level.The proposed tool simulates a cache system that can be recon-figured within a variety of 298 combinations of cache capacity,number of ways or associativity, and line/block size in words (C,W, and L) without changing its architecture. The simulator wasdeveloped through a series of laboratory exercises in computer ar-chitecture. The students are introduced to the reconfigurable cachearchitecture while refreshing their knowledge on computer archi-tecture issues like logic design, and register transfer level and com-puter system level architectures, as well as reinforcing conceptsabout memory system organization and architecture. This paperpresents an overview of the reconfigurable cache and a descriptionof the simulator interface. Finally, feedback from the students pro-vides assessment on using the simulator in the laboratory.

Index Terms—Cache memories, computer-aided instruction, ed-ucation, reconfigurable cache design, simulation software.

I. INTRODUCTION

T HE growing disparity between processor and memoryspeeds makes the cache memory and its efficient utiliza-

tion a critical factor in determining program performance.There are many ways to improve the cache performance re-

sulting in a reduction of the application execution time. One so-lution integrates a number of cache levels inside the processorchip, although this may cause a significant increment in energyconsumption and problems with heat dissipation. Other solu-tions are based on the modification of the cache organization [1],optimization of cache algorithms [2], or improvement of data lo-cality [3].

This paper presents the SImulator of Reconfigurable CAche(SIRCA) tool, a reconfigurable cache simulator designed toshow that tuning cache parameters properly may improveperformance, as a result of a reduction in the miss rate. Tuningthe cache architecture may also optimize other critical aspects,like energy consumption, by using only those modules at anyparticular moment.

This tool allows students to study the cache behavior (e.g.,number of read/write misses and hits) for a set of memory ref-erences testing different cache organizations, such as direct-mapped, n-way set-associative, and so on, with different blockand cache sizes.

Manuscript received June 28, 2007; revised November 26, 2007. First pub-lished June 17, 2008; last published August 6, 2008 (projected).

R. Quislant, O. Plata, and E. L. Zapata are with the Department of ComputerArchitecture, University of Málaga ETSI Informatica, Campus de Teatinos, Uni-versity of Málaga, 29071 Málaga, Spain (e-mail: [email protected]).

E. Herruzo and J. I. Benavides are with the Department of Computer Archi-tecture and Electronics, University of Córdoba, 14071 Córdoba, Spain.

Digital Object Identifier 10.1109/TE.2008.916767

The simulation tool shows the students a particular exampleof a reconfigurable cache design, and motivates them byusing concepts that they already know implemented within amultimedia resource. The convenience of using multimediaresources as teaching materials has been extensively demon-strated [4]–[6].

Other simulators of the cache memory similar to SIRCA havebeen proposed in the literature. A representative list follows.

• Dinero IV [7]: A trace-driven uniprocessor cache sim-ulator that allows many cache design parameters to bevaried (e.g., write-back versus write-through, least re-cently used (LRU) versus random replacement, demandfetching versus prefetching). Although it is a powerfultool, Dinero is difficult to configure and does not providea graphic user interface (GUI). Dinero is a command-linedriven simulator written in C.

• CAMERA [8]: Written in Java Swing [9], this simulatorprovides users with interactive tutorials and simulations tohelp them understand concepts about cache mapping andvirtual memory using paging.

• Java Cache Simulator [10]: This tool addresses the con-cepts of the three cache mapping schemes, and allows flex-ibility in changing the cache size, block size and associa-tivity. However, Java Cache Simulator demonstrates nei-ther the flow of data throughout the system nor the flow ofthe cache read and write algorithms.

SIRCA is written in Java Swing, like CAMERA and JavaCache Simulator. However, SIRCA’s interface differs fromthese in how it shows cache modules. The register transfer levelview of SIRCA is more intuitive for students than are the textboxes and labels shown by CAMERA and Java Cache Simu-lator. Furthermore, SIRCA simulates a specific reconfigurablecache design and provides hit and miss rates.

The rest of this paper is organized as follows. Section II de-scribes the educational framework where the simulator is used.Section III shows a description of the reconfigurable cache de-sign and simulator, along with an explanation of practical lab-oratory exercises. Section IV presents the feedback collectedfrom students. Finally, conclusions are drawn in Section V.

II. EDUCATIONAL FRAMEWORK

A. Educational Objectives

SIRCA tool has been used in several courses on ComputerDesign and Architecture in the University of Córdoba, Cór-doba, Spain. More precisely, this software tool is being used intwo specific courses: Computer Architecture and Engineering,and Processor Design and Configurations Assessment, both ofthem in the computer engineering curriculum. This simulator is

0018-9359/$25.00 © 2008 IEEE

QUISLANT et al.: CACHE MEMORY SYSTEM USING A RECONFIGURABLE APPROACH 337

a helpful tool in achieving some important aims in these courses,namely

• introducing a reconfigurable cache as a way to improveaspects like performance and energy saving;

• performing some practical exercises with a reconfigurablecache simulator;

• studying in depth the cache memory parameters, such as,associativity, block size, replacement policies, and writingpolicies;

• reinforcing the knowledge acquired in previous relatedcourses;

• motivating the students with these concepts.The IEEE-ACM Computer Science Curricula 2001 [11] lists

computer architecture (AR) as one of the subjects that shouldbe in a computer engineering curriculum, and which shouldinclude several core elements. A reconfigurable cache simu-lator is a powerful tool to enhance the teaching of at least threeof these elements, including digital logic and digital systems(AR1), memory system organization and architecture (AR4),and multiprocessing and alternative architectures (AR7).

SIRCA has been used in several lab exercises prepared forthe students on the aforementioned courses, together withother additional simulators also developed by the ComputerArchitecture and Electronics Department of the Universityof Córdoba, namely: the MESI coherence protocol simulator[12], the multilevel cache memory simulator (SiCaM [13]),and the Web system to simulate the multilevel cache memoryoccupation [14].

B. Theoretical Context

SIRCA has been placed in different theoretical contexts, suchas an introductory course on Computer Architecture and Engi-neering and a more specific course on Computer ConfigurationsAssessment. A brief outline of these courses follows.

Computer Architecture and Engineering is a mandatory an-nual course in the fourth year of the Informatics Engineeringdegree at the Polytechnic University of Córdoba. The course isassigned a total of 90 h that are divided into 45 h of theoreticallectures and 45 h of practical laboratory exercises. Three ses-sions, each two hours long, are assigned to practical exerciseswith SIRCA. This course is a general approach to CA. Hence,these practical exercises with the simulator are designed to studythe influence of the cache parameters on the processor perfor-mance.

Processor Design and Configurations Assessment is a first-semester course in the fifth and last year of the Informatics En-gineering degree at the Polytechnic University of Córdoba. Thiscourse is assigned a total of 60 h that are divided into 30 h of the-oretical lectures and 30 h of practical laboratory exercises. Threesessions, each two hours long, are assigned to practical exerciseswith the reconfigurable cache simulator. The main topics of thecourse are hardware optimizations in modern processors, andprocessor design oriented to optimizing software applications.SIRCA is used to perform some exercises to achieve those ob-jectives.

III. USING SIRCA

This section describes in detail the reconfigurable cache de-sign and some features of the simulator, as well as some simu-lator-based practical exercises.

A. Reconfigurable Cache Overview

The reconfigurable cache is assumed to be embedded ina system with a 256 Mb 32-bit word main memory (26-bitmemory address register). The cache capacity (C) ranges from256 bytes to 256 KB (that is, from 64 to 65536 words, powerof two). The line size (L) can be 1, 2, 4, 8, 16, 32, or 64 words.The number of blocks per set (W) can be 1, 2, 4, or 8. Everycombination of cache capacity, number of ways or associativity,and line/block size in words (C, W, and L) is allowed.

The cache is unified (there is no separate caches for data andinstructions). The writing policy chosen is write-back so thereis a dirty bit in tag RAM to specify a modified block in cache.

The reconfigurable cache is an inline cache, which means thata memory access causes a cache lookup and, if a miss occurs, amain memory lookup begins. In a look-aside cache, the lookupprocess begins in the cache and in the main memory at the sametime. This process is faster in case of a miss, but increases theoccupation of the system bus [15].

The replacement policy considered is a pseudo-LRU [15],which is a variant of the LRU algorithm, that requires lessmemory usage. The reconfigurable cache implements thisalgorithm in the LRU module depending on the associativitysupplied by the cache reconfigurable register (CRR) and othersignals.

The reconfigurable cache includes the dynamic mappingmodule (DMM) that maps the addresses issued by the CPUdynamically, depending on the content of the CRR. This reg-ister can be set by a reconfiguration instruction, and it holdsthe current parameters of the cache (C, W, and L) in form ofmasks. The programmer is provided with these masks, but theyare easy to calculate by applying the following equations:

Each mask is applied to certain bits of the memory addressby means of AND gates. For instance, if the current CPU ad-dress is , and the mask to extract the set address is

, then the resulting set address will be .There is also a Cache–Main memory interface that allows

burst line transfers between cache and main memory, which arefaster than individual word transfers. The burst size is set on thereconfiguration cycle to match the line size.

Finally, the Way selection module consists of combinationallogic to select the ways that are to be used while keeping instandby those that are not necessary.

A more detailed description of the reconfigurable cache canbe found in [16].

B. Simulator Overview

SIRCA provides a GUI (Fig. 1) that consists of a menu barproviding access to every functionality of the application; atoolbar, with buttons to simulate cycle by cycle, reference by

338 IEEE TRANSACTIONS ON EDUCATION, VOL. 51, NO. 3, AUGUST 2008

Fig. 1. SIRCA graphic user interface.

Fig. 2. Format of the input trace file for the simulator.

reference and execute all references; a references table, thatshows the current set of memory addresses being simulated(top left); a statistics chart, that shows the hit and miss rates(bottom left); a reconfigurable cache block chart, that shows thereconfigurable cache structure in RTL (center); and a status bar.

SIRCA highlights in blue the active buses and the selectedtag and data arrays. Microoperations selected by the cache con-troller are highlighted in red, as well as the control lines such asDIRTY, MATCH, and so on. Rolling the mouse over the busescauses a tool tip to show their contents. Rolling the mouse overthe CRR shows the current configuration of the cache.

To run a simulation, a file containing the memory referencesissued by a program execution, a so-called trace, is needed. Aspecial reference has been defined to reconfigure the simulatedcache system by setting up the parameters of the cache via areconfiguration word ( in Fig. 2). Such a reconfigurationword encodes the cache parameters C, W, and L. The “cacheconfigurations table” in the help menu holds every (C, W, L,

) tuple available. An example of a memory reference filefor SIRCA is given in Fig. 2.

Several windows of the user interface detail parts of the cachememory system. Fig. 3 depicts the microprogrammed cachecontroller window. The students can track the execution of read,write and reconfiguration algorithms since microoperations arehighlighted in red and control bits are shown in boxes at thebottom right corner.

Fig. 3. Microprogrammed cache controller window.

Fig. 4. Tag RAMs, data RAMs, and main memory windows.

Fig. 5. Dynamic mapping module window.

Memory content is shown in three different windows (seeFig. 4). Data RAM window shows the cache data array con-tents. Tag RAM window shows the cache tag array contents in atabbed pane. Each entry has its valid and dirty bit. Finally, mainmemory window shows the contents of main memory.

The last window of the GUI shows the DMM (Fig. 5), the coreof the reconfigurable cache system. This window depicts, in anintuitive fashion, how the masks in the CRR are applied to theaddress in the memory address register (MAR). The numbers inred are changing with the ongoing simulation. The masks dy-namically divide the MAR address into tag, set, and line offset.The buses that connect these parts to the corresponding memoryaddress ports have a fixed length; however, the masks set to zerosome bits not in the current configuration.

Practical exercises with SIRCA can be done in two ways.

QUISLANT et al.: CACHE MEMORY SYSTEM USING A RECONFIGURABLE APPROACH 339

1) Students are provided with memory reference (trace) filesthat are used as input to SIRCA. These reference files areshort and simple, and are intended to allow the students tosee the flow of microoperations fetched by the cache con-troller and the flow of data through the registers, buffersand buses. This insight is very important in order to under-stand the opera of a cache memory.

2) Students are provided with source code files (e.g., matrixproduct). They must instrument the code to obtain the cor-responding trace file for that code. This mode of using thesimulator helps the students to understand certain com-piler-related tasks.

The simulator also includes a user’s manual that contains acomprehensive explanation of its functionality and of the digitaldesign of the modules in the reconfigurable cache block chart.This information is important for the student to understand thecomplete operation of SIRCA.

The simulator is free and available at www.uco.es/~el1hegoe/download/src for academic use. Minimum system requirementsare: Java Runtime Environment 1.2 or later for Windows orLinux, running under a Pentium II/AMD K-6II 350-MHz pro-cessor, with at least 64 MB of main memory and a 10-MB harddisk.

C. Practical Laboratory Exercises

Section II-B placed the simulator in the context of two coursesat the University of Córdoba: Computer Architecture and Engi-neering, and Processor Design and Configurations Assessment.In the first course, the simulator is used in three practical ses-sions.

• The first session introduces SIRCA, to familiarize the stu-dents with the tool. Furthermore, the reconfigurable cachelogic and architectural design is explained in detail. Thestudents can see theoretical concepts implemented withinthe software tool.

• In the second session, students are provided with severalfiles containing small pieces of memory references thatforces different cache events, such as a cache hit, a miss ora block replacement. They have to simulate the given files,cycle by cycle, in order to see the cache operation. Finally,they have to justify why the cache operates in that way, re-ferring to the policies the cache memory implements.

• In the last session, students can see the effects of changingblock size, capacity and associativity, as well as practicingwith some cache miss rate reduction techniques, describedin [17].

In the second course, the SIRCA simulator is used to achievemore specific objectives. The following subsections detail theexercises proposed in that course.

1) Software Optimization: The first session is designed tostudy some software optimization techniques that can be usedby the compiler or the programmer in the area of the cachememory system. This topic links programming languages, andmemory system concepts such as array storage and indexing.

With respect to array storage, some programming languages,like C/C++, store arrays in memory following row–major order,while languages like Fortran store arrays following column-major order. Some techniques have been developed to optimizearray access locality. Loop interchange [17], for instance, may

Fig. 6. Java code fragment for the matrix product.

be used to reduce the cache miss rate by indexing arrays in theorder in which they are stored in memory. Consequently, the ex-ecution time is also reduced, by enhancing locality of reference.Blocking [17] is another powerful technique that tries to reducecache misses by improving temporal locality. Instead of oper-ating on entire rows or columns of an array, blocked algorithmsoperate on submatrices or blocks, in order to maximize the useof the data stored in the cache before such data is replaced.

The exercise given to the students concerns loop interchange,and was designed as follows. Students are provided with a Javaor C code that implements a matrix product. The code was in-strumented to generate an output trace file with the format re-quired by the simulator (see Fig. 6). Initially, array storage is inrow-major order (i.e., , whereis an array of dimension ). Then students must test the re-sults obtained with column-major order array storage, and witha mixture of both orders of storage. Students might observe thatbetter results are obtained when matrices and are storedin row-major order because these are accessed in row-majororder, and when matrix B is stored in column-major order be-cause this is accessed in column-major order (i.e.,

, where is an array of dimension ).2) Hardware Optimization: The objectives of the second ses-

sion are to• introduce the need for reducing power dissipation due to

on-chip caching;• explore some tradeoffs between cache performance and

power dissipation;• introduce reconfigurable caches.Students use the trace file obtained in the previous session

to gather results from the simulator. They have to change asso-ciativity, while maintaining the same values for block size andcapacity. The students are expected to see that while the hit andmiss rates have not changed significantly, reducing cache as-sociativity results in lower power dissipation, since the designof the reconfigurable cache requires fewer transistors to be op-erating, and bearing in mind that a transistor consumes energyon changing state. Students should also observe the benefits of

340 IEEE TRANSACTIONS ON EDUCATION, VOL. 51, NO. 3, AUGUST 2008

increasing block size, while taking into account the penalty inincreasing misses.

3) Collaborative Session: The last session is devoted to col-laborative learning. Students must compare and explain the re-sults obtained in their previous sessions. They form discussiongroups to exchange their conclusions for the various exercises.Subsequently, they present these conclusions to the teacher andto the other groups in order to create a social interaction betweenthe teacher and the students.

The idea is to encourage students to help each other, thus rein-forcing the social dimension of education [6]. The collaborativesession is based on proven pedagogical theories of collaborativelearning [18] and constructional learning [19].

IV. FEEDBACK FROM THE STUDENTS

SIRCA was introduced in the first term of the 2006–2007academic year as a reinforcement to theoretical lectures. As abasic assessment about the use of the simulator, students wererequired to fill out a questionnaire. They gave their opinions onthe following statements, among others.

• I clearly understood the purpose and operation of the cachememory system before practicing with the simulator.

• SIRCA helps a user to understand memory system con-cepts.

• The practical exercises are easy to perform.• I would like to have had extra time to carry out the labora-

tory exercises and to understand them better.• I think the simulator is a user-friendly tool that allows me

to learn at my own pace.The statements previously mentioned had to be answered on a

five-point Likert scale [20] where 1 means “strongly disagree,”2 means “disagree,” 3 means “neither agree nor disagree,” 4means “agree,” and 5 means “strongly agree.” A total of 52 stu-dents from the fourth-year course filled out the questionnaireand the results are shown in Fig. 7.

Fig. 7(a) shows that most of students did not previously havea good understanding of the purpose and operation of the cachememory system. As shown in Fig. 7(b), most of the studentsagreed or strongly agreed with the second question. They foundthat the simulator was a helpful tool in gaining a better under-standing of theoretical concepts. The results obtained in ques-tion number three, in Fig. 7(c), suggest that the practical exer-cises were easy to perform but they could be easier. Several stu-dents disagree with this statement. However, about 75% of thestudents indicated that they would like to have had extra time tothe practical exercises and to understand them better, as shownin Fig. 7(d). This fact and the results shown in Fig. 7(e) lead to adecision: since increasing the time devoted to the simulator lab-oratory exercises is not possible because of the credits assignedto the courses, the students have now been given the possibilityof downloading the simulator at home (see Section III-B) wherethey can complete the lab exercises and test the tool in depth, ifthey wish. Furthermore, since the tool was developed using Javatechnology, it is runable under any operating system with a vir-tual machine.

Finally, Table I shows the students’ grades obtained from the-oretical tests during 2006 and 2007. Tests in Computer Archi-tecture and Engineering include an exercise on memory systemorganization and architecture. The results showed here are the

Fig. 7. Questionnaire results for the main questions asked to students.

TABLE ISTUDENTS’ GRADES ON MEMORY SYSTEM ORGANIZATION

AND ARCHITECTURE EXERCISES

partial grades obtained on these exercises. It should be notedthat except for the use of the SIRCA practical laboratory ex-ercises, the educational environment has not change betweenyears (same professor, same textbook and lecture notes, andsame type of exercises). The number of students doing the testswas approximately the same (60 students in 2005–2006 and 64students in 2006–2007). A substantial 20% of the students, in-stead of earning either F or E, manage to earn either C or D.The number of students earning either an A or B grade increasedslightly.

V. CONCLUSION

A new simulator tool has been developed, along with a com-plementary set of laboratory exercises, to motivate students andto facilitate the teaching the cache memory, which is one of the

QUISLANT et al.: CACHE MEMORY SYSTEM USING A RECONFIGURABLE APPROACH 341

most important hardware structures devised to enhance CPUperformance by hiding the memory latency.

The reconfigurable cache simulator was used in the coursesComputer Architecture and Engineering, and Processor Designand Configurations Assessment at the University of Córdoba.Student opinions of the simulator were obtained through a ques-tionnaire, which shows that they think the simulator was helpfulin understanding the theoretical concepts, and that they wouldhave liked to have had more time to work on the suggestedexercises. Statistically, students in courses concerning memorysystem organization and architecture featuring the simulator ob-tained higher scores on theoretical tests than did students in pre-vious years, who did not have access to the simulator.

Overall, the main aims of the courses were achieved suc-cessfully, demonstrating that the simulator is a motivating toolto introduce reconfigurable cache design and to reinforce thestudents’ knowledge in CA. Finally, as previously mentioned,SIRCA is free and available for academic use at http://www.uco.es/~el1hegoe/download/src.

REFERENCES

[1] S. Coleman and K. S. McKinley, “Tile size selection using cache or-ganization and data layout,” in Proc. SIGPLAN Conf. ProgrammingLanguage Design and Implementation, La Jolla, CA, Jun. 1995, pp.279–290.

[2] M. D. Lam, E. E. Rothberg, and M. E. Wolf, “The cache performanceand optimizations of blocked algorithms,” in Proc. 4th Int. Conf. Archi-tectural Support for Programming Languages and Operating Systems(ASPLOS-IV) , Santa Clara, CA, Apr. 1991, pp. 63–74.

[3] D. F. Bacon, S. L. Graham, and O. J. Sharp, “Compiler transformationsfor high-performance computing,” ACM Comput. Surv., vol. 26, no. 4,pp. 345–420, Dec. 1994.

[4] A. A. Shiga and C. A. G. Pegollo, “The media as a complement of en-gineering teaching,” presented at the Int. Conf. Engineering Education,Taipei, Taiwan, 2000.

[5] J. C. Boluda, M. A. Peiró, M. A. L. Torres, R. Gironés, and R. J. C.Palero, “An active methodology for teaching electronic systems de-sign,” IEEE Trans. Educ., vol. 49, no. 3, pp. 355–359, Aug. 2006.

[6] L. Moreno, C. González, I. Castilla, E. J. González, and J. Sigut, “Useof constructivism and collaborative teaching in an ILP processorscourse,” IEEE Trans. Educ., vol. 50, no. 2, pp. 101–111, May 2007.

[7] J. Edler and M. D. Hill, Dinero IV Trace-Driven Uniprocessor CacheSimulator [Online]. Available: www.cs.wisc.edu/~markhill/DineroIV

[8] L. Null and K. Rao, “CAMERA: Introducing memory concepts viavisualization,” SIGCSE Bull., vol. 37, no. 1, pp. 96–100, 2005.

[9] Java, Trail: Creating a GUI With JFC/Swing [Online]. Available: java.sun.com/docs/books/tutorial/uiswing

[10] K. Rich, H. Pang, E. Weathers, and G. Zhong, Java Cache Simu-lator [Online]. Available: huron.cs.ucdavis.edu/students/weathers/public_html/index.html

[11] Computing Curricula 2001, Computer Science, The Joint Task Forceon Computing Curricula, IEEE Computer Society, Association forComputing Machinery [Online]. Available: www.sigcse.org/cc2001

[12] F. J. Jiménez, J. Gómez, A. Mesones, E. Herruzo, J. I. Benavides, andF. J. Sánchez, “Teaching the cache memory coherence with the MESIprotocol simulator,” presented at the Congreso de Tecnologías Apli-cadas a la Enseñanza de la Electrónica, Madrid, Spain, Jul. 2006.

[13] E. Herruzo, J. I. Benavides, Saez, M. A. Montijano, and J. M. Palo-mares, “Desarrollo de simuladores de arquitectura de computadoresy su aplicación en la enseñanza,” presented at the Congreso de Tec-nologías Aplicadas a la Enseñanza de la Electrónica, Las Palmas deGran Canaria, Spain, Feb. 2002.

[14] M. A. Herrera, J. M. Palomares, E. Herruzo, and J. I. Benavides,“Web system to simulate the multilevel cache memory occupationfor teaching purposes,” presented at the Int. Conf. Multimedia andInformation and Communication Technologies in Education, Cáceres,Spain, Jul. 2005.

[15] J. Handy, The Cache Memory Book. San Diego, CA: Academic,1993, 0-12-322985-5.

[16] R. Quislant, “Reconfigurable cache memory system. Design and sim-ulation,” M.S. thesis, University of Córdoba, Córdoba, Spain, 2004.

[17] J. L. Hennessy and D. A. Patterson, Computer Architecture: A Quanti-tative Approach, 3rd ed. San Mateo, CA: Morgan Kaufmann, 2003.

[18] A. Inaba, T. Supnithi, M. Ikeda, R. Mizoguchi, and J. Toyoda, “Howcan we form effective collaborative learning groups?,” in Proc. 5th Int.Conf. Intelligent Tutoring Systems, Montreal, QC, Canada, Jun. 2000,pp. 282–291.

[19] R. Reiser and J. V. Dempsey, Trends and Issues in Instructional De-sign and Technology, 2nd ed. Columbus, OH: Merrill/Prentice-Hall,2006, ch. II, sec. 5.

[20] R. A. Likert, “Technique for the measurement of attitudes,” Arch. Psy-chol., vol. 140, no. 55, pp. 1–55, 1932.

Ricardo Quislant received the M.Sc. degree in computer engineering from theUniversity of Granada, Granada, Spain, in 2006. He is currently working to-wards the Ph.D. degree in the Department of Computer Architecture, Universityof Málaga, Málaga, Spain.

His main research interests are computer memory system and high-perfor-mance computing with special regard to transactional memory.

Ezequiel Herruzo received the M.Sc. degree in computer engineering from theUniversity of Málaga, Málaga, Spain, in 1991.

Since 1999, he has been an Assistant Professor in the Department of Com-puter Architecture and Electronics, University of Córdoba, Córdoba, Spain. Hisresearch is mainly focused on the interaction between the compiler and the novelcomputer architectures as well as in the implementation of new educational toolsto teach computer architecture issues.

Oscar Plata received the M.Sc. and Ph.D. degrees in physics from the Univer-sity of Santiago de Compostela, Santiago de Compostela, Spain, in 1985 and1989, respectively.

He is currently a Full Professor in the Department of Computer Architecture,University of Málaga, Málaga, Spain. His research interests include high per-formance computer architecture, automatic parallelization, and compiler tech-nologies for parallel applications.

José Ignacio Benavides received the Ph.D. degree in physics from the Univer-sity of Santiago de Compostela, Santiago de Compostela, Spain, in 1990.

Since 1990, he has been an Associate Professor in the Department of Com-puter Architecture and Electronics, University of Córdoba, Córdoba, Spain. Hisresearch is mainly focused on processor design and in the implementation ofnew educational tools to teach computer architecture issues.

Emilio L. Zapata received the M.Sc. degree in physics from the University ofGranada, Granada, Spain, and the Ph.D. degree in physics from the Universityof Santiago de Compostela, Santiago de Compostela, Spain, in 1978 and 1983,respectively.

Since 1991, he has been a Full Professor in the Department of ComputerArchitecture, University of Málaga, Málaga, Spain. His main research interestsare numerical and audiovisual applications, high performance architectures, andcompilation techniques for parallel computers.