A BIST Architecture for Testing LUTs in a Virtex-4 ... - OhioLINK ETD

A Thesis

entitled

A BIST Architecture for Testing LUTs in a Virtex-4 FPGA

by

Priyanka Gadde

Submitted to the Graduate Faculty as partial fulfillment of the requirements for the

Master of Science Degree in Electrical Engineering

_______________________________________

Dr. Mohammad Niamat, Committee Chair

_______________________________________

Dr. Mansoor Alam, Committee Member

_______________________________________

Dr. Weiqing Sun, Committee Member

_______________________________________

Dr. Patricia R. Komuniecki, Dean

College of Graduate Studies

The University of Toledo

December 2013

Copyright 2013, Priyanka Gadde

This document is copyrighted material. Under copyright law, no parts of this document

may be reproduced without the expressed permission of the author.

iii

An Abstract of

A BIST Architecture for Testing LUTs in a Virtex-4 FPGA

by

Priyanka Gadde

Submitted to the Graduate Faculty as partial fulfillment of the requirements for the

Master of Science Degree in Electrical Engineering

The University of Toledo

December 2013

Field Programmable Gate Arrays (FPGAs) are programmable logic devices that

can be used to implement a given digital design. Built-In Self-Test (BIST) is a testing

technique that enables the device to test itself without the need for any external test

equipment. The re-programmability feature of the FPGAs makes BIST a favorable

approach for testing FPGAs because it eliminates any area or performance degradation

associated with BIST.

In order to ensure proper operation of Look up Tables in Xilinx Virtex-4 Field-

Programmable Gate Arrays (FPGAs), a dependable and resource efficient test technique

is needed so that the functional operation of the memory can be tested. Traditional BIST

techniques for FPGAs suffer from a large number of logic resource requirements and

long test times in the implementation and testing of the circuit.

The work presented in this research simplifies the BIST architecture and reduces

the test time required to test the Look up Tables in a Virtex-4 FPGA. The proposed

iv

technique is capable of testing the following types of memory faults: stuck-at fault,

transition fault, address decoder fault, incorrect read fault, read destructive fault,

deceptive read destructive fault, data retention fault, state coupling fault, transition

coupling fault, incorrect read coupling fault, read destructive coupling fault, and

deceptive read destructive coupling fault in a SRAM based FPGA.

This Thesis is dedicated to my Grandparents, for all their Love and Support.

vi

Acknowledgements

I would like to thank Dr. Mohammed Niamat for giving me an opportunity to

work under his leadership and guiding me with his valuable advice. I would also like to

thank Dr. Mansoor Alam and Dr. Weiqing Sun for serving in my thesis committee. I

would also like to thank the Department of Electrical Engineering and Computer

Sciences, for partially funding my Master’s degree

I would like to thank my grandparents Mr. Adinarayana and Mrs. Jayasri and my

parents, Mr. Ramprasad and Mrs. Karuna for their constant love, support, understanding,

encouragement and for always being my source of motivation. I am very grateful to them

for their sacrifices and efforts that made this thesis possible. I would love to thank my

sister Hema Prasanthi for being there to share my happiness, cheer me up in tough times

and being my best friend always.

My acknowledgments would be incomplete without thanking my friends. Primarily, I

would like to thank Pradyuma Thayi for being my best companion to help and guide me

throughout my Masters. I would like to thank my friends Aditya, Ahmad, Anu, Jayaram,

Karthik, Prem, Sandeep, Swetha, and Teja for all their encouragement at every step and I

would like to thank my Uncle Madhusudan and Aunt Padmaja for all their love and

support.

vii

Contents

Abstract ............................................................................................................................. iii

Acknowledgements .......................................................................................................... vi

Table of Contents ............................................................................................................ vii

List of Tables .................................................................................................................. xii

List of Figures ............................................................................................................... xiii

1 Introduction ........................................................................................................... 1

1.1 Field Programmable Gate Arrays .................................................................... 1

1.2 Built in Self-Test (BIST) ................................................................................. 2

1.3 Advantages of BIST ......................................................................................... 3

1.4 Disadvantages of BIST .................................................................................... 4

1.5 Literature Survey .............................................................................................. 5

1.6 Organization of Thesis ..................................................................................... 8

2 Fault Types and Algorithms ................................................................................ 9

2.1 Introduction ...................................................................................................... 9

2.2 SRAM Cell ..................................................................................................... 11

2.3 Functional Model ........................................................................................... 11

2.4 Electrical Structure for SRAMs ..................................................................... 12

viii

2.5 SRAM Read and Write Circuitries................................................................. 14

2.6 Faults .............................................................................................................. 15

2.6.1 SRAM Memory Faults .................................................................... 16

2.7 Analysis of Faults in SRAM Cell ................................................................... 29

2.8 Advanced Memory Test ................................................................................. 32

2.9 MATS and MATS+ Algorithms .................................................................... 32

2.10 MARCH C-Algorithm.................................................................................. 33

2.11 Extended MarchC- Algorithm ...................................................................... 33

2.12 March Tests .................................................................................................. 34

2.13 Selection of the Testing Algorithm .............................................................. 35

3 SRAM Based FPGA ............................................................................................ 36

3.1 Introduction .................................................................................................... 36

3.2 Anatomy of the FPGA.................................................................................... 36

3.3 Benefits and Drawbacks of FPGAs ................................................................ 37

3.4 FPGA Applications ........................................................................................ 38

3.5 FPGA Device Manufactures .......................................................................... 39

3.6 SRAM Programmable Virtex-4 FPGA .......................................................... 39

3.6.1 I/O Blocks ....................................................................................... 40

3.6.2 Block RAM Modules (BRAMs) ..................................................... 40

3.6.3 Cascadable Embedded Xtreme DSPSlices ...................................... 41

ix

3.6.4 Digital Clock Managers (DCMs) .................................................... 41

3.6.5 Configurable Logic Block (CLBs) .................................................. 41

3.7 Need for Testing FPGAs ................................................................................ 49

4 Proposed Architecture for Testing Look up Tables in a Virtex-4 FPGA ...... 50

4.1 Test Pattern Generator (TPG) ........................................................................ 53

4.2 Circuit Under Test (CUT) and Output Response Analyzer (ORA) ............... 57

4.3 BISTArchitecture ........................................................................................... 59

4.4 Fault Modeling and Detection using Extended MarchC- Algorithm............. 62

4.5 Pseudo Code ................................................................................................... 63

4.6 Fault Modeling and Detection ........................................................................ 64

4.6.1 Stuck-at Fault .................................................................................. 64

4.6.2 Transition Fault ............................................................................... 65

4.6.3 Address Decoder Fault .................................................................... 67

4.6.4 Incorrect Read Fault ........................................................................ 69

4.6.5 Read Destructive Fault .................................................................... 70

4.6.6 Deceptive Read Destructive Fault ................................................... 71

4.6.7 Data Retention Fault........................................................................ 72

4.6.8 Coupling Faults ............................................................................... 72

5 Simulation Results and Performance Analysis ................................................ 74

5.1 Introduction .................................................................................................... 74

x

5.2 Simulation Results.......................................................................................... 74

5.3 Simulations without Faults ............................................................................. 75

5.4 Stuck-at 1 Fault .............................................................................................. 80

5.5 Stuck-at 0 Fault .............................................................................................. 82

5.6 Up-Transient Fault ......................................................................................... 83

5.7 Down-Transient Fault .................................................................................... 85

5.8 Address Decoder Fault ................................................................................... 86

5.9 Incorrect Read Fault ....................................................................................... 87

5.10 Read Destructive Fault ................................................................................. 88

5.11 Deceptive Read Destructive Fault ................................................................ 89

5.12 Data Retention Fault..................................................................................... 90

5.13 State Coupling Fault ..................................................................................... 91

5.14 Up-Transient Coupling Fault ....................................................................... 94

5.15 Down-Transient Coupling Fault................................................................... 94

5.16 Incorrect Read Coupling Fault ..................................................................... 95

5.17 Read Destructive Coupling Fault ................................................................. 97

5.18 Deceptive Read Destructive Coupling Fault ................................................ 99

5.19 Analysis of Results ..................................................................................... 100

6 Conclusion ......................................................................................................... 108

6.1 Contributions ................................................................................................ 110

6.2 Future work .................................................................................................. 111

xi

References ...................................................................................................................... 112

xii

List of Tables

Table 2.1. Characteristics of different Memory Architectures ......................................... 10

Table 2.2. List of other March Tests ................................................................................ 34

Table 3.1. Logic Resources in a CLB ............................................................................... 43

Table 3.2. ROM Configurations. ...................................................................................... 46

Table 4.1. The Test Patterns Generated by the TPG. ........................................................ 56

Table 5.1. ORA outputs .................................................................................................... 79

Table 5.2. Fault Coverage ............................................................................................... 101

xiii

List of Figures

Simple BIST scheme. ....................................................................................... 3 Figure 1-1:

Huang's Interconnection scheme. ..................................................................... 6 Figure 1-2:

Lalla’s proposed Interconnection Scheme. ...................................................... 7 Figure 1-3:

SRAM Memory Model. ................................................................................. 11 Figure 2-1:

6T SRAM Cell. .............................................................................................. 13 Figure 2-2:

Read Circuitry. ............................................................................................... 14 Figure 2-3:

Single-ended Voltage Sense Amplifier. ......................................................... 15 Figure 2-4:

State Diagram of a Fault Free Cell. ................................................................ 17 Figure 2-5:

State Diagram of (a) SA0 Fault and (b) SA1 Fault. ....................................... 18 Figure 2-6:

Up-Transient Fault. ........................................................................................ 19 Figure 2-7:

State Diagram of Down-Transient Fault. ....................................................... 19 Figure 2-8:

Address Decoder Faults. ................................................................................ 20 Figure 2-9:

State diagram for Incorrect Read Fault. ....................................................... 21 Figure 2-10:

State Diagram for Read Destructive Fault. .................................................. 22 Figure 2-11:

State diagram for Deceptive Read Destructive Fault. .................................. 23 Figure 2-12:

State diagram for Data Retention Fault. ....................................................... 24 Figure 2-13:

State Diagram for State Coupling Fault. ...................................................... 25 Figure 2-14:

State Diagram for Transient Coupling Fault. ............................................... 26 Figure 2-15:

State Diagram for Incorrect Read Coupling Fault. ...................................... 27 Figure 2-16:

xiv

State Diagram for Read Destructive Coupling Fault. .................................. 28 Figure 2-17:

State Diagram for Deceptive Read Destructive Coupling Fault. ................. 29 Figure 2-18:

Defects Injected into SRAM Core cell. ........................................................ 30 Figure 2-19:

MATS+ Algorithm. ...................................................................................... 32 Figure 2-20:

March C- Algorithm. .................................................................................... 33 Figure 2-21:

Extended March C- Algorithm. .................................................................... 33 Figure 2-22:

Figure 3-1: Basic FPGA Architecture. .............................................................................. 37

Figure 3-2: CLB Architecture. .......................................................................................... 42

Figure 3-3: Distributed RAM. ........................................................................................... 44

Figure 3-4: Representation of a Shift Register. ................................................................. 47

Figure 3-5: Representation of MUX F5 and MUX FX Multiplexers. .............................. 48

Figure 4-1: Slice L [42] ..................................................................................................... 51

Figure 4-2: Slice M [42]. .................................................................................................. 52

Figure 4-3: Detailed Diagram for a UP Counter. .............................................................. 54

Figure 4-4: Detailed Diagram for a Down Counter. ......................................................... 55

Figure 4-5: XOR operation of a Down Counter. .............................................................. 55

Figure 4-6: Extended March Algorithm. ........................................................................... 56

Figure 4-7: Comparator Based ORA Architecture. .......................................................... 58

Figure 4-8: Comparator Operation. ................................................................................... 59

Figure 4-9: Proposed Architecture. ................................................................................... 60

Figure 4-10: Interconnection Scheme of the Proposed Architecture. ............................... 61

Figure 4-11: Circular Comparison BIST Architecture. .................................................... 62

Figure 4-12: Model of Stuck-at Fault. .............................................................................. 65

xv

Figure 4-13: Model of Transition Fault. ........................................................................... 66

Figure 4-14: Address Decoder with Stuck-at Faults. ........................................................ 67

Figure 4-15: Model of Address Decoder Fault. ................................................................ 68

Figure 4-16: Model of Incorrect Read Fault. .................................................................... 69

Figure 4-17: Model of Read Destructive Fault. ................................................................ 70

Figure 4-18: Model of Deceptive Read Destructive Fault. ............................................... 71

Figure 4-19: Model of Coupling Fault. ............................................................................. 73

Fault free simulation of M0 Operation. ......................................................... 75 Figure 5-1:

Fault free simulation of M1 operation. .......................................................... 76 Figure 5-2:

Fault free simulation of M2 operation. .......................................................... 77 Figure 5-3:

Fault free simulation of M3 Operation. .......................................................... 78 Figure 5-4:



Stuck-at 1 Fault at CLB#3 during M1 operation. .......................................... 81 Figure 5-7:

Stuck-at 1 Fault at CLB #3 during M3 operation. ......................................... 81 Figure 5-8:

Stuck-at 1 Fault CLB#3 during M5 operation. .............................................. 82 Figure 5-9:

Stuck-at 0 Fault at CLB#3 during M2 operation. ........................................ 83 Figure 5-10:

Up-Transient fault at CLB#2 during M2 operation. .................................... 84 Figure 5-11:

Down-Transient fault at CLB#2 during M3 operation. ............................... 85 Figure 5-12:

Address Decoder fault at CLB#2 during M3 operation. .............................. 86 Figure 5-13:

Address Decoder fault at CLB#3 during M1 operation. .............................. 87 Figure 5-14:

Incorrect Read Fault at CLB#1 during M1 operation. ................................. 88 Figure 5-15:

Read Destructive Fault at CLB#1 during M1 operation. ............................. 89 Figure 5-16:

xvi

Deceptive Read Destructive Fault at CLB#3 during M4 operation. ............ 90 Figure 5-17:

Data Retention Fault at CLB#3 during M4 operation. ................................. 91 Figure 5-18:

State Coupling Fault at CLB#3 during M1 operation. ................................. 92 Figure 5-19:

State Coupling Fault at CLB#3 during M1 operation. ................................. 93 Figure 5-20:

Up-Transient Coupling Fault at CLB#4 during M1 operation. .................... 94 Figure 5-21:

Down-Transient Coupling Fault at CLB#1 during M3 operation. ............... 95 Figure 5-22:

Incorrect Read Coupling Fault at CLB#1 during M1 operation. ................. 96 Figure 5-23:

Incorrect Coupling Fault at CLB#1 during M2 operation. .......................... 97 Figure 5-24:

Read Destructive Coupling Fault at CLB#3 during M1 operation. ............. 98 Figure 5-25:

Read Destructive Coupling Fault at CLB#1 during M2 operation. ............. 99 Figure 5-26:

Figure 5-27: Deceptive Read Destructive Coupling Fault at CLB#4 during M4 operation.

................................................................................................................................. 100

1

Chapter 1

1 Introduction

1.1 Field Programmable Gate Arrays

A Field Programmable Gate Array (FPGA) is an integrated circuit that can be

configured by the user in the field unlike devices such as Application Specific Integrated

Circuits (ASICs) which are configured by the manufacturer [1]. FPGAs contain

Configurable Logic Blocks (CLBs) and Random Access Memories (RAMs) that allow

the user to implement combinational or sequential logic functions. Also, some FPGAs

can be partially reprogrammed during run time, thereby making it possible to implement

reconfigurable hardware circuits. Due to these versatile features, FPGAs are in great

demand for military and space applications. However the operations of FPGAs may be

prone to errors when they are subjected to severe environmental conditions such as

exposure to gamma radiations. With the advent of the FPGA and its proliferation in

system critical applications, testing FPGAs before programming them is becoming a

necessity.

Testing an FPGA is a complex task since it involves testing logic functions and

interconnections. New testing schemes are being developed to decrease the overhead

2

circuitry cost and test time; and at the same time increasing the fault coverage. In general,

testing is carried out by applying a test vector to the circuit, and its output is compared

with the expected output. With the decrease in feature size and the increase in device

complexity, large test vectors are required to test a circuit. Also, an external circuitry

might be used to store all the test configurations. A Built in Self-Test (BIST) can

overcome the problems of using large test vectors and external circuitry by testing the

circuit with components on board with the FPGA.

1.2 Built in Self-Test (BIST)

The need for an efficient and economical testing method such as the Built-In Self-

Test (BIST) increases with the increase in complexity of Very Large Scale Integration

(VLSI) devices [2]. The idea behind BIST is to design a circuit that is capable of

verifying itself as being either faulty or fault-free and then continue its operation when

the testing is not being carried out.

As shown in Figure 1-1, a simple BIST scheme contains three major components [3] [4]:

Test Pattern Generator (TPG)

Circuit Under Test (CUT)

Output Response Analyzer (ORA)

3

Simple BIST scheme. Figure 1-1:

The TPG serves as a stimulus to the CUT, producing a sequence of patterns that

will cause the CUT to generate an expected output. The result from the CUT is analyzed

by an ORA. Depending on whether an ORA receives the expected output or an erroneous

one, it generates some sort of pass/fail indication [1] [3]. For the system level

implementation, components such as an Isolation Circuitry and a Test Controller are

needed. The isolation circuit can be a 2:1 multiplexer which switches between normal

operation and BIST. The test controller ensures that all the components in the BIST

circuit are initialized to prevent any unknown data from entering into an ORA. The BIST

scheme contains an output bit to indicate the status (pass/fail) of the system to an external

device. Optionally, BIST start and done flags are used to indicate the start and end of a

test sequence. The effectiveness of a BIST test is determined by the number of faults that

are detected compared to the total number of faults possible in a system (fault coverage)

and the test time [5] [6].

1.3 Advantages of BIST

Given that BIST enables a circuit to test itself, the main advantages of BIST are:

ORATPG

Test Controller

CUT

BIST Start BIST End

IsolationCircuitry

Pass/Fail

System Input System

Output

ORATPG

Test Controller

CUT

BIST Start BIST End

IsolationCircuitry

Pass/Fail

System Input System

Output

4

A device can be validated in any stage of production which is known as Vertical

Testability.

BIST is a lower cost technique compared to external testing using an Automatic

Test Pattern Generator (ATPG).

BIST uses the system’s internal clock for at-speed testing which enables it to

detect components which cause excessive delay in an otherwise working circuit.

It is possible to test at a high speed, which helps in reducing the test time.

It is possible to test the circuit in the field by a user using BIST.

Using pseudorandom patterns helps in detecting unmodeled defects in a circuit.

1.4 Disadvantages of BIST

Disadvantages of implementing BIST include

Additional design time.

Applying pseudorandom patterns results in sending illegal patterns to some

signals that have constraints on the set of logic values they can have.

An experienced BIST design engineer is required.

Additional circuitry increases the overall cost of the chip.

Despite these drawbacks, studies [7] [8] have shown that the benefits incurred

from using a BIST are more than the implementation costs. Using an FPGA, a BIST can

be programmed, and the circuit can be tested. Implementing BIST using FPGA is

5

beneficial because of the re-programmable nature of an FPGA. Due to the availability of

enormous logic resources in a FPGA, BIST structures can be easily implemented. After

the circuit has been tested for the required function, the chip can be reprogrammed to its

original function. In this research, a Virtex-4 FPGA is used for implementation.

1.5 Literature Survey

BIST technique has been implemented for testing embedded memory [9] [10].

Using external test equipment techniques increases the area overhead on the chip [11].

Therefore, it is advantageous to use the reprogrammability feature inherent in the FPGAs.

An additional advantage of utilizing the re-programmable feature of an FPGA is, after

testing, the circuit BIST logic can be removed and the circuit can be configured to its

normal operation. Using this technique, permanent area overhead problem can be solved.

Due to these advantages BIST techniques have been implemented widely to test various

ICs including System-on-Chips and FPGAs [7] [12-15]. There has been considerable

research on developing BIST techniques for programmable logic resources in an FPGA

including CLBs [16] and interconnect matrix of routing resources [17]. Testing

embedded SRAM modules of FPGA has been done in [18-22]. Each study has come up

with a different testing scheme.

Abramovici and Stroud [16] presented a BIST architecture to test CLBs in an

FPGA. In the proposed scheme, a group of CLBs are configured to generate pseudo-

exhaustive test patterns to test the circuit and a group of CLBs are configured to compare

the outputs. Each testing session covers only half of the CLBs in an FPGA and another

session is required to test the other half.

6

In [18], Huang proposes to use the output of the first module to the input of the

second module using N test configurations. This method achieves full controllabilty but it

is very time consuming. Figure 1-2 shows the proposed scheme. This scheme uses a

single chain of connected CLBs which increases the time taken to detect the faults. The

fault needs to traverse n-1 arrays on a row before it can be observed. This is the main

drawback of this system.

Huang’s Interconnection scheme. Figure 1-2:

In [20], Renovell proposes an pseudo register inteconnection scheme to test a 4-

input RAM module using single test configuration. This method guarantees full

controllabilty and observabilty on all the SRAM modules. In this method, the output of

the LUT/RAM module is connected to the data input of the next SRAM in the chain. In

this scheme, to propagate data from a memory location X to Y, it is first read and then

written to the same address location in the RAM module of the next CLB. If there are ‘n’

CLBs in a chain, it takes ‘n’ read operations to read a particular memory address.

Similarly, it requires ‘n’ write operations to write in all SRAM modules at a particular

memory address. Although it has the above mentioned advantages, the main

disadvantage of this system is that it cannot locate the faulty CLB in the chain.

7

In [21], a new Split Array Technique(SAT) was introduced by Nemade and later

developed by Lalla in [22]. The SAT scheme is proposed to reduce time of detection of

faults and make efficient use of I/O pins. The entire FPGA is divided into two halves

and tested for various faults. Figure 1-3 shows the proposed interconnection scheme. In

this scheme, TPG provides the test vectors which are then sent to test the circuit and the

outputs are later analysed by a resopnse analyser. The drawback of this scheme is that it

uses almost two complete CLBs to test a portion of a CLB.

Lalla’s proposed Interconnection Scheme. Figure 1-3:

Most of the research in [5-8] focuses on testing embedded RAM modules for the

presence of classic faults. The current research proposes a new BIST scheme which

overcomes the drawbacks of the above mentioned schemes and tests SRAM memories

for the presence of single-cell and coupling fault models (namely, Stuck-at Fault;

Transition Fault; Address Decoder Fault; Incorrect Read Fault; Read Destructive Fault;

Deceptive Read Destructive Fault; Data Retention Fault; Transition Coupling Fault; State

Coupling Fault; Incorrect Read Coupling Fault ; Read Destructive Coupling Fault and

Deceptive Read Destructive Coupling Fault). An optimized March C- algorithm is

8

applied to detect the faults. The reason for the selection of the algorithm is justified in

Section 2.13. The Xilinx Virtex-4 Series FPGA is used as a model for implementing the

algorithm to detect the above mentioned faults. VHDL is used to model the FPGA, and

simulations results are presented to verify the system.

1.6 Organization of Thesis

The organization of this thesis is as follows:

Chapter 2 describes various memory faults and the different testing algorithms used to

test them. Chapter 3 gives an overview of Virtex-4 series FPGA Architecture. Chapter 4

discusses the proposed BIST Architecture as well as the implementation of BIST using

an Extended March C- algorithm. Chapter 5 shows the simulation results. Chapter 6

presents the conclusion and suggestions for future work.

9

Chapter 2

2 Fault Types and Algorithms

2.1 Introduction

For the last decade, semiconductor memory devices have shown to have the

highest performance and versatility among all types of memories (Floppy Discs, CDs,

etc.) [23]. These memories are classified as Read Only Memories (ROMs) and Random

Access Memories (RAMs). ROMs are the programmed memory devices which are set to

give the same output all the time while RAMs are memory devices in which any cell can

be accessed for Read and Write operations. ROMs have two variants, Erasable

Programmable ROMs (EPROMs) which are erasable with ultra violet light and

Electronically Erasable Programmable ROMs (EEPROMs) which are erasable

electronically. RAMs have been classified into Dynamic RAMs (DRAMs) and Static

RAMs (SRAMs). DRAMs store their information as a charge on a capacitor and they

have the high density and slow access time. Inherently, DRAMs suffer from leakage

currents, which cause its cell to loose energy over a period of time. In order to maintain

the data in a cell, DRAMs need to be refreshed from time to time (typically every 64ns).

10

The word ‘dynamic’ refers to the fact that the data stored in the DRAM cell has to be

refreshed after a given period of time [24].

SRAMs are constructed out of a bistable multi-vibrator circuit, which means

circuits that have two different stable states. Each state represents a given logical level ‘1’

or ‘0’. The word ‘static’ refers to the fact that when the cell is forced into a certain state,

it will stay in it as long as the memory is kept in contact with the power supply. SRAMs

have the fastest possible speed (typically 2ns).

Hybrid memories combine the feature of both RAMs and ROMs. Table 2.1 shows

the characteristic of various elements that are being used widely in the industry [25]. This

research is focused on testing SRAMs.

Table 2.1. Characteristics of different Memory Architectures

Memory

Type Volatile Writeable Speed

Erase

Size

PROM No Yes, Once with a device

Programmer Fast N/A

EPROM No Yes, Multiple times with a

device Programmer Fast

Entire

chip

EEPROM No Yes Fast to read, slow

to write Byte

DRAM Yes Yes Fast Byte

SRAM Yes Yes Fast Byte

Flash No Yes Fast to read, slow

to write Sector

The most popular hybrid memories are Flash Memories and Phase Change

Memories (PCMs). Flash Memories are low cost and non-volatile memory devices. They

are used extensively in embedded systems. PCM is a type of non-volatile random-access

memory. It has high storage capacity and is small in size, but the greatest challenge for

11

PCM has been the requirement of high programming current density. Also, this memory

is still in the research phase.

2.2 SRAM Cell

SRAM has excellent read and write speeds, integrates readily into the process

technology of embedded applications, requires little power for data retention, and does

not need to refresh logic to maintain the data at all times.

2.3 Functional Model

A SRAM memory consists of a memory cell array, two address decoders

read/write circuits, data flow, and control circuits as shown in Figure 2-1.

SRAM Memory Model. Figure 2-1:

The memory cell array is the basic part of the memory. It consists of ‘n’ cells,

which are organized as an array of R rows and C columns. The memory cell capacity is

determined by the number of rows and columns (RxC bits). The number of rows is not

restricted and it can be any integer whereas the number of columns is restricted. There is

always an integer number of words in one row.

12

The address is provided by an Address Decoder which is divided into high and

low order bits. The higher order bits are connected to the row decoder, while the lower

order bits are connected to the column decoder and these decoders select the appropriate

rows and columns respectively. The number of columns determines the number of bits

that can be accessed during a read/write operation.

To read the memory cells, appropriate row and column select lines must be

selected. The content of the selected memory cells are amplified by the read circuits,

loaded on to the data registers, and presented on the output lines. Conversely, during a

write operation, the data on the data lines is loaded into the data registers and written into

the selected cells through the write circuits.

2.4 Electrical Structure for SRAMs

A memory cell is the basic part of the memory whose design depends on various

factors including the memory application and the implementation style. A standard

SRAM memory cell is a bi-stable circuit being driven into one of two states ‘0’ and ‘1’.

After removing the trigger, the circuit remains in its state. A standard SRAM cell with 6

transistors is shown in the Figure 2-2. The 6T SRAM cell consists of two load elements

LT1 and LT2, two storage elements ST1 and ST2, and two pass transistors PT1 and PT2.

Transistor ST1 forms an inverter with LT1 and transistor ST2 forms an inverter with LT2.

These two inverters are cross coupled forming a latch. This latch can be access for read

and write operations.

13

6T SRAM Cell. Figure 2-2:

Data can be written into the cell by driving the bit line BL with the data given by

the Data-in and bitline BL with its complementary value. Also, to perform a write

operation the Word Line (WL) should be driven high. Since the two bitlines are driven

with more force than the force with which the cell retains its information, the memory

cell will be maintained at a state presented by these lines.

To read data from a cell, the bitlines needs to be pre-charged to a high voltage

level, after which the desired WL is driven high. At this time the data in the cell will

discharge one of the bitlines. This creates a difference in voltage levels between the two

bitlines which is amplified by the read circuitry and read out through the data register.

14

2.5 SRAM Read and Write Circuitries

Once a particular cell has been selected by the Address Decoder, the circuitry is

required to write and read the cell. A typical write circuitry is shown in Figure 2-3 (a) and

(b), Figure 2-3 (a) consists of a pair of inverters and a pass gate with a write enable

control, while Figure 2-3 (b) consists of a pair of NAND gates. The data to be written

‘Data In’ is presented on BL and BL.

Read Circuitry. Figure 2-3:

The read circuitry is more complex than the write circuitry and depends on the

type of memory cell and the technique to transmit the signal. A memory cell can be

single ended or differential and it can use a voltage node or a current node transmitting

technique to transmit the signal. Figure 2-4 shows a sample voltage mode single ended

sense amplifier.

In the figure, when the data on BL is ‘1’, the transistor N1 turns on, and the

transistor P2 gives an output ‘1’ at the ‘out’ line. Similarly, when the data on BL is ‘0’,

the transistor N2 turns on, and gives an output ‘0’. Using these circuitries, the data can

(a) (b)

15

read from a cell. If there is any delay in reading data there may be a read fault. Also, the

resistive effects between transistors may lead to different faults [26].

Single-ended Voltage Sense Amplifier. Figure 2-4:

2.6 Faults

A defect is an imperfection in a circuit that, depending on the abstraction level,

can be modeled as a fault. A fault is identified when a difference is observed between the

observed and expected response in the circuit. Fault detection means discovering the

existence of the fault.

A simple way to categorize the faults is according to the way they manifest

themselves in time. For example, faults can be categorized as permanent and temporary

faults. Permanent faults affect the functionality of the system permanently; these faults

usually occur during the manufacturing process or in the early life cycle of FPGAs. For

example, the presence of broken components or design errors could cause such faults.

16

Temporary faults can be caused by transient or intermittent disturbances that are present

only for a short period of time. For example, exposure to cosmic rays, high temperature

conditions, aging components, wear out failures, or power supply fluctuations can result

in temporary faults. Detecting either type of faults is not a trivial task, as the feature size

of the semi-conductor devices are shrinking day by day.

Fault detection in a logic circuit is carried out by applying a series of test patterns

and observing the resulting outputs [27]. When the number of the test sequences and the

number of components used to implement the testing circuit increases, the cost of testing

the circuit increases. One of the main objectives of testing the circuit is to minimize the

length of the test sequence so as to reduce cost. For example, a combinational circuit with

‘n’ inputs, can be tested by applying 2n test vectors to it. The size of the test patterns

increases exponentially as the value of ‘n’ increases. Hence, to reduce the size of the test

patterns, optimization of the test pattern is required; that is, the input pattern that detects

most of the faults in the circuit needs to be identified [28].

2.6.1 SRAM Memory Faults

Faults in SRAM memories are classified into two categories:

1. Simple Faults: Faults that involve one cell are simple faults. These faults cannot

influence the behavior of each other such that masking cannot occur. Some

examples of single cell faults are stuck-at faults and transition faults.

2. Coupling Faults: Faults that involve neighboring cells are called coupling faults.

These faults influence the behavior of the other cells such that masking can occur.

These faults have the property that the cell which sensitizes the fault is different

17

from the cell in which the fault appears. Some examples are state coupling faults

and transition coupling faults.

If the actual output from a circuit is the same as the expected output, then the cell

is considered fault free. Figure 2-5 shows the state diagram of a fault free memory cell.

S0 is the state when the cell contains logic ‘0’ and S1 is the state when the cell contains

logic ‘1’ [29].

State Diagram of a Fault Free Cell. Figure 2-5:

During the normal fault free operation,

When the cell is in state S0, a write 0 operation (denoted by w0) causes the cell to

remain in the same state, while the write 1 operation (denoted by w1 ) causes the

cell to undergo a transition from ‘0’ to ‘1’.

When the cell is in state S1, a w1 operation causes the cell to remain in the same

state and a w0 operation causes the cell to undergo a transition from ‘1’ to ‘0’.

18

2.6.1.1 Simple Faults

There are various fault modes that need to be considered in the SRAM memories. In this

research faults that may occur in the address decoder, read/write circuitry, and memory

cell array of the SRAM core faults are considered [30] [31].

2.6.1.1.1 Stuck-at Fault (SF)

A stuck-at fault occurs when the logic value of the cell is always ‘0’ or ‘1’. If the

value of the cell is always ‘0’ then it is a stuck-at 0 fault (SF0), and if the value of the cell

is always ‘1’ then it is a stuck-at 1 fault (SF1). Figure 2-6 shows the state diagram for

SF0 and SF1 faults.

In case of a SF0 as shown Figure2-6, a w1 operation on the cell in state S0 does

not change the content of the cell. Similarly, in case of SF1, a w0 operation on the cell in

state S1 does not change the content of the cell. A SF0 is detected by a r1 operation

followed by a w1 operation, while a SF1 is detected by a r0 operation followed by a w0

operation.

State Diagram of (a) SA0 Fault and (b) SA1 Fault. Figure 2-6:

(a) (b)

19

2.6.1.1.2 Transition Faults (TFs)

A transition fault is a special case of stuck-at fault in which a cell fails to undergo

a transition from ‘0’ to ‘1’ (up transition) or a transition from ‘1’ to ‘0’ (down transition).

When the cell fails to transit from ‘1’ to ‘0’ it cannot be mistaken as a stuck-at fault

because the cell can take and store the value 1 if a 0 has not yet been written to the cell.

Figure 2-7 and Figure 2-8 show the state diagram for up-transient and down-transient

faults.

Up-Transient Fault. Figure 2-7:

State Diagram of Down-Transient Fault. Figure 2-8:

20

A test that detects transient faults must undergo an up transition and a down

transition and must be read after each transition before undergoing any further operations.

2.6.1.1.3 Address Decoder Fault

Address Decoder Faults are critical as a wrong address generated can result in

addressing a completely different set of data in a memory cell. Faulty address decoders

can result in the following:

No cell is accessed with a certain address.

A cell cannot be accessed with any address.

More than one cell can be accessed simultaneously.

Address Decoder Faults. Figure 2-9:

To detect an ADF (shown in Figure 2-9), a cell has to be written and read with a

‘0’ and ‘1’ in increasing and decreasing address order.

21

2.6.1.1.4 Incorrect Read Faults (IRFs)

Incorrect Read Faults are hard to detect, as the content of the cell is not changed

by the fault. IRF faults are explained using the example provided in Figure 2-10.

State diagram for Incorrect Read Fault. Figure 2-10:

In Figure 2-10, when cell 2 is being read for logic value ‘1’, the read operation

sensitizes the fault and returns a ‘0’, while retaining ‘1’ in the cell. This fault is identified

by reading a ‘1’ and ‘0’ from each cell.

2.6.1.1.5 Read Destructive Faults (RDFs):

A cell is said to have a read destructive fault if the read operation performed on

the memory cell returns an incorrect logic value while changing the content in the cell.

The state diagram for RDF is depicted in Figure 2-11. In the figure, when cell 2 is read

for a logic value ‘1’, the read operation sensitizes the fault and changes the content stored

in the cell, returning an incorrect logic value at the output. This is shown by a blue circle

in the figure. To detect a RDF, a ‘1’ and ‘0’ should be read from each cell.

22

State Diagram for Read Destructive Fault. Figure 2-11:

2.6.1.1.6 Deceptive Read Destructive Fault (DRDFs):

A cell is said to have a deceptive read destructive fault when a read operation

followed by a write operation is performed on the cell and the read operation returns the

correct logic value, while changing the content of the cell. The state diagram for this fault

is shown in Figure 2-12.

In the figure, the value stored in the cell is ‘1’. After the first read operation it

returns the correct logic value ‘1’, while inverting the content on the cell to ‘0’. When the

cell is being read for the second time, it returns a ‘0’ (marked by a blue circle in the

figure), indicating the presence of a DRDF. To identify this fault, two simultaneous read

operations are required.

23

State diagram for Deceptive Read Destructive Fault. Figure 2-12:

2.6.1.1.7 Data Retention Faults (DRFs)

A memory cell is said to have a data retention fault when the cell loose its stored

logic value after a certain period during which it is not accessed. The state diagram for

this fault is as shown in Figure 2-13.

In the figure, the value stored in the cell is ‘0’. After an immediate read operation,

the output of the read operation shows the exact value written into the cell. However,

when the cell is kept on hold for a certain amount of time and read for the expected value,

it shows the complement value stored in the cell indicating the presence of a DRF [32].

24

State diagram for Data Retention Fault. Figure 2-13:

2.6.1.2 Coupling Faults

A coupling fault is said to exist if transition in the coupling cell forces the

contents in the coupled cell to change.

2.6.1.2.1 State Coupling Fault (CFst)

A cell is said to have a state coupling fault when the coupled cell is forced to

change. This could happen when the coupling cell is in a given logical state. State

coupling fault is not sensitized by a transition write operation; it is sensitized by the

logical state of the coupling cell. The state diagram for this fault is as shown in Figure 2-

14.

25

State Diagram for State Coupling Fault. Figure 2-14:

In the figure, the state of the coupling cell (marked by a blue circle) and coupled

cell (marked by a green square) is shown. Initially the state of the coupled cell shows the

exact data, but when the coupling cell is at a given state ‘1’, the content of the coupled

cell is inverted (marked by a red square), which proves the existence of a state coupling

fault. This fault is detected when the coupling cell is read for a ‘0’ and ‘1’ when the

coupled cell is in a given state.

2.6.1.2.2 Transient Coupling Fault

A cell is said to have a transient coupling fault when the state of the coupling cell

causes the failure of a write operation performed on the coupled cell. This fault is

sensitized by a transition write operation on the coupled cell when the coupling cell is in

a given state. Depending on the transition, it is categorized as an up-transient or down-

transient coupling fault. The state diagram for this fault is as shown in Figure 2-15.

26

State Diagram for Transient Coupling Fault. Figure 2-15:

In the figure, the coupling cell in a given state ‘0’, the transition in the coupled

cell failed. This confirms the existence of a transient coupling fault.

2.6.1.2.3 Incorrect Read Coupling Fault (CFir)

A cell is said to have an incorrect read coupling fault, if a read operation

performed on the coupling cell, which is in a given state, returns an incorrect value from

the coupled cell. During this operation, the content of the coupled cell will not be

changed, only the output changes. The state diagram for this fault is as shown in Figure

2-16.

In the figure, the initial state of the coupling cell (denoted by a blue circle) and

coupled cell (denoted by a green square) after write operation is shown. When a read

operation is performed on the coupling cell, it affects the coupled cell and changes its

output leaving the content of the cell unchanged.

27

State Diagram for Incorrect Read Coupling Fault. Figure 2-16:

2.6.1.2.4 Read Destructive Coupling Fault (CFrd)

A cell is said to have a read destructive coupling fault if a read operation

performed on the coupling cell that is in a given state changes the content of the coupled

cell and returns an incorrect value at the output. In this research, to detect this fault, the

content of the cell is stored in a buffer and compared with expected output. The state

diagram for this fault is as shown in Figure 2-17. The change in the content of the cell

after the read operation is shown in the figure.

28

State Diagram for Read Destructive Coupling Fault. Figure 2-17:

2.6.1.2.5 Deceptive Read Destructive Coupling Fault (CFdr)

A deceptive read coupling fault is a special case of read fault. To detect this fault

two read operations are required. A read operation performed on the coupling cell which

is in a given state returns the correct logic value while changing the content of the

coupled cell. The state diagram for this fault is shown in Figure 2-18.

In the figure, after performing a read operation on the coupling cell that is in a

given state, it results in change in the content of the coupled cell. However, the output of

the coupled cell (denoted by a red square) will be same as the expected output which

might mask the fault. Hence, it is difficult to detect these faults. In order to detect these

faults, before the content of the coupled changes, a second read operation needs to be

operated on the coupling cell. The change in the content of the coupled cell when a read

operation is performed on the coupling cell is shown in the figure.

29

State Diagram for Deceptive Read Destructive Coupling Fault. Figure 2-18:

2.7 Analysis of Faults in a SRAM Cell

To analyze the faults described in Section 2.6, the defects need to be injected onto

the SRAM cell. Each injected defect induces a faulty behavior during the memory

operation as well as in HOLD mode [33] [34]. The defect injection in the SRAM core

cell is depicted in Figure 2-19.

Defect RDF1: This defect is responsible for the delay of charge or discharge of the bit

line BL through transistor tn4 during write operations. This defect leads to a transition

fault. Also, RDF1 is on the path which is responsible for read operation and may lead to

a read destructive fault.

30

Defects Injected into SRAM Core cell. Figure 2-19:

Defect RDF2: This defect induces a delay in the output of INV1, which leads to RDFs .

During r1 operation, the bit line BL is pre-charged to VDD. After it is pre-charged, it

tries to pull up the INV2 which is at logic ‘0’. This pull up is not well counterbalanced by

the pull down of INV2 which may lead to the change of state at INV2 and swap of the

core cell content. In some cases, data loss does not involve incorrect read immediately;

thus a further read operation is required. This leads to a DRDF.

Defect RDF3: This defect also produces similar effects to those of RDF2. This defect

also leads to RDF and DRDF.

Defect RDF4: This defect is placed in the pull up of INV1 and delay in this operation

might lead to RDF and RDFs for large values of resistance. For very large values of

resistance, this might lead to spontaneous data loss, resulting in DRF.

Defect RDF5: This defect represents the resistance of long interconnects as word lines.

This defect affects the switching activity of the pass transistors, reducing the operating

time of the read or write operations leading to IRFs and TFs.

31

Defect RDF6: This defect is placed at the gates of two transistors of INV2 and for high

value of resistance no bias current enters into the MOS transistor gate. This defect might

cause a delay in pull up and pull down operations of INV2. This may result in TFs.

There are many traditional memory test algorithms such as zero-one,

checkerboard, and walking I/O tests. These algorithms are very well known and simple to

implement. A zero-one test pattern is also referred as blanket pattern or MSCAN

(Adams). In a zero-one test a ‘0’ is written and read back similarly a ‘1’ is written and

read back. This test has a limited coverage. It would be able to find stuck-at faults, but

not transition or coupling faults. Also, it has a long test length of 4*2N

operations, where

N stands for the number of bits and 2N

is the common notation used for the number of

addresses in memory [35].

The checkerboard test is another simple test, in which the cells in memory are

written with alternating values; each cell is surrounded by a cell whose value is different.

This test has the same test strength as the zero-one test and also takes the same length of

4*2N operations or O(n) [35].

The walking I/O test is not as simple as the other tests,but it can detect transifition

faults and coupling faults. In this test, the memory is written with all 0s (or 1s) except for

a "base" cell, which contains the opposite logic value and the cell is "walked" or stepped

through the memory. All cells are read for each step. This test test fails to cover all

coupling faults and takes an enormous test time. The test time is 2*(2N + 2*n + n

2), which

is an O(n2) test (Goor.). The GALPAT (GALloping PATtern) test is like the Walking 1/0

test except that, in GALPAT, after each read the base cell is also read.

32

2.8 Advanced Memory Test

With processor memory size growing exponentially, new efficient test pattems

with larger test coverage are needed. March test algorithms are superior to detect faults

and have reduced test time [36]. The test 'marches' through the memory and hence the

name. March tests consist of March elements which are applied to every cell either in

increasing or decreasing address order. There are four operations in a March test and they

are:

Write ‘0’ in all cells (w0).

Read ‘0’ from all cells (r0).

Write ‘1’ in all cells (w1).

Read ‘1’ from all cells (r1).

2.9 MATS and MATS+ Algorithms

MATS, which stands for Modified Algorithmic Test Sequnce, is the shortest

March test for detecting stuck-at faults. The Algorithmic Test Sequence was proposed by

KInaizuk and Hartman and later improved by Nair as MATS+. MATS+ consists of 4N

operations [37]. Figure 2-20 shows the MATS+ algorithm which consists of three March

elements M0-M2. The MATS+ Algorithm has a complexity of 4n with a better fault

coverage compared to equivalent zero-one and checkerboard tests.

MATS+ Algorithm. Figure 2-20:

{↕(w0); ↕(r0,w1); ↕(r1,w0)}

M0 M1 M2

33

2.10 MARCH C- Algorithm

March C- is a popular testing algorithm used in the industry [35] [38] and it

detects SAF, TF, IRF and RDF. Figure 2-21 shows the algorithms which consists of six

March elements: M0-M5. The March C- Algorithm has a complexity of 10n. It has better

fault coverage than MATS+ but it is not able to detect DRDF and data retention faults.

March C- Algorithm. Figure 2-21:

2.11 Extended March C- Algorithm

This test detects all the faults detected by March C- and also detects DRDFs, data

retention faults, and read coupling faults. The algorithm has 4n operations and is shown

in Figure 2-22 [39].

Extended March C- Algorithm. Figure 2-22:

Stuck-at faults are detected because each cell is read with expected value ‘0’ (by

M1) or ‘1’ (by M2). Up-Tranisent faults are detected by M1 followed by M2 and down

transient faults are detected by M2 followed by M3; all address decoder faults are

detected by this algorithm. The incorrect read and read destructive faults are detected

when the cell is read with ‘0’ or ‘1’ and then compared with the expected value and with

the value stored in the buffer. If the actual output and the value stored in the buffer are

{↕ (w0); ↑ (r0,w1); ↑ (r1,w0); ↓ (r0,w1); ↓(r1,w0); ↕(r0)}

M0 M1 M2 M3 M4 M5

{↕ (w0); ↑ (r0, w1); ↑ (r1, w0); ↓ (r0, w1) HOLD; ↓ (r1, r1, w0) HOLD; ↕ (r0, r0)}

M0 M1 M2 M3 M4 M5

34

different then its is an incorrect read fault and if it is the same then it’s a read destructive

fault. Deceptive read destructive and data retention faults are detected by M4 and M5.

State coupling faults are detected by the March elements M1 and M2 and these

faults are useful to differentiate other coupling faults with simple faults. A transition fault

is differntiated with a transient coupling fault by introducing a state coupling fault at the

coupling cell. After introducing the fault, if the transient fault still exists, then it is a

simple fault or else it can be concluded as a transient coupling fault. Transient coupling

faults are detected by the March elements M2 and M3. Incorrect read and read

destructive copling faults are detected by March elements M1 and M2 where deceptive

read destructive and data retention coupling faults are detected by the March elements

M4 and M5.

2.12 March Tests

There are other March tests avaiable. Table 2.2 covers the list of March tests available

and their fault coverage .

Table 2.2. List of other March Tests

March Test

Algorithm

No.of

operatios Algorithm Fault Coverage

March SR 14n

{ ↓ (w0); ↑ (r0,w1,r1,w0); ↓(r0,r0,);

↑(w1); ↓(r1,w0,r0,w1); ↑(r1,r1) }

SF, TF, RDF, IRF,

DRDF, DRF, CFst,

CFtf, CFir, CFrd

35

March B 17n

{↕ (w0); ↑(r0,w1,r1,w0,r0,w1);

↑(r1,w0,w1); ↓ (r1,w0,w1,w0);

↓(r0,w1,w0)}

SF, ADF, TF, RDF,

IRF, CFst

March C- 10n

{↕ (w0); ↑(r0,w1); ↑(r1,w0);

↓(r0,w1); ↓(r1,w0); ↕ (r0) }

SF, ADF, TF, RDF,

IRF, CFst, CFtf, CFir

2.13 Selection of the Testing Algorithm

One of the important steps in testing any circuit is the selection of the testing

algorithm. The time taken and the fault coverage are important factors to be considered

while testing the algorithm. In this research, the focus is on testing Look up Tables in a

SRAM FPGA for the presence of address decoder, stuck-at, transient, incorrect read, read

destructive, deceptive read destructive, data retentnion, state coupling, transient coupling,

incorrect read coupling, read destructive coupling, and deceptive read destructive

coupling faults. There are many March tests available to detect these faults and the most

efficent is selected after analysing the avaiable algorithms. Extended March C- algorithm

proposed in [39] was choosen for this research because it covers all the simple and

coupling faults within the scope of this research with less test time.

36

Chapter 3

3 SRAM Based FPGA

3.1 Introduction

FPGAs are programmable logic devices that are programmed to perform tasks

specific to any digital application. FPGAs have gained popularity because of their

flexibility, portability, and short time-to-market, making them ideal for prototyping

systems. Also, these devices allow “in-the-field” reconfiguration which makes them

suitable for a wide variety of applications including, military and airborne applications.

3.2 Anatomy of the FPGA

A FPGA consists of an array of Configurable Logic Blocks (CLBs),

Programmable Interconnects, Input/Output Buffers (IOBs), and RAM cores. Newer

FPGAs have additional embedded cores like DSP cores, embedded microprocessors, and

high-speed I/O interface for better system performance. The CLBs are comprised of Look

up Tables (LUTs) and the Flip-flops form the logic resource of an FPGA. A

programmable interconnect network is comprised of wire segments and programmable

switches that either connect or disconnect the wire segments. The CLBs are surrounded

37

by these programmable interconnect networks that allows CLB blocks to be

interconnected. The CLBs are surrounded by the IOBs, which in turn connect the chip to

the outside world. The basic FPGA architecture is shown in Figure 3-1.

Figure 3-1: Basic FPGA Architecture.

3.3 Benefits and Drawbacks of FPGAs

The main advantages of the FPGA are:

Programmability and re-programmability.

Short development time.

ASICs are microchips specifically designed for a given application. The

implementation of ASIC consumes a lot of time and money. On the other hand, FPGA

38

eliminates the need for customization during manufacturing which reduces the need for a

custom made package and customized testing. Programming a FPGA is easy and they can

be reprogrammed even after the design has been manufactured allowing engineers to

reconfigure the hardware for the design enhancements. It also allows the designer to test

the design extensively without any additional manufacturing costs. Once the design is

validated and approved, it can then be sent for fabrication, which saves a lot of time and

money.

There are also some disadvantages of using FPGAs compared to ASCIs. FPGAs

have an on-chip programming circuitry that enables the programming of the FPGA that

helps in efficient programming and re-programming of the devices; it adds an overhead

to the circuit. The additional circuitry also slows down the inter-connect paths in the

FPGA due to additional resistance and capacitance in the connection paths causing signal

delay.

3.4 FPGA Applications

Due to their programmable nature and flexibility, FPGAs are an ideal fit for a lot

of industries [40]:

In the fields of Aerospace & Defense, radiation-tolerant FPGAs are used for

image processing, waveform generation, and partial reconfiguration for SDRs.

ASIC prototyping of FPGAs enables a fast and accurate SoC modeling and

verification of the embedded software.

39

In the fields of Multimedia and Teleprocessing, FPGAs are used to design

platforms which enable higher degrees of flexibility and lower overall non-

recurring engineering costs (NRE).

FPGAs are used in cost-effective, full-featured consumer applications such as

converged handsets, digital flat panel displays, information appliances, home

networking, and residential set top boxes.

3.5 FPGA Device Manufactures

A List of FPGA product manufactures is shown below:

Xilinx

Altera

Actel

Cypress Semiconductor

i-Cube

Motorola

Quicklogic

Gatefield

A Virtex-4 FPGA from Xilinx was chosen as a hardware platform for this research.

3.6 SRAM Programmable Virtex-4 FPGA

The Virtex-4 family of FPGAs combines traditional FPGAs with embedded

processors, multipliers, and high speed I/O interfaces into a single package [41]. The

architectural and operational features of these FPGAs can be exploited in the

40

implementation of BIST in order to speed-up the test time. Virtex-4 devices implement

the following functionality:

I/O blocks

Configurable Logic Blocks (CLBs)

Block RAM

Cascadable embedded XtremeDSP slices

Digital Clock Manager (DCM)

3.6.1 I/O Blocks

I/O Blocks control the data flow between package pins and the internal

configurable logic blocks. All the popular and leading-edge I/O standards are supported

by programmable I/O Blocks (IOBs). The IOBs are enhanced for source-synchronous

applications including per-bit deskew, data serializer/deserializer, clock dividers, and

dedicated local clocking resources.

3.6.2 Block RAM Modules (BRAMs)

BRAMs provide flexible 18Kbit dual-port RAM that are cascadable to form

larger memory blocks. In addition, BRAMs in Virtex-4 FPGAs contain optional

programmable FIFO logic for increased device utilization.

41

3.6.3 Cascadable Embedded Xtreme DSP Slices

The DSP slices contain an 18-bit dedicated multiplier, an Integrated Adder, and a

48-bit accumulator. These blocks are designed in order to implement high-speed DSP

applications.

3.6.4 Digital Clock Managers (DCMs)

Digital Clock Manager (DCMs) blocks and Global Clock Multiplexers (GCMs)

provide self-calibration and complete digital solutions for clock distribution delay

compensation, clock multiplication or division, and coarse or fine-grained clock phase

shifting.

3.6.5 Configurable Logic Blocks (CLBs)

CLBs provide the basic logic elements for Xilinx FPGAs. In addition to this they

provide combinatorial and synchronous logic, as well as distributed memory and SRL16

shift register capability. CLBs are the main logic resources for realizing sequential and

combinatorial circuits. In order to access the general routing matrix, each CLB element is

connected to a switch matrix as shown in Figure 3-2. A CLB element contains four slices

[42]. These slices are grouped in pairs and organized as a column. In the figure, a

SLICEM indicates the pair of slices in the left column, and SLICEL designates the pair of

slices in the right column. Each pair in a column has an independent carry chain.

However, only the slices in SLICEM have a common shift chain. CLBs provide the basic

logic elements for Xilinx FPGAs. They provide combinatorial and synchronous logic as

well as distributed memory and SRL16 shift register capability.

42

In the figure,

The letter “X” followed by a number identifies the position of a slice in a pair as

well as in the column.

The letter “Y” followed by a number identifies the position of each slice in a pair

as well as in the CLB row.

The number followed by “X” counts up in the sequence from left to right. The

number followed by “Y” counts the slices from bottom to up. Figure 3-2 shows the CLB

located in the bottom left corner. The elements common to both slice pairs (SLICEM and

SLICEL) are function generators (or look-up tables), storage elements, wide-function

multiplexers, carry logic, and arithmetic gates.

Figure 3-2: CLB Architecture.

43

Table 3.1 details the logic resources in one CLB. These elements are used by both

SLICEM and SLICEL to provide logic, arithmetic, and ROM functions. Besides these,

SLICEM supports two additional functions including storing data using distributed RAM

and shifting data with 16-bit registers.

Table 3.1. Logic Resources in a CLB

Slices LUTs

Flip-

Flops MULT_ANDs

Arithmetic

and Carry

Chains

Distributed

RAMs

Shift

Registers

4 8 8 8 2 64bits 64bits

3.6.5.1 Look Up Table (LUT)

The function generators in Virtex-4 FPGAs are implemented as 4-input Look up

Tables (LUTs) and there are four inputs for each of the two function generators (F and G)

in a slice. The LUTs can implement any arbitrarily defined four-input Boolean function

and the propagation delay is independent of the function implemented. Signals

originating from the LUTs exit the slice through the output lines X or Y, can enter the

XOR dedicated gate and enter the select line of the carry-logic multiplexer. The output is

then feed to the D input of the storage element, or to MUXF5.

In addition to the basic LUTs, the Virtex-4 FPGA slices contain multiplexers

(MUXF5 and MUXFX) which can effectively combine LUTs within the same CLB or

across different CLBs making logic functions with even more input variables. As

44

mentioned earlier, Slice L does not have any memory so all the functional generators act

as LUTs. On the other hand Slice M LUTs can be configured as 16 bit SRAM memories.

3.6.5.2 Distributed RAM and Memory (Available in SLICEM only)

Multiple LUTs in a SLICEM can be grouped in pairs to store larger amounts of

data. This is possible since each function generator (LUTs) available in SLICEM can be

implemented as a 16x1 bit synchronous RAM resource called a distributed RAM element

(Figure 3-3). Distributed RAM modules are by default synchronous write and read

resources and they can be implemented with a storage element in the same slice. The

distributed RAM and the storage element share the same control signals (CLK, CE, and

Set/Reset). To perform a write operation, the write enable signal must be set high.

Figure 3-3: Distributed RAM.

45

3.6.5.3 Storage Elements

The storage elements in a Virtex-4 FPGA slice can be configured in two ways :

Edge-triggered D-type flip-flops or

Level-sensitive latches.

The input of each flip-flop can be driven directly by a LUT output or by the slice

inputs bypassing the function generators. The control signals clock (CLK), clock enable

(CE) and set/reset (SR) are common to both storage elements in a slice. All of the control

signals have independent polarity and the clock-enable signal (CE) is active High by

default. If left unconnected, the clock enable defaults to the active state.

3.6.5.4 Read Only Memory (ROM)

Each function generator in SLICEM and SLICEL can implement a 16 x 1-bit

ROM, with contents being loaded at device configuration. Four device configurations are

available: ROM16x1, ROM32x1, ROM64x1, and ROM128x1. The ROM elements are

cascadable to implement wider and deeper ROM. The number of LUTs occupied by each

configuration is shown in Table 3.2.

46

Table 3.2. ROM Configurations.

Number of LUTs ROM

1 16x1

2 32x1

4 64x1

8 128x1

16(2CLBs) 256x1

3.6.5.5 Shift Registers (SLICEM only)

A function generator in a SLICEM can also be configured as a 16-bit shift register

without using the flip-flops available in a slice. This way, each LUT can delay serial data

from one to 16 clock cycles. The SHIFTIN and SHIFTOUT lines are cascaded to other

LUTs to form larger shift registers. The four LUTs in a SLICEM of a single CLB can be

cascaded to produce delays from one to 64 clock cycles. It is also possible to combine

shift registers across different CLBs to produce longer delays. The resulting

programmable delays can be used to balance the timing of data pipelines as well as

implement the synchronous FIFO designs and Content Addressable Memory (CAM)

designs.

The write operation with a clock input (CLK) and a Clock Enable (CE) is shown

in Figure 3-4. The write operation is synchronous and the read operation is asynchronous

by default. However, a storage element or flip-flop is provided to implement synchronous

reads.

47

Figure 3-4: Representation of a Shift Register.

3.6.5.6 Multiplexers

Each Virtex-4 FPGA slice has one MUXF5 and one MUXFX multiplexer. The

MUXFX multiplexer implements the MUXF6, MUXF7, or MUXF8 depending on the

slice position in the CLB as shown in Figure 3-5. Each CLB element has two MUXF6,

one MUXF7, and one MUXF8 multiplexer. These Multiplexers are used to design

different LUT combinations up to 16 LUTs. Any LUT can be implemented by the

following configurations [42]:

4x1 multiplexer in one slice.

8x1 multiplexer in two slices.

48

16x1 multiplexer in one CLB element (4 slices).

32x1 multiplexer in two CLB elements (8 slices - 2 adjacent CLBs).

Figure 3-5: Representation of MUX F5 and MUX FX Multiplexers.

Each Multiplexer shown in the figure has a defined function:

MUXF5 combines the outputs of two LUTs

MUXF6 combines the outputs of MUXF5 from all the four slices S0- S3

MUXF7 combines the outputs of MUXF6 from slices S0 and S1

MUXF8 combines the outputs of MUXF7

After the detailed analysis of slice architecture, the next section describes the need

for testing FPGAs.

49

3.7 Need for Testing FPGAs

Field Programmable Gate Arrays (FPGAs) have the ability to be configured in the

field to implement an arbitrary desired function according to the user demands. The

ability of FPGAs can help users achieve a faster design cycle, lower development costs,

and a reduced time-to market compared to conventional Application Specific Integrated

Circuits (ASICs). ASICs are widely used in many system critical applications including

military, airborne, and adaptive computing. However, these applications can cause many

defects in FPGA due to exposure to gamma radiation. Hence, testing methods are

required to efficiently detect the faults with minimum test time and maximum fault

coverage.

50

Chapter 4

4 Proposed Architecture for Testing Look up Tables in a

Virtex-4 FPGA

BIST architecture consists of a Test Pattern Generator (TPG), a Circuit Under

Test (CUT), and an Output Response Analyzer (ORA). For testing Look up Tables

(LUTs) in a SRAM based FPGA, a 4 bit up/down counter which generates addresses to

access various memory cells is used as a TPG. March test algorithms used for testing

memories requires sequential access to memory cells in both up and down directions.

Hence, an up/down counter is used. The ORA used for analyzing the outputs is a XOR

comparator. ORA compares the outputs of two identically configured CUTs and

generates a pass/fail indication.

Based on the slice mode being tested, the CLB BIST architecture is divided into

two categories. The first set of configurations tests every CLB in the FPGA in Slice M

(memory) mode of operation and the second set tests every Slice L (Logic). The set of

BIST configurations is repeated twice with the roles of the CLBs reversed such that every

CLB is tested. Figure 4-1 and 4-2 (reproduced from [42]) show the elements in Slice L

and M, respectively.

51

Figure 4-1: Slice L [42]

52

Figure 4-2: Slice M [42].

53

4.1 Test Pattern Generator (TPG)

The test pattern generator used to generate the addresses for testing the circuit is

an important part of the BIST architecture. It is designed using four LUTs: two LUTs

from Slice L and two LUTs from Slice M. The method proposed in [22] uses an entire

CLB; it takes eight LUTs to implement the TPG which adds a lot of area overhead to the

test circuitry and is not optimal. The method implemented in this research improves the

architectures proposed in [22] by building the TPG using four LUTs instead of eight. The

method implemented in [43] uses a DSP to implement the TPG and a CLB as a CUT.

Hence, reversing the roles of a CUT and TPG to detect a faulty TPG can be difficult with

this approach.

TPG is divided into two modules: module 1 is used as an up counter and module 2

is used as a down counter. Module 1 generates addresses from “0000” to “1111” and

module 2 counts from “1111” to “0000”. The detailed diagram for the up counter is

shown in Figure 4-3. As shown in the figure, the initial address for all LUTs is “0000”

and then it increments or decrements based on the up/down signal. The current cell being

accessed contains the address of the next cell. For example, if the contents of all the

LUTs read as “0000”, then the outputs from the TPG would be “0000”, and as a result,

the value of signal changes from “0000” to 0001”. This is feedback to the LUTs and the

cell 1 of all LUTs is read and the process continues until the address “1111” is reached.

At this point, a check for the up/down signal is done and if the up counter signal does not

change, then the TPG is initialized to “0000” and the counting continues until the address

“1111” is reached. If the up/down counter signal is changed, then the rollover takes place

and it forms a down counter, which forms the second module of TPG.

54

Figure 4-3: Detailed Diagram for a UP Counter.

The detailed diagram of down counter is shown in Figure 4-4. Unlike the method

implemented in [22] which uses a complete set of different LUTs for a down counter, this

method utilizes the same circuitry used by an up counter thus reducing a significant area

overhead.

55

Figure 4-4: Detailed Diagram for a Down Counter.

A down counter generates addresses from “1111” to “0000” and the addresses are

feedback to LUT inputs to access the next address. This is achieved by using an XOR

logic as shown in Figure 4-5.

Figure 4-5: XOR operation of a Down Counter.

56

The extended March algorithm used in this research is shown in Figure 4-6.

During M0, M1, and M2 operations, the addresses are generated in increasing order from

“0000” to “1111”. During this period, the up/down signal is kept low. During operations

M3, M4, and M5, the addresses are generated in reverse order. Hence, the up/down signal

is kept high. The pattern for an up/down counter is shown in Table 4.1.

Figure 4-6: Extended March Algorithm.

Table 4.1. The Test Patterns Generated by the TPG.

Bit Signal Up/Down

Signal

Output for

the Up

counter

Up/Down

Signal

Output for

the Down

counter

0000 0 0000 1 1111

0001 0 0001 1 1110

0010 0 0010 1 1101

0011 0 0011 1 1100

0100 0 0100 1 1011

0101 0 0101 1 1010

0110 0 0110 1 1001

0111 0 0111 1 1000

1000 0 1000 1 0111

1001 0 1001 1 0110

{↑ (w0); ↑ (r0, w1); ↑ (r1, w0); ↓ (r0, w1) HOLD; ↓ (r1, r1, w0) HOLD; ↓ (r0, r0)}

M0 M1 M2 M3 M4 M5

57

1010 0 1010 1 0101

1011 0 1011 1 0100

1100 0 1100 1 0011

1101 0 1101 1 0010

1110 0 1110 1 0001

1111 0 1111 1 0000

4.2 Circuit Under Test (CUT) and Output Response Analyzer (ORA)

Circuit Under Test is the actual test object being tested. Initially, Slice M, which

has the memory test resources, is tested and then the set of BIST configurations are

repeated twice with the roles of the TPGs and ORAs reversed such that every Slice serves

as a CUT. The outputs of each CUT are compared by an ORA with the outputs of two

adjacent identically configured CUTs in the same row.

The ORA is used to compare the actual output with the expected output. In the

proposed architecture, signals are compared using a XOR comparator implemented in a

LUT. The output of a circuit is compared with the adjacent identically configured

memory in the same row, as shown in Figure 4-7. Any deviation from the expected

output latches a logic 1 in the ORA flip-flop. Otherwise, a logic 0 is stored, which

indicates the circuit is fault free.

ORA is implemented using Slice L, which contains no embedded SRAM

memories. Hence, no external resources are used in mapping ORA, which reduces the

cost of testing and the area overhead. The output of the memory under test is XORed

with the output of the adjacent memory and displayed at the ORA output. The

58

implementation of a comparator based ORA is shown in Figure 4-8. The ORA

implemented identifies the faults at the LUT level and it receives the following inputs:

Output of F LUT for the memory under test

Output of F LUT of the adjacent memory

Output of G LUT for the memory under test

Output of G LUT of the adjacent memory

When the output of the current memory under test doesn’t match the adjacent

memory, the faulty signal for the LUT goes high.

Figure 4-7: Comparator Based ORA Architecture.

59

Figure 4-8: Comparator Operation.

4.3 BIST Architecture

The basic concept of the BIST architecture, illustrated in Figure 4-9, is to

configure the TPG, CUT, and ORA into one CLB thereby reducing the effects of

interconnects. This also helps to reduce the test time taken to send the test patterns to the

circuit being tested. After applying the test patterns, the output response of the circuit

under test is compared with the responses of other identically configured CUTs by

circular comparison-based ORAs to detect faults. All the CLBs in one row are connected

through a scan chain mechanism. Each CUT receives an address from a different TPG.

This reduces the chance of a faulty TPG sending the wrong addresses to all the CLBs

[44-45].

60

Figure 4-9: Proposed Architecture.

Figure 4-10 shows the interconnection scheme of the proposed architecture. It

illustrates the interconnects between four CLBs that have all the three BIST components

embedded in them. The TPG generates the address for both F and G LUTs and sends it to

the CUT for testing. Subsequently, the response of CUT is analyzed by an ORA. Each

ORA compares the output of the current memory under test with the memories once

within the same row and with the next row to prevent masking of faults.

61

Figure 4-10: Interconnection Scheme of the Proposed Architecture.

For example, if the third and fourth memories are faulty, comparing the third with

the fourth memory will not result in a faulty signal. However, a fault results when the

third memory is compared with the memory in the next row. Each ORA compares the F

LUT and G LUT modules separately and gives out two faulty signals, F1 and G1

respectively. The circular comparison of BIST architecture is shown in Figure 4-11.

62

Figure 4-11: Circular Comparison BIST Architecture.

Detection of the faulty LUT/ RAM (F or G) is possible through the ORA outputs

which have two faulty signals, one for each LUT. If all the ORA outputs (FO1-FO4 or

G1-G4) show “0000” then it can be concluded that no fault exists in the row. When a

fault exists, the corresponding signal goes high. For example, when the ORA output

shows F2 “0010”, then it can be determined that the fault exists at CLB#2 of F LUT.

Similarly, “0100” (CLB#3) and “1000” (CLB#4) identify the fault. The exact address at

which the fault is present can be found from the TPG.

4.4 Fault Modeling and Detection using Extended March C- Algorithm

In this research, the Extended March C- Algorithm was applied to test the LUTs

in a CLB. The set of BIST configurations is repeated twice to ensure the entire CLB is

tested. In order to detect the faults, faults are inserted using VHDL before applying the

March algorithm. The pseudo code for the algorithm is shown below.

A March test consists of a finite sequence of March elements, while a March

element is a finite sequence of operations applied to every cell in the memory array

before proceeding to the next cell.

63

4.5 Pseudo Code

Initialize the memory cells

Inject faults

--March Element M0

for i= 0 to 15 do

Ram[i]= write 0

end for

-March Element M1

for i= 0 to 15 do

read values from the cell then

update the cell value to 1

end for

-March Element M2

for i= 0 to 15 do



end for

-March Element M3

for i= 15 to 0 do



wait for 5 ns;

64

end for

March Element M4

for i= 15 to 0 do

read values from the cell twice and then


wait for 5 ns;

end for

March Element M5

for i= 15 to 0 do

read values from the cell twice

end for

4.6 Fault Modeling and Detection

4.6.1 Stuck-at Fault

A fault free behavior of the write driver will write the value specified by the

‘Data’ pin, and a faulty free read driver will read the data written into the memory cell. In

the presence of stuck-at faults, the data in the cell is always stuck at a logic value despite

the changes in the input.

To model a SF1, logic ‘0’ needs to be written into all the memory cells and a logic

‘1’ needs to be inserted at the SF address as shown in Figure 4-12.

65

Figure 4-12: Model of Stuck-at Fault.

The fault is inserted at “0010” of G LUT at CLB#2. This fault is detected in read

‘0’ operation of M1 element of Extended March algorithm. This detection of fault implies

that a ‘0’ is not written in all the cells by the write ‘0’ operation. Similarly, a SF1 can be

detected by read 1 operation of M2 element.

4.6.2 Transition Fault

A successful operation on a fault free circuit will undergo an up or down

transition when there is an up or down write operation. With transient faults, the cell fails

to undergo a ‘0’ to ‘1’ or ‘1’ to ‘0’ operation.

To model a transient fault, the cell needs to be checked for any possible

transitions from its previously stored value. As shown in Figure 4-13, the modeling of

66

transition fault can be achieved by using an AND gate and ANDing the output of the

memory cell with its previous output. For example, if the memory output is ‘1’ and if the

faulty address previously contains ‘0’, the output of the AND gate is replaced in the cell

thus preventing the up transition.

Figure 4-13: Model of Transition Fault.

The up-transient fault is detected by March element M2. The results appear

similar to a stuck-at fault. Hence, to distinguish them, a state coupling fault should be

added at the same location. If the value of the cell changes, then it is concluded that the

fault is a transient fault. Similarly, the down transient fault can be modeled and detected

by the March element M3.

67

4.6.3 Address Decoder Fault

Address decoder faults are caused by shorts and/or opens between the gates of the

decoder. Due to this fault, the cell might not be accessed or it might be accessed with two

addresses.

A typical LUT consists of a 4:16 decoder and the fault can occur if any of the

input line is stuck-at ‘0’ or ‘1’. Figure 4-14 shows a detailed diagram of an address

decoder with stuck-at faults.

Figure 4-14: Address Decoder with Stuck-at Faults.

It is observed that if an entire input line is stuck-at ‘1’ or ‘0’ the cells are

accessed at the wrong time due to faulty addresses and if the and gate input is stuck-at ‘1’

or ‘0’, multiple cells are being accessed at the same time. Also, if an input gate is open,

the particular cell is undefined and the cell can never be accessed.

68

To model the fault, a bit signal is used to determine which AND gate is stuck.

Figure 4-15 shows when the AND gate input is stuck and it also shows when the cell is

never accessed.

Figure 4-15: Model of Address Decoder Fault.

To detect these faults, faults are introduced in the LUT at 0010 and 1110. Initially

the memory is assumed to contain unknown or garbage values. During a fault free

operation, March element writes ‘0’ in all memory locations. Due to the address decoder

fault at “0010”, the cell is never accessed and shows an output ‘X’ during M2 operation.

This detects the address decoder fault at “0010”.

When the AND gate input is stuck at ‘1’ and when the address is “1110”, cell 14

and 15 are accessed simultaneously. During M3 operation, when cell 15 is accessed it

69

writes a ‘1’ on cell 14 as well as itself. So, when cell 14 is read for a ‘0’ the operation

fails, confirming the existence of an address decoder fault.

4.6.4 Incorrect Read Fault

During no fault operation, the read circuit should be able to read the value stored

in the cell. With incorrect read faults, the read operation fails to read the value stored in

the cell.

To model an IRF, the cell needs to be checked for any read operation. If there is a

read operation at the faulty address, the output value is changed according to the logic

implemented in the MUX as shown in Figure 4-16.

Figure 4-16: Model of Incorrect Read Fault.

The IRF is inserted at “0010” of G LUT at CLB#3. This fault is detected in read

‘0’ operation of M1 element of Extended March algorithm. This detection of fault implies

that a ‘0’ is written in all the cells by the write ‘0’ operation, However, a defect in read

circuitry results in the faulty output.

70

4.6.5 Read Destructive Fault

During no fault operation, the read circuit should be able to read the value stored

in the cell. With read destructive fault, the read operation changes the value stored in the

cell and results in a faulty output.

To model an RDF, the cell needs to be checked for any read operation. If there is

a read operation at the faulty address, the value stored in the cell is changed according to

the logic implemented in the MUX. This is shown in Figure 4-17.

Figure 4-17: Model of Read Destructive Fault.

RDF is detected by March element M1. The results appear similar to IRF. Hence

to distinguish the two faults, the value of the cell is stored in a buffer. If the output

obtained is different from the value of cell stored in the cell, it is concluded that an RDF

exists.

71

4.6.6 Deceptive Read Destructive Fault

During a no fault operation, the read circuit should be able to read the value stored

in the cell. With deceptive read destructive fault, the read operation returns the correct

logic value, while changing the content of the cell.

To model a DRDF, the cell needs to be checked for any read operation. If there is

a read operation at the faulty address, the value stored in the cell is changed after the

value is sent to the output. This is achieved by changing the value at the falling edge of

the clock cycle as shown in Figure 4-18.

Figure 4-18: Model of Deceptive Read Destructive Fault.

A deceptive read fault is a special case of read fault. To detect this fault, two read

operations are required. A read operation performed on the cell which is in a given state,

returns the correct logic value while changing the content of the cell. The DRDF is

inserted at “0110” of F LUT at CLB#1. This fault is detected in second read operation of

M4 element of Extended March algorithm. This detection of fault implies that a read

operation has changed the content of the cell.

72

4.6.7 Data Retention Fault

During a no fault operation, memory will write and read the value specified by the

Data input. In the case of a data retention fault, the delayed read operation followed by

write operation fails to read the data as the cell fails to retain the data after a specific

time. This is achieved by introducing a delay in the process of reading and writing data.

To model a data retention fault, a delayed read operation followed by a write

operation is required. The DRF is inserted at “1010” of F LUT at CLB#4. This fault is

detected by the read operation of M4 element of Extended March algorithm. The

detection of fault, only by the read operation of M4 element, indicates that the fault is a

data retention fault.

4.6.8 Coupling Faults

During a no fault operation, the logical state of one cell will not change the data

stored in the coupled cell. With state coupling fault, the data stored in the coupled cell is

affected by the value stored in the coupling cell.

To model the state coupling fault, the logical value stored in the coupling cell is

checked and if it matches with the given state, the value of the coupled cell is inverted

using an inverter as shown in Figure 4-19.

The CFst is inserted at “0110” of F LUT at CLB#1. This fault is detected by

March element M1 of Extended March algorithm. This fault is used to differentiate

between single cell faults and coupling faults. For example, if CFst is introduced at the

faulty cell and if the value of the cell changes, then it is concluded as a single cell fault. If

it’s not, it can be concluded as coupling fault.

73

Figure 4-19: Model of Coupling Fault.

Similarly, using the approach described in the above figure, the remaining

coupling faults, including CFir, CFrd, and CFdrdf are modeled and detected and the results

are shown in Chapter 5.

74

Chapter 5

5 Simulation Results and Performance Analysis

5.1 Introduction

The functional model of a Virtex-4 series FPGA is modeled using VHDL. To

increase the accuracy and prevent masking of faults, a chain of 4 CLBs is used to test the

system. An optimized March C- algorithm is used to test the embedded SRAM memories

of Virtex-4 FPGA. The simulation results and performance analysis is discussed below.

5.2 Simulation Results

Preliminary simulations are done without any faults. Subsequently, various faults

described in Section 2.6 are introduced into the memory. The unlatched outputs of RAM

modules are used for comparing the outputs. Due to this, the final fault signal output is

available instantaneously and there is no delay due to the scan chain. However, the

detection of fault using optimized March C- algorithm takes a certain amount of time.

This is the only timing constraint observed, and is listed in terms of number of clock

cycles taken in each subsection.

75

5.3 Simulations without Faults

Figure 5-1 to Figure 5-6 show the simulation results when no fault is introduced

in the system for March elements M0, M1, M2, M3, M4, and M5. M0 is a write operation

and during M0, the write enable signal must be held high. Data input is sampled and a ‘0’

is written in all memory locations. During the write cycle, the memory outputs are in high

impedance (blue lines), and the faulty outputs FO1- FO4 are in undefined state (red lines)

as shown in Figure 5-1.

Fault free simulation of M0 Operation. Figure 5-1:

M1 is a read ‘0’ and write ‘1’ operation. During this operation, the data written by

M1 operation will be read from each address and a ‘1’ is written to each address. During

read operation, the write enable signal is held low, indicating a read operation. The data

read is propagated through the ORA and the ORA compares the output with the fault free

76

output and enables the PASS/FAIL signal instantaneously. In this case, the ORA outputs

FO1-FO4 and G1-G4 show “0000” indicating a fault free operation. During the write

operation, the write enable signal is held high indicating a write operation. The process

continues in increasing order. Figure 5-2 shows the simulation results.

Fault free simulation of M1 operation. Figure 5-2:

March element M2 operation is performed on the memory cells in the similar way

as explained above. During this operation a read ‘1’ and a write ‘0’ is performed. The

ORA outputs show “00000000”, indicating a fault free simulation. During the write

operation ORA outputs remain in a high impedance state because, the output cannot be

determined during a write operation. Figure 5-3 presents the simulations results.

M3 is applied in the reverse order on the memory cells. After the read and write

operations, a HOLD command is applied on the memory cells. During this period, the

77

cells will remain in a saturation state and the value of the cell remains the same. This

operation is used as a test for many faults. Simulations results are presented in Figure 5-4.

Fault free simulation of M2 operation. Figure 5-3:

78


M4 operation occurs after the HOLD command is performed. During this

operation each memory cell is read twice from the address and then the new data ‘0’ is

written on the cells. The multiple reads avoid masking of the faults and this helps in

detecting deceptive read faults. M4 operation is performed in the decreasing order and

after the operation there is a Hold command, during which a ‘0’ is written in the memory

cell and held for a time ‘T’. Simulation result is shown in Figure 5-5.

M5 operation starts in the decreasing order and during the operation each cell is

read for the value ‘0’. During this operation, ORA output reads a “00000000”, indicating

a fault free circuit. Simulation results are shown in Figure 5-6. Table 5.1 shows the ORA

outputs.

79



Table 5.1. ORA outputs

80

Fault inserted in CLB# ORA outputs

F1/G1 F2/G2 F3/G3 F4/G4

No CLB is Faulty 0 0 0 0

CLB 1 is Faulty 0 0 0 1




5.4 Stuck-at 1 Fault

A stuck-at 1 fault is introduced at G LUT of CLB#3 at address “0101”. The

simulation result is shown in Figure 5-7. When the output of CUT #3 is compared with

adjacent identically configured CUT, the faulty signals show an output “00000100”,

indicating the presence of fault at CLB #3.

81

Stuck-at 1 Fault at CLB#3 during M1 operation. Figure 5-7:

The exact location can be obtained from the TPG address. Stuck-at 0 faults are

detected during M1 operation and the detection of the fault takes 22 clock cycles.

Assuming a clock period of 10ns (100 MHZ frequency), it takes 0.22 µs to detect and

locate the fault.

As shown in Figure 5-7, when the Memory cell “0101” is read for an expected value ‘0’

during March element M1, it reads a ‘1’. After the value is read, ORA receives the

output, and compares the value with the adjacent LUT signal. As there is a mismatch in

the value the ORA pass/fail signal goes high. This is shown by the yellow circle. And the

pattern “00000100” indicates a fault in the G LUT of CLB #3.

Stuck-at 1 Fault at CLB #3 during M3 operation. Figure 5-8:

82

The same fault can be identified by March element M3 and M5. Figures 5-8 and

5-9 show the simulations results.

Stuck-at 1 Fault CLB#3 during M5 operation. Figure 5-9:

5.5 Stuck-at 0 Fault

Stuck-at 0 fault is introduced in G RAM Module of CLB#3 at address “0101”.

Initially, during M1 operation, the memory cell is read with an expected ‘0’ and results in

the expected output. At the end a ‘1’ is written into the cell and during M2, when the cell

is read for an expected ‘1’ it returns a ‘0’. This ensures the presence of a stuck-at 0 fault

and ORA Signals show a “00001000” indicating a fault at address “0101”. Figure 5-10

shows the Stuck-at fault detection at address “0101” (marked by a yellow circle). The

exact location of the fault can be found with the TPG Address and the detection of the

fault takes 33 clock cycles. Assuming a clock period of 10 ns (100 MHZ), it takes 0.33 µs

to detect and locate the fault. SAF0 can also be detected by March element M4.

83

Stuck-at 0 Fault at CLB#3 during M2 operation. Figure 5-10:

5.6 Up-Transient Fault

An Up- Transient fault is introduced in the memory cell of F RAM module of

CLB#2 at address “0100”. Figure 5-11 shows the detection of up-transient fault at

address “0100”. The fault can be detected by March element M2 and M4. The ORA

output shows “00100000” indicating a fault (yellow circle). This occurs at the same time

when the cell “0100” is read for a ‘1’. The up-transient fault is detected and it takes 37

clock cycles to detect the fault and assuming a clock period of 10 ns, it takes 0.37 µs to

detect the fault.

84

Up-Transient fault at CLB#2 during M2 operation. Figure 5-11:

When the F RAM module is read for an expected value ‘1’, it reads a ‘0’. The

existence of this fault is confirmed with the ORA signal going high (yellow circle in the

figure). Yet the up-transient fault seems like a stuck-at fault. These two faults can be

distinguished by introducing the state coupling fault at the same location. The output of

the stuck-at fault is not affected by the coupling faults, whereas the output affects the

state transition fault.

85

5.7 Down-Transient Fault

Down-Transient fault is introduced in the F LUT of CLB #2 at address “1100”.

Initially, during M1 operation, the memory cell at address “1100” is read for the expected

value ‘0’. The output returns the expected value and the circuit appears to be fault free.

However, during M3 operation, when the cell is read for a ‘0’ it returns a ‘1’, confirming

a down-transient fault. This fault is detected only by March element M3. The ORA

returns an output “001000000” indicating a fault in the F LUT of CLB#2 (yellow circle),

and the exact fault location is obtained from the TPG address. Figure 5-12 shows the

simulation result for down-transient fault. It takes 51 clock cycles to detect the fault and,

assuming a clock period of 10 ns, the down- transient fault is detected in 0.51 µs.

Down-Transient fault at CLB#2 during M3 operation. Figure 5-12:

86

5.8 Address Decoder Fault

Address Decoder fault is inserted at address “0100” in the G RAM module of

CLB #2. Stuck-at 1 fault is introduced to detect the faults when the input lines are stuck-

at1. Figure 5-13 shows the detection of address decoder fault at “0100”.

When the input of the AND gate is stuck-at ‘1’ and address is “0100”, cell 4 and

cell 5 are accessed. During M3 operation, write ‘1’ is performed on cell 5. As more than

one cell is accessed with same address, a ‘1’ is also written on cell 4. Thus, when a read

operation on cell 4 is performed, it fails and reads a ‘1’ instead. . It takes 59 clock cycles

to detect the fault and assuming a clock period of 10 ns, the address decoder fault is

detected in 0.59 µs.

Address Decoder fault at CLB#2 during M3 operation. Figure 5-13:

A second type of address decoder fault can occur when the cell “0100” is never

accessed due to an open gate line. As the cell is never accessed, it shows an ‘X’

87

(undefined value). This fault is detected by March element M1 and Figure 5-14 shows

the simulation result. It takes 22 clock cycles to detect the fault, and assuming a clock

period of 10 ns, the address decoder fault is detected in 0.22 µs.

Address Decoder fault at CLB#3 during M1 operation. Figure 5-14:

5.9 Incorrect Read Fault

An Incorrect Read fault is introduced in the memory cell of F RAM module of

CLB#1 at address “1010”. The simulation result is shown in Figure 5-15 and the fault can

be detected by March element M1. The ORA output shows “10000000”, indicating a

fault. This fault is detected when the cell “1010” is read for a ‘0’ and it takes 27 clock

88

cycles to detect the fault. Assuming a clock period of 10 ns, the incorrect read fault is

detected in 0.27 µs.

Incorrect Read Fault at CLB#1 during M1 operation. Figure 5-15:

5.10 Read Destructive Fault

A Read Destructive fault is introduced in the memory cell of F RAM module of

CLB#1 at address “1010”. To detect a read destructive fault a ‘0’ and ‘1’ should be read

from each cell. The fault can be detected by March element M1 and M2. The value of the

cell is affected by RDF changes during the read operation (M1), whereas the value of cell

affected by IRF does not change. This helps in differentiating the two faults as shown in

Figure 5-15 and Figure 5-16 (identified by the “value of cell 1010”) .The simulation

89

results are shown in Figure 5-16. The ORA output shows “10000000”, indicating a fault.

It takes 27 clock cycles to detect the fault and, assuming a clock period of 10 ns the read

destructive fault is detected in 0.27 µs.

Read Destructive Fault at CLB#1 during M1 operation. Figure 5-16:

5.11 Deceptive Read Destructive Fault

A Deceptive Read Destructive fault is introduced in the memory cell of G RAM

module of CLB#4 at address “0110”. To detect the fault, two successive read operations

are applied to each cell, the first operation will sensitize the fault and the second will

detect it. The fault can be detected by M4 and M5. The simulation result is shown in

Figure 5-17 and the ORA output shows “00000010”, indicating a fault. It takes 100 clock

cycles to detect the fault and, assuming a clock period of 10 ns, the deceptive read fault is

detected in 1 µs.

90

Deceptive Read Destructive Fault at CLB#3 during M4 operation. Figure 5-17:

As shown in the figure, the fault is sensitized by the first read operation and detected

by the second read operation (marked by yellow circle).

5.12 Data Retention Fault

Data Retention fault is introduced in the memory cell of F RAM module of

CLB#3 at address “0100”. To detect the fault, the memory cell needs to be set at a

certain state, this is achieved by the HOLD command in the March algorithm. The fault is

sensitized by the HOLD command and detected by the read operation followed by it. The

fault can be detected by the March element M4, and the simulation results are shown in

Figure 5-18. The ORA output shows “000010000” indicating a fault. It takes 89 clock

cycles to detect the fault and assuming a clock period of 10 ns, it takes 0.89 µs to detect

the fault.

91

Data Retention Fault at CLB#3 during M4 operation. Figure 5-18:

5.13 State Coupling Fault

State Coupling fault is introduced in the memory cell of F RAM module of

CLB#3 at address “1000” and is coupled to a cell at address “0111” of F RAM as shown

in the figure. State coupling fault occurs when a coupled cell is forced to a complement

state when the coupling cell is in a given state. The simulation results are shown in the

figures Figure 5-19 and Figure 5-20.

92

State Coupling Fault at CLB#3 during M1 operation. Figure 5-19:

Figure 5-19 shows the coupling cell in a ‘0’ state (given state) and the coupled

cell forced to ‘0’. The ORA detects the change and shows “00001000” at the output,

indicating a fault in the F RAM module of CLB#3. The fault can be detected by the

March element M1 and the exact location of the faulty cell can be obtained from the TPG

address. It takes approximately 25 clock cycles to detect the fault and, assuming a clock

period of 10 ns, it takes 0.25 µs to detect the fault.

93

State Coupling Fault at CLB#3 during M1 operation. Figure 5-20:

Figure 5-20 shows the condition when the coupling cell is in a ‘1’ state (given

state), and the coupled cell is forced to “0”. This can be detected by the March Element

M2 and the ORA output shows “00000100” indicating a fault in the G RAM module of

CLB#3. The simulation results are shown in Figure 5-20 and the exact location of the

faulty cell can be obtained from the TPG address. It takes approximately 37 clock cycles

to detect the fault and, assuming a clock period of 10 ns, it takes 0.37 µs to detect the

fault.

94

5.14 Up-Transient Coupling Fault

Transient Coupling Fault is introduced in the memory cell of G RAM module of

CLB#4 at address “1001”. It is coupled to a cell at address “1000” of F RAM. The fault

can be detected by M1 and M2. The simulation result is shown in Figure 5-21. The ORA

output shows “00000001”, indicating a fault, and this fault can be differentiated from the

up transient fault by introducing a state coupling fault on the aggressor cell.

Up-Transient Coupling Fault at CLB#4 during M1 operation. Figure 5-21:

5.15 Down-Transient Coupling Fault

Transient Coupling Fault is introduced in the memory cell of G RAM module of

CLB#1 at address “1010”. It is coupled to a cell at address “1011” of F RAM. The fault

can be detected by M3. Figure 5-22 shows the simulation result and the ORA output

shows “01000000” indicating a fault. It takes approximately 67 clock cycles to detect the

fault and, assuming a clock period of 10 ns (100 MHZ), it takes 0.67 µs to detect and

locate the fault.

95

Down-Transient Coupling Fault at CLB#1 during M3 operation. Figure 5-22:

The exact location of the faulty cell can be obtained from the TPG address. It

takes approximately 26 clock cycles to detect the fault and, assuming a clock period of 10

ns (100 MHZ), it takes 0.26 µs to detect and locate the fault. Down-transient coupling

fault and down-transient fault can be distinguished by introducing a coupling fault and

the aggressor location. The output of the down-transient will not be affected by a

coupling fault.

5.16 Incorrect Read Coupling Fault

Incorrect Read Coupling Fault is introduced in the memory cell of F RAM

module of CLB#3 at address “1000”. It is coupled to a cell at address “0111” of F RAM.

The state of coupling cell can result in two types of incorrect read coupling faults, i.e. if

96

the coupling cell is at ‘0’ state and ‘1’ state. Figure 5-23 describes when the coupling cell

is at ‘0’ and Figure 5-24 describes when the coupling cell is at’1’. The fault can be

detected by March elements M1 and M2. The ORA output shows “00000010” indicating

a fault. The exact location of the faulty cell can be obtained from the TPG address. It

takes approximately 24 clock cycles to detect the fault and, assuming a clock period of 10

ns (100 MHZ), it takes 0.24 µs to detect and locate the fault.

Incorrect Read Coupling Fault at CLB#1 during M1 operation. Figure 5-23:

Figure 5-24 shows when the coupling cell is in a ‘1’ state (given state) the

coupled cell is forced to “0”. This can be detected by the March Element M2 and the

ORA output shows “00001000”, indicating a fault in the G RAM module of CLB#3. The

fault can be detected by M1 and the exact location of the faulty cell can be obtained from

97

the TPG address. It takes approximately 41 clock cycles to detect the fault and, assuming

a clock period of 10 ns, it takes 0.41 µs to detect the fault.

Incorrect Coupling Fault at CLB#1 during M2 operation. Figure 5-24:

5.17 Read Destructive Coupling Fault

Read Destructive Coupling Fault is introduced in the memory cell of F RAM

module of CLB#4 at address “1000”. It is coupled to a cell at address “0111” of F RAM.

Read destructive coupling fault is also classified into two types based on the coupling cell

state. The fault can be detected by the March elements M1 and M2, respectively. The

98

simulation results are shown in the figures Figure 5-25 and Figure 5-26. Figure 5-25

shows the coupling cell in a ‘1’ state (given state). The ORA output shows “00000010”,

indicating a fault. It takes approximately 24 clock cycles to detect the fault and, assuming

a clock period of 10 ns, it takes 0.24 µs to detect the fault.

Read Destructive Coupling Fault at CLB#3 during M1 operation. Figure 5-25:

Figure 5-26 shows the coupling cell in a ‘0’ state (given state). This can be

detected by the March Element M2 and the ORA output shows “00001000”, indicating a

fault in the G RAM module of CLB#3. The fault can be detected by M1 and the exact

location of the faulty cell can be obtained from the TPG address. It takes approximately

41 clock cycles to detect the fault and, assuming a clock period of 10 ns, it takes 0.41 µs

to detect the fault.

99

Read Destructive Coupling Fault at CLB#1 during M2 operation. Figure 5-26:

5.18 Deceptive Read Destructive Coupling Fault

Deceptive Read Destructive Coupling Fault is introduced in the memory cell of F

RAM module of CLB#4 at address “1000”. It is coupled to a cell at address “0111” of F

RAM. The fault can be detected by M4 and M5 and the simulation results are shown in

Figure 5-27. The ORA output shows “00000010”, indicating a fault. The exact location

of the faulty cell can be obtained from the TPG address. It takes approximately 82 clock

cycles to detect the fault and, assuming a clock period of 10 ns, it takes 0.82 µs to detect

the fault.

100

Figure 5-27: Deceptive Read Destructive Coupling Fault at CLB#4 during M4

operation.

5.19 Analysis of Results

Observing the simulation results, the presence of a fault can be identified when

the ORA output goes high. The detection of fault is, however, dependent on the type of

fault and each fault can be differentiated by the methods explained above and can be

uniquely identified. The algorithm used requires 12n operations to completely identify

the faults. For the 4-input LUT, it requires 128 operations to completely detect the fault

and read and write operations are performed in a single clock cycle using both rising and

falling edges. Table 5.2 summarizes the time taken to detect a particular fault based on

the cell addresses.

101

Table 5.2. Fault Coverage.

Fault Type Address

Inserted

Time Taken for detection

µs

Stuck-at Fault 0

0000 0.17

0001 0.18

0010 0.19

0011 0.20

0100 0.21

0101 0.22

0110 0.23

0111 0.24

1000 0.25

1001 0.26

1010 0.27

1011 0.28

1100 0.29

1101 0.30

1110 0.31

1111 0.32

Stuck-at Fault 1, Up-Transient Fault,

Address Decoder Fault-Open Gate

0000 0.33

0001 0.34

0010 0.35

0011 0.36

0100 0.37

102

0101 0.38

0110 0.39

0111 0.40

1000 0.41

1001 0.42

1010 0.43

1011 0.44

1100 0.45

1101 0.46

1110 0.47

1111 0.48

Down-Transient Fault,

Address Decoder Fault Stuck-at input

lines

0000 0.50

0001 0.52

0010 0.54

0011 0.56

0100 0.58

0101 0.60

0110 0.62

0111 0.64

1000 0.66

1001 0.68

1010 0.70

1011 0.72

1100 0.74

103

1101 0.76

1110 0.78

1111 0.80

Incorrect Read Fault,

Read Destructive Fault

0000 0.17

0001 0.18

0010 0.19

0011 0.20

0100 0.21

0101 0.22

0110 0.23

0111 0.24

1000 0.25

1001 0.26

1010 0.27

1011 0.28

1100 0.29

1101 0.30

1110 0.31

1111 0.32

Deceptive Read Destructive Fault ,

Data Retention Fault

0000 0.82

0001 0.84

0010 0.86

0011 0.88

0100 0.90

104

0101 0.92

0110 0.94

0111 0.96

1000 0.98

1001 1.00

1010 1.02

1011 1.04

1100 1.06

1101 1.08

1110 1.10

1111 1.12

State Coupling Fault

0000 0.17

0001 0.18

0010 0.19

0011 0.20

0100 0.21

0101 0.22

0110 0.23

0111 0.24

1000 0.25

1001 0.26

1010 0.27

1011 0.28

1100 0.29

105

1101 0.30

1110 0.31

1111 0.32

Up- Transient Coupling Fault

0000 0.33

0001 0.34

0010 0.35

0011 0.36

0100 0.37

0101 0.38

0110 0.39

0111 0.40

1000 0.41

1001 0.42

1010 0.43

1011 0.44

1100 0.45

1101 0.46

1110 0.47

1111 0.48

Down-Transient Coupling Fault

0000 0.50

0001 0.52

0010 0.54

0011 0.56

0100 0.58

106

0101 0.60

0110 0.62

0111 0.64

1000 0.66

1001 0.68

1010 0.70

1011 0.72

1100 0.74

1101 0.76

1110 0.78

1111 0.80

Incorrect Read Coupling Fault

Read Destructive Coupling Fault

0000 0.17

0001 0.18

0010 0.19

0011 0.20

0100 0.21

0101 0.22

0110 0.23

0111 0.24

1000 0.25

1001 0.26

1010 0.27

1011 0.28

1100 0.29

107

1101 0.30

1110 0.31

1111 0.32

Deceptive Read Destructive Coupling

Fault

0000 1.13

0001 1.14

0010 1.15

0011 1.16

0100 1.17

0101 1.18

0110 1.19

0111 1.20

1000 1.21

1001 1.22

1010 1.23

1011 1.24

1100 1.25

1101 1.26

1110 1.27

1111 1.28

108

Chapter 6

6 Conclusion

This thesis presents the development and verification of a BIST architecture for

testing the LUTs in Virtex-4 FPGAs. The primary aim is to test the LUTs within the

SRAM Virtex-4 FPGAs for the presence of stuck-at, transient, address decoder, incorrect

read, read destructive, deceptive read destructive, data retention, transient coupling,


coupling faults with minimum test time and also to design a diagnostic scheme to locate

the faulty LUT in the FPGA.

With increasing FPGA applications and their wide usage in critical applications,

testing FPGAs for correct operations is very important. Prior research in the area of

testing memory modules has been done in [16], [20-22]. The method presented in [16]

takes a long time to test the entire FPGA as it coveres only half of the FPGA in one test

session . The method proposed in [21] requires external logic resources to implement the

scheme and the scheme proposed in [22] consumes an entire CLB for designing a TPG

and is time consuming. The method proposed in [20] cannot locate the faulty CLB.

Hence, to overcome these disadvantages a novel BIST scheme is proposed,

designed and mapped on the Virtex-4 FPGA. The proposed scheme can not only test the

109

faults but also has the capability of locating the faults. An extended March C- algorithm

is used for testing the LUTs and it takes 14n operations and can detect all the faults. In

Table 5.3, the exact number of clock cycles taken by the proposed method is calculated

and comapred with [22].

Table 5.3. Fault Coverage Comparison

#

Fault Models with

faults at address

0001

No of clock

Cycles taken by

the proposed

technique

No of clock

Cycles

[22]

Time taken by

the proposed

technique (µs)

Time

taken

(µs)

[22]

1 Stuck-at 0 Fault

18 17 0.18 0.17

2 Stuck-at 1 Fault

34 77 0.34 0.77

3

Address Decoder

Fault

34 77 0.34 0.77

4

Up-Transient

Fault

34 77 0.34 0.77

5

Down-Transient

Fault

78 83 0.78 0.83

6

Incorrect Read

Fault

18 NA 0.18 NA

7

Read Destructive

Fault

18 NA 0.18 NA

8

Data Retention

Fault

108 NA 1.08 NA

9

Deceptive Read

Destructive Fault

109 NA 1.09 NA

10

State Coupling

Fault

18 NA 0.18 NA

11 Up-Transient

Coupling Fault 34 NA 0.34 NA

110

12

Down-Transient

Coupling Fault

78 NA 0.78 NA

13

Incorrect Read

Coupling Fault

18 NA 0.18 NA

14

Read Destructive

Coupling Fault

18 NA 0.18 NA

15

Deceptive Read

Destructive

Coupling Fault 128 NA 1.28 NA

6.1 Contributions

The following are the main contributions in this research:

Developing a new interconnection scheme which eliminates the drawbacks

presented in earlier works.

Testing stuck-at, transient, address decoder, incorrect read, read destructive,

deceptive read destructive, data retention, state coupling, transient coupling,


coupling faults.

Detecting and locating the exact address of multiple faults in the LUTs of an

SRAM based FPGA.

Performing simulations in ModelSim.

Reducing the fault detection time.

It was observed that the time taken for detection of the fault is dependent on the

type of the fault and the address at which the fault is present. Based on the address of the

memory cell (0000-1111), to detect stuck-at1 and 0 faults it takes 17 to 32 and 33 to 48

111

clock cycles, respectively. Up-transient and down-transient faults are detected in 33 to 48

and 50 to 80 clock cycles respectively. To detect incorrect read and read destructive

faults, 17 to 32 clock cycles are needed, and for deceptive read destructive and data

retention faults, 82 to 112 clock cycles are required. For detecting address decoder faults

with stuck-at input lines takes 50 to 80 clock cycles, and with open lines, it takes 17 to 32

clock cycles. Similarly, for incorrect read coupling and read destructive coupling faults,

17 to 32 clock cycles are needed and to detect deceptive read destructive coupling faults,

82 to 112 clock cycles are required.

6.2 Future work

The following ideas can be considered for extending this work:

SRAM BIST design could be applied to other families of FPGAs including the

Virtex-5 and Spartan-3. Additionally, the improvements made with this BIST

approach can be applied to the previous families of the Virtex-4 device.

The techniques used for test time speed-up and better fault coverage can be

explored for other programmable resources in FPGAs.

Multiple faults located in the same address in a LUT may not be detected in this

research which can be achieved by further modifying the architecture.

The fault detection approach can be applied to future memories including flash

and phase change memories.

112

References

[1] C. Stroud, N. A. Touba, and L. Wang, "System-on-Chip Test Architectures:

Nanometer Design for Testability," Morgan Kaufmann, 2008.

[2] P. Prinetto, M. S. Reorda, S. Barbagallo, A. Burri, D. Medina, and P. Camurati,

"Industrial BIST of Embedded RAMs," in Proceedings of IEEE Design and Test, pp.

86-95.

[3] C. R. Kime, K. K. Saluja, and V. D. Agrawal, "A Tutorial on Built-in Self-Test. I.

Principles," IEEE Design and Test of Computers, vol. 10, no. 1, pp. 73-82, 1993.

[4] Charles Stroud, "A Designer's Guide to Built-in Self-test", Springer, 2002.

[5] J. L. Dailey, "Analysis and Implementation of Built-In Self-Test for Block Random

Access Memories in Virtex-5 Field Programmable Gate Arrays," Dissertation,

Auburn University, 2011.

[6] D. Niggemeyer, E. M. Rudnick, and T. J. Bergfeld, "Diagnostic Testing of

Embedded Memories using BIST," in Proceedings of IEEE Design, Automation and

Test in Europe Conference and Exhibition, 2000, pp. 305-309.

[7] F. Tony, T. Nagesh, K. Mark, H. Abu, R. Janusz, and H. Graham, "Logic BIST for

Large Industrial Designs: Real Issues and Case Studies," in International Test

113

Conference, 1999, pp. 358-367.

[8] E. J. McCluskey, "Built-In Self Test Techniques," IEEE Design and Test of

Computers, vol. 2, 1985, pp. 21-28.

[9] V. D. Agrawal and T. J. Chakraborty, "High-Performance Circuit Testing with

Slow-Speed Testers," in Proceedings of International Test Conference, vol. 21-25 ,

1995, pp. 302-310.

[10] A. J. van de Goor, I. Schanstra, and Y. Zorian, "An effective BIST scheme for ring-

address type FIFOs," in Proceedings of International Test Conference, 1994, pp.

378-387.

[11] T. Arnaout, H. J. Wunderlich, and O. Heron, "On the Reliability Evaluation of

SRAM- based FPGA designs," International Conference of Field Programmable

Logic Applications, 2005, pp. 403–408.

[12] Y. Zorian, "A Distributed BIST Control Scheme for Complex VLSI Devices,"

Proceedings of VLSI Test Syposium, 1993, pp. 4-9.

[13] T. Inoue, H. Fujiwara, H. Michinishi, T. Yokohira, and T. Okamoto, "Universal Test

Complexity of Field-Programmable Gate Arrays," in Proceedings of the 4th Asian

Test Symposium, 1995, pp. 259-265.

[14] Riaz Naseer and Jeff Draper, "Improve Memory Reliability in Sub-100nm

Technologies," in Proceedings of the IEEE International conference on

114

Elecctroinics, Circuits and Systems, 2008, pp. 586-589.

[15] B. Dixon, C. Stroud, V. Nelson, and Jia Yao, "System-level Built-in Self-Test of

Global Routing Resources in Virtex-4 FPGAs," in 41st Southeastern Symposium on

System Theory(SSST), 2009.

[16] M. Abramovici and C. Stroud, "BIST-Based Test and Diagnosis of FPGA Logic

Blocks," in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol.

9(10), 2001, pp. 159-172.

[17] T. Xia, C. Stroud, and J. Smith, "An Automated BIST Architecture for Testing and

Diagnosing FPGA Interconnect Faults," in Journal of Electronic Testing, vol. 22,

2006, pp. 239-253.

[18] W. K. Huang, F. J. Meyer, N. Park, and F. Lombardi, "Testing Memory Modules in

SRAM-Based Configurable FPGAs," in IEEE International Workshop on Memory

Technology, Design and Test, 1997, pp. 79-86.

[19] J. Jenicek, O. Novak, and M. Rozkovec, "An Evaluation of the Application

Dependent FPGA Test Method," in 2012, IEEE 15th International Symposium on

Design and Diagnostics of Electronic Circuits & Systems (DDECS).

[20] J. M. Portal, J. Figueras, Y. Zorian, and M. Renovell, "SRAM-Based FPGAs:

Testing the Embedded SRAM Modules," in Journal of Electronic Testing, vol. 14,

1998.

115

[21] M. Y. Niamat and D. M. Nemade, "Test, Diagnosis and Fault simulation of

Embedded RAM Modules in SRAM-Based FPGAs," in Microelectronic

Engineering, vol. 84, 2007, pp. 194-203.

[22] M. Lalla, K. Junghwan, and M. Niamat, "Testing Faults in SRAM Memory of

Virtex-4 FPGA," in 52nd IEEE International Midwest Symposium on Circuits and

Systems, 2009.

[23] Said Hamidioui, Testing Static Random Access Memories: Defects, Fault Models

and Test Patterns.: Kluwer academic Publishers, 2004.

[24] M. Grosso, M. S. Reorda, Y. Zhang, and P. Bernardi, "A Programmable BIST for

DRAM Testing and Diagnosis," in International Test Conference (ITC), 2010, pp. 1-

10.

[25] Barr Michael, Memory Types Embedded Systems Programming., 2001.

[26] M. H. Abu-Rahma and M. Anis, "Variation-Tolerant SRAM Write and Read Assist

Techniques in Nanometer Variation-Tolerant SRAM," in Springer New York, 2013,

pp. 49-95.

[27] P. K. Lala, "Digital Circuit Testing and Testability", Academic Press, 1997.

[28] A. J. van de Goor amd S. Hamdioui, "Advanced Embedded Memory Testing:

Reducing the Defect per Million Level at Lower Test Cost," in IEEE 13th

International Symposium on Design and Diagnostics of Electronic Circuits and

116

Systems (DDECS), 2010, pp. 7-7.

[29] Siti Aisah Mat Junos, A. Razak, M. Idris, and N. Haron, "Modeling and Simulation

of Finite State Machine Memory Built-in Self Test Architecture for Embedded

Memories," in Asia-Pacific Conference on Applied Electromagnetics, 2007, pp. 1-5.

[30] D. T. Milton, C. E. Stroud, and B. R. Garrison, "Built-in Self-Test for Memory

Resources in Virtex-4 Field Programmable Gate Arrays," in International

Conference on Computers and Their Applications(CATA), 2009, pp. 63-68.

[31] Jian-Hua Su, You-Ren Wang, and C. Ze-Wang, "An Effective Test Algorithm and

Diiagnostic Implementation for Embedded Static Random Access Memories," in

Journal of Circuits, Systems, and Computers, vol. 07, 2011, pp. 1389-1402.

[32] E. Karl, M. Meterelliyoz, F. Hamzaoglu, Ng. Yong-Gee, S. Ghosh, and K. Zhang Y.

Wang, "Dynamic Behavior of SRAM Data Retention and a Novel Transient Voltage

Collapse," in IEEE International In Electron Devices Meeting, 2011, pp. 32-1.

[33] P. Girard , S. Pravossoudovitch, A. Virazel, S. Borri, and M. Hage-Hassan L. Dilillo,

"Resistive-Open Defects in Embedded-SRAM core cells:Analysis and March Test

Solution," in Asian Test Symposium, 2004.

[34] A. Bosio, L. Dilillo, P. Girard, S. Pravossoudovitch, A. Virazel, N. Badereddine, and

L. B. Zordan, "Optimized March Test Flow for Detecting Memory Faults in SRAM

Devices under Bit Line Coupling," in IEEE 14th International Symposium on Design

and Diagnostics of Electronic Circuits & Systems (DDECS), 2011, pp. 353-358.

117

[35] A. J. Van de Goor, Testing Semiconductor Memories, Theory and Practice.: John

Wiley and Sons, Inc, 1991.

[36] A. J. Van De Goor, "Using March Tests to Test SRAMs," in IEEE Design and Test

of Computers Conference, 1993, pp. 8-14.

[37] S. M. Thatte, J. A. Abraham, and R. Nair, "Efficient Algorithms for Testing

Semiconductor Random-Access Memories," in IEEE Transactions on Computers,

vol. 6, 1978, pp. 572-576.

[38] Z. Navabi, S. M. Fakhraie, and M. H. Tehranipour, "An Efficient BIST Method for

Testing of Embedded SRAMs," in IEEE International Symposium on Circuits and

Systems (ISCAS), vol. 5, 2001, pp. 73-76.

[39] M. Hamid, B. Swarup, K. Roy, and Q. Chen, "Efficient Testing of SRAM With

Optimized March Sequences and a Novel DFT Technique for Emerging Failures due

to Process Variations," in IEEE Transactions on Very Large Scale Integration

(VLSI) Systems, vol. 13, 2005.

[40] [Online]. http://www.xilinx.com/training/fpga/fpga-field-programmable-gate-

array.htm

[41] Xilinx Inc. (2005) Virtex-4 Family Overview Product Specification DS-112.

[42] "Virtex-4 FPGA Userguide," in Xilinx Inc., 2008.

[43] B. F. Dutton, "Embedded Soft-Core Processor-Based Built-in Self-Test of Field

http://www.xilinx.com/training/fpga/fpga-field-programmable-gate-array.htm

http://www.xilinx.com/training/fpga/fpga-field-programmable-gate-array.htm

118

Programmable Gate Arrays", Dissertation Auburn University, 2010.

[44] Priyanka Gadde and Mohammed Niamat, "FPGA Memory Testing Technique using

BIST," in IEEE 56th International Midwest Symposium on Circuits and Systems

(MWSCAS), 2013.

A BIST Architecture for Testing LUTs in a Virtex-4 ... - OhioLINK ETD

Documents

Transcript of A BIST Architecture for Testing LUTs in a Virtex-4 ... - OhioLINK ETD