Tatyasaheb Kore Institute of Engineering and Technology ...

Tatyasaheb Kore Institute of Engineering and Technology, Warananagar Department of Computer Science and Engineering

MCQ: Advanced Computer Architecture

Subject Teacher: Prof. Kanishka. N. Kamble, Asst. Prof., TKIET, Warananagar

Pag

e1

Chapter 1: Introduction

1. Vacuum tubes and relay memories were used by ____________ computers.

a) First generation

b) Second generation

c) Third generation

d) Fourth generation

2. In which generation, CPU were involved in all memory access and input/output

operations?

a) First generation


c) Third generation


3. High level languages along with compilers were introduced in ________ computers.

a) First generation


c) Third generation


4. Pipelining and cache memory were introduced to close up the speed gap between the

CPU and main memory in _________ computers.

a) First generation


c) Third generation


5. The idea of multiprogramming was introduced in _________ computers.

a) First generation


c) Third generation


6. __________ used to initiate the program execution through the OS kernel.

a) compiler

b) assembler

c) loader




Pag

e2

d) None of the above

7. Which of following is the approach used to upgrade the compiler

a) preprocessor

b) precompiler

c) parallelizing compiler

d) All of the above

8. Look ahead Techniques were introduced to

a) Prefetch Instructions

b) Fetch Instructions

c) Decode Instruction


9. Functional Parallelism supported using the following approach

a) Use of Multiple Functional Units Simultaneously

b) Use Pipelining at various processing Levels

c) Both (a) & (b)

d) None of above

10. Conventional sequential machines are called

a) SISD machines

b) SIMD machines

c) MIMD machines

d) MISD machines

11. Vector computers are

a) SISD machines

b) SIMD machines

c) MIMD machines

d) MISD machines

12. Systolic arrays are

a) SISD machines

b) SIMD machines

c) MIMD machines

d) MISD machines




Pag

e3

13. Memory to Memory architecture supports

a) Flow of Vector Operands directly from memory

b) Pipelined Flow of Vector Operands directly from memory

c) Pipelined Flow of Vector Operands directly from memory to pipeline

d) Pipelined Flow of Vector Operands directly from memory to pipelines and

back to memory

14. The CPU time needed to execute the program is estimated by finding the product of

a) Instruction count, Cycles per Instruction

b) Instruction count, Cycles per Instruction, cycle time

c) Instruction count, Cycles per Instruction, frequency

d) Instruction count, frequency, cycle time

15. Instruction set architecture influence on following performance factor

a) Instruction count

b) Processor cycles per instruction

c) Memory access latency

d) Both (a) and (b)

16. Compiler technology influence on following performance factor

a) Instruction count

b) Processor cycles per instruction

c) Memory references per instruction

d) All of the above

17. Which of the following statement is true ?

a) System throughput is often lower than the CPU throughput

b) System throughput is often greater than the CPU throughput

c) System throughput is often equal to the CPU throughput


18. Which of the following approach reduce the burden on compiler to detect the

parallelism?

a) Implicit parallelism

b) Explicit parallelism

c) Internal parallelism

d) External parallelism




Pag

e4

19. In which of the following multiprocessor model , all processors have equal access time to

all memory words?

a) Uniform memory access model

b) Nonuniform memory access model

c) Cache only memory architecture


20. An operational model of SIMD computer is specified by

a) <N,C,I,M,R>

b) <N,C,I,M,>

c) <N,C,I,>

d) <N,C,M,R>

21. In SIMD computer, which of the following scheme is used to partitions the set of PEs

into enabled and disable sets ?

a) Routing scheme

b) Broadcasting

c) Network topology

d) Masking scheme

22. In SIMD computer, which of the following scheme is used to specify the various patterns

to be set up in the interconnection network for inter-PE communications ?

a) Routing scheme

b) Broadcasting

c) Network topology

d) Masking scheme

23. Modern processors are managing heat by

a) Reduction in clock rate

b) Thermal overload trip

c) Both (a) and (b)


24. The energy required per transistor is proportional to the product of the ________ driven

by the transistor and the square of the ________ .

a) Capacitive load, Voltage

b) Voltage , Current

c) Capacitive load, Current




Pag

e5

d) Voltage, Resistance

25. Which of the following is the technique used to improve energy efficiency in modern

microprocessors ?

a) Do nothing well

b) Dynamic Voltage-Frequency Scaling (DVFS)

c) Overclocking

d) All of the above

26. The percentage of manufactured devices that survives the testing procedure is called as

a) Quality product

b) Tested product

c) Yield

d) Volume

27. Increase in volume

a) decrease the cost

b) decreases learning curve

c) both (a) and (b)

d) none of the above

28. Commodities are products that are sold by multiple vendors in large volumes and are

essentially identical.

a) multiple vendors, different

b) single vendor, different

c) single vendors, identical

d) multiple vendors, identical

29. Which of the following is used to decide whether system is up or down?

a) Service accomplishment

b) Service level agreements

c) Service interruption,

d) All of the above

30. ___________ is a measure of the continuous service accomplishment (or, equivalently, of

the time to failure) from a reference initial instant.

a) Module availability

b) Module reliability

c) Module efficiency





Pag

e6

31. ____________ is a measure of the service accomplishment with respect to the alternation

between the two states of accomplishment and interruption.

a) Module availability

b) Module reliability

c) Module efficiency


32. Which of the following equation is correct ?

a) MTBF = MTTF - MTTR

b) MTBF = MTTF + MTTR

c) MTTF = MTBF + MTTR

d) MTTF = MTBF - MTTR

33. Which of the following equation is correct ?

a) Module availability = MTTF/ (MTTF + MTTR)

b) Module availability = MTBF/ (MTTF + MTTR)

c) Module availability = MTBF/ (MTTF - MTTR)

d) Module availability = MTTF/ (MTTF - MTTR)

34. Which of the following statement is correct ?

a) execution time is the reciprocal of performance

b) execution time is directly proportional to the performance

c) execution time is independent of performance


35. A 40 MHz processor was used to execute a benchmark program with the following

instruction mix and clock cycle counts:

Instruction Type Instruction

Count

Clock

Cycle

Count

Integer arithmetic 45000 1

Data transfer 32000 2

Floating point 15000 2

Control transfer 8000 2




Pag

e7

What is the value of Instruction Count ?

a) 1,00,000

b) 1,55,000

c) 92,000

d) 45,000

36. How May cycles are needed to execute the Program in question 35?

a) 1,00,000

b) 1,55,000

c) 92,000

d) 45,000

37. What is the average CPI for program in question 35?

a) 1.4

b) 1.55

c) 1.0

d) 9.2

38. What is Execution time of above Program for question 35?

a) 2.7

b) 3.87

c) 2.87

d) 4.78

39. Consider the execution of an object code with 200,000 instructions on a 40 MHz

processor. The program consist of four major types of instructions. The instruction mix

and number of cycles (CPI) needed for each instruction type are given below based on

the result of a program trace experiment:

Instruction Type CPI Instruction

Mix

Arithmetic and logic 1 60%

Load/Store with

cache hit 2 18%

Branch 4 12%

Memory reference

with cache miss 8 10%




Pag

e8

What is the average CPI when the program is executed on a uniprocessor with the above

trace result?

a) 2.45

b) 2.24

c) 4.38

d) 4.45

40. What is the MIPS rate for above program in question 39?

a) 17.86

b) 18.78

c) 15.45

d) 14.28

41. Find the number of dies per 300 mm (30 cm) wafer for a die that is 1.5 cm on a side.

a) 240

b) 270

c) 180

d) 290

42. Find the die yield for dies that are 1.5 cm on a side , assuming a defect density of 0.031

per cm2 and N is 13.5.

a) 0.40

b) 0.66

c) 0.75

d) 0.87

Chapter 2: Principles of Pipelining and Vector Processing

43. The pipelining process is also called as ______

a) Superscalar operation

b) Assembly line operation

c) Von Neumann cycle

d) None of the mentioned

44. The fetch and execution cycles are interleaved with the help of ________

a) Modification in processor architecture

b) Clock

c) Special unit

d) Control unit




Pag

e9

45. Each stage in pipelining should be completed within ___________ cycle.

a) 1

b) 2

c) 3

d) 4

46. To increase the speed of memory access in pipelining, we make use of _______

a) Special memory locations

b) Special purpose registers

c) Cache

d) Buffers

47. A pipeline stage is

a) Sequential circuit

b) Combinational circuit

c) Both sequential circuit and combinational circuit


48. In pipeline computer clock period is defined by

a) Maximum of time delays of all stages plus time delay of latch

b) Minimum of time delays of all stages plus time delay of latch

c) Average of time delays of all stages plus time delay of latch


49. The reciprocal of the clock period is called

a) Throughput

b) Efficiency

c) Frequency


50. Ideally, a linear pipeline with k stages can process n tasks in _______ clock periods

a) k-(n+1)

b) k*(n-1)

c) k+(n+1)

d) k+(n-1)




Pag

e10

51. Ideally, nonpipelined processor with k stages can process n tasks in _______ clock

periods

a) k*(n-1)

b) k*(n+1)

c) n*(k-1)

d) n*k

e)

52. The maximum speedup the linear pipeline with k stages can achieve for n number of

tasks where n >>k is

a) n*k

b) k

c) n

d) n*(k-1)

53. The efficiency of linear pipeline is measured by the percentage of

a) busy time-space span over total time-space time

b) idle time-space span over total time-space time

c) busy time-space span over idle time-space time


54. In ideal case, the maximum efficiency that can be achieved with linear pipeline is

a) 1

b) f

c) k


55. In ideal case, the maximum throughput that can be achieved with linear pipeline is

a) 1/τ

b) f

c) both (a) and (b)

d) k

56. In the S access memory organization which address bits are used to retrieve the

information from particular module

a) Higher (n-m) bits

b) Lower (n-m) bits

c) Higher m bits

d) Lower m bits




Pag

e11

57. In the S access memory organization which address bits are used to address to all M

memory modules simultaneously

a) Higher (n-m) bits

b) Lower (n-m) bits

c) Higher m bits

d) Lower m bits

58. S access memory configuration is suitable for

a) Sequential address words

b) Non-sequential address words

c) Both (a) and (b)


59. In the S access memory organization, total time required to access k consecutive words

in sequence starting in module i with a memory access time Ta and a latch delay of τ if

i+k<=M

a) τ + kTa

b) τ + (k-1)Ta

c) Ta + (k-1)τ

d) Ta + kτ

60. C access memory configuration is suitable for

a) Sequential address words

b) Non-sequential address words

c) Both (a) and (b)


61. Which approach is used to enhance the vector processing capability

a) Enrich the vector instruction set

b) Combine scalar instructions

c) Choose suitable algorithms

d) All of the above

62. Cray-1 has ____ functional pipeline units.

a) 5

b) 8

c) 10

d) 12




Pag

e12

63. The clock rate in the Cray-1 is _____ ns.

a) 5.5

b) 8.5

c) 12.5

d) 16.5

64. In Cray-1 there are _______ I/O cannels.

a) 12

b) 24

c) 36

d) 48

65. The memory section in the Cray-1 computer is organized in 8 or 16 banks with ______

modules per bank.

a) 32

b) 64

c) 16

d) 72

66. In Cray-1, how many bits are used for parity check?

a) 8

b) 16

c) 32

d) 64

67. In Cray-1, parity bits checks for ____________________.

a) Single error correction and double error detection

b) Single error detection and double error correction

c) Double error correction and double error detection


68. In Cray-1 bipolar memory has a cycle time of _____ ns.

a) 12.5

b) 25

c) 50

d) 75

69. In Cray-1 at most _____ 64-bit word can be transferred per channel during each clock

period.

a) One




Pag

e13

b) Two

c) Three

d) Four

70. In Cray-1, the computation section contains ___________ instruction buffer.

a) 32*4

b) 64*4

c) 72*4

d) 16*4

71. In Cray-1, there are ______ address registers (A) with _______ bits each used for

memory addressing.

a) 8,24

b) 8,32

c) 16, 64

d) 8,64

72. In Cray-1, there are ______ vector registers (V); each has ______ component registers.

a) 16,64

b) 8,64

c) 16,32

d) 8,32

73. In Cray-1, P register is _____ bit program counter indicating the next parcel of program

code to enter the next instruction parcel in linear program sequence.

a) 16

b) 32

c) 64

d) 22

74. In Cray-1, which register holds the instruction waiting to be issued?

a) CIP

b) NIP

c) LIP

d) VM

75. In Cray-1, which register holds the parcel of the program code prior to entering the CIP?




Pag

e14

a) NIP

b) LIP

c) VM

d) BA

76. In Cray-1, if instruction has 32 bits , _______ register holds the upper half of the

instruction.

a) CIP

b) NIP

c) LIP

d) BA

77. In Cray-1, if instruction has 32 bits , _______ register holds the lower half of the

instruction.

a) CIP

b) NIP

c) LIP

d) BA

Chapter 3: Different parallel processing architectures

78. Associative memory is called as

a) Content addressable memory

b) Parallel search memory

c) Multiaccess memory

d) All of the above

79. In associative memory, which registered is used to hold the key operand being searched

or being compared with ?

a) Masking register

b) Temporary register

c) Indicator register

d) Comparand register

80. In associative memory, which registered is used to enable or disable the bit slices to be

involved in the parallel comparison operations across all the words in the associative

memory?

a) Masking register




Pag

e15




81. In associative memory, which registered is used to hold the current match patterns in the

associative memory?

a) Masking register




82. In associative memory, which registered is used to hold the previous match patterns in

the associative memory?

a) Masking register




83. In the Bit parallel organization the comparison operation is performed

a) All words at a time

b) One word at a time

c) One bit slice at a time

d) All bit slices which are not masked off at a time

84. In the Bit serial organization the comparison operation is performed

a) All words at a time

b) One word at a time

c) One bit slice at a time

d) All bit slices which are not masked off at a time

85. The associative processor STARAN has the

a) Bit serial memory organization

b) Bit parallel memory organization

86. The associative processor PEPE has the

a) Bit serial memory organization

b) Bit parallel memory organization




Pag

e16

87. Which latency hiding techniques move the data close to the processor before it is actually

needed?

a) Prefetching

b) Distributed coherent cache

c) Relaxed memory consistency

d) Multithreading

88. In which type of prefetching technique, data is visible to cache coherence protocol and

kept consistent

a) Binding

b) Nonbinding

c) Both (a) and (b)


89. Which parameter is affecting on performance of multithreaded MPP system

a) Number of threads

b) Context switching overhead

c) Interval between switches

d) All of the above

90. Which latency is unpredictable ?

a) Remote load latency

b) Synchronization latency

c) Both (a) and (b)


91. Which factor is not used to calculate the efficiency of single threaded machine?

a) Number of cycles for which processor is busy (R)

b) Number of cycles for which processor is idle (L)

c) Number of cycles wasted in context switching (C)


92. In which region multithreaded processor operates with maximum utilization

a) Linear region

b) Saturation region




Pag

e17

93. In multithreaded processor , the saturation point Nd is given by

a) [L/(R-C)]+1

b) [R/(L+C)]+1

c) [L/(R-C)]+1

d) [L/(R+C)]+1

94. Which of the following context switching policy improves the cache-hit ratio due to

locality

a) Switch on every instruction

b) Switch on block of instruction

Chapter 4: Distributed Memory Architecture

95. Which of the following can tolerate high degree of interactions between the tasks without

significant decrease in performance?

a) Loosely coupled systems (LCS)

b) Tightly coupled systems (TCS)

96. For LCS configurations that use a single time shared bus, the performance is limited by

a) Message arrival rate on the bus

b) Message length

c) Bus capacity

d) All of the above

97. Which bus is used to connect the computer modules within the cluster?

a) Inter-cluster bus

b) LSI-11 bus

c) Control bus

d) Map bus

98. In cm* architecture, __________ is responsible for address mapping.

a) Kmap

b) Slocal

c) CAS

d) Computer module




Pag

e18

99. In cm* architecture, a cluster is made up of

a) Kmap

b) Computer modules

c) Map bus

d) All of the above

100. In cm* architecture, a clusters communicate via ________ which are connected

between ________.

a) Map buses, Kmap

b) Inter-cluster buses, Kmap

c) LSI-11, Slocal


101. Kmap is microprogrammed, _________ cycle and ________ processor complex

a) 150 ns, 3

b) 12.5 ns, 5

c) 25 ns, 1

d) 250 ns, 3

102. The _____ is the bus controller which arbitrates requests to the map bus.

a) Kbus

b) Linc

c) Pmap

d) Slocal

103. The _____ is manages communication between Kmap and other Kmap .

a) Kbus

b) Linc

c) Pmap

d) Slocal

104. Kmap is multiprogrammed to handle up to _____ concurrent request.

a) two

b) four

c) six

d) eight




Pag

e19

105. Communication between the tasks allocated to the same processor take place

through input ports stored in ____________.

a) local memory

b) remote memory

c) communication memory


106. Communication between the tasks allocated to the different processor take place

through communication ports stored in ____________.

a) local memory

b) remote memory

c) communication memory


107. In Tightly Coupled Systems which module directs the memory references to

either ULM or Private cache of the processor?

a) PMIN

b) ISIN

c) IOPIN

d) Memory map

108. In Tightly Coupled Systems which memory is used to store the Kernel code and

operating system tables often used by the processes running on the processor?

a) Shared memory

b) Memory map (MM)

c) Private Cache (PC)

d) Unmapped local memory (ULM)

109. _______________ determines whether memory reference is local or nonlocal.

a) Kbus

b) Linc

c) Memory Map

d) Mapping table

110. Kamp is much faster than the memory in the computer modules. Is it true or false?

a) True

b) False

111. Which processor is responsible for Context Switching in Kmap ?




Pag

e20

a) Kbus

b) Linc

c) Pmap

d) Slocal

112. In tightly coupled architecture, to avoid excessive conflicts number of memory

modules is ________ the number of processors.

a) Less than

b) Greater than or equal to

Chapter 5: Data-Level Parallelism in Vector, SIMD and GPU Architectures

113. The execution time of a sequence of vector operations primarily depends on

a) length of the operand vectors

b) structural hazards

c) data dependences

d) all of the above

114. In vector processor, the set of vector instructions that could potentially execute

together is called _________ .

a) chime

b) convoy

c) stride

d) chaining

115. Which technique is used to tackle the problem where the vector is longer than the

maximum length ?

a) chaining

b) stride

c) data mining

d) strip mining

116. Which register is used to handle the IF statements in Vector loops?

a) Vector length register

b) Scalar register




Pag

e21

c) Vector mask register


117. This distance separating elements to be gathered into a single vector register is

called as _______.

a) chime

b) convoy

c) stride

d) chaining

118. The primary mechanism for supporting sparse matrices is ________using index

vectors.

a) Masking operation

b) gather-scatter operations

c) stride operations

d) chaining operations

119. GPUs have the following type of parallelism that can be captured by the

programming environment:

a) Multithreading

b) MIMD

c) SIMD

d) instruction-level

e) all of the above

120. In GPU computational structure, a Grid consists of

a) ThreadBlocks

b) Threads

c) Registers


121. In GPU computational structure, ____________ is assigned to the multithreaded

SIMD Processor.

a) Grid

b) Thread Blocks

c) Threads

d) All of the above




Pag

e22

122. In GPU memory structure, local memory is shared by

a) All threads in multithreaded SIMD Processor.

b) All threads in a Thread Block

c) All Grids


123. Data accesses in later iterations are dependent on data values produced in earlier

iterations is called a

a) Data unit

b) Loop carried unit

c) Data dependence

d) Loop carried dependence

124. The form of loop carried dependency is

a) Multithreading

b) Recurrence

c) CUDA thread

d) Unit Block

125. _____________ is needed to detect the structural and data hazards.

a) Vector functional unit

b) Vector Load/Store unit

c) Vector registers

d) Control unit

126. __________ allows a vector operation to start as soon as possible the individual

elements of its vector source operand become available.

a) chaining

b) stride

c) data mining

d) strip mining

127. ____________ allows a vector instruction to chain to essentially any other active

vector instruction, assuming that we don’t generate a structural hazard.

a) fixed chaining




Pag

e23

b) dynamic chaining

c) flexible chaining


128. When an array is allocated memory, it is laid out in either row-major (as in C) or

column-major (as in Fortran) order and is called as ___________.

a) linearization

b) quantization

c) discretization

d) normalization

129. The result of gather operation using index vector is __________

a) Identity vector

b) Dense vector

c) Sparse vector

d) Null vector

130. The result of scatter operation using index vector is __________

a) Identity vector

b) Dense vector

c) Sparse vector

d) Null vector

131. CUDA produces the ___________ for the GPU.

a) C and C++

b) C and C++ dialects

c) Java

d) C#

132. To distinguish between functions for the GPU (device) and functions for the

system processor (host), CUDA uses for the former and __host__ for the latter.

a) _device_ and _host_

b) _gpu_ and _cpu_

c) _graphics_ and _system_





Pag

e24

Chapter 6: Program and Network Properties

133. The opportunities exists for parallelization and vectorization is identified by

analysis of

a) Precedence graph

b) Dependence graph

c) Analysis graph

d) Program graph

134. If an execution path exists from statement S1 to S2 and if at least one output of S1

feeds in as input to S2, data dependence is called

a) Output dependence

b) I/O dependence

c) Antidependence

d) Flow dependence

135. If an execution path exists from statement S1 to S2 and if the output of S2

overlaps the input to S1, data dependence is called


b) I/O dependence

c) Antidependence

d) Flow dependence

136. If two statements S1 and S2 produces or write to same output variable, data

dependence is called


b) I/O dependence

c) Antidependence

d) Flow dependence

137. The order of execution of statements cannot be determined before run time due to

a) Data dependence

b) Control dependence

c) Resource dependence


138. The dependence relation between the two statements can not be determined in the

following situation




Pag

e25

a) The subscript of a variable is itself subscribed

b) The subscript does not contain the loop index variable

c) A variable appears more than once with subscript having different

coefficient of the loop variable

d) The subscript is nonlinear in the loop index variable

e) All of the above

139. Bernstein’s conditions imply that two processes can execute in parallel if they are

a) Flow-independent

b) Anti-independent

c) Output-independent

d) All of the above

140. The type of parallelism defined by the control and data dependence of programs is

a) Hardware parallelism

b) Software parallelism

c) Instruction parallelism

d) Loop parallelism

141. The type of parallelism defined by the machine architecture and hardware

multiplicity is

a) Hardware parallelism

b) Software parallelism

c) Instruction parallelism

d) Loop parallelism

142. The mismatch problem between software parallelism and hardware parallelism

can be solved by using the following approach

a) Develop compilation support

b) Hardware redesign

c) Both (a) and (b)


143. Which of the following has heavier communication overhead ?

a) Fine grain

b) Medium grain

c) Coarse grain




Pag

e26


144. Which of the following provides a higher degree of parallelism ?

a) Fine grain

b) Medium grain

c) Coarse grain


145. The grain size problem demands the determination of

a) Number of grains

b) Size of grain

c) Both (a) and (b)

d) Schedule time

146. Node duplication is used to

a) Eliminate idle time

b) Reduce the communication delays among processor

c) Both (a) and (b)


147. Flow dependence is represented by

a) Directed arrow

b) Directed arrow crossed with bar

c) Directed arrow with hollow circle on it


148. Antiindependence is represented by

a) Directed arrow

b) Directed arrow crossed with bar

c) Directed arrow with hollow circle on it


149. ___________ is the measure of the amount of computations involved in a

software process.

a) Latency

b) Grain Pack




Pag

e27

c) Grain Size

d) Grain schedule

150. ____________ is the most optimized program construct to execute on parallel or

vector computer.

a) Instruction level parallelism

b) Loop level parallelism

c) Procedure level parallelism

d) Subprogram level parallelism

151. Which of the following requires a higher communication demand and

synchronization overhead ?

a) Fine grain

b) Medium grain

c) Coarse grain


152. ___________ parallelism is handled by the program loader and operating system

in general.

a) Instruction level parallelism

b) Loop level parallelism

c) Procedure level parallelism

d) Program level parallelism

Tatyasaheb Kore Institute of Engineering and Technology ...

Documents

Transcript of Tatyasaheb Kore Institute of Engineering and Technology ...