buses.  One is a read-only bus; the other is a read-write bus.  The FB has to send enough data per cycle to a whole row or column of the RC Array.  Since there are eight RCs, each needing two 8-bit operands, a total of 128 bits (8 RCs * 2 operands/RC * 8 bits/operand = 128 bits) is necessary, hence the two 64-bit read buses.  One 64-bit bus is needed to write data back to the FB from the RC Array because each RC produces an 8-bit output (8 RCs * 1 output/RC * 8 bits/output = 64 bits).  Since one of the data buses between the RC Array and the FB is used for both reading and writing, a read and write between these two modules may not occur at the same time. However, the DMA Controller and RC Array buses are independent of each other and may each read or write at will, with the constraint presented below.

The FB is divided into two separate sets (memory buffers).  This configuration allows the DMA Controller to access one set while the RC Array is accessing the other.  Each set may be accessed by either the DMA Controller or by the RC Array, but not by both simultaneously.  Each set is further divided into two banks (Figure 3), each 64 bits wide.  The DMA Controller accesses one bank at a time, while the RC Array accesses both banks within the same set at the same time.  Thus, the DMA Controller must deliver data to the FB at a rate at least twice as fast as the rate the RC Array reads it.   Fortunately, this stipulation does not degrade the performance of the RC Array because many typical applications require the RC Array to perform several operations on the same set of data before the desired result is obtained (the DMA can fill a set of the FB with data before the RC Array needs another set of data).

Figure3.GIF (2676 bytes)
Figure 3
FB Internal Block Diagram

 

Direct Memory Access Controller:
The DMA Controller acts as the interface between the main memory of the processor and the FB and RC Array modules.  The data bus between the DMA Controller and the FB is a 64-bit read-write bus, while the data bus between the DMA Controller and the RC Array is 32 bits wide (Figure 2).  Since the data bus to and from memory is 32 bits wide, the DMA Controller needs two cycles to assemble 64 bits of data from the memory for the FB, and one cycle to assemble the 32-bit data for the RC Array.

The DMA Controller has three main components: the Data Packing Register, the Address Generator Unit, and the State Controller.  The Data Packing Register assembles the 64 bits of data for the FB. The Address Generator Unit generates and tracks addresses for the memory, FB, and RC Array.  The State Controller receives information from the Tiny RISC processor and determines the following sequence of data transfers to and from the FB and RC Array.  The amount of data transferred is specified by the information from Tiny RISC.

Figure4.GIF (9043 bytes)
Figure 4
RC Array - 8x8 array of RCs


Reconfigurable Cell Array:
The RC Array is a Single-Instruction Multiple-Data (SIMD) multiprocessor.   It consists of an 8x8 array (Figure 4) of processing units (called RCs).  The array is row or column reconfigurable, meaning that a whole row or column can be reconfigured at the same time, with the same context across all eight cells.  With the same context, each row or column executes the same instruction on different data, hence making each row or column a SIMD multiprocessor.  Each RC stores a copy of its current context in its Context Register, which is internal to each RC and separate from the Context Memory. Capability to reconfigure single RCs is present.  The power of the RC Array lies in the fact that, depending on a specific application's needs, it can be configured to be a row or a column of eight multi

back.gif (287 bytes) back.gif (221 bytes)


Page 3

 

U-Tee Cheah - The MorphoSys Project: Dynamically Configurable... [1] [2] [3] [4] [5] [6] [7]