Computing with memory: Difference between revisions

Content deleted Content added
Added {{lead rewrite}} and {{technical}} tags to article (TW)
Yobot (talk | contribs)
m clean up, References after punctuation per WP:REFPUNC and WP:PAIC using AWB (8434)
Line 6:
<!-- Deleted image removed: [[Image:Memory Logic Block.png|thumb|right|alt=Time-multiplexed execution of mapped application using embedded memory blocks .|Functional block diagram of Memory Based Computation.]] -->
 
Computing with memory platforms are typically used to provide the benefit of hardware reconfigurabilty. Reconfigurable computing platforms offer advantages in terms of reduced design cost, early time-to-market, rapid prototyping and easily customizable hardware systems. FPGAs present a popular reconfigurable computing platform for implementing digital circuits. They follow a purely spatial computing model. Since their inception in 1985, the basic structure of the FPGAs has continued to consist of two-dimensional array of Configurable Logic blocks (CLBs) and a programmable interconnect matrix .<ref name="Ref 1"> K.Compton and S. Hauck, "Computing: A Survey of systems and software", ACM Surveys, Vol. 34, No. 2, June, 2002.</ref>. FPGA performance and power dissipation is largely dominated by the elaborate programmable interconnect (PI) architecture .<ref name="Ref 2"> S.M. Trimberger, "Field Programmable Gate Array Technology", Norwell, MA: Kluwer, 1994.</ref><ref name="Ref 3"> A. Rahman, S. Das, A.P. Chandrakasan, R. Reif, "Wiring Requirement and Three-Dimensional Integration Technology for Field Programmable Gate Arrays", IEEE Trans. on Very Large Scale Integration Systems, Vol. 11, No. 1, February, 2003.</ref>. An effective way of reducing the impact of the PI architecture in FPGA is to place small LUTs in close proximity (referred as clusters) and to allow intra-cluster communication using local interconnects. Due to the benefits of a clustered FPGA architecture, major FPGA vendors have incorporated it in their commercial products .<ref name="Ref 4"> [http://www.xilinx.com Xilinx Corporation]</ref><ref name="Ref 5"> [http://www.altera.com Altera Corporation]</ref>. Investigations have also been made to reduce the overhead due to PI in fine-grained FPGAs by mapping larger multi-input multi-output LUTs to embedded memory blocks. Although it follows a similar spatial computing model, part of the logic functions are implemented using embedded memory blocks while the remaining part is realized using smaller LUTs .<ref name="Ref 6"> J. Cong and S. Xu, "Technology Mapping for FPGAs with Embedded Memory Blocks", Symposium on Field Programmable Gate Array, 1998.</ref>. Such a heterogeneous mapping can improve the area and performance by reducing the contribution of programmable interconnects.
 
Contrary to the purely spatial computing model of FPGA, a reconfigurable computing platform that employs a temporal computing model (or a combination of both temporal and spatial) has also been investigated <ref name="Ref 7"> S. Paul and S. Bhunia, "Reconfigurable Computing Using Content Addressable Memory for Improved Performance and Resource Usage", Design Automation Conference, 2008.</ref>
<ref name="Ref 8"> S. Paul, S. Chatterjee, S. Mukhopadhyay and S. Bhunia, "Nanoscale Reconfigurable Computing Using Non-Volatile 2-D STTRAM Array", International Conference on Nanotechnology, 2009.</ref> in the context of improving performance and energy over conventional FPGA. These platforms, referred as Memory Based Computing (MBC), use dense two-dimensional memory array to store the LUTs. Such frameworks rely on breaking a complex function (''f'') into small sub-functions; representing the sub-functions as into multi-input, multi-output LUTs in the memory array; and evaluating the function ''f'' over multiple cycles. MBC can leverage on the high density, low power and high performance advantages of nanoscale memory .<ref name="Ref 8"/>. [[:Image:Memory Logic Block.png]] shows the high-level block diagram of MBC. Each computing element incorporates a two-dimensional memory array for storing LUTs, a small controller for sequencing evaluation of sub-functions and a set of temporary registers to hold the intermediate outputs from individual partitions. A fast, local routing framework inside each computing block generates the address for LUT access. Multiple such computing elements can be spatially connected using FPGA-like programmable interconnect architecture to enable mapping of large functions. The local time-multiplexed execution inside the computing elements can drastically reduce the requirement of programmable interconnects leading to large improvement in energy-delay product and better scalability of performance across technology generations. The memory array inside each computing element can be realized by [[Content-addressable memory]] (CAM) to drastically reduce the memory requirement for certain applications .<ref name="Ref 7"/>.
 
==See also==
Line 21:
[[Category:Computer engineering]]
[[Category:Models of computation]]
 
[[ru:Вычисления_с_памятью]]
[[ru:Вычисления с памятью]]