Using GALS Interconnect for Single Clock cycle Soc Design

Andrew Lines


Abstract

AbstractThis presentation will explore how using globally asynchronous, locally synchronous (GALS) interconnect techniques helps SoC designers distributing the clock signal across an SoC in one clock period. GALS makes use of an asynchronous crossbar chip and allows designers the ability to bridge the varying clock domains on an SoC without adding timing margin that can impact chip performance. The presentation will detail how to implement the GALS interconnect architecture using Fulcrum’s Nexus crossbar switching technology. With a total capacity of 1.6 Tbps, the Nexus crossbar switching logic is designed for high-speed on-chip communications. Measuring less than 2 mm2, the crossbar takes full advantage of our asynchronous design style to connect 16 full-duplex ports in a non-blocking manner using 36-bit data paths of asynchronous flow-controlled channels.Data and control, in the form of 36-bit words, are transferred at 1.4 GHz; round-trip latency through the crossbar is under 3 ns. Each port supports 50 Gbps duplex and the chip is non-blocking up to 800 Gbps. Fair arbitration is implemented to handle contention on output ports.Power scales linearly with bandwidth, from a few mW to 4W. This compares favorably with crossbar fabric chips and other SoC interconnect solutions on the market that have significantly less bandwidth and substantially higher power and longer latencies.Die photos and comparative performance results will be given, and we will discuss on-chip interconnect applications. Nexus opens a lot of possibilities for advancing multiprocessor designs by providing low-latency peer-to-peer interconnection of embedded processors, their shared memories, and high-speed interfaces, and transparently bridging between clock domains.

Keywords:

Low power networking
Gb/sec and Tb/sec switching and routing technologies