Introduction
With the popularity of mobile wireless devices soaring, the wireless communication market continues to see rapid growth. However, with this growth comes a significant challenge. Many applications, such as digital video, need new high data rate wireless communication algorithms. The continuous evolution of these wireless specifications is constantly widening the gap between wireless algorithmic innovation and hardware implementation. In addition, low power consumption is now a critical design issue, since the life of a battery is a key differentiator among consumer mobile devices. The chip designer's most important task is to implement highly complex algorithms into hardware as quickly as possible, while still retaining power efficiency. High Level Synthesis (HLS) methodology has already been widely adopted as the best way to meet the challenge. This article gives an example in which an HLS tool is used, together with architectural innovation, to create a low power LDPC decoder.
High Level Synthesis Methodology
HLS methodology allows the hardware design to be completed at a higher level of abstraction such as C/C++ algorithmic description. This provides significant time and cost savings, and paves the way for designers to handle complex designs quickly and efficiently, producing results that compare favorably with hand design.
HLS tools also offer specific power-saving features, designed to solve the problems of power optimization. In any design, there are huge opportunities for power reduction at both the system and the architecture levels. HLS can make a significant contribution to power reduction at the architecture level, specifically by offering the following:
Ease of architecture and micro-architecture exploration
Ease of frequency and voltage exploration
Use of high level power reduction techniques such as multi-level clock gating, which are time-consuming and error-prone when done manually at the RTL level
Power-saving opportunities at the RTL and gate-level are limited and have a much smaller impact on the total power consumption.
Low-Density, Parity-Check decoders
Forward Error Correction (FEC) coding, a core technology in wireless communications, has already advanced from 2G convolutional/block codes to more powerful 3G Turbo codes. Recently, designers have been looking elsewhere for help with the more complex 4G systems. A Low-Density, Parity-Check (LDPC) encoding scheme is an attractive proposition for these systems, because of its excellent error correction performance and highly parallel decoding scheme.
Nevertheless, it is a major challenge for any designer to create quickly and efficiently a high performance LDPC decoder which also meets the data rate and power consumption constraints in wireless handsets.
LDPC decoders vary significantly in their levels of parallelism, which range from fully parallel to partially parallel to fully sequential. A fully parallel decoder requires a large amount of hardware resources. Moreover, it hard-wires the entire parity matrix into hardware, and therefore can only support one particular LDPC code. This makes it impractical to implement in a wireless system-on-a-chip (SoC) because different or multiple LDPC codes might need to be supported eventually. Partial parallel architectures can achieve high throughput decoding at a reduced hardware complexity. However, the level of parallelism in these instances has to be at the sub-circulant (shifted identity matrix) level, which makes it code-specific as well and therefore can be too inflexible for the wireless SoC.
This article looks at exploring the design space of scalable parallel realizations of LDPC decoders using a high level synthesis (HLS) methodology. Under the guidance of the designers, HLS can effectively exploit the parallelism of a given algorithm. The article demonstrates how two scalable parallel LDPC decoding algorithms can be implemented by the HLS tool to produce area and power-efficient hardware.
|