An area-efficient minimum-time FFT schedule using single-ported memory
Article 2013 English
Authors
SR
Stephen Richardson
OS
Ofer Shacham
DM
Dejan Marković
Abstract
1 min read
FFT design requires an exhaustive recoupling of data across successive stages of computation. The resulting memory access patterns have constantly-changing strides, making it hard to interleave the data for reliable conflict-free access of operand pairs. We modify an existing method of "swizzling" data locations so as to guarantee conflict-free access within any given stage and, with minimal support for buffering, we provide conflict-free access across the boundaries of adjoining stages as well. As a result, implementations that would naively require either a fully-associative, or at the very least a multiported register file, can be implemented using four single-ported banks of memory per butterfly unit, plus one bypass buffer. Because fewer ports means less area, and given that a butterfly must read two inputs and write two results for each cycle of operation, this solution should represent the least-area memory configuration for a resource-constrained FFT. Using this scheme, we show examples including a minimal one-butterfly FFT having 9% less area versus a competing equal-performance design and 20% better throughput versus a competing equal-area design.
Discussion(0)
No comments yet. Be the first to comment.