Motion estimation plays a key role in video coding. (e.g., video telephone, MPEG, HDTV) Among the previous motion estimation algorithms, full-search block matching algorithms (BMA) are preferred because of their simplicity and lower control overhead when those algorithm are implemented in VLSI array processors. Previous full-search BMAs have considered one block matching at a time. There exist, however, shared data in the search areas for adjacent template blocks. Therefore, if we process adjacent template blocks in parallel, we can reduce the data memory accesses for the shared data. In this paper, we propose a new data flow scheme for the efficient, systolic, full-search BMA on programmable array processors so that we can process as many adjacent template blocks as possible in unison in order to reduce the data memory accesses. We present an efficient implementation of the BMA on the Micro Grained Array Processor (MGAP) which is a fine-grained, mesh-connected, programmable VLSI array processor being developed at Penn State University. As a result, the BMA for the MPEG SIF video format (352×240 pixels) with a block size of 16×16 pixels, displacement range of 16 pixels, frame rate of 30 frames/sec can be computed at a real time processing rate on the MGAP.
|Original language||English (US)|
|Number of pages||10|
|Journal||International Conference on Application-Specific Systems, Architectures and Processors, Proceedings|
|State||Published - 1995|
All Science Journal Classification (ASJC) codes
- Hardware and Architecture