Abstract
Abstract. Despite the speed up of PC technology over the years, real-time performance
of video processing in medical X-ray procedures continues to be an issue
as the size and number of concurrent data streams is increasing steadily. Since
the computing evolves quicker than memory technology, there is an increasing
pressure on an efficient use of the off-chip memory bandwidth. Additionally, as
a multitude of video functions is carried out in parallel, the memory-bandwidth
problem is further stressed. In this paper, we present an architecture study for
performance prediction and optimization of medical X-ray video-processing on
multiple cores. By carefully modeling the critical stages of the architecture, bottlenecks
are known in detail. Model descriptions for the video-processing algorithms
are inserted into the architecture model, making explicit where data and
functions needs to be partitioned to obtain higher throughput. For the application
under study, we propose a combined 2-level data partitioning with functional
partitioning scheme that result in a bandwidth and latency reduction of 40-70%
compared to straightforward implementations.
Original language | English |
---|---|
Title of host publication | Proceedings High performance embedded architectures and compilers : fourth international conference, HiPEAC 2009, Paphos, Cyprus, January 25-28, 2009 |
Editors | André Seznec, Joel Emer, Michael O'Boyle |
Place of Publication | Berlin |
Publisher | Springer |
Pages | 1-12 |
ISBN (Print) | 978-3-540-92989-5 |
DOIs | |
Publication status | Published - 2009 |