Tremendous amounts of data are processed by relatively small and cheap processors nowadays. Consider the radio waves arriving at a cell phone’s antenna and the light intensities on a DVD player’s lens. These are digitized into a stream of bits, which is then filtered, demodulated and decoded into an clear audio or video stream fit for the human senses. My research is about these so-called stream processing systems, i.e., systems that perform relatively simple computations on a large stream of data. Research into these systems is necessary because their data rates have been growing at a faster rate than the speed of our processors. So, if this trend continues we will face a shortage of processing power. The solution to this problem is parallelism. By using multiple processing elements it is possible to achieve the same data rate, or throughput, as one very fast processor. Ideally the throughput obtained in this manner is proportional to the number of processing elements, i.e., the throughput scales linearly with the amount of hardware used. Linear scalability makes it possible to avoid the shortage of processing power; it makes it possible to achieve a higher throughput by simply using more, instead of faster, hardware. Is it possible to achieve linear scalability in every case? Amdahl’s law, or the law of diminishing returns, seems to indicate that linear scalability is not possible for systems in which even a small fraction of the computation is sequential. This seems to indicate that it is impossible to obtain linear scalability for every stream processing system, since many of them contain some kind of sequential computation, consider recursive filters or Viterbi decoders for example. It turns out, however, that there is actually no theoretical limit on the throughput of any stream processing system. It is possible to obtain any throughput by using block processing algorithms. Block processing algorithms process, as their name implies, blocks of data instead of single data elements. A block processing algorithm achieves linear scalability by balancing, in the correct manner, the block size, the computation time per block and the number of processing elements needed for implementation. Existing and new methods for devising such algorithms are described in this thesis. The few existing methods with linear scalability produce algorithms that have to be implemented as dedicated hardware. The new ones, however, are also suited for implementation on SIMD processors. In this thesis I present the theory and design methods required to design a block processing algorithm with linear scalability for any stream processing system. Additionally, I show how these methods can benefit the implementation of various specific systems like the moving sum, recursive filters and rank order filters. All of these can be implemented on a SIMD processor, like the NXP Embedded Vector Processor (EVP), such that the throughput scales linearly with the number of processing elements.
|Qualification||Doctor of Philosophy|
|Award date||4 Sep 2008|
|Place of Publication||Eindhoven|
|Publication status||Published - 2008|