Performance and QoS-aware MPEG-4 video-object coding for multiprocessor architecture

M. Pastrnak

Research output: ThesisPhd Thesis 1 (Research TU/e / Graduation TU/e)Academic

Abstract

The introduction of Arbitrary-Shaped (AS) Video Objects (VO) in the MPEG-4 coding standard has enabled various applications using both natural and synthetic composition of video scenes. The work presented in this thesis aims at realizing an embedded-systems design involving the mapping of this type of applications onto a multiprocessor platform, like Network-on-Chip (NoC). The research has focused on the upper design layers, dealing with the application and their control for an ecient execution. The aspects addressed for the mapping are performance modeling of the MPEG-4 decoding, granularity optimization of the algorithm, introduction of task-level scalability, and controlling the quality of the applications by a Quality-of-Service (QoS) manager. The AS VO MPEG-4 decoding algorithm comprises of the conventional DCT coding techniques from MPEG-1/2 that are extended with the coding of object shapes and specic processing for the improvement of the picture quality of object borders, employing padding and block-based ltering. At the system level, the AS VO MPEG-4 coding allows the designer to think in individual planes and objects that together compose the scene. The target platform for such an application should be able to handle the features of MPEG-4 coding: the combination of high-level control-driven operations and streaming-oriented processing at the video-data level. The platform features a tile-based computing network, in which each tile is separated from the network by buered communication. This allows multiple instantiation of object decoding, each having its own dynamic behavior. The Synchronous Data Flow (SDF) graph is a traditional model for computation of multimedia applications mapped on the multiprocessor system. However, SDF cannot cope with the dynamic behavior of object-based video. Therefore, this research has extended SDF by a linear parametrical model of the required computation resources. The model is based on the coding parameters of the input stream (BAB-type of the block, number of non-transparent sub-blocks, number of AC coecients coded by an ESC code, etc.) and weighting coecients depending on the target processor architecture. Similarly, thesis proposes a parametrical model for the communication resources. It was found that our obtained parametrical timing model has about 5% deviation from the real execution on an Æthereal NoC with ARM7 cores. Our comparison with the mostly used worst-case approach for communication resource allocation revealed that it reduces the required resources with a factor of 2.5. For more ecient system control, the thesis presents a hierarchical Qualityof- Service (QoS) concept in combination with a scalable MPEG-4 decoder. To serve scalable execution, we have classied the tasks involved with the AS VO MPEG-4 decoding into two classes. The rst class contains essential tasks that cannot be skipped, while the second class is lled with the enhancement functions. Scalability of AS VO MPEG-4 decoding was obtained by enabling/disabling optional functions of the non-essential tasks next to the essential tasks. The resource distribution is controlled by a hierarchical QoS management. This QoS is based on two QoS managers. In our experimental implementation, the Local QoS provides the estimation of the resource-usage of an application and monitors the real execution. The Global QoS selects the best quality-levels of the active applications and reserves resources for the application. The key contribution of our work on QoS is the design of a heuristic algorithm that searches suitable combinations of quality levels for individual jobs, so that a set of jobs can be mapped on the available resources. In order to further improve the eciency of the mapping, we have distinguished reservation-based QoS control and best-eort computing on top of it as an addition. This combination was studied for controlling the bandwidth of the communication resources. The reservation-based approach guarantees that the video object will be always decoded at least at the lowest quality level, while the best-eort computing improves the quality by using the resources as much as they are available, as controlled by the Global QoS. The complete system was experimentally veried with a network of eight ARM processor cores, using an MPEG-4 Video Object decoder at the ACE prole and at CCIR- 601 resolution. The proposed framework showed that the adaptation at ner granularity, e.g. a VOP level within a GOV, signicantly improve the image quality (provided that resources are constrained. The mapping exploration of AS VO MPEG-4 decoding for execution on an NoC addresses a general case of running modern multimedia applications, because of the variability and dynamics of tasks. It has been shown that parametrical models help in planning the execution and QoS management and best-eort computing clearly improve the eciency of multiple tasks executed in parallel.
LanguageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • Department of Electrical Engineering
Supervisors/Advisors
  • de With, Peter, Promotor
  • van Meerbergen, Jef, Promotor
Award date24 Jan 2008
Place of PublicationEindhoven
Publisher
Print ISBNs978-90-386-1744-2
DOIs
StatePublished - 2008

Fingerprint

Quality of service
Decoding
Tile
Scalability
Communication
Managers
ARM processors
Data flow graphs
Motion Picture Experts Group standards
Level control
Embedded systems
Systems analysis
Control systems
Planning
Processing

Cite this

Pastrnak, M. (2008). Performance and QoS-aware MPEG-4 video-object coding for multiprocessor architecture Eindhoven: Technische Universiteit Eindhoven DOI: 10.6100/IR632362
Pastrnak, M.. / Performance and QoS-aware MPEG-4 video-object coding for multiprocessor architecture. Eindhoven : Technische Universiteit Eindhoven, 2008. 177 p.
@phdthesis{02db061230824652adbdfb920e014ea1,
title = "Performance and QoS-aware MPEG-4 video-object coding for multiprocessor architecture",
abstract = "The introduction of Arbitrary-Shaped (AS) Video Objects (VO) in the MPEG-4 coding standard has enabled various applications using both natural and synthetic composition of video scenes. The work presented in this thesis aims at realizing an embedded-systems design involving the mapping of this type of applications onto a multiprocessor platform, like Network-on-Chip (NoC). The research has focused on the upper design layers, dealing with the application and their control for an ecient execution. The aspects addressed for the mapping are performance modeling of the MPEG-4 decoding, granularity optimization of the algorithm, introduction of task-level scalability, and controlling the quality of the applications by a Quality-of-Service (QoS) manager. The AS VO MPEG-4 decoding algorithm comprises of the conventional DCT coding techniques from MPEG-1/2 that are extended with the coding of object shapes and specic processing for the improvement of the picture quality of object borders, employing padding and block-based ltering. At the system level, the AS VO MPEG-4 coding allows the designer to think in individual planes and objects that together compose the scene. The target platform for such an application should be able to handle the features of MPEG-4 coding: the combination of high-level control-driven operations and streaming-oriented processing at the video-data level. The platform features a tile-based computing network, in which each tile is separated from the network by buered communication. This allows multiple instantiation of object decoding, each having its own dynamic behavior. The Synchronous Data Flow (SDF) graph is a traditional model for computation of multimedia applications mapped on the multiprocessor system. However, SDF cannot cope with the dynamic behavior of object-based video. Therefore, this research has extended SDF by a linear parametrical model of the required computation resources. The model is based on the coding parameters of the input stream (BAB-type of the block, number of non-transparent sub-blocks, number of AC coecients coded by an ESC code, etc.) and weighting coecients depending on the target processor architecture. Similarly, thesis proposes a parametrical model for the communication resources. It was found that our obtained parametrical timing model has about 5{\%} deviation from the real execution on an {\AE}thereal NoC with ARM7 cores. Our comparison with the mostly used worst-case approach for communication resource allocation revealed that it reduces the required resources with a factor of 2.5. For more ecient system control, the thesis presents a hierarchical Qualityof- Service (QoS) concept in combination with a scalable MPEG-4 decoder. To serve scalable execution, we have classied the tasks involved with the AS VO MPEG-4 decoding into two classes. The rst class contains essential tasks that cannot be skipped, while the second class is lled with the enhancement functions. Scalability of AS VO MPEG-4 decoding was obtained by enabling/disabling optional functions of the non-essential tasks next to the essential tasks. The resource distribution is controlled by a hierarchical QoS management. This QoS is based on two QoS managers. In our experimental implementation, the Local QoS provides the estimation of the resource-usage of an application and monitors the real execution. The Global QoS selects the best quality-levels of the active applications and reserves resources for the application. The key contribution of our work on QoS is the design of a heuristic algorithm that searches suitable combinations of quality levels for individual jobs, so that a set of jobs can be mapped on the available resources. In order to further improve the eciency of the mapping, we have distinguished reservation-based QoS control and best-eort computing on top of it as an addition. This combination was studied for controlling the bandwidth of the communication resources. The reservation-based approach guarantees that the video object will be always decoded at least at the lowest quality level, while the best-eort computing improves the quality by using the resources as much as they are available, as controlled by the Global QoS. The complete system was experimentally veried with a network of eight ARM processor cores, using an MPEG-4 Video Object decoder at the ACE prole and at CCIR- 601 resolution. The proposed framework showed that the adaptation at ner granularity, e.g. a VOP level within a GOV, signicantly improve the image quality (provided that resources are constrained. The mapping exploration of AS VO MPEG-4 decoding for execution on an NoC addresses a general case of running modern multimedia applications, because of the variability and dynamics of tasks. It has been shown that parametrical models help in planning the execution and QoS management and best-eort computing clearly improve the eciency of multiple tasks executed in parallel.",
author = "M. Pastrnak",
year = "2008",
doi = "10.6100/IR632362",
language = "English",
isbn = "978-90-386-1744-2",
publisher = "Technische Universiteit Eindhoven",
school = "Department of Electrical Engineering",

}

Pastrnak, M 2008, 'Performance and QoS-aware MPEG-4 video-object coding for multiprocessor architecture', Doctor of Philosophy, Department of Electrical Engineering, Eindhoven. DOI: 10.6100/IR632362

Performance and QoS-aware MPEG-4 video-object coding for multiprocessor architecture. / Pastrnak, M.

Eindhoven : Technische Universiteit Eindhoven, 2008. 177 p.

Research output: ThesisPhd Thesis 1 (Research TU/e / Graduation TU/e)Academic

TY - THES

T1 - Performance and QoS-aware MPEG-4 video-object coding for multiprocessor architecture

AU - Pastrnak,M.

PY - 2008

Y1 - 2008

N2 - The introduction of Arbitrary-Shaped (AS) Video Objects (VO) in the MPEG-4 coding standard has enabled various applications using both natural and synthetic composition of video scenes. The work presented in this thesis aims at realizing an embedded-systems design involving the mapping of this type of applications onto a multiprocessor platform, like Network-on-Chip (NoC). The research has focused on the upper design layers, dealing with the application and their control for an ecient execution. The aspects addressed for the mapping are performance modeling of the MPEG-4 decoding, granularity optimization of the algorithm, introduction of task-level scalability, and controlling the quality of the applications by a Quality-of-Service (QoS) manager. The AS VO MPEG-4 decoding algorithm comprises of the conventional DCT coding techniques from MPEG-1/2 that are extended with the coding of object shapes and specic processing for the improvement of the picture quality of object borders, employing padding and block-based ltering. At the system level, the AS VO MPEG-4 coding allows the designer to think in individual planes and objects that together compose the scene. The target platform for such an application should be able to handle the features of MPEG-4 coding: the combination of high-level control-driven operations and streaming-oriented processing at the video-data level. The platform features a tile-based computing network, in which each tile is separated from the network by buered communication. This allows multiple instantiation of object decoding, each having its own dynamic behavior. The Synchronous Data Flow (SDF) graph is a traditional model for computation of multimedia applications mapped on the multiprocessor system. However, SDF cannot cope with the dynamic behavior of object-based video. Therefore, this research has extended SDF by a linear parametrical model of the required computation resources. The model is based on the coding parameters of the input stream (BAB-type of the block, number of non-transparent sub-blocks, number of AC coecients coded by an ESC code, etc.) and weighting coecients depending on the target processor architecture. Similarly, thesis proposes a parametrical model for the communication resources. It was found that our obtained parametrical timing model has about 5% deviation from the real execution on an Æthereal NoC with ARM7 cores. Our comparison with the mostly used worst-case approach for communication resource allocation revealed that it reduces the required resources with a factor of 2.5. For more ecient system control, the thesis presents a hierarchical Qualityof- Service (QoS) concept in combination with a scalable MPEG-4 decoder. To serve scalable execution, we have classied the tasks involved with the AS VO MPEG-4 decoding into two classes. The rst class contains essential tasks that cannot be skipped, while the second class is lled with the enhancement functions. Scalability of AS VO MPEG-4 decoding was obtained by enabling/disabling optional functions of the non-essential tasks next to the essential tasks. The resource distribution is controlled by a hierarchical QoS management. This QoS is based on two QoS managers. In our experimental implementation, the Local QoS provides the estimation of the resource-usage of an application and monitors the real execution. The Global QoS selects the best quality-levels of the active applications and reserves resources for the application. The key contribution of our work on QoS is the design of a heuristic algorithm that searches suitable combinations of quality levels for individual jobs, so that a set of jobs can be mapped on the available resources. In order to further improve the eciency of the mapping, we have distinguished reservation-based QoS control and best-eort computing on top of it as an addition. This combination was studied for controlling the bandwidth of the communication resources. The reservation-based approach guarantees that the video object will be always decoded at least at the lowest quality level, while the best-eort computing improves the quality by using the resources as much as they are available, as controlled by the Global QoS. The complete system was experimentally veried with a network of eight ARM processor cores, using an MPEG-4 Video Object decoder at the ACE prole and at CCIR- 601 resolution. The proposed framework showed that the adaptation at ner granularity, e.g. a VOP level within a GOV, signicantly improve the image quality (provided that resources are constrained. The mapping exploration of AS VO MPEG-4 decoding for execution on an NoC addresses a general case of running modern multimedia applications, because of the variability and dynamics of tasks. It has been shown that parametrical models help in planning the execution and QoS management and best-eort computing clearly improve the eciency of multiple tasks executed in parallel.

AB - The introduction of Arbitrary-Shaped (AS) Video Objects (VO) in the MPEG-4 coding standard has enabled various applications using both natural and synthetic composition of video scenes. The work presented in this thesis aims at realizing an embedded-systems design involving the mapping of this type of applications onto a multiprocessor platform, like Network-on-Chip (NoC). The research has focused on the upper design layers, dealing with the application and their control for an ecient execution. The aspects addressed for the mapping are performance modeling of the MPEG-4 decoding, granularity optimization of the algorithm, introduction of task-level scalability, and controlling the quality of the applications by a Quality-of-Service (QoS) manager. The AS VO MPEG-4 decoding algorithm comprises of the conventional DCT coding techniques from MPEG-1/2 that are extended with the coding of object shapes and specic processing for the improvement of the picture quality of object borders, employing padding and block-based ltering. At the system level, the AS VO MPEG-4 coding allows the designer to think in individual planes and objects that together compose the scene. The target platform for such an application should be able to handle the features of MPEG-4 coding: the combination of high-level control-driven operations and streaming-oriented processing at the video-data level. The platform features a tile-based computing network, in which each tile is separated from the network by buered communication. This allows multiple instantiation of object decoding, each having its own dynamic behavior. The Synchronous Data Flow (SDF) graph is a traditional model for computation of multimedia applications mapped on the multiprocessor system. However, SDF cannot cope with the dynamic behavior of object-based video. Therefore, this research has extended SDF by a linear parametrical model of the required computation resources. The model is based on the coding parameters of the input stream (BAB-type of the block, number of non-transparent sub-blocks, number of AC coecients coded by an ESC code, etc.) and weighting coecients depending on the target processor architecture. Similarly, thesis proposes a parametrical model for the communication resources. It was found that our obtained parametrical timing model has about 5% deviation from the real execution on an Æthereal NoC with ARM7 cores. Our comparison with the mostly used worst-case approach for communication resource allocation revealed that it reduces the required resources with a factor of 2.5. For more ecient system control, the thesis presents a hierarchical Qualityof- Service (QoS) concept in combination with a scalable MPEG-4 decoder. To serve scalable execution, we have classied the tasks involved with the AS VO MPEG-4 decoding into two classes. The rst class contains essential tasks that cannot be skipped, while the second class is lled with the enhancement functions. Scalability of AS VO MPEG-4 decoding was obtained by enabling/disabling optional functions of the non-essential tasks next to the essential tasks. The resource distribution is controlled by a hierarchical QoS management. This QoS is based on two QoS managers. In our experimental implementation, the Local QoS provides the estimation of the resource-usage of an application and monitors the real execution. The Global QoS selects the best quality-levels of the active applications and reserves resources for the application. The key contribution of our work on QoS is the design of a heuristic algorithm that searches suitable combinations of quality levels for individual jobs, so that a set of jobs can be mapped on the available resources. In order to further improve the eciency of the mapping, we have distinguished reservation-based QoS control and best-eort computing on top of it as an addition. This combination was studied for controlling the bandwidth of the communication resources. The reservation-based approach guarantees that the video object will be always decoded at least at the lowest quality level, while the best-eort computing improves the quality by using the resources as much as they are available, as controlled by the Global QoS. The complete system was experimentally veried with a network of eight ARM processor cores, using an MPEG-4 Video Object decoder at the ACE prole and at CCIR- 601 resolution. The proposed framework showed that the adaptation at ner granularity, e.g. a VOP level within a GOV, signicantly improve the image quality (provided that resources are constrained. The mapping exploration of AS VO MPEG-4 decoding for execution on an NoC addresses a general case of running modern multimedia applications, because of the variability and dynamics of tasks. It has been shown that parametrical models help in planning the execution and QoS management and best-eort computing clearly improve the eciency of multiple tasks executed in parallel.

U2 - 10.6100/IR632362

DO - 10.6100/IR632362

M3 - Phd Thesis 1 (Research TU/e / Graduation TU/e)

SN - 978-90-386-1744-2

PB - Technische Universiteit Eindhoven

CY - Eindhoven

ER -

Pastrnak M. Performance and QoS-aware MPEG-4 video-object coding for multiprocessor architecture. Eindhoven: Technische Universiteit Eindhoven, 2008. 177 p. Available from, DOI: 10.6100/IR632362