Optimization of video capturing and tone mapping in video camera systems

S.D. Cvetkovic

Research output: ThesisPhd Thesis 2 (Research NOT TU/e / Graduation TU/e)Academic

Abstract

Image enhancement techniques are widely employed in many areas of professional and consumer imaging, machine vision and computational imaging. Image enhancement techniques used in surveillance video cameras are complex systems involving controllable lenses, sensors and advanced signal processing. In surveillance, a high output image quality with very robust and stable operation under difficult imaging conditions are essential, combined with automatic, intelligent camera behavior without user intervention. The key problem discussed in this thesis is to ensure this high quality under all conditions, which specifically addresses the discrepancy of the dynamic range of input scenes and displays. For example, typical challenges are High Dynamic Range (HDR) and low-dynamic range scenes with strong light-dark differences and overall poor visibility of details, respectively. The detailed problem statement is as follows: (1) performing correct and stable image acquisition for video cameras in variable dynamic range environments, and (2) finding the best image processing algorithms to maximize the visualization of all image details without introducing image distortions. Additionally, the solutions should satisfy complexity and cost requirements of typical video surveillance cameras. For image acquisition, we develop optimal image exposure algorithms that use a controlled lens, sensor integration time and camera gain, to maximize SNR. For faster and more stable control of the camera exposure system, we remove nonlinear tone-mapping steps from the level control loop and we derive a parallel control strategy that prevents control delays and compensates for the non-linearity and unknown transfer characteristics of the used lenses. For HDR imaging we adopt exposure bracketing that merges short and long exposed images. To solve the involved non-linear sensor distortions, we apply a non-linear correction function to the distorted sensor signal, implementing a second-order polynomial with coefficients adaptively estimated from the signal itself. The result is a good, dynamically controlled match between the long- and short-exposed image. The robustness of this technique is improved for fluorescent light conditions, preventing serious distortions by luminance flickering and color errors. To prevent image degradation we propose both fluorescent light detection and fluorescence locking, based on measurements of the sensor signal intensity and color errors in the short-exposed image. The use of various filtering steps increases the detector robustness and reliability for scenes with motion and the appearance of other light sources. In the alternative algorithm principle of fluorescence locking, we ensure that light integrated during the short exposure time has a correct intensity and color by synchronizing the exposure measurement to the mains frequency. The second area of research is to maximize visualization of all image details. This is achieved by both global and local tone mapping functions. The largest problem of Global Tone Mapping Functions (GTMF) is that they often significantly deteriorate the image contrast. We have developed a new GTMF and illustrate, both analytically and perceptually, that it exhibits only a limited amount of compression, compared to conventional solutions. Our algorithm splits GTMF into two tasks: (1) compressing HDR images (DRC transfer function) and (2) enhancing the (global) image contrast (CHRE transfer function). The DRC subsystem adapts the HDR video signal to the remainder of the system, which can handle only a fraction of the original dynamic range. Our main contribution is a novel DRC function shape which is adaptive to the image, so that details in the dark image parts are enhanced simultaneously while only moderately compressing details in the bright areas. Also, the DRC function shape is well matched with the sensor noise characteristics in order to limit the noise amplification. Furthermore, we show that the image quality can be significantly improved in DRC compression if a local contrast preservation step is included. The second part of GTMF is a CHRE subsystem that fine-tunes and redistributes the luminance (and color) signal in the image, to optimize global contrast of the scene. The contribution of the proposed CHRE processing is that unlike standard histogram equalization, it can preserve details in statistically unpopulated but visually relevant luminance regions. One of the important cornerstones of the GTMF is that both DRC and CHRE algorithms are performed in the perceptually uniform space and optimized for the salient regions obtained by the improved salient-region detector, to maximize the relevant information transfer to the HVS. The proposed GTMF solution offers a good processing quality, but cannot sufficiently preserve local contrast for extreme HDR signals and it gives limited improvement low-contrast scenes. The local contrast improvement is based on the Locally Adaptive Contrast Enhancement (LACE) algorithm. We contribute by using multi-band frequency decomposition, to set up the complete enhancement system. Four key problems occur with real-time LACE processing: (1) "halo" artifacts, (2) clipping of the enhancement signal, (3) noise degradation and (4) the overall system complexity. "Halo" artifacts are eliminated by a new contrast gain specification using local energy and contrast measurements. This solution has a low complexity and offers excellent performance in terms of higher contrast and visually appealing performance. Algorithms preventing clipping of the output signal and reducing noise amplification give a further enhancement. We have added a supplementary discussion on executing LACE in the logarithmic domain, where we have derived a new contrast gain function solving LACE problems efficiently. For the best results, we have found that LACE processing should be performed in the logarithmic domain for standard and HDR images, and in the linear domain for low-contrast images. Finally, the complexity of the contrast gain calculation is reduced by a new local energy metric, which can be calculated efficiently in a 2D-separable fashion. Besides the complexity benefit, the proposed energy metric gives better performance compared to the conventional metrics. The conclusions of our work are summarized as follows. For acquisition, we need to combine an optimal exposure algorithm, giving both improved dynamic performance and maximum image contrast/SNR, with robust exposure bracketing that can handle difficult conditions such as fluorescent lighting. For optimizing visibility of details in the scene, we have split the GTMF in two parts, DRC and CHRE, so that a controlled optimization can be performed offering less contrast compression and detail loss than in the conventional case. Local contrast is enhanced with the known LACE algorithm, but the performance is significantly improved by individually addressing "halo" artifacts, signal clipping and noise degradation. We provide artifact reduction by new contrast gain function based on local energy, contrast measurements and noise estimation. Besides the above arguments, we have contributed feasible performance metrics and listed ample practical evidence of the real-time implementation of our algorithms in FPGAs and ASICs, used in commercially available surveillance cameras, which obtained awards for their image quality.
LanguageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • Department of Electrical Engineering
Supervisors/Advisors
  • de With, Peter, Promotor
Award date8 Dec 2011
Place of PublicationEindhoven
Publisher
Print ISBNs978-90-386-2880-6
DOIs
StatePublished - 2011

Fingerprint

Video cameras
Cameras
Sensors
Image quality
Color
Luminance
Lenses
Image enhancement
Image acquisition
Processing
Imaging techniques
Visibility
Degradation
Amplification
Transfer functions
Visualization
Fluorescence
Flickering
Detectors
Level control

Cite this

Cvetkovic, S. D. (2011). Optimization of video capturing and tone mapping in video camera systems Eindhoven: Technische Universiteit Eindhoven DOI: 10.6100/IR719365
Cvetkovic, S.D.. / Optimization of video capturing and tone mapping in video camera systems. Eindhoven : Technische Universiteit Eindhoven, 2011. 296 p.
@phdthesis{a5f52ffe2a6d49e189a3a031f2436710,
title = "Optimization of video capturing and tone mapping in video camera systems",
abstract = "Image enhancement techniques are widely employed in many areas of professional and consumer imaging, machine vision and computational imaging. Image enhancement techniques used in surveillance video cameras are complex systems involving controllable lenses, sensors and advanced signal processing. In surveillance, a high output image quality with very robust and stable operation under difficult imaging conditions are essential, combined with automatic, intelligent camera behavior without user intervention. The key problem discussed in this thesis is to ensure this high quality under all conditions, which specifically addresses the discrepancy of the dynamic range of input scenes and displays. For example, typical challenges are High Dynamic Range (HDR) and low-dynamic range scenes with strong light-dark differences and overall poor visibility of details, respectively. The detailed problem statement is as follows: (1) performing correct and stable image acquisition for video cameras in variable dynamic range environments, and (2) finding the best image processing algorithms to maximize the visualization of all image details without introducing image distortions. Additionally, the solutions should satisfy complexity and cost requirements of typical video surveillance cameras. For image acquisition, we develop optimal image exposure algorithms that use a controlled lens, sensor integration time and camera gain, to maximize SNR. For faster and more stable control of the camera exposure system, we remove nonlinear tone-mapping steps from the level control loop and we derive a parallel control strategy that prevents control delays and compensates for the non-linearity and unknown transfer characteristics of the used lenses. For HDR imaging we adopt exposure bracketing that merges short and long exposed images. To solve the involved non-linear sensor distortions, we apply a non-linear correction function to the distorted sensor signal, implementing a second-order polynomial with coefficients adaptively estimated from the signal itself. The result is a good, dynamically controlled match between the long- and short-exposed image. The robustness of this technique is improved for fluorescent light conditions, preventing serious distortions by luminance flickering and color errors. To prevent image degradation we propose both fluorescent light detection and fluorescence locking, based on measurements of the sensor signal intensity and color errors in the short-exposed image. The use of various filtering steps increases the detector robustness and reliability for scenes with motion and the appearance of other light sources. In the alternative algorithm principle of fluorescence locking, we ensure that light integrated during the short exposure time has a correct intensity and color by synchronizing the exposure measurement to the mains frequency. The second area of research is to maximize visualization of all image details. This is achieved by both global and local tone mapping functions. The largest problem of Global Tone Mapping Functions (GTMF) is that they often significantly deteriorate the image contrast. We have developed a new GTMF and illustrate, both analytically and perceptually, that it exhibits only a limited amount of compression, compared to conventional solutions. Our algorithm splits GTMF into two tasks: (1) compressing HDR images (DRC transfer function) and (2) enhancing the (global) image contrast (CHRE transfer function). The DRC subsystem adapts the HDR video signal to the remainder of the system, which can handle only a fraction of the original dynamic range. Our main contribution is a novel DRC function shape which is adaptive to the image, so that details in the dark image parts are enhanced simultaneously while only moderately compressing details in the bright areas. Also, the DRC function shape is well matched with the sensor noise characteristics in order to limit the noise amplification. Furthermore, we show that the image quality can be significantly improved in DRC compression if a local contrast preservation step is included. The second part of GTMF is a CHRE subsystem that fine-tunes and redistributes the luminance (and color) signal in the image, to optimize global contrast of the scene. The contribution of the proposed CHRE processing is that unlike standard histogram equalization, it can preserve details in statistically unpopulated but visually relevant luminance regions. One of the important cornerstones of the GTMF is that both DRC and CHRE algorithms are performed in the perceptually uniform space and optimized for the salient regions obtained by the improved salient-region detector, to maximize the relevant information transfer to the HVS. The proposed GTMF solution offers a good processing quality, but cannot sufficiently preserve local contrast for extreme HDR signals and it gives limited improvement low-contrast scenes. The local contrast improvement is based on the Locally Adaptive Contrast Enhancement (LACE) algorithm. We contribute by using multi-band frequency decomposition, to set up the complete enhancement system. Four key problems occur with real-time LACE processing: (1) {"}halo{"} artifacts, (2) clipping of the enhancement signal, (3) noise degradation and (4) the overall system complexity. {"}Halo{"} artifacts are eliminated by a new contrast gain specification using local energy and contrast measurements. This solution has a low complexity and offers excellent performance in terms of higher contrast and visually appealing performance. Algorithms preventing clipping of the output signal and reducing noise amplification give a further enhancement. We have added a supplementary discussion on executing LACE in the logarithmic domain, where we have derived a new contrast gain function solving LACE problems efficiently. For the best results, we have found that LACE processing should be performed in the logarithmic domain for standard and HDR images, and in the linear domain for low-contrast images. Finally, the complexity of the contrast gain calculation is reduced by a new local energy metric, which can be calculated efficiently in a 2D-separable fashion. Besides the complexity benefit, the proposed energy metric gives better performance compared to the conventional metrics. The conclusions of our work are summarized as follows. For acquisition, we need to combine an optimal exposure algorithm, giving both improved dynamic performance and maximum image contrast/SNR, with robust exposure bracketing that can handle difficult conditions such as fluorescent lighting. For optimizing visibility of details in the scene, we have split the GTMF in two parts, DRC and CHRE, so that a controlled optimization can be performed offering less contrast compression and detail loss than in the conventional case. Local contrast is enhanced with the known LACE algorithm, but the performance is significantly improved by individually addressing {"}halo{"} artifacts, signal clipping and noise degradation. We provide artifact reduction by new contrast gain function based on local energy, contrast measurements and noise estimation. Besides the above arguments, we have contributed feasible performance metrics and listed ample practical evidence of the real-time implementation of our algorithms in FPGAs and ASICs, used in commercially available surveillance cameras, which obtained awards for their image quality.",
author = "S.D. Cvetkovic",
year = "2011",
doi = "10.6100/IR719365",
language = "English",
isbn = "978-90-386-2880-6",
publisher = "Technische Universiteit Eindhoven",
school = "Department of Electrical Engineering",

}

Cvetkovic, SD 2011, 'Optimization of video capturing and tone mapping in video camera systems', Doctor of Philosophy, Department of Electrical Engineering, Eindhoven. DOI: 10.6100/IR719365

Optimization of video capturing and tone mapping in video camera systems. / Cvetkovic, S.D.

Eindhoven : Technische Universiteit Eindhoven, 2011. 296 p.

Research output: ThesisPhd Thesis 2 (Research NOT TU/e / Graduation TU/e)Academic

TY - THES

T1 - Optimization of video capturing and tone mapping in video camera systems

AU - Cvetkovic,S.D.

PY - 2011

Y1 - 2011

N2 - Image enhancement techniques are widely employed in many areas of professional and consumer imaging, machine vision and computational imaging. Image enhancement techniques used in surveillance video cameras are complex systems involving controllable lenses, sensors and advanced signal processing. In surveillance, a high output image quality with very robust and stable operation under difficult imaging conditions are essential, combined with automatic, intelligent camera behavior without user intervention. The key problem discussed in this thesis is to ensure this high quality under all conditions, which specifically addresses the discrepancy of the dynamic range of input scenes and displays. For example, typical challenges are High Dynamic Range (HDR) and low-dynamic range scenes with strong light-dark differences and overall poor visibility of details, respectively. The detailed problem statement is as follows: (1) performing correct and stable image acquisition for video cameras in variable dynamic range environments, and (2) finding the best image processing algorithms to maximize the visualization of all image details without introducing image distortions. Additionally, the solutions should satisfy complexity and cost requirements of typical video surveillance cameras. For image acquisition, we develop optimal image exposure algorithms that use a controlled lens, sensor integration time and camera gain, to maximize SNR. For faster and more stable control of the camera exposure system, we remove nonlinear tone-mapping steps from the level control loop and we derive a parallel control strategy that prevents control delays and compensates for the non-linearity and unknown transfer characteristics of the used lenses. For HDR imaging we adopt exposure bracketing that merges short and long exposed images. To solve the involved non-linear sensor distortions, we apply a non-linear correction function to the distorted sensor signal, implementing a second-order polynomial with coefficients adaptively estimated from the signal itself. The result is a good, dynamically controlled match between the long- and short-exposed image. The robustness of this technique is improved for fluorescent light conditions, preventing serious distortions by luminance flickering and color errors. To prevent image degradation we propose both fluorescent light detection and fluorescence locking, based on measurements of the sensor signal intensity and color errors in the short-exposed image. The use of various filtering steps increases the detector robustness and reliability for scenes with motion and the appearance of other light sources. In the alternative algorithm principle of fluorescence locking, we ensure that light integrated during the short exposure time has a correct intensity and color by synchronizing the exposure measurement to the mains frequency. The second area of research is to maximize visualization of all image details. This is achieved by both global and local tone mapping functions. The largest problem of Global Tone Mapping Functions (GTMF) is that they often significantly deteriorate the image contrast. We have developed a new GTMF and illustrate, both analytically and perceptually, that it exhibits only a limited amount of compression, compared to conventional solutions. Our algorithm splits GTMF into two tasks: (1) compressing HDR images (DRC transfer function) and (2) enhancing the (global) image contrast (CHRE transfer function). The DRC subsystem adapts the HDR video signal to the remainder of the system, which can handle only a fraction of the original dynamic range. Our main contribution is a novel DRC function shape which is adaptive to the image, so that details in the dark image parts are enhanced simultaneously while only moderately compressing details in the bright areas. Also, the DRC function shape is well matched with the sensor noise characteristics in order to limit the noise amplification. Furthermore, we show that the image quality can be significantly improved in DRC compression if a local contrast preservation step is included. The second part of GTMF is a CHRE subsystem that fine-tunes and redistributes the luminance (and color) signal in the image, to optimize global contrast of the scene. The contribution of the proposed CHRE processing is that unlike standard histogram equalization, it can preserve details in statistically unpopulated but visually relevant luminance regions. One of the important cornerstones of the GTMF is that both DRC and CHRE algorithms are performed in the perceptually uniform space and optimized for the salient regions obtained by the improved salient-region detector, to maximize the relevant information transfer to the HVS. The proposed GTMF solution offers a good processing quality, but cannot sufficiently preserve local contrast for extreme HDR signals and it gives limited improvement low-contrast scenes. The local contrast improvement is based on the Locally Adaptive Contrast Enhancement (LACE) algorithm. We contribute by using multi-band frequency decomposition, to set up the complete enhancement system. Four key problems occur with real-time LACE processing: (1) "halo" artifacts, (2) clipping of the enhancement signal, (3) noise degradation and (4) the overall system complexity. "Halo" artifacts are eliminated by a new contrast gain specification using local energy and contrast measurements. This solution has a low complexity and offers excellent performance in terms of higher contrast and visually appealing performance. Algorithms preventing clipping of the output signal and reducing noise amplification give a further enhancement. We have added a supplementary discussion on executing LACE in the logarithmic domain, where we have derived a new contrast gain function solving LACE problems efficiently. For the best results, we have found that LACE processing should be performed in the logarithmic domain for standard and HDR images, and in the linear domain for low-contrast images. Finally, the complexity of the contrast gain calculation is reduced by a new local energy metric, which can be calculated efficiently in a 2D-separable fashion. Besides the complexity benefit, the proposed energy metric gives better performance compared to the conventional metrics. The conclusions of our work are summarized as follows. For acquisition, we need to combine an optimal exposure algorithm, giving both improved dynamic performance and maximum image contrast/SNR, with robust exposure bracketing that can handle difficult conditions such as fluorescent lighting. For optimizing visibility of details in the scene, we have split the GTMF in two parts, DRC and CHRE, so that a controlled optimization can be performed offering less contrast compression and detail loss than in the conventional case. Local contrast is enhanced with the known LACE algorithm, but the performance is significantly improved by individually addressing "halo" artifacts, signal clipping and noise degradation. We provide artifact reduction by new contrast gain function based on local energy, contrast measurements and noise estimation. Besides the above arguments, we have contributed feasible performance metrics and listed ample practical evidence of the real-time implementation of our algorithms in FPGAs and ASICs, used in commercially available surveillance cameras, which obtained awards for their image quality.

AB - Image enhancement techniques are widely employed in many areas of professional and consumer imaging, machine vision and computational imaging. Image enhancement techniques used in surveillance video cameras are complex systems involving controllable lenses, sensors and advanced signal processing. In surveillance, a high output image quality with very robust and stable operation under difficult imaging conditions are essential, combined with automatic, intelligent camera behavior without user intervention. The key problem discussed in this thesis is to ensure this high quality under all conditions, which specifically addresses the discrepancy of the dynamic range of input scenes and displays. For example, typical challenges are High Dynamic Range (HDR) and low-dynamic range scenes with strong light-dark differences and overall poor visibility of details, respectively. The detailed problem statement is as follows: (1) performing correct and stable image acquisition for video cameras in variable dynamic range environments, and (2) finding the best image processing algorithms to maximize the visualization of all image details without introducing image distortions. Additionally, the solutions should satisfy complexity and cost requirements of typical video surveillance cameras. For image acquisition, we develop optimal image exposure algorithms that use a controlled lens, sensor integration time and camera gain, to maximize SNR. For faster and more stable control of the camera exposure system, we remove nonlinear tone-mapping steps from the level control loop and we derive a parallel control strategy that prevents control delays and compensates for the non-linearity and unknown transfer characteristics of the used lenses. For HDR imaging we adopt exposure bracketing that merges short and long exposed images. To solve the involved non-linear sensor distortions, we apply a non-linear correction function to the distorted sensor signal, implementing a second-order polynomial with coefficients adaptively estimated from the signal itself. The result is a good, dynamically controlled match between the long- and short-exposed image. The robustness of this technique is improved for fluorescent light conditions, preventing serious distortions by luminance flickering and color errors. To prevent image degradation we propose both fluorescent light detection and fluorescence locking, based on measurements of the sensor signal intensity and color errors in the short-exposed image. The use of various filtering steps increases the detector robustness and reliability for scenes with motion and the appearance of other light sources. In the alternative algorithm principle of fluorescence locking, we ensure that light integrated during the short exposure time has a correct intensity and color by synchronizing the exposure measurement to the mains frequency. The second area of research is to maximize visualization of all image details. This is achieved by both global and local tone mapping functions. The largest problem of Global Tone Mapping Functions (GTMF) is that they often significantly deteriorate the image contrast. We have developed a new GTMF and illustrate, both analytically and perceptually, that it exhibits only a limited amount of compression, compared to conventional solutions. Our algorithm splits GTMF into two tasks: (1) compressing HDR images (DRC transfer function) and (2) enhancing the (global) image contrast (CHRE transfer function). The DRC subsystem adapts the HDR video signal to the remainder of the system, which can handle only a fraction of the original dynamic range. Our main contribution is a novel DRC function shape which is adaptive to the image, so that details in the dark image parts are enhanced simultaneously while only moderately compressing details in the bright areas. Also, the DRC function shape is well matched with the sensor noise characteristics in order to limit the noise amplification. Furthermore, we show that the image quality can be significantly improved in DRC compression if a local contrast preservation step is included. The second part of GTMF is a CHRE subsystem that fine-tunes and redistributes the luminance (and color) signal in the image, to optimize global contrast of the scene. The contribution of the proposed CHRE processing is that unlike standard histogram equalization, it can preserve details in statistically unpopulated but visually relevant luminance regions. One of the important cornerstones of the GTMF is that both DRC and CHRE algorithms are performed in the perceptually uniform space and optimized for the salient regions obtained by the improved salient-region detector, to maximize the relevant information transfer to the HVS. The proposed GTMF solution offers a good processing quality, but cannot sufficiently preserve local contrast for extreme HDR signals and it gives limited improvement low-contrast scenes. The local contrast improvement is based on the Locally Adaptive Contrast Enhancement (LACE) algorithm. We contribute by using multi-band frequency decomposition, to set up the complete enhancement system. Four key problems occur with real-time LACE processing: (1) "halo" artifacts, (2) clipping of the enhancement signal, (3) noise degradation and (4) the overall system complexity. "Halo" artifacts are eliminated by a new contrast gain specification using local energy and contrast measurements. This solution has a low complexity and offers excellent performance in terms of higher contrast and visually appealing performance. Algorithms preventing clipping of the output signal and reducing noise amplification give a further enhancement. We have added a supplementary discussion on executing LACE in the logarithmic domain, where we have derived a new contrast gain function solving LACE problems efficiently. For the best results, we have found that LACE processing should be performed in the logarithmic domain for standard and HDR images, and in the linear domain for low-contrast images. Finally, the complexity of the contrast gain calculation is reduced by a new local energy metric, which can be calculated efficiently in a 2D-separable fashion. Besides the complexity benefit, the proposed energy metric gives better performance compared to the conventional metrics. The conclusions of our work are summarized as follows. For acquisition, we need to combine an optimal exposure algorithm, giving both improved dynamic performance and maximum image contrast/SNR, with robust exposure bracketing that can handle difficult conditions such as fluorescent lighting. For optimizing visibility of details in the scene, we have split the GTMF in two parts, DRC and CHRE, so that a controlled optimization can be performed offering less contrast compression and detail loss than in the conventional case. Local contrast is enhanced with the known LACE algorithm, but the performance is significantly improved by individually addressing "halo" artifacts, signal clipping and noise degradation. We provide artifact reduction by new contrast gain function based on local energy, contrast measurements and noise estimation. Besides the above arguments, we have contributed feasible performance metrics and listed ample practical evidence of the real-time implementation of our algorithms in FPGAs and ASICs, used in commercially available surveillance cameras, which obtained awards for their image quality.

U2 - 10.6100/IR719365

DO - 10.6100/IR719365

M3 - Phd Thesis 2 (Research NOT TU/e / Graduation TU/e)

SN - 978-90-386-2880-6

PB - Technische Universiteit Eindhoven

CY - Eindhoven

ER -

Cvetkovic SD. Optimization of video capturing and tone mapping in video camera systems. Eindhoven: Technische Universiteit Eindhoven, 2011. 296 p. Available from, DOI: 10.6100/IR719365