We study universal compression for discrete data sequences that were corrupted by noise. We show that while, as expected, there exist many cases in which the entropy of these sequences increases from that of the original data, somewhat surprisingly and counter-intuitively, universal coding redundancy of such sequences cannot increase compared to the original data. We derive conditions that guarantee that this redundancy does not decrease asymptotically (in first order) from the original sequence redundancy in the stationary memoryless case. We then provide bounds on the redundancy for coding finite length (large) noisy blocks generated by stationary memoryless sources and corrupted by some speci??c memoryless channels. Finally, we propose a sequential probability estimation method that can be used to compress binary data corrupted by some noisy channel. While there is much benefit in using this method in compressing short blocks of noise corrupted data, the new method is more general and allows sequential compression of binary sequences for which the probability of a bit is known to be limited within any given interval (not necessarily between 0 and 1). Additionally, this method has many different applications, including, prediction, sequential channel estimation, and others.
|Title of host publication||Proceedings of the 2nd Information Theory and Applications workshop (ITA 2007) 29 January - 2 February 2007, San Diego, California, USA|
|Publication status||Published - 2007|
|Event||conference; ITA 2007, San Diego, California, USA; 2007-01-29; 2007-02-02 - |
Duration: 29 Jan 2007 → 2 Feb 2007
|Conference||conference; ITA 2007, San Diego, California, USA; 2007-01-29; 2007-02-02|
|Period||29/01/07 → 2/02/07|
|Other||ITA 2007, San Diego, California, USA|