We give a characterization of Maximum Entropy/Minimum Relative Entropy inference by providing two ‘strong entropy concentration’ theorems. These theorems unify and generalize Jaynes’ ‘concentration phenomenon’ and Van Campenhout and Cover’s ‘conditional limit theorem’. The theorems characterize exactly in what sense a ‘prior’ distribution Q conditioned on a given constraint and the distribution P minimizing D(P//Q) over all P satisfyingthe constraint are ‘close’ to each other. We show how our theorems are related to ‘universal models’ for exponential families, thereby establishinga link with Rissanen’s MDL/stochastic complexity. We then apply our theorems to establish the relationship (A) between entropy concentration and a game-theoretic characterization of Maximum Entropy Inference due to Topsøe and others; (B) between maximum entropy distributions and sequences that are random (in the sense of Martin-Löf/Kolmogorov) with respect to the given constraint. These two applications have strong implications for the use of Maximum Entropy distributions in sequential prediction tasks, both for the logarithmic loss and for general loss functions. We identify circumstances under which Maximum Entropy predictions are almost optimal.
|Title of host publication||Computational learning theory : proceedings 14th annual conference on computational learning theory, COLT 2001, and 5th European conference on computational learning theory, EuroCOLT 2001, Amsterdam, The Netherlands, july 16-19, 2001|
|Editors||D.P. Helmbold, B. Williamson|
|Place of Publication||Berlin|
|Publication status||Published - 2001|
|Name||Lecture Notes in Computer Science|