The author consider a binary classification problem with a feature vector of high dimensionality. Spam mail filters are a popular example hereof. A Bayesian approach requires us to estimate the probability of a feature vector given the class of the object. Due to the size of the feature vector this is an unfeasible task. A useful approach is to split the feature space into several (conditionally) independent subspaces. This results in a new problem, namely how to find the ldquobestrdquo subdivision. In this paper the author consider a weighing approach that will perform (asymptotically) as good as the best subdivision and still has a manageable complexity.
|Title of host publication||Information Theory and Applications Workshop, 2008 , 3rd ,27 January -1 February 2008, San Diego, U.S.A.|
|Place of Publication||Piscataway|
|Publisher||Institute of Electrical and Electronics Engineers|
|Publication status||Published - 2008|