Abstract
The author consider a binary classification problem with a feature vector of high dimensionality. Spam mail filters are a popular example hereof. A Bayesian approach requires us to estimate the probability of a feature vector given the class of the object. Due to the size of the feature vector this is an unfeasible task. A useful approach is to split the feature space into several (conditionally) independent subspaces. This results in a new problem, namely how to find the ldquobestrdquo subdivision. In this paper the author consider a weighing approach that will perform (asymptotically) as good as the best subdivision and still has a manageable complexity.
Original language | English |
---|---|
Title of host publication | Information Theory and Applications Workshop, 2008 , 3rd ,27 January -1 February 2008, San Diego, U.S.A. |
Place of Publication | Piscataway |
Publisher | Institute of Electrical and Electronics Engineers |
Pages | 1-11 |
ISBN (Print) | 978-1-4244-2670-6 |
DOIs | |
Publication status | Published - 2008 |