Samenvatting
I consider a binary classification problem with a feature vector of high dimensionality. Spam mail filters are a popular example hereof. A Bayesian approach requires us to estimate the probability of a feature vector given the class of the object. Due to the size of the feature vector this is an unfeasible t ask. A useful approach is to split the feature space into several (conditionally) independent subspaces. This results in a new problem, namely how to find the " best" subdivision. In this paper I consider a weighing approach that will perform (asymptotically) as good as the best subdivision and still has a manageable complexity
Originele taal-2 | Engels |
---|---|
Titel | Proceedings of the 29th Symposium on Information Theory in the Benelux, May 29-30, 2008, Leuven, Belgium |
Redacteuren | L. Perre, Van der, A. Dejonghe, V. Ramon |
Plaats van productie | Leuven |
Uitgeverij | IMEC |
Pagina's | 121-128 |
ISBN van geprinte versie | 978-90-9023135-8 |
Status | Gepubliceerd - 2008 |