Abstract
In this paper, we study a two-category classification problem. We indicate the categories by labels Y=1 and Y=-1. We observe a covariate, or feature, X ¿ X ¿ Rd. Consider a collection {ha} of classifiers indexed by a finite-dimensional parameter a, and the classifier ha* that minimizes the prediction error over this class. The parameter a* is estimated by the empirical risk minimizer ân over the class, where the empirical risk is calculated on a training sample of size n. We apply the Kim Pollard Theorem to show that under certain differentiability assumptions, ân converges to a* with rate n-1/3, and also present the asymptotic distribution of the renormalized estimator.For example, let V0 denote the set of x on which, given X=x, the label Y=1 is more likely (than the label Y=-1). If X is one-dimensional, the set V0 is the union of disjoint intervals. The problem is then to estimate the thresholds of the intervals. We obtain the asymptotic distribution of the empirical risk minimizer when the classifiers have K thresholds, where K is fixed. We furthermore consider an extension to higher-dimensional X, assuming basically that V0 has a smooth boundary in some given parametric class.We also discuss various rates of convergence when the differentiability conditions are possibly violated. Here, we again restrict ourselves to one-dimensional X. We show that the rate is n-1 in certain cases, and then also obtain the asymptotic distribution for the empirical prediction error.
Original language | English |
---|---|
Pages (from-to) | 2027-2047 |
Journal | Journal of Machine Learning Research |
Volume | 6 |
Publication status | Published - 2005 |