Samenvatting
This thesis presents a general discussion of active learning and adaptive sampling.
In many practical scenarios it is possible to use information gleaned from previous observations to focus the sampling process, in the spirit of the "twentyquestions" game.
As more samples are collected one can learn how to improve the sampling process by deciding where to sample next, for example. These sampling feedback techniques are generically known as active learning or adaptive sampling. Although appealing, analysis of such methodologies is difficult, since there are strong dependencies between the observed data. This is especially important in the presence of measurement uncertainty or noise. The main thrust of this thesis is to characterize the potential and fundamental limitations of active learning, particularly in nonparametric settings.
First, we consider the probabilistic classification setting. Using minimax analysis techniques we investigate the achievable rates of classification error convergence for broad classes of distributions characterized by decision boundary regularity and noise conditions (which describe the observation noise near the decision boundary). The results clearly indicate the conditions under which one can expect significant gains through active learning. Furthermore we show that the learning rates derived are tight for "boundary fragment" classes in ddimensional feature spaces when the feature marginal density is bounded from above and below.
Second we study the problem of estimating an unknown function from noisy pointwise samples, where the sample locations are adaptively chosen based on previous samples and observations, as described above. We present results characterizing the potential and fundamental limits of active learning for certain classes of nonparametric regression problems, and also present practical algorithms capable of exploiting the sampling adaptivity and provably improving upon nonadaptive techniques. Our active sampling procedure is based on a novel coarsetofine strategy, based on and
motivated by the success of spatiallyadaptive methods such as wavelet analysis in nonparametric function estimation.
Using the ideas developed when solving the function regression problem we present a greedy algorithm for estimating piecewise constant functions with smooth boundaries that is near minimax optimal but is computationally much more efficient than the best dictionary based method (in this case wedgelet approximations). Finally we compare adaptive sampling (where feedback guiding the sampling process is present) with nonadaptive compressive sampling (where nontraditional projection samples are used). It is shown that under mild noise compressive sampling can be competitive with adaptive sampling, but adaptive sampling significantly outperforms compressive
sampling in lower signaltonoise conditions. Furthermore this work also helps the understanding of the different behavior of compressive sampling under noisy and noiseless settings.
Originele taal2  Engels 

Kwalificatie  Doctor in de Filosofie 
Toekennende instantie 

Begeleider(s)/adviseur 

Datum van toekenning  1 jan 2007 
Plaats van publicatie  Houston 
Uitgever  
Status  Gepubliceerd  2007 
Vingerafdruk Duik in de onderzoeksthema's van 'Active learning and adaptive sampling for nonparametric inference'. Samen vormen ze een unieke vingerafdruk.
Citeer dit
Castro, R. M. (2007). Active learning and adaptive sampling for nonparametric inference. Rice University.