Abstract
Original language  English 

Qualification  Doctor of Philosophy 
Awarding Institution 

Supervisors/Advisors 

Award date  1 Jan 2007 
Place of Publication  Houston 
Publisher  
Publication status  Published  2007 
Fingerprint
Cite this
}
Active learning and adaptive sampling for nonparametric inference. / Castro, R.M.
Houston : Rice University, 2007. 202 p.Research output: Thesis › Phd Thesis 4 Research NOT TU/e / Graduation NOT TU/e)
TY  THES
T1  Active learning and adaptive sampling for nonparametric inference
AU  Castro, R.M.
PY  2007
Y1  2007
N2  This thesis presents a general discussion of active learning and adaptive sampling. In many practical scenarios it is possible to use information gleaned from previous observations to focus the sampling process, in the spirit of the "twentyquestions" game. As more samples are collected one can learn how to improve the sampling process by deciding where to sample next, for example. These sampling feedback techniques are generically known as active learning or adaptive sampling. Although appealing, analysis of such methodologies is difficult, since there are strong dependencies between the observed data. This is especially important in the presence of measurement uncertainty or noise. The main thrust of this thesis is to characterize the potential and fundamental limitations of active learning, particularly in nonparametric settings. First, we consider the probabilistic classification setting. Using minimax analysis techniques we investigate the achievable rates of classification error convergence for broad classes of distributions characterized by decision boundary regularity and noise conditions (which describe the observation noise near the decision boundary). The results clearly indicate the conditions under which one can expect significant gains through active learning. Furthermore we show that the learning rates derived are tight for "boundary fragment" classes in ddimensional feature spaces when the feature marginal density is bounded from above and below. Second we study the problem of estimating an unknown function from noisy pointwise samples, where the sample locations are adaptively chosen based on previous samples and observations, as described above. We present results characterizing the potential and fundamental limits of active learning for certain classes of nonparametric regression problems, and also present practical algorithms capable of exploiting the sampling adaptivity and provably improving upon nonadaptive techniques. Our active sampling procedure is based on a novel coarsetofine strategy, based on and motivated by the success of spatiallyadaptive methods such as wavelet analysis in nonparametric function estimation. Using the ideas developed when solving the function regression problem we present a greedy algorithm for estimating piecewise constant functions with smooth boundaries that is near minimax optimal but is computationally much more efficient than the best dictionary based method (in this case wedgelet approximations). Finally we compare adaptive sampling (where feedback guiding the sampling process is present) with nonadaptive compressive sampling (where nontraditional projection samples are used). It is shown that under mild noise compressive sampling can be competitive with adaptive sampling, but adaptive sampling significantly outperforms compressive sampling in lower signaltonoise conditions. Furthermore this work also helps the understanding of the different behavior of compressive sampling under noisy and noiseless settings.
AB  This thesis presents a general discussion of active learning and adaptive sampling. In many practical scenarios it is possible to use information gleaned from previous observations to focus the sampling process, in the spirit of the "twentyquestions" game. As more samples are collected one can learn how to improve the sampling process by deciding where to sample next, for example. These sampling feedback techniques are generically known as active learning or adaptive sampling. Although appealing, analysis of such methodologies is difficult, since there are strong dependencies between the observed data. This is especially important in the presence of measurement uncertainty or noise. The main thrust of this thesis is to characterize the potential and fundamental limitations of active learning, particularly in nonparametric settings. First, we consider the probabilistic classification setting. Using minimax analysis techniques we investigate the achievable rates of classification error convergence for broad classes of distributions characterized by decision boundary regularity and noise conditions (which describe the observation noise near the decision boundary). The results clearly indicate the conditions under which one can expect significant gains through active learning. Furthermore we show that the learning rates derived are tight for "boundary fragment" classes in ddimensional feature spaces when the feature marginal density is bounded from above and below. Second we study the problem of estimating an unknown function from noisy pointwise samples, where the sample locations are adaptively chosen based on previous samples and observations, as described above. We present results characterizing the potential and fundamental limits of active learning for certain classes of nonparametric regression problems, and also present practical algorithms capable of exploiting the sampling adaptivity and provably improving upon nonadaptive techniques. Our active sampling procedure is based on a novel coarsetofine strategy, based on and motivated by the success of spatiallyadaptive methods such as wavelet analysis in nonparametric function estimation. Using the ideas developed when solving the function regression problem we present a greedy algorithm for estimating piecewise constant functions with smooth boundaries that is near minimax optimal but is computationally much more efficient than the best dictionary based method (in this case wedgelet approximations). Finally we compare adaptive sampling (where feedback guiding the sampling process is present) with nonadaptive compressive sampling (where nontraditional projection samples are used). It is shown that under mild noise compressive sampling can be competitive with adaptive sampling, but adaptive sampling significantly outperforms compressive sampling in lower signaltonoise conditions. Furthermore this work also helps the understanding of the different behavior of compressive sampling under noisy and noiseless settings.
UR  http://www.win.tue.nl/~rmcastro/publications/castro_phd.pdf
M3  Phd Thesis 4 Research NOT TU/e / Graduation NOT TU/e)
PB  Rice University
CY  Houston
ER 