Active learning is a major area of interest within the field of machine learning, especially when the labeled instances are very difficult, time-consuming or expensive to obtain. In this paper, we review various active learning methods for manifold data, where the intrinsic manifold structure of data is also incorporated into the active learning query strategies. In addition, we present a new manifold-based active learning algorithm for Gaussian process classification. This new method uses a data-dependent kernel derived from a semi-supervised model that considers both labeled and unlabeled data. The method performs a regularization on the smoothness of the fitted function with respect to both the ambient space and the manifold where the data lie. The regularization parameter is treated as an additional kernel (covariance) parameter and estimated from the data, permitting adaptation of the kernel to the given dataset manifold geometry. Comparisons with other AL methods for manifold data show faster learning performance in our empirical experiments. MATLAB code that reproduces all examples is provided as supplementary materials.
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Statistics, Probability and Uncertainty