Show simple item record

dc.contributor.advisorProf. Kenji Fukumizu
dc.contributor.authorAlam, Md. Ashad
dc.date.accessioned2022-04-24T06:35:37Z
dc.date.available2022-04-24T06:35:37Z
dc.date.issued2014-09
dc.identifier.urihttp://localhost:8080/xmlui/handle/123456789/762
dc.descriptionMethods using positive definite kernel (PDK), kernel methods play an increasingly prominent role to solve various problems in statistical machining learning such as, web design, pattern recognition, human action recognition for a robot, computational protein function perdition, remote sensing data analysis and in many other research fields. Due to the kernel trick and reproducing property, we can use linear techniques in feature spaces without knowing explicit forms of either the feature map or feature spaces. It offers versatile tools to process, analyze, and compare many types of data and offers state-of-the-art performance. Nowadays, PDK has become a popular tool for the most branches of statistical machine learning e.g., supervised learning, unsupervised learning, reinforcement learning, non-parametric inference and so on. Many methods have been proposed to kernel methods, which include support vector machine (SVM, Boser et al., 1992), kernel ridge regression (KRR, Saunders et al., 1998), kernel principal component analysis (kernel PCA, Schélkopf et al., 1998), kernel canonical correlation analysis (kernel CCA, Akaho, 2001, Bach and Jordan, 2002), Bayesian inference with positive definite kernels (kernel Bayes’ rule, Fukumizu et al., 2013), gradient-based kernel dimension reduction for regression (gKDR, Fukumizu and Leng, 2014), kernel two-sample test (Gretton, 2012) and so on.en_US
dc.description.abstractIn kernel methods, choosing a suitable kernel is indispensable for favorable results. While cross-validation is a useful method of the kernel and parameter choice for supervised learning such as the support vector machines, there are no well-founded methods, have been established in general for unsupervised learning. We focus on kernel principal component analysis (kernel PCA) and kernel canonical correlation analysis (kernel CCA), which are the nonlinear extension of principal component analysis (PCA) and canonical correlation analysis (CCA), respectively. Both of these methods have been used effectively for extracting nonlinear features and reducing dimensionality. As a kernel method, kernel PCA and kernel CCA also suffer from the problem of kernel choice. Although cross-validation is a popular method of choosing hyperparameters, it is not applicable straightforwardly to choose a kernel and the number of components in kernel PCA and kernel CCA. It is important, thus, to develop a well-founded method for choosing hyperparameters of the unsupervised methods. In kernel PCA, it is not possible to use cross-validation for choosing hyperparameters because of the incomparable norms given by different kernels. The first goal of the dissertation is to propose a method for choosing hyperparameters in kernel PCA (the kernel and the number of components) based on cross-validation for the comparable reconstruction errors of pre-images in the original space. The experimental results of synthesized and real-world datasets demonstrate that the proposed method successfully selects an appropriate kernel and the number of components in kernel PCA in terms of visualization and classification errors on the principal components. The results imply that the proposed method enables the automatic design of hyperparameters in kernel PCA. XIV In recent years, the influence function of kernel PCA and a robust kernel PCA has been theoretically derived. One observation of their analysis is that kernel PCA with a bounded kernel such as Gaussian is robust in that sense the influence function does not diverged, while for kernel PCA with unbounded kernels for example polynomial the influence function goes to infinity. This can be understood by the boundedness of the transformed data onto the feature space by a bounded kernel. While this is not a result of kernel CCA but for kernel PCA, it is reasonable to expect that kernel CCA with a bounded kernel is also robust. This consideration motivates us to do some empirical studies on the robustness of kernel CCA. It is essential to know how kernel CCA is effected by outliers and to develop measures of accuracy. Therefore, we do intend to study a number of conventional robust estimates and kernel CCA with different functions but fixed parameter of kernel. The second goal of the dissertation is to discuss five canonical correlation coefficients and investigate their performances (robustness) by influence function, sensitivity curve, qualitative robustness index and breakdown point using different type of simulated datasets. The final goal of the dissertation is to extract the limitations of cross-validation for the kernel CCA, and to propose a new regularization approach to overcome the limitations of kernel CCA. As we demonstrate for Gaussian kernels, the cross-validation errors for kernel CCA tend to decrease as the bandwidth parameter of the kernel decreases, which provides inappropriate features with all the data concentrated in a few points. This is caused by the ill-posedness of the kernel CCA with the cross-validation. To solve this problem, we propose to use constraints on the 4th order moments of canonical variables in addition to the variances. Experiments on synthesized and real world datasets including human action recognition for a robot demonstrate that the proposed higher-order regularized kernel CCA can be applied effectively with the cross-validation to find appropriate kernel and regularization parameters.en_US
dc.language.isoenen_US
dc.publisherThe Graduate University of Advanced Studiesen_US
dc.subjectUnsuperviseden_US
dc.subjectKernel Methodsen_US
dc.subjectProperties of RKHSen_US
dc.titleKernel Choice for Unsupervised Kernel Methodsen_US
dc.typeThesisen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record