Information theoretic approaches to principal component selection

Component selection in principal component analysis is the main step for dimension reduction and working further with extracted factors. A range of component selection methods include hypothesis testing and subjective judgement of graphical representation of eigenvalues, and most of them are not suitable for automatic selection of number of principal components. Though several versions of description length criteria have been developed for automatic selection of principal components, each of the criteria has its own methodological advantages and disadvantages, and no single method always performs the best in all datasets. With the lack of general guidance and criteria for using a particular method, component selection procedures largely depend on personal preferences rather than real statistics. In this article, we survey theoretical grounds of three commonly used minimum description length criteria based on the inherent Karhunen-Loève expansion of the observed process, and examine their performances by employing a series of simulation experiments. Finally, we present some empirical results to demonstrate that the theoretical properties of these criteria are reflected in simulation experiments, and results obtained in simulation experiments are also reflected in real data analysis.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s