What is canonical correlation
For example, someone interested in further understanding the relationships between the multidimensional constructs of personality and a healthy behavioral lifestyle might identify two sets of variables that measure those constructs.
In the personality set, one might include factors like conscientiousness, openness to experience, and neuroticism, whereas in the healthy behavior set, one might include physical activity, healthy eating, sleep, or dental hygiene. To use this technique, the researcher should identify two sets of measured variables.
The variables selected for a set should measure different dimensions of the same construct e. Skip to main content Skip to table of contents. This service is more advanced with JavaScript available. As a second example consider variables measured on environmental health and environmental toxins.
A number of environmental health variables such as frequencies of sensitive species, species diversity, total biomass, productivity of the environment, etc. For a third example consider a group of sales representatives, on whom we have recorded several sales performance variables along with several measures of intellectual and creative aptitude. We may wish to explore the relationships between the sales performance variables and the aptitude variables. One approach to studying relationships between the two sets of variables is to use canonical correlation analysis which describes the relationship between the first set of variables and the second set of variables.
We do not necessarily think of one set of variables as independent and the other as dependent, though that may potentially be another approach.
Below we use the canon command to conduct a canonical correlation analysis. It requires two sets of variables enclosed with a pair of parentheses. We specify our psychological variables as the first set of variables and our academic variables plus gender as the second set.
The output for canonical correlation analysis is made up of two parts. First is the raw canonical coefficients. The second part begins with the canonical correlations and includes the overall multivariate tests for dimensionality. The raw canonical coefficients can be used to generate the canonical variates, represented by the columns 1 2 3 in the coefficient tables, for each set.
They are interpreted in a manner analogous to interpreting regression coefficients i. Here is another example: being female leads to a. The number of possible canonical variates, also known as canonical dimensions, is equal to the number of variables in the smaller set.
This leads to three possible canonical variates for each set, which corresponds to the three columns for each set and three canonical correlation coefficients in the output. Canonical dimensions are latent variables that are analogous to factors obtained in factor analysis, except that canonical variates also maximize the correlation between the two sets of variables.
In general, not all the canonical dimensions would be statistically significant. A significant dimension corresponds to a significant canonical correlation and vice versa. To test if a canonical correlation is statistically different from zero, we can use the test option in canon command as shown below. In order to test all the canonical dimensions, we need to specify test 1 2 3.
Essentially test 1 is the overall test on three dimensions, test 2 will test the significance of canonical correlations 2 and 3, and test 3 will test the significance of the third canonical correlation alone. We may wish to explore the relationships between the sales performance variables and the aptitude variables. One approach to studying relationships between the two sets of variables is to use canonical correlation analysis which describes the relationship between the first set of variables and the second set of variables.
We do not necessarily think of one set of variables as independent and the other as dependent, though that may potentially be another approach. It is possible to create pairwise scatter plots with variables in the first set e. But if the dimension of the first set is p and that of the second set is q , there will be pq such scatter plots, it may be difficult, if not impossible, to look at all of these graphs together and interpret the results.
Similarly, you could compute all correlations between variables from the first set e. Canonical Correlation Analysis allows us to summarize the relationships into a lesser number of statistics while preserving the main facets of the relationships.
In a way, the motivation for canonical correlation is very similar to principal component analysis. It is another dimension reduction technique. This is done for computational convenience. We look at linear combinations of the data, similar to principal components analysis. We define a set of linear combinations named U and V. U corresponds to the linear combinations from the first set of variables, X , and V corresponds to the second set of variables, Y. Each member of U is paired with a member of V.
And, so on We hope to find linear combinations that maximize the correlations between the members of each canonical variate pair. We take the covariance between the two variables and divide it by the square root of the product of the variances:.
The canonical correlation is a specific type of correlation. This is the quantity to maximize. We want to find linear combinations of the X 's and linear combinations of the Y 's that maximize the above correlation.
This is subject to the constraint that variances of the two canonical variates in that pair are equal to one. Again, we will maximize this canonical correlation subject to the constraints that the variances of the individual canonical variates are both equal to one. In summary, our constraints are:.
Two collections of variables were measured:. Download the text file containing the data here: sales. Canonical Correlation Analysis is carried out in SAS using a canonical correlation procedure that is abbreviated as cancorr. Let's look at how this is carried out in the SAS Program below. Download the SAS program here: sales.
0コメント