

SPSS might exclude an observation from the analysis are listed here, and the Analysis Case Processing Summary – This table summarizes theĪnalysis dataset in terms of valid and excluded cases. This will provide us withĬlassification statistics in our output. In job to the predicted groupings generated by the discriminant analysis.įor this, we use the statistics subcommand. We will be interested in comparing the actual groupings In this example, we have selected three predictors: outdoor, socialĪnd conservative. The discriminating variables, or predictors, in the variables subcommand. In parenthesis the minimum and maximum values seen in job. Subcommand that we are interested in the variable job, and we list In this example, we specify in the groups Performs canonical linear discriminant analysis which is the classical form ofĭiscriminant analysis. Will also look at the frequency of each job group. Uncorrelated variables are likely preferable in this respect. Very highly correlated, then they will be contributing shared information to theĪnalysis. These correlations will give us some indication of how much unique informationĮach predictor will contribute to the analysis. Next, we can look at the correlations between these three predictors. Observations in one job group from observations in another job These differences will hopefully allow us to use these predictors to distinguish Tables=outdoor social conservative by jobįrom this output, we can see that some of the means of outdoor, socialĪnd conservative differ noticeably from group to group in job. Let’s look at summary statistics of these three continuous variables for each job category. We are interested in how job relates to outdoor, social and conservative. To start, we can examine the overall means of theĬontinuous variables. Some options for visualizing what occurs in discriminant analysis can be found in theĭiscriminant Analysis Data Analysis Example. Will be discussing the degree to which the continuous variables can be used toĭiscriminate between the groups. Well the continuous variables separate the categories in the classification. We can predict a classification based on the continuous variables or assess how Specifically, we would like to know how manyĭimensions we would need to express this relationship. We are interested in the relationship between the three continuous variablesĪnd our categorical variable. Levels: 1) customer service, 2) mechanic and 3) dispatcher. Three continuous, numeric variables ( outdoor, social andĬonservative) and one categorical variable ( job) with three , with 244 observations on four variables. The data used in this example are from a data file, A friend was also able to replicate this same output for me in SPSS.This page shows an example of a discriminant analysis in SPSS with footnotesĮxplaining the output. as mentioned here) doesn't match the SPSS output:Īnd the SPSS output (copied from answer). The structure matrix from candisc (which I believe is the same as the pooled within-groups correlations, i.e. Man1 <- lm(cbind(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) ~ Species, data = iris) The unstandardised discriminant coefficents and discriminant scores match those in the SPSS output and can be obtained using: #Unstandardised discriminant coefficientsĪdditional outputs can be obtained using the package candisc mentioned in this helpful post by Wilson here. In R, the lda can be performed using: library(MASS) Here are the iris data: Sepal.Length Sepal.Width Petal.Length Petal.Width Species Secondly, I have tried calculating the structure matrix more directly, but end up with a matrix that doesn't match either the R output or the SPSS output, so I suspect I have made a mistake somewhere.

I am interested in what the matrix that R produces is and whether it is a useful measure to describe the results of the Linear Discriminant Analysis. My question has two parts which I will summarise here before explaining the details:įirstly, I can produce a structure matrix using R however, it does not match the one given by SPSS. However, I am still having difficulty replicating the structure matrix produced by SPSS in R. This is also complemented by the question and answer by Wilson here.
#Multiple discriminant analysis spss code how to#
Having read through previous answers on this issue, I can see that here gave a detailed comparison of the SPSS and R output, as well as instructions on how to calculate the various statistics here. The R output lacks several of the statistics which are given with SPSS however, it should be possible to calculate these from the available information. I am trying to use R to replicate the more detailed output from a Linear Discriminant Analysis that is produced by SPSS.
