MCA in R Using FactoMineR: Quick Scripts and Videos

This article presents quick start R code and video series for computing MCA (Multiple Correspondence Analysis) in R, using the FactoMineR package. Recall that MCA is used for analyzing multivarariate data sets containing categorical variables, such as survey data.

MCA in R using FactoMineR: Video course

Contents:

Quick start R code
Theory and key concepts
MCA examples in R
Further reading
Related Books

Quick start R code

Install FactoMineR package:

install.packages("FactoMineR")

Compute MCA using the demo data set poison [in FactoMineR]. This data set refers to a survey carried out on a sample of children of primary school who suffered from food poisoning. They were asked about their symptoms and about what they ate.

library(FactoMineR)
data("poison")
res.mca <- MCA(poison, 
              quanti.sup = 1:2, # Supplementary quantitative variable
              quali.sup = 3:4,  # Supplementary qualitative variable
              graph=FALSE)

Key terms:

Active individuals and variables are used during the MCA.
Supplementary individuals and variables: their coordinates will be predicted after the MCA.

Visualize eigenvalues (scree plot). Show the percentage of variances explained by each principal component.

eig.val <- res.mca$eig
barplot(eig.val[, 2], 
        names.arg = 1:nrow(eig.val), 
        main = "Variances Explained by Dimensions (%)",
        xlab = "Principal Dimensions",
        ylab = "Percentage of variances",
        col ="steelblue")
# Add connected line segments to the plot
lines(x = 1:nrow(eig.val), eig.val[, 2], 
      type = "b", pch = 19, col = "red")

Biplot of individuals and variables showing the link between them.

plot(res.mca, autoLab = "yes")

Blue: Individuals
red: Variables
dark.green: Qualitative supplementary variable color

Graph of individuals. Individuals with a similar profile are grouped together. Use the argument invisible to hide active and supplementary variables on the plot.

plot(res.mca,
     invisible = c("var", "quali.sup", "quanti.sup"),
     cex = 0.8,                                    
     autoLab = "yes")

Graph of active variables. Use the argument invisible to hide individuals and supplementary variables on the plot

plot(res.mca, 
     invisible = c("ind", "quali.sup", "quanti.sup"),
     cex = 0.8,
     autoLab = "yes")

Color individuals by groups and add confidence ellipses around the mean of groups.

plotellipses(res.mca, keepvar = c("Vomiting", "Fish"))

For ggplot2-based visualization, read this: MCA - Multiple Correspondence Analysis in R: Essentials

Access to the results:

# Eigenvalues
res.mca$eig
  
# Results for active Variables
res.var <- res.mca$var
res.var$coord          # Coordinates
res.var$contrib        # Contributions to the PCs
res.var$cos2           # Quality of representation 
# Results for qualitative supp. variables
res.mca$quali.sup
# Results for active individuals
res.ind <- res.mca$var
res.ind$coord          # Coordinates
res.ind$contrib        # Contributions to the PCs
res.ind$cos2           # Quality of representation

The following series of video explains the basics of MCA and show practical examples and interpretation in R.

Theory and key concepts

Data types

This video describes the data format and the goals of MCA.

Visualizing the point cloud of individuals

This video shows how to build the point cloud of rows/individuals and, how to interpret it using the variable’s categories.

Visualizing the cloud of categories

In this video, you’ll learn how to build point clouds of categories, as well as, how to get an optimal representation of them. You will discover the link between the optimal representation of individuals and the optimal representation of categories.

Interpretation

This video describes some interpretation aids, shared by all principal component methods. Additionally, it shows how to use supplementary information, including supplementary variables, in MCA.

Course video materials

MCA examples in R

MCA in practice with FactoMineR

Handling missing values

This video show how to handle missing values in MCA using missMDA and FactoMineR packages

MCA Graphical user interface: Factoshiny

Automatic interpretation: FactoInvestigate

The FactoInvestigate R package makes it possible to generate automatically a report for principal component analysis. Learn more in our previous article: FactoInvestigate R Package: Automatic Reports and Interpretation of Principal Component Analyses

1	2	3
Practical Guide to Principal Component Methods in R	Exploratory Multivariate Analysis by Example Using R	Practical Guide to Cluster Analysis in R

Recommended for You!

Machine Learning Essentials: Practical Guide in R

Practical Guide to Cluster Analysis in R

Practical Guide to Principal Component Methods in R

R Graphics Essentials for Great Data Visualization

Network Analysis and Visualization in R

More books on R and data science

Recommended for you

This section contains the best data science and self-development resources to help you on your path.

Books - Data Science

Our Books

Practical Guide to Cluster Analysis in R by A. Kassambara (Datanovia)
Practical Guide To Principal Component Methods in R by A. Kassambara (Datanovia)
Machine Learning Essentials: Practical Guide in R by A. Kassambara (Datanovia)
R Graphics Essentials for Great Data Visualization by A. Kassambara (Datanovia)
GGPlot2 Essentials for Great Data Visualization in R by A. Kassambara (Datanovia)
Network Analysis and Visualization in R by A. Kassambara (Datanovia)
Practical Statistics in R for Comparing Groups: Numerical Variables by A. Kassambara (Datanovia)
Inter-Rater Reliability Essentials: Practical Guide in R by A. Kassambara (Datanovia)

Others

R for Data Science: Import, Tidy, Transform, Visualize, and Model Data by Hadley Wickham & Garrett Grolemund
Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurelien Géron
Practical Statistics for Data Scientists: 50 Essential Concepts by Peter Bruce & Andrew Bruce
Hands-On Programming with R: Write Your Own Functions And Simulations by Garrett Grolemund & Hadley Wickham
An Introduction to Statistical Learning: with Applications in R by Gareth James et al.
Deep Learning with R by François Chollet & J.J. Allaire
Deep Learning with Python by François Chollet

Comments

You are not authorized to post a comment

Comment

Visitor

#735 03/31/2019 at 05h32

keep it up

STAY UPDATED

Articles - Principal Component Methods: Videos