<?xml version="1.0" encoding="UTF-8" ?>
<!-- RSS generated by PHPBoost on Sat, 25 Apr 2026 10:45:55 +0200 -->

<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
	<channel>
		<title><![CDATA[Easy Guides]]></title>
		<atom:link href="https://www.sthda.com/english/syndication/rss/wiki/32" rel="self" type="application/rss+xml"/>
		<link>https://www.sthda.com</link>
		<description><![CDATA[Last articles of the category: Factor analysis]]></description>
		<copyright>(C) 2005-2026 PHPBoost</copyright>
		<language>en</language>
		<generator>PHPBoost</generator>
		
		
		<item>
			<title><![CDATA[Multiple Correspondence Analysis Essentials: Interpretation and application to investigate the associations between categories of multiple qualitative variables  - R software and data mining]]></title>
			<link>https://www.sthda.com/english/wiki/multiple-correspondence-analysis-essentials-interpretation-and-application-to-investigate-the-associations-between-categories-of-multiple-qualitative-variables-r-software-and-data-mining</link>
			<guid>https://www.sthda.com/english/wiki/multiple-correspondence-analysis-essentials-interpretation-and-application-to-investigate-the-associations-between-categories-of-multiple-qualitative-variables-r-software-and-data-mining</guid>
			<description><![CDATA[<!-- START HTML -->

  <!--====================== start from here when you copy to sthda================-->  
  <div id="rdoc">

<div id="TOC">
<ul>
<li><a href="#required-packages">Required packages</a></li>
<li><a href="#load-factominer-and-factoextra">Load FactoMineR and factoextra</a></li>
<li><a href="#data-format">Data format</a></li>
<li><a href="#exploratory-data-analysis">Exploratory data analysis</a></li>
<li><a href="#multiple-correspondence-analysis-mca">Multiple Correspondence Analysis (MCA)</a></li>
<li><a href="#summary-of-mca-outputs">Summary of MCA outputs</a></li>
<li><a href="#interpretation-of-mca-outputs">Interpretation of MCA outputs</a></li>
<li><a href="#eigenvaluesvariances-and-screeplot">Eigenvalues/variances and screeplot</a></li>
<li><a href="#mca-scatter-plot-biplot-of-individuals-and-variable-categories">MCA scatter plot: Biplot of individuals and variable categories</a></li>
<li><a href="#variable-categories">Variable categories</a><ul>
<li><a href="#correlation-between-variables-and-principal-dimensions">Correlation between variables and principal dimensions</a></li>
<li><a href="#coordinates-of-variable-categories">Coordinates of variable categories</a></li>
<li><a href="#contribution-of-variable-categories-to-the-dimensions">Contribution of variable categories to the dimensions</a></li>
<li><a href="#cos2-the-quality-of-representation-of-variable-categories">Cos2 : The quality of representation of variable categories</a></li>
</ul></li>
<li><a href="#individuals">Individuals</a><ul>
<li><a href="#coordinates-of-individuals">Coordinates of individuals</a></li>
<li><a href="#contribution-of-individuals-to-the-dimensions">Contribution of individuals to the dimensions</a></li>
<li><a href="#cos2-the-quality-of-representation-of-individuals">Cos2 : The quality of representation of individuals</a></li>
<li><a href="#change-the-color-of-individuals-by-groups">Change the color of individuals by groups</a></li>
</ul></li>
<li><a href="#mca-using-supplementary-individuals-and-variables">MCA using supplementary individuals and variables</a><ul>
<li><a href="#make-a-biplot-of-individuals-and-variable-categories">Make a biplot of individuals and variable categories</a></li>
<li><a href="#visualize-supplementary-variables">Visualize supplementary variables</a><ul>
<li><a href="#supplementary-qualitative-variable-categories">Supplementary qualitative variable categories</a></li>
<li><a href="#supplementary-quantitative-variables">Supplementary quantitative variables</a></li>
</ul></li>
<li><a href="#visualize-supplementary-individuals">Visualize supplementary individuals</a></li>
</ul></li>
<li><a href="#filter-the-mca-result">Filter the MCA result</a></li>
<li><a href="#dimension-description">Dimension description</a></li>
<li><a href="#infos">Infos</a></li>
<li><a href="#references-and-further-reading">References and further reading</a></li>
</ul>
</div>

<p><br/></p>
<p>As described in my previous article, the simple <a href="https://www.sthda.com/english/english/wiki/correspondence-analysis-in-r-the-ultimate-guide-for-the-analysis-the-visualization-and-the-interpretation-r-software-and-data-mining"><strong>correspondence analysis (CA)</strong></a> is used to analyse the contingency table formed by two <strong>categorical variables</strong>.</p>
<p>To learn more about <strong>CA</strong>, read this article: <a href="https://www.sthda.com/english/english/wiki/correspondence-analysis-in-r-the-ultimate-guide-for-the-analysis-the-visualization-and-the-interpretation-r-software-and-data-mining">Correspondence Analysis in R: The Ultimate Guide for the Analysis, the Visualization and the Interpretation</a>.</p>
<p><strong>Multiple Correspondence Analysis (MCA)</strong> is an extension of simple <strong>CA</strong> to analyse a data table containing more than two categorical variables.</p>
<p><strong>MCA</strong> is generally used to analyse a data from survey.</p>
<p>The objectives are to identify:</p>
<ul>
<li>A group of individuals with similar profile in their answers to the questions</li>
<li>The associations between variable categories</li>
</ul>
<p>There are several R functions from different packages to compute <strong>MCA</strong>, including:</p>
<ul>
<li><strong>MCA()</strong> [in <em>FactoMineR</em> package]</li>
<li><strong>dudi.mca()</strong> [in <em>ade4</em> package]</li>
</ul>
<p>These packages provide also some standard functions to visualize the results of the analysis. It’s also possible to use the package <a href="https://www.sthda.com/english/english/wiki/factoextra-r-package-quick-multivariate-data-analysis-pca-ca-mca-and-visualization-r-software-and-data-mining"><strong>factoextra</strong></a> to generate easily beautiful graphs.</p>
<p><span class="success">This article describes how to perform and interpret <strong>multiple correspondence analysis</strong> using <strong>FactoMineR</strong> package.<span></p>
<div id="required-packages" class="section level1">
<h1>Required packages</h1>
<p><strong>FactoMineR</strong>(for computing <strong>MCA</strong>) and <a href="https://www.sthda.com/english/english/wiki/factoextra-r-package-visualization-of-the-outputs-of-a-multivariate-analysis-r-software-and-data-mining"><strong>factoextra</strong></a> (for MCA visualization) packages are used.</p>
<p>These packages can be installed as follow :</p>
<pre class="r"><code>install.packages("FactoMineR")

# install.packages("devtools")
devtools::install_github("kassambara/factoextra")</code></pre>
<p><span class="warning">Note that, for factoextra a version >= 1.0.2 is required for this tutorial. If it’s already installed on your computer, you should re-install it to have the most updated version.</span></p>
</div>
<div id="load-factominer-and-factoextra" class="section level1">
<h1>Load FactoMineR and factoextra</h1>
<pre class="r"><code>library("FactoMineR")
library("factoextra")</code></pre>
</div>
<div id="data-format" class="section level1">
<h1>Data format</h1>
<p>We’ll use the data sets <em>poison</em> [in <em>FactoMineR</em>]</p>
<pre class="r"><code>data(poison)
head(poison[, 1:7])</code></pre>
<pre><code>  Age Time   Sick Sex   Nausea Vomiting Abdominals
1   9   22 Sick_y   F Nausea_y  Vomit_n     Abdo_y
2   5    0 Sick_n   F Nausea_n  Vomit_n     Abdo_n
3   6   16 Sick_y   F Nausea_n  Vomit_y     Abdo_y
4   9    0 Sick_n   F Nausea_n  Vomit_n     Abdo_n
5   7   14 Sick_y   M Nausea_n  Vomit_y     Abdo_y
6  72    9 Sick_y   M Nausea_n  Vomit_n     Abdo_y</code></pre>
<p>An image of the data is shown below:</p>
<p><a href="https://www.sthda.com/english/sthda/RDoc/images/mca-poison-big.png" title="Click to zoom!" target="_blank"> <img src="https://www.sthda.com/english/sthda/RDoc/images/mca-poison.png" alt="Multiple Correspondence analysis data"/> </a></p>
<p>This data is a result from a survey carried out on children of primary school who suffered from food poisoning. They were asked about their symptoms and about what they ate.</p>
<p>The data contains 55 rows (children, individuals) and 15 columns (variables).</p>
<br/>

<div class="warning">
<p>Only some of these individuals (children) and variables will be used to perform the <em>multiple correspondence analysis (MCA)</em>.</p>
The coordinates of the remaining individuals and variables on the factor map will be <strong>predicted</strong> after the MCA.
</div>
<p><br/></p>
<p>In <strong>MCA</strong> terminology, our data contains :</p>
<br/>
<div class="block">
<ul>
<li><strong>Active individuals</strong> (rows 1:55): Individuals that are used during the correspondence analysis.</li>
<li><strong>Active variables</strong> (columns 5:15) : Variables that are used for the <strong>MCA</strong>.</li>
<li><strong>Supplementary variables</strong> : They don’t participate to the MCA. The coordinates of these variables will be predicted.</li>
<li><strong>Supplementary continuous variables</strong> : Columns 1 and 2 corresponding to the columns <em>age</em> and <em>time</em>, respectively.</li>
<li><strong>Supplementary qualitative variables</strong> : Columns 3 and 4 corresponding to the columns <em>Sick</em> and <em>Sex</em>, respectively. This factor variables will be used to color individuals by groups.</li>
</ul>
</div>
<p><br/></p>
<p>Subset only active individuals and variables for multiple correspondence analysis:</p>
<pre class="r"><code>poison.active <- poison[1:55, 5:15]
head(poison.active[, 1:6])</code></pre>
<pre><code>    Nausea Vomiting Abdominals   Fever   Diarrhae   Potato
1 Nausea_y  Vomit_n     Abdo_y Fever_y Diarrhea_y Potato_y
2 Nausea_n  Vomit_n     Abdo_n Fever_n Diarrhea_n Potato_y
3 Nausea_n  Vomit_y     Abdo_y Fever_y Diarrhea_y Potato_y
4 Nausea_n  Vomit_n     Abdo_n Fever_n Diarrhea_n Potato_y
5 Nausea_n  Vomit_y     Abdo_y Fever_y Diarrhea_y Potato_y
6 Nausea_n  Vomit_n     Abdo_y Fever_y Diarrhea_y Potato_y</code></pre>
</div>
<div id="exploratory-data-analysis" class="section level1">
<h1>Exploratory data analysis</h1>
<p>The function <strong>summary()</strong> can be used to compute the frequency of variable categories. As the data table contains a large number of variables, we’ll display only the results for the first 4 variables.</p>
<p><strong>Statistical summaries</strong>:</p>
<pre class="r"><code># Summary of the 4 first variables
summary(poison.active)[, 1:4]</code></pre>
<pre><code>      Nausea        Vomiting     Abdominals       Fever     
 "Nausea_n:43  " "Vomit_n:33  " "Abdo_n:18  " "Fever_n:20  "
 "Nausea_y:12  " "Vomit_y:22  " "Abdo_y:37  " "Fever_y:35  "</code></pre>
<p>It’s also possible to plot the frequency of variable categories:</p>
<pre class="r"><code>for (i in 1:ncol(poison.active)) {
  plot(poison.active[,i], main=colnames(poison.active)[i],
       ylab = "Count", col="steelblue", las = 2)
  }</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-frequency-variables-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="172.8" style="margin-bottom:10px;" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-frequency-variables-data-mining-2.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="172.8" style="margin-bottom:10px;" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-frequency-variables-data-mining-3.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="172.8" style="margin-bottom:10px;" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-frequency-variables-data-mining-4.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="172.8" style="margin-bottom:10px;" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-frequency-variables-data-mining-5.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="172.8" style="margin-bottom:10px;" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-frequency-variables-data-mining-6.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="172.8" style="margin-bottom:10px;" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-frequency-variables-data-mining-7.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="172.8" style="margin-bottom:10px;" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-frequency-variables-data-mining-8.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="172.8" style="margin-bottom:10px;" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-frequency-variables-data-mining-9.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="172.8" style="margin-bottom:10px;" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-frequency-variables-data-mining-10.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="172.8" style="margin-bottom:10px;" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-frequency-variables-data-mining-11.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="172.8" style="margin-bottom:10px;" /></p>
<p><span class="warning">The graphs above can be used to identify variable categories with a very low frequency. These types of variables can distort the analysis.</span></p>
</div>
<div id="multiple-correspondence-analysis-mca" class="section level1">
<h1>Multiple Correspondence Analysis (MCA)</h1>
<p>The function <strong>MCA()</strong> [in <em>FactoMineR</em> package] can be used. A simplified format is :</p>
<pre class="r"><code>MCA(X, ncp = 5, graph = TRUE)</code></pre>
<br/>
<div class="block">
<ul>
<li><strong>X</strong> : a data frame with n rows (individuals) and p columns (categorical variables)</li>
<li><strong>ncp</strong> : number of dimensions kept in the final results.</li>
<li><strong>graph</strong> : a logical value. If TRUE a graph is displayed.</li>
</ul>
</div>
<p><br/></p>
<p>In the R code below, the MCA is performed only on the active individuals/variables :</p>
<pre class="r"><code>res.mca <- MCA(poison.active, graph = FALSE)</code></pre>
<p>The output of the function <strong>MCA()</strong> is a list including :</p>
<pre class="r"><code>print(res.mca)</code></pre>
<pre><code>**Results of the Multiple Correspondence Analysis (MCA)**
The analysis was performed on 55 individuals, described by 11 variables
*The results are available in the following objects:

   name              description                       
1  "$eig"            "eigenvalues"                     
2  "$var"            "results for the variables"       
3  "$var$coord"      "coord. of the categories"        
4  "$var$cos2"       "cos2 for the categories"         
5  "$var$contrib"    "contributions of the categories" 
6  "$var$v.test"     "v-test for the categories"       
7  "$ind"            "results for the individuals"     
8  "$ind$coord"      "coord. for the individuals"      
9  "$ind$cos2"       "cos2 for the individuals"        
10 "$ind$contrib"    "contributions of the individuals"
11 "$call"           "intermediate results"            
12 "$call$marge.col" "weights of columns"              
13 "$call$marge.li"  "weights of rows"                 </code></pre>
<p><span class="success">The object that is created using the function <strong>MCA()</strong> contains results as lists. These values are described in the next sections.</span></p>
</div>
<div id="summary-of-mca-outputs" class="section level1">
<h1>Summary of MCA outputs</h1>
<p>The function <strong>summary.MCA()</strong> [in <em>FactoMineR</em>] is used to print a summary of <strong>multiple correspondence analysis</strong> results:</p>
<pre class="r"><code>summary(object, nb.dec = 3, nbelements = 10, 
        ncp = TRUE, file ="", ...)</code></pre>
<br/>
<div class="block">
<ul>
<li><strong>object</strong>: an object of class <strong>MCA</strong></li>
<li><strong>nb.dec</strong>: number of decimal printed</li>
<li><strong>nbelements</strong>: number of row/column variables to be written. To have all the elements, use <em>nbelements = Inf</em>.</li>
<li><strong>ncp</strong>: Number of dimensions to be printed</li>
<li><strong>file</strong>: an optional file name for exporting the summaries.</li>
</ul>
</div>
<p><br/></p>
<p><strong>Print the summary of the MCA for the dimensions 1 and 2:</strong></p>
<pre class="r"><code>summary(res.mca, nb.dec = 2, ncp = 2)</code></pre>
<pre><code>

Eigenvalues
                      Dim.1  Dim.2  Dim.3  Dim.4  Dim.5  Dim.6  Dim.7  Dim.8  Dim.9 Dim.10 Dim.11
Variance               0.34   0.13   0.11   0.10   0.08   0.07   0.06   0.06   0.04   0.01   0.01
% of var.             33.52  12.91  10.73   9.59   7.88   7.11   6.02   5.58   4.12   1.30   1.23
Cumulative % of var.  33.52  46.44  57.17  66.76  74.64  81.75  87.77  93.35  97.47  98.77 100.00

Individuals (the 10 first)
             Dim.1   ctr  cos2   Dim.2   ctr  cos2  
1          | -0.45  1.11  0.35 | -0.26  0.98  0.12 |
2          |  0.84  3.79  0.56 | -0.03  0.01  0.00 |
3          | -0.45  1.09  0.55 |  0.14  0.26  0.05 |
4          |  0.88  4.20  0.75 | -0.09  0.10  0.01 |
5          | -0.45  1.09  0.55 |  0.14  0.26  0.05 |
6          | -0.36  0.70  0.02 | -0.44  2.68  0.04 |
7          | -0.45  1.09  0.55 |  0.14  0.26  0.05 |
8          | -0.64  2.23  0.62 | -0.01  0.00  0.00 |
9          | -0.45  1.11  0.35 | -0.26  0.98  0.12 |
10         | -0.14  0.11  0.04 |  0.12  0.21  0.03 |

Categories (the 10 first)
             Dim.1   ctr  cos2 v.test   Dim.2   ctr  cos2 v.test  
Nausea_n   |  0.27  1.52  0.26   3.72 |  0.12  0.81  0.05   1.69 |
Nausea_y   | -0.96  5.43  0.26  -3.72 | -0.43  2.91  0.05  -1.69 |
Vomit_n    |  0.48  3.73  0.34   4.31 | -0.41  7.07  0.25  -3.68 |
Vomit_y    | -0.72  5.60  0.34  -4.31 |  0.61 10.61  0.25   3.68 |
Abdo_n     |  1.32 15.42  0.85   6.76 | -0.04  0.03  0.00  -0.18 |
Abdo_y     | -0.64  7.50  0.85  -6.76 |  0.02  0.01  0.00   0.18 |
Fever_n    |  1.17 13.54  0.78   6.51 | -0.17  0.78  0.02  -0.97 |
Fever_y    | -0.67  7.74  0.78  -6.51 |  0.10  0.45  0.02   0.97 |
Diarrhea_n |  1.18 13.80  0.80   6.57 |  0.00  0.00  0.00  -0.02 |
Diarrhea_y | -0.68  7.88  0.80  -6.57 |  0.00  0.00  0.00   0.02 |

Categorical variables (eta2)
             Dim.1 Dim.2  
Nausea     |  0.26  0.05 |
Vomiting   |  0.34  0.25 |
Abdominals |  0.85  0.00 |
Fever      |  0.78  0.02 |
Diarrhae   |  0.80  0.00 |
Potato     |  0.03  0.40 |
Fish       |  0.01  0.03 |
Mayo       |  0.38  0.03 |
Courgette  |  0.02  0.45 |
Cheese     |  0.19  0.05 |</code></pre>
<p>The result of the function <strong>summary()</strong> contains 4 tables:</p>
<ul>
<li><strong>Table 1 - Eigenvalues</strong>: table 1 contains the variances and the percentage of variances retained by each dimension.</li>
<li><strong>Table 2</strong> contains the coordinates, the contribution and the cos2 (quality of representation [in 0-1]) of the first 10 active individuals on the dimensions 1 and 2.</li>
<li><strong>Table 3</strong> contains the coordinates, the contribution and the cos2 (quality of representation [in 0-1]) of the first 10 active variable categories on the dimensions 1 and 2. This table contains also a column called <em>v.test</em>. The value of the <em>v.test</em> is generally comprised between 2 and -2. For a given variable category, if the absolute value of the <em>v.test</em> is superior to 2, this means that the coordinate is significantly different from 0.</li>
<li><strong>Table 4</strong> - categorical variables (eta2): contains the squared correlation between each variable and the dimensions.</li>
</ul>
<br/>
<div class="warning">
<ul>
<li>For exporting the summary to a file, use the code: <em>summary(res.mca, file =“myfile.txt”)</em></li>
<li>For displaying the summary of more than 10 elements, use the argument <em>nbelements</em> in the function <strong>summary()</strong></li>
</ul>
</div>
<p><br/></p>
</div>
<div id="interpretation-of-mca-outputs" class="section level1">
<h1>Interpretation of MCA outputs</h1>
<p>MCA results is interpreted as the results from a simple correspondence analysis (CA).</p>
<p>I recommend to read the interpretation of simple CA which has been comprehensively described in my previous post: <a href="https://www.sthda.com/english/english/wiki/correspondence-analysis-in-r-the-ultimate-guide-for-the-analysis-the-visualization-and-the-interpretation-r-software-and-data-mining">Correspondence Analysis in R: The Ultimate Guide for the Analysis, the Visualization and the Interpretation</a>.</p>
</div>
<div id="eigenvaluesvariances-and-screeplot" class="section level1">
<h1>Eigenvalues/variances and screeplot</h1>
<p>The proportion of variances retained by the different dimensions (axes) can be extracted using the function <strong>get_eigenvalue()</strong> [in <em>factoextra</em>] as follow :</p>
<pre class="r"><code>eigenvalues <- get_eigenvalue(res.mca)
head(round(eigenvalues, 2))</code></pre>
<pre><code>      eigenvalue variance.percent cumulative.variance.percent
Dim.1       0.34            33.52                       33.52
Dim.2       0.13            12.91                       46.44
Dim.3       0.11            10.73                       57.17
Dim.4       0.10             9.59                       66.76
Dim.5       0.08             7.88                       74.64
Dim.6       0.07             7.11                       81.75</code></pre>
<p>The function <strong>fviz_screeplot()</strong> [in <em>factoextra</em> package] can be used to draw the scree plot (the percentages of inertia explained by the MCA dimensions):</p>
<pre class="r"><code>fviz_screeplot(res.mca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-scree-pot-factoextra-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="warning">Read more about eigenvalues and screeplot: <a href="https://www.sthda.com/english/english/wiki/eigenvalues-quick-data-visualization-with-factoextra-r-software-and-data-mining">Eigenvalues data visualization</a></span></p>
</div>
<div id="mca-scatter-plot-biplot-of-individuals-and-variable-categories" class="section level1">
<h1>MCA scatter plot: Biplot of individuals and variable categories</h1>
<p>The function <strong>plot.MCA()</strong> [in <strong>FactoMineR</strong> package] can be used. A simplified format is :</p>
<pre class="r"><code>plot(x, axes = c(1,2), choix=c("ind", "var"))</code></pre>
<br/>
<div class="block">
<ul>
<li><strong>x</strong> : An object of class <strong>MCA</strong></li>
<li><strong>axes</strong> : A numeric vector of length 2 specifying the component to plot</li>
<li><strong>choix</strong> : The graph to be plotted. Possible values are “ind” for the individuals and “var” for the variables</li>
</ul>
</div>
<p><br/></p>
<p><strong>FactoMineR base graph for MCA</strong>:</p>
<pre class="r"><code>plot(res.mca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-factor-map-factominer-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>It’s also possible to use the function <strong>fviz_mca_biplot()</strong>[in <em>factoextra</em> package] to draw a nice looking plot:</p>
<pre class="r"><code>fviz_mca_biplot(res.mca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-mca-biplot-factoextra-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Change the theme
fviz_mca_biplot(res.mca) +
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-mca-biplot-factoextra-data-mining-2.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="warning">Read more about <em>fviz_mca_biplot()</em>: <a href="https://www.sthda.com/english/english/wiki/fviz-mca-quick-multiple-correspondence-analysis-data-visualization-r-software-and-data-mining">fviz_mca_biplot</a></span></p>
<p>The graph above shows a global pattern within the data. Rows (individuals) are represented by blue points and columns (variable categories) by red triangles.</p>
<p>The distance between any row points or column points gives a measure of their similarity (or dissimilarity).</p>
<p>Row points with similar profile are closed on the factor map. The same holds true for column points.</p>
</div>
<div id="variable-categories" class="section level1">
<h1>Variable categories</h1>
<p>The function <strong>get_mca_var()</strong>[in <em>factoextra</em>] is used to extract the results for variable categories. This function returns a list containing the coordinates, the cos2 and the contribution of variable categories:</p>
<pre class="r"><code>var <- get_mca_var(res.mca)
var</code></pre>
<pre><code>Multiple Correspondence Analysis Results for variables
 ===================================================
  Name       Description                  
1 "$coord"   "Coordinates for categories" 
2 "$cos2"    "Cos2 for categories"        
3 "$contrib" "contributions of categories"</code></pre>
<div id="correlation-between-variables-and-principal-dimensions" class="section level2">
<h2>Correlation between variables and principal dimensions</h2>
<p>Variables can be visualized as follow:</p>
<pre class="r"><code>plot(res.mca, choix = "var")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-unnamed-chunk-11-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<br/>
<div class="success">
<ul>
<li><p>The plot above helps to identify variables that are the most correlated with each dimension. The squared correlations between variables and the dimensions are used as coordinates.</p></li>
<li>It can be seen that, the variables <em>Diarrhae, Abdominals and Fever</em> are the most correlated with dimension 1. Similarly, the variables <em>Courgette and Potato</em> are the most correlated with dimension 2.</li>
</ul>
</div>
<p><br/></p>
</div>
<div id="coordinates-of-variable-categories" class="section level2">
<h2>Coordinates of variable categories</h2>
<pre class="r"><code>head(round(var$coord, 2))</code></pre>
<pre><code>         Dim 1 Dim 2 Dim 3 Dim 4 Dim 5
Nausea_n  0.27  0.12 -0.27  0.03  0.07
Nausea_y -0.96 -0.43  0.95 -0.12 -0.26
Vomit_n   0.48 -0.41  0.08  0.27  0.05
Vomit_y  -0.72  0.61 -0.13 -0.41 -0.08
Abdo_n    1.32 -0.04 -0.01 -0.15 -0.07
Abdo_y   -0.64  0.02  0.00  0.07  0.03</code></pre>
<p>Use the function <strong>fviz_mca_var()</strong> [in <em>factoextra</em>] to visualize only variable categories:</p>
<pre class="r"><code># Default plot
fviz_mca_var(res.mca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-ca-row-points-factoextra-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>It’s possible to change the color and the shape of the variable points using the arguments <em>col.var</em> and <em>shape.var</em> as follow:</p>
<pre class="r"><code>fviz_mca_var(res.mca, col.var="black", shape.var = 15)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-ca-row-points-color-factoextra-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<br/>
<div class="notice">
<p>Note that, it’s also possible to make the graph of variables only using <em>FactoMineR</em> base graph. The argument <em>invisible</em> is used to hide the individual points:</p>
<pre class="r"><code># Hide individuals
plot(res.mca, invisible="ind") </code></pre>
</div>
<p><br/></p>
</div>
<div id="contribution-of-variable-categories-to-the-dimensions" class="section level2">
<h2>Contribution of variable categories to the dimensions</h2>
<p>The contribution of the variable categories (in %) to the definition of the dimensions can be extracted as follow:</p>
<pre class="r"><code>head(round(var$contrib,2))</code></pre>
<pre><code>         Dim 1 Dim 2 Dim 3 Dim 4 Dim 5
Nausea_n  1.52  0.81  4.67  0.08  0.49
Nausea_y  5.43  2.91 16.73  0.30  1.76
Vomit_n   3.73  7.07  0.36  4.26  0.19
Vomit_y   5.60 10.61  0.54  6.39  0.29
Abdo_n   15.42  0.03  0.00  0.73  0.18
Abdo_y    7.50  0.01  0.00  0.36  0.09</code></pre>
<p><span class="success">The variable categories with the larger value, contribute the most to the definition of the dimensions.</span></p>
<p>The different categories in the table are:</p>
<pre class="r"><code>categories <- rownames(var$coord)
length(categories)</code></pre>
<pre><code>[1] 22</code></pre>
<pre class="r"><code>print(categories)</code></pre>
<pre><code> [1] "Nausea_n"   "Nausea_y"   "Vomit_n"    "Vomit_y"    "Abdo_n"     "Abdo_y"     "Fever_n"   
 [8] "Fever_y"    "Diarrhea_n" "Diarrhea_y" "Potato_n"   "Potato_y"   "Fish_n"     "Fish_y"    
[15] "Mayo_n"     "Mayo_y"     "Courg_n"    "Courg_y"    "Cheese_n"   "Cheese_y"   "Icecream_n"
[22] "Icecream_y"</code></pre>
<p>It’s possible to use the function <strong>corrplot</strong> to highlight the most contributing variables for each dimension:</p>
<pre class="r"><code>library("corrplot")
corrplot(var$contrib, is.corr = FALSE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-variable-contribution-r-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>The function <strong>fviz_contrib()</strong>[in <em>factoextra</em>] can be used to draw a bar plot of variable contributions:</p>
<pre class="r"><code># Contributions of variables on Dim.1
fviz_contrib(res.mca, choice = "var", axes = 1)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-mca-variable-contribution-dim-1-factoextra-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<br/>
<div class="warning">
<ul>
<li><p>If the contribution of variable categories were uniform, the expected value would be 1/number_of_categories = 1/22 = 4.5%.</p></li>
<li>The red dashed line on the graph above indicates the expected average contribution. For a given dimension, any category with a contribution larger than this threshold could be considered as important in contributing to that dimension.</li>
</ul>
</div>
<p><br/></p>
<p><span class="success"> It can be seen that the categories <em>Abdo_n, Diarrhea_n, Fever_n and Mayo_n</em> are the most important in the definition of the first dimension.</span></p>
<pre class="r"><code># Contributions of rows on Dim.2
fviz_contrib(res.mca, choice = "var", axes = 2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-mca-variable-contribution-dim-2-factoextra-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="success">The row items <em>Courg_n, Potato_n, Vomit_y and Icecream_n</em> contribute the most to the dimension 2.</span></p>
<pre class="r"><code># Total contribution on Dim.1 and Dim.2
fviz_contrib(res.mca, choice = "var", axes = 1:2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-mca-variable-contribution-2-dimension-factoextra-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<br/>
<div class="block">
<p>The total contribution of a category, on explaining the variations retained by Dim.1 and Dim.2, is calculated as follow : (C1 * Eig1) + (C2 * Eig2).</p>
<p>C1 and C2 are the contributions of the category to dimensions 1 and 2, respectively. Eig1 and Eig2 are the eigenvalues of dimensions 1 and 2, respectively.</p>
The expected average contribution of a category for Dim.1 and Dim.2 is : (4.5 * Eig1) + (4.5 * Eig2) = (4.5<em>0.34) + (4.5</em>0.13) = 2.12%
</div>
<p><br/></p>
<p>If your data contains many categories, the top contributing categories can be displayed as follow:</p>
<pre class="r"><code>fviz_contrib(res.mca, choice = "var", axes = 1, top = 10)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-mca-top-5-contributing-variables-r-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="warning">Read more about <em>fviz_contrib()</em>: <a href="https://www.sthda.com/english/english/wiki/fviz-contrib-quick-visualization-of-row-column-contributions-r-software-and-data-mining">fviz_contrib</a></span></p>
<p>A second option is to draw a scatter plot of categories and to highlight categories according to the amount of their contributions. The function <strong>fviz_mca_var()</strong> is used.</p>
<p><span class="warning">Note that, using <strong>factoextra</strong> package, the color or the transparency of the variable categories can be automatically controlled by the value of their contributions, their cos2, their coordinates on x or y axis.</span></p>
<pre class="r"><code># Control category point colors using their contribution
# Possible values for the argument col.row are :
  # "cos2", "contrib", "coord", "x", "y"
fviz_mca_var(res.mca, col.var = "contrib")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-variables-graph-colors-factoextra-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Change the gradient color
fviz_mca_var(res.mca, col.var="contrib")+
scale_color_gradient2(low="white", mid="blue", 
                      high="red", midpoint=2)+theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-variables-graph-colors-factoextra-data-mining-2.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<br/>
<div class="success">
<p>The scatter plot is also helpful to highlight the most important categories in the determination of the dimensions.</p>
<p>In addition we can have an idea of what pole of the dimensions the categories are actually contributing to.</p>
It is evident that the categories <em>Abdo_n, Diarrhea_n, Fever_n and Mayo_n</em> have an important contribution to the positive pole of the first dimension, while the categories <em>Fever_y and Diarrhea_y</em> have a major contribution to the negative pole of the first dimension; etc, ….
</div>
<p></br/></p>
<p>It’s also possible to control automatically the transparency of variable categories by their contributions. The argument <em>alpha.var</em> is used:</p>
<pre class="r"><code># Control the transparency of categories using their contribution
# Possible values for the argument alpha.var are :
  # "cos2", "contrib", "coord", "x", "y"
fviz_mca_var(res.mca, alpha.var="contrib")+
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-variables-graph-colors-transparency-factoextra-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><span class="warning">It’s possible to select and display only the top contributing categories as illustrated in the R code below.</span></p>
<pre class="r"><code># Select the top 10 contributing categories
fviz_mca_var(res.mca, select.var=list(contrib=10))</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-mca-select-top-variables-factoextra-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="notice">Variable category/individual selections are discussed in details in the next sections</span></p>
<p><span class="warning">Read more about <em>fviz_mca_var()</em>: <a href="https://www.sthda.com/english/english/wiki/fviz-mca-quick-multiple-correspondence-analysis-data-visualization-r-software-and-data-mining">fviz_mca_var</a></span></p>
</div>
<div id="cos2-the-quality-of-representation-of-variable-categories" class="section level2">
<h2>Cos2 : The quality of representation of variable categories</h2>
<p>The two dimensions 1 and 2 are sufficient to retain 46% of the total inertia contained in the data.</p>
<p><span class="warning">However, not all the points are equally well displayed in the two dimensions.</span></p>
<p><span class="success">The <strong>quality of representation</strong> of the categories on the factor map is called the <strong>squared cosine</strong> (cos2) or the <strong>squared correlations</strong>.</span></p>
<p>The cos2 measures the degree of association between variable categories and a particular axis.</p>
<p>The cos2 of variable categories can be extracted as follow:</p>
<pre class="r"><code>head(var$cos2)</code></pre>
<pre><code>             Dim 1        Dim 2        Dim 3       Dim 4       Dim 5
Nausea_n 0.2562007 0.0528025759 2.527485e-01 0.004084375 0.019466197
Nausea_y 0.2562007 0.0528025759 2.527485e-01 0.004084375 0.019466197
Vomit_n  0.3442016 0.2511603912 1.070855e-02 0.112294813 0.004126898
Vomit_y  0.3442016 0.2511603912 1.070855e-02 0.112294813 0.004126898
Abdo_n   0.8451157 0.0006215864 1.262496e-05 0.011479077 0.002374929
Abdo_y   0.8451157 0.0006215864 1.262496e-05 0.011479077 0.002374929</code></pre>
<p>The values of the cos2 are comprised between 0 and 1.</p>
<p><strong>The sum of the cos2</strong> for rows on all the MCA dimensions is equal to one.</p>
<p><span class="warning">The quality of representation of a variable category or an individual in n dimensions is simply the sum of the squared cosine of that variable category or individual over the n dimensions.</span></p>
<p>If a variable category is well represented by two dimensions, the sum of the cos2 is closed to one.</p>
<p>For some of the categories, more than 2 dimensions are required to perfectly represent the data.</p>
<p><strong>Visualize the cos2 of variable categories using corrplot</strong>:</p>
<pre class="r"><code>library("corrplot")
corrplot(var$cos2, is.corr=FALSE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-row-cos2-r-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>The function <strong>fviz_cos2()</strong>[in <em>factoextra</em>] can be used to draw a bar plot of rows cos2:</p>
<pre class="r"><code># Cos2 of variable categories on Dim.1 and Dim.2
fviz_cos2(res.mca, choice = "var", axes = 1:2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-mca-variable-cos2-factoextra-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="warning">Note that, variable categories <em>Fish_n, Fish_y, Icecream_n and Icecream_y</em> are not very well represented by the first two dimensions. This implies that the position of the corresponding points on the scatter plot should be interpreted with some caution. A higher dimensional solution is probably necessary.</span></p>
<p><span class="warning">Read more about <em>fviz_cos2()</em>: <a href="https://www.sthda.com/english/english/wiki/fviz-cos2-quick-visualization-of-the-quality-of-representation-of-rows-columns-r-software-and-data-mining">fviz_cos2</a></span></p>
</div>
</div>
<div id="individuals" class="section level1">
<h1>Individuals</h1>
<p>The function <strong>get_mca_ind()</strong>[in <em>factoextra</em>] is used to extract the results for individuals. This function returns a list containing the coordinates, the cos2 and the contributions of individuals:</p>
<pre class="r"><code>ind <- get_mca_ind(res.mca)
ind</code></pre>
<pre><code>Multiple Correspondence Analysis Results for individuals
 ===================================================
  Name       Description                       
1 "$coord"   "Coordinates for the individuals" 
2 "$cos2"    "Cos2 for the individuals"        
3 "$contrib" "contributions of the individuals"</code></pre>
<p><span class="warning"> The result for <em>individuals</em> gives the same information as described for variable categories. For this reason, I’ll just displayed the result for individuals in this section without commenting.</span></p>
<div id="coordinates-of-individuals" class="section level2">
<h2>Coordinates of individuals</h2>
<pre class="r"><code>head(ind$coord)</code></pre>
<pre><code>       Dim 1       Dim 2       Dim 3       Dim 4       Dim 5
1 -0.4525811 -0.26415072  0.17151614  0.01369348 -0.11696806
2  0.8361700 -0.03193457 -0.07208249 -0.08550351  0.51978710
3 -0.4481892  0.13538726 -0.22484048 -0.14170168 -0.05004753
4  0.8803694 -0.08536230 -0.02052044 -0.07275873 -0.22935022
5 -0.4481892  0.13538726 -0.22484048 -0.14170168 -0.05004753
6 -0.3594324 -0.43604390 -1.20932223  1.72464616  0.04348157</code></pre>
<p>Use the function <strong>fviz_mca_ind()</strong> [in <em>factoextra</em>] to visualize only column points:</p>
<pre class="r"><code>fviz_mca_ind(res.mca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-mca-individuals-graph-factoextra-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="warning">Read more about <em>fviz_mca_ind()</em>: <a href="https://www.sthda.com/english/english/wiki/fviz-mca-quick-multiple-correspondence-analysis-data-visualization-r-software-and-data-mining">fviz_mca_ind</a></span></p>
<br/>
<div class="warning">
<p>Note that, it’s also possible to make the graph of individuals only using <strong>FactoMineR</strong> base graph.The argument <em>invisible</em> is used to hide the variable categories on the factor map:</p>
<pre class="r"><code># Hide variable categories
plot(res.mca, invisible="var") </code></pre>
</div>
<p><br/></p>
</div>
<div id="contribution-of-individuals-to-the-dimensions" class="section level2">
<h2>Contribution of individuals to the dimensions</h2>
<pre class="r"><code>head(ind$contrib)</code></pre>
<pre><code>     Dim 1      Dim 2        Dim 3        Dim 4      Dim 5
1 1.110927 0.98238297  0.498254685  0.003555817 0.31554778
2 3.792117 0.01435818  0.088003703  0.138637089 6.23134138
3 1.089470 0.25806722  0.856229950  0.380768961 0.05776914
4 4.203611 0.10259105  0.007132055  0.100387990 1.21319013
5 1.089470 0.25806722  0.856229950  0.380768961 0.05776914
6 0.700692 2.67693398 24.769968729 56.404214518 0.04360547</code></pre>
<p><span class="notice">Note that, you can use the previously mentioned <strong>corrplot()</strong> function to visualize the contribution of individuals.</span></p>
<p>Use the function <strong>fviz_contrib()</strong>[in <em>factoextra</em>] to visualize column contributions on dimensions 1+2:</p>
<pre class="r"><code>fviz_contrib(res.mca, choice = "ind", axes = 1:2, top = 20)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-mca-individuals-contribution-factoextra-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<br/>
<div class="warning">
<ul>
<li><p>If the individual contributions were uniform, the expected value would be 1/nrow(poison) = 1/55 = 1.8%.</p></li>
<li>The expected average contribution (reference line) of a column for Dim.1 and Dim.2 is : (1.8 * Eig1) + (1.8 * Eig2) = (1.8 * 0.34) + (1.8 * 0.13) = 0.85%.</li>
</ul>
</div>
<p><br/></p>
<p><strong>Draw a scatter plot of individuals points</strong> and highlight individuals according to the amount of their contributions. The function <strong>fviz_mca_ind()</strong> [in <em>factoextra</em>] is used:</p>
<pre class="r"><code># Control individual colors using their contribution
# Possible values for the argument col.ind are :
  # "cos2", "contrib", "coord", "x", "y"
fviz_mca_ind(res.mca, col.ind="contrib")+
scale_color_gradient2(low="white", mid="blue", 
                      high="red", midpoint=0.85)+theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-individuals-graph-colors-factoextra-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<br/>
<div class="warning">
<p>Note that, it’s also possible to control automatically the transparency of individuals by their contributions using the argument <em>alpha.ind</em>:</p>
<pre class="r"><code># Control the transparency of individuals using their contribution
# Possible values for the argument alpha.ind are :
  # "cos2", "contrib", "coord", "x", "y"
fviz_mca_ind(res.mca, alpha.ind="contrib")</code></pre>
</div>
<p><br/></p>
</div>
<div id="cos2-the-quality-of-representation-of-individuals" class="section level2">
<h2>Cos2 : The quality of representation of individuals</h2>
<pre class="r"><code>head(ind$cos2)</code></pre>
<pre><code>       Dim 1        Dim 2        Dim 3        Dim 4        Dim 5
1 0.34652591 0.1180447167 0.0497683175 0.0003172275 0.0231460846
2 0.55589562 0.0008108236 0.0041310808 0.0058126211 0.2148103098
3 0.54813888 0.0500176790 0.1379484860 0.0547920948 0.0068349171
4 0.74773962 0.0070299584 0.0004062504 0.0051072923 0.0507479873
5 0.54813888 0.0500176790 0.1379484860 0.0547920948 0.0068349171
6 0.02485357 0.0365775483 0.2813443706 0.5722083217 0.0003637178</code></pre>
<p><span class="warning">Note that, the value of the cos2 is between 0 and 1. A cos2 closed to 1 corresponds to a variable categories/individuals that are well represented on the factor map.</span></p>
<p>The function <strong>fviz_cos2()</strong>[in <em>factoextra</em>] can be used to draw a bar plot of individuals cos2:</p>
<pre class="r"><code># Cos2 of individuals on Dim.1 and Dim.2
fviz_cos2(res.mca, choice = "ind", axes = 1:2, top = 20)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-mca-individuals-cos2-factoextra-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
<div id="change-the-color-of-individuals-by-groups" class="section level2">
<h2>Change the color of individuals by groups</h2>
<p><span class="warning"> As mentioned above, our data contains <strong>supplementary qualitative variables</strong>: Columns 3 and 4 corresponding to the columns <em>Sick</em> and <em>Sex</em>, respectively. These factor variables will be used to color individuals by groups. </span></p>
<pre class="r"><code>sick <- as.factor(poison$Sick)
head(sick)</code></pre>
<pre><code>[1] Sick_y Sick_n Sick_y Sick_n Sick_y Sick_y
Levels: Sick_n Sick_y</code></pre>
<pre class="r"><code>sex <- as.factor(poison$Sex)
head(sex)</code></pre>
<pre><code>[1] F F F F M M
Levels: F M</code></pre>
<p><strong>Individuals factor map</strong> :</p>
<pre class="r"><code># Default plot
fviz_mca_ind(res.mca, label ="none")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-individuals-factor-map-factoextra-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><strong>Change individual colors by groups</strong> using the levels of the variable <em>sick</em>. The argument <strong>habillage</strong> is used:</p>
<pre class="r"><code>fviz_mca_ind(res.mca, label = "none", habillage=sick)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-individuals-factor-map-color-by-groups-factoextra-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><strong>Add ellipses of point concentrations</strong> : the argument <em>habillage</em> is used to specify the factor variable for coloring the observations by groups.</p>
<pre class="r"><code>fviz_mca_ind(res.mca, label="none", habillage = sick,
             addEllipses = TRUE, ellipse.level = 0.95)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-individuals-factor-map-concentration-ellipse-factoextra-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p>Now, let’s :</p>
<ul>
<li>make a biplot of individuals and variable categories</li>
<li>change the color of individuals by groups (sick levels)</li>
<li>show only the labels for variables</li>
</ul>
<pre class="r"><code>fviz_mca_biplot(res.mca, 
  habillage = sick, addEllipses = TRUE,
  label = "var", shape.var = 15) +
  scale_color_brewer(palette="Dark2")+
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-biplot-change-color-factoextra-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><span class="warning">Note that, it’s possible to color the individuals using any of the qualitative variable in the initial data table (poison)</span></p>
<p>Let’s color the individuals by groups using the levels of the variable <em>Vomiting</em>:</p>
<pre class="r"><code>fviz_mca_ind(res.mca, 
  habillage = poison$Vomiting, addEllipses = TRUE) +
  scale_color_brewer(palette="Dark2")+
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-biplot-change-color-group-2-factoextra-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p>It’s also possible to use the index of the column as follow (<em>habillage = 2</em>):</p>
<pre class="r"><code>fviz_mca_ind(res.mca, 
  habillage = 2, addEllipses = TRUE) +
  scale_color_brewer(palette="Dark2")+
  theme_minimal()</code></pre>
<p>You can also use the function plotellipses() [in <em>FactoMineR</em>] to draw confidence ellipses around the categories. The simplified format is:</p>
<pre class="r"><code>plotellipses(model, keepvar="all", axis =c(1,2))</code></pre>
<ul>
<li><strong>model</strong>: object of class MCA or PCA</li>
<li><strong>keppvar</strong>: a boolean or numeric vector of indexes of variables or a character vector of names of variables. If <em>keepvar</em> is “all”, “quali” or “quali.sup”, variables which are plotted are all the categorical variables, only those which are used to compute the dimensions (active variables) or only the supplementary categorical variables. If keepvar is a numeric vector of indexes or a character vector of names of variables, only relevant variables are plotted.</li>
</ul>
<pre class="r"><code>plotellipses(res.mca, keepvar=1)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-plot-ellipse-datamining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code>plotellipses(res.mca, keepvar=1:4)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-plot-ellipse-datamining-2.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code>plotellipses(res.mca, keepvar="Vomiting")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-plot-ellipse-datamining-3.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code>plotellipses(res.mca, keepvar=c("Vomiting", "Fever"))</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-plot-ellipse-datamining-4.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code>plotellipses(res.mca, keepvar="all")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-plot-ellipse-factominer-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
</div>
<div id="mca-using-supplementary-individuals-and-variables" class="section level1">
<h1>MCA using supplementary individuals and variables</h1>
<br/>
<div class="warning">
<p>As described above, the data set <em>poison</em> contains:</p>
<ul>
<li><strong>supplementary continuous variables</strong> (quanti.sup = 1:2, columns 1 and 2 corresponding to the columns <em>Sick</em> and <em>Sex</em>, respectively)</li>
<li><strong>supplementary qualitative variables</strong> (quali.sup = 3:4, corresponding to the columns <em>Sick</em> and <em>Sex</em>, respectively). This factor variables are used to color individuals by groups</li>
</ul>
<p>The data doesn’t contain <strong>supplementary individuals</strong>. However for demonstration, we’ll use the individuals 53:55 as supplementary individuals. The coordinates of these individuals will be predicted from the parameters of the MCA on the active individuals (1:52)</p>
</div>
<p><br/></p>
<p>Supplementary variables and individuals are not used for the determination of the principal dimensions. Their coordinates are predicted using only the information provided by the performed multiple correspondence analysis on active variables/individuals.</p>
<p>To specify supplementary individuals and variables, the function <strong>MCA()</strong> can be used as follow :</p>
<pre class="r"><code>MCA(X,  ncp = 5, ind.sup = NULL,
    quanti.sup=NULL, quali.sup=NULL, graph=TRUE, axes = c(1,2))</code></pre>
<br/>
<div class="block">
<ul>
<li><strong>X</strong> : a data frame. Rows are individuals and columns are variables.</li>
<li><strong>ncp</strong> : number of dimensions kept in the final results.</li>
<li><strong>ind.sup</strong> : a numeric vector specifying the indexes of the supplementary individuals</li>
<li><strong>quanti.sup</strong>, <strong>quali.sup</strong> : a numeric vector specifying, respectively, the indexes of the quantitative and qualitative variables</li>
<li><strong>graph</strong> : a logical value. If TRUE a graph is displayed.</li>
<li><strong>axes</strong> : a vector of length 2 specifying the components to be plotted</li>
</ul>
</div>
<p><br/></p>
<p>Example of usage :</p>
<pre class="r"><code>res.mca <- MCA(poison, ind.sup=53:55, 
               quanti.sup = 1:2, quali.sup = 3:4,  graph=FALSE)</code></pre>
<p>The summary of the MCA is :</p>
<pre class="r"><code>summary(res.mca, nb.dec = 2, ncp = 2)</code></pre>
<pre><code>

Eigenvalues
                      Dim.1  Dim.2  Dim.3  Dim.4  Dim.5  Dim.6  Dim.7  Dim.8  Dim.9 Dim.10 Dim.11
Variance               0.33   0.13   0.11   0.10   0.09   0.07   0.06   0.06   0.04   0.01   0.01
% of var.             32.88  13.04  10.63   9.67   8.60   6.66   6.40   5.94   3.89   1.33   0.95
Cumulative % of var.  32.88  45.92  56.56  66.23  74.83  81.49  87.89  93.83  97.72  99.05 100.00

Individuals (the 10 first)
             Dim.1   ctr  cos2   Dim.2   ctr  cos2  
1          | -0.44  1.14  0.35 | -0.27  1.10  0.13 |
2          |  0.85  4.23  0.54 | -0.01  0.00  0.00 |
3          | -0.43  1.09  0.50 |  0.13  0.24  0.04 |
4          |  0.91  4.81  0.77 | -0.03  0.01  0.00 |
5          | -0.43  1.09  0.50 |  0.13  0.24  0.04 |
6          | -0.34  0.67  0.02 | -0.45  2.93  0.04 |
7          | -0.43  1.09  0.50 |  0.13  0.24  0.04 |
8          | -0.63  2.32  0.61 | -0.02  0.00  0.00 |
9          | -0.44  1.14  0.35 | -0.27  1.10  0.13 |
10         | -0.12  0.08  0.03 |  0.14  0.27  0.04 |

Supplementary individuals
             Dim.1  cos2   Dim.2  cos2  
53         |  1.08  0.36 |  0.52  0.08 |
54         | -0.12  0.03 |  0.14  0.04 |
55         | -0.43  0.50 |  0.13  0.04 |

Categories (the 10 first)
             Dim.1   ctr  cos2 v.test   Dim.2   ctr  cos2 v.test  
Nausea_n   |  0.29  1.78  0.28   3.77 |  0.13  0.94  0.06   1.72 |
Nausea_y   | -0.97  5.94  0.28  -3.77 | -0.44  3.12  0.06  -1.72 |
Vomit_n    |  0.46  3.56  0.33   4.13 | -0.39  6.57  0.24  -3.53 |
Vomit_y    | -0.73  5.70  0.33  -4.13 |  0.63 10.51  0.24   3.53 |
Abdo_n     |  1.32 15.80  0.85   6.58 |  0.02  0.01  0.00   0.12 |
Abdo_y     | -0.64  7.68  0.85  -6.58 | -0.01  0.01  0.00  -0.12 |
Fever_n    |  1.17 13.89  0.79   6.35 | -0.12  0.36  0.01  -0.65 |
Fever_y    | -0.68  8.00  0.79  -6.35 |  0.07  0.21  0.01   0.65 |
Diarrhea_n |  1.26 15.31  0.85   6.57 |  0.04  0.04  0.00   0.20 |
Diarrhea_y | -0.67  8.10  0.85  -6.57 | -0.02  0.02  0.00  -0.20 |

Categorical variables (eta2)
             Dim.1 Dim.2  
Nausea     |  0.28  0.06 |
Vomiting   |  0.33  0.24 |
Abdominals |  0.85  0.00 |
Fever      |  0.79  0.01 |
Diarrhae   |  0.85  0.00 |
Potato     |  0.03  0.40 |
Fish       |  0.01  0.03 |
Mayo       |  0.33  0.04 |
Courgette  |  0.02  0.48 |
Cheese     |  0.13  0.03 |

Supplementary categories
             Dim.1  cos2 v.test   Dim.2  cos2 v.test  
Sick_n     |  1.42  0.89   6.75 |  0.00  0.00   0.01 |
Sick_y     | -0.63  0.89  -6.75 |  0.00  0.00  -0.01 |
F          | -0.03  0.00  -0.23 |  0.11  0.01   0.83 |
M          |  0.03  0.00   0.23 | -0.12  0.01  -0.83 |

Supplementary categorical variables (eta2)
             Dim.1 Dim.2  
Sick       |  0.89  0.00 |
Sex        |  0.00  0.01 |

Supplementary continuous variables
             Dim.1   Dim.2  
Age        |  0.00 | -0.01 |
Time       | -0.84 | -0.08 |</code></pre>
<p><span class="notice">For the supplementary individuals/variable categories, the coordinates and the quality of representation (cos2) on the factor maps are shown. They don’t contribute to the dimensions.</span></p>
<div id="make-a-biplot-of-individuals-and-variable-categories" class="section level2">
<h2>Make a biplot of individuals and variable categories</h2>
<p><strong>FactomineR base graph</strong>:</p>
<pre class="r"><code>plot(res.mca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-biplot-supplementary-factominer-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<br/>
<div class="block">
<ul>
<li>Active individuals are in blue</li>
<li>Supplementary individuals are in darkblue</li>
<li>Active variable categories are in red</li>
<li>Supplementary variable categories are in darkgreen</li>
</ul>
</div>
<p><br/></p>
<p><strong>Use factoextra</strong>:</p>
<pre class="r"><code>fviz_mca_biplot(res.mca) +
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-biplot-supplementary-factoextra-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
</div>
<div id="visualize-supplementary-variables" class="section level2">
<h2>Visualize supplementary variables</h2>
<p>The graph below highlight the correlation between variables (active &amp; supplementary) and dimensions:</p>
<pre class="r"><code>plot(res.mca, choix ="var")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-variables-correlation-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<div id="supplementary-qualitative-variable-categories" class="section level3">
<h3>Supplementary qualitative variable categories</h3>
<p>All the results (coordinates, cos2, v.test and eta2) for the supplementary qualitative variable categories can be extracted as follow :</p>
<pre class="r"><code>res.mca$quali.sup</code></pre>
<pre><code>$coord
             Dim 1         Dim 2       Dim 3        Dim 4       Dim 5
Sick_n  1.41809140  0.0020394048  0.13199139 -0.016036841 -0.08354663
Sick_y -0.63026284 -0.0009064021 -0.05866284  0.007127485  0.03713184
F      -0.03108147  0.1123143957  0.05033124 -0.055927173 -0.06832928
M       0.03356798 -0.1212995474 -0.05435774  0.060401347  0.07379562

$cos2
             Dim 1        Dim 2       Dim 3        Dim 4       Dim 5
Sick_n 0.893770319 1.848521e-06 0.007742990 0.0001143023 0.003102240
Sick_y 0.893770319 1.848521e-06 0.007742990 0.0001143023 0.003102240
F      0.001043342 1.362369e-02 0.002735892 0.0033780765 0.005042401
M      0.001043342 1.362369e-02 0.002735892 0.0033780765 0.005042401

$v.test
            Dim 1        Dim 2      Dim 3       Dim 4      Dim 5
Sick_n  6.7514655  0.009709509  0.6284047 -0.07635063 -0.3977615
Sick_y -6.7514655 -0.009709509 -0.6284047  0.07635063  0.3977615
F      -0.2306739  0.833551410  0.3735378 -0.41506855 -0.5071119
M       0.2306739 -0.833551410 -0.3735378  0.41506855  0.5071119

$eta2
           Dim 1        Dim 2       Dim 3        Dim 4       Dim 5
Sick 0.893770319 1.848521e-06 0.007742990 0.0001143023 0.003102240
Sex  0.001043342 1.362369e-02 0.002735892 0.0033780765 0.005042401</code></pre>
<p><strong>Factor map</strong> :</p>
<pre class="r"><code>fviz_mca_var(res.mca) + theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-supplementary-rows-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Hide active variables
fviz_mca_var(res.mca, invisible ="var") +
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-supplementary-rows-data-mining-2.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Hide supplementary qualitative variables
fviz_mca_var(res.mca, invisible ="quali.sup") +
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-supplementary-rows-data-mining-3.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="success">Supplementary variable categories are shown in darkgreen color.</span></p>
</div>
<div id="supplementary-quantitative-variables" class="section level3">
<h3>Supplementary quantitative variables</h3>
<p>The coordinates of supplementary quantitative variables are:</p>
<pre class="r"><code>res.mca$quanti</code></pre>
<pre><code>$coord
            Dim 1       Dim 2       Dim 3       Dim 4       Dim 5
Age   0.003934896 -0.00741340 -0.26494536  0.20015501  0.02928483
Time -0.838158507 -0.08330586 -0.08718851 -0.08421599 -0.02316931</code></pre>
<p>Graph using FactoMineR base graph:</p>
<pre class="r"><code>plot(res.mca, choix="quanti.sup")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-supplementary-quantitative-variables-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
</div>
<div id="visualize-supplementary-individuals" class="section level2">
<h2>Visualize supplementary individuals</h2>
<p>The results for supplementary individuals can be extracted as follow :</p>
<pre class="r"><code>res.mca$ind.sup</code></pre>
<pre><code>$coord
        Dim 1     Dim 2      Dim 3      Dim 4      Dim 5
53  1.0835684 0.5172478  0.5794063  0.5390903  0.4553650
54 -0.1249473 0.1417271 -0.1765234 -0.1526587 -0.2779565
55 -0.4315948 0.1270468 -0.2071580 -0.1186804 -0.1891760

$cos2
        Dim 1      Dim 2      Dim 3      Dim 4      Dim 5
53 0.36304957 0.08272764 0.10380536 0.08986204 0.06411692
54 0.03157652 0.04062716 0.06302535 0.04713607 0.15626590
55 0.50232519 0.04352713 0.11572730 0.03798314 0.09650827</code></pre>
<p><strong>Factor map for individuals</strong>:</p>
<pre class="r"><code>fviz_mca_ind(res.mca) +
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-supplementary-individuals-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Show the label of ind.sup only
fviz_mca_ind(res.mca, label="ind.sup") +
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-supplementary-individuals-data-mining-2.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="success">Supplementary individuals are shown in darkblue.</span></p>
</div>
</div>
<div id="filter-the-mca-result" class="section level1">
<h1>Filter the MCA result</h1>
<p>If you have many individuals/variable categories, it’s possible to visualize only some of them using the arguments <em>select.ind</em> and <em>select.var</em>.</p>
<br/>
<div class="block">
<p><strong>select.ind, select.var:</strong> a selection of individuals/variable categories to be drawn. Allowed values are <em>NULL</em> or a <em>list</em> containing the arguments name, cos2 or contrib:</p>
<ul>
<li><em>name</em>: is a character vector containing individuals/variable category names to be drawn</li>
<li><em>cos2</em>: if cos2 is in [0, 1], ex: 0.6, then individuals/variable categories with a cos2 > 0.6 are drawn</li>
<li><em>if cos2 > 1</em>, ex: 5, then the top 5 active individuals/variable categories and top 5 supplementary columns/rows with the highest cos2 are drawn</li>
<li><em>contrib</em>: if contrib > 1, ex: 5, then the top 5 individuals/variable categories with the highest cos2 are drawn</li>
</ul>
</div>
<p><br/></p>
<pre class="r"><code># Visualize variable categories with cos2 >= 0.4
fviz_mca_var(res.mca, select.var = list(cos2 = 0.4))</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-filter-r-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Top 10 active variables with the highest cos2
fviz_mca_var(res.mca, select.var= list(cos2 = 10))</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-filter-r-data-mining-2.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><span class="warning">The top 10 active individuals and the top 10 supplementary individuals are shown.</span></p>
<pre class="r"><code># Select by names
name <- list(name = c("Fever_n", "Abdo_y", "Diarrhea_n", "Fever_Y", "Vomit_y", "Vomit_n"))
fviz_mca_var(res.mca, select.var = name)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-filter-2-r-data-mining-1.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<pre class="r"><code>#top 5 contributing individuals and variable categories
fviz_mca_biplot(res.mca, select.ind = list(contrib = 5), 
               select.var = list(contrib = 5)) +
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/multiple-correspondance-analysis-filter-2-r-data-mining-2.png" title="Multiple Correspondence Analysis - R software and data mining" alt="Multiple Correspondence Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><span class="warning">Supplementary individuals/variable categories are not shown because they don’t contribute to the construction of the axes.</span></p>
</div>
<div id="dimension-description" class="section level1">
<h1>Dimension description</h1>
<p>The function <strong>dimdesc()</strong> can be used to identify the most correlated variables with a given dimension.</p>
<p>A simplified format is :</p>
<pre class="r"><code>dimdesc(res, axes = 1:2, proba = 0.05)</code></pre>
<br/>
<div>
<ul>
<li><strong>res</strong> : an object of class MCA</li>
<li><strong>axes</strong> : a numeric vector specifying the dimensions to be described</li>
<li><strong>prob</strong> : the significance level</li>
</ul>
</div>
<p><br/></p>
<p>Example of usage :</p>
<pre class="r"><code>res.desc <- dimdesc(res.mca, axes = c(1,2))
# Description of dimension 1
res.desc$`Dim 1`</code></pre>
<pre><code>$quanti
     correlation     p.value
Time  -0.8381585 9.12658e-15

$quali
                  R2      p.value
Sick       0.8937703 5.368221e-26
Abdominals 0.8493262 3.429439e-22
Diarrhae   0.8467702 5.229788e-22
Fever      0.7916690 1.168654e-18
Vomiting   0.3348718 7.001487e-06
Mayo       0.3257425 9.967995e-06
Nausea     0.2794053 5.623583e-05
Cheese     0.1344785 7.495656e-03

$category
             Estimate      p.value
Sick_n      0.5872910 5.368221e-26
Abdo_n      0.5632879 3.429439e-22
Diarrhea_n  0.5545730 5.229788e-22
Fever_n     0.5297728 1.168654e-18
Vomit_n     0.3410366 7.001487e-06
Mayo_n      0.4325471 9.967995e-06
Nausea_n    0.3597065 5.623583e-05
Cheese_n    0.3290968 7.495656e-03
Cheese_y   -0.3290968 7.495656e-03
Nausea_y   -0.3597065 5.623583e-05
Mayo_y     -0.4325471 9.967995e-06
Vomit_y    -0.3410366 7.001487e-06
Fever_y    -0.5297728 1.168654e-18
Diarrhea_y -0.5545730 5.229788e-22
Abdo_y     -0.5632879 3.429439e-22
Sick_y     -0.5872910 5.368221e-26</code></pre>
<pre class="r"><code># Description of dimension 2
res.desc$`Dim 2`</code></pre>
<pre><code>$quali
                 R2      p.value
Courgette 0.4839477 1.039252e-08
Potato    0.4020987 4.489421e-07
Vomiting  0.2449186 1.917736e-04
Icecream  0.1366683 6.989716e-03

$category
             Estimate      p.value
Courg_n     0.4261065 1.039252e-08
Potato_y    0.4910893 4.489421e-07
Vomit_y     0.1836850 1.917736e-04
Icecream_n  0.2863045 6.989716e-03
Icecream_y -0.2863045 6.989716e-03
Vomit_n    -0.1836850 1.917736e-04
Potato_n   -0.4910893 4.489421e-07
Courg_y    -0.4261065 1.039252e-08</code></pre>
</div>
<div id="infos" class="section level1">
<h1>Infos</h1>
<p><span class="warning"> This analysis has been performed using <strong>R software</strong> (ver. 3.2.1), <strong>FactoMineR</strong> (ver. 1.30) and <strong>factoextra</strong> (ver. 1.0.2) </span></p>
</div>
<div id="references-and-further-reading" class="section level1">
<h1>References and further reading</h1>
<ul>
<li>Bendixen M.1995, Compositional perceptual mapping using chi-squared tree analysis and Correspondence Analysis, «Journal of Marketing Management», 11, 571-581.</li>
<li>Bendixen M. 2003, A Practical Guide to the Use of Correspondence Analysis in Marketing Research, Marketing Bulletin, 2003, 14, Technical Note 2. <a href="http://marketing-bulletin.massey.ac.nz/V14/MB_V14_T2_Bendixen.pdf" class="uri">http://marketing-bulletin.massey.ac.nz/V14/MB_V14_T2_Bendixen.pdf</a></li>
<li>Greenacre M.. Contribution biplots. <a href="http://www.econ.upf.edu/docs/papers/downloads/1162.pdf" class="uri">http://www.econ.upf.edu/docs/papers/downloads/1162.pdf</a></li>
<li>François Husson, <a href="http://factominer.free.fr/contact/index.html" class="uri">http://factominer.free.fr/contact/index.html</a></li>
</ul>
</div>

<script>jQuery(document).ready(function () {
    jQuery('h1').addClass('wiki_paragraph1');
    jQuery('h2').addClass('wiki_paragraph2');
    jQuery('h3').addClass('wiki_paragraph3');
    jQuery('h4').addClass('wiki_paragraph4');
    });//add phpboost class to header</script>
<style>.content{padding:0px;}</style>
</div><!--end rdoc-->
<!--====================== stop here when you copy to sthda================-->



<!-- END HTML -->]]></description>
			<pubDate>Wed, 01 Jul 2015 15:48:33 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[ca package and factoextra : Correspondence Analysis - R software and data mining]]></title>
			<link>https://www.sthda.com/english/wiki/ca-package-and-factoextra-correspondence-analysis-r-software-and-data-mining</link>
			<guid>https://www.sthda.com/english/wiki/ca-package-and-factoextra-correspondence-analysis-r-software-and-data-mining</guid>
			<description><![CDATA[<!-- START HTML -->

            
  <!--====================== start from here when you copy to sthda================-->  
  <div id="rdoc">

<div id="TOC">
<ul>
<li><a href="#required-packages">Required packages</a></li>
<li><a href="#load-ca-and-factoextra">Load ca and factoextra</a></li>
<li><a href="#data-format">Data format</a></li>
<li><a href="#correspondence-analysis-ca">Correspondence analysis (CA)</a></li>
<li><a href="#summary-of-ca-outputs">Summary of CA outputs</a></li>
<li><a href="#interpretation-of-ca-outputs">Interpretation of CA outputs</a></li>
<li><a href="#eigenvalues-and-scree-plot">Eigenvalues and scree plot</a></li>
<li><a href="#biplot-of-row-and-column-variables">Biplot of row and column variables</a></li>
<li><a href="#references-and-further-reading">References and further reading</a></li>
<li><a href="#infos">Infos</a></li>
</ul>
</div>

<p><br/></p>
<p>As described <a href="https://www.sthda.com/english/english/wiki/correspondence-analysis-in-r-the-ultimate-guide-for-the-analysis-the-visualization-and-the-interpretation-r-software-and-data-mining">here</a>, correspondence analysis is used to analyse the contingency table formed by two qualitative variables.</p>
<p><span class="success">This article describes how to perform a <strong>correspondence analysis</strong> using <strong>ca</strong> package<span></p>
<div id="required-packages" class="section level1">
<h1>Required packages</h1>
<p><strong>ca</strong>(for computing CA) and <a href="https://www.sthda.com/english/english/wiki/factoextra-r-package-visualization-of-the-outputs-of-a-multivariate-analysis-r-software-and-data-mining"><strong>factoextra</strong></a> (for CA visualization) packages are used.</p>
<p>These packages can be installed as follow :</p>
<pre class="r"><code>install.packages("ca")

# install.packages("devtools")
devtools::install_github("kassambara/factoextra")</code></pre>
<p><span class="warning">Note that, for factoextra a version >= 1.0.1 is required for this tutorial. If it’s already installed on your computer, you should re-install it to have the most updated version.</span></p>
</div>
<div id="load-ca-and-factoextra" class="section level1">
<h1>Load ca and factoextra</h1>
<pre class="r"><code>library("ca")
library("factoextra")</code></pre>
</div>
<div id="data-format" class="section level1">
<h1>Data format</h1>
<p>We’ll use the data sets <em>housetasks</em> taken from the package <strong>ade4</strong>.</p>
<pre class="r"><code>data(housetasks)
head(housetasks, 13)</code></pre>
<pre><code>           Wife Alternating Husband Jointly
Laundry     156          14       2       4
Main_meal   124          20       5       4
Dinner       77          11       7      13
Breakfeast   82          36      15       7
Tidying      53          11       1      57
Dishes       32          24       4      53
Shopping     33          23       9      55
Official     12          46      23      15
Driving      10          51      75       3
Finances     13          13      21      66
Insurance     8           1      53      77
Repairs       0           3     160       2
Holidays      0           1       6     153</code></pre>
<br/>
<div class="block">
<p>The data is a contingency table containing 13 housetasks and their repartition in the couple :</p>
<ul>
<li>rows are the different tasks</li>
<li>values are the frequencies of the tasks done :
<ul>
<li>by the <em>wife</em> only</li>
<li>alternatively</li>
<li>by the husband only</li>
<li>or jointly</li>
</ul></li>
</ul>
</div>
<p><br/></p>
</div>
<div id="correspondence-analysis-ca" class="section level1">
<h1>Correspondence analysis (CA)</h1>
<p>The function <strong>ca()</strong> [in <em>ca</em> package] can be used. A simplified format is :</p>
<pre class="r"><code>ca(obj,  nd = NA)</code></pre>
<br/>
<div class="block">
<ul>
<li><strong>obj</strong> : a data frame, matrice or table (contingency table)</li>
<li><strong>nd</strong> : number of dimensions to be included in the output</li>
</ul>
</div>
<p><br/></p>
<p>Example of usage :</p>
<pre class="r"><code>res.ca <- ca(housetasks, nd = 3)</code></pre>
<p>The output of the function <strong>ca()</strong> is structured as a list including :</p>
<pre class="r"><code>names(res.ca)</code></pre>
<pre><code> [1] "sv"         "nd"         "rownames"   "rowmass"    "rowdist"    "rowinertia" "rowcoord"  
 [8] "rowsup"     "colnames"   "colmass"    "coldist"    "colinertia" "colcoord"   "colsup"    
[15] "call"      </code></pre>
<p>The standard coordinates of row variables can be extracted as follow:</p>
<pre class="r"><code>res.ca$rowcoord</code></pre>
<pre><code>                 Dim1       Dim2       Dim3
Laundry    -1.3461225 -0.7425167 -0.8885935
Main_meal  -1.1883460 -0.7347025 -0.4602894
Dinner     -0.9399625 -0.4618664 -0.5819061
Breakfeast -0.6902730 -0.6787794  0.6183521
Tidying    -0.5344773  0.6511077 -0.2643198
Dishes     -0.2564623  0.6625334  0.7489349
Shopping   -0.1597173  0.6045960  0.5684434
Official    0.3075858 -0.3801811  2.5905284
Driving     1.0067309 -0.9795065  1.5274961
Finances    0.3674852  0.9262210  0.0976236
Insurance   0.8782125  0.7102288 -0.8118104
Repairs     2.0748608 -1.2955835 -1.3244577
Holidays    0.3426748  2.1511592 -0.3635596</code></pre>
<p>The standard coordinates of columns are:</p>
<pre class="r"><code>res.ca$colcoord</code></pre>
<pre><code>                   Dim1       Dim2       Dim3
Wife        -1.13682130 -0.5474873 -0.5608580
Alternating -0.08439706 -0.4371162  2.3807453
Husband      1.57560041 -0.9023133 -0.5298508
Jointly      0.20280133  1.5389023 -0.1302974</code></pre>
<p><span class="warning">Note that, the methods <strong>print()</strong> and <strong>summary()</strong> are available for <em>ca</em> objects.</span></p>
<pre class="r"><code># printing method
print(x)

# Summary method
summary(object, scree = TRUE, rows = TRUE, columns = TRUE)</code></pre>
<br/>
<div class="block">
<ul>
<li><strong>x, object</strong>: CA object</li>
<li><strong>scree</strong>: If TRUE, the scree plot is included in the output</li>
<li><strong>rows</strong>: If TRUE, the results for rows are included in the output</li>
<li><strong>columns</strong>: If TRUE, the results for columns are included in the output</li>
</ul>
</div>
<p><br/></p>
</div>
<div id="summary-of-ca-outputs" class="section level1">
<h1>Summary of CA outputs</h1>
<pre class="r"><code>summary(res.ca)</code></pre>
<pre><code>
Principal inertias (eigenvalues):

 dim    value      %   cum%   scree plot               
 1      0.542889  48.7  48.7  ************             
 2      0.445003  39.9  88.6  **********               
 3      0.127048  11.4 100.0  ***                      
        -------- -----                                 
 Total: 1.114940 100.0                                 


Rows:
     name   mass  qlt  inr    k=1 cor ctr    k=2 cor ctr    k=3 cor ctr  
1  | Lndr |  101 1000  120 | -992 740 183 | -495 185  56 | -317  75  80 |
2  | Mn_m |   88 1000   81 | -876 742 124 | -490 232  47 | -164  26  19 |
3  | Dnnr |   62 1000   34 | -693 777  55 | -308 154  13 | -207  70  21 |
4  | Brkf |   80 1000   37 | -509 505  38 | -453 400  37 |  220  95  31 |
5  | Tdyn |   70 1000   22 | -394 440  20 |  434 535  30 |  -94  25   5 |
6  | Dshs |   65 1000   18 | -189 118   4 |  442 646  28 |  267 236  36 |
7  | Shpp |   69 1000   13 | -118  64   2 |  403 748  25 |  203 189  22 |
8  | Offc |   55 1000   48 |  227  53   5 | -254  66   8 |  923 881 369 |
9  | Drvn |   80 1000   91 |  742 432  81 | -653 335  76 |  544 233 186 |
10 | Fnnc |   65 1000   27 |  271 161   9 |  618 837  56 |   35   3   1 |
11 | Insr |   80 1000   52 |  647 576  61 |  474 309  40 | -289 115  53 |
12 | Rprs |   95 1000  281 | 1529 707 407 | -864 226 159 | -472  67 166 |
13 | Hldy |   92 1000  176 |  252  30  11 | 1435 962 425 | -130   8  12 |

Columns:
    name   mass  qlt  inr    k=1 cor ctr    k=2 cor ctr    k=3 cor ctr  
1 | Wife |  344 1000  270 | -838 802 445 | -365 152 103 | -200  46 108 |
2 | Altr |  146 1000  106 |  -62   5   1 | -292 105  28 |  849 890 825 |
3 | Hsbn |  218 1000  342 | 1161 772 542 | -602 208 178 | -189  20  61 |
4 | Jntl |  292 1000  282 |  149  21  12 | 1027 977 691 |  -46   2   5 |</code></pre>
<p>The result of the function <strong>summary()</strong> contains 3 tables:</p>
<ul>
<li><strong>Table 1 - Eigenvalues</strong>: table 1 contains the eigenvalues and the percentage of inertia retained by each dimension. Additionally, accumulated percentages and a scree plot are shown.</li>
<li><strong>Table 2</strong> contains the results for row variables (X1000):
<ul>
<li>The principal coordinates for the first 3 dimensions (k = 1, k = 2 and k = 3).</li>
<li>Squared correlations (<strong>cor</strong> or cos2) and contributions (<strong>ctr</strong>) of the points. Note that, <strong>cor</strong> and <strong>ctr</strong> are expressed in per mills.</li>
<li><strong>mass</strong>: the mass (or total frequency) of each point (X1000).</li>
<li><strong>qlt</strong> is the total quality (X1000) of representation of points by the 3 included dimensions. In our example, it is the sum of the squared correlations over the three included dimensions.</li>
<li><strong>inr</strong>: the inertia of the point (in per mills of the total inertia).</li>
</ul></li>
<li><strong>Table 3</strong> contains the results for column variables (the same as the row variables).</li>
</ul>
<p><span class="warning">The function <em>summary.ca()</em> returns a list : list(scree, rows, columns).</span></p>
<p>Use the R code below to get the table containing the results for rows:</p>
<pre class="r"><code>summary(res.ca)$rows</code></pre>
<pre><code>   name mass  qlt  inr  k=1 cor ctr  k=2 cor ctr  k=3 cor ctr
1  Lndr  101 1000  120 -992 740 183 -495 185  56 -317  75  80
2  Mn_m   88 1000   81 -876 742 124 -490 232  47 -164  26  19
3  Dnnr   62 1000   34 -693 777  55 -308 154  13 -207  70  21
4  Brkf   80 1000   37 -509 505  38 -453 400  37  220  95  31
5  Tdyn   70 1000   22 -394 440  20  434 535  30  -94  25   5
6  Dshs   65 1000   18 -189 118   4  442 646  28  267 236  36
7  Shpp   69 1000   13 -118  64   2  403 748  25  203 189  22
8  Offc   55 1000   48  227  53   5 -254  66   8  923 881 369
9  Drvn   80 1000   91  742 432  81 -653 335  76  544 233 186
10 Fnnc   65 1000   27  271 161   9  618 837  56   35   3   1
11 Insr   80 1000   52  647 576  61  474 309  40 -289 115  53
12 Rprs   95 1000  281 1529 707 407 -864 226 159 -472  67 166
13 Hldy   92 1000  176  252  30  11 1435 962 425 -130   8  12</code></pre>
<p>The summary for column variables is:</p>
<pre class="r"><code>summary(res.ca)$columns</code></pre>
<pre><code>  name mass  qlt  inr  k=1 cor ctr  k=2 cor ctr  k=3 cor ctr
1 Wife  344 1000  270 -838 802 445 -365 152 103 -200  46 108
2 Altr  146 1000  106  -62   5   1 -292 105  28  849 890 825
3 Hsbn  218 1000  342 1161 772 542 -602 208 178 -189  20  61
4 Jntl  292 1000  282  149  21  12 1027 977 691  -46   2   5</code></pre>
</div>
<div id="interpretation-of-ca-outputs" class="section level1">
<h1>Interpretation of CA outputs</h1>
<p>The interpretation of correspondence analysis has been described in my previous post: <a href="https://www.sthda.com/english/english/wiki/correspondence-analysis-in-r-the-ultimate-guide-for-the-analysis-the-visualization-and-the-interpretation-r-software-and-data-mining">Correspondence Analysis in R: The Ultimate Guide for the Analysis, the Visualization and the Interpretation</a>.</p>
</div>
<div id="eigenvalues-and-scree-plot" class="section level1">
<h1>Eigenvalues and scree plot</h1>
<p>The proportion of inertia explained by the principal dimensions can be extracted using the function <strong>get_eigenvalue()</strong> [in <em>factoextra</em>] as follow :</p>
<pre class="r"><code>eigenvalues <- get_eigenvalue(res.ca)
eigenvalues</code></pre>
<pre><code>      eigenvalue variance.percent cumulative.variance.percent
Dim.1  0.5428893         48.69222                    48.69222
Dim.2  0.4450028         39.91269                    88.60491
Dim.3  0.1270484         11.39509                   100.00000</code></pre>
<p>The function <strong>fviz_screeplot()</strong> [in <em>factoextra</em> package] can be used to draw the scree plot (the percentages of inertia explained by the CA dimensions):</p>
<pre class="r"><code>fviz_screeplot(res.ca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ca-correspondance-analysis-scree-pot-factoextra-data-mining-1.png" title="Correspondance analysis - R software and data mining" alt="Correspondance analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="warning">Read more about eigenvalues and screeplot: <a href="https://www.sthda.com/english/english/wiki/eigenvalues-quick-data-visualization-with-factoextra-r-software-and-data-mining">Eigenvalues data visualization</a></span></p>
</div>
<div id="biplot-of-row-and-column-variables" class="section level1">
<h1>Biplot of row and column variables</h1>
<p>The base plot()[in <strong>ca</strong> package] function can be used:</p>
<pre class="r"><code>plot(res.ca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ca-correspondance-analysis-plot-data-mining-1.png" title="Correspondance analysis - R software and data mining" alt="Correspondance analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>It’s also possible to use the function <strong>fviz_ca_biplot()</strong> [in <em>factoextra</em>]:</p>
<pre class="r"><code>fviz_ca_biplot(res.ca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ca-correspondance-analysis-plot-factoextra-data-mining-1.png" title="Correspondance analysis - R software and data mining" alt="Correspondance analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="warning">Read more about <em>fviz_ca_biplot()</em>: <a href="https://www.sthda.com/english/english/wiki/fviz-ca-quick-correspondence-analysis-data-visualization-using-factoextra-r-software-and-data-mining">fviz_ca_biplot</a></span></p>
</div>
<div id="references-and-further-reading" class="section level1">
<h1>References and further reading</h1>
<ul>
<li><a href="https://www.sthda.com/english/english/wiki/correspondence-analysis-in-r-the-ultimate-guide-for-the-analysis-the-visualization-and-the-interpretation-r-software-and-data-mining">Correspondence Analysis in R: The Ultimate Guide for the Analysis, the Visualization and the Interpretation</a></li>
<li><a href="https://www.sthda.com/english/english/wiki/ade4-and-factoextra-correspondence-analysis-r-software-and-data-mining">Correspondence Analysis using ade4 and factoextra</a></li>
<li>Oleg Nenadic’ and Michael Greenacre. Correspondence Analysis in R, with Two- and. Three-dimensional Graphics: The ca Package. Journal of Statistical Software, May 2007. <a href="http://www.jstatsoft.org/v20/i03/paper">http://www.jstatsoft.org/v20/i03/paper</a></li>
</ul>
</div>
<div id="infos" class="section level1">
<h1>Infos</h1>
<p><span class="warning"> This analysis has been performed using <strong>R software</strong> (ver. 3.1.2), <strong>ca</strong> (ver. 0.58) and <strong>factoextra</strong> (ver. 1.0.2) </span></p>
</div>

<script>jQuery(document).ready(function () {
    jQuery('h1').addClass('wiki_paragraph1');
    jQuery('h2').addClass('wiki_paragraph2');
    jQuery('h3').addClass('wiki_paragraph3');
    jQuery('h4').addClass('wiki_paragraph4');
    });//add phpboost class to header</script>
<style>.content{padding:0px;}</style>
</div><!--end rdoc-->
<!--====================== stop here when you copy to sthda================-->



<!-- END HTML -->]]></description>
			<pubDate>Sun, 28 Jun 2015 12:01:34 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[MASS package and factoextra : Correspondence Analysis - R software and data mining]]></title>
			<link>https://www.sthda.com/english/wiki/mass-package-and-factoextra-correspondence-analysis-r-software-and-data-mining</link>
			<guid>https://www.sthda.com/english/wiki/mass-package-and-factoextra-correspondence-analysis-r-software-and-data-mining</guid>
			<description><![CDATA[<!-- START HTML -->

  <!--====================== start from here when you copy to sthda================-->  
  <div id="rdoc">

<div id="TOC">
<ul>
<li><a href="#required-packages">Required packages</a></li>
<li><a href="#load-mass-and-factoextra">Load MASS and factoextra</a></li>
<li><a href="#data-format">Data format</a></li>
<li><a href="#correspondence-analysis-ca">Correspondence analysis (CA)</a></li>
<li><a href="#interpretation-of-ca-outputs">Interpretation of CA outputs</a></li>
<li><a href="#eigenvalues-and-scree-plot">Eigenvalues and scree plot</a></li>
<li><a href="#biplot-of-row-and-column-variables">Biplot of row and column variables</a></li>
<li><a href="#row-variables">Row variables</a></li>
<li><a href="#column-varables">Column varables</a></li>
<li><a href="#references-and-further-reading">References and further reading</a></li>
<li><a href="#infos">Infos</a></li>
</ul>
</div>

<p><br/></p>
<p>As illustrated in my <a href="https://www.sthda.com/english/english/wiki/correspondence-analysis-in-r-the-ultimate-guide-for-the-analysis-the-visualization-and-the-interpretation-r-software-and-data-mining">previous article</a>, <strong>correspondence analysis</strong> (<strong>CA</strong>) is used to analyse the contingency table formed by two <strong>categorical variables</strong>.</p>
<p><span class="success">This article describes how to perform <strong>correspondence analysis</strong> using <strong>MASS</strong> package<span></p>
<div id="required-packages" class="section level1">
<h1>Required packages</h1>
<p><strong>MASS</strong>(for computing CA) and <a href="https://www.sthda.com/english/english/wiki/factoextra-r-package-visualization-of-the-outputs-of-a-multivariate-analysis-r-software-and-data-mining"><strong>factoextra</strong></a> (for CA visualization) packages are used.</p>
<p>These packages can be installed as follow :</p>
<pre class="r"><code>install.packages("MASS")

# install.packages("devtools")
devtools::install_github("kassambara/factoextra")</code></pre>
<p><span class="warning">Note that, for factoextra a version >= 1.0.1 is required for this tutorial. If it’s already installed on your computer, you should re-install it to have the most updated version.</span></p>
</div>
<div id="load-mass-and-factoextra" class="section level1">
<h1>Load MASS and factoextra</h1>
<pre class="r"><code>library("MASS")
library("factoextra")</code></pre>
</div>
<div id="data-format" class="section level1">
<h1>Data format</h1>
<p>We’ll use the data sets <em>housetasks</em> [in <em>factoextra</em>].</p>
<pre class="r"><code>data(housetasks)
head(housetasks)</code></pre>
<pre><code>           Wife Alternating Husband Jointly
Laundry     156          14       2       4
Main_meal   124          20       5       4
Dinner       77          11       7      13
Breakfeast   82          36      15       7
Tidying      53          11       1      57
Dishes       32          24       4      53</code></pre>
<br/>
<div class="block">
<p>The data is contingency table containing 13 housetasks and their repartition in the couple :</p>
<ul>
<li>rows are the different tasks</li>
<li>values are the frequencies of the tasks done :
<ul>
<li>by the <em>wife</em> only</li>
<li>alternatively</li>
<li>by the husband only</li>
<li>or jointly</li>
</ul></li>
</ul>
</div>
<p><br/></p>
</div>
<div id="correspondence-analysis-ca" class="section level1">
<h1>Correspondence analysis (CA)</h1>
<p>The function <strong>corresp()</strong> [in <em>MASS</em> package] can be used. A simplified format is :</p>
<pre class="r"><code>corresp(x,  nf = 1)</code></pre>
<br/>
<div class="block">
<ul>
<li><strong>x</strong> : a data frame, matrix or table (contingency table)</li>
<li><strong>nf</strong> : number of dimensions to be included in the output</li>
</ul>
</div>
<p><br/></p>
<p>Example of usage :</p>
<pre class="r"><code>res.ca <- corresp(housetasks, nf= 3)</code></pre>
<p>The output of the function <strong>corresp()</strong> is an object of class <em>correspondence</em> structured as a list including :</p>
<pre class="r"><code>names(res.ca)</code></pre>
<pre><code>[1] "cor"    "rscore" "cscore" "Freq"  </code></pre>
<ul>
<li><em>cor</em>: the square root of eigenvalues</li>
<li><em>rscore</em>, <em>cscore</em>: the row and column scores</li>
<li><em>Freq</em>: the initial contingency table</li>
</ul>
</div>
<div id="interpretation-of-ca-outputs" class="section level1">
<h1>Interpretation of CA outputs</h1>
<p>For the interpretation of result, read this article: <a href="https://www.sthda.com/english/english/wiki/correspondence-analysis-in-r-the-ultimate-guide-for-the-analysis-the-visualization-and-the-interpretation-r-software-and-data-mining">Correspondence Analysis in R: The Ultimate Guide for the Analysis, the Visualization and the Interpretation</a>.</p>
</div>
<div id="eigenvalues-and-scree-plot" class="section level1">
<h1>Eigenvalues and scree plot</h1>
<p>The proportion of inertia explained by the principal axes can be obtained using the function <strong>get_eigenvalue()</strong> [in <em>factoextra</em>] as follow :</p>
<pre class="r"><code>eigenvalues <- get_eigenvalue(res.ca)
eigenvalues</code></pre>
<pre><code>      eigenvalue variance.percent cumulative.variance.percent
Dim.1  0.5428893         48.69222                    48.69222
Dim.2  0.4450028         39.91269                    88.60491
Dim.3  0.1270484         11.39509                   100.00000</code></pre>
<p>The function <strong>fviz_screeplot()</strong> [in <em>factoextra</em> package] can be used to draw the scree plot (the percentages of inertia explained by the CA dimensions):</p>
<pre class="r"><code>fviz_screeplot(res.ca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/mass-correspondance-analysis-scree-pot-factoextra-data-mining-1.png" title="Correspondance analysis - R software and data mining" alt="Correspondance analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="warning">Read more about eigenvalues and screeplot: <a href="https://www.sthda.com/english/english/wiki/eigenvalues-quick-data-visualization-with-factoextra-r-software-and-data-mining">Eigenvalues data visualization</a></span></p>
</div>
<div id="biplot-of-row-and-column-variables" class="section level1">
<h1>Biplot of row and column variables</h1>
<p>You can use the base R function <strong>biplot(res.ca)</strong> or use the function the function <strong>fviz_ca_biplot()</strong>[in <em>factoextra</em> package] to draw a nice looking plot:</p>
<pre class="r"><code>fviz_ca_biplot(res.ca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/mass-correspondance-analysis-ca-biplot-factoextra-data-mining-1.png" title="Correspondance analysis - R software and data mining" alt="Correspondance analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Change the theme
fviz_ca_biplot(res.ca) +
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/mass-correspondance-analysis-ca-biplot-factoextra-data-mining-2.png" title="Correspondance analysis - R software and data mining" alt="Correspondance analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="warning">Read more about <em>fviz_ca_biplot()</em>: <a href="https://www.sthda.com/english/english/wiki/fviz-ca-quick-correspondence-analysis-data-visualization-using-factoextra-r-software-and-data-mining">fviz_ca_biplot</a></span></p>
</div>
<div id="row-variables" class="section level1">
<h1>Row variables</h1>
<p>The function <strong>get_ca_row()</strong>[in <em>factoextra</em>] is used to extract the results for row variables. This functions returns a list containing the coordinates, the cos2, the contribution and the inertia of row variables. The function <strong>fviz_ca_row()</strong> [in <em>factoextra</em>] is used to visualize only row points.</p>
<pre class="r"><code>row <- get_ca_row(res.ca)
row</code></pre>
<pre><code>Correspondence Analysis - Results for rows
 ===================================================
  Name       Description                
1 "$coord"   "Coordinates for the rows" 
2 "$cos2"    "Cos2 for the rows"        
3 "$contrib" "contributions of the rows"
4 "$inertia" "Inertia of the rows"      </code></pre>
<pre class="r"><code># Coordinates
head(row$coord)</code></pre>
<pre><code>                Dim.1      Dim.2       Dim.3
Laundry    -0.9918368 -0.4953220 -0.31672897
Main_meal  -0.8755855 -0.4901092 -0.16406487
Dinner     -0.6925740 -0.3081043 -0.20741377
Breakfeast -0.5086002 -0.4528038  0.22040453
Tidying    -0.3938084  0.4343444 -0.09421375
Dishes     -0.1889641  0.4419662  0.26694926</code></pre>
<pre class="r"><code># Visualize row variables only 
fviz_ca_row(res.ca) +
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/mass-correspondance-analysis-ca-row-points-factoextra-data-mining-1.png" title="Correspondance analysis - R software and data mining" alt="Correspondance analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
<div id="column-varables" class="section level1">
<h1>Column varables</h1>
<p><span class="warning"> The result for columns gives the same information as described for rows.</span></p>
<pre class="r"><code>col <- get_ca_col(res.ca)
# Coordinates
head(col$coord)</code></pre>
<pre><code>                  Dim.1      Dim.2       Dim.3
Wife        -0.83762154 -0.3652207 -0.19991139
Alternating -0.06218462 -0.2915938  0.84858939
Husband      1.16091847 -0.6019199 -0.18885924
Jointly      0.14942609  1.0265791 -0.04644302</code></pre>
<pre class="r"><code># Visualize column variables only 
fviz_ca_col(res.ca) +
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/mass-correspondance-analysis-unnamed-chunk-5-1.png" title="Correspondance analysis - R software and data mining" alt="Correspondance analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
<div id="references-and-further-reading" class="section level1">
<h1>References and further reading</h1>
<ul>
<li><a href="https://www.sthda.com/english/english/wiki/correspondence-analysis-in-r-the-ultimate-guide-for-the-analysis-the-visualization-and-the-interpretation-r-software-and-data-mining">Correspondence Analysis in R: The Ultimate Guide for the Analysis, the Visualization and the Interpretation</a></li>
<li><a href="https://www.sthda.com/english/english/wiki/ade4-and-factoextra-correspondence-analysis-r-software-and-data-mining">Correspondence Analysis using ade4 and factoextra</a></li>
<li>Oleg Nenadic’ and Michael Greenacre. Correspondence Analysis in R, with Two- and. Three-dimensional Graphics: The ca Package. Journal of Statistical Software, May 2007. <a href="http://www.jstatsoft.org/v20/i03/paper">http://www.jstatsoft.org/v20/i03/paper</a></li>
</ul>
</div>
<div id="infos" class="section level1">
<h1>Infos</h1>
<p><span class="warning"> This analysis has been performed using <strong>R software</strong> (ver. 3.1.2), <strong>FactoMineR</strong> (ver. ) and <strong>factoextra</strong> (ver. 1.0.2) </span></p>
</div>

<script>jQuery(document).ready(function () {
    jQuery('h1').addClass('wiki_paragraph1');
    jQuery('h2').addClass('wiki_paragraph2');
    jQuery('h3').addClass('wiki_paragraph3');
    jQuery('h4').addClass('wiki_paragraph4');
    });//add phpboost class to header</script>
<style>.content{padding:0px;}</style>
</div><!--end rdoc-->
<!--====================== stop here when you copy to sthda================-->


<!-- END HTML -->]]></description>
			<pubDate>Wed, 24 Jun 2015 07:37:48 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[Correspondence Analysis in R: The Ultimate Guide for the Analysis, the Visualization and the Interpretation - R software and data mining]]></title>
			<link>https://www.sthda.com/english/wiki/correspondence-analysis-in-r-the-ultimate-guide-for-the-analysis-the-visualization-and-the-interpretation-r-software-and-data-mining</link>
			<guid>https://www.sthda.com/english/wiki/correspondence-analysis-in-r-the-ultimate-guide-for-the-analysis-the-visualization-and-the-interpretation-r-software-and-data-mining</guid>
			<description><![CDATA[<!-- START HTML -->
           
  <!--====================== start from here when you copy to sthda================-->  
  <div id="rdoc">


<div id="TOC">
<ul>
<li><a href="#how-this-article-is-organized">How this article is organized?</a></li>
<li><a href="#required-packages">Required packages</a></li>
<li><a href="#load-factominer-and-factoextra">Load FactoMineR and factoextra</a></li>
<li><a href="#data-format-contingency-tables">Data format: Contingency tables</a></li>
<li><a href="#exploratory-data-analysis-eda">Exploratory data analysis (EDA)</a><ul>
<li><a href="#visual-inspection">Visual inspection</a></li>
<li><a href="#visualize-a-contingency-table-using-graphical-matrix">Visualize a contingency table using graphical matrix</a></li>
<li><a href="#mosaic-association-plots">Mosaic / association plots</a></li>
<li><a href="#chi-square-statistic">Chi-square statistic</a></li>
</ul></li>
<li><a href="#correspondence-analysis-ca">Correspondence analysis (CA)</a></li>
<li><a href="#summary-of-ca-outputs">Summary of CA outputs</a></li>
<li><a href="#interpretation-of-ca-outputs">Interpretation of CA outputs</a><ul>
<li><a href="#significance-of-the-association-between-rows-and-columns">Significance of the association between rows and columns</a></li>
<li><a href="#eigenvalues-and-scree-plot">Eigenvalues and scree plot</a></li>
<li><a href="#ca-scatter-plot-biplot-of-row-and-column-variables">CA scatter plot: Biplot of row and column variables</a></li>
<li><a href="#row-variables">Row variables</a><ul>
<li><a href="#coordinates-of-rows">Coordinates of rows</a></li>
<li><a href="#contribution-of-rows-to-the-dimensions">Contribution of rows to the dimensions</a></li>
<li><a href="#cos2-the-quality-of-representation-of-rows">Cos2 : The quality of representation of rows</a></li>
</ul></li>
<li><a href="#column-varables">Column varables</a><ul>
<li><a href="#coordinates-of-columns">Coordinates of columns</a></li>
<li><a href="#contribution-of-columns-to-the-dimensions">Contribution of columns to the dimensions</a></li>
</ul></li>
<li><a href="#cos2-the-quality-of-representation-of-columns">Cos2 : The quality of representation of columns</a></li>
</ul></li>
<li><a href="#biplot-of-rows-and-columns">Biplot of rows and columns</a><ul>
<li><a href="#symmetric-biplot">Symmetric biplot</a></li>
<li><a href="#asymmetric-biplot-for-correspondence-analysis">Asymmetric biplot for correspondence analysis</a></li>
<li><a href="#contribution-biplot">Contribution biplot</a></li>
<li><a href="#plot-rows-or-columns-only">Plot rows or columns only</a></li>
</ul></li>
<li><a href="#correspondence-analysis-using-supplementary-rows-and-columns">Correspondence analysis using supplementary rows and columns</a><ul>
<li><a href="#data">Data</a></li>
<li><a href="#ca-with-supplementary-rowscolumns">CA with supplementary rows/columns</a></li>
<li><a href="#make-a-biplot-of-rows-and-columns">Make a biplot of rows and columns</a></li>
<li><a href="#visualize-supplementary-rows">Visualize supplementary rows</a></li>
<li><a href="#visualize-supplementary-columns">Visualize supplementary columns</a></li>
</ul></li>
<li><a href="#filter-ca-results">Filter CA results</a></li>
<li><a href="#dimension-description">Dimension description</a></li>
<li><a href="#ca-and-outliers">CA and outliers</a></li>
<li><a href="#infos">Infos</a></li>
</ul>
</div>

<p><br/></p>
<p><strong>Correspondence analysis</strong> (<strong>CA</strong>) is an extension of <a href="https://www.sthda.com/english/english/wiki/factominer-and-factoextra-principal-component-analysis-visualization-r-software-and-data-mining"><strong>Principal Component Analysis</strong> (<strong>PCA</strong>)</a> suited to handle <strong>qualitative variables</strong> (or categorical data).</p>
<p><strong>CA</strong> is used to analyze frequencies formed by categorical data (i.e, <strong>contengency table</strong>) and it provides factor scores (coordinates) for both the rows and the columns of contingency table. These coordinates are used to visualize graphically the association between row and column variables in the contingency table.</p>
<p><span class="success">This article describes how to compute and interpret a <strong>correspondence analysis</strong> using <strong>FactoMineR</strong> and <strong>factoextra</strong> R packages.<span></p>
<p>The <a href="https://www.sthda.com/english/english/wiki/correspondence-analysis-basics-r-software-and-data-mining">mathematical procedures of CA</a> has been described in my <a href="https://www.sthda.com/english/english/wiki/correspondence-analysis-basics-r-software-and-data-mining">previous tutorial</a>. In the current tutorial, we’ll focus on the practical application and interpretation of correspondence analysis rather than the mathematical and statistical details.</p>
<div id="how-this-article-is-organized" class="section level1">
<h1>How this article is organized?</h1>
<p>This article contains mainly 5 important parts:</p>
<ul>
<li>Part I describes the <strong>exploratory data analysis tools</strong> for contingency tables</li>
<li>Part II shows how to use FactoMineR package for <strong>computing correspondence analysis</strong> (CA)</li>
<li>Part III is a step-by-step guide for <strong>interpreting</strong> and <strong>visualizing</strong> the output of CA</li>
<li>Part IV provides an explanation about <strong>symmetric and asymmetric biplot</strong>. This section is very important and we’ll see why.</li>
<li>Part V covers how to apply correspondence analysis using <strong>supplementary rows and colums</strong>. This is important, if you want to make predictions with CA.</li>
</ul>
<p>The last sections of this guide describe also how to <strong>filter CA result</strong> in order to keep only the most contributing variables. Finally, we’ll see how to deal with outliers.</p>
</div>
<div id="required-packages" class="section level1">
<h1>Required packages</h1>
<p>There are many functions from different packages in <strong>R</strong>, to perform correspondence analysis:</p>
<ul>
<li><strong>CA</strong> [in <strong>FactoMineR</strong> package]</li>
<li><strong>ca()</strong> [in <strong>ca</strong> package]</li>
<li><strong>dudi.coa()</strong> [in <strong>ade4</strong> package]</li>
<li><strong>corresp()</strong> [in <strong>MASS</strong> package]</li>
</ul>
<p><span class="success"> In this tutorial, <strong>FactoMineR</strong>(for computing CA) and <a href="https://www.sthda.com/english/english/wiki/factoextra-r-package-visualization-of-the-outputs-of-a-multivariate-analysis-r-software-and-data-mining"><strong>factoextra</strong></a> (for CA visualization) packages are used.</span></p>
<p><span class="warning">Note that, no matter what function you decide to use for computing CA, the output can be visualized using the R functions available in <em>factoextra</em> package, as described in the next sections.</span></p>
<p>FactoMineR and factoextra R packages can be installed as follow :</p>
<pre class="r"><code>install.packages("FactoMineR")

# install.packages("devtools")
devtools::install_github("kassambara/factoextra")</code></pre>
<p><span class="notice">Note that, for factoextra a version >= 1.0.2 is required for this tutorial. If it’s already installed on your computer, you should re-install it to have the most updated version.</span></p>
</div>
<div id="load-factominer-and-factoextra" class="section level1">
<h1>Load FactoMineR and factoextra</h1>
<pre class="r"><code>library("FactoMineR")
library("factoextra")</code></pre>
</div>
<div id="data-format-contingency-tables" class="section level1">
<h1>Data format: Contingency tables</h1>
<p>We’ll use the data sets <em>housetasks</em> [in <em>factoextra</em>]</p>
<pre class="r"><code>data(housetasks)
# head(housetasks)</code></pre>
<p>An image of the data is shown below:</p>
<p><img src="https://www.sthda.com/english/sthda/RDoc/images/ca-housetasks.png" alt="Data format correspondence analysis" /></p>
<br/>
<div class="block">
<p>The data is a contingency table containing 13 housetasks and their repartition in the couple:</p>
<ul>
<li>rows are the different tasks</li>
<li>values are the frequencies of the tasks done :
</li>
<li>by the <em>wife</em> only</li>
<li>alternatively</li>
<li>by the husband only</li>
<li>or jointly</li>
</ul>
</div>
<p><br/></p>
</div>
<div id="exploratory-data-analysis-eda" class="section level1">
<h1>Exploratory data analysis (EDA)</h1>
<p>Most of the EDA methods presented here (<strong>graphical matrix</strong>, <strong>mosaic/association plots</strong> and <strong>Chi-square statistic</strong>), have been already described in my <a href="https://www.sthda.com/english/english/wiki/correspondence-analysis-basics-r-software-and-data-mining">previous tutorial: correspondence analysis basics</a>.</p>
<p>If you’re already familiar with these approaches, you can skip this section.</p>
<div id="visual-inspection" class="section level2">
<h2>Visual inspection</h2>
<p>The above <em>contingency table</em> is not very large. Therefore, it’s easy to visually inspect and interpret row and column profiles:</p>
<ul>
<li>It’s evident that, the housetasks - <em>Laundry, Main_Meal and Dinner</em> - are more frequently done by the “Wife”.
</li>
<li>Repairs and driving are dominantly done by the husband</li>
<li>Holidays are frequently associated with the column “jointly”</li>
</ul>
</div>
<div id="visualize-a-contingency-table-using-graphical-matrix" class="section level2">
<h2>Visualize a contingency table using graphical matrix</h2>
<p>It’s also possible to visualize a contingency table using the function <strong>balloonplot()</strong> [in <em>gplots</em> package]. This function draws a graphical matrix where each cell contains a dot whose size reflects the relative magnitude of the corresponding component.</p>
<p><span class="notice">To execute the R code below, you should install the package <strong>gplots</strong>: <strong>install.packages(“gplots”)</strong>.</span></p>
<pre class="r"><code>library("gplots")
# 1. convert the data as a table
dt <- as.table(as.matrix(housetasks))
# 2. Graph
balloonplot(t(dt), main ="housetasks", xlab ="", ylab="",
            label = FALSE, show.margins = FALSE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-graph-contingency-table-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><span class="warning">Note that, row and column sums are printed by default in the bottom and right margins, respectively. These values can be hidden using the argument <em>show.margins = FALSE</em>.</span></p>
</div>
<div id="mosaic-association-plots" class="section level2">
<h2>Mosaic / association plots</h2>
<p>The function <strong>mosaicplot()</strong> from the built-in R package <strong>garphics</strong> can be used also to visualize a contingency table.</p>
<pre class="r"><code>library("graphics")
mosaicplot(dt, shade = TRUE, las=2,
           main = "housetasks")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-contingency-table-graph-mosaic-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<ul>
<li>The argument <strong>shade</strong> is used to color the graph</li>
<li>The argument <strong>las = 2</strong> produces vertical labels</li>
</ul>
<p><span class="warning">The surface of an element of the mosaic reflects the relative magnitude of its value.</span></p>
<ul>
<li>Blue color indicates that the observed value is higher than the expected value if the data were random</li>
<li>Red color specifies that the observed value is lower than the expected value if the data were random</li>
</ul>
<p><span class="success">From this mosaic plot, it can be seen that the housetasks <em>Laundry, Main_meal, Dinner and breakfeast</em> (blue color) are mainly done by the wife in our example.</span></p>
<p><span class="warning">It’s also possible to use the package <strong>vcd</strong> to make a mosaic plot (function <strong>mosaic()</strong>) or an association plot (function <strong>assoc()</strong>).</span></p>
<pre class="r"><code># install.packages("vcd")
library("vcd")
# plot just a subset of the table
assoc(head(dt), shade = T, las=3)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-contingency-table-graph-association-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
<div id="chi-square-statistic" class="section level2">
<h2>Chi-square statistic</h2>
<p>Another method to analyse a frequency table is to use the <strong>Chi-square test</strong> of independence. The Chi-square test evaluates whether there is a significant dependence between row and column categories.</p>
<p>Chi-square statistic can be easily computed using the function <strong>chisq.test()</strong> as follow:</p>
<pre class="r"><code>chisq <- chisq.test(housetasks)
chisq</code></pre>
<pre><code>
    Pearson&amp;#39;s Chi-squared test

data:  housetasks
X-squared = 1944.456, df = 36, p-value < 2.2e-16</code></pre>
<p><span class="success">In our example, the row and the column variables are statistically significantly associated (<em>p-value</em> = 0).</span></p>
<p>Read more: <a href="https://www.sthda.com/english/english/wiki/correspondence-analysis-basics-r-software-and-data-mining">correspondence analysis basics</a></p>
</div>
</div>
<div id="correspondence-analysis-ca" class="section level1">
<h1>Correspondence analysis (CA)</h1>
<p>The EDA methods described in the previous sections are useful only for small contingency table. For a large contingency table, statistical approaches, such as CA, are required to reduce the dimension of the data without loosing the most important information. In other words, CA is used to graphically visualize row points and column points in a low dimensional space.</p>
<p>The function <strong>CA()</strong> [in <em>FactoMineR</em> package] can be used. A simplified format is :</p>
<pre class="r"><code>CA(X, ncp = 5, graph = TRUE)</code></pre>
<br/>
<div class="block">
<ul>
<li><strong>X</strong> : a data frame (contingency table)</li>
<li><strong>ncp</strong> : number of dimensions kept in the final results.</li>
<li><strong>graph</strong> : a logical value. If TRUE a graph is displayed.</li>
</ul>
</div>
<p><br/></p>
<p>Example of usage :</p>
<pre class="r"><code>res.ca <- CA(housetasks, graph = FALSE)</code></pre>
<p>The output of the function <strong>CA()</strong> is a list including :</p>
<pre class="r"><code>print(res.ca)</code></pre>
<pre><code>**Results of the Correspondence Analysis (CA)**
The row variable has  13  categories; the column variable has 4 categories
The chi square of independence between the two variables is equal to 1944.456 (p-value =  0 ).
*The results are available in the following objects:

   name              description                   
1  "$eig"            "eigenvalues"                 
2  "$col"            "results for the columns"     
3  "$col$coord"      "coord. for the columns"      
4  "$col$cos2"       "cos2 for the columns"        
5  "$col$contrib"    "contributions of the columns"
6  "$row"            "results for the rows"        
7  "$row$coord"      "coord. for the rows"         
8  "$row$cos2"       "cos2 for the rows"           
9  "$row$contrib"    "contributions of the rows"   
10 "$call"           "summary called parameters"   
11 "$call$marge.col" "weights of the columns"      
12 "$call$marge.row" "weights of the rows"         </code></pre>
<p><span class="success">The object that is created using the function <strong>CA()</strong> contains many informations found in many different lists and matrices. These values are described in the next sections.</span></p>
</div>
<div id="summary-of-ca-outputs" class="section level1">
<h1>Summary of CA outputs</h1>
<p>The function <strong>summary.CA()</strong> is used to print a summary of <strong>correspondence analysis</strong> results:</p>
<pre class="r"><code>summary(object, nb.dec = 3, nbelements = 10, 
        ncp = TRUE, file ="", ...)</code></pre>
<br/>
<div class="block">
<ul>
<li><strong>object</strong>: an object of class <strong>CA</strong></li>
<li><strong>nb.dec</strong>: number of decimal printed</li>
<li><strong>nbelements</strong>: number of row/column variables to be written. To have all the elements, use <em>nbelements = Inf</em>.</li>
<li><strong>ncp</strong>: Number of dimensions to be printed</li>
<li><strong>file</strong>: an optional file name for exporting the summaries.</li>
</ul>
</div>
<p><br/></p>
<p><strong>Print the summary of the CA analysis for the dimensions 1 and 2:</strong></p>
<pre class="r"><code>summary(res.ca, nb.dec = 2, ncp = 2)</code></pre>
<pre><code>
Call:
rmarkdown::render("factominer-correspondance-analysis.Rmd", encoding = "UTF-8") 

The chi square of independence between the two variables is equal to 1944.456 (p-value =  0 ).

Eigenvalues
                      Dim.1  Dim.2  Dim.3  Dim.4
Variance               0.54   0.45   0.13   0.00
% of var.             48.69  39.91  11.40   0.00
Cumulative % of var.  48.69  88.60 100.00 100.00

Rows (the 10 first)
               Dim.1    ctr   cos2    Dim.2    ctr   cos2  
Laundry     |  -0.99  18.29   0.74 |   0.50   5.56   0.18 |
Main_meal   |  -0.88  12.39   0.74 |   0.49   4.74   0.23 |
Dinner      |  -0.69   5.47   0.78 |   0.31   1.32   0.15 |
Breakfeast  |  -0.51   3.82   0.50 |   0.45   3.70   0.40 |
Tidying     |  -0.39   2.00   0.44 |  -0.43   2.97   0.54 |
Dishes      |  -0.19   0.43   0.12 |  -0.44   2.84   0.65 |
Shopping    |  -0.12   0.18   0.06 |  -0.40   2.52   0.75 |
Official    |   0.23   0.52   0.05 |   0.25   0.80   0.07 |
Driving     |   0.74   8.08   0.43 |   0.65   7.65   0.34 |
Finances    |   0.27   0.88   0.16 |  -0.62   5.56   0.84 |

Columns
               Dim.1    ctr   cos2    Dim.2    ctr   cos2  
Wife        |  -0.84  44.46   0.80 |   0.37  10.31   0.15 |
Alternating |  -0.06   0.10   0.00 |   0.29   2.78   0.11 |
Husband     |   1.16  54.23   0.77 |   0.60  17.79   0.21 |
Jointly     |   0.15   1.20   0.02 |  -1.03  69.12   0.98 |</code></pre>
<p>The result of the function <strong>summary()</strong> contains the <strong>chi-square statistic</strong> and 3 tables:</p>
<ul>
<li><strong>Table 1 - Eigenvalues</strong>: table 1 contains the variances and the percentage of variances retained by each dimension.</li>
<li><strong>Table 2</strong> contains the coordinates, the contribution and the cos2 (quality of representation [in 0-1]) of the first 10 active row variables on the dimensions 1 and 2.</li>
<li><strong>Table 3</strong> contains the coordinates, the contribution and the cos2 (quality of representation [in 0-1]) of the first 10 active column variables on the dimensions 1 and 2.</li>
</ul>
<br/>
<div class="warning">
<p>Note that,</p>
<ul>
<li>to export the summary into a file use <em>summary(res.ca, file =“myfile.txt”)</em></li>
<li>to display the summary of more than 10 elements, use the argument <strong>nbelements</strong> in the function <strong>summary()</strong></li>
</ul>
</div>
<p><br/></p>
</div>
<div id="interpretation-of-ca-outputs" class="section level1">
<h1>Interpretation of CA outputs</h1>
<div id="significance-of-the-association-between-rows-and-columns" class="section level2">
<h2>Significance of the association between rows and columns</h2>
<p>To interpret correspondence analysis, the first step is to evaluate whether there is a significant dependency between the rows and columns.</p>
<p>There are two methods to inspect the significance:</p>
<ol style="list-style-type: decimal">
<li>Using the <strong>trace</strong></li>
<li>Using the <strong>Chi-square statistic</strong></li>
</ol>
<p>The <strong>trace</strong> is the the total inertia of the table (i.e, the sum of the eigenvalues). The square root of the trace is interpreted as the <strong>correlation coefficient</strong> between rows and columns.</p>
<p>The correlation coefficient is calculated as follow:</p>
<pre class="r"><code>eig <- get_eigenvalue(res.ca)
trace <- sum(eig$eigenvalue) 
cor.coef <- sqrt(trace)
cor.coef</code></pre>
<pre><code>[1] 1.055907</code></pre>
<p><span class="warning">Note that, as a rule of thumb <strong>0.2</strong> is the threshold above which the correlation can be considered as important (Bendixen 1995, 576; Healey 2013, 289-290).</span></p>
<p><span class="success">In our example, the <strong>correlation coefficient</strong> is 1.0559074 indicating a strong association between row and column variables.</span></p>
<p>A more rigorous method is to use the <strong>chi-square statistic</strong> for examining the association. This appears at the top of the report generated by the function <strong>summary.CA()</strong>. A high chi-square statistic means strong link between row and column variables.</p>
<p><span class="success">In our example, the association is highly significant (<em>chi-square: 1944.456, p = 0</em>).</span></p>
<br/>
<div class="block">
<p>Note that, the <strong>chi-square statistics = trace * n</strong>, where n is the grand total of the table (total frequency); see the R code below:</p>
<pre class="r"><code># Chi-square statistics
chi2 <- trace*sum(as.matrix(housetasks))
chi2</code></pre>
<pre><code>[1] 1944.456</code></pre>
<pre class="r"><code># Degree of freedom
df <- (nrow(housetasks) - 1) * (ncol(housetasks) - 1)
# P-value
pval <- pchisq(chi2, df = df, lower.tail = FALSE)
pval</code></pre>
<pre><code>[1] 0</code></pre>
</div>
<p><br/></p>
</div>
<div id="eigenvalues-and-scree-plot" class="section level2">
<h2>Eigenvalues and scree plot</h2>
<p><span class="question">How many dimensions are sufficient for the data interpretation?</span></p>
<p>The number of dimensions to retain in the solution can be determined by examining the table of eigenvalues.</p>
<p>As mentioned above, <strong>trace</strong> is the total sum of eigenvalues. For a given axis, the ratio of the axis eigenvalue to the trace is called the percentage of variance (or total inertia or chi-square value) explained by that axis.</p>
<p>The proportion of variances retained by the different dimensions (axes) can be extracted using the function <strong>get_eigenvalue()</strong>[in <em>factoextra</em>] as follow :</p>
<pre class="r"><code>eigenvalues <- get_eigenvalue(res.ca)
head(round(eigenvalues, 2))</code></pre>
<pre><code>      eigenvalue variance.percent cumulative.variance.percent
Dim.1       0.54            48.69                       48.69
Dim.2       0.45            39.91                       88.60
Dim.3       0.13            11.40                      100.00
Dim.4       0.00             0.00                      100.00</code></pre>
<div class="success">
<strong>Eigenvalues</strong> correspond to the amount of information retained by each axis. Dimensions are ordered decreasingly and listed according to the amount of variance explained in the solution. Dimension 1 explains the most variance in the solution, followed by dimension 2 and so on.
</div>
<p>There is no “rule of thumb” to choose the number of dimension to keep for the data interpretation. It depends on the research question and the researcher’s need. For example, if you are satisfied with 80% of the total inertia explained then use the number of dimensions necessary to achieve that.</p>
<p>Another method is to visually inspect the <strong>scree plot</strong> in which dimensions are ordered decreasingly according the amount of explained inertia.</p>
<p>The function <strong>fviz_screeplot()</strong> [in <em>factoextra</em> package] can be used to draw the scree plot (the percentages of inertia explained by the CA dimensions):</p>
<pre class="r"><code>fviz_screeplot(res.ca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-scree-pot-factoextra-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="warning">The point at which the <em>scree plot</em> shows a bend (so called “elbow”) can be considered as indicating an optimal dimensionality.</span></p>
<p>It’s also possible to calculate an average eigenvalue above which the axis should be kept in the solution.</p>
<br/>
<div class="block">
<p>Our data contains 13 rows and 4 columns.</p>
<p>If the data were random, the expected value of the eigenvalue for each axis would be 1/(nrow(housetasks)-1) = 1/12 = 8.33% in terms of rows.</p>
Likewise, the average axis should account for 1/(ncol(housetasks)-1) = 1/3 = 33.33% in terms of the 4 columns.
</div>
<p><br/></p>
<p><span class="success">Any axis with a contribution larger than the maximum of these two percentages should be considered as important and included in the solution for the interpretation of the data (see, Bendixen 1995, 577).</span></p>
<p>The R code below, draws the scree plot with a red dashed line specifying the average eigenvalue:</p>
<pre class="r"><code>fviz_screeplot(res.ca) +
 geom_hline(yintercept=33.33, linetype=2, color="red")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-scree-pot-cut-off-factoextra-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>According to the graph above, only dimensions 1 and 2 should be used in the solution. The dimension 3 explains only 11.4% of the total inertia which is below the average eigeinvalue (33.33%) and too little to be kept for further analysis.</p>
<p><span class="notice">Note that, you can use more than 2 dimensions. However, the supplementary dimensions are unlikely to contribute significantly to the interpretation of nature of the association between the rows and columns.</span></p>
<p>Dimensions 1 and 2 explain approximately 48.7% and 39.9% of the total inertia respectively. This corresponds to a cumulative total of 88.6% of total inertia retained by the 2 dimensions.</p>
<p><span class="success">The higher the retention, the more subtlety in the original data is retained in the low-dimensional solution (Mike Bendixen, 2003).</span></p>
<p><span class="warning">Read more about eigenvalues and screeplot: <a href="https://www.sthda.com/english/english/wiki/eigenvalues-quick-data-visualization-with-factoextra-r-software-and-data-mining">Eigenvalues data visualization</a></span></p>
</div>
<div id="ca-scatter-plot-biplot-of-row-and-column-variables" class="section level2">
<h2>CA scatter plot: Biplot of row and column variables</h2>
<p>The function <strong>plot.CA()</strong>[in <em>FactoMineR</em>] can be used to plot the coordinates of rows and columns presented in the correspondence analysis output.</p>
<p>A simplified format is :</p>
<pre class="r"><code>plot.CA(x, axes = c(1,2), col.row = "blue", col.col = "red")</code></pre>
<br/>
<div class="block">
<ul>
<li><strong>x</strong> : An object of class <strong>CA</strong></li>
<li><strong>axes</strong> : A numeric vector of length 2 specifying the component to plot variables</li>
<li><strong>col.row</strong>, <strong>col.col</strong> : colors for rows and columns respectively</li>
</ul>
</div>
<p><br/></p>
<p><strong>FactoMineR base graph for CA</strong>:</p>
<pre class="r"><code>plot(res.ca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-factor-map-factominer-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>It’s also possible to use the function <strong>fviz_ca_biplot()</strong>[in <em>factoextra</em> package] to draw a nice looking plot:</p>
<pre class="r"><code>fviz_ca_biplot(res.ca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-ca-biplot-factoextra-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Change the theme
fviz_ca_biplot(res.ca) +
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-ca-biplot-factoextra-data-mining-2.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="warning">Read more about <em>fviz_ca_biplot()</em>: <a href="https://www.sthda.com/english/english/wiki/fviz-ca-quick-correspondence-analysis-data-visualization-using-factoextra-r-software-and-data-mining">fviz_ca_biplot</a></span></p>
<p>The graph above is called <strong>symetric plot</strong> and shows a global pattern within the data. Rows are represented by blue points and columns by red triangles.</p>
<p>The distance between any row points or column points gives a measure of their similarity (or dissimilarity).</p>
<p>Row points with similar profile are closed on the factor map. The same holds true for column points.</p>
<br/>

<div class="success">
<p>This graph shows that :</p>
<ul>
<li>housetasks such as dinner, breakfeast, laundry are done more often by the wife</li>
<li>Driving and repairs are done by the husband</li>
<li>……</li>
</ul>
</div>
<p><br/></p>
<br/>
<div class="warning">
<ul>
<li><p><strong>Symetric plot</strong> represents the row and column profiles simultaneously in a common space (Bendixen, 2003). In this case, only the distance between row points or the distance between column points can be really interpreted.</p></li>
<li><p>The distance between any row and column items is not meaningful! You can only make a general statements about the observed pattern.</p></li>
<li>In order to interpret the distance between column and row points, the column profiles must be presented in row space or vice-versa. This type of map is called <strong>asymmetric biplot</strong> and is discussed at the end of this article.</li>
</ul>
</div>
<p><br/></p>
<p>The next step for the interpretation is to determine which row and column variables contribute the most in the definition of the different dimensions retained in the model.</p>
</div>
<div id="row-variables" class="section level2">
<h2>Row variables</h2>
<p>The function <strong>get_ca_row()</strong>[in <em>factoextra</em>] is used to extract the results for row variables. This function returns a list containing the coordinates, the cos2, the contribution and the inertia of row variables:</p>
<pre class="r"><code>row <- get_ca_row(res.ca)
row</code></pre>
<pre><code>Correspondence Analysis - Results for rows
 ===================================================
  Name       Description                
1 "$coord"   "Coordinates for the rows" 
2 "$cos2"    "Cos2 for the rows"        
3 "$contrib" "contributions of the rows"
4 "$inertia" "Inertia of the rows"      </code></pre>
<div id="coordinates-of-rows" class="section level3">
<h3>Coordinates of rows</h3>
<pre class="r"><code>head(row$coord)</code></pre>
<pre><code>                Dim 1      Dim 2       Dim 3
Laundry    -0.9918368  0.4953220 -0.31672897
Main_meal  -0.8755855  0.4901092 -0.16406487
Dinner     -0.6925740  0.3081043 -0.20741377
Breakfeast -0.5086002  0.4528038  0.22040453
Tidying    -0.3938084 -0.4343444 -0.09421375
Dishes     -0.1889641 -0.4419662  0.26694926</code></pre>
<p><span class="success">The data indicate the coordinates of each row point in each dimension (1, 2 and 3)</span></p>
<p>Use the function <strong>fviz_ca_row()</strong> [in factoextra] to visualize only row points:</p>
<pre class="r"><code># Default plot
fviz_ca_row(res.ca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-ca-row-points-factoextra-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>It’s possible to change the color and the shape of the row points using the arguments <em>col.row</em> and <em>shape.row</em> as follow:</p>
<pre class="r"><code>fviz_ca_row(res.ca, col.row="steelblue", shape.row = 15)</code></pre>
<br/>
<div class="nootice">
<p>Note that, it’s also possible to make the graph of rows only using <em>FactoMineR</em> base graph. The argument <em>invisible</em> is used to hide the column points:</p>
<pre class="r"><code># Hide columns
plot(res.ca, invisible="col") </code></pre>
</div>
<p><br/></p>
<p><span class="warning">Read more about <em>fviz_ca_row()</em>: <a href="https://www.sthda.com/english/english/wiki/fviz-ca-quick-correspondence-analysis-data-visualization-using-factoextra-r-software-and-data-mining">fviz_ca_row</a></span></p>
</div>
<div id="contribution-of-rows-to-the-dimensions" class="section level3">
<h3>Contribution of rows to the dimensions</h3>
<p>The contribution of rows (in %) to the definition of the dimensions can be extracted as follow:</p>
<pre class="r"><code>head(row$contrib)</code></pre>
<pre><code>                Dim 1    Dim 2    Dim 3
Laundry    18.2867003 5.563891 7.968424
Main_meal  12.3888433 4.735523 1.858689
Dinner      5.4713982 1.321022 2.096926
Breakfeast  3.8249284 3.698613 3.069399
Tidying     1.9983518 2.965644 0.488734
Dishes      0.4261663 2.844117 3.634294</code></pre>
<p><span class="success">The row variables with the larger value, contribute the most to the definition of the dimensions.</span></p>
<p>It’s possible to use the function <strong>corrplot</strong> to highlight the most contributing variables for each dimension:</p>
<pre class="r"><code>library("corrplot")
corrplot(row$contrib, is.corr=FALSE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-row-contribution-r-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>The function <strong>fviz_contrib()</strong>[in <em>factoextra</em>] can be used to draw a bar plot of row contributions:</p>
<pre class="r"><code># Contributions of rows on Dim.1
fviz_contrib(res.ca, choice = "row", axes = 1)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-ca-row-contribution-dim-1-factoextra-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<br/>
<div class="warning">
<ul>
<li><p>If the row contributions were uniform, the expected value would be 1/nrow(housetasks) = 1/13 = 7.69%.</p></li>
<li>The red dashed line on the graph above indicates the expected average contribution. For a given dimension, any row with a contribution larger than this threshold could be considered as important in contributing to that dimension.</li>
</ul>
</div>
<p><br/></p>
<p><span class="success"> It can be seen that the row items <em>Repairs, Laundry, Main_meal and Driving</em> are the most important in the definition of the first dimension.</span></p>
<pre class="r"><code># Contributions of rows on Dim.2
fviz_contrib(res.ca, choice = "row", axes = 2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-ca-row-contribution-dim-2-factoextra-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="success">The row items <em>Holidays and Repairs</em> contribute the most to the dimension 2.</span></p>
<pre class="r"><code># Total contribution on Dim.1 and Dim.2
fviz_contrib(res.ca, choice = "row", axes = 1:2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-ca-row-contribution-2-dimension-factoextra-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<br/>
<div class="block">
<p>The total contribution of a row, on explaining the variations retained by Dim.1 and Dim.2, is calculated as follow : (C1 * Eig1) + (C2 * Eig2).</p>
<p>C1 and C2 are the contributions of the row to dimensions 1 and 2, respectively. Eig1 and Eig2 are the eigenvalues of dimensions 1 and 2, respectively.</p>
The expected average contribution of a row for Dim.1 and Dim.2 is : (7.69 * Eig1) + (7.69 * Eig2) = (7.69<em>0.54) + (7.69</em>0.44) = 7.53%
</div>
<p><br/></p>
<p>If your data contains many row items, the top contributing rows can be displayed as follow:</p>
<pre class="r"><code>fviz_contrib(res.ca, choice = "row", axes = 1, top = 5)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-ca-top-5-contributing-rows-r-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="warning">Read more about <em>fviz_contrib()</em>: <a href="https://www.sthda.com/english/english/wiki/fviz-contrib-quick-visualization-of-row-column-contributions-r-software-and-data-mining">fviz_contrib</a></span></p>
<p>A second option is to draw a scatter plot of row points and to highlight rows according to the amount of their contributions. The function <strong>fviz_ca_row()</strong> is used.</p>
<p><span class="warning">Note that, using <strong>factoextra</strong> package, the color or the transparency of the row variables can be automatically controlled by the value of their contributions, their cos2, their coordinates on x or y axis.</span></p>
<pre class="r"><code># Control row point colors using their contribution
# Possible values for the argument col.row are :
  # "cos2", "contrib", "coord", "x", "y"
fviz_ca_row(res.ca, col.row = "contrib")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-rows-graph-colors-factoextra-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Change the gradient color
fviz_ca_row(res.ca, col.row="contrib")+
scale_color_gradient2(low="white", mid="blue", 
                      high="red", midpoint=10)+theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-rows-graph-colors-factoextra-data-mining-2.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<br/>
<div class="success">
<p>The scatter plot is also helpful to highlight the most important row variables in the determination of the dimensions.</p>
<p>In addition we can have an idea of what pole of the dimensions the row categories are actually contributing to.</p>
<p>It is evident that row categories <em>Repair and Driving</em> have an important contribution to the positive pole of the first dimension, while the categories <em>Laundry and Main_meal</em> have a major contribution to the negative pole of the first dimension; etc, ….</p>
In other words, dimension 1 is mainly defined by the opposition of <em>Repair and Driving</em> (positive pole), and <em>Laundry and Main_meal</em> (negative pole).
</div>
<p></br/></p>
<p>It’s also possible to control automatically the transparency of rows by their contributions. The argument <em>alpha.row</em> is used:</p>
<pre class="r"><code># Control the transparency of rows using their contribution
# Possible values for the argument alpha.var are :
  # "cos2", "contrib", "coord", "x", "y"
fviz_ca_row(res.ca, alpha.row="contrib")+
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-rows-graph-colors-transparency-factoextra-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><span class="warning">It’s possible to select and display only the top contributing row as illustrated in the R code below.</span></p>
<pre class="r"><code># Select the top 5 contributing rows
fviz_ca_row(res.ca, alpha.row="contrib", select.row=list(contrib=5))</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-ca-select-top-rows-factoextra-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="notice">Row/column selections are discussed in details in the next sections</span></p>
<p><span class="warning">The contribution of row/column variables can be visualized using the so-called <strong>contribution biplots</strong> (discussed in the last sections of this article).</span></p>
<p><span class="warning">Read more about <em>fviz_ca_row()</em>: <a href="https://www.sthda.com/english/english/wiki/fviz-ca-quick-correspondence-analysis-data-visualization-using-factoextra-r-software-and-data-mining">fviz_ca_row</a></span></p>
</div>
<div id="cos2-the-quality-of-representation-of-rows" class="section level3">
<h3>Cos2 : The quality of representation of rows</h3>
<p>The result of the analysis shows that, the contingency table has been successfully represented in low dimension space using <strong>correspondence analysis</strong>. The two dimensions 1 and 2 are sufficient to retain 88.6% of the total inertia contained in the data.</p>
<p><span class="warning">However, not all the points are equally well displayed in the two dimensions.</span></p>
<p><span class="success">The <strong>quality of representation</strong> of the rows on the factor map is called the <strong>squared cosine</strong> (cos2) or the <strong>squared correlations</strong>.</span></p>
<p>The cos2 measures the degree of association between rows/columns and a particular axis.</p>
<p>The cos2 of rows can be extracted as follow:</p>
<pre class="r"><code>head(row$cos2)</code></pre>
<pre><code>               Dim 1     Dim 2      Dim 3
Laundry    0.7399874 0.1845521 0.07546047
Main_meal  0.7416028 0.2323593 0.02603787
Dinner     0.7766401 0.1537032 0.06965666
Breakfeast 0.5049433 0.4002300 0.09482670
Tidying    0.4398124 0.5350151 0.02517249
Dishes     0.1181178 0.6461525 0.23572969</code></pre>
<p>The values of the cos2 are comprised between 0 and 1.</p>
<p><strong>The sum of the cos2</strong> for rows on all the CA dimensions is equal to one.</p>
<p><span class="warning">The quality of representation of a row or column in n dimensions is simply the sum of the squared cosine of that row or column over the n dimensions.</span></p>
<p>If a row item is well represented by two dimensions, the sum of the cos2 is closed to one.</p>
<p>For some of the row items, more than 2 dimensions are required to perfectly represent the data.</p>
<p><strong>Visualize the cos2 of rows using corrplot</strong>:</p>
<pre class="r"><code>library("corrplot")
corrplot(row$cos2, is.corr=FALSE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-row-cos2-r-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>The function <strong>fviz_cos2()</strong>[in <em>factoextra</em>] can be used to draw a bar plot of rows cos2:</p>
<pre class="r"><code># Cos2 of rows on Dim.1 and Dim.2
fviz_cos2(res.ca, choice = "row", axes = 1:2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-ca-row-cos2-dim-1-factoextra-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="warning">Note that, all row points except <em>Official</em> are well represented by the first two dimensions. This implies that the position of the point corresponding the item <em>Official</em> on the scatter plot should be interpreted with some caution. A higher dimensional solution is probably necessary for the item <em>Official</em>.</span></p>
<p><span class="warning">Read more about <em>fviz_cos2()</em>: <a href="https://www.sthda.com/english/english/wiki/fviz-cos2-quick-visualization-of-the-quality-of-representation-of-rows-columns-r-software-and-data-mining">fviz_cos2</a></span></p>
</div>
</div>
<div id="column-varables" class="section level2">
<h2>Column varables</h2>
<p>The function <strong>get_ca_col()</strong>[in <em>factoextra</em>] is used to extract the results for column variables. This function returns a list containing the coordinates, the cos2, the contribution and the inertia of columns variables:</p>
<pre class="r"><code>col <- get_ca_col(res.ca)
col</code></pre>
<pre><code>Correspondence Analysis - Results for columns
 ===================================================
  Name       Description                   
1 "$coord"   "Coordinates for the columns" 
2 "$cos2"    "Cos2 for the columns"        
3 "$contrib" "contributions of the columns"
4 "$inertia" "Inertia of the columns"      </code></pre>
<p><span class="warning"> The result for columns gives the same information as described for rows. For this reason, I’ll just displayed the result for columns in this section without commenting.</span></p>
<div id="coordinates-of-columns" class="section level3">
<h3>Coordinates of columns</h3>
<pre class="r"><code>head(col$coord)</code></pre>
<pre><code>                  Dim 1      Dim 2       Dim 3
Wife        -0.83762154  0.3652207 -0.19991139
Alternating -0.06218462  0.2915938  0.84858939
Husband      1.16091847  0.6019199 -0.18885924
Jointly      0.14942609 -1.0265791 -0.04644302</code></pre>
<p>Use the function <strong>fviz_ca_col()</strong> [in <em>factoextra</em>] to visualize only column points:</p>
<pre class="r"><code>fviz_ca_col(res.ca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-ca-columns-graph-factoextra-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<br/>
<div class="warning">
<p>Note that, it’s also possible to make the graph of columns only using <strong>FactoMineR</strong> base graph.The argument <em>invisible</em> is used to hide the rows on the factor map:</p>
<pre class="r"><code># Hide rows
plot(res.ca, invisible="row") </code></pre>
</div>
<p><br/></p>
<p><span class="warning">Read more about <em>fviz_ca_col()</em>: <a href="https://www.sthda.com/english/english/wiki/fviz-ca-quick-correspondence-analysis-data-visualization-using-factoextra-r-software-and-data-mining">fviz_ca_col</a></span></p>
</div>
<div id="contribution-of-columns-to-the-dimensions" class="section level3">
<h3>Contribution of columns to the dimensions</h3>
<pre class="r"><code>head(col$contrib)</code></pre>
<pre><code>                Dim 1     Dim 2      Dim 3
Wife        44.462018 10.312237 10.8220753
Alternating  0.103739  2.782794 82.5492464
Husband     54.233879 17.786612  6.1331792
Jointly      1.200364 69.118357  0.4954991</code></pre>
<p><span class="notice">Note that, you can use the previously mentioned <strong>corrplot()</strong> function to visualize the contribution of columns.</span></p>
<p>Use the function <a href="https://www.sthda.com/english/english/wiki/fviz-contrib-quick-visualization-of-row-column-contributions-r-software-and-data-mining"><strong>fviz_contrib()</strong></a> [in <em>factoextra</em>] to visualize column contributions on dimensions 1+2:</p>
<pre class="r"><code>fviz_contrib(res.ca, choice = "col", axes = 1:2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-ca-column-contribution-dim-1-factoextra-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<br/>
<div class="warning">
<ul>
<li><p>If the column contributions were uniform, the expected value would be 1/ncol(housetasks) = 1/4 = 25%.</p></li>
<li>The expected average contribution (reference line) of a column for Dim.1 and Dim.2 is : (25 * Eig1) + (25 * Eig2) = (25 * 0.54) + (25 * 0.44) = 24.5%.</li>
</ul>
</div>
<p><br/></p>
<p><strong>Draw a scatter plot of column points</strong> and highlight columns according to the amount of their contributions. The function <strong>fviz_ca_col()</strong> [in <em>factoextra</em>] is used:</p>
<pre class="r"><code># Control column point colors using their contribution
# Possible values for the argument col.col are :
  # "cos2", "contrib", "coord", "x", "y"
fviz_ca_col(res.ca, col.col="contrib")+
scale_color_gradient2(low="white", mid="blue", 
                      high="red", midpoint=24.5)+theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-columns-graph-colors-factoextra-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<br/>
<div class="warning">
<p>Note that, it’s also possible to control automatically the transparency of columns by their contributions using the argument <em>alpha.col</em>:</p>
<pre class="r"><code># Control the transparency of rows using their contribution
# Possible values for the argument alpha.col are :
  # "cos2", "contrib", "coord", "x", "y"
fviz_ca_col(res.ca, alpha.col="contrib")</code></pre>
</div>
<p><br/></p>
</div>
</div>
<div id="cos2-the-quality-of-representation-of-columns" class="section level2">
<h2>Cos2 : The quality of representation of columns</h2>
<pre class="r"><code>head(col$cos2)</code></pre>
<pre><code>                  Dim 1     Dim 2       Dim 3
Wife        0.801875947 0.1524482 0.045675847
Alternating 0.004779897 0.1051016 0.890118521
Husband     0.772026244 0.2075420 0.020431728
Jointly     0.020705858 0.9772939 0.002000236</code></pre>
<p><span class="warning">Note that, the value of the cos2 is between 0 and 1. A cos2 closed to 1 corresponds to a column/row variables that are well represented on the factor map.</span></p>
<p>The function <a href="https://www.sthda.com/english/english/wiki/fviz-cos2-quick-visualization-of-the-quality-of-representation-of-rows-columns-r-software-and-data-mining"><strong>fviz_cos2()</strong></a> [in <em>factoextra</em>] can be used to draw a bar plot of columns cos2:</p>
<pre class="r"><code># Cos2 of columns on Dim.1 and Dim.2
fviz_cos2(res.ca, choice = "col", axes = 1:2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-ca-columns-cos2-dim-1-factoextra-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="warning">Note that, only the column item <em>Alternating</em> is not very well displayed on the first two dimensions. The position of this item must be interpreted with caution in the space formed by dimensions 1 and 2.</span></p>
</div>
</div>
<div id="biplot-of-rows-and-columns" class="section level1">
<h1>Biplot of rows and columns</h1>
<div id="symmetric-biplot" class="section level2">
<h2>Symmetric biplot</h2>
<p>As mentioned above, the standard plot of <strong>correspondence analysis</strong> is a <strong>symmetric biplot</strong> in which both rows (blue points) and columns (red triangles) are represented in the same space using the <strong>principal coordinates</strong>. These coordinates represent the row and column profiles. In this case, only the distance between row points or the distance between column points can be really interpreted.</p>
<p><span class="notice">With symmetric plot, the inter-distance between rows and columns can’t be interpreted. Only a general statements can be made about the pattern.</span></p>
<pre class="r"><code>fviz_ca_biplot(res.ca)+
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-biplot-factoextra-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><strong>Remove the points from the graph, use texts only</strong> :</p>
<pre class="r"><code>fviz_ca_biplot(res.ca, geom="text")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-columns-graph-factoextra-remove-points-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<br/>

<div class="warning">
<p>Note that, allowed values for the argument <strong>geom</strong> are the combination of :</p>
<ul>
<li><strong>“point”</strong> to show only points (dots)</li>
<li><strong>“text”</strong> to show only labels</li>
<li><strong>c(“point”, “text”)</strong> to show both types</li>
</ul>
</div>
<p><br/></p>
<p><span class="warning">Note that, in order to interpret the distance between column points and row points, the simplest way is to make an <strong>asymmetric plot</strong> (Bendixen, 2003). This means that, the column profiles must be presented in row space or vice-versa.</span></p>
<p><span class="warning">Read more about <em>fviz_ca_biplot()</em>: <a href="https://www.sthda.com/english/english/wiki/fviz-ca-quick-correspondence-analysis-data-visualization-using-factoextra-r-software-and-data-mining">fviz_ca_biplot</a></span></p>
</div>
<div id="asymmetric-biplot-for-correspondence-analysis" class="section level2">
<h2>Asymmetric biplot for correspondence analysis</h2>
<p>To make an <strong>asymetric plot</strong>, rows (or columns) points are plotted from the <strong>standard co-ordinates</strong> (<em>S</em>) and the profiles of the columns (or the rows) are plotted from the <strong>principale coordinates</strong> (<em>P</em>) (Bendixen 2003).</p>
<br/>
<div class="block">
<p>For a given axis, the standard and principle co-ordinates are related as follows:</p>
<p><em>P = sqrt(eigenvalue) X S</em></p>
<ul>
<li><em>P</em>: the principal coordinate of a row (or a column) on the axis</li>
<li><em>eigenvalue</em>: the eigenvalue of the axis</li>
</ul>
</div>
<p><br/></p>
<p>Depending on the situation, other types of display can be set using the argument <em>map</em> for the function <strong>fviz_ca_biplot()</strong>[in <em>factoextra</em>]. This is inspired from <em>ca</em> package (Michael Greenacre).</p>
<p>The allowed options for the argument <em>map</em> are:</p>
<ol style="list-style-type: decimal">
<li><strong>“rowprincipal”</strong> or <strong>“colprincipal”</strong> - these are the so-called <strong>asymmetric biplots</strong>, with either rows in principal coordinates and columns in standard coordinates, or vice versa (also known as row-metric-preserving or column-metric-preserving respectively).
</li>
</ol>
<ul>
<li><strong>“rowprincipal”</strong>: columns are represented in row space</li>
<li><strong>“colprincipal”</strong>: rows are represented in column space</li>
</ul>
<ol start="2" style="list-style-type: decimal">
<li><p><strong>“symbiplot”</strong> - both rows and columns are scaled to have variances equal to the singular values (square roots of eigenvalues), which gives a <strong>symmetric biplot</strong> but does not preserve row or column metrics.</p></li>
<li><strong>“rowgab”</strong> or <strong>“colgab”</strong>: <strong>Asymetric maps</strong> proposed by <em>Gabriel &amp; Odoroff (1990)</em>:</li>
</ol>
<ul>
<li>“<em>rowgab</em>”: rows in principal coordinates and columns in standard coordinates multiplied by the mass.</li>
<li>“<em>colgab</em>”: columns in principal coordinates and rows in standard coordinates multiplied by the mass.</li>
</ul>
<ol start="4" style="list-style-type: decimal">
<li><strong>“rowgreen”</strong> or <strong>“colgreen”</strong>: The so-called <strong>contribution biplots</strong> showing visually the most contributing points (Greenacre 2006b).</li>
</ol>
<ul>
<li>“<em>rowgreen</em>”: rows in principal coordinates and columns in standard coordinates multiplied by square root of the mass.</li>
<li>“<em>colgreen</em>”: columns in principal coordinates and rows in standard coordinates multiplied by the square root of the mass.</li>
</ul>
<p>The R code below draw a standard <strong>asymetric biplot</strong>:</p>
<pre class="r"><code>fviz_ca_biplot(res.ca, map ="rowprincipal", arrow = c(TRUE, TRUE))</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-biplot-asymmetric-map-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="notice">The argument <em>arrows</em> is a vector of two logicals specifying if the plot should contain points (FALSE, default) or arrows (TRUE). First value sets the rows and the second value sets the columns.</span></p>
<br/>
<div class="warning">
<p>If the angle between two arrows is acute, then their is a strong association between the corresponding row and column.</p>
To interpret the distance between rows and and a column you should perpendicularly project row points on the column arrow.
</div>
<p><br/></p>
</div>
<div id="contribution-biplot" class="section level2">
<h2>Contribution biplot</h2>
<p>In correspondence analysis, <strong>biplot</strong> is a graphical display of rows and columns in 2 or 3 dimensions.</p>
<p>In the standard <strong>symmetric biplot</strong> (mentioned in the previous sections), it’s difficult to know the most contributing points to the solution of the CA.</p>
<p><a href="http://www.econ.upf.edu/docs/papers/downloads/1162.pdf"><em>Michael Greenacre</em></a> proposed a new scaling displayed (called contribution biplot) which incorporates the contribution of points. In this display, points that contribute very little to the solution, are close to the center of the biplot and are relatively unimportant to the interpretation.</p>
<p><span class="warning">A <strong>contribution biplot</strong> can be drawn using the argument <strong>map = “rowgreen”</strong> or <strong>map = “colgreen”</strong>.</span></p>
<p>Firstly, you have to decide whether to analyse the contributions of rows or columns to the definition of the axes.</p>
<p>In our example we’ll interpret the contribution of rows to the axes. The argument <strong>map =“colgreen”</strong> is used. In this case, remember that columns are in principal coordinates and rows in standard coordinates multiplied by the square root of the mass. For a given row, the square of the new coordinate on an axis i is exactly the contribution of this row to the inertia of the axis i.</p>
<pre class="r"><code>fviz_ca_biplot(res.ca, map ="colgreen",
               arrow = c(TRUE, FALSE))</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-contribution-biplot-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>In the graph above, the position of the column profile points is unchanged relative to that in the conventional biplot. However, the distances of the row points from the plot origin are related to their contributions to the two-dimensional factor map.</p>
<p>The closer an arrow is (in terms of angular distance) to an axis the greater is the contribution of the row category on that axis relative to the other axis. If the arrow is halfway between the two, its row category contributes to the two axes to the same extent.</p>
<br/>
<div class="block">
<ul>
<li><p>It is evident that row category <em>Repairs</em> have an important contribution to the positive pole of the first dimension, while the categories <em>Laundry</em> and <em>Main_meal</em> have a major contribution to the negative pole of the first dimension;</p></li>
<li><p>Dimension 2 is mainly defined by the row category <em>Holidays</em>.</p></li>
<li>The row category <em>Driving</em> contributes to the two axes to the same extent.</li>
</ul>
</div>
<p><br/></p>
</div>
<div id="plot-rows-or-columns-only" class="section level2">
<h2>Plot rows or columns only</h2>
<p><span class="warning">It’s also possible to draw the rows or columns only using the function <strong>fviz_ca_biplot()</strong> (instead of using fviz_ca_row() and fviz_ca_col)</span></p>
<p>Plot rows only by hiding the columns (<em>invisible =“col”</em>):</p>
<pre class="r"><code>fviz_ca_biplot(res.ca, invisible = "col")+
  theme_minimal()</code></pre>
<p>Plot columns only by hiding the rows (<em>invisible =“row”</em>):</p>
<pre class="r"><code>fviz_ca_biplot(res.ca, invisible = "row")+
  theme_minimal()</code></pre>
</div>
</div>
<div id="correspondence-analysis-using-supplementary-rows-and-columns" class="section level1">
<h1>Correspondence analysis using supplementary rows and columns</h1>
<div id="data" class="section level2">
<h2>Data</h2>
<p>We’ll use the data set <em>children</em> [in <em>FactoMineR</em> package]. It contains 18 rows and 8 columns:</p>
<pre class="r"><code>data(children)
# head(children)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/images/ca-children.png" alt="Data format correspondence analysis" /></p>
<p><span class="notice"> The data used here is a contingency table describing the answers given by different categories of people to the following question: What are the reasons that can make hesitate a woman or a couple to have children? </span></p>
<br/>

<div class="warning">
<p>Only some of the rows and columns will be used to perform the correspondence analysis (CA).</p>
The coordinates of the remaining (supplementary) rows/columns on the factor map will be <strong>predicted</strong> after the CA.
</div>
<p><br/></p>
<p>In CA terminology, our data contains :</p>
<br/>
<div class="block">
<ul>
<li><strong>Active rows</strong> (rows 1:14) : Rows that are used during the correspondence analysis.</li>
<li><strong>Supplementary rows</strong> (row.sup 15:18) : The coordinates of these rows will be predicted using the CA informations and parameters obtained with active rows/columns</li>
<li><strong>Active columns</strong> (columns 1:5) : Columns that are used for the correspondence analysis.</li>
<li><strong>Supplementary columns</strong> (col.sup 6:8) : As supplementary rows, the coordinates of these columns will be predicted also.</li>
</ul>
</div>
<p><br/></p>
</div>
<div id="ca-with-supplementary-rowscolumns" class="section level2">
<h2>CA with supplementary rows/columns</h2>
<p>As mentioned above, supplementary rows and columns are not used for the definition of the principal dimensions. Their coordinates are predicted using only the informations provided by the performed CA on active rows/columns.</p>
<p>To specify supplementary rows/columns, the function <strong>CA()</strong>[in <em>FactoMineR</em>] can be used as follow :</p>
<pre class="r"><code>CA(X,  ncp = 5, row.sup = NULL, col.sup = NULL,
   graph = TRUE)</code></pre>
<br/>
<div class="block">
<ul>
<li><strong>X</strong> : a data frame (contingency table)</li>
<li><strong>row.sup</strong> : a numeric vector specifying the indexes of the supplementary rows</li>
<li><strong>col.sup</strong> : a numeric vector specifying the indexes of the supplementary columns</li>
<li><strong>ncp</strong> : number of dimensions kept in the final results.</li>
<li><strong>graph</strong> : a logical value. If TRUE a graph is displayed.</li>
</ul>
</div>
<p><br/></p>
<p>Example of usage :</p>
<pre class="r"><code>res.ca <- CA (children, row.sup = 15:18, col.sup = 6:8,
              graph = FALSE)</code></pre>
<p>The summary of the CA is :</p>
<pre class="r"><code>summary(res.ca, nb.dec = 2, ncp = 2)</code></pre>
<pre><code>
Call:
rmarkdown::render("factominer-correspondance-analysis.Rmd", encoding = "UTF-8") 

The chi square of independence between the two variables is equal to 98.80159 (p-value =  9.748064e-05 ).

Eigenvalues
                      Dim.1  Dim.2  Dim.3  Dim.4  Dim.5
Variance               0.04   0.01   0.01   0.01   0.00
% of var.             57.04  21.13  11.76  10.06   0.00
Cumulative % of var.  57.04  78.17  89.94 100.00 100.00

Rows (the 10 first)
                      Dim.1   ctr  cos2   Dim.2   ctr  cos2  
money               | -0.12  4.55  0.43 |  0.02  0.37  0.01 |
future              |  0.18 17.57  0.72 | -0.10 14.59  0.22 |
unemployment        | -0.21 22.62  0.87 | -0.07  6.78  0.10 |
circumstances       |  0.40  6.27  0.58 |  0.33 11.54  0.40 |
hard                | -0.25  2.99  0.88 |  0.07  0.59  0.06 |
economic            |  0.35 12.00  0.48 |  0.32 26.60  0.40 |
egoism              |  0.06  0.68  0.07 | -0.03  0.34  0.01 |
employment          | -0.14  2.62  0.16 |  0.22 17.55  0.41 |
finances            | -0.24  2.79  0.28 | -0.21  5.69  0.21 |
war                 |  0.22  2.17  0.75 | -0.07  0.69  0.09 |

Columns
                      Dim.1   ctr  cos2   Dim.2   ctr  cos2  
unqualified         | -0.21 25.11  0.68 | -0.08 10.08  0.10 |
cep                 | -0.14 18.30  0.64 |  0.06  8.08  0.11 |
bepc                |  0.11  6.76  0.31 | -0.03  1.25  0.02 |
high_school_diploma |  0.27 37.98  0.76 | -0.12 20.10  0.15 |
university          |  0.23 11.86  0.31 |  0.32 60.49  0.59 |

Supplementary rows
                      Dim.1 cos2   Dim.2 cos2  
comfort             |  0.21 0.07 |  0.70 0.78 |
disagreement        |  0.15 0.13 |  0.12 0.09 |
world               |  0.52 0.88 |  0.14 0.07 |
to_live             |  0.31 0.14 |  0.50 0.37 |

Supplementary columns
                      Dim.1  cos2   Dim.2  cos2  
thirty              |  0.11  0.14 | -0.06  0.04 |
fifty               | -0.02  0.01 |  0.05  0.09 |
more_fifty          | -0.18  0.29 | -0.05  0.02 |</code></pre>
<p><span class="notice">For the supplementary rows/columns, the coordinates and the quality of representation (cos2) on the factor maps are displayed. They don’t contribute to the dimensions.</span></p>
</div>
<div id="make-a-biplot-of-rows-and-columns" class="section level2">
<h2>Make a biplot of rows and columns</h2>
<p><strong>FactomineR base graph</strong>:</p>
<pre class="r"><code>plot(res.ca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-biplot-supplementary-factominer-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<br/>
<div class="block">
<ul>
<li>Active rows are in blue</li>
<li>Supplementary rows are in darkblue</li>
<li>Columns are in red</li>
<li>Supplementary columns are in darkred</li>
</ul>
</div>
<p><br/></p>
<p><strong>Use factoextra</strong>:</p>
<pre class="r"><code>fviz_ca_biplot(res.ca) +
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-biplot-supplementary-factoextra-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p>It’s also possible to hide supplementary rows and columns using the argument <em>invisible</em>:</p>
<pre class="r"><code>fviz_ca_biplot(res.ca, invisible = c("row.sup", "col.sup") ) +
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-hide-supplementary-rows-columns-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><span class="notice">The argument <em>invisible</em> is also available in <em>FactoMineR</em> base graph.</span></p>
</div>
<div id="visualize-supplementary-rows" class="section level2">
<h2>Visualize supplementary rows</h2>
<p>All the results (coordinates and cos2) for the supplementary rows can be extracted as follow :</p>
<pre class="r"><code>res.ca$row.sup</code></pre>
<pre><code>$coord
                 Dim 1     Dim 2      Dim 3      Dim 4
comfort      0.2096705 0.7031677 0.07111168  0.3071354
disagreement 0.1462777 0.1190106 0.17108916 -0.3132169
world        0.5233045 0.1429707 0.08399269 -0.1063597
to_live      0.3083067 0.5020193 0.52093397  0.2557357

$cos2
                  Dim 1      Dim 2       Dim 3      Dim 4
comfort      0.06892759 0.77524032 0.007928672 0.14790342
disagreement 0.13132177 0.08692632 0.179649183 0.60210272
world        0.87587685 0.06537746 0.022564054 0.03618163
to_live      0.13899699 0.36853645 0.396830367 0.09563620</code></pre>
<p><strong>Factor map for rows</strong> :</p>
<pre class="r"><code>fviz_ca_row(res.ca) +
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-supplementary-rows-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="success">Supplementary rows are shown in darkblue color.</span></p>
</div>
<div id="visualize-supplementary-columns" class="section level2">
<h2>Visualize supplementary columns</h2>
<p><strong>Factor map for columns</strong>:</p>
<pre class="r"><code>fviz_ca_col(res.ca) +
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-supplementary-columns-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="success">Supplementary columns are shown in darkred.</span></p>
<p>The results for supplementary columns can be extracted as follow :</p>
<pre class="r"><code>res.ca$col.sup</code></pre>
<pre><code>$coord
                 Dim 1       Dim 2       Dim 3       Dim 4
thirty      0.10541339 -0.05969594 -0.10322613  0.06977996
fifty      -0.01706444  0.04907657 -0.01568923 -0.01306117
more_fifty -0.17706810 -0.04813788  0.10077299 -0.08517528

$cos2
               Dim 1      Dim 2       Dim 3       Dim 4
thirty     0.1375601 0.04411543 0.131910759 0.060278490
fifty      0.0108695 0.08990298 0.009188167 0.006367804
more_fifty 0.2860989 0.02114509 0.092666735 0.066200714</code></pre>
</div>
</div>
<div id="filter-ca-results" class="section level1">
<h1>Filter CA results</h1>
<p>If you have many row/column variables, it’s possible to visualize only some of them using the arguments <em>select.row</em> and <em>select.col</em>.</p>
<br/>
<div class="block">
<p><strong>select.col, select.row:</strong> a selection of columns/rows to be drawn. Allowed values are <em>NULL</em> or a <em>list</em> containing the arguments name, cos2 or contrib:</p>
<ul>
<li><em>name</em>: is a character vector containing column/row names to be drawn</li>
<li><em>cos2</em>: if cos2 is in [0, 1], ex: 0.6, then columns/rows with a cos2 > 0.6 are drawn</li>
<li><em>if cos2 > 1</em>, ex: 5, then the top 5 active columns/rows and top 5 supplementary columns/rows with the highest cos2 are drawn</li>
<li><em>contrib</em>: if contrib > 1, ex: 5, then the top 5 columns/rows with the highest cos2 are drawn</li>
</ul>
</div>
<p><br/></p>
<pre class="r"><code># Visualize rows with cos2 >= 0.8
fviz_ca_row(res.ca, select.row = list(cos2 = 0.8))</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-filter-r-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Top 5 active rows and 5 suppl. rows with the highest cos2
fviz_ca_row(res.ca, select.row = list(cos2 = 5))</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-filter-r-data-mining-2.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><span class="warning">The top 5 active rows and the top 5 supplementary rows are shown.</span></p>
<pre class="r"><code># Select by names
name <- list(name = c("employment", "fear", "future"))
fviz_ca_row(res.ca, select.row = name)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-filter-2-r-data-mining-1.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<pre class="r"><code>#top 5 contributing rows and columns
fviz_ca_biplot(res.ca, select.row = list(contrib = 5), 
               select.col = list(contrib = 5)) +
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-correspondance-analysis-filter-2-r-data-mining-2.png" title="Correspondence analysis, visualization and interpretation - R software and data mining" alt="Correspondence analysis, visualization and interpretation - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><span class="warning">Supplementary rows/columns are not shown because they don’t contribute to the construction of the axes.</span></p>
</div>
<div id="dimension-description" class="section level1">
<h1>Dimension description</h1>
<p>The function <strong>dimdesc()</strong> [in <em>FactoMineR</em>] can be used to identify the most correlated variables with a given dimension.</p>
<p>A simplified format is :</p>
<pre class="r"><code>dimdesc(res, axes = 1:2, proba = 0.05)</code></pre>
<br/>
<div>
<ul>
<li><strong>res</strong> : an object of class CA</li>
<li><strong>axes</strong> : a numeric vector specifying the dimensions to be described</li>
<li><strong>prob</strong> : the significance level</li>
</ul>
</div>
<p><br/></p>
<p>Example of usage :</p>
<pre class="r"><code>res.desc <- dimdesc(res.ca, axes = c(1,2))
# Description of dimension 1
res.desc$`Dim 1`</code></pre>
<pre><code>$row
                     coord
hard          -0.249984356
finances      -0.236995598
unemployment  -0.212227692
work          -0.211677086
employment    -0.136754598
money         -0.115267468
housing       -0.006680991
egoism         0.059889455
health         0.111651752
disagreement   0.146277736
future         0.176449413
fear           0.203347917
comfort        0.209670471
war            0.216824026
to_live        0.308306674
economic       0.353963920
circumstances  0.400922001
world          0.523304472

$col
                          coord
unqualified         -0.20931790
more_fifty          -0.17706810
cep                 -0.13857658
fifty               -0.01706444
thirty               0.10541339
bepc                 0.10875778
university           0.23123279
high_school_diploma  0.27403930</code></pre>
<pre class="r"><code># Description of dimension 2
res.desc$`Dim 2`</code></pre>
<pre><code>$row
                    coord
finances      -0.20598461
future        -0.09786326
war           -0.07466267
unemployment  -0.07071770
fear          -0.05806796
egoism        -0.02566733
health         0.00429124
money          0.02004613
hard           0.06765048
work           0.10888448
disagreement   0.11901056
housing        0.12824218
world          0.14297067
employment     0.21539408
economic       0.32072390
circumstances  0.33098674
to_live        0.50201935
comfort        0.70316769

$col
                          coord
high_school_diploma -0.12134373
unqualified         -0.08072742
thirty              -0.05969594
more_fifty          -0.04813788
bepc                -0.02848299
fifty                0.04907657
cep                  0.05604703
university           0.31785751</code></pre>
</div>
<div id="ca-and-outliers" class="section level1">
<h1>CA and outliers</h1>
<p>If one or more “outliers” are present in the contingency table, they can dominate the interpretation the axes (Bendixen M. 2003).</p>
<p>Outliers are points that have high absolute co-ordinate values and high contributions. They are represented, on the graph, very far from the centroïd. In this case, the remaining row/column points tend to be tightly clustered in the graph which become difficult to interpret.</p>
<p>In the CA output, the coordinates of row/column points represent the number of standard deviations the row/column is away from the barycentre (Bendixen M. 2003).</p>
<p><span class="warning">Outliers are points that are are at least <strong>one standard deviation away from the barycentre</strong>. They contribute also, significantly to the interpretation to one pole of an axis (Bendixen M. 2003).</span></p>
<p><span class="success">There are no apparent outliers in our data.</span></p>
<p><span class="notice">If there are outliers in the data, they must be suppressed or treated as supplementary points when re-running the correspondence analysis.</span></p>
</div>
<div id="infos" class="section level1">
<h1>Infos</h1>
<p><span class="warning"> This analysis has been performed using <strong>R software</strong> (ver. 3.1.2), <strong>FactoMineR</strong> (ver. 1.29) and <strong>factoextra</strong> (ver. 1.0.2) </span></p>
<p><strong>References and further reading</strong>:</p>
<ul>
<li>Bendixen M.1995, Compositional perceptual mapping using chi-squared tree analysis and Correspondence Analysis, «Journal of Marketing Management», 11, 571-581.</li>
<li>Bendixen M. 2003, A Practical Guide to the Use of Correspondence Analysis in Marketing Research, Marketing Bulletin, 2003, 14, Technical Note 2. <a href="http://marketing-bulletin.massey.ac.nz/V14/MB_V14_T2_Bendixen.pdf">http://marketing-bulletin.massey.ac.nz/V14/MB_V14_T2_Bendixen.pdf</a></li>
<li>G Alberti, An R Script to Facilitate Correspondence Analysis. A Guide to the Use and the Interpretation of Results from an Archaeological Perspective, in Archeologia e Calcolatori 24 2013, 25-53. <a href="http://soi.cnr.it/archcalc/indice/PDF24/02_Alberti.pdf">http://soi.cnr.it/archcalc/indice/PDF24/02_Alberti.pdf</a></li>
<li>Greenacre M.. Contribution biplots. <a href="http://www.econ.upf.edu/docs/papers/downloads/1162.pdf">http://www.econ.upf.edu/docs/papers/downloads/1162.pdf</a></li>
<li>Healey J.F. 2013, The Essentials of Statistics. A Tool for Social Research, 3rded., Belmont, Wadsworth.</li>
<li>Laura Doey and Jessica Kurta. Correspondence Analysis applied to psychological research. Tutorials in Quantitative Methods for Psychology 2011, Vol. 7(1), p. 5-14. <a href="http://www.tqmp.org/RegularArticles/vol07-1/p005/p005.pdf">http://www.tqmp.org/RegularArticles/vol07-1/p005/p005.pdf</a></li>
<li>François Husson. FactomineR. <a href="http://factominer.free.fr">http://factominer.free.fr</a></li>
</ul>
</div>

<script>jQuery(document).ready(function () {
    jQuery('h1').addClass('wiki_paragraph1');
    jQuery('h2').addClass('wiki_paragraph2');
    jQuery('h3').addClass('wiki_paragraph3');
    jQuery('h4').addClass('wiki_paragraph4');
    });//add phpboost class to header</script>
<style>.content{padding:0px;}</style>
</div><!--end rdoc-->
<!--====================== stop here when you copy to sthda================-->

<!-- END HTML -->]]></description>
			<pubDate>Mon, 22 Jun 2015 08:17:47 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[ade4 and factoextra : Correspondence Analysis - R software and data mining]]></title>
			<link>https://www.sthda.com/english/wiki/ade4-and-factoextra-correspondence-analysis-r-software-and-data-mining</link>
			<guid>https://www.sthda.com/english/wiki/ade4-and-factoextra-correspondence-analysis-r-software-and-data-mining</guid>
			<description><![CDATA[<!-- START HTML -->

  <!--====================== start from here when you copy to sthda================-->  
  <div id="rdoc">

<div id="TOC">
<ul>
<li><a href="#required-packages">Required packages</a></li>
<li><a href="#load-ade4-and-factoextra">Load ade4 and factoextra</a></li>
<li><a href="#data-format-contingency-tables">Data format: Contingency tables</a></li>
<li><a href="#correspondence-analysis-ca">Correspondence analysis (CA)</a></li>
<li><a href="#eigenvalues-and-scree-plot">Eigenvalues and scree plot</a><ul>
<li><a href="#extract-the-eigenvalues">Extract the eigenvalues</a></li>
<li><a href="#make-a-scree-plot-using-ade4-base-graphics">Make a scree plot using ade4 base graphics</a></li>
<li><a href="#make-the-scree-plot-using-factoextra">Make the scree plot using factoextra</a></li>
</ul></li>
<li><a href="#ca-scatter-plot-biplot-of-row-and-column-variables">CA scatter plot: Biplot of row and column variables</a></li>
<li><a href="#row-variables">Row variables</a><ul>
<li><a href="#coordinates-of-rows">Coordinates of rows</a></li>
<li><a href="#contribution-of-rows-to-the-dimensions">Contribution of rows to the dimensions</a></li>
<li><a href="#cos2-quality-of-representation-of-rows-on-the-factor-map">Cos2 : quality of representation of rows on the factor map</a></li>
</ul></li>
<li><a href="#column-variables">Column variables</a><ul>
<li><a href="#coordinates-of-columns">Coordinates of columns</a></li>
<li><a href="#contribution-of-columns">Contribution of columns</a></li>
<li><a href="#cos2-the-quality-of-representation-of-columns">Cos2 : The quality of representation of columns</a></li>
</ul></li>
<li><a href="#correspondence-analysis-using-supplementary-rows-and-columns">Correspondence analysis using supplementary rows and columns</a><ul>
<li><a href="#data">Data</a></li>
<li><a href="#r-functions">R functions</a></li>
<li><a href="#supplementary-rows">Supplementary rows</a></li>
<li><a href="#supplementary-columns">Supplementary columns</a></li>
</ul></li>
<li><a href="#further-reading">Further reading</a></li>
<li><a href="#infos">Infos</a></li>
</ul>
</div>

<p><br/></p>
<p><strong>Correspondence Analysis (CA)</strong> is an adaptation of <a href="https://www.sthda.com/english/english/wiki/factominer-and-factoextra-principal-component-analysis-visualization-r-software-and-data-mining"><strong>Principal Component Analysis</strong></a> used to analyse a contingency (or frequency) table formed by two qualitative variables.</p>
<p>A comprehensive guide for CA computing, analysis and visualization has been provided in my previous post: <a href="https://www.sthda.com/english/english/wiki/correspondence-analysis-in-r-the-ultimate-guide-for-the-analysis-the-visualization-and-the-interpretation-r-software-and-data-mining">Correspondence Analysis in R: The Ultimate Guide for the Analysis, the Visualization and the Interpretation</a>.</p>
<p>The basic idea and the <a href="https://www.sthda.com/english/english/wiki/correspondence-analysis-basics-r-software-and-data-mining">mathematical procedures</a> of correspondence analysis are covered here: <a href="https://www.sthda.com/english/english/wiki/correspondence-analysis-basics-r-software-and-data-mining">Correspondence analysis basics</a></p>
<p>This current <strong>R tutorial</strong> describes how to compute <strong>CA</strong> using <strong>R software</strong> and <strong>ade4</strong> package.</p>
<div id="required-packages" class="section level1">
<h1>Required packages</h1>
<p>The R packages <strong>ade4</strong>(for computing CA) and <a href="https://www.sthda.com/english/english/wiki/factoextra-r-package-visualization-of-the-outputs-of-a-multivariate-analysis-r-software-and-data-mining"><strong>factoextra</strong></a> (for CA visualization) are used.</p>
<p>They can be installed as follow :</p>
<pre class="r"><code>install.packages("ade4")

# install.packages("devtools")
devtools::install_github("kassambara/factoextra")</code></pre>
<p><span class="warning">Note that, for factoextra a version >= 1.0.2 is required for this tutorial. If it’s already installed on your computer, you should re-install it to have the most updated version.</span></p>
</div>
<div id="load-ade4-and-factoextra" class="section level1">
<h1>Load ade4 and factoextra</h1>
<pre class="r"><code>library("ade4")
library("factoextra")</code></pre>
</div>
<div id="data-format-contingency-tables" class="section level1">
<h1>Data format: Contingency tables</h1>
<p>We’ll use the data sets <em>housetasks</em> taken from the package <strong>ade4</strong>.</p>
<pre class="r"><code>data(housetasks)
# head(housetasks)</code></pre>
<p>An image of the data is shown below:</p>
<p><img src="https://www.sthda.com/english/sthda/RDoc/images/ca-housetasks.png" alt="Data format correspondence analysis" /></p>
<br/>
<div class="block">
<p>The data is a contingency table containing 13 housetasks and their repartition in the couple :</p>
<ul>
<li>rows are the different tasks</li>
<li>values are the frequencies of the tasks done :
</li>
<li>by the <em>wife</em> only</li>
<li>alternatively</li>
<li>by the husband only</li>
<li>or jointly</li>
</ul>
</div>
<p><br/></p>
<br/>
<div class="warning">
<p>Note that, it’s possible to visualize a contingency table using the functions: <strong>balloonplot()</strong> [in <em>gplots</em> package], <strong>mosaicplot()</strong> [in <em>graphics</em> package], <strong>assoc()</strong> [in <em>vcd</em> package].</p>
To learn more about these functions, read this article: <a href="https://www.sthda.com/english/english/wiki/correspondence-analysis-in-r-the-ultimate-guide-for-the-analysis-the-visualization-and-the-interpretation-r-software-and-data-mining">Correspondence Analysis in R: The Ultimate Guide for the Analysis, the Visualization and the Interpretation</a>
</div>
<p><br/></p>
</div>
<div id="correspondence-analysis-ca" class="section level1">
<h1>Correspondence analysis (CA)</h1>
<p>The function <strong>dudi.coa()</strong> [in <em>ade4</em> package] can be used. A simplified format is :</p>
<pre class="r"><code>dudi.coa(df, scannf = TRUE, nf = 2)</code></pre>
<br/>
<div class="block">
<ul>
<li><strong>df</strong> : a data frame (contingency table)</li>
<li><strong>scannf</strong> : a logical value specifying whether the eigenvalues bar plot should be displayed</li>
<li><strong>nf</strong> : number of dimensions kept in the final results.</li>
</ul>
</div>
<p><br/></p>
<p>Example of usage:</p>
<pre class="r"><code>res.ca <- dudi.coa(housetasks, scannf = FALSE, nf = 5)</code></pre>
</div>
<div id="eigenvalues-and-scree-plot" class="section level1">
<h1>Eigenvalues and scree plot</h1>
<div id="extract-the-eigenvalues" class="section level2">
<h2>Extract the eigenvalues</h2>
<p><strong>Eigenvalues</strong> measure the amount of variation retained by a <strong>principal axis</strong> :</p>
<pre class="r"><code>summary(res.ca)</code></pre>
<pre><code>Class: coa dudi
Call: dudi.coa(df = housetasks, scannf = FALSE, nf = 5)

Total inertia: 1.115

Eigenvalues:
    Ax1     Ax2     Ax3 
 0.5429  0.4450  0.1270 

Projected inertia (%):
    Ax1     Ax2     Ax3 
  48.69   39.91   11.40 

Cumulative projected inertia (%):
    Ax1   Ax1:2   Ax1:3 
  48.69   88.60  100.00 </code></pre>
<p>You can also use the function <strong>get_eigenvalue()</strong> [in <strong>factoextra</strong> package] to extract the eigenvalues :</p>
<pre class="r"><code>eig.val <- get_eigenvalue(res.ca)
head(eig.val)</code></pre>
<pre><code>      eigenvalue variance.percent cumulative.variance.percent
Dim.1  0.5428893         48.69222                    48.69222
Dim.2  0.4450028         39.91269                    88.60491
Dim.3  0.1270484         11.39509                   100.00000</code></pre>
</div>
<div id="make-a-scree-plot-using-ade4-base-graphics" class="section level2">
<h2>Make a scree plot using ade4 base graphics</h2>
<p>The function <strong>screeplot()</strong> can be used to draw the amount of inertia (variance) retained by the dimensions.</p>
<p>A simplified format is:</p>
<pre class="r"><code>screeplot(x, ncps = length(x$eig), type = c("barplot", "lines"))</code></pre>
<br/>
<div class="block">
<ul>
<li><strong>x</strong> : an object of class dudi</li>
<li><strong>ncps</strong> : the number of components to be plotted</li>
<li><strong>type</strong> : the type of plot</li>
</ul>
</div>
<p><br/></p>
<p>Example of usage :</p>
<pre class="r"><code>screeplot(res.ca, main ="Screeplot - Eigenvalues")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-correspondence-analysis-screeplot-ade4-data-mining-1.png" title="ade4 and factoextra : correspondence analysis - R software and data mining" alt="ade4 and factoextra : correspondence analysis - R software and data mining" width="336" style="margin-bottom:10px;" /></p>
<p><span class="success">~89% of the information contained in the data are retained by the first two dimensions.</span></p>
</div>
<div id="make-the-scree-plot-using-factoextra" class="section level2">
<h2>Make the scree plot using factoextra</h2>
<p>It’s also possible to use the function <strong>fviz_screeplot()</strong> [in <em>factoextra</em>] to make the <strong>scree plot</strong>. In the R code below, we’ll draw the percentage of variances retained by each component :</p>
<pre class="r"><code>fviz_screeplot(res.ca, ncp=3)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-correspondence-analysis-eigenvalue-factoextra-data-mining-1.png" title="ade4 and factoextra : correspondence analysis - R software and data mining" alt="ade4 and factoextra : correspondence analysis - R software and data mining" width="336" style="margin-bottom:10px;" /></p>
<p><span class="warning">Read more about eigenvalues and screeplot: <a href="https://www.sthda.com/english/english/wiki/eigenvalues-quick-data-visualization-with-factoextra-r-software-and-data-mining">Eigenvalues data visualization</a></span></p>
</div>
</div>
<div id="ca-scatter-plot-biplot-of-row-and-column-variables" class="section level1">
<h1>CA scatter plot: Biplot of row and column variables</h1>
<p>The function <strong>scatter()</strong> or <strong>biplot()</strong> can be used as follow :</p>
<pre class="r"><code># Remove the scree plot (posieig ="none")
scatter(res.ca, posieig = "none")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-correspondence-analysis-biplot-ade4-data-mining-1.png" title="ade4 and factoextra : correspondence analysis - R software and data mining" alt="ade4 and factoextra : correspondence analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre><code>NULL</code></pre>
<p><span class="notice">By default, the scree plot is displayed on the <strong>scatter plot</strong>. The argument <strong>posieig =“none”</strong> is used to remove the <strong>scree plot</strong>.</span></p>
<p><span class="warning">Note that, if you want to remove row or column labels the argument <strong>clab.row = 0</strong> or <strong>clab.col = 0</strong> can be used.</span></p>
<p>Biplot can be drawn using the combination of the two functions below :</p>
<ul>
<li>s.label() to plot rows or columns as points</li>
<li>s.arrow() to add rows or columns as arrows</li>
</ul>
<pre class="r"><code># Plot of rows as points
s.label(res.ca$li, xax = 1, yax = 2)
# Add column variables as arrows
s.arrow(res.ca$co, add.plot = TRUE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-correspondence-analysis-biplot2-ade4-data-mining-1.png" title="ade4 and factoextra : correspondence analysis - R software and data mining" alt="ade4 and factoextra : correspondence analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>It’s also possible to use the function <strong>fviz_ca_biplot()</strong>[in <em>factoextra</em> package] to draw a nice looking plot:</p>
<pre class="r"><code>fviz_ca_biplot(res.ca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-correspondence-analysis-ca-biplot-factoextra-data-mining-1.png" title="ade4 and factoextra : correspondence analysis - R software and data mining" alt="ade4 and factoextra : correspondence analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Change the theme
fviz_ca_biplot(res.ca) +
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-correspondence-analysis-ca-biplot-factoextra-data-mining-2.png" title="ade4 and factoextra : correspondence analysis - R software and data mining" alt="ade4 and factoextra : correspondence analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="success">The graph above is called <strong>symetric plot</strong> representing row and column profiles. Rows are represented by blue points and columns by red triangles. </span></p>
<p><span class="warning">Read more about <em>fviz_ca_biplot()</em>: <a href="https://www.sthda.com/english/english/wiki/fviz-ca-quick-correspondence-analysis-data-visualization-using-factoextra-r-software-and-data-mining">fviz_ca_biplot</a></span></p>
</div>
<div id="row-variables" class="section level1">
<h1>Row variables</h1>
<p>The simplest way is to use the function <strong>get_ca_row()</strong> [in <em>factoextra</em>] to extract the results for row variables. This function returns a list containing the coordinates, the cos2 and the contribution of row variables:</p>
<pre class="r"><code>row <- get_ca_row(res.ca)
row</code></pre>
<pre><code>Correspondence Analysis - Results for rows
 ===================================================
  Name       Description                
1 "$coord"   "Coordinates for the rows" 
2 "$cos2"    "Cos2 for the rows"        
3 "$contrib" "contributions of the rows"
4 "$inertia" "Inertia of the rows"      </code></pre>
<pre class="r"><code># Print the coordinates
head(row$coord)</code></pre>
<pre><code>               Dim.1      Dim.2       Dim.3
Laundry    0.9918368 -0.4953220 -0.31672897
Main_meal  0.8755855 -0.4901092 -0.16406487
Dinner     0.6925740 -0.3081043 -0.20741377
Breakfeast 0.5086002 -0.4528038  0.22040453
Tidying    0.3938084  0.4343444 -0.09421375
Dishes     0.1889641  0.4419662  0.26694926</code></pre>
<p><span class="notice">In the next section, I’ll show how to extract row coordinates, cos2 and contribution using <strong>ade4</strong> base code.</span></p>
<div id="coordinates-of-rows" class="section level2">
<h2>Coordinates of rows</h2>
<p>The coordinates of the rows on the factor map are :</p>
<pre class="r"><code>head(res.ca$li)</code></pre>
<pre><code>               Axis1      Axis2       Axis3
Laundry    0.9918368 -0.4953220 -0.31672897
Main_meal  0.8755855 -0.4901092 -0.16406487
Dinner     0.6925740 -0.3081043 -0.20741377
Breakfeast 0.5086002 -0.4528038  0.22040453
Tidying    0.3938084  0.4343444 -0.09421375
Dishes     0.1889641  0.4419662  0.26694926</code></pre>
<p>Use the function <strong>fviz_ca_row()</strong> [in <em>factoextra</em> package] to visualize only row points:</p>
<pre class="r"><code># Default plot
fviz_ca_row(res.ca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-correspondence-analysis-ca-row-points-factoextra-data-mining-1.png" title="ade4 and factoextra : correspondence analysis - R software and data mining" alt="ade4 and factoextra : correspondence analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<br/>
<div class="notice">
<p>Note that, it’s also possible to plot rows only using the <em>ade4</em> base graph:</p>
<pre class="r"><code>s.label(res.ca$li, xax = 1, yax = 2)</code></pre>
</div>
<p><br/></p>
</div>
<div id="contribution-of-rows-to-the-dimensions" class="section level2">
<h2>Contribution of rows to the dimensions</h2>
<p>The <strong>cos2</strong> and the <strong>contributions</strong> of rows / columns are calculated using the function <strong>inertia.dudi()</strong> as follow :</p>
<pre class="r"><code>inertia <- inertia.dudi(res.ca, row.inertia = TRUE,
                        col.inertia = TRUE)</code></pre>
<p><span class="warning">Note that, the contributions and the cos2 are printed in 1/10 000. The sign is the sign of the coordinates.</span></p>
<p>The contributions can be printed in % as follow :</p>
<pre class="r"><code># absolute contribution of columns
contrib <- inertia$col.abs/100
head(contrib)</code></pre>
<pre><code>            Comp1 Comp2 Comp3
Wife        44.46 10.31 10.82
Alternating  0.10  2.78 82.55
Husband     54.23 17.79  6.13
Jointly      1.20 69.12  0.50</code></pre>
<p><span class="warning">Recall that, as mentioned above, the simplest way is to use the function <strong>get_ca_row()</strong> [in <em>factoextra</em> package]. It provides a list of matrices containing all the results for the active rows(coordinates, squared cosine and contributions).</span></p>
<pre class="r"><code>row <- get_ca_row(res.ca)
row</code></pre>
<pre><code>Correspondence Analysis - Results for rows
 ===================================================
  Name       Description                
1 "$coord"   "Coordinates for the rows" 
2 "$cos2"    "Cos2 for the rows"        
3 "$contrib" "contributions of the rows"
4 "$inertia" "Inertia of the rows"      </code></pre>
<pre class="r"><code># Row contributions
row$contrib</code></pre>
<pre><code>           Dim.1 Dim.2 Dim.3
Laundry    18.29  5.56  7.97
Main_meal  12.39  4.74  1.86
Dinner      5.47  1.32  2.10
Breakfeast  3.82  3.70  3.07
Tidying     2.00  2.97  0.49
Dishes      0.43  2.84  3.63
Shopping    0.18  2.52  2.22
Official    0.52  0.80 36.94
Driving     8.08  7.65 18.60
Finances    0.88  5.56  0.06
Insurance   6.15  4.02  5.25
Repairs    40.73 15.88 16.60
Holidays    1.08 42.45  1.21</code></pre>
<p><span class="success">The row category with the largest value, contribute the most to the definition of the dimensions.</span></p>
<p>The function <strong>fviz_contrib()</strong>[in <em>factoextra</em>] can be used to visualize the most important row variables:</p>
<pre class="r"><code># Contributions of rows on Dim.1
fviz_contrib(res.ca, choice = "row", axes = 1)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-correspondence-analysis-ca-row-contribution-dim-1-factoextra-data-mining-1.png" title="ade4 and factoextra : correspondence analysis - R software and data mining" alt="ade4 and factoextra : correspondence analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<br/>
<div class="warning">
<ul>
<li><p>The red dashed line represents the expected average row contributions if the contributions were uniform: 1/nrow(housetasks) = 1/13 = 7.69%.</p></li>
<li>For a given dimension, any row with a contribution above this threshold could be considered as important in contributing to that dimension.</li>
</ul>
</div>
<p><br/></p>
<p><span class="success"> The row items <em>Repairs, Laundry, Main_meal and Driving</em> contribute the most in the definition of the first axis.</span></p>
<pre class="r"><code># Contributions of rows on Dim.2
fviz_contrib(res.ca, choice = "row", axes = 2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-correspondence-analysis-ca-row-contribution-dim-2-factoextra-data-mining-1.png" title="ade4 and factoextra : correspondence analysis - R software and data mining" alt="ade4 and factoextra : correspondence analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="warning">Read more about <em>fviz_contrib()</em>: <a href="https://www.sthda.com/english/english/wiki/fviz-contrib-quick-visualization-of-row-column-contributions-r-software-and-data-mining">fviz_contrib</a></span></p>
<p><span class="warning">Using <strong>factoextra</strong> package, the color of rows can be automatically controlled by the value of their contributions</span></p>
<pre class="r"><code>fviz_ca_row(res.ca, col.row="contrib")+
scale_color_gradient2(low="white", mid="blue", 
                      high="red", midpoint=10)+theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-correspondence-analysis-rows-graph-colors-factoextra-data-mining-1.png" title="ade4 and factoextra : correspondence analysis - R software and data mining" alt="ade4 and factoextra : correspondence analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><span class="success">The graph above highlight the most important rows in the correspondence analysis solution.</span></p>
<p><span class="warning">Read more about <em>fviz_ca_row()</em>: <a href="https://www.sthda.com/english/english/wiki/fviz-ca-quick-correspondence-analysis-data-visualization-using-factoextra-r-software-and-data-mining">fviz_ca_row</a></span></p>
</div>
<div id="cos2-quality-of-representation-of-rows-on-the-factor-map" class="section level2">
<h2>Cos2 : quality of representation of rows on the factor map</h2>
<ul>
<li>A high cos2 indicates a good representation of the rows on the factor map.
</li>
<li>A low cos2 indicates that the variable is not perfectly represented by the principal dimensions.</li>
</ul>
<p>The cos2 of the rows are (<em>factoextra</em> code) :</p>
<pre class="r"><code>head(row$cos2)</code></pre>
<pre><code>            Dim.1  Dim.2  Dim.3
Laundry    0.7400 0.1846 0.0755
Main_meal  0.7416 0.2324 0.0260
Dinner     0.7766 0.1537 0.0697
Breakfeast 0.5049 0.4002 0.0948
Tidying    0.4398 0.5350 0.0252
Dishes     0.1181 0.6462 0.2357</code></pre>
<p>Note that, the <strong>ade4</strong> code is:</p>
<pre class="r"><code># relative contributions of rows
cos2 <- abs(inertia$row.rel/10000)
head(cos2)</code></pre>
<p>The values of the cos2 are comprised between 0 and 1.</p>
<p>The function <strong>fviz_cos2()</strong>[in <em>factoextra</em>] can be used to draw a bar plot of rows cos2:</p>
<pre class="r"><code># Cos2 of rows on Dim.1 and Dim.2
fviz_cos2(res.ca, choice = "row", axes = 1:2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-correspondence-analysis-ca-row-cos2-dim-1-factoextra-data-mining-1.png" title="ade4 and factoextra : correspondence analysis - R software and data mining" alt="ade4 and factoextra : correspondence analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="warning">Note that, all row points except <em>Official</em> are well represented by the first two dimensions. The position of the point corresponding the item <em>Official</em> on the scatter plot should be interpreted with some caution.</span></p>
<p><span class="warning">Using <strong>factoextra</strong> package, the color of rows can be automatically controlled by the value of their cos2.</span></p>
<pre class="r"><code>fviz_ca_row(res.ca, col.row="cos2")+
scale_color_gradient2(low="white", mid="blue", 
      high="red", midpoint=0.5) + theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-correspondence-analysis-cos2-colors-factoextra-data-mining-1.png" title="ade4 and factoextra : correspondence analysis - R software and data mining" alt="ade4 and factoextra : correspondence analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><span class="warning">Read more about <em>fviz_cos2()</em>: <a href="https://www.sthda.com/english/english/wiki/fviz-cos2-quick-visualization-of-the-quality-of-representation-of-rows-columns-r-software-and-data-mining">fviz_cos2</a></span></p>
</div>
</div>
<div id="column-variables" class="section level1">
<h1>Column variables</h1>
<p>The function <strong>get_ca_col()</strong>[in <em>factoextra</em>] is used to extract the results for column variables. This function returns a list containing the coordinates, the cos2 and the contribution of columns variables:</p>
<pre class="r"><code>col <- get_ca_col(res.ca)
col</code></pre>
<pre><code>Correspondence Analysis - Results for columns
 ===================================================
  Name       Description                   
1 "$coord"   "Coordinates for the columns" 
2 "$cos2"    "Cos2 for the columns"        
3 "$contrib" "contributions of the columns"
4 "$inertia" "Inertia of the columns"      </code></pre>
<pre class="r"><code># Coordinates
col$coord</code></pre>
<pre><code>                  Dim.1      Dim.2       Dim.3
Wife         0.83762154 -0.3652207 -0.19991139
Alternating  0.06218462 -0.2915938  0.84858939
Husband     -1.16091847 -0.6019199 -0.18885924
Jointly     -0.14942609  1.0265791 -0.04644302</code></pre>
<p><span class="warning"> The result for columns gives the same information as described for rows. For this reason, I’ll just displayed the result for columns in this section without commenting.</span></p>
<div id="coordinates-of-columns" class="section level2">
<h2>Coordinates of columns</h2>
<p>The coordinates of the columns on the factor maps can be extracted as follow :</p>
<pre class="r"><code># ade4 code
head(res.ca$co)</code></pre>
<pre><code>                  Comp1      Comp2       Comp3
Wife         0.83762154 -0.3652207 -0.19991139
Alternating  0.06218462 -0.2915938  0.84858939
Husband     -1.16091847 -0.6019199 -0.18885924
Jointly     -0.14942609  1.0265791 -0.04644302</code></pre>
<p>Use the function <strong>fviz_ca_col()</strong> [in <em>factoextra</em>] to visualize only column points:</p>
<pre class="r"><code>fviz_ca_col(res.ca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-correspondence-analysis-ca-columns-graph-factoextra-data-mining-1.png" title="ade4 and factoextra : correspondence analysis - R software and data mining" alt="ade4 and factoextra : correspondence analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<br/>
<div class="notice">
<p>Note that, it’s also possible to plot columns only using the <em>ade4</em> base graph:</p>
<pre class="r"><code>s.label(res.ca$co, xax = 1, yax = 2)</code></pre>
</div>
<p><br/></p>
</div>
<div id="contribution-of-columns" class="section level2">
<h2>Contribution of columns</h2>
<p>The contributions can be printed in % as follow :</p>
<pre class="r"><code># absolute contributions of columns
# ade4 code
contrib <- inertia$col.abs/100
head(contrib)</code></pre>
<pre><code>            Comp1 Comp2 Comp3
Wife        44.46 10.31 10.82
Alternating  0.10  2.78 82.55
Husband     54.23 17.79  6.13
Jointly      1.20 69.12  0.50</code></pre>
<p><span class="success">It’s simple to use the function get_ca_col() [from factoextra package]. factoextra provides, a list of matrices containing all the results for the active columns (coordinates, squared cosine and contributions)./span></p>
<pre class="r"><code>columns <- get_ca_col(res.ca)
columns</code></pre>
<pre><code>Correspondence Analysis - Results for columns
 ===================================================
  Name       Description                   
1 "$coord"   "Coordinates for the columns" 
2 "$cos2"    "Cos2 for the columns"        
3 "$contrib" "contributions of the columns"
4 "$inertia" "Inertia of the columns"      </code></pre>
<pre class="r"><code># Contributions of columns
head(columns$contrib)</code></pre>
<pre><code>            Dim.1 Dim.2 Dim.3
Wife        44.46 10.31 10.82
Alternating  0.10  2.78 82.55
Husband     54.23 17.79  6.13
Jointly      1.20 69.12  0.50</code></pre>
<p>Use the function <strong>fviz_contrib()</strong>[factoextra package] to visualize the most contributing columns :</p>
<pre class="r"><code># Contributions of columns on Dim.1
fviz_contrib(res.ca, choice = "col", axes = 1)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-correspondence-analysis-columns-contribution-data-mining-1.png" title="ade4 and factoextra : correspondence analysis - R software and data mining" alt="ade4 and factoextra : correspondence analysis - R software and data mining" width="336" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Contributions of columns on Dim.2
fviz_contrib(res.ca, choice = "col", axes = 2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-correspondence-analysis-columns-contribution-data-mining-2.png" title="ade4 and factoextra : correspondence analysis - R software and data mining" alt="ade4 and factoextra : correspondence analysis - R software and data mining" width="336" style="margin-bottom:10px;" /></p>
<p><span class="warning">Read more about <em>fviz_contrib()</em>: <a href="https://www.sthda.com/english/english/wiki/fviz-contrib-quick-visualization-of-row-column-contributions-r-software-and-data-mining">fviz_contrib</a></span></p>
<p><strong>Draw a scatter plot of column points</strong> and highlight columns according to the amount of their contributions. The function <strong>fviz_ca_col()</strong> [in <em>factoextra</em>] is used:</p>
<pre class="r"><code># Control column point colors using their contribution
# Possible values for the argument col.col are :
  # "cos2", "contrib", "coord", "x", "y"
fviz_ca_col(res.ca, col.col="contrib")+
scale_color_gradient2(low="white", mid="blue", 
                      high="red", midpoint=24.5)+theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-correspondence-analysis-columns-graph-colors-factoextra-data-mining-1.png" title="ade4 and factoextra : correspondence analysis - R software and data mining" alt="ade4 and factoextra : correspondence analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
</div>
<div id="cos2-the-quality-of-representation-of-columns" class="section level2">
<h2>Cos2 : The quality of representation of columns</h2>
<pre class="r"><code># relative contributions of columns
cos2 <- abs(inertia$col.rel)/10000
head(cos2)</code></pre>
<pre><code>             Comp1  Comp2  Comp3 con.tra
Wife        0.8019 0.1524 0.0457  0.2700
Alternating 0.0048 0.1051 0.8901  0.1057
Husband     0.7720 0.2075 0.0204  0.3421
Jointly     0.0207 0.9773 0.0020  0.2823</code></pre>
<p>The function <strong>fviz_cos2()</strong>[in <em>factoextra</em>] can be used to draw a bar plot of columns cos2:</p>
<pre class="r"><code># Cos2 of columns on Dim.1 and Dim.2
fviz_cos2(res.ca, choice = "col", axes = 1:2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-correspondence-analysis-ca-columns-cos2-dim-1-factoextra-data-mining-1.png" title="ade4 and factoextra : correspondence analysis - R software and data mining" alt="ade4 and factoextra : correspondence analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="warning">Note that, only the column item <em>Alternating</em> is not very well displayed on the first two dimensions. The position of this item must be interpreted with caution in the space formed by dimensions 1 and 2.</span></p>
<p><span class="warning">Read more about <em>fviz_cos2()</em>: <a href="https://www.sthda.com/english/english/wiki/fviz-cos2-quick-visualization-of-the-quality-of-representation-of-rows-columns-r-software-and-data-mining">fviz_cos2</a></span></p>
</div>
</div>
<div id="correspondence-analysis-using-supplementary-rows-and-columns" class="section level1">
<h1>Correspondence analysis using supplementary rows and columns</h1>
<div id="data" class="section level2">
<h2>Data</h2>
<p>We’ll use the data set <em>children</em> available on STHDA website. It contains 18 rows and 8 columns:</p>
<pre class="r"><code>ff <- "https://www.sthda.com/sthda/RDoc/data/ca-children.txt"
children <- read.table(file = ff, sep ="\t", 
                       header = TRUE, row.names = 1)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/images/ca-children.png" alt="Data format correspondence analysis" /></p>
<p><span class="notice"> The data used here is a contingency table describing the answers given by different categories of people to the following question: What are the reasons that can make hesitate a woman or a couple to have children? (source of the data: FactoMineR package) </span></p>
<br/>

<div class="warning">
<p>Only some of the rows and columns will be used to compute the correspondence analysis (CA).</p>
The coordinates of the remaining (supplementary) rows/columns on the factor map will be <strong>predicted</strong> after the CA.
</div>
<p><br/></p>
<p>In CA terminology, our data contains :</p>
<br/>
<div class="block">
<ul>
<li><strong>Active rows</strong> (rows 1:14) : Rows that are used during the correspondence analysis.</li>
<li><strong>Supplementary rows</strong> (row.sup 15:18) : The coordinates of these rows will be predicted using the CA informations and parameters obtained with active rows/columns</li>
<li><strong>Active columns</strong> (columns 1:5) : Columns that are used for the correspondence analysis.</li>
<li><strong>Supplementary columns</strong> (col.sup 6:8) : As supplementary rows, the coordinates of these columns will be predicted also.</li>
</ul>
</div>
<p><br/></p>
</div>
<div id="r-functions" class="section level2">
<h2>R functions</h2>
<p>The functions <strong>suprow()</strong> and <strong>supcol()</strong> [in ade4 package] are used to calculate the coordinates of supplementary rows and columns, respectively.</p>
<p>The simplified formats are :</p>
<pre class="r"><code># For supplementary rows
suprow(x, Xsup)

# For supplementary columns
supcol(x, Xsup)</code></pre>
</div>
<div id="supplementary-rows" class="section level2">
<h2>Supplementary rows</h2>
<pre class="r"><code># Data for the supplementary rows
row.sup <- children[15:18, 1:5, drop = FALSE]
head(row.sup)</code></pre>
<pre><code>             unqualified cep bepc high_school_diploma university
comfort                2   4    3                   1          4
disagreement           2   8    2                   5          2
world                  1   5    4                   6          3
to_live                3   3    1                   3          4</code></pre>
<p><strong>STEP 1/2 - CA using active rows/columns</strong>:</p>
<pre class="r"><code>d.active <- children[1:14, 1:5]
res.ca <- dudi.coa(d.active, scannf = FALSE, nf =5)</code></pre>
<p><strong>STEP 2/2 - Predict the coordinates of the supplementary rows</strong>:</p>
<pre class="r"><code>row.sup.ca <- suprow(res.ca, row.sup)
names(row.sup.ca)</code></pre>
<pre><code>[1] "tabsup" "lisup" </code></pre>
<pre class="r"><code># coordinates 
row.sup.coord <- row.sup.ca$lisup
head(row.sup.coord)</code></pre>
<pre><code>                 Axis1     Axis2      Axis3      Axis4
comfort      0.2096705 0.7031677 0.07111168  0.3071354
disagreement 0.1462777 0.1190106 0.17108916 -0.3132169
world        0.5233045 0.1429707 0.08399269 -0.1063597
to_live      0.3083067 0.5020193 0.52093397  0.2557357</code></pre>
<p><span class="question">How to visualize supplementary rows on the factor map?</span></p>
<p>The function <strong>fviz_add()</strong> is used :</p>
<pre class="r"><code># Plot of active rows
p <- fviz_ca_row(res.ca)
# Add supplementary rows
fviz_add(p, row.sup.coord, color ="darkgreen")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-correspondence-analysis-supplementary-rows-factoextra-data-mining-1.png" title="ade4 and factoextra : correspondence analysis - R software and data mining" alt="ade4 and factoextra : correspondence analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
<div id="supplementary-columns" class="section level2">
<h2>Supplementary columns</h2>
<pre class="r"><code># Data for the supplementary quantitative variables
col.sup <- children[1:14, 6:8, drop = FALSE]
head(col.sup)</code></pre>
<pre><code>              thirty fifty more_fifty
money             59    66         70
future           115   117         86
unemployment      79    88        177
circumstances      9     8          5
hard               2    17         18
economic          18    19         17</code></pre>
<p><span class="notice">Recall that, rows 15:18 are supplementary rows. We don’t want them in this current analysis. This is why, I extracted only rows 1:14. </span></p>
<p><strong>Predict the coordinates of the supplementary columns</strong> :</p>
<pre class="r"><code>col.sup.ca <- supcol(res.ca, col.sup)
names(col.sup.ca)</code></pre>
<pre><code>[1] "tabsup" "cosup" </code></pre>
<pre class="r"><code># coordinates 
col.sup.coord <- col.sup.ca$cosup
head(col.sup.coord)</code></pre>
<pre><code>                 Comp1       Comp2       Comp3       Comp4
thirty      0.10541339 -0.05969594 -0.10322613  0.06977996
fifty      -0.01706444  0.04907657 -0.01568923 -0.01306117
more_fifty -0.17706810 -0.04813788  0.10077299 -0.08517528</code></pre>
<p><strong>Visualize supplementary columns on the factor map using factoextra :</strong></p>
<pre class="r"><code># Plot of active columns
p <- fviz_ca_col(res.ca)
# Add supplementary active variables
fviz_add(p, col.sup.coord , color ="darkgreen")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-correspondence-analysis-supplementary-columns-data-mining-1.png" title="ade4 and factoextra : correspondence analysis - R software and data mining" alt="ade4 and factoextra : correspondence analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
</div>
<div id="further-reading" class="section level1">
<h1>Further reading</h1>
<p>To learn more about CA, read this article: <a href="https://www.sthda.com/english/english/wiki/correspondence-analysis-in-r-the-ultimate-guide-for-the-analysis-the-visualization-and-the-interpretation-r-software-and-data-mining">Correspondence Analysis in R: The Ultimate Guide for the Analysis, the Visualization and the Interpretation</a></p>
</div>
<div id="infos" class="section level1">
<h1>Infos</h1>
<p><span class="warning"> This analysis has been performed using <strong>R software</strong> (ver. 3.1.2), <strong>ade4</strong> (ver. 1.6-2) and <strong>factoextra</strong> (ver. 1.0.2) </span></p>
</div>

<script>jQuery(document).ready(function () {
    jQuery('h1').addClass('wiki_paragraph1');
    jQuery('h2').addClass('wiki_paragraph2');
    jQuery('h3').addClass('wiki_paragraph3');
    jQuery('h4').addClass('wiki_paragraph4');
    });//add phpboost class to header</script>
<style>.content{padding:0px;}</style>
</div><!--end rdoc-->
<!--====================== stop here when you copy to sthda================-->


<!-- END HTML -->]]></description>
			<pubDate>Sun, 21 Jun 2015 19:40:26 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[Correspondence analysis basics - R software and data mining]]></title>
			<link>https://www.sthda.com/english/wiki/correspondence-analysis-basics-r-software-and-data-mining</link>
			<guid>https://www.sthda.com/english/wiki/correspondence-analysis-basics-r-software-and-data-mining</guid>
			<description><![CDATA[<!-- START HTML -->

            
  <!--====================== start from here when you copy to sthda================-->  
  <div id="rdoc">

<div id="TOC">
<ul>
<li><a href="#required-package">Required package</a></li>
<li><a href="#load-factominer-and-factoextra">Load FactoMineR and factoextra</a></li>
<li><a href="#data-format-contingency-tables">Data format: Contingency tables</a></li>
<li><a href="#visualize-a-contingency-table-using-graphical-matrix">Visualize a contingency table using graphical matrix</a></li>
<li><a href="#row-sums-and-column-sums">Row sums and column sums</a></li>
<li><a href="#row-variables">Row variables</a><ul>
<li><a href="#row-profiles">Row profiles</a></li>
<li><a href="#distance-or-similarity-between-row-profiles">Distance (or similarity) between row profiles</a></li>
<li><a href="#squared-distance-between-each-row-profile-and-the-average-row-profile">Squared distance between each row profile and the average row profile</a></li>
<li><a href="#distance-matrix">Distance matrix</a></li>
<li><a href="#row-mass-and-inertia">Row mass and inertia</a></li>
<li><a href="#row-summary">Row summary</a></li>
</ul></li>
<li><a href="#column-variables">Column variables</a><ul>
<li><a href="#column-profiles">Column profiles</a></li>
<li><a href="#distance-similarity-between-column-profiles">Distance (similarity) between column profiles</a></li>
<li><a href="#squared-distance-between-each-column-profile-and-the-average-column-profile">Squared distance between each column profile and the average column profile</a></li>
<li><a href="#distance-matrix-1">Distance matrix</a></li>
<li><a href="#column-mass-and-inertia">column mass and inertia</a></li>
<li><a href="#column-summary">Column summary</a></li>
</ul></li>
<li><a href="#association-between-row-and-column-variables">Association between row and column variables</a><ul>
<li><a href="#chi-square-test">Chi-square test</a></li>
<li><a href="#chi-square-statistic-and-the-total-inertia">Chi-square statistic and the total inertia</a></li>
</ul></li>
<li><a href="#graphical-representation-of-a-contingency-table-mosaic-plot">Graphical representation of a contingency table: Mosaic plot</a></li>
<li><a href="#g-test-likelihood-ratio-test">G-test: Likelihood ratio test</a><ul>
<li><a href="#likelihood-ratio-test-in-r">Likelihood ratio test in R</a></li>
<li><a href="#interpret-the-association-between-rows-and-columns-using-likelihood-ratio">Interpret the association between rows and columns using likelihood ratio</a></li>
</ul></li>
<li><a href="#correspondence-analysis">Correspondence analysis</a></li>
<li><a href="#ca---singular-value-decomposition-of-the-standardized-residuals">CA - Singular value decomposition of the standardized residuals</a><ul>
<li><a href="#eigenvalues-and-screeplot">Eigenvalues and screeplot</a></li>
<li><a href="#row-coordinates">Row coordinates</a></li>
<li><a href="#column-coordinates">Column coordinates</a></li>
<li><a href="#biplot-of-rows-and-columns-to-view-the-association">Biplot of rows and columns to view the association</a></li>
<li><a href="#diagnostic">Diagnostic</a></li>
<li><a href="#contribution-of-rows-and-columns">Contribution of rows and columns</a></li>
<li><a href="#quality-of-the-representation">Quality of the representation</a></li>
<li><a href="#cos2-of-columns">Cos2 of columns</a></li>
<li><a href="#supplementary-rowscolumns">Supplementary rows/columns</a><ul>
<li><a href="#the-supplementary-row-coordinates">The supplementary row coordinates</a></li>
</ul></li>
</ul></li>
<li><a href="#packages-in-r">Packages in R</a></li>
<li><a href="#infos">Infos</a></li>
</ul>
</div>

<p><br/></p>
<p><strong>Correspondence analysis</strong> (<strong>CA</strong>) is an extension of <a href="https://www.sthda.com/english/english/wiki/factominer-and-factoextra-principal-component-analysis-visualization-r-software-and-data-mining"><strong>Principal Component Analysis</strong> (<strong>PCA</strong>)</a> suited to analyze frequencies formed by <strong>qualitative variables</strong> (i.e, <strong>contingency table</strong>).</p>
<p>This <strong>R tutorial</strong> describes the idea and the mathematical procedures of <strong>Correspondence Analysis</strong> (<strong>CA</strong>) using <strong>R software</strong>.</p>
<p>The mathematical procedures of <strong>CA</strong> are complex and require matrix algebra.</p>
<p><span class="success">In this tutorial, I put a lot of effort into writing all the formula in a very simple format so that every beginner can understand the methods.</span></p>
<div id="required-package" class="section level1">
<h1>Required package</h1>
<p><strong>FactoMineR</strong>(for computing CA) and <a href="https://www.sthda.com/english/english/wiki/factoextra-r-package-visualization-of-the-outputs-of-a-multivariate-analysis-r-software-and-data-mining"><strong>factoextra</strong></a> (for CA visualization) packages are used.</p>
<p>These packages can be installed as follow :</p>
<pre class="r"><code>install.packages("FactoMineR")

# install.packages("devtools")
devtools::install_github("kassambara/factoextra")</code></pre>
</div>
<div id="load-factominer-and-factoextra" class="section level1">
<h1>Load FactoMineR and factoextra</h1>
<pre class="r"><code>library("FactoMineR")
library("factoextra")</code></pre>
</div>
<div id="data-format-contingency-tables" class="section level1">
<h1>Data format: Contingency tables</h1>
<p>We’ll use the data set <strong>housetasks</strong>[in <em>factoextra</em>]</p>
<pre class="r"><code>data(housetasks)
head(housetasks)</code></pre>
<pre><code>           Wife Alternating Husband Jointly
Laundry     156          14       2       4
Main_meal   124          20       5       4
Dinner       77          11       7      13
Breakfeast   82          36      15       7
Tidying      53          11       1      57
Dishes       32          24       4      53</code></pre>
<p>An image of the data is shown below:</p>
<p><img src="https://www.sthda.com/english/sthda/RDoc/images/ca-housetasks.png" alt="Data format correspondence analysis" /></p>
<br/>
<div class="block">
<p>The data is a contingency table containing 13 housetasks and their repartition in the couple :</p>
<ul>
<li>rows are the different tasks</li>
<li>values are the frequencies of the tasks done:</li>
<li>by the <em>wife</em> only</li>
<li>alternatively</li>
<li>by the husband only</li>
<li>or jointly</li>
</ul>
</div>
<p><br/></p>
<p>As the above contingency table is not very large, with a quick visual examination it can be seen that:</p>
<ul>
<li>The house tasks <em>Laundry, Main_Meal and Dinner</em> are dominant in the column <em>Wife</em></li>
<li><em>Repairs</em> are dominant in the column <em>Husband</em></li>
<li><em>Holidays</em> are dominant in the column <em>Jointly</em></li>
</ul>
</div>
<div id="visualize-a-contingency-table-using-graphical-matrix" class="section level1">
<h1>Visualize a contingency table using graphical matrix</h1>
<p>To easily interpret the contingency table, a <strong>graphical matrix</strong> can be drawn using the function <strong>balloonplot()</strong> [in <em>gplots</em> package]. In this graph, each cell contains a dot whose size reflects the relative magnitude of the value it contains.</p>
<pre class="r"><code>library("gplots")
# 1. convert the data as a table
dt <- as.table(as.matrix(housetasks))
# 2. Graph
balloonplot(t(dt), main ="housetasks", xlab ="", ylab="",
            label = FALSE, show.margins = FALSE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/correspondence-analysis-basics-graph-contingency-table-data-mining-1.png" title="Correspondence analysis basics - R software and data mining" alt="Correspondence analysis basics - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><span class="warning">For a very large contingency table, the visual interpretation would be very hard. Other methods are required such as correspondence analysis.</span></p>
<p><span class="success"> I will describe step by step many tools and statistical approaches to visualize, analyse and interpret a contingency table.</span></p>
</div>
<div id="row-sums-and-column-sums" class="section level1">
<h1>Row sums and column sums</h1>
<p><strong>Row sums</strong> (row.sum) and <strong>column sums</strong> (col.sum) are called <strong>row margins</strong> and <strong>column margins</strong>, respectively. They can be calculated as follow:</p>
<pre class="r"><code># Row margins
row.sum <- apply(housetasks, 1, sum)
head(row.sum)</code></pre>
<pre><code>   Laundry  Main_meal     Dinner Breakfeast    Tidying     Dishes 
       176        153        108        140        122        113 </code></pre>
<pre class="r"><code># Column margins
col.sum <- apply(housetasks, 2, sum)
head(col.sum)</code></pre>
<pre><code>       Wife Alternating     Husband     Jointly 
        600         254         381         509 </code></pre>
<pre class="r"><code># grand total
n <- sum(housetasks)</code></pre>
<p><span class="notice">The <strong>grand total</strong> is the total sum of all values in the contingency table.</span></p>
<p>The contingency table with row and column margins are shown below:</p>
<table style="margin-left:0px;margin-right:auto;"><thead><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;"></p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;font-weight:bold;color:#000000;">Wife</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;font-weight:bold;color:#000000;">Alternating</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;font-weight:bold;color:#000000;">Husband</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;font-weight:bold;color:#000000;">Jointly</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;font-weight:bold;color:#000000;">TOTAL</span>
</p></td></tr></thead><tbody><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Laundry</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">156</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">14</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">2</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">4</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">176</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Main_meal</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">124</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">20</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">5</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">4</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">153</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Dinner</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">77</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">11</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">7</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">13</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">108</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Breakfeast</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">82</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">36</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">15</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">7</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">140</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Tidying</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">53</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">11</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">57</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">122</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Dishes</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">32</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">24</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">4</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">53</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">113</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Shopping</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">33</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">23</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">9</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">55</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">120</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Official</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">12</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">46</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">23</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">15</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">96</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Driving</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">10</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">51</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">75</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">3</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">139</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Finances</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">13</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">13</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">21</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">66</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">113</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Insurance</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">8</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">53</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">77</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">139</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Repairs</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">3</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">160</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">2</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">165</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Holidays</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">6</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">153</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">160</span>
</p></td></tr><tr><td style="background-color:lightblue;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">TOTAL</span>
</p></td><td style="background-color:lightblue;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">600</span>
</p></td><td style="background-color:lightblue;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">254</span>
</p></td><td style="background-color:lightblue;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">381</span>
</p></td><td style="background-color:lightblue;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">509</span>
</p></td><td style="background-color:#FFC0CB;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1744</span>
</p></td></tr></tbody><tfoot></tfoot></table>
       

<ul>
<li>Row margins: light gray</li>
<li>Column margins: light blue</li>
<li>The grand total (the total of all values in the table): pink</li>
</ul>
</div>
<div id="row-variables" class="section level1">
<h1>Row variables</h1>
<p>To compare rows, we can analyse their profiles in order to identify similar row variables.</p>
<div id="row-profiles" class="section level2">
<h2>Row profiles</h2>
<p>The profile of a given row is calculated by taking each row point and dividing by its margin (i.e, the sum of all row points). The formula is:</p>
<br/>
<div class="block">
<span class="math">\[
row.profile = \frac{row}{row.sum}
\]</span>
</div>
<p><br/></p>
<p>For example the profile of the row point Laundry/wife is <strong>P = 156/176 = 88.6%</strong>.</p>
<p>The R code below can be used to compute <strong>row profiles</strong>:</p>
<pre class="r"><code>row.profile <- housetasks/row.sum
# head(row.profile)</code></pre>
<table style="margin-left:0px;margin-right:auto;"><thead><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;"></p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;font-weight:bold;color:#000000;">Wife</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;font-weight:bold;color:#000000;">Alternating</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;font-weight:bold;color:#000000;">Husband</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;font-weight:bold;color:#000000;">Jointly</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;font-weight:bold;color:#000000;">TOTAL</span>
</p></td></tr></thead><tbody><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Laundry</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.88636364</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.079545455</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.011363636</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.02272727</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Main_meal</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.81045752</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.130718954</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.032679739</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.02614379</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Dinner</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.71296296</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.101851852</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.064814815</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.12037037</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Breakfeast</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.58571429</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.257142857</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.107142857</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.05000000</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Tidying</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.43442623</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.090163934</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.008196721</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.46721311</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Dishes</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.28318584</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.212389381</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.035398230</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.46902655</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Shopping</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.27500000</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.191666667</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.075000000</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.45833333</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Official</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.12500000</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.479166667</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.239583333</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.15625000</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Driving</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.07194245</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.366906475</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.539568345</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.02158273</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Finances</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.11504425</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.115044248</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.185840708</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.58407080</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Insurance</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.05755396</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.007194245</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.381294964</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.55395683</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Repairs</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.00000000</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.018181818</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.969696970</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.01212121</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Holidays</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.00000000</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.006250000</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.037500000</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.95625000</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1</span>
</p></td></tr><tr><td style="background-color:lightblue;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">TOTAL</span>
</p></td><td style="background-color:lightblue;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.34403670</span>
</p></td><td style="background-color:lightblue;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.145642202</span>
</p></td><td style="background-color:lightblue;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.218463303</span>
</p></td><td style="background-color:lightblue;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.29185780</span>
</p></td><td style="background-color:#FFC0CB;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1</span>
</p></td></tr></tbody><tfoot></tfoot></table>
     



<p><span class="warning">In the table above, the row <strong>TOTAL</strong> (in light blue) is called the <strong>average row profile</strong> (or <strong>marginal profile of columns</strong> or <strong>column margin</strong>)</span></p>
<p>The <strong>average row profile</strong> is computed as follow:</p>
<br/>
<div class="block">
<span class="math">\[
average.rp = \frac{column.sum}{grand.total}
\]</span>
</div>
<p><br/></p>
<p>For example, the average row profile is : (600/1744, 254/1744, 381/1744, 509/1744). It can be computed in R as follow:</p>
<pre class="r"><code># Column sums
col.sum <- apply(housetasks, 2, sum)
# average row profile = Column sums / grand total
average.rp <- col.sum/n 
average.rp</code></pre>
<pre><code>       Wife Alternating     Husband     Jointly 
  0.3440367   0.1456422   0.2184633   0.2918578 </code></pre>
</div>
<div id="distance-or-similarity-between-row-profiles" class="section level2">
<h2>Distance (or similarity) between row profiles</h2>
<p>If we want to compare 2 rows (row1 and row2), we need to compute the squared distance between their profiles as follow:</p>
<br/>
<div class="block">
<span class="math">\[ 
d^2(row_1, row_2) = \sum{\frac{(row.profile_1 - row.profile_2)^2}{average.profile}}
\]</span>
</div>
<p><br/></p>
<p><span class="warning">This distance is called <strong>Chi-square distance</strong>.</span></p>
<p>For example the distance between the rows <em>Laundry</em> and <em>Main_meal</em> are:</p>
<p><span class="math">\[
d^2(Laundry, Main\_meal) = \frac{(0.886-0.810)^2}{0.344} + \frac{(0.0795-0.131)^2}{0.146} + ... = 0.036
\]</span></p>
<p>The distance between Laundry and Main_meal can be calculated as follow in R:</p>
<pre class="r"><code># Laundry and Main_meal profiles
laundry.p <- row.profile["Laundry",]
main_meal.p <- row.profile["Main_meal",]
# Distance between Laundry and Main_meal
d2 <- sum(((laundry.p - main_meal.p)^2) / average.rp)
d2</code></pre>
<pre><code>[1] 0.03684787</code></pre>
<p>The distance between <em>Laundry</em> and <em>Driving</em> is:</p>
<pre class="r"><code># Driving profile
driving.p <- row.profile["Driving",]
# Distance between Laundry and Driving
d2 <- sum(((laundry.p - driving.p)^2) / average.rp)
d2</code></pre>
<pre><code>[1] 3.772028</code></pre>
<p><span class="success"> Note that, the rows Laundry and Main_meal are very close (d2 ~ 0.036, similar profiles) compared to the rows Laundry and Driving (d2 ~ 3.77) </span></p>
<p><span class="warning">You can also compute the squared distance between each row profile and the average row profile in order to view rows that are the most similar or different to the average row.</span></p>
</div>
<div id="squared-distance-between-each-row-profile-and-the-average-row-profile" class="section level2">
<h2>Squared distance between each row profile and the average row profile</h2>
<br/>
<div class="block">
<span class="math">\[ 
d^2(row_i, average.profile) = \sum{\frac{(row.profile_i - average.profile)^2}{average.profile}}
\]</span>
</div>
<p><br/></p>
<p>The R code below computes the distance from the average profile for all the row variables:</p>
<pre class="r"><code>d2.row <- apply(row.profile, 1, 
        function(row.p, av.p){sum(((row.p - av.p)^2)/av.p)}, 
        average.rp)
as.matrix(round(d2.row,3))</code></pre>
<pre><code>            [,1]
Laundry    1.329
Main_meal  1.034
Dinner     0.618
Breakfeast 0.512
Tidying    0.353
Dishes     0.302
Shopping   0.218
Official   0.968
Driving    1.274
Finances   0.456
Insurance  0.727
Repairs    3.307
Holidays   2.140</code></pre>
<p><span class="success">The rows <em>Repairs, Holidays, Laundry and Driving</em> have the most different profiles from the average profile.</span></p>
</div>
<div id="distance-matrix" class="section level2">
<h2>Distance matrix</h2>
<p>In this section the squared distance is computed between each row profile and the other rows in the contingency table.</p>
<p>The result is a distance matrix (a kind of correlation or dissimilarity matrix).</p>
<p>The custom R function below is used to compute the distance matrix:</p>
<pre class="r"><code>## data: a data frame or matrix; 
## average.profile: average profile
dist.matrix <- function(data, average.profile){
   mat <- as.matrix(t(data))
    n <- ncol(mat)
    dist.mat<- matrix(NA, n, n)
    diag(dist.mat) <- 0
    for (i in 1:(n - 1)) {
        for (j in (i + 1):n) {
            d2 <- sum(((mat[, i] - mat[, j])^2) / average.profile)
            dist.mat[i, j] <- dist.mat[j, i] <- d2
        }
    }
  colnames(dist.mat) <- rownames(dist.mat) <- colnames(mat)
  dist.mat
}</code></pre>
<p>Compute and visualize the distance between row profiles. The package <strong>corrplot</strong> is required for the visualization. It can be installed as follow: <strong>install.packages(“corrplot”)</strong>.</p>
<pre class="r"><code># Distance matrix
dist.mat <- dist.matrix(row.profile, average.rp)
dist.mat <-round(dist.mat, 2)
# Visualize the matrix
library("corrplot")
corrplot(dist.mat, type="upper",  is.corr = FALSE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/correspondence-analysis-basics-visualize-row-profile-data-mining-1.png" title="Correspondence analysis basics - R software and data mining" alt="Correspondence analysis basics - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="success">The size of the circle is proportional to the magnitude of the distance between row profiles.</span></p>
<p><span class="warning">When the data contains many categories, correspondence analysis is very useful to visualize the similarity between items.</span></p>
</div>
<div id="row-mass-and-inertia" class="section level2">
<h2>Row mass and inertia</h2>
<p>The <strong>Row mass</strong> (or <strong>row weight</strong>) is the total frequency of a given row. It’s calculated as follow:</p>
<br/>
<div class="block">
<span class="math">\[
row.mass = \frac{row.sum}{grand.total}
\]</span>
</div>
<p><br/></p>
<pre class="r"><code>row.sum <- apply(housetasks, 1, sum)
grand.total <- sum(housetasks)
row.mass <- row.sum/grand.total
head(row.mass)</code></pre>
<pre><code>   Laundry  Main_meal     Dinner Breakfeast    Tidying     Dishes 
0.10091743 0.08772936 0.06192661 0.08027523 0.06995413 0.06479358 </code></pre>
<p>The <strong>Row inertia</strong> is calculated as the row mass multiplied by the squared distance between the row and the average row profile:</p>
<br/>
<div class="block">
<span class="math">\[
row.inertia = row.mass * d^2(row)
\]</span>
</div>
<p><br/></p>
<br/>
<div class="warning">
<ul>
<li>The <strong>inertia of a row (or a column)</strong> is the amount of information it contains.</li>
<li>The <strong>total inertia</strong> is the total information contained in the data table. It’s computed as the sum of rows inertia (or equivalently, as the sum of columns inertia)</li>
</ul>
</div>
<p><br/></p>
<pre class="r"><code># Row inertia
row.inertia <- row.mass * d2.row
head(row.inertia)</code></pre>
<pre><code>   Laundry  Main_meal     Dinner Breakfeast    Tidying     Dishes 
0.13415976 0.09069235 0.03824633 0.04112368 0.02466697 0.01958732 </code></pre>
<pre class="r"><code># Total inertia
sum(row.inertia)</code></pre>
<pre><code>[1] 1.11494</code></pre>
<p><span class="success">The total inertia corresponds to the amount of the information the data contains.</span></p>
</div>
<div id="row-summary" class="section level2">
<h2>Row summary</h2>
<p>The result for rows can be summarized as follow:</p>
<pre class="r"><code>row <- cbind.data.frame(d2 = d2.row, mass = row.mass, inertia = row.inertia)
round(row,3)</code></pre>
<pre><code>              d2  mass inertia
Laundry    1.329 0.101   0.134
Main_meal  1.034 0.088   0.091
Dinner     0.618 0.062   0.038
Breakfeast 0.512 0.080   0.041
Tidying    0.353 0.070   0.025
Dishes     0.302 0.065   0.020
Shopping   0.218 0.069   0.015
Official   0.968 0.055   0.053
Driving    1.274 0.080   0.102
Finances   0.456 0.065   0.030
Insurance  0.727 0.080   0.058
Repairs    3.307 0.095   0.313
Holidays   2.140 0.092   0.196</code></pre>
</div>
</div>
<div id="column-variables" class="section level1">
<h1>Column variables</h1>
<div id="column-profiles" class="section level2">
<h2>Column profiles</h2>
<p>These are calculated in the same way as the <strong>row profiles</strong> table.</p>
<p>The profile of a given column is computed as follow:</p>
<br/>
<div class="block">
<span class="math">\[
col.profile = \frac{col}{col.sum}
\]</span>
</div>
<p><br/></p>
<p>The R code below can be used to compute column profile:</p>
<pre class="r"><code>col.profile <- t(housetasks)/col.sum
col.profile <- as.data.frame(t(col.profile))
# head(col.profile)</code></pre>
<table style="margin-left:0px;margin-right:auto;"><thead><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;"></p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;font-weight:bold;color:#000000;">Wife</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;font-weight:bold;color:#000000;">Alternating</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;font-weight:bold;color:#000000;">Husband</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;font-weight:bold;color:#000000;">Jointly</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;font-weight:bold;color:#000000;">TOTAL</span>
</p></td></tr></thead><tbody><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Laundry</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.26000000</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.055118110</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.005249344</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.007858546</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.10091743</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Main_meal</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.20666667</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.078740157</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.013123360</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.007858546</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.08772936</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Dinner</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.12833333</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.043307087</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.018372703</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.025540275</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.06192661</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Breakfeast</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.13666667</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.141732283</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.039370079</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.013752456</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.08027523</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Tidying</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.08833333</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.043307087</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.002624672</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.111984283</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.06995413</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Dishes</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.05333333</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.094488189</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.010498688</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.104125737</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.06479358</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Shopping</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.05500000</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.090551181</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.023622047</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.108055010</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.06880734</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Official</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.02000000</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.181102362</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.060367454</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.029469548</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.05504587</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Driving</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.01666667</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.200787402</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.196850394</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.005893910</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.07970183</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Finances</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.02166667</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.051181102</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.055118110</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.129666012</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.06479358</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Insurance</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.01333333</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.003937008</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.139107612</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.151277014</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.07970183</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Repairs</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.00000000</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.011811024</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.419947507</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.003929273</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.09461009</span>
</p></td></tr><tr><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">Holidays</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.00000000</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.003937008</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.015748031</span>
</p></td><td style="background-color:#FFFFFF;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.300589391</span>
</p></td><td style="background-color:lightgray;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">0.09174312</span>
</p></td></tr><tr><td style="background-color:lightblue;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">TOTAL</span>
</p></td><td style="background-color:lightblue;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1.00000000</span>
</p></td><td style="background-color:lightblue;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1.000000000</span>
</p></td><td style="background-color:lightblue;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1.000000000</span>
</p></td><td style="background-color:lightblue;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1.000000000</span>
</p></td><td style="background-color:#FFC0CB;border-bottom-color:#000000;border-bottom-style:solid;border-bottom-width:1px;border-top-color:#000000;border-top-style:solid;border-top-width:1px;border-right-color:#000000;border-right-style:solid;border-right-width:1px;border-left-color:#000000;border-left-style:solid;border-left-width:1px;vertical-align:middle;padding-left:0pt;padding-right:0pt;padding-top:0pt;padding-bottom:0pt;"><p style="text-align:left;padding-top:0pt;padding-bottom:0pt;padding-right:0pt;padding-left:0pt;">
<span style="font-size:11pt;font-family:Helvetica;color:#000000;">1.00000000</span>
</p></td></tr></tbody><tfoot></tfoot></table>
     


<p><span class="warning">In the table above, the column <strong>TOTAL</strong> is called the <strong>average column profile</strong> (or <strong>marginale profile of rows</strong>)</span></p>
<p>The average column profile is calculated as follow:</p>
<br/>
<div class="block">
<span class="math">\[
average.cp = row.sum/grand.total
\]</span>
</div>
<p><br/></p>
<p>For example, the average column profile is : (176/1744, 153/1744, 108/1744, 140/1744, …). It can be computed in R as follow:</p>
<pre class="r"><code># Row sums
row.sum <- apply(housetasks, 1, sum)
# average column profile= row sums/grand total
average.cp <- row.sum/n 
head(average.cp)</code></pre>
<pre><code>   Laundry  Main_meal     Dinner Breakfeast    Tidying     Dishes 
0.10091743 0.08772936 0.06192661 0.08027523 0.06995413 0.06479358 </code></pre>
</div>
<div id="distance-similarity-between-column-profiles" class="section level2">
<h2>Distance (similarity) between column profiles</h2>
<p>If we want to compare columns, we need to compute the squared distance between their profiles as follow:</p>
<br/>
<div class="block">
<span class="math">\[ 
d^2(col_1, col_2) = \sum{\frac{(col.profile_1 - col.profile_2)^2}{average.profile}}
\]</span>
</div>
<p><br/></p>
<p>For example the distance between the columns <em>Wife</em> and <em>Husband</em> are:</p>
<p><span class="math">\[
d^2(Wife, Husband) = \frac{(0.26-0.005)^2}{0.10} + \frac{(0.21-0.013)^2}{0.09} + ... + ... = 4.05
\]</span></p>
<p>The distance between Wife and Husband can be calculated as follow in R:</p>
<pre class="r"><code># Wife and Husband profiles
wife.p <- col.profile[, "Wife"]
husband.p <- col.profile[, "Husband"]
# Distance between Wife and Husband
d2 <- sum(((wife.p - husband.p)^2) / average.cp)
d2</code></pre>
<pre><code>[1] 4.050311</code></pre>
<p><span class="warning">You can also compute the squared distance between each column profile and the average column profile</span></p>
</div>
<div id="squared-distance-between-each-column-profile-and-the-average-column-profile" class="section level2">
<h2>Squared distance between each column profile and the average column profile</h2>
<br/>
<div class="block">
<span class="math">\[ 
d^2(col_i, average.profile) = \sum{\frac{(col.profile_i - average.profile)^2}{average.profile}}
\]</span>
</div>
<p><br/></p>
<p>The R code below computes the distance from the average profile for all the column variables</p>
<pre class="r"><code>d2.col <- apply(col.profile, 2, 
        function(col.p, av.p){sum(((col.p - av.p)^2)/av.p)}, 
        average.cp)
round(d2.col,3)</code></pre>
<pre><code>       Wife Alternating     Husband     Jointly 
      0.875       0.809       1.746       1.078 </code></pre>
</div>
<div id="distance-matrix-1" class="section level2">
<h2>Distance matrix</h2>
<pre class="r"><code># Distance matrix
dist.mat <- dist.matrix(t(col.profile), average.cp)
dist.mat <-round(dist.mat, 2)
dist.mat</code></pre>
<pre><code>            Wife Alternating Husband Jointly
Wife        0.00        1.71    4.05    2.93
Alternating 1.71        0.00    2.67    2.58
Husband     4.05        2.67    0.00    3.70
Jointly     2.93        2.58    3.70    0.00</code></pre>
<pre class="r"><code># Visualize the matrix
library("corrplot")
corrplot(dist.mat, type="upper", order="hclust", is.corr = FALSE)</code></pre>
<p><img src="https://www.sthda.com/english/english/sthda/RDoc/figure/factor-analysis/correspondence-analysis-basics-visualize-column-profile-data-mining-1.png" title="Correspondence analysis basics - R software and data mining" alt="Correspondence analysis basics - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
<div id="column-mass-and-inertia" class="section level2">
<h2>column mass and inertia</h2>
<p>The <strong>column mass</strong>(or column weight) is the total frequency of each column. It’s calculated as follow:</p>
<br/>
<div class="block">
<span class="math">\[
col.mass = \frac{col.sum}{grand.total}
\]</span>
</div>
<p><br/></p>
<pre class="r"><code>col.sum <- apply(housetasks, 2, sum)
grand.total <- sum(housetasks)
col.mass <- col.sum/grand.total
head(col.mass)</code></pre>
<pre><code>       Wife Alternating     Husband     Jointly 
  0.3440367   0.1456422   0.2184633   0.2918578 </code></pre>
<p>The <strong>column inertia</strong> is calculated as the column mass multiplied by the squared distance between the column and the average column profile:</p>
<br/>
<div class="block">
<span class="math">\[
col.inertia = col.mass * d^2(col)
\]</span>
</div>
<p><br/></p>
<pre class="r"><code>col.inertia <- col.mass * d2.col
head(col.inertia)</code></pre>
<pre><code>       Wife Alternating     Husband     Jointly 
  0.3010185   0.1178242   0.3813729   0.3147248 </code></pre>
<pre class="r"><code># total inertia
sum(col.inertia)</code></pre>
<pre><code>[1] 1.11494</code></pre>
<p><span class="success">Recall that the total inertia corresponds to the amount of the information the data contains. Note that, the total inertia obtained using column profile is the same as the one obtained when analyzing row profile. That’s normal, because we are analyzing the same data with just a different angle of view.</span></p>
</div>
<div id="column-summary" class="section level2">
<h2>Column summary</h2>
<p>The result for rows can be summarized as follow:</p>
<pre class="r"><code>col <- cbind.data.frame(d2 = d2.col, mass = col.mass, 
                        inertia = col.inertia)
round(col,3)</code></pre>
<pre><code>               d2  mass inertia
Wife        0.875 0.344   0.301
Alternating 0.809 0.146   0.118
Husband     1.746 0.218   0.381
Jointly     1.078 0.292   0.315</code></pre>
</div>
</div>
<div id="association-between-row-and-column-variables" class="section level1">
<h1>Association between row and column variables</h1>
<p>When the contingency table is not very large (as above), it’s easy to visually inspect and interpret row and column profiles:</p>
<ul>
<li>It’s evident that, the housetasks - <em>Laundry, Main_Meal and Dinner</em> - are more frequently done by the “Wife”.
</li>
<li>Repairs and driving are dominantly done by the husband</li>
<li>Holidays are more frequently taken jointly</li>
</ul>
<p>Larger contingency table is complex to interpret visually and several methods are required to help to this process.</p>
<p>Another statistical method that can be applied to contingency table is the <strong>Chi-square test</strong> of independence.</p>
<div id="chi-square-test" class="section level2">
<h2>Chi-square test</h2>
<p><strong>Chi-square test</strong> issued to examine whether rows and columns of a contingency table are statistically significantly associated.</p>
<ul>
<li><strong>Null hypothesis (H0)</strong>: the row and the column variables of the contingency table are independent.</li>
<li><strong>Alternative hypothesis (H1)</strong>: row and column variables are dependent</li>
</ul>
<p>For each cell of the table, we have to calculate the expected value under null hypothesis.</p>
<p>For a given cell, the expected value is calculated as follow:</p>
<br/>
<div class="block">
<span class="math">\[
e = \frac{row.sum * col.sum}{grand.total}
\]</span>
</div>
<p>The Chi-square statistic is calculated as follow:</p>
<br/>
<div class="block">
<p><span class="math">\[
\chi^2 = \sum{\frac{(o - e)^2}{e}}
\]</span></p>
<ul>
<li>o is the observed value</li>
<li>e is the expected value</li>
</ul>
</div>
<p><br/></p>
<p>This calculated Chi-square statistic is compared to the critical value (obtained from statistical tables) with <span class="math">\(df = (r - 1)(c - 1)\)</span> degrees of freedom and p = 0.05.</p>
<ul>
<li><em>r</em> is the number of rows in the contingency table</li>
<li><em>c</em> is the number of column in the contingency table</li>
</ul>
<p>If the calculated Chi-square statistic is greater than the critical value, then we must conclude that the row and the column variables are not independent of each other. This implies that they are significantly associated.</p>
<p><span class="warning">Note that, Chi-square test should only be applied when the expected frequency of any cell is at least 5.</span></p>
<p>Chi-square statistic can be easily computed using the function <strong>chisq.test()</strong> as follow:</p>
<pre class="r"><code>chisq <- chisq.test(housetasks)
chisq</code></pre>
<pre><code>
    Pearson&amp;#39;s Chi-squared test

data:  housetasks
X-squared = 1944.456, df = 36, p-value < 2.2e-16</code></pre>
<p><span class="success">In our example, the row and the column variables are statistically significantly associated(<em>p-value</em> = 0)</span></p>
<p><span class="warning">Note that, while Chi-square test can help to establish dependence between rows and the columns, the nature of the dependency is unknown.</span></p>
<p>The observed and the expected counts can be extracted from the result of the test as follow:</p>
<pre class="r"><code># Observed counts
chisq$observed</code></pre>
<pre><code>           Wife Alternating Husband Jointly
Laundry     156          14       2       4
Main_meal   124          20       5       4
Dinner       77          11       7      13
Breakfeast   82          36      15       7
Tidying      53          11       1      57
Dishes       32          24       4      53
Shopping     33          23       9      55
Official     12          46      23      15
Driving      10          51      75       3
Finances     13          13      21      66
Insurance     8           1      53      77
Repairs       0           3     160       2
Holidays      0           1       6     153</code></pre>
<pre class="r"><code># Expected counts
round(chisq$expected,2)</code></pre>
<pre><code>            Wife Alternating Husband Jointly
Laundry    60.55       25.63   38.45   51.37
Main_meal  52.64       22.28   33.42   44.65
Dinner     37.16       15.73   23.59   31.52
Breakfeast 48.17       20.39   30.58   40.86
Tidying    41.97       17.77   26.65   35.61
Dishes     38.88       16.46   24.69   32.98
Shopping   41.28       17.48   26.22   35.02
Official   33.03       13.98   20.97   28.02
Driving    47.82       20.24   30.37   40.57
Finances   38.88       16.46   24.69   32.98
Insurance  47.82       20.24   30.37   40.57
Repairs    56.77       24.03   36.05   48.16
Holidays   55.05       23.30   34.95   46.70</code></pre>
<p><span class="success">As mentioned above the Chi-square statistic is 1944.456196.</span></p>
<p><span class="question">Which are the most contributing cells to the definition of the total Chi-square statistic?</span></p>
<p>If you want to know the most contributing cells to the total Chi-square score, you just have to calculate the Chi-square statistic for each cell:</p>
<p><span class="math">\[
r = \frac{o - e}{\sqrt{e}}
\]</span></p>
<p><span class="success">The above formula returns the so-called <strong>Pearson residuals (r)</strong> for each cell (or standardized residuals)</span></p>
<p><span class="warning">Cells with the highest absolute standardized residuals contribute the most to the total Chi-square score.</span></p>
<p>Pearson residuals can be easily extracted from the output of the function <strong>chisq.test()</strong>:</p>
<pre class="r"><code>round(chisq$residuals, 3)</code></pre>
<pre><code>             Wife Alternating Husband Jointly
Laundry    12.266      -2.298  -5.878  -6.609
Main_meal   9.836      -0.484  -4.917  -6.084
Dinner      6.537      -1.192  -3.416  -3.299
Breakfeast  4.875       3.457  -2.818  -5.297
Tidying     1.702      -1.606  -4.969   3.585
Dishes     -1.103       1.859  -4.163   3.486
Shopping   -1.289       1.321  -3.362   3.376
Official   -3.659       8.563   0.443  -2.459
Driving    -5.469       6.836   8.100  -5.898
Finances   -4.150      -0.852  -0.742   5.750
Insurance  -5.758      -4.277   4.107   5.720
Repairs    -7.534      -4.290  20.646  -6.651
Holidays   -7.419      -4.620  -4.897  15.556</code></pre>
<p>Let’s visualize Pearson residuals using the package <strong>corrplot</strong>:</p>
<pre class="r"><code>library(corrplot)
corrplot(chisq$residuals, is.cor = FALSE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/correspondence-analysis-basics-residuals-chi-square-data-mining-1.png" title="Correspondence analysis basics - R software and data mining" alt="Correspondence analysis basics - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="notice">For a given cell, the size of the circle is proportional to the amount of the cell contribution.</span></p>
<p>The sign of the standardized residuals is also very important to interpret the association between rows and columns as explained in the block below.</p>
<br/>
<div class="block">
<ol style="list-style-type: decimal">
<li><strong>Positive residuals</strong> are in blue. Positive values in cells specify an attraction (positive association) between the corresponding row and column variables.</li>
</ol>
<ul>
<li>In the image above, it’s evident that there are an association between the column <strong>Wife</strong> and the rows <strong>Laundry, Main_meal</strong>.</li>
<li>There is a strong positive association between the column <strong>Husband</strong> and the row <strong>Repair</strong></li>
</ul>
<ol start="2" style="list-style-type: decimal">
<li><strong>Negative residuals</strong> are in red. This implies a repulsion (negative association) between the corresponding row and column variables. For example the column Wife are negatively associated (~ “not associated”) with the row <strong>Repairs</strong>. There is a repulsion between the column <em>Husband</em> and, the rows <strong>Laundry</strong> and <strong>Main_meal</strong></li>
</ol>
</div>
<p><br/></p>
<p><span class="warning">Note that, correspondence analysis is just the singular value decomposition of the standardized residuals. This will be explained in the next section.</span></p>
<p>The contribution (in %) of a given cell to the total Chi-square score is calculated as follow:</p>
<br/>
<div class="block">
<span class="math">\[
contrib = \frac{r^2}{\chi^2}
\]</span>
</div>
<p><br/></p>
<ul>
<li><strong>r</strong> is the residual of the cell</li>
</ul>
<pre class="r"><code># Contibution in percentage (%)
contrib <- 100*chisq$residuals^2/chisq$statistic
round(contrib, 3)</code></pre>
<pre><code>            Wife Alternating Husband Jointly
Laundry    7.738       0.272   1.777   2.246
Main_meal  4.976       0.012   1.243   1.903
Dinner     2.197       0.073   0.600   0.560
Breakfeast 1.222       0.615   0.408   1.443
Tidying    0.149       0.133   1.270   0.661
Dishes     0.063       0.178   0.891   0.625
Shopping   0.085       0.090   0.581   0.586
Official   0.688       3.771   0.010   0.311
Driving    1.538       2.403   3.374   1.789
Finances   0.886       0.037   0.028   1.700
Insurance  1.705       0.941   0.868   1.683
Repairs    2.919       0.947  21.921   2.275
Holidays   2.831       1.098   1.233  12.445</code></pre>
<pre class="r"><code># Visualize the contribution
corrplot(contrib, is.cor = FALSE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/correspondence-analysis-basics-contribution-chi-square-data-mining-1.png" title="Correspondence analysis basics - R software and data mining" alt="Correspondence analysis basics - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>The relative contribution of each cell to the total Chi-square score give some indication of the nature of the dependency between rows and columns of the contingency table.</p>
<p>It can be seen that:</p>
<ol style="list-style-type: decimal">
<li>The column “Wife” is strongly associated with Laundry, Main_meal, Dinner</li>
<li>The column “Husband” is strongly associated with the row Repairs</li>
<li>The column jointly is frequently associated with the row Holidays</li>
</ol>
<div class="success">
<p>From the image above, it can be seen that the most contributing cells to the Chi-square are Wife/Laundry (7.74%), Wife/Main_meal (4.98%), Husband/Repairs (21.9%), Jointly/Holidays (12.44%).</p>
<p>These cells contribute about 47.06% to the total Chi-square score and thus account for most of the difference between expected and observed values.</p>
This confirms the earlier visual interpretation of the data. As stated earlier, visual interpretation may be complex when the contingency table is very large. In this case, the contribution of one cell to the total Chi-square score becomes a useful way of establishing the nature of dependency.
</div>
</div>
<div id="chi-square-statistic-and-the-total-inertia" class="section level2">
<h2>Chi-square statistic and the total inertia</h2>
<p>As mentioned above, the total inertia is the amount of the information contained in the data table.</p>
<p>It’s called <span class="math">\(\phi^2\)</span> (squared phi) and is calculated as follow:</p>
<br/>
<div class="block">
<span class="math">\[
\phi^2 = \frac{\chi^2}{grand.total}
\]</span>
</div>
<p><br/></p>
<pre class="r"><code>phi2 <- as.numeric(chisq$statistic/sum(housetasks))
phi2</code></pre>
<pre><code>[1] 1.11494</code></pre>
<p>The square root of <span class="math">\(\phi^2\)</span> are called <strong>trace</strong> and may be interpreted as a correlation coefficient(Bendixen, 2003). Any value of the trace > 0.2 indicates a significant dependency between rows and columns (Bendixen M., 2003)</p>
</div>
</div>
<div id="graphical-representation-of-a-contingency-table-mosaic-plot" class="section level1">
<h1>Graphical representation of a contingency table: Mosaic plot</h1>
<p>Mosaic plot is used to visualize a contingency table in order to examine the association between categorical variables.</p>
<p>The function <strong>mosaicplot()</strong> [in <strong>garphics</strong> package] can be used.</p>
<pre class="r"><code>library("graphics")
# Mosaic plot of observed values
mosaicplot(housetasks,  las=2, col="steelblue",
           main = "housetasks - observed counts")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/correspondence-analysis-basics-contingency-table-graph-mosaic-data-mining-1.png" title="Correspondence analysis basics - R software and data mining" alt="Correspondence analysis basics - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Mosaic plot of expected values
mosaicplot(chisq$expected,  las=2, col = "gray",
           main = "housetasks - expected counts")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/correspondence-analysis-basics-contingency-table-graph-mosaic-data-mining-2.png" title="Correspondence analysis basics - R software and data mining" alt="Correspondence analysis basics - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p>In these plots, column variables are firstly splited (vertical split) and then row variables are splited(horizontal split). For each cell, the height of bars is proportional to the observed relative frequency it contains:</p>
<p><span class="math">\[
\frac{cell.value}{column.sum}
\]</span></p>
<p>The blue plot, is the <strong>mosaic plot</strong> of the observed values. The gray one is the <strong>mosaic plot</strong> of the expected values under null hypothesis.</p>
<p><span class="success"> If row and column variables were completely independent the mosaic bars for the observed values (blue graph) would be aligned as the mosaic bars for the expected values (gray graph).
</span></p>
<p>It’s also possible to color the mosaic plot according to the value of the <strong>standardized residuals</strong>:</p>
<pre class="r"><code>mosaicplot(housetasks, shade = TRUE, las=2,main = "housetasks")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/correspondence-analysis-basics-contingency-table-graph-mosaic-color-data-mining-1.png" title="Correspondence analysis basics - R software and data mining" alt="Correspondence analysis basics - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<ul>
<li>The argument <strong>shade</strong> is used to color the graph</li>
<li>The argument <strong>las = 2</strong> produces vertical labels</li>
</ul>
<br/>
<div class="block">
<ul>
<li>This plot clearly show you that Laundry, Main_meal, Dinner and Breakfeast are more often done by the “Wife”.</li>
<li>Repairs are done by the Husband</li>
</ul>
</div>
<p><br/></p>
</div>
<div id="g-test-likelihood-ratio-test" class="section level1">
<h1>G-test: Likelihood ratio test</h1>
<p>The <strong>G–test</strong> of independence is an alternative to the <strong>chi-square test</strong> of independence, and they will give approximately the same conclusion.</p>
<p>The test is based on the likelihood ratio defined as follow:</p>
<br/>
<div class="block">
<span class="math">\[
ratio = \frac{o}{e}
\]</span>
</div>
<p><br/></p>
<ul>
<li>o is the observed value</li>
<li>e is the expected value under null hypothesis</li>
</ul>
<p>This <strong>likelihood ratio</strong>, or its logarithm, can be used to compute a <em>p-value</em>. When the logarithm of the likelihood ratio is used, the statistic is known as a <strong>log-likelihood ratio statistic</strong>.</p>
<p>This test is called <strong>G-test</strong> or <strong>likelihood ratio test</strong> or <strong>maximum likelihood statistical significance test</strong>) and can be used in situations where <strong>Chi-square tests</strong> were previously recommended.</p>
<p>The <em>G-test</em> is generally defined as follow:</p>
<br/>
<div class="block">
<span class="math">\[
G = 2 * \sum{o * log(\frac{o}{e})}
\]</span>
</div>
<p><br/></p>
<br/>
<div class="block">
<ul>
<li><em>o</em> is the observed frequency in a cell</li>
<li><em>e</em> is the expected frequency under the null hypothesis</li>
<li><em>log</em> is the natural logarithm</li>
<li>The <em>sum</em> is taken over all non-empty cells.</li>
</ul>
</div>
<p><br/></p>
<p>The <strong>distribution of G</strong> is approximately a <strong>chi-squared distribution</strong>, with the same number of degrees of freedom as in the corresponding chi-squared test:</p>
<br/>
<div class="block">
<span class="math">\[df = (r - 1)(c - 1)\]</span>

</div>
<p><br/></p>
<ul>
<li><em>r</em> is the number of rows in the contingency table</li>
<li><em>c</em> is the number of column in the contingency table</li>
</ul>
<br/>
<div class="block">
<p>The commonly used Pearson Chi-square test is, in fact, just an approximation of the <strong>log-likelihood ratio</strong> on which the G-tests are based.</p>
<p>Remember that, the Chi-square formula is:</p>
<p><span class="math">\[
\chi^2 = \sum{\frac{(o - e)^2}{e}}
\]</span></p>
</div>
<p><br/></p>
<div id="likelihood-ratio-test-in-r" class="section level2">
<h2>Likelihood ratio test in R</h2>
<p>The functions <strong>likelihood.test()</strong>[in <em>Deducer</em> package] or <strong>G.test()</strong>[in <em>RVAideMemoire</em>] can be used to perform a <strong>G-test</strong> on a contingency table.</p>
<p>We’ll use the package <strong>RVAideMemoire</strong> which can be installed as follow : <strong>install.packages(“RVAideMemoire”)</strong>.</p>
<p>The function <strong>G.test()</strong> work as <strong>chisq.test()</strong>:</p>
<pre class="r"><code>library("RVAideMemoire")
gtest <- G.test(as.matrix(housetasks))
gtest</code></pre>
<pre><code>
    G-test

data:  as.matrix(housetasks)
G = 1907.658, df = 36, p-value < 2.2e-16</code></pre>
</div>
<div id="interpret-the-association-between-rows-and-columns-using-likelihood-ratio" class="section level2">
<h2>Interpret the association between rows and columns using likelihood ratio</h2>
<p>To interpret the association between the rows and the columns of the contingency table, the likelihood ratio can be used as an index (<em>i</em>):</p>
<br/>
<div class="block">
<span class="math">\[
ratio = \frac{o}{e}
\]</span>
</div>
<p><br/></p>
<p>For a given cell,</p>
<ul>
<li>If ratio > 1, there is an “attraction” (association) between the corresponding column and row</li>
<li>If ratio < 1, there is a “repulsion” between the corresponding column and row</li>
</ul>
<p>The ratio can be calculated as follow:</p>
<pre class="r"><code>ratio <- chisq$observed/chisq$expected
round(ratio,3)</code></pre>
<pre><code>            Wife Alternating Husband Jointly
Laundry    2.576       0.546   0.052   0.078
Main_meal  2.356       0.898   0.150   0.090
Dinner     2.072       0.699   0.297   0.412
Breakfeast 1.702       1.766   0.490   0.171
Tidying    1.263       0.619   0.038   1.601
Dishes     0.823       1.458   0.162   1.607
Shopping   0.799       1.316   0.343   1.570
Official   0.363       3.290   1.097   0.535
Driving    0.209       2.519   2.470   0.074
Finances   0.334       0.790   0.851   2.001
Insurance  0.167       0.049   1.745   1.898
Repairs    0.000       0.125   4.439   0.042
Holidays   0.000       0.043   0.172   3.276</code></pre>
<p><span class="warning">Note that, you can also use the R code : <strong>gtest$observed/gtest$expected</strong></span></p>
<p>The package <strong>corrplot</strong> can be used to make a graph of the likelihood ratio:</p>
<pre class="r"><code>corrplot(ratio, is.cor = FALSE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/correspondence-analysis-basics-likelihood-ratio-data-mining-1.png" title="Correspondence analysis basics - R software and data mining" alt="Correspondence analysis basics - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>The image above confirms our previous observations:</p>
<ul>
<li>The rows <em>Laundry, Main_meal and Dinner</em> are associated with the column <em>Wife</em></li>
<li><em>Repairs</em> are done more often by the <em>Husband</em></li>
<li><em>Holidays</em> are taken Jointly</li>
</ul>
<p>Let’s take the log(ratio) to see the attraction and the repulsion in different colors:</p>
<ul>
<li>If ratio < 1 => log(ratio) < 0 (negative values) => red color</li>
<li>If ratio > 1 = > log(ratio) > 0 (positive values) => blue color</li>
</ul>
<p>We’ll also add a small value (0.5) to all cells to avoid log(0):</p>
<pre class="r"><code>corrplot(log2(ratio + 0.5), is.cor = FALSE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/correspondence-analysis-basics-log-likelihood-ratio-data-mining-1.png" title="Correspondence analysis basics - R software and data mining" alt="Correspondence analysis basics - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
</div>
<div id="correspondence-analysis" class="section level1">
<h1>Correspondence analysis</h1>
<p>Correspondence analysis (CA) is required for large contingency table.</p>
<p>It used to graphically visualize row points and column points in a low dimensional space.</p>
<p>CA is a dimensional reduction method applied to a contingency table. The information retained by each dimension is called eigenvalue.</p>
<p>The total information (or inertia) contained in the data is called phi (<span class="math">\(\phi^2\)</span>) and can be calculated as follow:</p>
<br/>
<div class="block">
<span class="math">\[
\phi^2 = \frac{\chi^2}{grand.total}
\]</span>
</div>
<p><br/></p>
<p>For a given axis, the eigenvalue (<span class="math">\(\lambda\)</span>) is computed as follow:</p>
<br/>
<div class="block">
<span class="math">\[
\lambda_{axis} = \sum{\frac{row.sum}{grand.total} * row.coord^2}
\]</span>
</div>
<p><br/></p>
<p>Or equivalently</p>
<br/>
<div class="block">
<span class="math">\[
\lambda_{axis} = \sum{\frac{col.sum}{grand.total} * col.coord^2}
\]</span>
</div>
<p><br/></p>
<ul>
<li>row.coord and col.coord are the coordinates of row and column variables on the axis.</li>
</ul>
<p>The association index between a row and column for the principal axes can be computed as follow:</p>
<br/>
<div class="block">
<p><span class="math">\[
i = 1 + \sum{\frac{row.coord * col.coord}{\sqrt{\lambda}}}
\]</span></p>
<ul>
<li><span class="math">\(\lambda\)</span> is the eigenvalue of the axes</li>
<li>The sum denotes the sum for all axis</li>
</ul>
</div>
<p><br/></p>
<p>If there is an attraction the corresponding row and column coordinates have the same sign on the axes. If there is a repulsion the corresponding row and column coordinates have different signs on the axes. A high value indicates a strong attraction or repulsion</p>
</div>
<div id="ca---singular-value-decomposition-of-the-standardized-residuals" class="section level1">
<h1>CA - Singular value decomposition of the standardized residuals</h1>
<p>Correspondence analysis (CA) is used to represent graphically the table of distances between row variables or between column variables.</p>
<p>CA approach includes the following steps:</p>
<ul>
<li>STEP 1. <strong>Compute the standardized residuals</strong></li>
</ul>
<p>The <strong>standardized residuals</strong> (S) is:</p>
<p><span class="math">\[
S = \frac{o - e}{\sqrt{e}}
\]</span></p>
<p><span class="notice">In fact, S is just the square roots of the terms comprising <span class="math">\(\chi^2\)</span> statistic.</span></p>
<p>STEP II. Compute the <strong>singular value decomposition</strong> (SVD) of the <strong>standardized residuals</strong>.</p>
<p>Let M be: <span class="math">\(M = \frac{1}{sqrt(grand.total)} \times S\)</span></p>
<p>SVD means that we want to find orthogonal matrices <em>U</em> and <em>V</em>, together with a diagonal matrix <span class="math">\(\Delta\)</span>, such that:</p>
<br/>
<div class="block">
<span class="math">\[
M = U \Delta V^T
\]</span>
</div>
<p><br/> (Phillip M. Yelland, 2010)</p>
<ul>
<li><span class="math">\(U\)</span> is a matrix containing row eigenvectors</li>
<li><span class="math">\(\Delta\)</span> is the diagonal matrix. The numbers on the diagonal of the matrix are called singular values (SV). The eigenvalues are the squared SV.</li>
<li><span class="math">\(V\)</span> is a matrix containing column eigenvectors</li>
</ul>
<p>The eigenvalue of a given axis is:</p>
<br/>
<div class="block">
<span class="math">\[
\lambda = \delta^2
\]</span>
</div>
<p><br/></p>
<ul>
<li><span class="math">\(\delta\)</span> is the singular value</li>
</ul>
<p>The coordinates of row variables on a given axis are:</p>
<br/>
<div class="block">
<span class="math">\[
row.coord = \frac{U * \delta }{\sqrt{row.mass}}
\]</span>
</div>
<p><br/></p>
<p>The coordinates of columns are:</p>
<br/>
<div class="block">
<span class="math">\[
col.coord = \frac{V * \delta }{\sqrt{col.mass}}
\]</span>
</div>
<p><br/></p>
<p>Compute SVD in R:</p>
<pre class="r"><code># Grand total
n <- sum(housetasks)
# Standardized residuals
residuals <- chisq$residuals/sqrt(n)
# Number of dimensions
nb.axes <- min(nrow(residuals)-1, ncol(residuals)-1)
# Singular value decomposition
res.svd <- svd(residuals, nu = nb.axes, nv = nb.axes)
res.svd</code></pre>
<pre><code>$d
[1] 7.368102e-01 6.670853e-01 3.564385e-01 1.012225e-16

$u
             [,1]        [,2]        [,3]
 [1,] -0.42762952 -0.23587902 -0.28228398
 [2,] -0.35197789 -0.21761257 -0.13633376
 [3,] -0.23391020 -0.11493572 -0.14480767
 [4,] -0.19557424 -0.19231779  0.17519699
 [5,] -0.14136307  0.17221046 -0.06990952
 [6,] -0.06528142  0.16864510  0.19063825
 [7,] -0.04189568  0.15859251  0.14910925
 [8,]  0.07216535 -0.08919754  0.60778606
 [9,]  0.28421536 -0.27652950  0.43123528
[10,]  0.09354184  0.23576569  0.02484968
[11,]  0.24793268  0.20050833 -0.22918636
[12,]  0.63820133 -0.39850534 -0.40738669
[13,]  0.10379321  0.65156733 -0.11011902

$v
            [,1]       [,2]       [,3]
[1,] -0.66679846 -0.3211267 -0.3289692
[2,] -0.03220853 -0.1668171  0.9085662
[3,]  0.73643655 -0.4217418 -0.2476526
[4,]  0.10956112  0.8313745 -0.0703917</code></pre>
<pre class="r"><code>sv <- res.svd$d[1:nb.axes] # singular value
u <-res.svd$u
v <- res.svd$v</code></pre>
<div id="eigenvalues-and-screeplot" class="section level2">
<h2>Eigenvalues and screeplot</h2>
<pre class="r"><code># Eigenvalues
eig <- sv^2
# Variances in percentage
variance <- eig*100/sum(eig)
# Cumulative variances
cumvar <- cumsum(variance)

eig<- data.frame(eig = eig, variance = variance,
                     cumvariance = cumvar)
head(eig)</code></pre>
<pre><code>        eig variance cumvariance
1 0.5428893 48.69222    48.69222
2 0.4450028 39.91269    88.60491
3 0.1270484 11.39509   100.00000</code></pre>
<pre class="r"><code>barplot(eig[, 2], names.arg=1:nrow(eig), 
       main = "Variances",
       xlab = "Dimensions",
       ylab = "Percentage of variances",
       col ="steelblue")
# Add connected line segments to the plot
lines(x = 1:nrow(eig), eig[, 2], 
      type="b", pch=19, col = "red")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/correspondence-analysis-basics-eigenvalue-data-mining-1.png" title="Correspondence analysis basics - R software and data mining" alt="Correspondence analysis basics - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>How many dimensions to retain?:</p>
<ol style="list-style-type: decimal">
<li>The maximum number of axes in the CA is :</li>
</ol>
<br/>
<div class="block">
<span class="math">\[
nb.axes = min( r-1, c-1)
\]</span>
</div>
<p><br/></p>
<p>r and c are respectively the number of rows and columns in the table.</p>
<ol start="2" style="list-style-type: decimal">
<li>Use elbow method</li>
</ol>
</div>
<div id="row-coordinates" class="section level2">
<h2>Row coordinates</h2>
<p>We can use the function <strong>apply</strong> to perform arbitrary operations on the rows and columns of a matrix.</p>
<p>A simplified format is:</p>
<pre class="r"><code>apply(X, MARGIN, FUN, ...)</code></pre>
<ul>
<li><strong>x</strong>: a matrix</li>
<li><strong>MARGIN</strong>: allowed values can be 1 or 2. 1 specifies that we want to operate on the rows of the matrix. 2 specifies that we want to operate on the column.</li>
<li><strong>FUN</strong>: the function to be applied</li>
<li><strong>…</strong>: optional arguments to FUN</li>
</ul>
<pre class="r"><code># row sum
row.sum <- apply(housetasks, 1, sum)
# row mass
row.mass <- row.sum/n

# row coord = sv * u /sqrt(row.mass)
cc <- t(apply(u, 1, &amp;#39;*&amp;#39;, sv)) # each row X sv
row.coord <- apply(cc, 2, &amp;#39;/&amp;#39;, sqrt(row.mass))
rownames(row.coord) <- rownames(housetasks)
colnames(row.coord) <- paste0("Dim.", 1:nb.axes)
round(row.coord,3)</code></pre>
<pre><code>            Dim.1  Dim.2  Dim.3
Laundry    -0.992 -0.495 -0.317
Main_meal  -0.876 -0.490 -0.164
Dinner     -0.693 -0.308 -0.207
Breakfeast -0.509 -0.453  0.220
Tidying    -0.394  0.434 -0.094
Dishes     -0.189  0.442  0.267
Shopping   -0.118  0.403  0.203
Official    0.227 -0.254  0.923
Driving     0.742 -0.653  0.544
Finances    0.271  0.618  0.035
Insurance   0.647  0.474 -0.289
Repairs     1.529 -0.864 -0.472
Holidays    0.252  1.435 -0.130</code></pre>
<pre class="r"><code># plot
plot(row.coord, pch=19, col = "blue")
text(row.coord, labels =rownames(row.coord), pos = 3, col ="blue")
abline(v=0, h=0, lty = 2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/correspondence-analysis-basics-row-coordinates-1.png" title="Correspondence analysis basics - R software and data mining" alt="Correspondence analysis basics - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
<div id="column-coordinates" class="section level2">
<h2>Column coordinates</h2>
<pre class="r"><code># Coordinates of columns
col.sum <- apply(housetasks, 2, sum)
col.mass <- col.sum/n
# coordinates sv * v /sqrt(col.mass)
cc <- t(apply(v, 1, &amp;#39;*&amp;#39;, sv))
col.coord <- apply(cc, 2, &amp;#39;/&amp;#39;, sqrt(col.mass))
rownames(col.coord) <- colnames(housetasks)
colnames(col.coord) <- paste0("Dim", 1:nb.axes)
head(col.coord)</code></pre>
<pre><code>                   Dim1       Dim2        Dim3
Wife        -0.83762154 -0.3652207 -0.19991139
Alternating -0.06218462 -0.2915938  0.84858939
Husband      1.16091847 -0.6019199 -0.18885924
Jointly      0.14942609  1.0265791 -0.04644302</code></pre>
<pre class="r"><code># plot
plot(col.coord, pch=17, col = "red")
text(col.coord, labels =rownames(col.coord), pos = 3, col ="red")
abline(v=0, h=0, lty = 2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/correspondence-analysis-basics-column-coordinates-1.png" title="Correspondence analysis basics - R software and data mining" alt="Correspondence analysis basics - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
<div id="biplot-of-rows-and-columns-to-view-the-association" class="section level2">
<h2>Biplot of rows and columns to view the association</h2>
<pre class="r"><code>xlim <- range(c(row.coord[,1], col.coord[,1]))*1.1
ylim <- range(c(row.coord[,2], col.coord[,2]))*1.1
# Plot of rows
plot(row.coord, pch=19, col = "blue", xlim = xlim, ylim = ylim)
text(row.coord, labels =rownames(row.coord), pos = 3, col ="blue")
# plot off columns
points(col.coord, pch=17, col = "red")
text(col.coord, labels =rownames(col.coord), pos = 3, col ="red")
abline(v=0, h=0, lty = 2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/correspondence-analysis-basics-biplot-1.png" title="Correspondence analysis basics - R software and data mining" alt="Correspondence analysis basics - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>You can interpret the distance between rows points or between column points but the distance between column points and row points are not meaningful.</p>
</div>
<div id="diagnostic" class="section level2">
<h2>Diagnostic</h2>
<p>Recall that, the total inertia contained in the data is:</p>
<br/>
<div class="block">
<span class="math">\[
\phi^2 = \frac{\chi^2}{n} = 1.11494 
\]</span>
</div>
<p><br/></p>
<p>Our two-dimensional plot captures about 88% of the total inertia of the table.</p>
</div>
<div id="contribution-of-rows-and-columns" class="section level2">
<h2>Contribution of rows and columns</h2>
<p>The contributions of a rows/columns to the definition of a principal axis are :</p>
<br/>
<div class="block">
<span class="math">\[
row.contrib = \frac{row.mass * row.coord^2}{eigenvalue}
\]</span>
</div>
<p><br/></p>
<br/>
<div class="block">
<span class="math">\[
col.contrib = \frac{col.mass * col.coord^2}{eigenvalue}
\]</span>
</div>
<p><br/></p>
<p>Contribution of rows in %</p>
<pre class="r"><code># contrib <- row.mass * row.coord^2/eigenvalue
cc <- apply(row.coord^2, 2, "*", row.mass)
row.contrib <- t(apply(cc, 1, "/", eig[1:nb.axes,1])) *100
round(row.contrib, 2)</code></pre>
<pre><code>           Dim.1 Dim.2 Dim.3
Laundry    18.29  5.56  7.97
Main_meal  12.39  4.74  1.86
Dinner      5.47  1.32  2.10
Breakfeast  3.82  3.70  3.07
Tidying     2.00  2.97  0.49
Dishes      0.43  2.84  3.63
Shopping    0.18  2.52  2.22
Official    0.52  0.80 36.94
Driving     8.08  7.65 18.60
Finances    0.88  5.56  0.06
Insurance   6.15  4.02  5.25
Repairs    40.73 15.88 16.60
Holidays    1.08 42.45  1.21</code></pre>
<pre class="r"><code>corrplot(row.contrib, is.cor = FALSE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/correspondence-analysis-basics-row-contribution-graph-1.png" title="Correspondence analysis basics - R software and data mining" alt="Correspondence analysis basics - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>Contribution of columns in %</p>
<pre class="r"><code># contrib <- col.mass * col.coord^2/eigenvalue
cc <- apply(col.coord^2, 2, "*", col.mass)
col.contrib <- t(apply(cc, 1, "/", eig[1:nb.axes,1])) *100
round(col.contrib, 2)</code></pre>
<pre><code>             Dim1  Dim2  Dim3
Wife        44.46 10.31 10.82
Alternating  0.10  2.78 82.55
Husband     54.23 17.79  6.13
Jointly      1.20 69.12  0.50</code></pre>
<pre class="r"><code>corrplot(col.contrib, is.cor = FALSE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/correspondence-analysis-basics-column-contrib-graph-1.png" title="Correspondence analysis basics - R software and data mining" alt="Correspondence analysis basics - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
<div id="quality-of-the-representation" class="section level2">
<h2>Quality of the representation</h2>
<p>The quality of the representation is called <strong>COS2</strong>.</p>
<p>The quality of the representation of a row on an axis is:</p>
<br/>
<div class="block">
<span class="math">\[
row.cos2 = \frac{row.coord^2}{d^2}
\]</span>
</div>
<p><br/></p>
<ul>
<li>row.coord is the coordinate of the row on the axis</li>
<li><span class="math">\(d^2\)</span> is the squared distance from the average profile</li>
</ul>
<p>Recall that the distance between each row profile and the average row profile is:</p>
<br/>
<div class="block">
<span class="math">\[ 
  d^2(row_i, average.profile) = \sum{\frac{(row.profile_i - average.profile)^2}{average.profile}}
\]</span>
</div>
<p><br/></p>
<pre class="r"><code>row.profile <- housetasks/row.sum
head(round(row.profile, 3))</code></pre>
<pre><code>            Wife Alternating Husband Jointly
Laundry    0.886       0.080   0.011   0.023
Main_meal  0.810       0.131   0.033   0.026
Dinner     0.713       0.102   0.065   0.120
Breakfeast 0.586       0.257   0.107   0.050
Tidying    0.434       0.090   0.008   0.467
Dishes     0.283       0.212   0.035   0.469</code></pre>
<pre class="r"><code>average.profile <- col.sum/n
head(round(average.profile, 3))</code></pre>
<pre><code>       Wife Alternating     Husband     Jointly 
      0.344       0.146       0.218       0.292 </code></pre>
<p>The R code below computes the distance from the average profile for all the row variables</p>
<pre class="r"><code>d2.row <- apply(row.profile, 1, 
                function(row.p, av.p){sum(((row.p - av.p)^2)/av.p)}, 
                average.rp)
head(round(d2.row,3))</code></pre>
<pre><code>   Laundry  Main_meal     Dinner Breakfeast    Tidying     Dishes 
     1.329      1.034      0.618      0.512      0.353      0.302 </code></pre>
<p>The cos2 of rows on the factor map are:</p>
<pre class="r"><code>row.cos2 <- apply(row.coord^2, 2, "/", d2.row)
round(row.cos2, 3)</code></pre>
<pre><code>           Dim.1 Dim.2 Dim.3
Laundry    0.740 0.185 0.075
Main_meal  0.742 0.232 0.026
Dinner     0.777 0.154 0.070
Breakfeast 0.505 0.400 0.095
Tidying    0.440 0.535 0.025
Dishes     0.118 0.646 0.236
Shopping   0.064 0.748 0.189
Official   0.053 0.066 0.881
Driving    0.432 0.335 0.233
Finances   0.161 0.837 0.003
Insurance  0.576 0.309 0.115
Repairs    0.707 0.226 0.067
Holidays   0.030 0.962 0.008</code></pre>
<p>visualize the cos2:</p>
<pre class="r"><code>corrplot(row.cos2, is.cor = FALSE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/correspondence-analysis-basics-cos2-graph2-1.png" title="Correspondence analysis basics - R software and data mining" alt="Correspondence analysis basics - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
<div id="cos2-of-columns" class="section level2">
<h2>Cos2 of columns</h2>
<br/>
<div class="block">
<span class="math">\[
col.cos2 = \frac{col.coord^2}{d^2}
\]</span>
</div>
<p><br/></p>
<pre class="r"><code>col.profile <- t(housetasks)/col.sum
col.profile <- t(col.profile)
#head(round(col.profile, 3))

average.profile <- row.sum/n
#head(round(average.profile, 3))</code></pre>
<p>The R code below computes the distance from the average profile for all the column variables</p>
<pre class="r"><code>d2.col <- apply(col.profile, 2, 
        function(col.p, av.p){sum(((col.p - av.p)^2)/av.p)}, 
        average.profile)
#round(d2.col,3)</code></pre>
<p>The cos2 of columns on the factor map are:</p>
<pre class="r"><code>col.cos2 <- apply(col.coord^2, 2, "/", d2.col)
round(col.cos2, 3)</code></pre>
<pre><code>             Dim1  Dim2  Dim3
Wife        0.802 0.152 0.046
Alternating 0.005 0.105 0.890
Husband     0.772 0.208 0.020
Jointly     0.021 0.977 0.002</code></pre>
<p>visualize the cos2:</p>
<pre class="r"><code>corrplot(col.cos2, is.cor = FALSE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/correspondence-analysis-basics-cos2-graph-1.png" title="Correspondence analysis basics - R software and data mining" alt="Correspondence analysis basics - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
<div id="supplementary-rowscolumns" class="section level2">
<h2>Supplementary rows/columns</h2>
<div id="the-supplementary-row-coordinates" class="section level3">
<h3>The supplementary row coordinates</h3>
<br/>
<div class="block">
<span class="math">\[
sup.row.coord = sup.row.profile * \frac{v}{\sqrt{col.mass}}
\]</span>
</div>
<p><br/></p>
<pre class="r"><code># Supplementary row
sup.row <- as.data.frame(housetasks["Dishes",, drop = FALSE])
# Supplementary row profile
sup.row.sum <- apply(sup.row, 1, sum)
sup.row.profile <- sweep(sup.row, 1, sup.row.sum, "/")
# V/sqrt(col.mass)
vv <- sweep(v, 1, sqrt(col.mass), FUN = "/")
# Supplementary row coord
sup.row.coord <- as.matrix(sup.row.profile) %*% vv
sup.row.coord</code></pre>
<pre><code>             [,1]      [,2]      [,3]
Dishes -0.1889641 0.4419662 0.2669493</code></pre>
<pre class="r"><code>## COS2 = coor^2/Distance from average profile
d2.row <- apply(sup.row.profile, 1, 
        function(row.p, av.p){sum(((row.p - av.p)^2)/av.p)}, 
        average.rp)
sup.row.cos2 <- sweep(sup.row.coord^2, 1, d2.row, FUN = "/")</code></pre>
</div>
</div>
</div>
<div id="packages-in-r" class="section level1">
<h1>Packages in R</h1>
<p>There are many packages for CA:</p>
<ul>
<li>FactoMineR</li>
<li>ade4</li>
<li>ca</li>
</ul>
<pre class="r"><code>library(FactoMineR)
res.ca <- CA(housetasks, graph = F)
# print
res.ca</code></pre>
<pre><code>**Results of the Correspondence Analysis (CA)**
The row variable has  13  categories; the column variable has 4 categories
The chi square of independence between the two variables is equal to 1944.456 (p-value =  0 ).
*The results are available in the following objects:

   name              description                   
1  "$eig"            "eigenvalues"                 
2  "$col"            "results for the columns"     
3  "$col$coord"      "coord. for the columns"      
4  "$col$cos2"       "cos2 for the columns"        
5  "$col$contrib"    "contributions of the columns"
6  "$row"            "results for the rows"        
7  "$row$coord"      "coord. for the rows"         
8  "$row$cos2"       "cos2 for the rows"           
9  "$row$contrib"    "contributions of the rows"   
10 "$call"           "summary called parameters"   
11 "$call$marge.col" "weights of the columns"      
12 "$call$marge.row" "weights of the rows"         </code></pre>
<pre class="r"><code># eigenvalue
head(res.ca$eig)[, 1:2]</code></pre>
<pre><code>        eigenvalue percentage of variance
dim 1 5.428893e-01           4.869222e+01
dim 2 4.450028e-01           3.991269e+01
dim 3 1.270484e-01           1.139509e+01
dim 4 5.119700e-33           4.591904e-31</code></pre>
<pre class="r"><code># barplot of percentage of variance
barplot(res.ca$eig[,2], names.arg = rownames(res.ca$eig))</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/correspondence-analysis-basics-factominer-data-mining-1.png" title="Correspondence analysis basics - R software and data mining" alt="Correspondence analysis basics - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Plot row points
plot(res.ca, invisible ="col")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/correspondence-analysis-basics-factominer-data-mining-2.png" title="Correspondence analysis basics - R software and data mining" alt="Correspondence analysis basics - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Plot column points
plot(res.ca, invisible ="col")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/correspondence-analysis-basics-factominer-data-mining-3.png" title="Correspondence analysis basics - R software and data mining" alt="Correspondence analysis basics - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Biplot of rows and columns
plot(res.ca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/correspondence-analysis-basics-factominer-data-mining-4.png" title="Correspondence analysis basics - R software and data mining" alt="Correspondence analysis basics - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
<div id="infos" class="section level1">
<h1>Infos</h1>
<p><span class="warning"> This analysis has been performed using <strong>R software</strong> (ver. 3.1.2), <strong>FactoMineR</strong> (ver. 1.29) and <strong>factoextra</strong> (ver. 1.0.2) </span></p>
<ul>
<li>Phillip M. Yelland. An introduction to correspondence analysis. Mathematica Journal. 2010. <a href="http://www.mathematica-journal.com/data/uploads/2010/09/Yelland.pdf">http://www.mathematica-journal.com/data/uploads/2010/09/Yelland.pdf</a></li>
<li>Ricco RAKOTOMALALA (article in french). Analyse factorielle des correspondances. University Lyon 2. <a href="http://eric.univ-lyon2.fr/~ricco/cours/slides/AFC.pdf">http://eric.univ-lyon2.fr/~ricco/cours/slides/AFC.pdf</a></li>
<li>Bendixen M. 2003, A Practical Guide to the Use of Correspondence Analysis in Marketing Research, Marketing Bulletin, 2003, 14, Technical Note 2. <a href="http://marketing-bulletin.massey.ac.nz/V14/MB_V14_T2_Bendixen.pdf">http://marketing-bulletin.massey.ac.nz/V14/MB_V14_T2_Bendixen.pdf</a></li>
</ul>
</div>

<script>jQuery(document).ready(function () {
    jQuery('h1').addClass('wiki_paragraph1');
    jQuery('h2').addClass('wiki_paragraph2');
    jQuery('h3').addClass('wiki_paragraph3');
    jQuery('h4').addClass('wiki_paragraph4');
    });//add phpboost class to header</script>
<style>.content{padding:0px;}</style>

<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
  (function () {
    var script = document.createElement("script");
    script.type = "text/javascript";
    script.src  = "https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
    document.getElementsByTagName("head")[0].appendChild(script);
  })();
</script>
</div><!--end rdoc-->
<!--====================== stop here when you copy to sthda================-->


<!-- END HTML -->]]></description>
			<pubDate>Mon, 01 Jun 2015 00:01:30 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[Principal component analysis in R : prcomp() vs. princomp() - R software and data mining]]></title>
			<link>https://www.sthda.com/english/wiki/principal-component-analysis-in-r-prcomp-vs-princomp-r-software-and-data-mining</link>
			<guid>https://www.sthda.com/english/wiki/principal-component-analysis-in-r-prcomp-vs-princomp-r-software-and-data-mining</guid>
			<description><![CDATA[<!-- START HTML -->

  <!--====================== start from here when you copy to sthda================-->  
  <div id="rdoc">


<div id="TOC">
<ul>
<li><a href="#packages-in-r-for-principal-component-analysis">Packages in R for principal component analysis</a></li>
<li><a href="#prcomp-and-princomp-functions">prcomp() and princomp() functions</a></li>
<li><a href="#install-factoextra-for-visualization">Install factoextra for visualization</a></li>
<li><a href="#prepare-the-data">Prepare the data</a></li>
<li><a href="#use-the-r-function-prcomp-for-pca">Use the R function prcomp() for PCA</a></li>
<li><a href="#variances-of-the-principal-components">Variances of the principal components</a></li>
<li><a href="#graph-of-variables-the-correlation-circle">Graph of variables : The correlation circle</a><ul>
<li><a href="#coordinates-of-variables-on-the-principal-components">Coordinates of variables on the principal components</a></li>
<li><a href="#graph-of-variables-using-r-base-graph">Graph of variables using R base graph</a></li>
<li><a href="#graph-of-variables-using-factoextra">Graph of variables using factoextra</a></li>
<li><a href="#cos2-quality-of-representation-for-variables-on-the-factor-map">Cos2 : quality of representation for variables on the factor map</a></li>
<li><a href="#contributions-of-the-variables-to-the-principal-components">Contributions of the variables to the principal components</a></li>
</ul></li>
<li><a href="#graph-of-individuals">Graph of individuals</a><ul>
<li><a href="#coordinates-of-individuals-on-the-principal-components">Coordinates of individuals on the principal components</a></li>
<li><a href="#cos2-quality-of-representation-for-individuals-on-the-principal-components">Cos2 : quality of representation for individuals on the principal components</a></li>
<li><a href="#contribution-of-individuals-to-the-princial-components">Contribution of individuals to the princial components</a></li>
<li><a href="#graph-of-individuals-base-graph">Graph of individuals : base graph</a></li>
<li><a href="#graph-of-individuals-factoextra">Graph of individuals : factoextra</a><ul>
<li><a href="#extract-the-results-for-the-individuals">Extract the results for the individuals</a></li>
<li><a href="#graph-of-individuals-using-factoextra">Graph of individuals using factoextra</a></li>
</ul></li>
</ul></li>
<li><a href="#prediction-using-principal-component-analysis">Prediction using Principal Component Analysis</a><ul>
<li><a href="#supplementary-quantitative-variables">Supplementary quantitative variables</a></li>
<li><a href="#supplementary-qualitative-variables">Supplementary qualitative variables</a></li>
<li><a href="#supplementary-individuals">Supplementary individuals</a></li>
<li><a href="#a-simple-function-to-predict-the-coordinates-of-new-individuals-data">A simple function to predict the coordinates of new individuals data</a></li>
<li><a href="#calculate-the-predicted-coordinates-by-hand">Calculate the predicted coordinates by hand</a></li>
<li><a href="#make-a-factor-map-including-the-supplementary-individuals-using-factoextra">Make a factor map including the supplementary individuals using factoextra</a></li>
</ul></li>
<li><a href="#infos">Infos</a></li>
</ul>
</div>

<p><br/>
The basics of <strong>Principal Component Analysis</strong> (<strong>PCA</strong>) have been already described in my previous article : <a href="https://www.sthda.com/english/english/wiki/principal-component-analysis-the-basics-you-should-read-r-software-and-data-mining">PCA basics</a>.</p>
<p>This <strong>R tutorial</strong> describes how to perform a <strong>Principal Component Analysis</strong> (<strong>PCA</strong>) using the built-in <strong>R</strong> functions <strong>prcomp()</strong> and <strong>princomp()</strong>.</p>
<p>You will learn how to :</p>
<ul>
<li>determine the number of components to retain for summarizing the information in your data</li>
<li>calculate the <strong>coordinates</strong>, the <strong>cos2</strong> and the <strong>contribution</strong> of variables</li>
<li>calculate the <strong>coordinates</strong>, the <strong>cos2</strong> and the <strong>contribution</strong> of individuals</li>
<li>interpret the correlation circle of PCA</li>
<li>make a prediction with PCA</li>
</ul>
<div id="packages-in-r-for-principal-component-analysis" class="section level1">
<h1>Packages in R for principal component analysis</h1>
<p>There are two general methods to perform PCA in R :</p>
<ul>
<li><em>Spectral decomposition</em> which examines the covariances / correlations between variables</li>
<li><em>Singular value decomposition</em> which examines the covariances / correlations between individuals</li>
</ul>
<p><span class="notice">The singular value decomposition method is the preferred analysis for numerical accuracy.</span></p>
<p>There are several functions from different packages for performing PCA :</p>
<ul>
<li>The functions <strong>prcomp()</strong> and <strong>princomp()</strong> from the built-in <strong>R stats</strong> package</li>
<li><strong>PCA()</strong> from <strong>FactoMineR</strong> package. Read more here : <a href="https://www.sthda.com/english/english/wiki/factominer-and-factoextra-principal-component-analysis-visualization-r-software-and-data-mining">PCA with FactoMineR</a></li>
<li><strong>dudi.pca()</strong> from <strong>ade4</strong> package. Read more here : <a href="https://www.sthda.com/english/english/wiki/ade4-and-factoextra-principal-component-analysis-r-software-and-data-mining">PCA with ade4</a></li>
</ul>
<p><span class="notice">The functions <strong>prcomp()</strong> and <strong>princomp()</strong> are described in the next section.</span></p>
</div>
<div id="prcomp-and-princomp-functions" class="section level1">
<h1>prcomp() and princomp() functions</h1>
<p>The function <strong>princomp()</strong> uses the <strong>spectral decomposition</strong> approach.</p>
<p>The functions <strong>prcomp()</strong> and <strong>PCA()</strong>[FactoMineR] use the <em>singular value decomposition</em> (SVD).</p>
<p><span class="warning">According to R help, SVD has slightly better numerical accuracy. Therefore, <em>prcomp()</em> is the preferred function.</span></p>
<p>The simplified format of these 2 functions are :</p>
<pre class="r"><code>prcomp(x, scale = FALSE)

princomp(x, cor = FALSE, scores = TRUE)</code></pre>
<br/>
<div class="block">
<ol style="list-style-type: decimal">
<li>Arguments for <em>prcomp()</em> :</li>
</ol>
<ul>
<li><strong>x</strong> : a numeric matrix or data frame</li>
<li><strong>scale</strong> : a logical value indicating whether the variables should be scaled to have unit variance before the analysis takes place</li>
</ul>
<ol start="2" style="list-style-type: decimal">
<li>Arguments for <em>princomp()</em> :</li>
</ol>
<ul>
<li><strong>x</strong> : a numeric matrix or data frame</li>
<li><strong>cor</strong> : a logical value. If TRUE, the data will be centered and scaled before the analysis</li>
<li><strong>scores</strong> : a logical value. If TRUE, the coordinates on each principal component are calculated</li>
</ul>
</div>
<p><br/></p>
<p>The elements of the outputs returned by the functions <em>prcomp()</em> and <em>princomp()</em> includes :</p>
<table>
<thead>
<tr class="header">
<th align="left">prcomp() name</th>
<th align="left">princomp() name</th>
<th align="left">Description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td align="left">sdev</td>
<td align="left">sdev</td>
<td align="left">the standard deviations of the principal components</td>
</tr>
<tr class="even">
<td align="left">rotation</td>
<td align="left">loadings</td>
<td align="left">the matrix of variable loadings (columns are eigenvectors)</td>
</tr>
<tr class="odd">
<td align="left">center</td>
<td align="left">center</td>
<td align="left">the variable means (means that were substracted)</td>
</tr>
<tr class="even">
<td align="left">scale</td>
<td align="left">scale</td>
<td align="left">the variable standard deviations (the scalings applied to each variable )</td>
</tr>
<tr class="odd">
<td align="left">x</td>
<td align="left">scores</td>
<td align="left">The coordinates of the individuals (observations) on the principal components.</td>
</tr>
</tbody>
</table>
<p><span class="success">In the following sections, we’ll focus only on the function <strong>prcomp()</strong></span></p>
</div>
<div id="install-factoextra-for-visualization" class="section level1">
<h1>Install factoextra for visualization</h1>
<p>The package <a href="https://www.sthda.com/english/english/wiki/factoextra-r-package-visualization-of-the-outputs-of-a-multivariate-analysis-r-software-and-data-mining"><strong>factoextra</strong></a> is used for the visualization of the <strong>principal component analysis</strong> results.</p>
<p><em>factoextra</em> can be installed and loaded as follow :</p>
<pre class="r"><code># install.packages("devtools")
devtools::install_github("kassambara/factoextra")

# load
library("factoextra")</code></pre>
</div>
<div id="prepare-the-data" class="section level1">
<h1>Prepare the data</h1>
<p>We’ll use the data sets <em>decathlon2</em> from the package <strong>factoextra</strong> :</p>
<pre class="r"><code>library("factoextra")
data(decathlon2)</code></pre>
<p><span class="warning">This data is a subset of <em>decathlon</em> data in <strong>FactoMineR</strong> package</span></p>
<p>As illustrated below, the data used here describes athletes’ performance during two sporting events (Desctar and OlympicG). It contains 27 individuals (athletes) described by 13 variables :</p>
<p><a href="https://www.sthda.com/english/sthda/RDoc/images/pca-decathlon-big.png" title="Click to zoom!"> <img src="https://www.sthda.com/english/sthda/RDoc/images/pca-decathlon.png" alt="principal component analysis data"/> </a></p>
<br/>

<div class="warning">
<p>Only some of these individuals and variables will be used to perform the principal component analysis (PCA).</p>
The coordinates of the remaining individuals and variables on the factor map will be <strong>predicted</strong> after the PCA.
</div>
<p><br/></p>
<p>In PCA terminology, our data contains :</p>
<br/>
<div class="block">
<ul>
<li><strong>Active individuals</strong> (in blue, rows 1:23) : Individuals that are used during the principal component analysis.</li>
<li><strong>Supplementary individuals</strong> (in green, rows 24:27) : The coordinates of these individuals will be predicted using the PCA information and parameters obtained with active individuals/variables</li>
<li><strong>Active variables</strong> (in pink, columns 1:10) : Variables that are used for the principal component analysis.</li>
<li><strong>Supplementary variables</strong> : As supplementary individuals, the coordinates of these variables will be predicted also.</li>
<li><strong>Supplementary continuous variables</strong> : Columns 11 and 12 corresponding respectively to the rank and the points of athletes.</li>
<li><strong>Supplementary qualitative variables</strong> : Column 13 corresponding to the two athletic meetings (2004 Olympic Game or 2004 Decastar). This factor variables will be used to color individuals by groups.</li>
</ul>
</div>
<p><br/></p>
<p>Extract only active individuals and variables for principal component analysis:</p>
<pre class="r"><code>decathlon2.active <- decathlon2[1:23, 1:10]
head(decathlon2.active[, 1:6])</code></pre>
<pre><code>          X100m Long.jump Shot.put High.jump X400m X110m.hurdle
SEBRLE    11.04      7.58    14.83      2.07 49.81        14.69
CLAY      10.76      7.40    14.26      1.86 49.37        14.05
BERNARD   11.02      7.23    14.25      1.92 48.93        14.99
YURKOV    11.34      7.09    15.19      2.10 50.42        15.31
ZSIVOCZKY 11.13      7.30    13.48      2.01 48.62        14.17
McMULLEN  10.83      7.31    13.76      2.13 49.91        14.38</code></pre>
</div>
<div id="use-the-r-function-prcomp-for-pca" class="section level1">
<h1>Use the R function prcomp() for PCA</h1>
<pre class="r"><code>res.pca <- prcomp(decathlon2.active, scale = TRUE)</code></pre>
<p><strong>The values returned, by the function prcomp(),</strong> are :</p>
<pre class="r"><code>names(res.pca)</code></pre>
<pre><code>[1] "sdev"     "rotation" "center"   "scale"    "x"       </code></pre>
<ol style="list-style-type: decimal">
<li><strong>sdev</strong> : the standard deviations of the principal components (the square roots of the eigenvalues)</li>
</ol>
<pre class="r"><code>head(res.pca$sdev)</code></pre>
<pre><code>[1] 2.0308159 1.3559244 1.1131668 0.9052294 0.8375875 0.6502944</code></pre>
<ol start="2" style="list-style-type: decimal">
<li><strong>rotation</strong> : the matrix of variable loadings (columns are eigenvectors)</li>
</ol>
<pre class="r"><code>head(unclass(res.pca$rotation)[, 1:4])</code></pre>
<pre><code>                    PC1         PC2        PC3         PC4
X100m        -0.4188591  0.13230683 -0.2708996  0.03708806
Long.jump     0.3910648 -0.20713320  0.1711752 -0.12746997
Shot.put      0.3613881 -0.06298590 -0.4649778  0.14191803
High.jump     0.3004132  0.34309742 -0.2965280  0.15968342
X400m        -0.3454786 -0.21400770 -0.2547084  0.47592968
X110m.hurdle -0.3762651  0.01824645 -0.4032525 -0.01866477</code></pre>
<ol start="3" style="list-style-type: decimal">
<li><strong>center</strong>, <strong>scale</strong> : the centering and scaling used, or FALSE</li>
</ol>
</div>
<div id="variances-of-the-principal-components" class="section level1">
<h1>Variances of the principal components</h1>
<p><strong>The variance retained by each principal component</strong> can be obtained as follow :</p>
<pre class="r"><code># Eigenvalues
eig <- (res.pca$sdev)^2

# Variances in percentage
variance <- eig*100/sum(eig)

# Cumulative variances
cumvar <- cumsum(variance)

eig.decathlon2.active <- data.frame(eig = eig, variance = variance,
                     cumvariance = cumvar)
head(eig.decathlon2.active)</code></pre>
<pre><code>        eig  variance cumvariance
1 4.1242133 41.242133    41.24213
2 1.8385309 18.385309    59.62744
3 1.2391403 12.391403    72.01885
4 0.8194402  8.194402    80.21325
5 0.7015528  7.015528    87.22878
6 0.4228828  4.228828    91.45760</code></pre>
<p><span class="warning">Note that, you can use the function <strong>summary()</strong> to extract the eigenvalues and variances from an object of class <strong>prcomp</strong>.</span></p>
<pre class="r"><code>summary(res.pca)</code></pre>
<p>You can also use the package <strong>factoextra</strong>. It’s simple :</p>
<pre class="r"><code>library("factoextra")
eig.val <- get_eigenvalue(res.pca)
head(eig.val)</code></pre>
<pre><code>      eigenvalue variance.percent cumulative.variance.percent
Dim.1  4.1242133        41.242133                    41.24213
Dim.2  1.8385309        18.385309                    59.62744
Dim.3  1.2391403        12.391403                    72.01885
Dim.4  0.8194402         8.194402                    80.21325
Dim.5  0.7015528         7.015528                    87.22878
Dim.6  0.4228828         4.228828                    91.45760</code></pre>
<p><span class="question">What mean <strong>eigenvalues</strong> ?</span></p>
<p>Recall that <strong>eigenvalues</strong> measures the variability retained by each PC. It’s large for the first PC and small for the subsequent PCs.</p>
<p>The importance of <strong>princpal components</strong> (PCs) can be visualized with a <strong>scree plot</strong>.</p>
<p><strong>Scree plot using base graphics</strong> :</p>
<pre class="r"><code>barplot(eig.decathlon2.active[, 2], names.arg=1:nrow(eig.decathlon2.active), 
       main = "Variances",
       xlab = "Principal Components",
       ylab = "Percentage of variances",
       col ="steelblue")
# Add connected line segments to the plot
lines(x = 1:nrow(eig.decathlon2.active), 
      eig.decathlon2.active[, 2], 
      type="b", pch=19, col = "red")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/principal-component-analysis-prcomp-eigenvalue-data-mining-1.png" title="Principal component analysis - R software and data mining" alt="Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="success">~60% of the information (variances) contained in the data are retained by the first two principal components.</span></p>
<p><strong>Scree plot using factoextra</strong> :</p>
<pre class="r"><code>fviz_screeplot(res.pca, ncp=10)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/principal-component-analysis-prcomp-variance-factoextra-data-mining-1.png" title="Principal component analysis - R software and data mining" alt="Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>It’s also possible to visualize the eigenvalues instead of the variances :</p>
<pre class="r"><code>fviz_screeplot(res.pca, ncp=10, choice="eigenvalue")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/principal-component-analysis-prcomp-eigenvalue-factoextra-data-mining-1.png" title="Principal component analysis - R software and data mining" alt="Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>Read more about <a href="https://www.sthda.com/english/english/wiki/fviz-screeplot-eigenvalue-visualizations-principal-component-analysis-r-software-and-data-mining">fviz_screeplot</a>.</p>
<p><span class="question">How to determine the <strong>number of components to retain</strong>?</span></p>
<ul>
<li>An <strong>eigenvalue</strong> > 1 indicates that PCs account for more variance than accounted by one of the original variables in standardized data. This is commonly used as a cutoff point for which PCs are retained.</li>
<li>You can also limit the number of component to that number that accounts for a certain fraction of the total variance. For example, if you are satisfied with 80% of the total variance explained then use the number of components to achieve that.</li>
</ul>
<p><span class="success">Note that, a good dimension reduction is achieved when the the first few PCs account for a large proportion of the variability (80-90%).</span></p>
</div>
<div id="graph-of-variables-the-correlation-circle" class="section level1">
<h1>Graph of variables : The correlation circle</h1>
<p>A simple method to extract the results, for variables, from a <em>PCA</em> output is to use the function <strong>get_pca_var()</strong> [<em>factoextra</em>]. This function provides a list of matrices containing all the results for the active variables (coordinates, correlation between variables and axes, squared cosine and contributions)</p>
<pre class="r"><code>var <- get_pca_var(res.pca)
var</code></pre>
<pre><code>Principal Component Analysis Results for variables
 ===================================================
  Name       Description                                    
1 "$coord"   "Coordinates for the variables"                
2 "$cor"     "Correlations between variables and dimensions"
3 "$cos2"    "Cos2 for the variables"                       
4 "$contrib" "contributions of the variables"               </code></pre>
<pre class="r"><code># Coordinates of variables
var$coord[, 1:4]</code></pre>
<pre><code>                    Dim.1       Dim.2       Dim.3       Dim.4
X100m        -0.850625692  0.17939806 -0.30155643  0.03357320
Long.jump     0.794180641 -0.28085695  0.19054653 -0.11538956
Shot.put      0.733912733 -0.08540412 -0.51759781  0.12846837
High.jump     0.610083985  0.46521415 -0.33008517  0.14455012
X400m        -0.701603377 -0.29017826 -0.28353292  0.43082552
X110m.hurdle -0.764125197  0.02474081 -0.44888733 -0.01689589
Discus        0.743209016 -0.04966086 -0.17652518  0.39500915
Pole.vault   -0.217268042 -0.80745110 -0.09405773 -0.33898477
Javeline      0.428226639 -0.38610928 -0.60412432 -0.33173454
X1500m        0.004278487 -0.78448019  0.21947068  0.44800961</code></pre>
<p><span class="success">In this section I’ll show you, step by step, how to calculate the <strong>coordinates</strong>, the <strong>cos2</strong> and the <strong>contribution</strong> of variables.</span></p>
<div id="coordinates-of-variables-on-the-principal-components" class="section level2">
<h2>Coordinates of variables on the principal components</h2>
<p>The <strong>correlation between variables and principal components</strong> is used as coordinates. It can be calculated as follow :</p>
<p><span class="warning">Variable correlations with PCs = loadings * the component standard deviations. </span></p>
<pre class="r"><code># Helper function : 
# Correlation between variables and principal components
var_cor_func <- function(var.loadings, comp.sdev){
  var.loadings*comp.sdev
  }

# Variable correlation/coordinates
loadings <- res.pca$rotation
sdev <- res.pca$sdev

var.coord <- var.cor <- t(apply(loadings, 1, var_cor_func, sdev))
head(var.coord[, 1:4])</code></pre>
<pre><code>                    PC1         PC2        PC3         PC4
X100m        -0.8506257  0.17939806 -0.3015564  0.03357320
Long.jump     0.7941806 -0.28085695  0.1905465 -0.11538956
Shot.put      0.7339127 -0.08540412 -0.5175978  0.12846837
High.jump     0.6100840  0.46521415 -0.3300852  0.14455012
X400m        -0.7016034 -0.29017826 -0.2835329  0.43082552
X110m.hurdle -0.7641252  0.02474081 -0.4488873 -0.01689589</code></pre>
</div>
<div id="graph-of-variables-using-r-base-graph" class="section level2">
<h2>Graph of variables using R base graph</h2>
<pre class="r"><code># Plot the correlation circle
a <- seq(0, 2*pi, length = 100)
plot( cos(a), sin(a), type = &amp;#39;l&amp;#39;, col="gray",
      xlab = "PC1",  ylab = "PC2")

abline(h = 0, v = 0, lty = 2)

# Add active variables
arrows(0, 0, var.coord[, 1], var.coord[, 2], 
      length = 0.1, angle = 15, code = 2)

# Add labels
text(var.coord, labels=rownames(var.coord), cex = 1, adj=1)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/principal-component-analysis-prcomp-correlation-circle-data-mining-1.png" title="Principal component analysis - R software and data mining" alt="Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
<div id="graph-of-variables-using-factoextra" class="section level2">
<h2>Graph of variables using factoextra</h2>
<pre class="r"><code>fviz_pca_var(res.pca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/principal-component-analysis-prcomp-correlation-circle-factoextra-data-mining-1.png" title="Principal component analysis - R software and data mining" alt="Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>Read more about the function <strong>fviz_pca_var()</strong> : <a href="https://www.sthda.com/english/english/wiki/fviz-pca-var-graph-of-variables-principal-component-analysis-r-software-and-data-mining">Graph of variables - Principal Component Analysis</a></p>
<p><span class="question"> How to interpret the correlation plot?</span></p>
<p>The graph of variables shows the relationships between all variables :</p>
<ul>
<li><strong>Positively correlated</strong> variables are grouped together.</li>
<li><strong>Negatively correlated</strong> variables are positioned on opposite sides of the plot origin (opposed quadrants).
</li>
<li><strong>The distance between variables and the origine</strong> measures the quality of the variables on the factor map. Variables that are away from the origin are well represented on the factor map.</li>
</ul>
</div>
<div id="cos2-quality-of-representation-for-variables-on-the-factor-map" class="section level2">
<h2>Cos2 : quality of representation for variables on the factor map</h2>
<p><span class="success">The cos2 of variables are calculated as the squared coordinates : var.cos2 = var.coord * var.coord </span></p>
<pre class="r"><code>var.cos2 <- var.coord^2
head(var.cos2[, 1:4])</code></pre>
<pre><code>                   PC1          PC2        PC3          PC4
X100m        0.7235641 0.0321836641 0.09093628 0.0011271597
Long.jump    0.6307229 0.0788806285 0.03630798 0.0133147506
Shot.put     0.5386279 0.0072938636 0.26790749 0.0165041211
High.jump    0.3722025 0.2164242070 0.10895622 0.0208947375
X400m        0.4922473 0.0842034209 0.08039091 0.1856106269
X110m.hurdle 0.5838873 0.0006121077 0.20149984 0.0002854712</code></pre>
<p><span class="warning">Using <strong>factoextra</strong> package, the color of variables can be automatically controlled by the value of their cos2.</span></p>
<pre class="r"><code>fviz_pca_var(res.pca, col.var="contrib")+
scale_color_gradient2(low="white", mid="blue", 
      high="red", midpoint=55) + theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/principal-component-analysis-prcomp-correlation-circle-colors-factoextra-data-mining-1.png" title="Principal component analysis - R software and data mining" alt="Principal component analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
</div>
<div id="contributions-of-the-variables-to-the-principal-components" class="section level2">
<h2>Contributions of the variables to the principal components</h2>
<p><span class="success">The contribution of a variable to a given principal component is (in percentage) : (var.cos2 * 100) / (total cos2 of the component)</span></p>
<pre class="r"><code>comp.cos2 <- apply(var.cos2, 2, sum)

contrib <- function(var.cos2, comp.cos2){var.cos2*100/comp.cos2}

var.contrib <- t(apply(var.cos2,1, contrib, comp.cos2))
head(var.contrib[, 1:4])</code></pre>
<pre><code>                   PC1        PC2       PC3         PC4
X100m        17.544293  1.7505098  7.338659  0.13755240
Long.jump    15.293168  4.2904162  2.930094  1.62485936
Shot.put     13.060137  0.3967224 21.620432  2.01407269
High.jump     9.024811 11.7715838  8.792888  2.54987951
X400m        11.935544  4.5799296  6.487636 22.65090599
X110m.hurdle 14.157544  0.0332933 16.261261  0.03483735</code></pre>
<p>Highlight the most important (i.e, contributing) variables :</p>
<pre class="r"><code>fviz_pca_var(res.pca, col.var="contrib") +
scale_color_gradient2(low="white", mid="blue", 
      high="red", midpoint=50) + theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/principal-component-analysis-prcomp-variable-contribution-factoextra-data-mining-1.png" title="Principal component analysis - R software and data mining" alt="Principal component analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p>You can also use the function <strong>fviz_contrib()</strong> described here : <a href="https://www.sthda.com/english/english/wiki/principal-component-analysis-how-to-reveal-the-most-important-variables-in-your-data-r-software-and-data-mining">Principal Component Analysis: How to reveal the most important variables in your data?</a></p>
</div>
</div>
<div id="graph-of-individuals" class="section level1">
<h1>Graph of individuals</h1>
<div id="coordinates-of-individuals-on-the-principal-components" class="section level2">
<h2>Coordinates of individuals on the principal components</h2>
<pre class="r"><code>ind.coord <- res.pca$x
head(ind.coord[, 1:4])</code></pre>
<pre><code>                 PC1        PC2        PC3         PC4
SEBRLE     0.1912074 -1.5541282 -0.6283688  0.08205241
CLAY       0.7901217 -2.4204156  1.3568870  1.26984296
BERNARD   -1.3292592 -1.6118687 -0.1961500 -1.92092203
YURKOV    -0.8694134  0.4328779 -2.4739822  0.69723814
ZSIVOCZKY -0.1057450  2.0233632  1.3049312 -0.09929630
McMULLEN   0.1185550  0.9916237  0.8435582  1.31215266</code></pre>
</div>
<div id="cos2-quality-of-representation-for-individuals-on-the-principal-components" class="section level2">
<h2>Cos2 : quality of representation for individuals on the principal components</h2>
<p>To calculate the cos2 of individuals, 2 simple steps are required :</p>
<ol style="list-style-type: decimal">
<li>Calculate the square distance between each individual and the PCA center of gravity</li>
</ol>
<ul>
<li>d2 = [(var1_ind_i - mean_var1)/sd_var1]^2 + …+ [(var10_ind_i - mean_var10)/sd_var10]^2 + …+..</li>
</ul>
<ol start="2" style="list-style-type: decimal">
<li>Calculate the cos2 = ind.coord^2/d2</li>
</ol>
<pre class="r"><code># Compute the square of the distance between an individual and the
# center of gravity
center <- res.pca$center
scale<- res.pca$scale
getdistance <- function(ind_row, center, scale){
  return(sum(((ind_row-center)/scale)^2))
  }
d2 <- apply(decathlon2.active,1,getdistance, center, scale)

# Compute the cos2
cos2 <- function(ind.coord, d2){return(ind.coord^2/d2)}
ind.cos2 <- apply(ind.coord, 2, cos2, d2)
head(ind.cos2[, 1:4])</code></pre>
<pre><code>                  PC1        PC2         PC3         PC4
SEBRLE    0.007530179 0.49747323 0.081325232 0.001386688
CLAY      0.048701249 0.45701660 0.143628117 0.125791741
BERNARD   0.197199804 0.28996555 0.004294015 0.411819183
YURKOV    0.096109800 0.02382571 0.778230322 0.061812637
ZSIVOCZKY 0.001574385 0.57641944 0.239754152 0.001388216
McMULLEN  0.002175437 0.15219499 0.110137872 0.266486530</code></pre>
<p><span class="success">The sum of each row is 1, if we consider the 10 components</span></p>
</div>
<div id="contribution-of-individuals-to-the-princial-components" class="section level2">
<h2>Contribution of individuals to the princial components</h2>
<p>The contribution of individuals (in percentage) to the principal components can be computed as follow :</p>
<p>100 * (1 / number_of_individuals)*(ind.coord^2 / comp_sdev^2)</p>
<pre class="r"><code># Contributions of individuals
contrib <- function(ind.coord, comp.sdev, n.ind){
  100*(1/n.ind)*ind.coord^2/comp.sdev^2
}

ind.contrib <- t(apply(ind.coord,1, contrib, 
                       res.pca$sdev, nrow(ind.coord)))
head(ind.contrib[, 1:4])</code></pre>
<pre><code>                 PC1        PC2        PC3         PC4
SEBRLE    0.03854254  5.7118249  1.3854184  0.03572215
CLAY      0.65814114 13.8541889  6.4600973  8.55568792
BERNARD   1.86273218  6.1441319  0.1349983 19.57827284
YURKOV    0.79686310  0.4431309 21.4755770  2.57939100
ZSIVOCZKY 0.01178829  9.6816398  5.9748485  0.05231437
McMULLEN  0.01481737  2.3253860  2.4967890  9.13531719</code></pre>
<p><span class="warning">Note that the sum of all the contributions per column is 100</span></p>
</div>
<div id="graph-of-individuals-base-graph" class="section level2">
<h2>Graph of individuals : base graph</h2>
<pre class="r"><code>plot(ind.coord[,1], ind.coord[,2], pch = 19,  
     xlab="PC1 - 41.2%",ylab="PC2 - 18.4%")
abline(h=0, v=0, lty = 2)
text(ind.coord[,1], ind.coord[,2], labels=rownames(ind.coord),
        cex=0.7, pos = 3)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/principal-component-analysis-prcomp-individuals-graph-data-mining-1.png" title="Principal component analysis - R software and data mining" alt="Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>Biplot of individuals and variables :</p>
<pre class="r"><code>biplot(res.pca, cex = 0.8, col = c("black", "red") )</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/principal-component-analysis-prcomp-biplot-data-mining-1.png" title="Principal component analysis - R software and data mining" alt="Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
<div id="graph-of-individuals-factoextra" class="section level2">
<h2>Graph of individuals : factoextra</h2>
<div id="extract-the-results-for-the-individuals" class="section level3">
<h3>Extract the results for the individuals</h3>
<p><span class="success"> factoextra provides, with less code, a list of matrices containing all the results for the active individuals (coordinates, square cosine, contributions). </span></p>
<pre class="r"><code>ind <- get_pca_ind(res.pca)
ind</code></pre>
<pre><code>Principal Component Analysis Results for individuals
 ===================================================
  Name       Description                       
1 "$coord"   "Coordinates for the individuals" 
2 "$cos2"    "Cos2 for the individuals"        
3 "$contrib" "contributions of the individuals"</code></pre>
<pre class="r"><code># Coordinates for individuals
head(ind$coord[, 1:4])</code></pre>
<pre><code>               Dim.1      Dim.2      Dim.3       Dim.4
SEBRLE     0.1912074 -1.5541282 -0.6283688  0.08205241
CLAY       0.7901217 -2.4204156  1.3568870  1.26984296
BERNARD   -1.3292592 -1.6118687 -0.1961500 -1.92092203
YURKOV    -0.8694134  0.4328779 -2.4739822  0.69723814
ZSIVOCZKY -0.1057450  2.0233632  1.3049312 -0.09929630
McMULLEN   0.1185550  0.9916237  0.8435582  1.31215266</code></pre>
</div>
<div id="graph-of-individuals-using-factoextra" class="section level3">
<h3>Graph of individuals using factoextra</h3>
<br/>
<div class="warning">
<p>Note that, in the R code below, the argument <em>data</em> is required only when <em>res.pca</em> is an object of class <em>princomp</em> or <em>prcomp</em> (two functions from the built-in <strong>R</strong> <strong>stats</strong> package).</p>
<p>In other words, if <em>res.pca</em> is a result of PCA functions from <em>FactoMineR</em> or <strong>ade4</strong> package, the argument <em>data</em> can be omitted.</p>
Yes, <strong>factoextra</strong> can also handle the output of <strong>FactoMineR</strong> and <strong>ade4</strong> packages.
</div>
<p><br/></p>
<p><strong>Default individuals factor map</strong> :</p>
<pre class="r"><code>fviz_pca_ind(res.pca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/principal-component-analysis-prcomp-individuals-graph-factoextra-data-mining-1.png" title="Principal component analysis - R software and data mining" alt="Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><strong>Control automatically the color of individuals</strong> using the cos2 values (the quality of the individuals on the factor map) :</p>
<pre class="r"><code>fviz_pca_ind(res.pca, col.ind="cos2") +
scale_color_gradient2(low="white", mid="blue", 
    high="red", midpoint=0.50) + theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/principal-component-analysis-prcomp-individuals-graph-color-factoextra-data-mining-1.png" title="Principal component analysis - R software and data mining" alt="Principal component analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p>Read more about fviz_pca_ind() : <a href="https://www.sthda.com/english/english/wiki/fviz-pca-ind-graph-of-individuals-principal-component-analysis-r-software-and-data-mining">Graph of individuals - principal component analysis</a></p>
<p><strong>Make a biplot of individuals and variables</strong> :</p>
<pre class="r"><code>fviz_pca_biplot(res.pca,  geom = "text") +
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/principal-component-analysis-prcomp-biplot-factoextra-data-mining-1.png" title="Principal component analysis - R software and data mining" alt="Principal component analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p>Read more about <strong>fviz_pca_biplot()</strong> : <a href="https://www.sthda.com/english/english/wiki/fviz-pca-biplot-biplot-of-individuals-and-variables-principal-component-analysis-r-software-and-data-mining">Biplot of individuals and variables - principal component analysis</a></p>
</div>
</div>
</div>
<div id="prediction-using-principal-component-analysis" class="section level1">
<h1>Prediction using Principal Component Analysis</h1>
<div id="supplementary-quantitative-variables" class="section level2">
<h2>Supplementary quantitative variables</h2>
<p><span class="warning">As described above, the data sets <em>decathlon2</em> contain some <strong>supplementary continuous variables</strong> at columns 11 and 12 corresponding respectively to the rank and the points of athletes.</span></p>
<pre class="r"><code># Data for the supplementary quantitative variables
quanti.sup <- decathlon2[1:23, 11:12, drop = FALSE]
head(quanti.sup)</code></pre>
<pre><code>          Rank Points
SEBRLE       1   8217
CLAY         2   8122
BERNARD      4   8067
YURKOV       5   8036
ZSIVOCZKY    7   8004
McMULLEN     8   7995</code></pre>
<p><span class="notice">Recall that, rows 24:27 are supplementary individuals. We don’t want them in this current analysis. This is why, I extracted only rows 1:23. </span></p>
<p><span class="success">In this section we’ll see how to calculate the predicted coordinates of these two variables using the information provided by the previously performed principal component analysis.</span></p>
<p><strong>2 simples steps are required</strong> :</p>
<ol style="list-style-type: decimal">
<li><strong>Calculate the correlation</strong> between each supplementary quantitative variables and the principal components</li>
<li><strong>Make a factor map</strong> of all variables (active and supplementary ones) to visualize the position of the supplementary variables</li>
</ol>
<p>The <strong>R code</strong> below can be used :</p>
<pre class="r"><code># Calculate the correlations between supplementary variables
# and the principal components
ind.coord <- res.pca$x
quanti.coord <- cor(quanti.sup, ind.coord)
head(quanti.coord[, 1:4])</code></pre>
<pre><code>              PC1         PC2        PC3         PC4
Rank   -0.7014777  0.24519443  0.1834294  0.05575186
Points  0.9637075 -0.07768262 -0.1580225 -0.16623092</code></pre>
<pre class="r"><code># Variable factor maps
#++++++++++++++++++
# Plot the correlation circle
a <- seq(0, 2*pi, length = 100)
plot( cos(a), sin(a), type = &amp;#39;l&amp;#39;, col="gray",
      xlab = "PC1",  ylab = "PC2")
abline(h = 0, v = 0, lty = 2)
# Add active variables
var.coord <- get_pca_var(res.pca)$coord
arrows(0 ,0, x1=var.coord[,1], y1 = var.coord[,2], 
       col="black", length = 0.09)
text(var.coord[,1], var.coord[,2],
     labels=rownames(var.coord), cex=0.8)
# Add supplementary quantitative variables
arrows(0 ,0, x1= quanti.coord[,1], y1 = quanti.coord[,2], 
       col="blue", lty =2, length = 0.09)
text(quanti.coord[,1], quanti.coord[,2],
     labels=rownames(quanti.coord), cex=0.8, col ="blue")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/principal-component-analysis-prcomp-supplementary-quantitative-variables-1.png" title="Principal component analysis - R software and data mining" alt="Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><strong>It’s also possible to make the graph of variables using factoextra:</strong></p>
<pre class="r"><code># Plot of active variables
p <- fviz_pca_var(res.pca)
# Add supplementary active variables
fviz_add(p, quanti.coord, color ="blue", geom="arrow")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/principal-component-analysis-prcomp-quantitative-supplementary-variable-data-mining-1.png" title="Principal component analysis - R software and data mining" alt="Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code># get the cos2 of the supplementary quantitative variables
(quanti.coord^2)[, 1:4]</code></pre>
<pre><code>             PC1         PC2        PC3        PC4
Rank   0.4920710 0.060120310 0.03364635 0.00310827
Points 0.9287322 0.006034589 0.02497110 0.02763272</code></pre>
</div>
<div id="supplementary-qualitative-variables" class="section level2">
<h2>Supplementary qualitative variables</h2>
<p>The data sets <em>decathlon2</em> contain a <strong>supplementary qualitative variable</strong> at columns 13 corresponding to the type of competitions.</p>
<p>Qualitative variable can be helpful for interpreting the data and for coloring individuals by groups :</p>
<pre class="r"><code># Data for the supplementary qualitative variables
quali.sup <- as.factor(decathlon2[1:23, 13])
head(quali.sup)</code></pre>
<pre><code>[1] Decastar Decastar Decastar Decastar Decastar Decastar
Levels: Decastar OlympicG</code></pre>
<p><strong>Color individuals by groups</strong> :</p>
<pre class="r"><code>fviz_pca_ind(res.pca, 
  habillage = quali.sup, addEllipses = TRUE, ellipse.level = 0.68) +
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/principal-component-analysis-prcomp-supplementary-qualitative-variables-1.png" title="Principal component analysis - R software and data mining" alt="Principal component analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><span class="success">Note that, the argument <em>habillage</em> is used to specify the variable containing the groups of individuals</span></p>
<p>It’s very easy to get the coordinates for the levels of a supplementary qualitative variables. The helper function below can be used :</p>
<pre class="r"><code># Return the coordinates of a group levels
# x : coordinate of individuals on x axis
# y : coordinate of indiviuals on y axis
get_coord_quali<-function(x, y, groups){
  data.frame(
    x= tapply(x, groups, mean),
    y = tapply(y, groups, mean)
  )
}</code></pre>
<p>Calculate the coordinates on components 1 and 2 :</p>
<pre class="r"><code>coord.quali <- get_coord_quali(ind.coord[,1], ind.coord[,2],
                               groups = quali.sup)
coord.quali</code></pre>
<pre><code>                 x          y
Decastar -1.313921 -0.1191322
OlympicG  1.204428  0.1092046</code></pre>
</div>
<div id="supplementary-individuals" class="section level2">
<h2>Supplementary individuals</h2>
<p>The data sets <em>decathlon2</em> contain some <strong>supplementary individuals</strong> from row 24 to 27.</p>
<pre class="r"><code># Data for the supplementary individuals
ind.sup <- decathlon2[24:27, 1:10, drop = FALSE]
ind.sup[, 1:6]</code></pre>
<pre><code>        X100m Long.jump Shot.put High.jump X400m X110m.hurdle
KARPOV  11.02      7.30    14.77      2.04 48.37        14.09
WARNERS 11.11      7.60    14.31      1.98 48.68        14.23
Nool    10.80      7.53    14.26      1.88 48.81        14.80
Drews   10.87      7.38    13.07      1.88 48.51        14.01</code></pre>
<p><span class="notice">Remember that, columns 11:13 are supplementary variables. We don’t want them in this current analysis. This is why, I extracted only columns 1:10. I used also the argument <strong>drop = FALSE</strong> to preserve the type of the data (which is a data.frame).</span></p>
<p><span class="success">In this section we’ll see how to predict the coordinates of the supplementary individuals using only the information provided by the previously performed principal component analysis.</span></p>
</div>
<div id="a-simple-function-to-predict-the-coordinates-of-new-individuals-data" class="section level2">
<h2>A simple function to predict the coordinates of new individuals data</h2>
<p>One simple approach is to use the function <strong>predict()</strong> from the built-in R <strong>stats</strong> package :</p>
<pre class="r"><code>ind.sup.coord <- predict(res.pca, newdata = ind.sup)
ind.sup.coord[, 1:4]</code></pre>
<pre><code>               PC1         PC2       PC3        PC4
KARPOV   0.7772521 -0.76237804 1.5971253  1.6863286
WARNERS -0.3779697  0.11891968 1.7005146 -0.6908084
Nool    -0.5468405 -1.93402211 0.4724184 -2.2283706
Drews   -1.0848227 -0.01703198 2.9818031 -1.5006207</code></pre>
</div>
<div id="calculate-the-predicted-coordinates-by-hand" class="section level2">
<h2>Calculate the predicted coordinates by hand</h2>
<p><strong>2 simples steps are required</strong> :</p>
<ol style="list-style-type: decimal">
<li>Center and scale the values for the supplementary individuals using the center and the scale of the PCA</li>
<li>Calculate the predicted coordinates by multiplying the scaled values with the eigenvectors (loadings) of the principal components.</li>
</ol>
<p>The <strong>R code</strong> below can be used :</p>
<pre class="r"><code># Centering and scaling the supplementary individuals
scale_func <- function(ind_row, center, scale){
  (ind_row-center)/scale
}

ind.scaled <- t(apply(ind.sup, 1, scale_func, res.pca$center, res.pca$scale))

# Coordinates of the individividuals
pca.loadings <- res.pca$rotation
coord_func <- function(ind, loadings){
  r <- loadings*ind
  r <- apply(r, 2, sum)
  r
}

ind.sup.coord <- t(apply(ind.scaled, 1, coord_func, pca.loadings ))
ind.sup.coord[, 1:4]</code></pre>
<pre><code>               PC1         PC2       PC3        PC4
KARPOV   0.7772521 -0.76237804 1.5971253  1.6863286
WARNERS -0.3779697  0.11891968 1.7005146 -0.6908084
Nool    -0.5468405 -1.93402211 0.4724184 -2.2283706
Drews   -1.0848227 -0.01703198 2.9818031 -1.5006207</code></pre>
</div>
<div id="make-a-factor-map-including-the-supplementary-individuals-using-factoextra" class="section level2">
<h2>Make a factor map including the supplementary individuals using factoextra</h2>
<pre class="r"><code># Plot of active individuals
p <- fviz_pca_ind(res.pca)
# Add supplementary individuals
fviz_add(p, ind.sup.coord, color ="blue")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/principal-component-analysis-prcomp-supplementary-individuals-data-mining-1.png" title="Principal component analysis - R software and data mining" alt="Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
</div>
<div id="infos" class="section level1">
<h1>Infos</h1>
<p><span class="warning"> This analysis has been performed using <strong>R software</strong> (ver. 3.1.2) and <strong>factoextra</strong> (ver. 1.0.2) </span></p>
<p>Read more :</p>
<ul>
<li>Gregory B. Anderson, principal component analysis in R, <a href="https://www.ime.usp.br/~pavan/pdf/MAE0330-PCA-R-2013">https://www.ime.usp.br/~pavan/pdf/MAE0330-PCA-R-2013</a></li>
</ul>
</div>

<script>jQuery(document).ready(function () {
    jQuery('h1').addClass('wiki_paragraph1');
    jQuery('h2').addClass('wiki_paragraph2');
    jQuery('h3').addClass('wiki_paragraph3');
    jQuery('h4').addClass('wiki_paragraph4');
    });//add phpboost class to header</script>
<style>.content{padding:0px;}</style>
</div><!--end rdoc-->
<!--====================== stop here when you copy to sthda================-->


<!-- END HTML -->]]></description>
			<pubDate>Mon, 25 May 2015 09:13:12 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[Principal component analysis : the basics you should read - R software and data mining]]></title>
			<link>https://www.sthda.com/english/wiki/principal-component-analysis-the-basics-you-should-read-r-software-and-data-mining</link>
			<guid>https://www.sthda.com/english/wiki/principal-component-analysis-the-basics-you-should-read-r-software-and-data-mining</guid>
			<description><![CDATA[<!-- START HTML -->

  <!--====================== start from here when you copy to sthda================-->  
  <div id="rdoc">


<div id="TOC">
<ul>
<li><a href="#what-is-principal-component-analysis">What is principal component analysis?</a></li>
<li><a href="#pca-basics">PCA basics</a></li>
<li><a href="#main-purpose-of-pca">Main purpose of PCA</a></li>
<li><a href="#basic-statistics---covariance-between-two-variables">Basic statistics - Covariance between two variables</a></li>
<li><a href="#covariancecorrelation-matrix">Covariance/correlation matrix</a></li>
<li><a href="#interpretention-of-the-covariance-matrix">Interpretention of the covariance matrix</a></li>
<li><a href="#how-to-minimize-the-distortion-in-the-data">How to minimize the distortion in the data ?</a></li>
<li><a href="#pca-terminologies-eigenvalues-eigenvectors">PCA terminologies : Eigenvalues / eigenvectors</a></li>
<li><a href="#steps-for-principal-component-analysis">Steps for principal component analysis</a></li>
<li><a href="#compute-principal-component-analysis-step-by-step">Compute principal component analysis (step by step)</a></li>
<li><a href="#packages-in-r-for-the-principal-component-analysis">Packages in R for the principal component analysis</a></li>
<li><a href="#infos">Infos</a></li>
</ul>
</div>

<div id="what-is-principal-component-analysis" class="section level1">
<h1>What is principal component analysis?</h1>
<p><strong>Principal component analysis</strong> (<strong>PCA</strong>) is used to summarize the information in a data set described by multiple variables.</p>
<p><span class="notice">Note that, the information in a data is the total <strong>variation</strong> it contains.</span></p>
<p><strong>PCA reduces the dimensionality</strong> of data containing a large set of variables. This is achieved by transforming the initial variables into a new small set of variables without loosing the most important information in the original data set.</p>
<p><span class="success">These new variables corresponds to a <strong>linear combination</strong> of the originals and are called <strong>principal components</strong>.</span></p>
<p>This article describes, step by step, how PCA works using <strong>R software</strong>.</p>
</div>
<div id="pca-basics" class="section level1">
<h1>PCA basics</h1>
<p>Understanding the details of PCA requires knowledge of linear algebra. In this section, we’ll explain the basics with simple graphical representation of the data.</p>
<p>In the Figure 1A below, the data are represented in the X-Y coordinate system. The dimension reduction is achieved by identifying the principal directions, called <strong>principal components</strong>, in which the data varies.</p>
<p><strong>PCA</strong> assumes that the directions with the largest variances are the most “important” (i.e, the most principal).</p>
<p>In the figure below, the <em>PC1 axis</em> is the <strong>first principal direction</strong> along which the samples show the largest variation. The <strong>PC2 axis</strong> is the <strong>second most important direction</strong> and it is <strong>orthogonal</strong> to the PC1 axis.</p>
<p>The dimensionality of our two-dimensional data can be reduced to a single dimension by projecting each sample onto the first principal component (Figure 1B)</p>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/principal-component-analysis-basics-scatter-plot-data-mining-1.png" title="Principal component analysis basics - R software and data mining" alt="Principal component analysis basics - R software and data mining" width="240" style="margin-bottom:10px;" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/principal-component-analysis-basics-scatter-plot-data-mining-2.png" title="Principal component analysis basics - R software and data mining" alt="Principal component analysis basics - R software and data mining" width="240" style="margin-bottom:10px;" /></p>
</div>
<div id="main-purpose-of-pca" class="section level1">
<h1>Main purpose of PCA</h1>
<p>The main goals of <strong>principal component analysis</strong> is :</p>
<ul>
<li>to identify hidden pattern in a data set</li>
<li>to reduce the dimensionnality of the data by removing the noise and redundancy in the data</li>
<li>to identify correlated variables</li>
</ul>
<p><span class="notice">PCA method is particularly useful when the variables within the data set are highly correlated. </span></p>
<p><strong>Correlation</strong> indicates that there is <strong>redundancy</strong> in the data. Due to this redundancy, PCA can be used to reduce the original variables into a smaller number of new variables ( = <strong>principal components</strong>) explaining most of the variance in the original variables.</p>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/principal-component-analysis-basics-unnamed-chunk-1-1.png" title="Principal component analysis basics - R software and data mining" alt="Principal component analysis basics - R software and data mining" width="240" style="margin-bottom:10px;" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/principal-component-analysis-basics-unnamed-chunk-1-2.png" title="Principal component analysis basics - R software and data mining" alt="Principal component analysis basics - R software and data mining" width="240" style="margin-bottom:10px;" /></p>
<p><span class="question">How to remove the redundancy?</span></p>
<p>PCA is traditionally performed on covariance matrix or correlation matrix.</p>
</div>
<div id="basic-statistics---covariance-between-two-variables" class="section level1">
<h1>Basic statistics - Covariance between two variables</h1>
<p>Let x and y be two variables with length n.</p>
<p>The variance of x is :</p>
<p><span class="math">\[\sigma^2_{xx} = \frac{\sum_i(x_i - m_x)(x_i - m_x)}{n - 1}\]</span></p>
<p>The variance of y is :</p>
<p><span class="math">\[\sigma^2_{yy} = \frac{\sum_i(y_i - m_y)(y_i - m_y)}{n - 1}\]</span></p>
<p>The covariance of x and y is :</p>
<p><span class="math">\[\sigma^2_{xy} = \frac{\sum_i(x_i - m_x)(y_i - m_y)}{n - 1}\]</span></p>
<p><span class="math">\(m_x\)</span> and <span class="math">\(m_y\)</span> are the means of x and y variables, respectively.</p>
<p><span class="success">The covariance measures the degree of the relationship between x and y.</span></p>
</div>
<div id="covariancecorrelation-matrix" class="section level1">
<h1>Covariance/correlation matrix</h1>
<p>A covariance matrix contains the covariances between all possible pairs of variables in the data set :</p>
<pre class="r"><code>df <- iris[, -5]
res.cov <- cov(df)
round(res.cov,2)</code></pre>
<pre><code>             Sepal.Length Sepal.Width Petal.Length Petal.Width
Sepal.Length         0.69       -0.04         1.27        0.52
Sepal.Width         -0.04        0.19        -0.33       -0.12
Petal.Length         1.27       -0.33         3.12        1.30
Petal.Width          0.52       -0.12         1.30        0.58</code></pre>
<p><span class="warning">Note that, the covariance matrix is symmetric. In the table above, covariance between Sepal.Length and Sepal.Width = covariance between Sepal.Width and Sepal.Length.</span></p>
</div>
<div id="interpretention-of-the-covariance-matrix" class="section level1">
<h1>Interpretention of the covariance matrix</h1>
<ol style="list-style-type: decimal">
<li>The diagonal elements are the <strong>variances</strong> of the different variables. <strong>A large diagonal values correspond to strong signal</strong>.</li>
</ol>
<pre class="r"><code>diag(res.cov)</code></pre>
<pre><code>Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
   0.6856935    0.1899794    3.1162779    0.5810063 </code></pre>
<ol start="2" style="list-style-type: decimal">
<li>The off-diagonal values are the <strong>covariances</strong> between variables. They reflect distortions in the data (noise, redundancy, …). <strong>Large off-diagonal values correspond to high distortions in our data</strong>.</li>
</ol>
<p><span class="success">The aim of PCA is to minimize this distortions and to summarize the essential information in the data</span></p>
</div>
<div id="how-to-minimize-the-distortion-in-the-data" class="section level1">
<h1>How to minimize the distortion in the data ?</h1>
<p>In the covariance table above, the off-diagonal values are different from zero. This indicates the presence of redundancy in the data. In other words, there is a certain amount of correlation between variables.</p>
<p><span class="success">This kind of matrix, with non-zero off-diagonal values, is called <strong>“non-diagonal” matrix</strong>.</span></p>
<p>We need to redefine our initial variables (x, y, z, ….) in order to diagonalize the covariance matrix.</p>
<p>This means that we want to change the covariance matrix so that the off–diagonal elements are close to zero (i.e, zero correlation between pairs of distinct variables).</p>
<p>The new variables (x’, y’, z’, …) are a linear combination of the old ones :</p>
<p><span class="math">\[X&amp;#39; = a_1X + a_2Y + a_3Z, ...\]</span></p>
<p><span class="math">\[Y&amp;#39; = b_1X + b_2Y + b_3Z, ...\]</span></p>
<p><span class="notice">In PCA, the constants a1, a2, an, b1, b2, bn are calculated such that the covariance matrix is diagonal.</span></p>
</div>
<div id="pca-terminologies-eigenvalues-eigenvectors" class="section level1">
<h1>PCA terminologies : Eigenvalues / eigenvectors</h1>
<p><strong>Eigenvalues</strong> : The numbers on the diagonal of the diagonalized covariance matrix are called eigenvalues of the covariance matrix. Large eigenvalues correspond to large variances.</p>
<p><strong>Eigenvectors</strong> : The directions of the new rotated axes are called the eigenvectors of the covariance matrix.</p>
<p><strong>Eigenvalues</strong> and <strong>eigenvectors</strong> can be easily calculated in R as follow :</p>
<pre class="r"><code>eigen(res.cov)</code></pre>
<pre><code>$values
[1] 4.22824171 0.24267075 0.07820950 0.02383509

$vectors
            [,1]        [,2]        [,3]       [,4]
[1,]  0.36138659 -0.65658877 -0.58202985  0.3154872
[2,] -0.08452251 -0.73016143  0.59791083 -0.3197231
[3,]  0.85667061  0.17337266  0.07623608 -0.4798390
[4,]  0.35828920  0.07548102  0.54583143  0.7536574</code></pre>
<p><span class="success">The <strong>first principal components</strong> of the data are the first directions explaining maximum variances. This is equivalent to the first eigenvectors of the covariance matrix.</span></p>
</div>
<div id="steps-for-principal-component-analysis" class="section level1">
<h1>Steps for principal component analysis</h1>
<p>The procedure includes 5 simple steps :</p>
<ol style="list-style-type: decimal">
<li><strong>Prepare the data</strong> :</li>
</ol>
<ul>
<li><em>Center the data</em> : subtract the mean from each variables. This produces a data set whose mean is zero.</li>
<li><em>Scale the data</em> : If the variances of the variables in your data are significantly different, it’s a good idea to scale the data to unit variance. This is achieved by dividing each variables by its standard deviation.</li>
</ul>
<ol start="2" style="list-style-type: decimal">
<li><strong>Calculate the covariance/correlation matrix</strong></li>
<li><strong>Calculate the eigenvectors and the eigenvalues</strong> of the covariance matrix</li>
<li><strong>Choose principal components</strong> : eigenvectors are ordered by eigenvalues from the highest to the lowest. The number of chosen eigenvectors will be the number of dimensions of the new data set. eigenvectors = (eig_1, eig_2,…, eig_n)</li>
<li><strong>compute the new dataset</strong> :</li>
</ol>
<ul>
<li><em>transpose eigeinvectors</em> : rows are eigenvectors</li>
<li><em>transpose the adjusted dat</em>a (rows are variables and columns are individuals)</li>
<li><em>new.data</em> = eigenvectors.transposed X adjustedData.transposed</li>
</ul>
</div>
<div id="compute-principal-component-analysis-step-by-step" class="section level1">
<h1>Compute principal component analysis (step by step)</h1>
<p>The data set <em>iris</em> is used : columns are variables and rows are observations:</p>
<pre class="r"><code>df <- iris[, -5]
head(df)</code></pre>
<pre><code>  Sepal.Length Sepal.Width Petal.Length Petal.Width
1          5.1         3.5          1.4         0.2
2          4.9         3.0          1.4         0.2
3          4.7         3.2          1.3         0.2
4          4.6         3.1          1.5         0.2
5          5.0         3.6          1.4         0.2
6          5.4         3.9          1.7         0.4</code></pre>
<p><strong>1. Center and scale the data</strong></p>
<pre class="r"><code>df.scaled <- scale(df, center = TRUE, scale = TRUE)</code></pre>
<p><strong>2. Compute the correlation matrix</strong> :</p>
<pre class="r"><code># 1. Correlation matrix
res.cor <- cor(df.scaled)
round(res.cor, 2)</code></pre>
<pre><code>             Sepal.Length Sepal.Width Petal.Length Petal.Width
Sepal.Length         1.00       -0.12         0.87        0.82
Sepal.Width         -0.12        1.00        -0.43       -0.37
Petal.Length         0.87       -0.43         1.00        0.96
Petal.Width          0.82       -0.37         0.96        1.00</code></pre>
<p><strong>3. Calculate the eigenvectors/eigenvalues</strong> of the correlation matrix :</p>
<pre class="r"><code># 2. Calculate eigenvectors/eigenvalues
res.eig <- eigen(res.cor)
res.eig</code></pre>
<pre><code>$values
[1] 2.91849782 0.91403047 0.14675688 0.02071484

$vectors
           [,1]        [,2]       [,3]       [,4]
[1,]  0.5210659 -0.37741762  0.7195664  0.2612863
[2,] -0.2693474 -0.92329566 -0.2443818 -0.1235096
[3,]  0.5804131 -0.02449161 -0.1421264 -0.8014492
[4,]  0.5648565 -0.06694199 -0.6342727  0.5235971</code></pre>
<p><span class="success">The first eigenvalue (<em>2.9</em>) is much larger than the second (<em>0.9</em>), and so on…. The highest eigenvalues correspond to the first data principal components.</span></p>
<p><strong>5. compute the new dataset</strong> :</p>
<pre class="r"><code># Transpose eigeinvectors
eigenvectors.t <- t(res.eig$vectors)
# Transpose the adjusted data
df.scaled.t <- t(df.scaled)
# The new dataset
df.new <- eigenvectors.t %*% df.scaled.t
# Transpose new data ad rename columns
df.new <- t(df.new)
colnames(df.new) <- c("PC1", "PC2", "PC3", "PC4")
head(df.new)</code></pre>
<pre><code>           PC1        PC2         PC3          PC4
[1,] -2.257141 -0.4784238  0.12727962  0.024087508
[2,] -2.074013  0.6718827  0.23382552  0.102662845
[3,] -2.356335  0.3407664 -0.04405390  0.028282305
[4,] -2.291707  0.5953999 -0.09098530 -0.065735340
[5,] -2.381863 -0.6446757 -0.01568565 -0.035802870
[6,] -2.068701 -1.4842053 -0.02687825  0.006586116</code></pre>
</div>
<div id="packages-in-r-for-the-principal-component-analysis" class="section level1">
<h1>Packages in R for the principal component analysis</h1>
<p>There are several functions from different packages for performing PCA :</p>
<ul>
<li>The functions <strong>prcomp()</strong> and <strong>princomp()</strong> from the built-in <strong>R stats</strong> package. Read more here: <a href="https://www.sthda.com/english/english/wiki/principal-component-analysis-the-basics-you-should-read-r-software-and-data-mining">prcomp and princomp</a></li>
<li><strong>PCA()</strong> from <strong>FactoMineR</strong> package. Read more here : <a href="https://www.sthda.com/english/english/wiki/factominer-and-factoextra-principal-component-analysis-visualization-r-software-and-data-mining">PCA with FactoMineR</a></li>
<li><strong>dudi.pca()</strong> from <strong>ade4</strong> package. Read more here : <a href="https://www.sthda.com/english/english/wiki/ade4-and-factoextra-principal-component-analysis-r-software-and-data-mining">PCA with ade4</a></li>
</ul>
</div>
<div id="infos" class="section level1">
<h1>Infos</h1>
<p><span class="warning"> This analysis has been performed using <strong>R software</strong> (ver. 3.1.2) and <strong>ggplot2</strong> (ver. 1.0.0) </span></p>
<p>Read more :</p>
<ul>
<li>Gregory B. Anderson, principal component analysis in R, <a href="https://www.ime.usp.br/~pavan/pdf/MAE0330-PCA-R-2013">https://www.ime.usp.br/~pavan/pdf/MAE0330-PCA-R-2013</a></li>
<li>Wikibooks, <a href="http://en.wikibooks.org/wiki/Data_Mining_Algorithms_In_R/Dimensionality_Reduction/Principal_Component_Analysis">http://en.wikibooks.org/wiki/Data_Mining_Algorithms_In_R/Dimensionality_Reduction/Principal_Component_Analysis</a></li>
<li>Carlos Pinto, Data reduction, <a href="https://medicine.tcd.ie/neuropsychiatric-genetics/assets/pdf/2009_7_PCA_+_Factor_analyses.pdf">https://medicine.tcd.ie/neuropsychiatric-genetics/assets/pdf/2009_7_PCA_+_Factor_analyses.pdf</a></li>
</ul>
</div>

<script>jQuery(document).ready(function () {
    jQuery('h1').addClass('wiki_paragraph1');
    jQuery('h2').addClass('wiki_paragraph2');
    jQuery('h3').addClass('wiki_paragraph3');
    jQuery('h4').addClass('wiki_paragraph4');
    });//add phpboost class to header</script>
<style>.content{padding:0px;}</style>

<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
  (function () {
    var script = document.createElement("script");
    script.type = "text/javascript";
    script.src  = "https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
    document.getElementsByTagName("head")[0].appendChild(script);
  })();
</script>

</div><!--end rdoc-->
<!--====================== stop here when you copy to sthda================-->


<!-- END HTML -->]]></description>
			<pubDate>Mon, 25 May 2015 09:06:37 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[ade4 and factoextra : Principal Component Analysis - R software and data mining]]></title>
			<link>https://www.sthda.com/english/wiki/ade4-and-factoextra-principal-component-analysis-r-software-and-data-mining</link>
			<guid>https://www.sthda.com/english/wiki/ade4-and-factoextra-principal-component-analysis-r-software-and-data-mining</guid>
			<description><![CDATA[<!-- START HTML -->
 
  <!--====================== start from here when you copy to sthda================-->  
  <div id="rdoc">

<div id="TOC">
<ul>
<li><a href="#required-packages">Required packages</a></li>
<li><a href="#prepare-the-data">Prepare the data</a></li>
<li><a href="#principal-component-analysis">Principal component analysis</a></li>
<li><a href="#variances-of-the-principal-components">Variances of the principal components</a><ul>
<li><a href="#extract-the-eigenvalues">Extract the eigenvalues</a></li>
<li><a href="#make-a-scree-plot-using-ade4-base-graphics">Make a scree plot using ade4 base graphics</a></li>
<li><a href="#make-the-scree-plot-using-the-package-factoextra">Make the scree plot using the package factoextra</a></li>
</ul></li>
<li><a href="#graph-of-variables-the-circle-of-correlations">Graph of variables : the circle of correlations</a><ul>
<li><a href="#coordinates-of-variables-on-the-principal-components">Coordinates of variables on the principal components</a></li>
<li><a href="#graph-of-variables-using-ade4-base-graph">Graph of variables using ade4 base graph</a></li>
<li><a href="#graph-of-variables-using-factoextra">Graph of variables using factoextra</a></li>
<li><a href="#cos2-quality-of-the-representation-for-variables-on-the-factor-map">Cos2 : quality of the representation for variables on the factor map</a></li>
<li><a href="#contributions-of-the-variables-to-the-principal-components">Contributions of the variables to the principal components</a></li>
</ul></li>
<li><a href="#graph-of-individuals">Graph of individuals</a><ul>
<li><a href="#coordinates-of-individuals-on-the-principal-components">Coordinates of individuals on the principal components</a></li>
<li><a href="#cos2-quality-of-the-representation-for-individuals-on-the-principal-components">Cos2 : quality of the representation for individuals on the principal components</a></li>
<li><a href="#contribution-of-the-individuals-to-the-princial-components">Contribution of the individuals to the princial components</a></li>
<li><a href="#graph-of-individuals-using-ade4-base-graph">Graph of individuals using ade4 base graph</a></li>
<li><a href="#biplot-of-individuals-and-variables-using-ade4">Biplot of individuals and variables using ade4</a></li>
<li><a href="#graph-of-individuals-using-factoextra">Graph of individuals using factoextra</a></li>
<li><a href="#change-the-color-of-individuals-by-groups">Change the color of individuals by groups</a></li>
</ul></li>
<li><a href="#principal-component-analysis-using-supplementary-individuals-and-variables">Principal component analysis using supplementary individuals and variables</a><ul>
<li><a href="#supplementary-individuals">Supplementary individuals</a></li>
<li><a href="#supplementary-quantitative-variables">Supplementary quantitative variables</a></li>
</ul></li>
<li><a href="#infos">Infos</a></li>
</ul>
</div>

<p><br/></p>
<p>This <strong>R tutorial</strong> describes how to perform a <strong>Principal Component Analysis</strong> (<strong>PCA</strong>) using <strong>R software</strong> and <strong>ade4</strong> package.</p>
<div id="required-packages" class="section level1">
<h1>Required packages</h1>
<ul>
<li>The package <strong>ade4</strong> can be installed and loaded as follow :</li>
</ul>
<pre class="r"><code>install.packages("ade4")

library("ade4")</code></pre>
<ul>
<li>The package <a href="https://www.sthda.com/english/english/wiki/factoextra-r-package-visualization-of-the-outputs-of-a-multivariate-analysis-r-software-and-data-mining"><strong>factoextra</strong></a> is used for the visualization of the <strong>principal component analysis</strong> results</li>
</ul>
<p><strong>factoextra</strong> can be installed as follow :</p>
<pre class="r"><code># install.packages("devtools")
devtools::install_github("kassambara/factoextra")</code></pre>
<p>Load it :</p>
<pre class="r"><code>library("factoextra")</code></pre>
</div>
<div id="prepare-the-data" class="section level1">
<h1>Prepare the data</h1>
<p>We’ll used the data sets <em>decathlon2</em> from the package <strong>factoextra</strong> :</p>
<pre class="r"><code>library("factoextra")

data(decathlon2)
head(decathlon2[, 1:6])</code></pre>
<pre><code>           X100m Long.jump Shot.put High.jump X400m X110m.hurdle
SEBRLE     11.04      7.58    14.83      2.07 49.81        14.69
CLAY       10.76      7.40    14.26      1.86 49.37        14.05
BERNARD    11.02      7.23    14.25      1.92 48.93        14.99
YURKOV     11.34      7.09    15.19      2.10 50.42        15.31
ZSIVOCZKY  11.13      7.30    13.48      2.01 48.62        14.17
McMULLEN   10.83      7.31    13.76      2.13 49.91        14.38</code></pre>
<p><span class="warning">This data is a subset of <em>decathlon</em> data in <strong>FactoMineR</strong> package</span></p>
<p>As illustrated below, the data used here describes athletes’ performance during two sporting events (Desctar and OlympicG). It contains 27 individuals (athletes) described by 13 variables :</p>
<p><a href="https://www.sthda.com/english/sthda/RDoc/images/pca-decathlon-big.png" title="Click to zoom!"> <img src="https://www.sthda.com/english/sthda/RDoc/images/pca-decathlon.png" alt="principal component analysis data"/> </a></p>
<br/>

<div class="warning">
<p>Only some of these individuals and variables will be used to perform the principal component analysis (PCA).</p>
The coordinates of the remaining individuals and variables on the factor map will be <strong>predicted</strong> after the PCA.
</div>
<p><br/></p>
<p>In PCA terminology, our data contains :</p>
<br/>
<div class="block">
<ul>
<li><strong>Active individuals</strong> (in blue, rows 1:23) : Individuals that are used during the principal component analysis.</li>
<li><strong>Supplementary individuals</strong> (in green, rows 24:27) : The coordinates of these individuals will be predicted using the PCA informations and parameters obtained with active individuals/variables</li>
<li><strong>Active variables</strong> (in pink, columns 1:10) : Variables that are used for the principal component analysis.</li>
<li><strong>Supplementary variables</strong> : As supplementary individuals, the coordinates of these variables will be predicted also.</li>
<li><strong>Supplementary continuous variables</strong> : Columns 11 and 12 corresponding respectively to the rank and the points of athletes.</li>
<li><strong>Supplementary qualitative variables</strong> : Column 13 corresponding to the two athletic meetings (2004 Olympic Game or 2004 Decastar). This factor variables will be used to color individuals by groups.</li>
</ul>
</div>
<p><br/></p>
<p>Extract only active individuals and variables for principal component analysis:</p>
<pre class="r"><code>decathlon2.active <- decathlon2[1:23, 1:10]
head(decathlon2.active[, 1:6])</code></pre>
<pre><code>           X100m Long.jump Shot.put High.jump X400m X110m.hurdle
SEBRLE     11.04      7.58    14.83      2.07 49.81        14.69
CLAY       10.76      7.40    14.26      1.86 49.37        14.05
BERNARD    11.02      7.23    14.25      1.92 48.93        14.99
YURKOV     11.34      7.09    15.19      2.10 50.42        15.31
ZSIVOCZKY  11.13      7.30    13.48      2.01 48.62        14.17
McMULLEN   10.83      7.31    13.76      2.13 49.91        14.38</code></pre>
</div>
<div id="principal-component-analysis" class="section level1">
<h1>Principal component analysis</h1>
<p>The function <strong>dudi.pca()</strong> [in <em>ade4</em> package] can be used. A simplified format is :</p>
<pre class="r"><code>dudi.pca(df, center = TRUE,  scale = TRUE, 
         scannf = TRUE, nf = 2)</code></pre>
<br/>
<div class="block">
<ul>
<li><strong>df</strong> : a data frame. Rows are individuals and columns are numeric variables</li>
<li><strong>center</strong> : a logical value specifying whether the variables should be shifted to be zero centered.</li>
<li><strong>scale</strong> : a logical value. If <em>TRUE</em>, the data are scaled to unit variance before the analysis. This standardization to the same scale avoids some variables to become dominant just because of their large measurement units.</li>
<li><strong>scannf</strong> : a logical value specifying whether the scree plot should be displayed</li>
<li><strong>nf</strong> : number of dimensions kept in the final results.</li>
</ul>
</div>
<p><br/></p>
<p>In the R code below, the PCA is performed only on the active individuals/variables :</p>
<pre class="r"><code>library("ade4")
res.pca <- dudi.pca(decathlon2.active, scannf = FALSE, nf = 5)</code></pre>
</div>
<div id="variances-of-the-principal-components" class="section level1">
<h1>Variances of the principal components</h1>
<div id="extract-the-eigenvalues" class="section level2">
<h2>Extract the eigenvalues</h2>
<p><strong>Eigenvalues</strong> measure the amount of variation retained by a <strong>principal component</strong> :</p>
<pre class="r"><code>summary(res.pca)</code></pre>
<pre><code>Class: pca dudi
Call: dudi.pca(df = decathlon2.active, scannf = FALSE, nf = 5)

Total inertia: 10

Eigenvalues:
    Ax1     Ax2     Ax3     Ax4     Ax5 
 4.1242  1.8385  1.2391  0.8194  0.7016 

Projected inertia (%):
    Ax1     Ax2     Ax3     Ax4     Ax5 
 41.242  18.385  12.391   8.194   7.016 

Cumulative projected inertia (%):
    Ax1   Ax1:2   Ax1:3   Ax1:4   Ax1:5 
  41.24   59.63   72.02   80.21   87.23 

(Only 5 dimensions (out of 10) are shown)</code></pre>
<p>You can also use the package <strong>factoextra</strong> to extract the eigenvalues :</p>
<pre class="r"><code>library("factoextra")
eig.val <- get_eigenvalue(res.pca)
head(eig.val)</code></pre>
<pre><code>      eigenvalue variance.percent cumulative.variance.percent
Dim 1  4.1242133        41.242133                    41.24213
Dim 2  1.8385309        18.385309                    59.62744
Dim 3  1.2391403        12.391403                    72.01885
Dim 4  0.8194402         8.194402                    80.21325
Dim 5  0.7015528         7.015528                    87.22878
Dim 6  0.4228828         4.228828                    91.45760</code></pre>
</div>
<div id="make-a-scree-plot-using-ade4-base-graphics" class="section level2">
<h2>Make a scree plot using ade4 base graphics</h2>
<p>The function <strong>scree plot()</strong> can be used to represent the amount of inertia (variance) associated with each principal component (PC).</p>
<p>A simplified format is :</p>
<pre class="r"><code>screeplot(x, ncps = length(x$eig), type = c("barplot", "lines"))</code></pre>
<br/>
<div class="block">
<ul>
<li><strong>x</strong> : an object of class dudi</li>
<li><strong>ncps</strong> : the number of components to be plotted</li>
<li><strong>type</strong> : the type of plot</li>
</ul>
</div>
<p><br/></p>
<p>Example of usage :</p>
<pre class="r"><code>screeplot(res.pca, main ="Screeplot - Eigenvalues")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-screeplot-ade4-data-mining-1.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>You can also customize the plot using the standard <strong>barplot()</strong> function. In the R code below, we’ll draw the percentage of variances retained by each component :</p>
<pre class="r"><code>barplot(eig.val[, 2], names.arg=1:nrow(eig.val), 
       main = "Variances",
       xlab = "Principal Components",
       ylab = "Percentage of variances",
       col ="steelblue")
# Add connected line segments to the plot
lines(x = 1:nrow(eig.val), eig.val[, 2], 
      type="b", pch=19, col = "red")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-eigenvalue-data-mining-1.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="success">~60% of the information (variances) contained in the data are retained by the first two principal components.</span></p>
</div>
<div id="make-the-scree-plot-using-the-package-factoextra" class="section level2">
<h2>Make the scree plot using the package factoextra</h2>
<pre class="r"><code>fviz_screeplot(res.pca, ncp=10)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-eigenvalue-factoextra-data-mining-1.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
</div>
<div id="graph-of-variables-the-circle-of-correlations" class="section level1">
<h1>Graph of variables : the circle of correlations</h1>
<div id="coordinates-of-variables-on-the-principal-components" class="section level2">
<h2>Coordinates of variables on the principal components</h2>
<p>The coordinates of the variables on the factor map are :</p>
<pre class="r"><code># Column coordinates
head(res.pca$co)</code></pre>
<pre><code>                  Comp1       Comp2      Comp3       Comp4      Comp5
X100m         0.8506257 -0.17939806 -0.3015564  0.03357320  0.1944440
Long.jump    -0.7941806  0.28085695  0.1905465 -0.11538956 -0.2331567
Shot.put     -0.7339127  0.08540412 -0.5175978  0.12846837  0.2488129
High.jump    -0.6100840 -0.46521415 -0.3300852  0.14455012 -0.4027002
X400m         0.7016034  0.29017826 -0.2835329  0.43082552 -0.1039085
X110m.hurdle  0.7641252 -0.02474081 -0.4488873 -0.01689589 -0.2242200</code></pre>
</div>
<div id="graph-of-variables-using-ade4-base-graph" class="section level2">
<h2>Graph of variables using ade4 base graph</h2>
<p>The function <strong>s.corcircle()</strong> can be used to plot the correlation circle. A simplified format is :</p>
<pre class="r"><code>s.corcircle(dfxy, label = row.names(dfxy), grid = TRUE,
            box = FALSE)</code></pre>
<br/>
<div class="block">
<ul>
<li><strong>dfxy</strong> : a data frame specifying the coordinates of variables</li>
<li><strong>label</strong> : a vector of strings specifying point labels</li>
<li><strong>grid</strong> : a logical value specifying whether a grid in the background of the plot should be drawn</li>
<li><strong>box</strong> : a logical value indicating whether a box should be drawn</li>
</ul>
</div>
<p><br/></p>
<pre class="r"><code># Graph of variables
s.corcircle(res.pca$co)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-correlation-circle-data-mining-1.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
<div id="graph-of-variables-using-factoextra" class="section level2">
<h2>Graph of variables using factoextra</h2>
<p>The function <strong>fviz_pca_var()</strong> is used to visualize variables :</p>
<pre class="r"><code># Default plot
fviz_pca_var(res.pca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-correlation-circle-factoextra-data-mining-1.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Change color and theme
fviz_pca_var(res.pca, col.var="steelblue")+
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-correlation-circle-factoextra-data-mining-2.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>Read more about the function <strong>fviz_pca_var()</strong> : <a href="https://www.sthda.com/english/english/wiki/fviz-pca-var-graph-of-variables-principal-component-analysis-r-software-and-data-mining">Graph of variables - Principal Component Analysis</a></p>
<p><span class="question">How to calculate the cos2 and the contribution of variables?</span></p>
<p>The cos2 and the contributions of variables (columns) / individuals (rows) are calculated using the function <strong>inertia.dudi()</strong> as follow :</p>
<pre class="r"><code>inertia <- inertia.dudi(res.pca, row.inertia = TRUE,
                        col.inertia = TRUE)</code></pre>
<p><span class="warning">Note that, the contributions and the cos2 are printed in 1/10 000. The sign is the sign of the coordinates.</span></p>
</div>
<div id="cos2-quality-of-the-representation-for-variables-on-the-factor-map" class="section level2">
<h2>Cos2 : quality of the representation for variables on the factor map</h2>
<p>The squared coordinates of variables are called cos2.</p>
<ul>
<li><p>A high cos2 indicates a good representation of the variable on the principal component. In this case the variable is positioned close to the circumference of the correlation circle.</p></li>
<li><p>A low cos2 indicates that the variable is not perfectly represented by the PCs. In this case the variable is close to the center of the circle.</p></li>
</ul>
<p>The cos2 of the variables are :</p>
<pre class="r"><code># relative contributions of columns
var.cos2 <- abs(inertia$col.rel/10000)
head(var.cos2)</code></pre>
<pre><code>              Comp1  Comp2  Comp3  Comp4  Comp5 con.tra
X100m        0.7236 0.0322 0.0909 0.0011 0.0378     0.1
Long.jump    0.6307 0.0789 0.0363 0.0133 0.0544     0.1
Shot.put     0.5386 0.0073 0.2679 0.0165 0.0619     0.1
High.jump    0.3722 0.2164 0.1090 0.0209 0.1622     0.1
X400m        0.4922 0.0842 0.0804 0.1856 0.0108     0.1
X110m.hurdle 0.5839 0.0006 0.2015 0.0003 0.0503     0.1</code></pre>
<p>It can also be calculated as follow :</p>
<pre class="r"><code># squared coordinates
head(res.pca$co^2)</code></pre>
<pre><code>                 Comp1        Comp2      Comp3        Comp4      Comp5
X100m        0.7235641 0.0321836641 0.09093628 0.0011271597 0.03780845
Long.jump    0.6307229 0.0788806285 0.03630798 0.0133147506 0.05436203
Shot.put     0.5386279 0.0072938636 0.26790749 0.0165041211 0.06190783
High.jump    0.3722025 0.2164242070 0.10895622 0.0208947375 0.16216747
X400m        0.4922473 0.0842034209 0.08039091 0.1856106269 0.01079698
X110m.hurdle 0.5838873 0.0006121077 0.20149984 0.0002854712 0.05027463</code></pre>
<p><span class="warning">Using <strong>factoextra</strong> package, the color of variables can be automatically controlled by the value of their cos2.</span></p>
<pre class="r"><code>fviz_pca_var(res.pca, col.var="contrib")+
scale_color_gradient2(low="white", mid="blue", 
      high="red", midpoint=55) + theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-correlation-circle-colors-factoextra-data-mining-1.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
</div>
<div id="contributions-of-the-variables-to-the-principal-components" class="section level2">
<h2>Contributions of the variables to the principal components</h2>
<p>The contributions can be printed in % as follow :</p>
<pre class="r"><code># absolute contribution of columns
var.contrib <- inertia$col.abs/100
head(var.contrib)</code></pre>
<pre><code>             Comp1 Comp2 Comp3 Comp4 Comp5
X100m        17.54  1.75  7.34  0.14  5.39
Long.jump    15.29  4.29  2.93  1.62  7.75
Shot.put     13.06  0.40 21.62  2.01  8.82
High.jump     9.02 11.77  8.79  2.55 23.12
X400m        11.94  4.58  6.49 22.65  1.54
X110m.hurdle 14.16  0.03 16.26  0.03  7.17</code></pre>
<p><span class="warning">Note that, You can also use the function get_pca_var() [from factoextra package]. It provides a list of matrices containing all the results for the active variables (coordinates, correlation between variables and axes, squared cosine and contributions).</span></p>
<pre class="r"><code>var <- get_pca_var(res.pca)
names(var)</code></pre>
<pre><code>[1] "coord"   "cor"     "cos2"    "contrib"</code></pre>
<pre class="r"><code># Contributions of variables
head(var$contrib)</code></pre>
<pre><code>                 Dim.1      Dim.2     Dim.3       Dim.4     Dim.5
X100m        17.544293  1.7505098  7.338659  0.13755240  5.389252
Long.jump    15.293168  4.2904162  2.930094  1.62485936  7.748815
Shot.put     13.060137  0.3967224 21.620432  2.01407269  8.824401
High.jump     9.024811 11.7715838  8.792888  2.54987951 23.115504
X400m        11.935544  4.5799296  6.487636 22.65090599  1.539012
X110m.hurdle 14.157544  0.0332933 16.261261  0.03483735  7.166193</code></pre>
<p><span class="warning">Using <strong>factoextra</strong> package, the color of variables can be automatically controlled by the value of their contributions</span></p>
<pre class="r"><code>fviz_pca_var(res.pca, col.var="contrib") +
scale_color_gradient2(low="white", mid="blue", 
      high="red", midpoint=50) + theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-variable-contribution-factoextra-data-mining-1.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><span class="success">This is helpful to highlight the most important variables for the principal components.</span></p>
<p>The most important variables for a given PC can be visualized using the function <strong>fviz_pca_contrib()</strong>[factoextra package] :</p>
<p>(factoextra >= 1.0.1 is required)</p>
<pre class="r"><code># Contributions of variables on PC1
fviz_pca_contrib(res.pca, choice = "var", axes = 1)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-variable-contribution-data-mining-1.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="384" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Contributions of variables on PC2
fviz_pca_contrib(res.pca, choice = "var", axes = 2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-variable-contribution-data-mining-2.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="384" style="margin-bottom:10px;" /></p>
<p>Read more about <strong>fviz_pca_contrib()</strong> : <a href="https://www.sthda.com/english/english/wiki/principal-component-analysis-how-to-reveal-the-most-important-variables-in-your-data-r-software-and-data-mining">Principal Component Analysis: How to reveal the most important variables in your data?</a></p>
</div>
</div>
<div id="graph-of-individuals" class="section level1">
<h1>Graph of individuals</h1>
<div id="coordinates-of-individuals-on-the-principal-components" class="section level2">
<h2>Coordinates of individuals on the principal components</h2>
<p>The coordinates of the individuals on the factor maps can be extracted as follow :</p>
<pre class="r"><code># The row coordinates
head(res.pca$li)</code></pre>
<pre><code>                Axis1      Axis2      Axis3       Axis4       Axis5
SEBRLE     -0.1955047  1.5890567 -0.6424912  0.08389652 -1.16829387
CLAY       -0.8078795  2.4748137  1.3873827  1.29838232  0.82498206
BERNARD     1.3591340  1.6480950 -0.2005584 -1.96409420 -0.08419345
YURKOV      0.8889532 -0.4426067 -2.5295843  0.71290837 -0.40782264
ZSIVOCZKY   0.1081216 -2.0688377  1.3342591 -0.10152796  0.20145217
McMULLEN   -0.1212195 -1.0139102  0.8625170  1.34164291 -1.62151286</code></pre>
</div>
<div id="cos2-quality-of-the-representation-for-individuals-on-the-principal-components" class="section level2">
<h2>Cos2 : quality of the representation for individuals on the principal components</h2>
<pre class="r"><code># relative contributions of rows
ind.cos2 <- abs(inertia$row.rel)/10000
head(ind.cos2)</code></pre>
<pre><code>            Axis1  Axis2  Axis3  Axis4  Axis5 con.tra
SEBRLE     0.0075 0.4975 0.0813 0.0014 0.2689  0.0221
CLAY       0.0487 0.4570 0.1436 0.1258 0.0508  0.0583
BERNARD    0.1972 0.2900 0.0043 0.4118 0.0008  0.0407
YURKOV     0.0961 0.0238 0.7782 0.0618 0.0202  0.0357
ZSIVOCZKY  0.0016 0.5764 0.2398 0.0014 0.0055  0.0323
McMULLEN   0.0022 0.1522 0.1101 0.2665 0.3893  0.0294</code></pre>
</div>
<div id="contribution-of-the-individuals-to-the-princial-components" class="section level2">
<h2>Contribution of the individuals to the princial components</h2>
<p>The contributions can be printed in % as follow :</p>
<pre class="r"><code># absolute contributions of rows
ind.contrib <- inertia$row.abs/100
head(ind.contrib)</code></pre>
<pre><code>           Axis1 Axis2 Axis3 Axis4 Axis5
SEBRLE      0.04  5.97  1.45  0.04  8.46
CLAY        0.69 14.48  6.75  8.94  4.22
BERNARD     1.95  6.42  0.14 20.47  0.04
YURKOV      0.83  0.46 22.45  2.70  1.03
ZSIVOCZKY   0.01 10.12  6.25  0.05  0.25
McMULLEN    0.02  2.43  2.61  9.55 16.29</code></pre>
<p><span class="success">It’s also possible to use the function get_pca_ind() [from factoextra package]. factoextra provides, a list of matrices containing all the results for the active individuals (coordinates, squared cosine and contributions)./span></p>
<pre class="r"><code>ind <- get_pca_ind(res.pca)
names(ind)</code></pre>
<pre><code>[1] "coord"   "cos2"    "contrib"</code></pre>
<pre class="r"><code># Contributions of individuals
head(ind$contrib)</code></pre>
<pre><code>                Dim.1      Dim.2      Dim.3       Dim.4       Dim.5
SEBRLE     0.04029447  5.9714533  1.4483919  0.03734589  8.45894063
CLAY       0.68805664 14.4839248  6.7537381  8.94458283  4.21794385
BERNARD    1.94740183  6.4234107  0.1411345 20.46819433  0.04393073
YURKOV     0.83308415  0.4632733 22.4517396  2.69663605  1.03075263
ZSIVOCZKY  0.01232413 10.1217143  6.2464325  0.05469230  0.25151025
McMULLEN   0.01549089  2.4310854  2.6102794  9.55055888 16.29493304</code></pre>
<p>Use the function <strong>fviz_pca_contrib()</strong>[factoextra package] to visualize the most contributing individuals :</p>
<p>(factoextra >= 1.0.1 is required)</p>
<pre class="r"><code># Contributions of variables on PC1
fviz_pca_contrib(res.pca, choice = "ind", axes = 1)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-individuals-contribution-data-mining-1.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Contributions of variables on PC2
fviz_pca_contrib(res.pca, choice = "ind", axes = 2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-individuals-contribution-data-mining-2.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p>Read more about <strong>fviz_pca_contrib()</strong> : <a href="https://www.sthda.com/english/english/wiki/principal-component-analysis-how-to-reveal-the-most-important-variables-in-your-data-r-software-and-data-mining">Principal Component Analysis: How to reveal the most important variables in your data?</a></p>
</div>
<div id="graph-of-individuals-using-ade4-base-graph" class="section level2">
<h2>Graph of individuals using ade4 base graph</h2>
<p>The function <strong>s.label()</strong> can be used. A simplified format is :</p>
<pre class="r"><code>s.label(dfxy, xax = 1, yax = 2)</code></pre>
<br/>
<div class="block">
<ul>
<li><strong>dfxy</strong> : a data frame with at least two coordinates</li>
<li><strong>xax</strong> : a numeric value specifying the column number containing x values</li>
<li><strong>yax</strong> : a numeric value specifying the column number containing y values</li>
</ul>
</div>
<p><br/></p>
<p>Factor map of individuals :</p>
<pre class="r"><code>s.label(res.pca$li, xax = 1, yax = 2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-individus-graph-1.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
<div id="biplot-of-individuals-and-variables-using-ade4" class="section level2">
<h2>Biplot of individuals and variables using ade4</h2>
<p>Biplot can be drawn using the combination of the two functions below :</p>
<ul>
<li>s.label() to plot individuals</li>
<li>s.arrow() to add variables</li>
</ul>
<pre class="r"><code># Plot of individuals
s.label(res.pca$li, xax = 1, yax = 2)
# Add variables
s.arrow(7*res.pca$c1, add.plot = TRUE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-biplot-ade4-data-mining-1.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>It’s also possible to use the function <strong>scatter()</strong> or <strong>biplot()</strong> :</p>
<pre class="r"><code>scatter(res.pca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-biplot-ade4-data-mining2-1.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Remove the scree plot (posieig ="none")
# Remove row labels (clab.row = 0)
scatter(res.pca,  posieig = "none", clab.row = 0)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-biplot-ade4-data-mining2-2.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre><code>NULL</code></pre>
<p><span class="warning">Note that, to remove variable labels the argument <strong>clab.col = 0</strong> can be used.</span></p>
</div>
<div id="graph-of-individuals-using-factoextra" class="section level2">
<h2>Graph of individuals using factoextra</h2>
<p>The function <strong>fviz_pca_ind()</strong> is used to visualize individuals :</p>
<pre class="r"><code>fviz_pca_ind(res.pca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-individuals-graph-factoextra-data-mining-1.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><strong>Control automatically the color of individuals</strong> using the cos2 values (the quality of the individuals on the factor map) :</p>
<pre class="r"><code>fviz_pca_ind(res.pca, col.ind="cos2") +
scale_color_gradient2(low="white", mid="blue", 
                  high="red", midpoint=0.50)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-individuals-graph-color-factoextra-data-mining-1.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><strong>Change the theme</strong> :</p>
<pre class="r"><code>fviz_pca_ind(res.pca, col.ind="cos2") +
scale_color_gradient2(low="white", mid="blue", 
    high="red", midpoint=0.50) + theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-individuals-graph-theme-factoextra-data-mining-1.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p>Read more about fviz_pca_ind() : <a href="https://www.sthda.com/english/english/wiki/fviz-pca-ind-graph-of-individuals-principal-component-analysis-r-software-and-data-mining">Graph of individuals - principal component analysis</a></p>
<p><strong>Make a biplot of individuals and variables</strong> :</p>
<pre class="r"><code>fviz_pca_biplot(res.pca, geom = "text") +
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-biplot-factoextra-data-mining-1.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p>Read more about <strong>fviz_pca_biplot()</strong> : <a href="https://www.sthda.com/english/english/wiki/fviz-pca-biplot-biplot-of-individuals-and-variables-principal-component-analysis-r-software-and-data-mining">Biplot of individuals and variables - principal component analysis</a></p>
</div>
<div id="change-the-color-of-individuals-by-groups" class="section level2">
<h2>Change the color of individuals by groups</h2>
<p>The data sets <em>decathlon2</em> contain a <strong>supplementary qualitative variable</strong> at columns 13 corresponding to the type of competitions.</p>
<p>Qualitative variable can be helpful for interpreting the data and for coloring individuals by groups :</p>
<pre class="r"><code># Data for the supplementary qualitative variables
quali.sup <- as.factor(decathlon2[1:23, 13])
head(quali.sup)</code></pre>
<pre><code>[1] Decastar Decastar Decastar Decastar Decastar Decastar
Levels: Decastar OlympicG</code></pre>
<p>The function <strong>s.class()</strong> can be used to visualize the classes (groups) of points :</p>
<pre class="r"><code>s.class(dfxy, fac, xax = 1, yax = 2, col)</code></pre>
<br/>
<div class="block">
<ul>
<li><strong>dfxy</strong> : a data frame containing the two columns for x and y axes</li>
<li><strong>fac</strong> : a factor variable partitioning the individuals in classes</li>
<li><strong>xax</strong>, <strong>yax</strong> : a numeric value specifying the column number containing x and y values</li>
<li><strong>col</strong> : a vector of colors used to draw each class in a different color</li>
</ul>
</div>
<p><br/></p>
<p><strong>Color individuals by groups</strong> :</p>
<pre class="r"><code>s.class(res.pca$li, fac = quali.sup, xax = 1, yax = 2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-color-individuals-by-groups-data-mining-1.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Change the colors
s.class(res.pca$li, fac = quali.sup, col = c("blue", "red"))</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-color-individuals-by-groups-data-mining-2.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Make a biplot
# clab.row : hide the label for rows (individuals)
res <- scatter(res.pca, clab.row = 0, posieig = "none")
s.class(res.pca$li, fac = quali.sup, col = c("blue", "red"),
        add.plot = TRUE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-color-individuals-by-groups-data-mining-3.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Customize the biplot
# - remove row labels (clab.row = 0)
# - hide the scree plot (posieig = 0)
# - remove stars (cstar = 0)
# - remove ellipse (cellipse = 0)
res <- scatter(res.pca, clab.row = 0, posieig = "none")
s.class(res.pca$li, fac = quali.sup, col = c("blue", "red"),
        add.plot = TRUE, cstar = 0, cellipse = 0)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-color-individuals-by-groups-data-mining-4.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code># remove labels for classes (clabel = 0)
res <- scatter(res.pca, clab.row = 0, posieig = "none")
s.class(res.pca$li, fac = quali.sup, col = c("blue", "red"),
        add.plot = TRUE, cstar = 0, cellipse = 0, clabel = 0)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-color-individuals-by-groups-data-mining-5.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>It’s also possible to use factoextra :</p>
<pre class="r"><code>fviz_pca_ind(res.pca, habillage = quali.sup,
     addEllipses =TRUE, ellipse.level = 0.68) +
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-individuals-graph-supplementary-qualitative-variables-1.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><strong>Elegant biplot using factoextra and iris data</strong> :</p>
<pre class="r"><code>data(iris)

head(iris)</code></pre>
<pre><code>  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa</code></pre>
<pre class="r"><code># The variable Species (index = 5) is removed
# before PCA analysis
iris.pca <- dudi.pca(iris[,-5], scannf = FALSE, nf = 2)</code></pre>
<p>Now, let’s :</p>
<ul>
<li>make a biplot of individuals and variables</li>
<li>change the color of individuals by groups</li>
<li>change the transparency of variable colors by their contribution values</li>
<li>show only the labels for variables</li>
</ul>
<pre class="r"><code>fviz_pca_biplot(iris.pca, 
  habillage = iris$Species, addEllipses = TRUE,
  col.var = "red", alpha.var ="cos2",
  label = "var") +
  scale_color_brewer(palette="Dark2")+
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-biplot-change-color-transparency-factoextra-data-mining-1.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
</div>
</div>
<div id="principal-component-analysis-using-supplementary-individuals-and-variables" class="section level1">
<h1>Principal component analysis using supplementary individuals and variables</h1>
<p><span class="warning">As described above, the data sets <em>decathlon2</em> contain <strong>supplementary continuous variables</strong> (quanti.sup, columns 11:12), <strong>supplementary qualitative variables</strong> (quali.sup, column 13) and <strong>supplementary individuals</strong> (ind.sup, rows 24:27)</span></p>
<p>Supplementary variables / individuals are not used to compute the principal component. Their coordinates are predicted using only the information provided by the performed principal component analysis on active variables / individuals.</p>
<p>The functions <strong>suprow()</strong> and <strong>supcol()</strong> [in ade4 package] are used to calculate the coordinates of supplementary rows (individuals) and columns (variables), respectively.</p>
<p>The simplified formats are :</p>
<pre class="r"><code># For supplementary individuals (rows)
suprow(x, Xsup)

# For supplementary variables (columns)
supcol(x, Xsup)</code></pre>
<div id="supplementary-individuals" class="section level2">
<h2>Supplementary individuals</h2>
<pre class="r"><code># Data for the supplementary individuals
ind.sup <- decathlon2[24:27, 1:10, drop = FALSE]
ind.sup[, 1:6]</code></pre>
<pre><code>         X100m Long.jump Shot.put High.jump X400m X110m.hurdle
KARPOV   11.02      7.30    14.77      2.04 48.37        14.09
WARNERS  11.11      7.60    14.31      1.98 48.68        14.23
Nool     10.80      7.53    14.26      1.88 48.81        14.80
Drews    10.87      7.38    13.07      1.88 48.51        14.01</code></pre>
<p>Predict the coordinates of the supplementary individuals :</p>
<pre class="r"><code>ind.sup.pca <- suprow(res.pca, ind.sup)
names(ind.sup.pca)</code></pre>
<pre><code>[1] "tabsup" "lisup" </code></pre>
<pre class="r"><code># coordinates 
ind.sup.coord <- ind.sup.pca$lisup
head(ind.sup.coord)</code></pre>
<pre><code>              Axis1       Axis2     Axis3      Axis4      Axis5
KARPOV   -0.7947206  0.77951227 1.6330203  1.7242283 0.75070396
WARNERS   0.3864645 -0.12159237 1.7387332 -0.7063341 0.03230011
Nool      0.5591306  1.97748871 0.4830358 -2.2784526 0.25461493
Drews     1.1092038  0.01741477 3.0488182 -1.5343468 0.32642192</code></pre>
<p><span class="question">How to visualize supplementary individuals on the factor map?</span></p>
<p>The function <strong>fviz_add()</strong> is used :</p>
<pre class="r"><code># Plot of active individuals
p <- fviz_pca_ind(res.pca)
# Add supplementary individuals
fviz_add(p, ind.sup.coord, color ="blue")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-supplementary-individuals-factoextra-data-mining-1.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="question">How to calculate the cos2 (quality of the representation) for supplementary individuals?</span></p>
<pre class="r"><code>cos2.func <-function(x){x^2/sum(x^2)}
ind.sup.cos2 <- t(apply(ind.sup.coord, 1, cos2.func))
head(ind.sup.cos2)</code></pre>
<pre><code>              Axis1        Axis2      Axis3     Axis4        Axis5
KARPOV   0.08486144 8.164458e-02 0.35831467 0.3994579 0.0757214366
WARNERS  0.04050537 4.009646e-03 0.81989704 0.1353050 0.0002829447
Nool     0.03218782 4.026179e-01 0.02402281 0.5344967 0.0066747159
Drews    0.09473792 2.335268e-05 0.71575477 0.1812793 0.0082046453</code></pre>
</div>
<div id="supplementary-quantitative-variables" class="section level2">
<h2>Supplementary quantitative variables</h2>
<pre class="r"><code># Data for the supplementary quantitative variables
quanti.sup <- decathlon2[1:23, 11:12, drop = FALSE]
head(quanti.sup)</code></pre>
<pre><code>           Rank Points
SEBRLE        1   8217
CLAY          2   8122
BERNARD       4   8067
YURKOV        5   8036
ZSIVOCZKY     7   8004
McMULLEN      8   7995</code></pre>
<p><span class="notice">Remember that, rows 24:27 are supplementary individuals. We don’t want them in this current analysis. This is why, I extracted only rows 1:23. </span></p>
<p><strong>Predict the coordinates of the supplementary variables</strong> :</p>
<p>(You have to scale the supplementary variables before the analysis as the PCA has been performed on scaled data.)</p>
<pre class="r"><code>quanti.pca <- supcol(res.pca, scale(quanti.sup))
names(quanti.pca)</code></pre>
<pre><code>[1] "tabsup" "cosup" </code></pre>
<pre class="r"><code># coordinates 
quanti.coord <- quanti.pca$cosup
head(quanti.coord)</code></pre>
<pre><code>            Comp1      Comp2      Comp3      Comp4      Comp5
Rank    0.6860587 -0.2398049  0.1793975  0.0545264 0.07220371
Points -0.9425246  0.0759751 -0.1545490 -0.1625770 0.03046248</code></pre>
<p><strong>Visualize supplementary variables on the factor map using factoextra :</strong></p>
<pre class="r"><code># Plot of active variables
p <- fviz_pca_var(res.pca)
# Add supplementary active variables
fviz_add(p, quanti.coord, geom="arrow", color ="blue")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/ade4-principal-component-analysis-quantitative-supplementary-variable-data-mining-1.png" title="ade4 and factoextra : Principal Component Analysis - R software and data mining" alt="ade4 and factoextra : Principal Component Analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Get the cos2 of the supplementary quantitative variables
(quanti.coord^2)[, 1:4]</code></pre>
<pre><code>           Comp1       Comp2      Comp3       Comp4
Rank   0.4706766 0.057506383 0.03218347 0.002973128
Points 0.8883526 0.005772216 0.02388540 0.026431296</code></pre>
</div>
</div>
<div id="infos" class="section level1">
<h1>Infos</h1>
<p><span class="warning"> This analysis has been performed using <strong>R software</strong> (ver. 3.1.2) and <strong>ggplot2</strong> (ver. 1.0.0) </span></p>
</div>

<script>jQuery(document).ready(function () {
    jQuery('h1').addClass('wiki_paragraph1');
    jQuery('h2').addClass('wiki_paragraph2');
    jQuery('h3').addClass('wiki_paragraph3');
    jQuery('h4').addClass('wiki_paragraph4');
    });//add phpboost class to header</script>
<style>.content{padding:0px;}</style>
</div><!--end rdoc-->
<!--====================== stop here when you copy to sthda================-->

<!-- END HTML -->]]></description>
			<pubDate>Sun, 15 Mar 2015 08:43:27 +0100</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[FactoMineR and factoextra : Principal Component Analysis Visualization - R software and data mining]]></title>
			<link>https://www.sthda.com/english/wiki/factominer-and-factoextra-principal-component-analysis-visualization-r-software-and-data-mining</link>
			<guid>https://www.sthda.com/english/wiki/factominer-and-factoextra-principal-component-analysis-visualization-r-software-and-data-mining</guid>
			<description><![CDATA[<!-- START HTML -->

  <!--====================== start from here when you copy to sthda================-->  
  <div id="rdoc">

<div id="TOC">
<ul>
<li><a href="#install-and-load-factominer-package">Install and load FactoMineR package</a></li>
<li><a href="#install-and-load-factoextra-for-visualization">Install and load factoextra for visualization</a></li>
<li><a href="#prepare-the-data">Prepare the data</a></li>
<li><a href="#exploratory-data-analysis">Exploratory data analysis</a><ul>
<li><a href="#descriptive-statistics">Descriptive statistics</a></li>
<li><a href="#correlation-matrix">Correlation matrix</a></li>
</ul></li>
<li><a href="#principal-component-analysis">Principal component analysis</a><ul>
<li><a href="#variances-of-the-principal-components">Variances of the principal components</a></li>
</ul></li>
<li><a href="#graph-of-individus-and-variables">Graph of individus and variables</a></li>
<li><a href="#variables-factor-map-the-correlation-circle">Variables factor map : The correlation circle</a><ul>
<li><a href="#coordinates-of-variables-on-the-principal-components">Coordinates of variables on the principal components</a></li>
<li><a href="#cos2-quality-of-variables-on-the-factor-map">Cos2 : quality of variables on the factor map</a></li>
<li><a href="#contributions-of-the-variables-to-the-principal-components">Contributions of the variables to the principal components</a></li>
<li><a href="#graph-of-variables-using-factominer-base-graph">Graph of variables using FactoMineR base graph</a></li>
<li><a href="#graph-of-variables-using-factoextra">Graph of variables using factoextra</a></li>
</ul></li>
<li><a href="#graph-of-individuals">Graph of individuals</a><ul>
<li><a href="#coordinates-of-individuals-on-the-principal-components">Coordinates of individuals on the principal components</a></li>
<li><a href="#cos2-quality-of-representation-of-individuals-on-the-principal-components">Cos2 : quality of representation of individuals on the principal components</a></li>
<li><a href="#contribition-of-individuals-to-the-princial-components">Contribition of individuals to the princial components</a></li>
<li><a href="#graph-of-individuals-using-factominer-base-graph">Graph of individuals using FactoMineR base graph</a></li>
<li><a href="#graph-of-individuals-using-factoextra">Graph of individuals using factoextra</a></li>
<li><a href="#change-the-color-of-individuals-by-groups">Change the color of individuals by groups</a></li>
</ul></li>
<li><a href="#principal-component-analysis-using-supplementary-individuals-and-variables">Principal component analysis using supplementary individuals and variables</a><ul>
<li><a href="#visualize-supplementary-quantitative-variables">Visualize supplementary quantitative variables</a></li>
<li><a href="#visualize-supplementary-individuals">Visualize supplementary individuals</a></li>
<li><a href="#supplementary-qualitative-variables">Supplementary qualitative variables</a></li>
</ul></li>
<li><a href="#dimension-description">Dimension description</a></li>
<li><a href="#infos">Infos</a></li>
</ul>
</div>

<p><br/></p>
<p><strong>Principal component analysis</strong> (PCA) allows us to summarize the variations (informations) in a data set described by multiple variables. Each variable could be considered as a different dimension. If you have more than 3 variables in your data sets, it could be very difficult to visualize a multi-dimensional hyperspace.</p>
<p><strong>The goal of principal component analysis</strong> is to transform the initial variables into a new set of variables which explain the variation in the data. These new variables corresponds to a linear combination of the originals and are called principal components.</p>
<p>PCA reduces the dimensionality of multivariate data, to two or three that can be visualized graphically with minimal loss of information.</p>
<p>Several functions from different packages are available in R for performing PCA : prcomp and princomp (built-in R stats package), PCA (FactoMineR package), dudi.pca(ade4 package).</p>
<br/>
<div class="success">
<p>This <strong>R tutorial</strong> describes :</p>
<ol style="list-style-type: decimal">
<li>How to perform a <strong>principal component analysis</strong> using <strong>R software</strong> and <strong>FactoMineR</strong> package</li>
<li>How to visualize the output of the PCA using the R package <a href="https://www.sthda.com/english/english/wiki/factoextra-r-package-visualization-of-the-outputs-of-a-multivariate-analysis-r-software-and-data-mining">factoextra</a></li>
</ol>
</div>
<p><br/></p>
<div id="install-and-load-factominer-package" class="section level1">
<h1>Install and load FactoMineR package</h1>
<p><strong>FactoMineR</strong> (Husson et al.) is one of the most <strong>powerful R packages</strong> and my favorite one for performing a multivariate exploratory data analysis. A rich documentation is available on the FactoMineR official website (<a href="http://factominer.free.fr/index.html">http://factominer.free.fr/index.html</a>) and on youtube. Many thanks to François Husson for this effort…</p>
<p><strong>FactoMineR</strong> can be installed and loaded as follow :</p>
<pre class="r"><code>install.packages("FactoMineR")

library("FactoMineR")</code></pre>
</div>
<div id="install-and-load-factoextra-for-visualization" class="section level1">
<h1>Install and load factoextra for visualization</h1>
<p><span class="success">The package <a href="https://www.sthda.com/english/english/wiki/factoextra-r-package-visualization-of-the-outputs-of-a-multivariate-analysis-r-software-and-data-mining"><strong>factoextra</strong></a> has flexible methods for the classes PCA, prcomp, princomp and dudi in order to extract and visualize quickly the results of the analysis. The ggplot2 plotting system is used for the data visualization.</span></p>
<p>Install and load <strong>factoextra</strong> as follow :</p>
<pre class="r"><code>library("devtools")
install_github("kassambara/factoextra")</code></pre>
<p>Load it :</p>
<pre class="r"><code>library("factoextra")</code></pre>
</div>
<div id="prepare-the-data" class="section level1">
<h1>Prepare the data</h1>
<p>We’ll use the data sets <em>decathlon2</em> from the package <strong>factoextra</strong> :</p>
<pre class="r"><code>data(decathlon2)
head(decathlon2[, 1:6])</code></pre>
<pre><code>           X100m Long.jump Shot.put High.jump X400m X110m.hurdle
SEBRLE     11.04      7.58    14.83      2.07 49.81        14.69
CLAY       10.76      7.40    14.26      1.86 49.37        14.05
BERNARD    11.02      7.23    14.25      1.92 48.93        14.99
YURKOV     11.34      7.09    15.19      2.10 50.42        15.31
ZSIVOCZKY  11.13      7.30    13.48      2.01 48.62        14.17
McMULLEN   10.83      7.31    13.76      2.13 49.91        14.38</code></pre>
<p><span class="warning">This data is just a subset of the <em>decathlon</em> data in <strong>FactoMineR</strong> package</span></p>
<p>As illustrated below, the data used here describes athletes’ performance during two sporting events (Desctar and OlympicG). It contains 27 individuals (athletes) described by 13 variables :</p>
<p><a href="https://www.sthda.com/english/sthda/RDoc/images/pca-decathlon-big.png" title="Click to zoom!"> <img src="https://www.sthda.com/english/sthda/RDoc/images/pca-decathlon.png" alt="principal component analysis data"/> </a></p>
<br/>

<div class="warning">
<p>Only some of these individuals and variables will be used to perform the principal component analysis (PCA).</p>
The coordinates of the remaining individuals and variables on the factor map will be <strong>predicted</strong> after the PCA.
</div>
<p><br/></p>
<p>In PCA terminology, our data contains :</p>
<br/>
<div class="block">
<ul>
<li><strong>Active individuals</strong> (in blue, rows 1:23) : Individuals that are used during the principal component analysis.</li>
<li><strong>Supplementary individuals</strong> (in green, rows 24:27) : The coordinates of these individuals will be predicted using the PCA informations and parameters obtained with active individuals/variables</li>
<li><strong>Active variables</strong> (in pink, columns 1:10) : Variables that are used for the principal component analysis.</li>
<li><strong>Supplementary variables</strong> : As supplementary individuals, the coordinates of these variables will be predicted also.</li>
<li><strong>Supplementary continuous variables</strong> : Columns 11 and 12 corresponding respectively to the rank and the points of athletes.</li>
<li><strong>Supplementary qualitative variables</strong> : Column 13 corresponding to the two athlete-tic meetings (2004 Olympic Game or 2004 Decastar). This factor variables will be used to color individuals by groups.</li>
</ul>
</div>
<p><br/></p>
<p>Extract only active individuals and variables for principal component analysis:</p>
<pre class="r"><code>decathlon2.active <- decathlon2[1:23, 1:10]
head(decathlon2.active[, 1:6])</code></pre>
<pre><code>           X100m Long.jump Shot.put High.jump X400m X110m.hurdle
SEBRLE     11.04      7.58    14.83      2.07 49.81        14.69
CLAY       10.76      7.40    14.26      1.86 49.37        14.05
BERNARD    11.02      7.23    14.25      1.92 48.93        14.99
YURKOV     11.34      7.09    15.19      2.10 50.42        15.31
ZSIVOCZKY  11.13      7.30    13.48      2.01 48.62        14.17
McMULLEN   10.83      7.31    13.76      2.13 49.91        14.38</code></pre>
</div>
<div id="exploratory-data-analysis" class="section level1">
<h1>Exploratory data analysis</h1>
<p>Before principal component analysis, we can perform some exploratory data analysis such as descriptive statistics, correlation matrix and scatter plot matrix.</p>
<div id="descriptive-statistics" class="section level2">
<h2>Descriptive statistics</h2>
<pre class="r"><code>decathlon2.active_stats <- data.frame(
  Min = apply(decathlon2.active, 2, min), # minimum
  Q1 = apply(decathlon2.active, 2, quantile, 1/4), # First quartile
  Med = apply(decathlon2.active, 2, median), # median
  Mean = apply(decathlon2.active, 2, mean), # mean
  Q3 = apply(decathlon2.active, 2, quantile, 3/4), # Third quartile
  Max = apply(decathlon2.active, 2, max) # Maximum
  )
decathlon2.active_stats <- round(decathlon2.active_stats, 1)
head(decathlon2.active_stats)</code></pre>
<pre><code>              Min   Q1  Med Mean   Q3  Max
X100m        10.4 10.8 11.0 11.0 11.2 11.6
Long.jump     6.8  7.2  7.3  7.3  7.5  8.0
Shot.put     12.7 14.2 14.7 14.6 15.1 16.4
High.jump     1.9  1.9  2.0  2.0  2.1  2.1
X400m        46.8 49.0 49.4 49.4 50.0 51.2
X110m.hurdle 14.0 14.2 14.4 14.5 14.9 15.7</code></pre>
<p><span class="warning">Note that, you can also use the built-in R function <strong>summary()</strong> for the descriptive statistics but I don’t like the format of the output on data frame.</span></p>
</div>
<div id="correlation-matrix" class="section level2">
<h2>Correlation matrix</h2>
<p>The correlation between variables can be calculated as follow :</p>
<pre class="r"><code>cor.mat <- round(cor(decathlon2.active),2)
head(cor.mat[, 1:6])</code></pre>
<pre><code>             X100m Long.jump Shot.put High.jump X400m X110m.hurdle
X100m         1.00     -0.76    -0.45     -0.40  0.59         0.73
Long.jump    -0.76      1.00     0.44      0.34 -0.51        -0.59
Shot.put     -0.45      0.44     1.00      0.53 -0.31        -0.38
High.jump    -0.40      0.34     0.53      1.00 -0.37        -0.25
X400m         0.59     -0.51    -0.31     -0.37  1.00         0.58
X110m.hurdle  0.73     -0.59    -0.38     -0.25  0.58         1.00</code></pre>
<p><strong>Visualize the correlation matrix using a correlogram</strong> : the package <strong>corrplot</strong> is required.</p>
<pre class="r"><code># install.packages("corrplot")
library("corrplot")
corrplot(cor.mat, type="upper", order="hclust", 
         tl.col="black", tl.srt=45)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-correlogram-data-mining-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="success"> Read more about visualizing correlation matrix : <a href="https://www.sthda.com/english/english/wiki/correlation-matrix-a-quick-start-guide-to-analyze-format-and-visualize-a-correlation-matrix-using-r-software">Correlation matrix visualization</a></span></p>
<p><strong>Make a scatter plot matrix</strong> showing the correlation coefficients between variables and the significance levels : the package <strong>PerformanceAnalytics</strong> is required.</p>
<pre class="r"><code># install.packages("PerformanceAnalytics")
library("PerformanceAnalytics")
chart.Correlation(decathlon2.active[, 1:6], histogram=TRUE, pch=19)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-correlation-matrix-chart-data-mining-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><span class="success"> You can read more about this plot here : <a href="https://www.sthda.com/english/english/wiki/correlation-matrix-a-quick-start-guide-to-analyze-format-and-visualize-a-correlation-matrix-using-r-software">Correlation matrix visualization</a></span></p>
</div>
</div>
<div id="principal-component-analysis" class="section level1">
<h1>Principal component analysis</h1>
<p>The function <strong>PCA()</strong> [in <em>FactoMiner</em> package] can be used. A simplified format is :</p>
<pre class="r"><code>PCA(X, scale.unit = TRUE, ncp = 5, graph = TRUE)</code></pre>
<br/>
<div class="block">
<ul>
<li><strong>X</strong> : a data frame. Rows are individuals and columns are numeric variables</li>
<li><strong>scale.unit</strong> : a logical value. If <em>TRUE</em>, the data are scaled to unit variance before the analysis. This standardization to the same scale avoids some variables to become dominant just because of their large measurement units.</li>
<li><strong>ncp</strong> : number of dimensions kept in the final results.</li>
<li><strong>graph</strong> : a logical value. If TRUE a graph is displayed.</li>
</ul>
</div>
<p><br/></p>
<p>In the R code below, the PCA is performed only on the active individuals/variables :</p>
<pre class="r"><code>library("FactoMineR")
res.pca <- PCA(decathlon2.active, graph = FALSE)</code></pre>
<p>The output of the function <strong>PCA()</strong> is a list including :</p>
<pre class="r"><code>print(res.pca)</code></pre>
<pre><code>**Results for the Principal Component Analysis (PCA)**
The analysis was performed on 23 individuals, described by 10 variables
*The results are available in the following objects:

   name               description                          
1  "$eig"             "eigenvalues"                        
2  "$var"             "results for the variables"          
3  "$var$coord"       "coord. for the variables"           
4  "$var$cor"         "correlations variables - dimensions"
5  "$var$cos2"        "cos2 for the variables"             
6  "$var$contrib"     "contributions of the variables"     
7  "$ind"             "results for the individuals"        
8  "$ind$coord"       "coord. for the individuals"         
9  "$ind$cos2"        "cos2 for the individuals"           
10 "$ind$contrib"     "contributions of the individuals"   
11 "$call"            "summary statistics"                 
12 "$call$centre"     "mean of the variables"              
13 "$call$ecart.type" "standard error of the variables"    
14 "$call$row.w"      "weights for the individuals"        
15 "$call$col.w"      "weights for the variables"          </code></pre>
<p><span class="success">The object that is created using the function <strong>PCA()</strong> contains many informations found in many different lists and matrices. These values are described in the next section.</span></p>
<div id="variances-of-the-principal-components" class="section level2">
<h2>Variances of the principal components</h2>
<p>The proportion of variances retained by the principal components can be extracted as follow :</p>
<pre class="r"><code>eigenvalues <- res.pca$eig
head(eigenvalues[, 1:2])</code></pre>
<pre><code>       eigenvalue percentage of variance
comp 1  4.1242133              41.242133
comp 2  1.8385309              18.385309
comp 3  1.2391403              12.391403
comp 4  0.8194402               8.194402
comp 5  0.7015528               7.015528
comp 6  0.4228828               4.228828</code></pre>
<div class="success">
<ul>
<li><p><strong>Eigenvalues</strong> correspond to the amount of the variation explained by each principal component (PC). Eigenvalues are large for the first PC and small for the subsequent PCs.</p></li>
<li>A PC with an eigenvalue > 1 indicates that the PC accounts for more variance than accounted by one of the original variables in standardized data. This is commonly used as a cutoff point to determine the number of PCs to retain.</li>
</ul>
</div>
<p><strong>Make a scree plot using base graphics</strong> : A scree plot is a graph of the eigenvalues/variances associated with components.</p>
<pre class="r"><code>barplot(eigenvalues[, 2], names.arg=1:nrow(eigenvalues), 
       main = "Variances",
       xlab = "Principal Components",
       ylab = "Percentage of variances",
       col ="steelblue")
# Add connected line segments to the plot
lines(x = 1:nrow(eigenvalues), eigenvalues[, 2], 
      type="b", pch=19, col = "red")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-eigenvalue-data-mining-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="success">~60% of the informations (variances) contained in the data are retained by the first two principal components.</span></p>
<p><strong>Make the scree plot using the package factoextra</strong> :</p>
<pre class="r"><code>fviz_screeplot(res.pca, ncp=10)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-eigenvalue-factoextra-data-mining-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
</div>
<div id="graph-of-individus-and-variables" class="section level1">
<h1>Graph of individus and variables</h1>
<p>The function <strong>plot.PCA()</strong> can be used. A simplified format is :</p>
<pre class="r"><code>plot.PCA(x, axes = c(1,2), choix=c("ind", "var"))</code></pre>
<br/>
<div class="block">
<ul>
<li><strong>x</strong> : An object of class <strong>PCA</strong></li>
<li><strong>axes</strong> : A numeric vector of length 2 specifying the component to plot</li>
<li><strong>choix</strong> : The graph to be plotted. Possible values are “ind” for the individuals and “var” for the variables</li>
</ul>
</div>
<p><br/></p>
</div>
<div id="variables-factor-map-the-correlation-circle" class="section level1">
<h1>Variables factor map : The correlation circle</h1>
<div id="coordinates-of-variables-on-the-principal-components" class="section level2">
<h2>Coordinates of variables on the principal components</h2>
<pre class="r"><code>head(res.pca$var$coord)</code></pre>
<pre><code>                  Dim.1       Dim.2      Dim.3       Dim.4      Dim.5
X100m        -0.8506257 -0.17939806  0.3015564  0.03357320 -0.1944440
Long.jump     0.7941806  0.28085695 -0.1905465 -0.11538956  0.2331567
Shot.put      0.7339127  0.08540412  0.5175978  0.12846837 -0.2488129
High.jump     0.6100840 -0.46521415  0.3300852  0.14455012  0.4027002
X400m        -0.7016034  0.29017826  0.2835329  0.43082552  0.1039085
X110m.hurdle -0.7641252 -0.02474081  0.4488873 -0.01689589  0.2242200</code></pre>
</div>
<div id="cos2-quality-of-variables-on-the-factor-map" class="section level2">
<h2>Cos2 : quality of variables on the factor map</h2>
<p><span class="success">The quality of representation of the variables of the principal components are called the cos2.</span></p>
<pre class="r"><code>head(res.pca$var$cos2)</code></pre>
<pre><code>                 Dim.1        Dim.2      Dim.3        Dim.4      Dim.5
X100m        0.7235641 0.0321836641 0.09093628 0.0011271597 0.03780845
Long.jump    0.6307229 0.0788806285 0.03630798 0.0133147506 0.05436203
Shot.put     0.5386279 0.0072938636 0.26790749 0.0165041211 0.06190783
High.jump    0.3722025 0.2164242070 0.10895622 0.0208947375 0.16216747
X400m        0.4922473 0.0842034209 0.08039091 0.1856106269 0.01079698
X110m.hurdle 0.5838873 0.0006121077 0.20149984 0.0002854712 0.05027463</code></pre>
</div>
<div id="contributions-of-the-variables-to-the-principal-components" class="section level2">
<h2>Contributions of the variables to the principal components</h2>
<p><span class="success">Variable contributions in the determination of a given principal component are (in percentage) : (var.cos2 * 100) / (total cos2 of the component)</span></p>
<pre class="r"><code>head(res.pca$var$contrib)</code></pre>
<pre><code>                 Dim.1      Dim.2     Dim.3       Dim.4     Dim.5
X100m        17.544293  1.7505098  7.338659  0.13755240  5.389252
Long.jump    15.293168  4.2904162  2.930094  1.62485936  7.748815
Shot.put     13.060137  0.3967224 21.620432  2.01407269  8.824401
High.jump     9.024811 11.7715838  8.792888  2.54987951 23.115504
X400m        11.935544  4.5799296  6.487636 22.65090599  1.539012
X110m.hurdle 14.157544  0.0332933 16.261261  0.03483735  7.166193</code></pre>
</div>
<div id="graph-of-variables-using-factominer-base-graph" class="section level2">
<h2>Graph of variables using FactoMineR base graph</h2>
<pre class="r"><code>plot(res.pca, choix = "var")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-correlation-circle-data-mining-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
<div id="graph-of-variables-using-factoextra" class="section level2">
<h2>Graph of variables using factoextra</h2>
<p>The function <strong>fviz_pca_var()</strong> is used to visualize variables :</p>
<pre class="r"><code># Default plot
fviz_pca_var(res.pca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-correlation-circle-factoextra-data-mining-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Change color and theme
fviz_pca_var(res.pca, col.var="steelblue")+
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-correlation-circle-factoextra-data-mining-2.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="warning">Note that, using <strong>factoextra</strong> package, the color or the transparency of variables can be automatically controlled by the value of their contributions, their cos2, their coordinates on x or y axis.</span></p>
<pre class="r"><code># Control variable colors using their contribution
# Possible values for the argument col.var are :
  # "cos2", "contrib", "coord", "x", "y"
fviz_pca_var(res.pca, col.var="contrib")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-correlation-circle-colors-factoextra-data-mining-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<pre class="r"><code># Change the gradient color
fviz_pca_var(res.pca, col.var="contrib")+
scale_color_gradient2(low="white", mid="blue", 
                      high="red", midpoint=55)+theme_bw()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-correlation-circle-colors-factoextra-data-mining-2.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><span class="success">This is helpful to highlight the most important variables in the determination of the principal components.</span></p>
<p>It’s also possible to control automatically the transparency of variables by their contributions :</p>
<pre class="r"><code># Control the transparency of variables using their contribution
# Possible values for the argument alpha.var are :
  # "cos2", "contrib", "coord", "x", "y"
fviz_pca_var(res.pca, alpha.var="contrib")+
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-correlation-circle-colors-transparency-factoextra-data-mining-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><span class="success">Read more about ggplot2 and colors here : <a href="https://www.sthda.com/english/english/wiki/rggplot2-colors-how-to-change-colors-automatically-and-manually">ggplot2 colors - How to change colors automatically and manually?</a></span></p>
</div>
</div>
<div id="graph-of-individuals" class="section level1">
<h1>Graph of individuals</h1>
<div id="coordinates-of-individuals-on-the-principal-components" class="section level2">
<h2>Coordinates of individuals on the principal components</h2>
<pre class="r"><code>head(res.pca$ind$coord)</code></pre>
<pre><code>                Dim.1      Dim.2      Dim.3       Dim.4       Dim.5
SEBRLE      0.1955047  1.5890567  0.6424912  0.08389652  1.16829387
CLAY        0.8078795  2.4748137 -1.3873827  1.29838232 -0.82498206
BERNARD    -1.3591340  1.6480950  0.2005584 -1.96409420  0.08419345
YURKOV     -0.8889532 -0.4426067  2.5295843  0.71290837  0.40782264
ZSIVOCZKY  -0.1081216 -2.0688377 -1.3342591 -0.10152796 -0.20145217
McMULLEN    0.1212195 -1.0139102 -0.8625170  1.34164291  1.62151286</code></pre>
</div>
<div id="cos2-quality-of-representation-of-individuals-on-the-principal-components" class="section level2">
<h2>Cos2 : quality of representation of individuals on the principal components</h2>
<pre class="r"><code>head(res.pca$ind$cos2)</code></pre>
<pre><code>                 Dim.1      Dim.2       Dim.3       Dim.4        Dim.5
SEBRLE     0.007530179 0.49747323 0.081325232 0.001386688 0.2689026575
CLAY       0.048701249 0.45701660 0.143628117 0.125791741 0.0507850580
BERNARD    0.197199804 0.28996555 0.004294015 0.411819183 0.0007567259
YURKOV     0.096109800 0.02382571 0.778230322 0.061812637 0.0202279796
ZSIVOCZKY  0.001574385 0.57641944 0.239754152 0.001388216 0.0054654972
McMULLEN   0.002175437 0.15219499 0.110137872 0.266486530 0.3892621478</code></pre>
</div>
<div id="contribition-of-individuals-to-the-princial-components" class="section level2">
<h2>Contribition of individuals to the princial components</h2>
<pre class="r"><code>head(res.pca$ind$contrib)</code></pre>
<pre><code>                Dim.1      Dim.2      Dim.3       Dim.4       Dim.5
SEBRLE     0.04029447  5.9714533  1.4483919  0.03734589  8.45894063
CLAY       0.68805664 14.4839248  6.7537381  8.94458283  4.21794385
BERNARD    1.94740183  6.4234107  0.1411345 20.46819433  0.04393073
YURKOV     0.83308415  0.4632733 22.4517396  2.69663605  1.03075263
ZSIVOCZKY  0.01232413 10.1217143  6.2464325  0.05469230  0.25151025
McMULLEN   0.01549089  2.4310854  2.6102794  9.55055888 16.29493304</code></pre>
</div>
<div id="graph-of-individuals-using-factominer-base-graph" class="section level2">
<h2>Graph of individuals using FactoMineR base graph</h2>
<pre class="r"><code>plot(res.pca, choix = "ind")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-individus-graph-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
<div id="graph-of-individuals-using-factoextra" class="section level2">
<h2>Graph of individuals using factoextra</h2>
<p>The function <strong>fviz_pca_ind()</strong> is used to visualize individuals :</p>
<pre class="r"><code>fviz_pca_ind(res.pca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-individuals-graph-factoextra-data-mining-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><strong>Remove the points from the graph, use texts only</strong> :</p>
<pre class="r"><code>fviz_pca_ind(res.pca, geom="text")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-individuals-graph-factoextra-remove-points-data-mining-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<br/>

<div class="warning">
<p>Note that, allowed values for the argument <strong>geom</strong> are :</p>
<ul>
<li><strong>“point”</strong> to show only points (dots)</li>
<li><strong>“text”</strong> to show only labels</li>
<li><strong>c(“point”, “text”)</strong> to show both types</li>
</ul>
</div>
<p><br/></p>
<p><strong>Control automatically the color of individuals</strong> using the cos2 values (the quality of the individuals on the factor map) :</p>
<pre class="r"><code>fviz_pca_ind(res.pca, col.ind="cos2") +
scale_color_gradient2(low="white", mid="blue", 
                      high="red", midpoint=0.50)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-individuals-graph-color-factoextra-data-mining-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><span class="success">Read more about ggplot2 and colors here : <a href="https://www.sthda.com/english/english/wiki/rggplot2-colors-how-to-change-colors-automatically-and-manually">ggplot2 colors - How to change colors automatically and manually?</a></span></p>
<p><strong>Change the theme</strong> :</p>
<pre class="r"><code>fviz_pca_ind(res.pca,  col.ind="cos2") +
scale_color_gradient2(low="white", mid="blue", 
                      high="red", midpoint=0.50)+
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-individuals-graph-theme-factoextra-data-mining-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><span class="success">Read more about ggplot2 themes here : <a href="https://www.sthda.com/english/english/wiki/ggplot2-themes-and-background-colors-the-3-elements">ggplot2 themes and background colors</a></span></p>
<p><strong>Make a biplot of individuals and variables</strong> :</p>
<pre class="r"><code>fviz_pca_biplot(res.pca,  geom = "text")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-biplot-factoextra-data-mining-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
</div>
<div id="change-the-color-of-individuals-by-groups" class="section level2">
<h2>Change the color of individuals by groups</h2>
<p>We will use iris data sets in this section :</p>
<pre class="r"><code>data(iris)

head(iris)</code></pre>
<pre><code>  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa</code></pre>
<pre class="r"><code># The variable Species (index = 5) is removed
# before PCA analysis
iris.pca <- PCA(iris[,-5], graph = FALSE)</code></pre>
<p><strong>Individuals factor map</strong> :</p>
<pre class="r"><code># Default plot
fviz_pca_ind(iris.pca, label="none")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-individuals-factor-map-factoextra-data-mining-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><strong>Change individual colors by groups</strong> :</p>
<pre class="r"><code>fviz_pca_ind(iris.pca,  label="none", habillage=iris$Species)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-individuals-factor-map-color-by-groups-factoextra-data-mining-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><strong>Add ellipses of point concentrations</strong> : the argument <em>habillage</em> is used to specify the factor variable for coloring the observations by groups.</p>
<pre class="r"><code>fviz_pca_ind(iris.pca, label="none", habillage=iris$Species,
             addEllipses=TRUE, ellipse.level=0.95)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-individuals-factor-map-concentration-ellipse-factoextra-data-mining-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p>Now, let’s :</p>
<ul>
<li>make a biplot of individuals and variables</li>
<li>change the color of individuals by groups</li>
<li>change the transparency of variable colors by their contribution values</li>
<li>show only the labels for variables</li>
</ul>
<pre class="r"><code>fviz_pca_biplot(iris.pca, 
  habillage = iris$Species, addEllipses = TRUE,
  col.var = "red", alpha.var ="cos2",
  label = "var") +
  scale_color_brewer(palette="Dark2")+
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-biplot-change-color-transparency-factoextra-data-mining-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
</div>
</div>
<div id="principal-component-analysis-using-supplementary-individuals-and-variables" class="section level1">
<h1>Principal component analysis using supplementary individuals and variables</h1>
<p><span class="warning">As described above, the data sets <em>decathlon2</em> contain <strong>supplementary continuous variables</strong> (quanti.sup, columns 11:12), <strong>supplementary qualitative variables</strong> (quali.sup, column 13) and <strong>supplementary individuals</strong> (ind.sup, rows 24:27)</span></p>
<p>Supplementary variables and individuals are not used for the determination of the principal components. Their coordinates are predicted using only the informations provided by the performed principal component analysis on active variables/individuals.</p>
<p>To specify supplementary individuals and variables, the function <strong>PCA()</strong> can be used as follow :</p>
<pre class="r"><code>PCA(X, scale.unit = TRUE, ncp = 5, ind.sup = NULL,
    quanti.sup=NULL, quali.sup=NULL, graph=TRUE, axes = c(1,2))</code></pre>
<br/>
<div class="block">
<ul>
<li><strong>X</strong> : a data frame. Rows are individuals and columns are numeric variables.</li>
<li><strong>scale.unit</strong> : a logical value. If <em>TRUE</em>, the data are scaled to unit variance before the analysis.</li>
<li><strong>ncp</strong> : number of dimensions kept in the final results.</li>
<li><strong>ind.sup</strong> : a numeric vector specifying the indexes of the supplementary individuals</li>
<li><strong>quanti.sup</strong>, <strong>quali.sup</strong> : a numeric vector specifying, respectively, the indexes of the quantitative and qualitative variables</li>
<li><strong>graph</strong> : a logical value. If TRUE a graph is displayed.</li>
<li><strong>axes</strong> : a vector of length 2 specifying the components to be plotted</li>
</ul>
</div>
<p><br/></p>
<p>Example of usage :</p>
<pre class="r"><code>res.pca <- PCA(decathlon2, ind.sup=24:27, 
               quanti.sup = 11:12, quali.sup = 13, graph=FALSE)</code></pre>
<div id="visualize-supplementary-quantitative-variables" class="section level2">
<h2>Visualize supplementary quantitative variables</h2>
<p>All the results (coordinates, correlation and cos2) for the supplementary quantitative variables can be extracted as follow :</p>
<pre class="r"><code>res.pca$quanti.sup</code></pre>
<pre><code>$coord
            Dim.1       Dim.2      Dim.3       Dim.4       Dim.5
Rank   -0.7014777 -0.24519443 -0.1834294  0.05575186 -0.07382647
Points  0.9637075  0.07768262  0.1580225 -0.16623092 -0.03114711

$cor
            Dim.1       Dim.2      Dim.3       Dim.4       Dim.5
Rank   -0.7014777 -0.24519443 -0.1834294  0.05575186 -0.07382647
Points  0.9637075  0.07768262  0.1580225 -0.16623092 -0.03114711

$cos2
           Dim.1       Dim.2      Dim.3      Dim.4        Dim.5
Rank   0.4920710 0.060120310 0.03364635 0.00310827 0.0054503477
Points 0.9287322 0.006034589 0.02497110 0.02763272 0.0009701427</code></pre>
<p><strong>Variables factor map using FactoMineR base graph</strong> :</p>
<pre class="r"><code>plot(res.pca, choix = "var")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-supplementary-quantitative-variables-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="success">Supplementary quantitative variables are shown in blue color and dashed lines.</span></p>
<p><strong>It’s also possible to make the variables factor map using factoextra</strong> :</p>
<pre class="r"><code>fviz_pca_var(res.pca)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-quantitative-supplementary-variable-data-mining-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
</div>
<div id="visualize-supplementary-individuals" class="section level2">
<h2>Visualize supplementary individuals</h2>
<p>The data sets <em>decathlon2</em> contain some <strong>supplementary individuals</strong> from row 24 to 27.</p>
<pre class="r"><code># Data for the supplementary individuals
ind.sup <- decathlon2[24:27, 1:10]
ind.sup[, 1:6]</code></pre>
<pre><code>         X100m Long.jump Shot.put High.jump X400m X110m.hurdle
KARPOV   11.02      7.30    14.77      2.04 48.37        14.09
WARNERS  11.11      7.60    14.31      1.98 48.68        14.23
Nool     10.80      7.53    14.26      1.88 48.81        14.80
Drews    10.87      7.38    13.07      1.88 48.51        14.01</code></pre>
<p><strong>Individuals factor map using FactoMineR base graph</strong> :</p>
<pre class="r"><code>plot(res.pca, choix="ind")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-supplementary-individuals-data-mining-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p><span class="success">Supplementary individuals are shown in blue. The levels of the supplementary qualitative variable are shown in magnenta color.</span></p>
<p>The results for supplementary individuals can be extracted as follow :</p>
<pre class="r"><code>res.pca$ind.sup</code></pre>
<pre><code>$coord
              Dim.1       Dim.2      Dim.3      Dim.4       Dim.5
KARPOV    0.7947206  0.77951227 -1.6330203  1.7242283 -0.75070396
WARNERS  -0.3864645 -0.12159237 -1.7387332 -0.7063341 -0.03230011
Nool     -0.5591306  1.97748871 -0.4830358 -2.2784526 -0.25461493
Drews    -1.1092038  0.01741477 -3.0488182 -1.5343468 -0.32642192

$cos2
              Dim.1        Dim.2      Dim.3      Dim.4        Dim.5
KARPOV   0.05104677 4.911173e-02 0.21553730 0.24028620 0.0455487744
WARNERS  0.02422707 2.398250e-03 0.49039677 0.08092862 0.0001692349
Nool     0.02897149 3.623868e-01 0.02162236 0.48108780 0.0060077529
Drews    0.09207094 2.269527e-05 0.69560547 0.17617609 0.0079736753

$dist
 KARPOV  WARNERS     Nool    Drews  
3.517470 2.482899 3.284943 3.655527 </code></pre>
</div>
<div id="supplementary-qualitative-variables" class="section level2">
<h2>Supplementary qualitative variables</h2>
<p>The data sets <em>decathlon2</em> contain a <strong>supplementary qualitative variable</strong> at columns 13 corresponding to the type of competitions.</p>
<p>Qualitative variable can be helpful for interpreting the data and for coloring individuals by groups.</p>
<p>The argument <em>habillage</em> is used to specify the index of the supplementary qualitative variable :</p>
<pre class="r"><code>plot(res.pca, choix = "ind", habillage = 13)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-supplementary-qualitative-variable-data-mining-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="432" style="margin-bottom:10px;" /></p>
<p>It’s also possible to use factoextra :</p>
<pre class="r"><code>fviz_pca_ind(res.pca, habillage = 13,
  addEllipses =TRUE, ellipse.level = 0.68) +
  scale_color_brewer(palette="Dark2") +
  theme_minimal()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/factor-analysis/factominer-principal-component-analysis-supplementary-qualitative-variable-factoextra-data-mining-1.png" title="FactoMineR and factoextra :  Principal component analysis - R software and data mining" alt="FactoMineR and factoextra :  Principal component analysis - R software and data mining" width="480" style="margin-bottom:10px;" /></p>
<p><span class="success">Supplementary individuals are shown in blue color</span></p>
<p>The results concerning the supplementary qualitative variable are :</p>
<pre class="r"><code>res.pca$quali</code></pre>
<pre><code>$coord
             Dim.1      Dim.2       Dim.3      Dim.4      Dim.5
Decastar -1.343451  0.1218097 -0.03789524  0.1808357  0.1343364
OlympicG  1.231497 -0.1116589  0.03473730 -0.1657661 -0.1231417

$cos2
             Dim.1       Dim.2        Dim.3      Dim.4       Dim.5
Decastar 0.9051233 0.007440939 0.0007201669 0.01639956 0.009050062
OlympicG 0.9051233 0.007440939 0.0007201669 0.01639956 0.009050062

$v.test
             Dim.1      Dim.2      Dim.3      Dim.4      Dim.5
Decastar -2.970766  0.4034256 -0.1528767  0.8971036  0.7202457
OlympicG  2.970766 -0.4034256  0.1528767 -0.8971036 -0.7202457

$dist
Decastar OlympicG 
1.412108 1.294433 

$eta2
                Dim 1      Dim 2       Dim 3      Dim 4      Dim 5
Competition 0.4011568 0.00739783 0.001062332 0.03658159 0.02357972</code></pre>
</div>
</div>
<div id="dimension-description" class="section level1">
<h1>Dimension description</h1>
<p>The function <strong>dimdesc()</strong> can be used to identify the most correlated variables with a given principal component.</p>
<p>A simplified format is :</p>
<pre class="r"><code>dimdesc(res, axes = 1:3, proba = 0.05)</code></pre>
<br/>
<div>
<ul>
<li><strong>res</strong> : an object of class PCA</li>
<li><strong>axes</strong> : a numeric vector specifying the dimensions to be described</li>
<li><strong>prob</strong> : the significance level</li>
</ul>
</div>
<p><br/></p>
<p>Example of usage :</p>
<pre class="r"><code>res.desc <- dimdesc(res.pca, axes = c(1,2))
# Description of dimension 1
res.desc$Dim.1</code></pre>
<pre><code>$quanti
             correlation      p.value
Points         0.9637075 1.605675e-13
Long.jump      0.7941806 6.059893e-06
Discus         0.7432090 4.842563e-05
Shot.put       0.7339127 6.723102e-05
High.jump      0.6100840 1.993677e-03
Javeline       0.4282266 4.149192e-02
Rank          -0.7014777 1.917657e-04
X400m         -0.7016034 1.910387e-04
X110m.hurdle  -0.7641252 2.195812e-05
X100m         -0.8506257 2.727129e-07

$quali
                   R2     p.value
Competition 0.4011568 0.001177378

$category
          Estimate     p.value
OlympicG  1.287474 0.001177378
Decastar -1.287474 0.001177378</code></pre>
<pre class="r"><code># Description of dimension 2
res.desc$Dim.2</code></pre>
<pre><code>$quanti
           correlation      p.value
Pole.vault   0.8074511 3.205016e-06
X1500m       0.7844802 9.384747e-06
High.jump   -0.4652142 2.529390e-02</code></pre>
</div>
<div id="infos" class="section level1">
<h1>Infos</h1>
<p><span class="warning"> This analysis has been performed using <strong>R software</strong> (ver. 3.1.2) and <strong>ggplot2</strong> (ver. ) </span></p>
</div>

<script>jQuery(document).ready(function () {
    jQuery('h1').addClass('wiki_paragraph1');
    jQuery('h2').addClass('wiki_paragraph2');
    jQuery('h3').addClass('wiki_paragraph3');
    jQuery('h4').addClass('wiki_paragraph4');
    });//add phpboost class to header</script>
<style>.content{padding:0px;}</style>
</div><!--end rdoc-->
<!--====================== stop here when you copy to sthda================-->


<!-- END HTML -->]]></description>
			<pubDate>Wed, 11 Mar 2015 21:35:38 +0100</pubDate>
			
		</item>
		
	</channel>
</rss>
