<?xml version="1.0" encoding="UTF-8" ?>
<!-- RSS generated by PHPBoost on Fri, 17 Apr 2026 00:59:18 +0200 -->

<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
	<channel>
		<title><![CDATA[Easy Guides]]></title>
		<atom:link href="https://www.sthda.com/english/syndication/rss/wiki/42" rel="self" type="application/rss+xml"/>
		<link>https://www.sthda.com</link>
		<description><![CDATA[Last articles of the category: R packages]]></description>
		<copyright>(C) 2005-2026 PHPBoost</copyright>
		<language>en</language>
		<generator>PHPBoost</generator>
		
		
		<item>
			<title><![CDATA[Bar Plots and Modern Alternatives]]></title>
			<link>https://www.sthda.com/english/wiki/bar-plots-and-modern-alternatives</link>
			<guid>https://www.sthda.com/english/wiki/bar-plots-and-modern-alternatives</guid>
			<description><![CDATA[<!-- START HTML -->

  <!--====================== start from here when you copy to sthda================-->  
  <div id="rdoc">



<p><br/></p>
<p>This article describes how to create easily basic and ordered <strong>bar plots</strong> using ggplot2 based helper functions available in the <a href="https://www.sthda.com/english/rpkgs/ggpubr/index.html">ggpubr R package</a>. We’ll also present some modern alternatives to bar plots, including <strong>lollipop charts</strong> and <strong>cleveland’s dot plots</strong>.</p>
<div class="block">
<p>
Note that, the approach to build a bar plot, using ggplot2 standard verbs, has been described in our previous article available at: <a href="https://www.sthda.com/english/wiki/ggplot2-barplots-quick-start-guide-r-software-and-data-visualization">ggplot2 barplots : Quick start guide</a>.
</p>
<p>
You might be also interested by the following articles:
</p>
<ul>
<li>
<a href="https://www.sthda.com/english/wiki/add-p-values-and-significance-levels-to-ggplots">Add P-values and Significance Levels to ggplots</a>
</li>
<li>
<a href="https://www.sthda.com/english/wiki/facilitating-exploratory-data-visualization-application-to-tcga-genomic-data">Facilitating Exploratory Data Visualization: Application to TCGA Genomic Data</a>
</li>
</ul>
</div>
<p><br/></p>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/bar-plots-and-alternatives-logo-1.png" alt="Bar plots and modern alternatives" width="672" style="margin-bottom:10px;" />
<p class="caption">
Bar plots and modern alternatives
</p>
</div>
<p><strong>Contents</strong>:</p>

<div id="TOC">
<ul>
<li><a href="#prerequisites">Prerequisites</a></li>
<li><a href="#basic-bar-plots">Basic bar plots</a></li>
<li><a href="#multiple-grouping-variables">Multiple grouping variables</a></li>
<li><a href="#ordered-bar-plots">Ordered bar plots</a></li>
<li><a href="#deviation-graphs">Deviation graphs</a></li>
<li><a href="#alternatives-to-bar-plots">Alternatives to bar plots</a></li>
<li><a href="#infos">Infos</a></li>
</ul>
</div>

<div id="prerequisites" class="section level2">
<h2>Prerequisites</h2>
<div id="required-r-package" class="section level3">
<h3>Required R package</h3>
<p>You need to install the R package <a href="https://www.sthda.com/english/rpkgs/ggpubr">ggpubr (version >= 0.1.3)</a>, to easily create ggplot2-based publication ready plots.</p>
<p>Install from CRAN:</p>
<pre class="r"><code>install.packages("ggpubr")</code></pre>
<p>Or, install the latest developmental version from <a href="https://github.com/kassambara/ggpubr">GitHub</a> as follow:</p>
<pre class="r"><code>if(!require(devtools)) install.packages("devtools")
devtools::install_github("kassambara/ggpubr")</code></pre>
<p>Load ggpubr:</p>
<pre class="r"><code>library(ggpubr)</code></pre>
</div>
</div>
<div id="basic-bar-plots" class="section level2">
<h2>Basic bar plots</h2>
<p>Create a demo data set:</p>
<pre class="r"><code>df <- data.frame(dose=c("D0.5", "D1", "D2"),
                 len=c(4.2, 10, 29.5))
print(df)</code></pre>
<pre><code>  dose  len
1 D0.5  4.2
2   D1 10.0
3   D2 29.5</code></pre>
<p>Basic bar plots:</p>
<pre class="r"><code># Basic bar plots with label
p <- ggbarplot(df, x = "dose", y = "len",
          color = "black", fill = "lightgray")
p

# Rotate to create horizontal bar plots
p + rotate()</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/bar-plots-and-alternatives-basics-1.png" alt="Bar plots and modern alternatives" width="336" style="margin-bottom:10px;" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/bar-plots-and-alternatives-basics-2.png" alt="Bar plots and modern alternatives" width="336" style="margin-bottom:10px;" />
<p class="caption">
Bar plots and modern alternatives
</p>
</div>
<p>Change fill and outline colors by groups:</p>
<pre class="r"><code>ggbarplot(df, x = "dose", y = "len",
   fill = "dose", color = "dose", palette = "jco")</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/bar-plots-and-alternatives-color-1.png" alt="Bar plots and modern alternatives" width="336" style="margin-bottom:10px;" />
<p class="caption">
Bar plots and modern alternatives
</p>
</div>
</div>
<div id="multiple-grouping-variables" class="section level2">
<h2>Multiple grouping variables</h2>
<p>Create a demo data set:</p>
<pre class="r"><code>df2 <- data.frame(supp=rep(c("VC", "OJ"), each=3),
                  dose=rep(c("D0.5", "D1", "D2"),2),
                  len=c(6.8, 15, 33, 4.2, 10, 29.5))
print(df2)</code></pre>
<pre><code>  supp dose  len
1   VC D0.5  6.8
2   VC   D1 15.0
3   VC   D2 33.0
4   OJ D0.5  4.2
5   OJ   D1 10.0
6   OJ   D2 29.5</code></pre>
<p>Plot y = “len” by x = “dose” and change color by a second group: “supp”</p>
<pre class="r"><code># Stacked bar plots, add labels inside bars
ggbarplot(df2, x = "dose", y = "len",
  fill = "supp", color = "supp", 
  palette = c("gray", "black"),
  label = TRUE, lab.col = "white", lab.pos = "in")

# Change position: Interleaved (dodged) bar plot
ggbarplot(df2, x = "dose", y = "len",
          fill = "supp", color = "supp", 
          palette = c("gray", "black"),
          position = position_dodge(0.9))</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/bar-plots-and-alternatives-stacked-bar-plots-1.png" alt="Bar plots and modern alternatives" width="336" style="margin-bottom:10px;" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/bar-plots-and-alternatives-stacked-bar-plots-2.png" alt="Bar plots and modern alternatives" width="336" style="margin-bottom:10px;" />
<p class="caption">
Bar plots and modern alternatives
</p>
</div>
</div>
<div id="ordered-bar-plots" class="section level2">
<h2>Ordered bar plots</h2>
<p>Load and prepare data:</p>
<pre class="r"><code># Load data
data("mtcars")
dfm <- mtcars
# Convert the cyl variable to a factor
dfm$cyl <- as.factor(dfm$cyl)
# Add the name colums
dfm$name <- rownames(dfm)
# Inspect the data
head(dfm[, c("name", "wt", "mpg", "cyl")])</code></pre>
<pre><code>                               name    wt  mpg cyl
Mazda RX4                 Mazda RX4 2.620 21.0   6
Mazda RX4 Wag         Mazda RX4 Wag 2.875 21.0   6
Datsun 710               Datsun 710 2.320 22.8   4
Hornet 4 Drive       Hornet 4 Drive 3.215 21.4   6
Hornet Sportabout Hornet Sportabout 3.440 18.7   8
Valiant                     Valiant 3.460 18.1   6</code></pre>
<p>Create ordered bar plots. Change the fill color by the grouping variable “cyl”. Sorting will be done globally, but not by groups.</p>
<pre class="r"><code>ggbarplot(dfm, x = "name", y = "mpg",
          fill = "cyl",               # change fill color by cyl
          color = "white",            # Set bar border colors to white
          palette = "jco",            # jco journal color palett. see ?ggpar
          sort.val = "desc",          # Sort the value in dscending order
          sort.by.groups = FALSE,     # Don&amp;#39;t sort inside each group
          x.text.angle = 90           # Rotate vertically x axis texts
          )</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/bar-plots-and-alternatives-ordered-bar-plots-1.png" alt="Bar plots and modern alternatives" width="576" style="margin-bottom:10px;" />
<p class="caption">
Bar plots and modern alternatives
</p>
</div>
<p>Sort bars inside each group. Use the argument <strong>sort.by.groups = TRUE</strong>.</p>
<pre class="r"><code>ggbarplot(dfm, x = "name", y = "mpg",
          fill = "cyl",               # change fill color by cyl
          color = "white",            # Set bar border colors to white
          palette = "jco",            # jco journal color palett. see ?ggpar
          sort.val = "asc",           # Sort the value in dscending order
          sort.by.groups = TRUE,      # Sort inside each group
          x.text.angle = 90           # Rotate vertically x axis texts
          )</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/bar-plots-and-alternatives-ordered-bar-plots-by-groups-1.png" alt="Bar plots and modern alternatives" width="576" style="margin-bottom:10px;" />
<p class="caption">
Bar plots and modern alternatives
</p>
</div>
</div>
<div id="deviation-graphs" class="section level2">
<h2>Deviation graphs</h2>
<p>The deviation graph shows the deviation of quantitative values to a reference value. In the R code below, we’ll plot the mpg z-score from the mtcars data set.</p>
<p>Calculate the z-score of the mpg data:</p>
<pre class="r"><code># Calculate the z-score of the mpg data
dfm$mpg_z <- (dfm$mpg -mean(dfm$mpg))/sd(dfm$mpg)
dfm$mpg_grp <- factor(ifelse(dfm$mpg_z < 0, "low", "high"), 
                     levels = c("low", "high"))
# Inspect the data
head(dfm[, c("name", "wt", "mpg", "mpg_z", "mpg_grp", "cyl")])</code></pre>
<pre><code>                               name    wt  mpg      mpg_z mpg_grp cyl
Mazda RX4                 Mazda RX4 2.620 21.0  0.1508848    high   6
Mazda RX4 Wag         Mazda RX4 Wag 2.875 21.0  0.1508848    high   6
Datsun 710               Datsun 710 2.320 22.8  0.4495434    high   4
Hornet 4 Drive       Hornet 4 Drive 3.215 21.4  0.2172534    high   6
Hornet Sportabout Hornet Sportabout 3.440 18.7 -0.2307345     low   8
Valiant                     Valiant 3.460 18.1 -0.3302874     low   6</code></pre>
<p>Create an ordered bar plot, colored according to the level of mpg:</p>
<pre class="r"><code>ggbarplot(dfm, x = "name", y = "mpg_z",
          fill = "mpg_grp",           # change fill color by mpg_level
          color = "white",            # Set bar border colors to white
          palette = "jco",            # jco journal color palett. see ?ggpar
          sort.val = "asc",           # Sort the value in ascending order
          sort.by.groups = FALSE,     # Don&amp;#39;t sort inside each group
          x.text.angle = 90,          # Rotate vertically x axis texts
          ylab = "MPG z-score",
          xlab = FALSE,
          legend.title = "MPG Group"
          )</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/bar-plots-and-alternatives-deviation-graphs-1.png" alt="Bar plots and modern alternatives" width="576" style="margin-bottom:10px;" />
<p class="caption">
Bar plots and modern alternatives
</p>
</div>
<p>Rotate the plot: use rotate = TRUE and sort.val = “desc”</p>
<pre class="r"><code>ggbarplot(dfm, x = "name", y = "mpg_z",
          fill = "mpg_grp",           # change fill color by mpg_level
          color = "white",            # Set bar border colors to white
          palette = "jco",            # jco journal color palett. see ?ggpar
          sort.val = "desc",          # Sort the value in descending order
          sort.by.groups = FALSE,     # Don&amp;#39;t sort inside each group
          x.text.angle = 90,          # Rotate vertically x axis texts
          ylab = "MPG z-score",
          legend.title = "MPG Group",
          rotate = TRUE,
          ggtheme = theme_minimal()
          )</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/bar-plots-and-alternatives-deviation-graphs-horizontal-1.png" alt="Bar plots and modern alternatives" width="624" style="margin-bottom:10px;" />
<p class="caption">
Bar plots and modern alternatives
</p>
</div>
</div>
<div id="alternatives-to-bar-plots" class="section level2">
<h2>Alternatives to bar plots</h2>
<div id="lollipop-chart" class="section level3">
<h3>Lollipop chart</h3>
<p>Lollipop chart is an alternative to bar plots, when you have a large set of values to visualize.</p>
<p>Lollipop chart colored by the grouping variable “cyl”:</p>
<pre class="r"><code>ggdotchart(dfm, x = "name", y = "mpg",
           color = "cyl",                                # Color by groups
           palette = c("#00AFBB", "#E7B800", "#FC4E07"), # Custom color palette
           sorting = "ascending",                        # Sort value in descending order
           add = "segments",                             # Add segments from y = 0 to dots
           ggtheme = theme_pubr()                        # ggplot2 theme
           )</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/bar-plots-and-alternatives-lollipop-chart-1.png" alt="Bar plots and modern alternatives" width="720" style="margin-bottom:10px;" />
<p class="caption">
Bar plots and modern alternatives
</p>
</div>
<ul>
<li>Sort in descending order. <strong>sorting = “descending”</strong>.</li>
<li>Rotate the plot vertically, using <strong>rotate = TRUE</strong>.</li>
<li>Sort the mpg value inside each group by using <strong>group = “cyl”</strong>.</li>
<li>Set <strong>dot.size</strong> to 6.</li>
<li>Add mpg values as label. <strong>label = “mpg”</strong> or <strong>label = round(dfm$mpg)</strong>.</li>
</ul>
<pre class="r"><code>ggdotchart(dfm, x = "name", y = "mpg",
           color = "cyl",                                # Color by groups
           palette = c("#00AFBB", "#E7B800", "#FC4E07"), # Custom color palette
           sorting = "descending",                       # Sort value in descending order
           add = "segments",                             # Add segments from y = 0 to dots
           rotate = TRUE,                                # Rotate vertically
           group = "cyl",                                # Order by groups
           dot.size = 6,                                 # Large dot size
           label = round(dfm$mpg),                        # Add mpg values as dot labels
           font.label = list(color = "white", size = 9, 
                             vjust = 0.5),               # Adjust label parameters
           ggtheme = theme_pubr()                        # ggplot2 theme
           )</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/bar-plots-and-alternatives-lollipop-chart-rotate-1.png" alt="Bar plots and modern alternatives" width="480" style="margin-bottom:10px;" />
<p class="caption">
Bar plots and modern alternatives
</p>
</div>
<p>Deviation graph:</p>
<ul>
<li>Use y = “mpg_z”</li>
<li>Change segment color and size: add.params = list(color = “lightgray”, size = 2)</li>
</ul>
<pre class="r"><code>ggdotchart(dfm, x = "name", y = "mpg_z",
           color = "cyl",                                # Color by groups
           palette = c("#00AFBB", "#E7B800", "#FC4E07"), # Custom color palette
           sorting = "descending",                       # Sort value in descending order
           add = "segments",                             # Add segments from y = 0 to dots
           add.params = list(color = "lightgray", size = 2), # Change segment color and size
           group = "cyl",                                # Order by groups
           dot.size = 6,                                 # Large dot size
           label = round(dfm$mpg_z,1),                        # Add mpg values as dot labels
           font.label = list(color = "white", size = 9, 
                             vjust = 0.5),               # Adjust label parameters
           ggtheme = theme_pubr()                        # ggplot2 theme
           )+
  geom_hline(yintercept = 0, linetype = 2, color = "lightgray")</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/bar-plots-and-alternatives-lollipop-chart-deviation-1.png" alt="Bar plots and modern alternatives" width="720" style="margin-bottom:10px;" />
<p class="caption">
Bar plots and modern alternatives
</p>
</div>
</div>
<div id="clevelands-dot-plot" class="section level3">
<h3>Cleveland’s dot plot</h3>
<p>Color y text by groups. Use y.text.col = TRUE.</p>
<pre class="r"><code>ggdotchart(dfm, x = "name", y = "mpg",
           color = "cyl",                                # Color by groups
           palette = c("#00AFBB", "#E7B800", "#FC4E07"), # Custom color palette
           sorting = "descending",                       # Sort value in descending order
           rotate = TRUE,                                # Rotate vertically
           dot.size = 2,                                 # Large dot size
           y.text.col = TRUE,                            # Color y text by groups
           ggtheme = theme_pubr()                        # ggplot2 theme
           )+
  theme_cleveland()                                      # Add dashed grids</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/bar-plots-and-alternatives-cleveland-dot-plots-1.png" alt="Bar plots and modern alternatives" width="480" style="margin-bottom:10px;" />
<p class="caption">
Bar plots and modern alternatives
</p>
</div>
</div>
</div>
<div id="infos" class="section level2">
<h2>Infos</h2>
<p>This analysis has been performed using <strong>R software</strong> (ver. 3.3.2) and <strong>ggpubr</strong> (ver. 0.1.4).</p>
</div>

<script>jQuery(document).ready(function () {
    jQuery('#rdoc h1').addClass('wiki_paragraph1');
    jQuery('#rdoc h2').addClass('wiki_paragraph2');
    jQuery('#rdoc h3').addClass('wiki_paragraph3');
    jQuery('#rdoc h4').addClass('wiki_paragraph4');
    });//add phpboost class to header</script>
<style>.content{padding:0px;}</style>
</div><!--end rdoc-->
<!--====================== stop here when you copy to sthda================-->


<!-- END HTML -->]]></description>
			<pubDate>Wed, 28 Jun 2017 15:00:41 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[Facilitating Exploratory Data Visualization: Application to TCGA Genomic Data]]></title>
			<link>https://www.sthda.com/english/wiki/facilitating-exploratory-data-visualization-application-to-tcga-genomic-data</link>
			<guid>https://www.sthda.com/english/wiki/facilitating-exploratory-data-visualization-application-to-tcga-genomic-data</guid>
			<description><![CDATA[<!-- START HTML -->

  <!--====================== start from here when you copy to sthda================-->  
  <div id="rdoc">

<p><br/></p>
<p>In genomic fields, it’s very common to explore the <strong>gene expression</strong> profile of one or a list of genes involved in a pathway of interest. Here, we present some helper functions in the <a href="https://www.sthda.com/english/rpkgs/ggpubr/">ggpubr R package</a> to facilitate <strong>exploratory data analysis</strong> (<strong>EDA</strong>) for life scientists.</p>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-logo-1.png" alt="Exploratory Data visualization: Gene Expression Data" width="720" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<p>Standard <a href="https://www.sthda.com/english/wiki/data-visualization">graphical techniques</a> used in EDA, include:</p>
<ul>
<li>Box plot</li>
<li>Violin plot</li>
<li>Stripchart</li>
<li>Dot plot</li>
<li>Histogram and density plots</li>
<li>ECDF plot</li>
<li>Q-Q plot</li>
</ul>
<p>All these plots can be created using the <a href="http://ggplot2.tidyverse.org/reference/"><strong>ggplot2</strong> R package</a>, which is highly flexible.</p>
<p>However, to customize a ggplot, the syntax might appear opaque for a beginner and this raises the level of difficulty for researchers with no advanced R programming skills. If you’re not familiar with ggplot2 system, you can start by reading our <a href="https://www.sthda.com/english/wiki/ggplot2-essentials">Guide to Create Beautiful Graphics in R</a>.</p>
<div class="block">
<p>
Previously, we described how to <a href="https://www.sthda.com/english/wiki/add-p-values-and-significance-levels-to-ggplots">Add P-values and Significance Levels to ggplots</a>. In this article, we present the <a href="https://www.sthda.com/english/rpkgs/ggpubr/">ggpubr package</a>, a wrapper around ggplot2, which provides some easy-to-use functions for creating ‘ggplot2’- based publication ready plots. We’ll use the ggpubr functions to visualize <strong>gene expression</strong> profile from <strong>TCGA</strong> genomic data sets.
</p>
</div>
<p><strong>Contents:</strong></p>
<div id="TOC">
<ul>
<li><a href="#prerequisites">Prerequisites</a></li>
<li><a href="#gene-expression-data">Gene expression data</a></li>
<li><a href="#box-plots">Box plots</a></li>
<li><a href="#violin-plots">Violin plots</a></li>
<li><a href="#stripcharts-and-dot-plots">Stripcharts and dot plots</a></li>
<li><a href="#density-plots">Density plots</a></li>
<li><a href="#histogram-plots">Histogram plots</a></li>
<li><a href="#empirical-cumulative-density-function">Empirical cumulative density function</a></li>
<li><a href="#quantile---quantile-plot">Quantile - Quantile plot</a></li>
<li><a href="#infos">Infos</a></li>
</ul>
</div>

<div id="prerequisites" class="section level2">
<h2>Prerequisites</h2>
<div id="ggpubr-package" class="section level3">
<h3>ggpubr package</h3>
<p>Required R package: <a href="https://www.sthda.com/english/rpkgs/ggpubr">ggpubr (version >= 0.1.3)</a>.</p>
<ul>
<li>Install from <a href="https://cran.r-project.org/package=ggpubr">CRAN</a> as follow:</li>
</ul>
<pre class="r"><code>install.packages("ggpubr")</code></pre>
<ul>
<li>Or, install the latest developmental version from <a href="https://github.com/kassambara/ggpubr">GitHub</a> as follow:</li>
</ul>
<pre class="r"><code>if(!require(devtools)) install.packages("devtools")
devtools::install_github("kassambara/ggpubr")</code></pre>
<ul>
<li>Load ggpubr:</li>
</ul>
<pre class="r"><code>library(ggpubr)</code></pre>
</div>
<div id="tcga-data" class="section level3">
<h3>TCGA data</h3>
<p><a href="https://cancergenome.nih.gov/">The Cancer Genome Atlas (TCGA) data</a> is a publicly available data containing clinical and genomic data across 33 cancer types. These data include gene expression, CNV profiling, SNP genotyping, DNA methylation, miRNA profiling, exome sequencing, and other types of data.</p>
<p>The <a href="https://github.com/RTCGA/RTCGA">RTCGA</a> R package, by Marcin Marcin Kosinski et al., provides a convenient solution to access to clinical and genomic data available in TCGA. Each of the data packages is a separate package, and must be installed (once) individually.</p>
<p>The following R code installs the core RTCGA package as well as the clinical and mRNA gene expression data packages.</p>
<pre class="r"><code># Load the bioconductor installer. 
source("https://bioconductor.org/biocLite.R")

# Install the main RTCGA package
biocLite("RTCGA")

# Install the clinical and mRNA gene expression data packages
biocLite("RTCGA.clinical")
biocLite("RTCGA.mRNA")</code></pre>
<p>To see the type of data available for each cancer type, use this:</p>
<pre class="r"><code>library(RTCGA)
infoTCGA()</code></pre>
<pre><code># A tibble: 38 x 13
     Cohort    BCR Clinical     CN   LowP Methylation   mRNA mRNASeq    miR miRSeq   RPPA    MAF rawMAF
 *   <fctr> <fctr>   <fctr> <fctr> <fctr>      <fctr> <fctr>  <fctr> <fctr> <fctr> <fctr> <fctr> <fctr>
 1      ACC     92       92     90      0          80      0      79      0     80     46     90      0
 2     BLCA    412      412    410    112         412      0     408      0    409    344    130    395
 3     BRCA   1098     1097   1089     19        1097    526    1093      0   1078    887    977      0
 4     CESC    307      307    295     50         307      0     304      0    307    173    194      0
 5     CHOL     51       45     36      0          36      0      36      0     36     30     35      0
 6     COAD    460      458    451     69         457    153     457      0    406    360    154    367
 7 COADREAD    631      629    616    104         622    222     623      0    549    491    223    489
 8     DLBC     58       48     48      0          48      0      48      0     47     33     48      0
 9     ESCA    185      185    184     51         185      0     184      0    184    126    185      0
10     FPPP     38       38      0      0           0      0       0      0     23      0      0      0
# ... with 28 more rows</code></pre>
<div class="success">
<p>
More information about the disease names can be found at: <a href="http://gdac.broadinstitute.org/" class="uri">http://gdac.broadinstitute.org/</a>
</p>
</div>
</div>
</div>
<div id="gene-expression-data" class="section level2">
<h2>Gene expression data</h2>
<p>The R function <strong>expressionsTCGA</strong>() [in RTCGA package] can be used to easily extract the expression values of genes of interest in one or multiple cancer types.</p>
<p>In the following R code, we start by extracting the mRNA expression for five genes of interest - GATA3, PTEN, XBP1, ESR1 and MUC1 - from 3 different data sets:</p>
<ul>
<li>Breast invasive carcinoma (BRCA),</li>
<li>Ovarian serous cystadenocarcinoma (OV) and</li>
<li>Lung squamous cell carcinoma (LUSC)</li>
</ul>
<pre class="r"><code>library(RTCGA)
library(RTCGA.mRNA)
expr <- expressionsTCGA(BRCA.mRNA, OV.mRNA, LUSC.mRNA,
                        extract.cols = c("GATA3", "PTEN", "XBP1","ESR1", "MUC1"))
expr</code></pre>
<pre><code># A tibble: 1,305 x 7
            bcr_patient_barcode   dataset     GATA3       PTEN      XBP1       ESR1      MUC1
                          <chr>     <chr>     <dbl>      <dbl>     <dbl>      <dbl>     <dbl>
 1 TCGA-A1-A0SD-01A-11R-A115-07 BRCA.mRNA  2.870500  1.3613571  2.983333  3.0842500  1.652125
 2 TCGA-A1-A0SE-01A-11R-A084-07 BRCA.mRNA  2.166250  0.4283571  2.550833  2.3860000  3.080250
 3 TCGA-A1-A0SH-01A-11R-A084-07 BRCA.mRNA  1.323500  1.3056429  3.020417  0.7912500  2.985250
 4 TCGA-A1-A0SJ-01A-11R-A084-07 BRCA.mRNA  1.841625  0.8096429  3.131333  2.4954167 -1.918500
 5 TCGA-A1-A0SK-01A-12R-A084-07 BRCA.mRNA -6.025250  0.2508571 -1.451750 -4.8606667 -1.171500
 6 TCGA-A1-A0SM-01A-11R-A084-07 BRCA.mRNA  1.804500  1.3107857  4.041083  2.7970000  3.529750
 7 TCGA-A1-A0SO-01A-22R-A084-07 BRCA.mRNA -4.879250 -0.2369286 -0.724750 -4.4860833 -1.455750
 8 TCGA-A1-A0SP-01A-11R-A084-07 BRCA.mRNA -3.143250 -1.2432143 -1.193083 -1.6274167 -0.986750
 9 TCGA-A2-A04N-01A-11R-A115-07 BRCA.mRNA  2.034000  1.2074286  2.278833  4.1155833  0.668000
10 TCGA-A2-A04P-01A-31R-A034-07 BRCA.mRNA -0.293125  0.2883571 -1.605083  0.4731667  0.011500
# ... with 1,295 more rows</code></pre>
<p>To display the number of sample in each data set, type this:</p>
<pre class="r"><code>nb_samples <- table(expr$dataset)
nb_samples</code></pre>
<pre><code>
BRCA.mRNA LUSC.mRNA   OV.mRNA 
      590       154       561 </code></pre>
<p>We can simplify data set names by removing the “mRNA” tag. This can be done using the R base function <strong>gsub</strong>().</p>
<pre class="r"><code>expr$dataset <- gsub(pattern = ".mRNA", replacement = "",  expr$dataset)</code></pre>
<p>Let’s simplify also the patients’ barcode column. The following R code will change the barcodes into BRCA1, BRCA2, …, OV1, OV2, …., etc</p>
<pre class="r"><code>expr$bcr_patient_barcode <- paste0(expr$dataset, c(1:590, 1:561, 1:154))
expr</code></pre>
<pre><code># A tibble: 1,305 x 7
   bcr_patient_barcode dataset     GATA3       PTEN      XBP1       ESR1      MUC1
                 <chr>   <chr>     <dbl>      <dbl>     <dbl>      <dbl>     <dbl>
 1               BRCA1    BRCA  2.870500  1.3613571  2.983333  3.0842500  1.652125
 2               BRCA2    BRCA  2.166250  0.4283571  2.550833  2.3860000  3.080250
 3               BRCA3    BRCA  1.323500  1.3056429  3.020417  0.7912500  2.985250
 4               BRCA4    BRCA  1.841625  0.8096429  3.131333  2.4954167 -1.918500
 5               BRCA5    BRCA -6.025250  0.2508571 -1.451750 -4.8606667 -1.171500
 6               BRCA6    BRCA  1.804500  1.3107857  4.041083  2.7970000  3.529750
 7               BRCA7    BRCA -4.879250 -0.2369286 -0.724750 -4.4860833 -1.455750
 8               BRCA8    BRCA -3.143250 -1.2432143 -1.193083 -1.6274167 -0.986750
 9               BRCA9    BRCA  2.034000  1.2074286  2.278833  4.1155833  0.668000
10              BRCA10    BRCA -0.293125  0.2883571 -1.605083  0.4731667  0.011500
# ... with 1,295 more rows</code></pre>
<p>The above (expr) dataset has been saved at <a href="https://raw.githubusercontent.com/kassambara/data/master/expr_tcga.txt" class="uri">https://raw.githubusercontent.com/kassambara/data/master/expr_tcga.txt</a>. This data is required to practice the R code provided in this tutotial.</p>
<p>If you experience some issues in installing the RTCGA packages, You can simply load the data as follow:</p>
<pre class="r"><code>expr <- read.delim("https://raw.githubusercontent.com/kassambara/data/master/expr_tcga.txt",
                   stringsAsFactors = FALSE)</code></pre>
</div>
<div id="box-plots" class="section level2">
<h2>Box plots</h2>
<p>(<a href="https://www.sthda.com/english/wiki/ggplot2-box-plot-quick-start-guide-r-software-and-data-visualization">ggplot2 way of creating box plot</a>)</p>
<p>Create a box plot of a gene expression profile, colored by groups (here data set/cancer type):</p>
<pre class="r"><code>library(ggpubr)
# GATA3
ggboxplot(expr, x = "dataset", y = "GATA3",
          title = "GATA3", ylab = "Expression",
          color = "dataset", palette = "jco")

# PTEN
ggboxplot(expr, x = "dataset", y = "PTEN",
          title = "PTEN", ylab = "Expression",
          color = "dataset", palette = "jco")</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-boxplot-gene-expression-1.png" alt="Exploratory Data visualization: Gene Expression Data" width="355.2" style="margin-bottom:10px;" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-boxplot-gene-expression-2.png" alt="Exploratory Data visualization: Gene Expression Data" width="355.2" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<div class="block">
<p>
Note that, the argument <strong>palette</strong> is used to change color palettes. Allowed values include:
</p>
<ul>
<li>
“grey” for grey color palettes;
</li>
<li>
brewer palettes e.g. “RdBu”, “Blues”, …;. To view all, type this in R: <strong>RColorBrewer::display.brewer.all()</strong> or <a href="https://www.sthda.com/english/wiki/ggplot2-colors-how-to-change-colors-automatically-and-manually#use-rcolorbrewer-palettes">click here to see all brewer palettes</a>;
</li>
<li>
or custom color palettes e.g. c(“blue”, “red”) or c(“#00AFBB”, “#E7B800”);
</li>
<li>
and scientific journal palettes from the <a href="https://cran.r-project.org/web/packages/ggsci/vignettes/ggsci.html">ggsci R package</a>, e.g.: “npg”, “aaas”, “lancet”, “jco”, “ucscgb”, “uchicago”, “simpsons” and “rickandmorty”.
</li>
</ul>
</div>
<p>Instead of repeating the same R code for each gene, you can create a list of plots at once, as follow:</p>
<pre class="r"><code># Create a  list of plots
p <- ggboxplot(expr, x = "dataset", 
               y = c("GATA3", "PTEN", "XBP1"),
               title = c("GATA3", "PTEN", "XBP1"),
               ylab = "Expression", 
               color = "dataset", palette = "jco")

# View GATA3
p$GATA3

# View PTEN
p$PTEN

# View XBP1
p$XBP1</code></pre>
<div class="block">
<p>
Note that, when the argument <em>y</em> contains multiple variables (here multiple gene names), then the arguments <em>title</em>, <em>xlab</em> and <em>ylab</em> can be also a character vector of same length as <em>y</em>.
</p>
</div>
<p>To add p-values and significance levels to the boxplots, read our previous article: <a href="https://www.sthda.com/english/wiki/add-p-values-and-significance-levels-to-ggplots">Add P-values and Significance Levels to ggplots</a>. Briefly, you can do this:</p>
<pre class="r"><code>my_comparisons <- list(c("BRCA", "OV"), c("OV", "LUSC"))
ggboxplot(expr, x = "dataset", y = "GATA3",
          title = "GATA3", ylab = "Expression",
          color = "dataset", palette = "jco")+
  stat_compare_means(comparisons = my_comparisons)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-compare-means-1.png" alt="Exploratory Data visualization: Gene Expression Data" width="384" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<p>For each of the genes, you can compare the different groups as follow:</p>
<pre class="r"><code>compare_means(c(GATA3, PTEN, XBP1) ~ dataset, data = expr)</code></pre>
<pre><code># A tibble: 9 x 8
     .y. group1 group2             p         p.adj p.format p.signif   method
  <fctr>  <chr>  <chr>         <dbl>         <dbl>    <chr>    <chr>    <chr>
1  GATA3   BRCA     OV 1.111768e-177 3.335304e-177  < 2e-16     **** Wilcoxon
2  GATA3   BRCA   LUSC  6.684016e-73  1.336803e-72  < 2e-16     **** Wilcoxon
3  GATA3     OV   LUSC  2.965702e-08  2.965702e-08  3.0e-08     **** Wilcoxon
4   PTEN   BRCA     OV  6.791940e-05  6.791940e-05  6.8e-05     **** Wilcoxon
5   PTEN   BRCA   LUSC  1.042830e-16  3.128489e-16  < 2e-16     **** Wilcoxon
6   PTEN     OV   LUSC  1.280576e-07  2.561153e-07  1.3e-07     **** Wilcoxon
7   XBP1   BRCA     OV 2.551228e-123 7.653685e-123  < 2e-16     **** Wilcoxon
8   XBP1   BRCA   LUSC  1.950162e-42  3.900324e-42  < 2e-16     **** Wilcoxon
9   XBP1     OV   LUSC  4.239570e-11  4.239570e-11  4.2e-11     **** Wilcoxon</code></pre>
<p>If you want to select items (here cancer types) to display or to remove a particular item from the plot, use the argument <strong>select</strong> or <strong>remove</strong>, as follow:</p>
<pre class="r"><code># Select BRCA and OV cancer types
ggboxplot(expr, x = "dataset", y = "GATA3",
          title = "GATA3", ylab = "Expression",
          color = "dataset", palette = "jco",
          select = c("BRCA", "OV"))

# or remove BRCA
ggboxplot(expr, x = "dataset", y = "GATA3",
          title = "GATA3", ylab = "Expression",
          color = "dataset", palette = "jco",
          remove = "BRCA")</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-select-dataset-1.png" alt="Exploratory Data visualization: Gene Expression Data" width="336" style="margin-bottom:10px;" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-select-dataset-2.png" alt="Exploratory Data visualization: Gene Expression Data" width="336" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<p>To change the order of the data sets on x axis, use the argument <strong>order</strong>. For example <em>order = c(“LUSC”, “OV”, “BRCA”)</em>:</p>
<pre class="r"><code># Order data sets
ggboxplot(expr, x = "dataset", y = "GATA3",
          title = "GATA3", ylab = "Expression",
          color = "dataset", palette = "jco",
          order = c("LUSC", "OV", "BRCA"))</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-order-dataset-1.png" alt="Exploratory Data visualization: Gene Expression Data" width="336" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<p>To create horizontal plots, use the argument <strong>rotate = TRUE</strong>:</p>
<pre class="r"><code>ggboxplot(expr, x = "dataset", y = "GATA3",
          title = "GATA3", ylab = "Expression",
          color = "dataset", palette = "jco",
          rotate = TRUE)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-horizontal-plot-1.png" alt="Exploratory Data visualization: Gene Expression Data" width="432" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<p>To combine the three gene expression plots into a multi-panel plot, use the argument <strong>combine = TRUE</strong>:</p>
<pre class="r"><code>ggboxplot(expr, x = "dataset",
          y = c("GATA3", "PTEN", "XBP1"),
          combine = TRUE,
          ylab = "Expression",
          color = "dataset", palette = "jco")</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-boxplot-gene-expression-multi-panel-1.png" alt="Exploratory Data visualization: Gene Expression Data" width="720" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<p>You can also merge the 3 plots using the argument <strong>merge = TRUE</strong> or <strong>merge = “asis”</strong>:</p>
<pre class="r"><code>ggboxplot(expr, x = "dataset",
          y = c("GATA3", "PTEN", "XBP1"),
          merge = TRUE,
          ylab = "Expression", 
          palette = "jco")</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-merge-plot-1.png" alt="Exploratory Data visualization: Gene Expression Data" width="576" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<p>In the plot above, It’s easy to visually compare the expression level of the different genes in each cancer type.</p>
<p>But you might want to put genes (y variables) on x axis, in order to compare the expression level in the different cell subpopulations.</p>
<p>In this situation, the y variables (i.e.: genes) become x tick labels and the x variable (i.e.: dataset) becomes the grouping variable. To do this, use the argument <strong>merge = “flip”</strong>.</p>
<pre class="r"><code>ggboxplot(expr, x = "dataset",
          y = c("GATA3", "PTEN", "XBP1"),
          merge = "flip",
          ylab = "Expression", 
          palette = "jco")</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-merge-flip-1.png" alt="Exploratory Data visualization: Gene Expression Data" width="576" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<p>You might want to add jittered points on the boxplot. Each point correspond to individual observations. To add jittered points, use the argument <strong>add = “jitter”</strong> as follow. To customize the added elements, specify the argument <strong>add.params</strong>.</p>
<pre class="r"><code>ggboxplot(expr, x = "dataset",
          y = c("GATA3", "PTEN", "XBP1"),
          combine = TRUE,
          color = "dataset", palette = "jco",
          ylab = "Expression", 
          add = "jitter",                              # Add jittered points
          add.params = list(size = 0.1, jitter = 0.2)  # Point size and the amount of jittering
          )</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-boxplot-with-jitter-points-1.png" alt="Exploratory Data visualization: Gene Expression Data" width="720" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<div class="block">
<p>
Note that, when using <strong>ggboxplot</strong>() sensible values for the argument <strong>add</strong> are one of c(“jitter”, “dotplot”). If you decide to use <strong>add = “dotplot”</strong>, you can adjust <em>dotsize</em> and <em>binwidth</em> wen you have a strong dense dotplot. <a href="http://r4ds.had.co.nz/eda.html">Read more about binwidth</a>.
</p>
</div>
<p>You can add and adjust a dotplot as follow:</p>
<pre class="r"><code>ggboxplot(expr, x = "dataset",
          y = c("GATA3", "PTEN", "XBP1"),
          combine = TRUE,
          color = "dataset", palette = "jco",
          ylab = "Expression", 
          add = "dotplot",                              # Add dotplot
          add.params = list(binwidth = 0.1, dotsize = 0.3)
          )</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-boxplot-with-dotplot-1.png" alt="Exploratory Data visualization: Gene Expression Data" width="720" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<p>You might want to label the boxplot by showing the names of samples with the top n highest or lowest values. In this case, you can use the following arguments:</p>
<ul>
<li><strong>label</strong>: the name of the column containing point labels.</li>
<li><strong>label.select</strong>: can be of two formats:
<ul>
<li>a <em>character vector</em> specifying some labels to show.</li>
<li>a <em>list</em> containing one or the combination of the following components:
<ul>
<li><em>top.up</em> and <em>top.down</em>: to display the labels of the top up/down points. For example, <em>label.select = list(top.up = 10, top.down = 4)</em>.</li>
<li><em>criteria</em>: to filter, for example, by x and y variables values, use this: <em>label.select = list(criteria = “`y` > 3.9 &amp; `y` < 5 &amp; `x` %in% c(‘BRCA’, ‘OV’)”)</em>.</li>
</ul></li>
</ul></li>
</ul>
<p>For example:</p>
<pre class="r"><code>ggboxplot(expr, x = "dataset",
          y = c("GATA3", "PTEN", "XBP1"),
          combine = TRUE,
          color = "dataset", palette = "jco",
          ylab = "Expression", 
          add = "jitter",                               # Add jittered points
          add.params = list(size = 0.1, jitter = 0.2),  # Point size and the amount of jittering
          label = "bcr_patient_barcode",                # column containing point labels
          label.select = list(top.up = 2, top.down = 2),# Select some labels to display
          font.label = list(size = 9, face = "italic"), # label font
          repel = TRUE                                  # Avoid label text overplotting
          )</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-boxplot-with-point-labels-1.png" alt="Exploratory Data visualization: Gene Expression Data" width="720" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<p>A complex criteria for labeling can be specified as follow:</p>
<pre class="r"><code>label.select.criteria <- list(criteria = "`y` > 3.9 &amp; `x` %in% c(&amp;#39;BRCA&amp;#39;, &amp;#39;OV&amp;#39;)")
ggboxplot(expr, x = "dataset",
          y = c("GATA3", "PTEN", "XBP1"),
          combine = TRUE,
          color = "dataset", palette = "jco",
          ylab = "Expression", 
          label = "bcr_patient_barcode",              # column containing point labels
          label.select = label.select.criteria,       # Select some labels to display
          font.label = list(size = 9, face = "italic"), # label font
          repel = TRUE                                # Avoid label text overplotting
          )</code></pre>
<div class="warning">
<p>
Other types of plots, with the same arguments as the function <strong>ggboxplot</strong>(), are available, such as stripchart and violin plots.
</p>
</div>
</div>
<div id="violin-plots" class="section level2">
<h2>Violin plots</h2>
<p>(<a href="https://www.sthda.com/english/wiki/ggplot2-violin-plot-quick-start-guide-r-software-and-data-visualization">ggplot2 way of creating violin plot</a>)</p>
<p>The following R code draws violin plots with box plots inside:</p>
<pre class="r"><code>ggviolin(expr, x = "dataset",
          y = c("GATA3", "PTEN", "XBP1"),
          combine = TRUE, 
          color = "dataset", palette = "jco",
          ylab = "Expression", 
          add = "boxplot")</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-violin-plots-and-box-plots-1.png" alt="Exploratory Data visualization: Gene Expression Data" width="768" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<p>Instead of adding a box plot inside the violin plot, you can add the median + interquantile range as follow:</p>
<pre class="r"><code>ggviolin(expr, x = "dataset",
          y = c("GATA3", "PTEN", "XBP1"),
          combine = TRUE, 
          color = "dataset", palette = "jco",
          ylab = "Expression", 
          add = "median_iqr")</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-violin-plots-and-median-iqr-1.png" alt="Exploratory Data visualization: Gene Expression Data" width="768" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<div class="block">
<p>
When using the function <strong>ggviolin</strong>(), sensible values for the argument <strong>add</strong> include: “mean”, “mean_se”, “mean_sd”, “mean_ci”, “mean_range”, “median”, “median_iqr”, “median_mad”, “median_range”.
</p>
<p>
You can also add “jitter” points and “dotplot” inside the violin plot as described previously in the box plot section.
</p>
</div>
</div>
<div id="stripcharts-and-dot-plots" class="section level2">
<h2>Stripcharts and dot plots</h2>
<p>To draw a stripchart, type this:</p>
<pre class="r"><code>ggstripchart(expr, x = "dataset",
             y = c("GATA3", "PTEN", "XBP1"),
             combine = TRUE, 
             color = "dataset", palette = "jco",
             size = 0.1, jitter = 0.2,
             ylab = "Expression", 
             add = "median_iqr",
             add.params = list(color = "gray"))</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-stripchart-1.png" alt="Exploratory Data visualization: Gene Expression Data" width="768" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<p>(<a href="https://www.sthda.com/english/wiki/ggplot2-stripchart-jitter-quick-start-guide-r-software-and-data-visualization">ggplot2 way of creating stripcharts</a>)</p>
<p>For a dot plot, use this:</p>
<pre class="r"><code>ggdotplot(expr, x = "dataset",
          y = c("GATA3", "PTEN", "XBP1"),
          combine = TRUE, 
          color = "dataset", palette = "jco",
          fill = "white",
          binwidth = 0.1,
          ylab = "Expression", 
          add = "median_iqr",
          add.params = list(size = 0.9))</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-dot-plots-1.png" alt="Exploratory Data visualization: Gene Expression Data" width="768" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<p>(<a href="https://www.sthda.com/english/wiki/ggplot2-dot-plot-quick-start-guide-r-software-and-data-visualization">ggplot2 way of creating dot plots</a>)</p>
</div>
<div id="density-plots" class="section level2">
<h2>Density plots</h2>
<p>(<a href="https://www.sthda.com/english/wiki/ggplot2-density-plot-quick-start-guide-r-software-and-data-visualization">ggplot2 way of creating density plots</a>)</p>
<p>To visualize the distribution as a density plot, use the function <strong>ggdensity</strong>() as follow:</p>
<pre class="r"><code># Basic density plot
ggdensity(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       y = "..density..",
       combine = TRUE,                  # Combine the 3 plots
       xlab = "Expression", 
       add = "median",                  # Add median line. 
       rug = TRUE                       # Add marginal rug
)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-density-plot-1.png" alt="Exploratory Data visualization: Gene Expression Data" width="768" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<pre class="r"><code># Change color and fill by dataset
ggdensity(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       y = "..density..",
       combine = TRUE,                  # Combine the 3 plots
       xlab = "Expression", 
       add = "median",                  # Add median line. 
       rug = TRUE,                      # Add marginal rug
       color = "dataset", 
       fill = "dataset",
       palette = "jco"
)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-density-plot-2.png" alt="Exploratory Data visualization: Gene Expression Data" width="768" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<pre class="r"><code># Merge the 3 plots
# and use y = "..count.." instead of "..density.."
ggdensity(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       y = "..count..",
       merge = TRUE,                    # Merge the 3 plots
       xlab = "Expression", 
       add = "median",                  # Add median line. 
       rug = TRUE ,                     # Add marginal rug
       palette = "jco"                  # Change color palette
)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-density-plot-3.png" alt="Exploratory Data visualization: Gene Expression Data" width="768" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<pre class="r"><code># color and fill by x variables
ggdensity(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       y = "..count..",
       color = ".x.", fill = ".x.",     # color and fill by x variables
       merge = TRUE,                    # Merge the 3 plots
       xlab = "Expression", 
       add = "median",                  # Add median line. 
       rug = TRUE ,                     # Add marginal rug
       palette = "jco"                  # Change color palette
)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-density-plot-4.png" alt="Exploratory Data visualization: Gene Expression Data" width="768" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<pre class="r"><code># Facet by "dataset"
ggdensity(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       y = "..count..",
       color = ".x.", fill = ".x.", 
       facet.by = "dataset",            # Split by "dataset" into multi-panel
       merge = TRUE,                    # Merge the 3 plots
       xlab = "Expression", 
       add = "median",                  # Add median line. 
       rug = TRUE ,                     # Add marginal rug
       palette = "jco"                  # Change color palette
)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-density-plot-5.png" alt="Exploratory Data visualization: Gene Expression Data" width="768" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
</div>
<div id="histogram-plots" class="section level2">
<h2>Histogram plots</h2>
<p>(<a href="https://www.sthda.com/english/wiki/ggplot2-histogram-plot-quick-start-guide-r-software-and-data-visualization">ggplot2 way of creating histogram plots</a>)</p>
<p>To visualize the distribution as a histogram plot, use the function <strong>gghistogram</strong>() as follow:</p>
<pre class="r"><code># Basic histogram plot 
gghistogram(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       y = "..density..",
       combine = TRUE,                  # Combine the 3 plots
       xlab = "Expression", 
       add = "median",                  # Add median line. 
       rug = TRUE                       # Add marginal rug
)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-histogram-plot-1.png" alt="Exploratory Data visualization: Gene Expression Data" width="768" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<pre class="r"><code># Change color and fill by dataset
gghistogram(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       y = "..density..",
       combine = TRUE,                  # Combine the 3 plots
       xlab = "Expression", 
       add = "median",                  # Add median line. 
       rug = TRUE,                      # Add marginal rug
       color = "dataset", 
       fill = "dataset",
       palette = "jco"
)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-histogram-plot-2.png" alt="Exploratory Data visualization: Gene Expression Data" width="768" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<pre class="r"><code># Merge the 3 plots
# and use y = "..count.." instead of "..density.."
gghistogram(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       y = "..count..",
       merge = TRUE,                    # Merge the 3 plots
       xlab = "Expression", 
       add = "median",                  # Add median line. 
       rug = TRUE ,                     # Add marginal rug
       palette = "jco"                  # Change color palette
)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-histogram-plot-3.png" alt="Exploratory Data visualization: Gene Expression Data" width="768" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<pre class="r"><code># color and fill by x variables
gghistogram(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       y = "..count..",
       color = ".x.", fill = ".x.",     # color and fill by x variables
       merge = TRUE,                    # Merge the 3 plots
       xlab = "Expression", 
       add = "median",                  # Add median line. 
       rug = TRUE ,                     # Add marginal rug
       palette = "jco"                  # Change color palette
)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-histogram-plot-4.png" alt="Exploratory Data visualization: Gene Expression Data" width="768" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<pre class="r"><code># Facet by "dataset"
gghistogram(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       y = "..count..",
       color = ".x.", fill = ".x.", 
       facet.by = "dataset",            # Split by "dataset" into multi-panel
       merge = TRUE,                    # Merge the 3 plots
       xlab = "Expression", 
       add = "median",                  # Add median line. 
       rug = TRUE ,                     # Add marginal rug
       palette = "jco"                  # Change color palette
)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-histogram-plot-5.png" alt="Exploratory Data visualization: Gene Expression Data" width="768" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
</div>
<div id="empirical-cumulative-density-function" class="section level2">
<h2>Empirical cumulative density function</h2>
<p>(<a href="https://www.sthda.com/english/wiki/ggplot2-qq-plot-quantile-quantile-graph-quick-start-guide-r-software-and-data-visualization">ggplot2 way of creating ECDF plots</a>)</p>
<pre class="r"><code># Basic ECDF plot 
ggecdf(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       combine = TRUE,                 
       xlab = "Expression", ylab = "F(expression)"
)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-ecdf-plot-1.png" alt="Exploratory Data visualization: Gene Expression Data" width="768" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<pre class="r"><code># Change color  by dataset
ggecdf(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       combine = TRUE,                 
       xlab = "Expression", ylab = "F(expression)",
       color = "dataset", palette = "jco"
)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-ecdf-plot-2.png" alt="Exploratory Data visualization: Gene Expression Data" width="768" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<pre class="r"><code># Merge the 3 plots and color by x variables
ggecdf(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       merge = TRUE,                 
       xlab = "Expression", ylab = "F(expression)",
       color = ".x.", palette = "jco"
)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-ecdf-plot-3.png" alt="Exploratory Data visualization: Gene Expression Data" width="768" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<pre class="r"><code># Merge the 3 plots and color by x variables
# facet by "dataset" into multi-panel
ggecdf(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       merge = TRUE,                 
       xlab = "Expression", ylab = "F(expression)",
       color = ".x.", palette = "jco",
       facet.by = "dataset"
)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-ecdf-plot-4.png" alt="Exploratory Data visualization: Gene Expression Data" width="768" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
</div>
<div id="quantile---quantile-plot" class="section level2">
<h2>Quantile - Quantile plot</h2>
<p>(<a href="https://www.sthda.com/english/wiki/ggplot2-qq-plot-quantile-quantile-graph-quick-start-guide-r-software-and-data-visualization">ggplot2 way of creating QQ plots</a>)</p>
<pre class="r"><code># Basic ECDF plot 
ggqqplot(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       combine = TRUE, size = 0.5
)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-qq-plot-1.png" alt="Exploratory Data visualization: Gene Expression Data" width="768" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<pre class="r"><code># Change color  by dataset
ggqqplot(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       combine = TRUE, color = "dataset", palette = "jco",
       size = 0.5
)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-qq-plot-2.png" alt="Exploratory Data visualization: Gene Expression Data" width="768" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<pre class="r"><code># Merge the 3 plots and color by x variables
ggqqplot(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       merge = TRUE,  
       color = ".x.", palette = "jco"
)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-qq-plot-3.png" alt="Exploratory Data visualization: Gene Expression Data" width="768" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
<pre class="r"><code># Merge the 3 plots and color by x variables
# facet by "dataset" into multi-panel
ggqqplot(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       merge = TRUE, size = 0.5,
       color = ".x.", palette = "jco",
       facet.by = "dataset"
)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/exploratory-data-visualization-qq-plot-4.png" alt="Exploratory Data visualization: Gene Expression Data" width="768" style="margin-bottom:10px;" />
<p class="caption">
Exploratory Data visualization: Gene Expression Data
</p>
</div>
</div>
<div id="infos" class="section level2">
<h2>Infos</h2>
<p>This analysis has been performed using <strong>R software</strong> (ver. 3.3.2) and <strong>ggpubr</strong> (ver. 0.1.3).</p>
</div>

<script>jQuery(document).ready(function () {
    jQuery('#rdoc h1').addClass('wiki_paragraph1');
    jQuery('#rdoc h2').addClass('wiki_paragraph2');
    jQuery('#rdoc h3').addClass('wiki_paragraph3');
    jQuery('#rdoc h4').addClass('wiki_paragraph4');
    });//add phpboost class to header</script>
<style>.content{padding:0px;}</style>
</div><!--end rdoc-->
<!--====================== stop here when you copy to sthda================-->


<!-- END HTML -->]]></description>
			<pubDate>Mon, 12 Jun 2017 16:30:13 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[Add P-values and Significance Levels to ggplots]]></title>
			<link>https://www.sthda.com/english/wiki/add-p-values-and-significance-levels-to-ggplots</link>
			<guid>https://www.sthda.com/english/wiki/add-p-values-and-significance-levels-to-ggplots</guid>
			<description><![CDATA[<!-- START HTML -->

  <!--====================== start from here when you copy to sthda================-->  
  <div id="rdoc">

<p><br/></p>
<p>In this article, we’ll describe how to easily i) <strong>compare means</strong> of two or multiple groups; ii) and to automatically add <strong>p-values</strong> and <strong>significance levels</strong> to a ggplot (such as box plots, dot plots, bar plots and line plots …).</p>
<p><strong>Contents</strong>:</p>

<div id="TOC">
<ul>
<li><a href="#prerequisites">Prerequisites</a></li>
<li><a href="#methods-for-comparing-means">Methods for comparing means</a></li>
<li><a href="#r-functions-to-add-p-values">R functions to add p-values</a></li>
<li><a href="#compare-two-independent-groups">Compare two independent groups</a></li>
<li><a href="#compare-two-paired-samples">Compare two paired samples</a></li>
<li><a href="#compare-more-than-two-groups">Compare more than two groups</a></li>
<li><a href="#multiple-grouping-variables">Multiple grouping variables</a></li>
<li><a href="#other-plot-types">Other plot types</a></li>
<li><a href="#infos">Infos</a></li>
</ul>
</div>
 
<div id="prerequisites" class="section level2">
<h2>Prerequisites</h2>
<div id="install-and-load-required-r-packages" class="section level3">
<h3>Install and load required R packages</h3>
<p>Required R package: <a href="https://www.sthda.com/english/rpkgs/ggpubr">ggpubr (version >= 0.1.3)</a>, for ggplot2-based publication ready plots.</p>
<ul>
<li>Install from <a href="https://cran.r-project.org/package=ggpubr">CRAN</a> as follow:</li>
</ul>
<pre class="r"><code>install.packages("ggpubr")</code></pre>
<ul>
<li>Or, install the latest developmental version from <a href="https://github.com/kassambara/ggpubr">GitHub</a> as follow:</li>
</ul>
<pre class="r"><code>if(!require(devtools)) install.packages("devtools")
devtools::install_github("kassambara/ggpubr")</code></pre>
<ul>
<li>Load ggpubr:</li>
</ul>
<pre class="r"><code>library(ggpubr)</code></pre>
<div class="success">
<p>
Official documentation of ggpubr is available at: <a href="https://www.sthda.com/english/rpkgs/ggpubr" class="uri">https://www.sthda.com/english/rpkgs/ggpubr</a>
</p>
</div>
</div>
<div id="demo-data-sets" class="section level3">
<h3>Demo data sets</h3>
<p>Data: <a href="https://www.sthda.com/english/wiki/r-built-in-data-sets#toothgrowth">ToothGrowth</a> data sets.</p>
<pre class="r"><code>data("ToothGrowth")
head(ToothGrowth)</code></pre>
<pre><code>   len supp dose
1  4.2   VC  0.5
2 11.5   VC  0.5
3  7.3   VC  0.5
4  5.8   VC  0.5
5  6.4   VC  0.5
6 10.0   VC  0.5</code></pre>
</div>
</div>
<div id="methods-for-comparing-means" class="section level2">
<h2>Methods for comparing means</h2>
<p>The standard methods to compare the means of two or more groups in R, have been largely described at: <a href="https://www.sthda.com/english/wiki/comparing-means-in-r">comparing means in R</a>.</p>
<p>The most common methods for comparing means include:</p>
<table>
<thead>
<tr class="header">
<th>Methods</th>
<th>R function</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>T-test</td>
<td>t.test()</td>
<td>Compare two groups (parametric)</td>
</tr>
<tr class="even">
<td>Wilcoxon test</td>
<td>wilcox.test()</td>
<td>Compare two groups (non-parametric)</td>
</tr>
<tr class="odd">
<td>ANOVA</td>
<td>aov() or anova()</td>
<td>Compare multiple groups (parametric)</td>
</tr>
<tr class="even">
<td>Kruskal-Wallis</td>
<td>kruskal.test()</td>
<td>Compare multiple groups (non-parametric)</td>
</tr>
</tbody>
</table>
<p>A practical guide to compute and interpret the results of each of these methods are provided at the following links:</p>
<div class="block">
<ul>
<li>
Comparing one-sample mean to a standard known mean:
<ul>
<li>
<a href="https://www.sthda.com/english/wiki/one-sample-t-test-in-r">One-Sample T-test (parametric)</a>
</li>
<li>
<a href="https://www.sthda.com/english/wiki/one-sample-wilcoxon-signed-rank-test-in-r">One-Sample Wilcoxon Test (non-parametric)</a>
</li>
</ul>
</li>
<li>
Comparing the means of two independent groups:
<ul>
<li>
<a href="https://www.sthda.com/english/wiki/unpaired-two-samples-t-test-in-r">Unpaired Two Samples T-test (parametric)</a>
</li>
<li>
<a href="https://www.sthda.com/english/wiki/unpaired-two-samples-wilcoxon-test-in-r">Unpaired Two-Samples Wilcoxon Test (non-parametric)</a>
</li>
</ul>
</li>
<li>
Comparing the means of paired samples:
<ul>
<li>
<a href="https://www.sthda.com/english/wiki/paired-samples-t-test-in-r">Paired Samples T-test (parametric)</a>
</li>
<li>
<a href="https://www.sthda.com/english/wiki/paired-samples-wilcoxon-test-in-r">Paired Samples Wilcoxon Test (non-parametric)</a>
</li>
</ul>
</li>
<li>
Comparing the means of more than two groups
<ul>
<li>
Analysis of variance (ANOVA, parametric):
<ul>
<li>
<a href="https://www.sthda.com/english/wiki/one-way-anova-test-in-r">One-Way ANOVA Test in R</a>
</li>
<li>
<a href="https://www.sthda.com/english/wiki/two-way-anova-test-in-r">Two-Way ANOVA Test in R</a>
</li>
</ul>
</li>
<li>
<a href="https://www.sthda.com/english/wiki/kruskal-wallis-test-in-r">Kruskal-Wallis Test in R (non parametric alternative to one-way ANOVA)</a>
</li>
</ul>
</li>
</ul>
</div>
</div>
<div id="r-functions-to-add-p-values" class="section level2">
<h2>R functions to add p-values</h2>
<p>Here we present two new R functions in the <strong>ggpubr</strong> package:</p>
<ul>
<li><strong>compare_means</strong>(): easy to use solution to performs one and multiple mean comparisons.</li>
<li><strong>stat_compare_means</strong>(): easy to use solution to automatically add p-values and significance levels to a ggplot.</li>
</ul>
<div id="compare_means" class="section level3">
<h3>compare_means()</h3>
<p>As we’ll show in the next sections, it has multiple useful options compared to the standard R functions.</p>
<p>The simplified format is as follow:</p>
<pre class="r"><code>compare_means(formula, data, method = "wilcox.test", paired = FALSE,
  group.by = NULL, ref.group = NULL, ...)</code></pre>
<div class="block">
<ul>
<li>
<strong>formula</strong>: a formula of the form <em>x ~ group</em>, where x is a numeric variable and group is a factor with one or multiple levels. For example, <em>formula = TP53 ~ cancer_group</em>. It’s also possible to perform the test for multiple response variables at the same time. For example, <em>formula = c(TP53, PTEN) ~ cancer_group</em>.
</li>
<li>
<p>
<strong>data</strong>: a data.frame containing the variables in the formula.
</p>
</li>
<li>
<strong>method</strong>: the type of test. Default is <em>“wilcox.test”</em>. Allowed values include:
<ul>
<li>
<em>“t.test”</em> (parametric) and <em>“wilcox.test”</em>" (non-parametric). Perform comparison between two groups of samples. If the grouping variable contains more than two levels, then a pairwise comparison is performed.
</li>
<li>
<em>“anova”</em> (parametric) and <em>“kruskal.test”</em> (non-parametric). Perform one-way ANOVA test comparing multiple groups.
</li>
</ul>
</li>
<li>
<p>
<strong>paired</strong>: a logical indicating whether you want a paired test. Used only in <em>t.test</em> and in <em>wilcox.test</em>.
</p>
</li>
<li>
<p>
<strong>group.by</strong>: variables used to group the data set before applying the test. When specified the mean comparisons will be performed in each subset of the data formed by the different levels of the group.by variables.
</p>
</li>
<li>
<p>
<strong>ref.group</strong>: a character string specifying the reference group. If specified, for a given grouping variable, each of the group levels will be compared to the reference group (i.e. control group). ref.group can be also <em>“.all.”</em>. In this case, each of the grouping variable levels is compared to all (i.e. base-mean).
</p>
</li>
</ul>
</div>
</div>
<div id="stat_compare_means" class="section level3">
<h3>stat_compare_means()</h3>
<p>This function extends ggplot2 for adding mean comparison p-values to a ggplot, such as box blots, dot plots, bar plots and line plots.</p>
<p>The simplified format is as follow:</p>
<pre class="r"><code>stat_compare_means(mapping = NULL, comparisons = NULL hide.ns = FALSE,
                   label = NULL,  label.x = NULL, label.y = NULL,  ...)</code></pre>
<div class="block">
<ul>
<li>
<p>
<strong>mapping</strong>: Set of aesthetic mappings created by aes().
</p>
</li>
<li>
<p>
<strong>comparisons</strong>: A list of length-2 vectors. The entries in the vector are either the names of 2 values on the x-axis or the 2 integers that correspond to the index of the groups of interest, to be compared.
</p>
</li>
<li>
<p>
<strong>hide.ns</strong>: logical value. If TRUE, hide ns symbol when displaying significance levels.
</p>
</li>
<li>
<p>
<strong>label</strong>: character string specifying label type. Allowed values include “p.signif” (shows the significance levels), “p.format” (shows the formatted p value).
</p>
</li>
<li>
<p>
<strong>label.x,label.y</strong>: numeric values. coordinates (in data units) to be used for absolute positioning of the label. If too short they will be recycled.
</p>
</li>
<li>
<p>
<strong>…</strong>: other arguments passed to the function <strong>compare_means</strong>() such as <em>method</em>, <em>paired</em>, <em>ref.group</em>.
</p>
</li>
</ul>
</div>
</div>
</div>
<div id="compare-two-independent-groups" class="section level2">
<h2>Compare two independent groups</h2>
<p>Perform the test:</p>
<pre class="r"><code>compare_means(len ~ supp, data = ToothGrowth)</code></pre>
<pre><code># A tibble: 1 x 8
    .y. group1 group2          p      p.adj p.format p.signif   method
  <chr>  <chr>  <chr>      <dbl>      <dbl>    <chr>    <chr>    <chr>
1   len     OJ     VC 0.06449067 0.06449067    0.064       ns Wilcoxon</code></pre>
<div class="warning">
<p>
By default <strong>method = “wilcox.test”</strong> (non-parametric test). You can also specify <strong>method = “t.test”</strong> for a parametric t-test.
</p>
</div>
<p>Returned value is a data frame with the following columns:</p>
<ul>
<li>.y.: the y variable used in the test.</li>
<li>p: the p-value</li>
<li>p.adj: the adjusted p-value. Default value for p.adjust.method = “holm”</li>
<li>p.format: the formatted p-value</li>
<li>p.signif: the significance level.</li>
<li>method: the statistical test used to compare groups.</li>
</ul>
<p>Create a box plot with p-values:</p>
<pre class="r"><code>p <- ggboxplot(ToothGrowth, x = "supp", y = "len",
          color = "supp", palette = "jco",
          add = "jitter")
#  Add p-value
p + stat_compare_means()

# Change method
p + stat_compare_means(method = "t.test")</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/add-p-values-to-ggplots-compare-means-two-independent-groups-1.png" alt="Add p-values and significance levels to ggplots" width="355.2" style="margin-bottom:10px;" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/add-p-values-to-ggplots-compare-means-two-independent-groups-2.png" alt="Add p-values and significance levels to ggplots" width="355.2" style="margin-bottom:10px;" />
<p class="caption">
Add p-values and significance levels to ggplots
</p>
</div>
<p>Note that, the p-value label position can be adjusted using the arguments: <em>label.x, label.y, hjust and vjust</em>.</p>
<p>The default p-value label displayed is obtained by concatenating the <strong>method</strong> and the <strong>p</strong> columns of the returned data frame by the function <strong>compare_means</strong>(). You can specify other combinations using the <strong>aes</strong>() function.</p>
<p>For example,</p>
<ul>
<li><strong>aes(label = ..p.format..)</strong> or <strong>aes(label = paste0(“p =”, ..p.format..))</strong>: display only the formatted p-value (without the method name)</li>
<li><strong>aes(label = ..p.signif..)</strong>: display only the significance level.</li>
<li><strong>aes(label = paste0(..method.., “\n”, “p =”, ..p.format..))</strong>: Use line break (“\n”) between the method name and the p-value.</li>
</ul>
<p>As an illustration, type this:</p>
<pre class="r"><code>p + stat_compare_means( aes(label = ..p.signif..), 
                        label.x = 1.5, label.y = 40)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/add-p-values-to-ggplots-compare-means-two-independent-groups-significance-level-1.png" alt="Add p-values and significance levels to ggplots" width="355.2" style="margin-bottom:10px;" />
<p class="caption">
Add p-values and significance levels to ggplots
</p>
</div>
<p>If you prefer, it’s also possible to specify the argument <em>label</em> as a character vector:</p>
<pre class="r"><code>p + stat_compare_means( label = "p.signif", label.x = 1.5, label.y = 40)</code></pre>
</div>
<div id="compare-two-paired-samples" class="section level2">
<h2>Compare two paired samples</h2>
<p>Perform the test:</p>
<pre class="r"><code>compare_means(len ~ supp, data = ToothGrowth, paired = TRUE)</code></pre>
<pre><code># A tibble: 1 x 8
    .y. group1 group2           p       p.adj p.format p.signif   method
  <chr>  <chr>  <chr>       <dbl>       <dbl>    <chr>    <chr>    <chr>
1   len     OJ     VC 0.004312554 0.004312554   0.0043       ** Wilcoxon</code></pre>
<p>Visualize paired data using the <strong>ggpaired</strong>() function:</p>
<pre class="r"><code>ggpaired(ToothGrowth, x = "supp", y = "len",
         color = "supp", line.color = "gray", line.size = 0.4,
         palette = "jco")+
  stat_compare_means(paired = TRUE)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/add-p-values-to-ggplots-compare-means-paired-tests-1.png" alt="Add p-values and significance levels to ggplots" width="355.2" style="margin-bottom:10px;" />
<p class="caption">
Add p-values and significance levels to ggplots
</p>
</div>
</div>
<div id="compare-more-than-two-groups" class="section level2">
<h2>Compare more than two groups</h2>
<ul>
<li>Global test:</li>
</ul>
<pre class="r"><code># Global test
compare_means(len ~ dose,  data = ToothGrowth, method = "anova")</code></pre>
<pre><code># A tibble: 1 x 6
    .y.            p        p.adj p.format p.signif method
  <chr>        <dbl>        <dbl>    <chr>    <chr>  <chr>
1   len 9.532727e-16 9.532727e-16  9.5e-16     ****  Anova</code></pre>
<p>Plot with global p-value:</p>
<pre class="r"><code># Default method = "kruskal.test" for multiple groups
ggboxplot(ToothGrowth, x = "dose", y = "len",
          color = "dose", palette = "jco")+
  stat_compare_means()

# Change method to anova
ggboxplot(ToothGrowth, x = "dose", y = "len",
          color = "dose", palette = "jco")+
  stat_compare_means(method = "anova")</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/add-p-values-to-ggplots-multiple-independent-groups-1.png" alt="Add p-values and significance levels to ggplots" width="355.2" style="margin-bottom:10px;" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/add-p-values-to-ggplots-multiple-independent-groups-2.png" alt="Add p-values and significance levels to ggplots" width="355.2" style="margin-bottom:10px;" />
<p class="caption">
Add p-values and significance levels to ggplots
</p>
</div>
<ul>
<li><strong>Pairwise comparisons</strong>. If the grouping variable contains more than two levels, then pairwise tests will be performed automatically. The default method is “wilcox.test”. You can change this to “t.test”.</li>
</ul>
<pre class="r"><code># Perorm pairwise comparisons
compare_means(len ~ dose,  data = ToothGrowth)</code></pre>
<pre><code># A tibble: 3 x 8
    .y. group1 group2            p        p.adj p.format p.signif   method
  <chr>  <chr>  <chr>        <dbl>        <dbl>    <chr>    <chr>    <chr>
1   len    0.5      1 7.020855e-06 1.404171e-05  7.0e-06     **** Wilcoxon
2   len    0.5      2 8.406447e-08 2.521934e-07  8.4e-08     **** Wilcoxon
3   len      1      2 1.772382e-04 1.772382e-04  0.00018      *** Wilcoxon</code></pre>
<pre class="r"><code># Visualize: Specify the comparisons you want
my_comparisons <- list( c("0.5", "1"), c("1", "2"), c("0.5", "2") )
ggboxplot(ToothGrowth, x = "dose", y = "len",
          color = "dose", palette = "jco")+ 
  stat_compare_means(comparisons = my_comparisons)+ # Add pairwise comparisons p-value
  stat_compare_means(label.y = 50)     # Add global p-value</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/add-p-values-to-ggplots-pairwise-comparisons-1.png" alt="Add p-values and significance levels to ggplots" width="480" style="margin-bottom:10px;" />
<p class="caption">
Add p-values and significance levels to ggplots
</p>
</div>
<p>If you want to specify the precise y location of bars, use the argument <strong>label.y</strong>:</p>
<pre class="r"><code>ggboxplot(ToothGrowth, x = "dose", y = "len",
          color = "dose", palette = "jco")+ 
  stat_compare_means(comparisons = my_comparisons, label.y = c(29, 35, 40))+
  stat_compare_means(label.y = 45)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/add-p-values-to-ggplots-pairwise-comparisons-bar-location-1.png" alt="Add p-values and significance levels to ggplots" width="480" style="margin-bottom:10px;" />
<p class="caption">
Add p-values and significance levels to ggplots
</p>
</div>
<p>(Adding bars, connecting compared groups, has been facilitated by the <a href="https://github.com/Artjom-Metro/ggsignif">ggsignif</a> R package )</p>
<ul>
<li><strong>Multiple pairwise tests against a reference group</strong>:</li>
</ul>
<pre class="r"><code># Pairwise comparison against reference
compare_means(len ~ dose,  data = ToothGrowth, ref.group = "0.5",
              method = "t.test")</code></pre>
<pre><code># A tibble: 2 x 8
    .y. group1 group2            p        p.adj p.format p.signif method
  <chr>  <chr>  <chr>        <dbl>        <dbl>    <chr>    <chr>  <chr>
1   len    0.5      1 6.697250e-09 6.697250e-09  6.7e-09     **** T-test
2   len    0.5      2 1.469534e-16 2.939068e-16  < 2e-16     **** T-test</code></pre>
<pre class="r"><code># Visualize
ggboxplot(ToothGrowth, x = "dose", y = "len",
          color = "dose", palette = "jco")+
  stat_compare_means(method = "anova", label.y = 40)+      # Add global p-value
  stat_compare_means(label = "p.signif", method = "t.test",
                     ref.group = "0.5")                    # Pairwise comparison against reference</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/add-p-values-to-ggplots-reference-group-1.png" alt="Add p-values and significance levels to ggplots" width="480" style="margin-bottom:10px;" />
<p class="caption">
Add p-values and significance levels to ggplots
</p>
</div>
<ul>
<li><strong>Multiple pairwise tests against all (base-mean)</strong>:</li>
</ul>
<pre class="r"><code># Comparison of each group against base-mean
compare_means(len ~ dose,  data = ToothGrowth, ref.group = ".all.",
              method = "t.test")</code></pre>
<pre><code># A tibble: 3 x 8
    .y. group1 group2            p        p.adj p.format p.signif method
  <chr>  <chr>  <chr>        <dbl>        <dbl>    <chr>    <chr>  <chr>
1   len  .all.    0.5 1.244626e-06 3.733877e-06  1.2e-06     **** T-test
2   len  .all.      1 5.667266e-01 5.667266e-01     0.57       ns T-test
3   len  .all.      2 1.371704e-05 2.743408e-05  1.4e-05     **** T-test</code></pre>
<pre class="r"><code># Visualize
ggboxplot(ToothGrowth, x = "dose", y = "len",
          color = "dose", palette = "jco")+
  stat_compare_means(method = "anova", label.y = 40)+      # Add global p-value
  stat_compare_means(label = "p.signif", method = "t.test",
                     ref.group = ".all.")                  # Pairwise comparison against all</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/add-p-values-to-ggplots-comparison-against-base-mean-1.png" alt="Add p-values and significance levels to ggplots" width="480" style="margin-bottom:10px;" />
<p class="caption">
Add p-values and significance levels to ggplots
</p>
</div>
<p>A typical situation, where pairwise comparisons against “all” can be useful, is illustrated here using the <em>myeloma</em> data set from the <a href="https://github.com/kassambara/survminer"><strong>survminer</strong></a> package.</p>
<p>We’ll plot the expression profile of the DEPDC1 gene according to the patients’ molecular groups. We want to know if there is any difference between groups. If yes, where the difference is?</p>
<p>To answer to this question, you can perform a pairwise comparison between all the 7 groups. This will lead to a lot of comparisons between all possible combinations. If you have many groups, as here, it might be difficult to interpret.</p>
<p>Another easy solution is to compare each of the seven groups against “all” (i.e. base-mean). When the test is significant, then you can conclude that DEPDC1 is significantly overexpressed or downexpressed in a group xxx compared to all.</p>
<pre class="r"><code># Load myeloma data from survminer package
if(!require(survminer)) install.packages("survminer")
data("myeloma", package = "survminer")

# Perform the test
compare_means(DEPDC1 ~ molecular_group,  data = myeloma,
              ref.group = ".all.", method = "t.test")</code></pre>
<pre><code># A tibble: 7 x 8
     .y. group1           group2            p        p.adj p.format p.signif method
   <chr>  <chr>            <chr>        <dbl>        <dbl>    <chr>    <chr>  <chr>
1 DEPDC1  .all.       Cyclin D-1 1.496896e-01 4.490687e-01  0.14969       ns T-test
2 DEPDC1  .all.       Cyclin D-2 5.231428e-01 1.000000e+00  0.52314       ns T-test
3 DEPDC1  .all.     Hyperdiploid 2.815333e-04 1.689200e-03  0.00028      *** T-test
4 DEPDC1  .all. Low bone disease 5.083985e-03 2.541992e-02  0.00508       ** T-test
5 DEPDC1  .all.              MAF 8.610664e-02 3.444265e-01  0.08611       ns T-test
6 DEPDC1  .all.            MMSET 5.762908e-01 1.000000e+00  0.57629       ns T-test
7 DEPDC1  .all.    Proliferation 1.241416e-09 8.689910e-09  1.2e-09     **** T-test</code></pre>
<pre class="r"><code># Visualize the expression profile
ggboxplot(myeloma, x = "molecular_group", y = "DEPDC1", color = "molecular_group", 
          add = "jitter", legend = "none") +
  rotate_x_text(angle = 45)+
  geom_hline(yintercept = mean(myeloma$DEPDC1), linetype = 2)+ # Add horizontal line at base mean
  stat_compare_means(method = "anova", label.y = 1600)+        # Add global annova p-value
  stat_compare_means(label = "p.signif", method = "t.test",
                     ref.group = ".all.")                      # Pairwise comparison against all</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/add-p-values-to-ggplots-comparison-against-base-mean2-1.png" alt="Add p-values and significance levels to ggplots" width="672" style="margin-bottom:10px;" />
<p class="caption">
Add p-values and significance levels to ggplots
</p>
</div>
<div class="success">
<p>
From the plot above, we can conclude that DEPDC1 is significantly overexpressed in proliferation group and, it’s significantly downexpressed in Hyperdiploid and Low bone disease compared to all.
</p>
</div>
<div class="warning">
<p>
Note that, if you want to hide the ns symbol, specify the argument <em>hide.ns = TRUE</em>.
</p>
</div>
<pre class="r"><code># Visualize the expression profile
ggboxplot(myeloma, x = "molecular_group", y = "DEPDC1", color = "molecular_group", 
          add = "jitter", legend = "none") +
  rotate_x_text(angle = 45)+
  geom_hline(yintercept = mean(myeloma$DEPDC1), linetype = 2)+ # Add horizontal line at base mean
  stat_compare_means(method = "anova", label.y = 1600)+        # Add global annova p-value
  stat_compare_means(label = "p.signif", method = "t.test",
                     ref.group = ".all.", hide.ns = TRUE)      # Pairwise comparison against all</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/add-p-values-to-ggplots-comparison-against-base-mean-hide-ns-1.png" alt="Add p-values and significance levels to ggplots" width="672" style="margin-bottom:10px;" />
<p class="caption">
Add p-values and significance levels to ggplots
</p>
</div>
</div>
<div id="multiple-grouping-variables" class="section level2">
<h2>Multiple grouping variables</h2>
<ul>
<li><strong>Two independent sample comparisons after grouping the data by another variable</strong>:</li>
</ul>
<p>Perform the test:</p>
<pre class="r"><code>compare_means(len ~ supp, data = ToothGrowth, 
              group.by = "dose")</code></pre>
<pre><code># A tibble: 3 x 9
   dose   .y. group1 group2           p      p.adj p.format p.signif   method
  <dbl> <chr>  <chr>  <chr>       <dbl>      <dbl>    <chr>    <chr>    <chr>
1   0.5   len     OJ     VC 0.023186427 0.04637285    0.023        * Wilcoxon
2   1.0   len     OJ     VC 0.004030367 0.01209110    0.004       ** Wilcoxon
3   2.0   len     OJ     VC 1.000000000 1.00000000    1.000       ns Wilcoxon</code></pre>
<div class="notice">
<p>
In the example above, for each level of the variable “dose”, we compare the means of the variable “len” in the different groups formed by the grouping variable “supp”.
</p>
</div>
<p>Visualize (1/2). Create a multi-panel box plots facetted by group (here, “dose”):</p>
<pre class="r"><code># Box plot facetted by "dose"
p <- ggboxplot(ToothGrowth, x = "supp", y = "len",
          color = "supp", palette = "jco",
          add = "jitter",
          facet.by = "dose", short.panel.labs = FALSE)
# Use only p.format as label. Remove method name.
p + stat_compare_means(label = "p.format")</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/add-p-values-to-ggplots-facet-1.png" alt="Add p-values and significance levels to ggplots" width="672" style="margin-bottom:10px;" />
<p class="caption">
Add p-values and significance levels to ggplots
</p>
</div>
<pre class="r"><code># Or use significance symbol as label
p + stat_compare_means(label =  "p.signif", label.x = 1.5)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/add-p-values-to-ggplots-facet-2.png" alt="Add p-values and significance levels to ggplots" width="672" style="margin-bottom:10px;" />
<p class="caption">
Add p-values and significance levels to ggplots
</p>
</div>
<div class="warning">
<p>
To hide the ‘ns’ symbol, use the argument <strong>hide.ns = TRUE</strong>.
</p>
</div>
<p>Visualize (2/2). Create one single panel with all box plots. Plot y = “len” by x = “dose” and color by “supp”:</p>
<pre class="r"><code>p <- ggboxplot(ToothGrowth, x = "dose", y = "len",
          color = "supp", palette = "jco",
          add = "jitter")
p + stat_compare_means(aes(group = supp))</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/add-p-values-to-ggplots-compare-means-interaction-1.png" alt="Add p-values and significance levels to ggplots" width="672" style="margin-bottom:10px;" />
<p class="caption">
Add p-values and significance levels to ggplots
</p>
</div>
<pre class="r"><code># Show only p-value
p + stat_compare_means(aes(group = supp), label = "p.format")</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/add-p-values-to-ggplots-compare-means-interaction-2.png" alt="Add p-values and significance levels to ggplots" width="672" style="margin-bottom:10px;" />
<p class="caption">
Add p-values and significance levels to ggplots
</p>
</div>
<pre class="r"><code># Use significance symbol as label
p + stat_compare_means(aes(group = supp), label = "p.signif")</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/add-p-values-to-ggplots-compare-means-interaction-3.png" alt="Add p-values and significance levels to ggplots" width="672" style="margin-bottom:10px;" />
<p class="caption">
Add p-values and significance levels to ggplots
</p>
</div>
<ul>
<li><strong>Paired sample comparisons after grouping the data by another variable</strong>:</li>
</ul>
<p>Perform the test:</p>
<pre class="r"><code>compare_means(len ~ supp, data = ToothGrowth, 
              group.by = "dose", paired = TRUE)</code></pre>
<pre><code># A tibble: 3 x 9
   dose   .y. group1 group2          p      p.adj p.format p.signif   method
  <dbl> <chr>  <chr>  <chr>      <dbl>      <dbl>    <chr>    <chr>    <chr>
1   0.5   len     OJ     VC 0.03296938 0.06593876    0.033        * Wilcoxon
2   1.0   len     OJ     VC 0.01905889 0.05717667    0.019        * Wilcoxon
3   2.0   len     OJ     VC 1.00000000 1.00000000    1.000       ns Wilcoxon</code></pre>
<p>Visualize. Create a multi-panel box plots facetted by group (here, “dose”):</p>
<pre class="r"><code># Box plot facetted by "dose"
p <- ggpaired(ToothGrowth, x = "supp", y = "len",
          color = "supp", palette = "jco", 
          line.color = "gray", line.size = 0.4,
          facet.by = "dose", short.panel.labs = FALSE)
# Use only p.format as label. Remove method name.
p + stat_compare_means(label = "p.format", paired = TRUE)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/add-p-values-to-ggplots-facet-paired-1.png" alt="Add p-values and significance levels to ggplots" width="672" style="margin-bottom:10px;" />
<p class="caption">
Add p-values and significance levels to ggplots
</p>
</div>
</div>
<div id="other-plot-types" class="section level2">
<h2>Other plot types</h2>
<ul>
<li><strong>Bar and line plots</strong> (one grouping variable):</li>
</ul>
<pre class="r"><code># Bar plot of mean +/-se
ggbarplot(ToothGrowth, x = "dose", y = "len", add = "mean_se")+
  stat_compare_means() +                                         # Global p-value
  stat_compare_means(ref.group = "0.5", label = "p.signif",
                     label.y = c(22, 29))                   # compare to ref.group

# Line plot of mean +/-se
ggline(ToothGrowth, x = "dose", y = "len", add = "mean_se")+
  stat_compare_means() +                                         # Global p-value
  stat_compare_means(ref.group = "0.5", label = "p.signif",
                     label.y = c(22, 29))     </code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/add-p-values-to-ggplots-bar-line-plot-p-value-one-grouping-var-1.png" alt="Add p-values and significance levels to ggplots" width="355.2" style="margin-bottom:10px;" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/add-p-values-to-ggplots-bar-line-plot-p-value-one-grouping-var-2.png" alt="Add p-values and significance levels to ggplots" width="355.2" style="margin-bottom:10px;" />
<p class="caption">
Add p-values and significance levels to ggplots
</p>
</div>
<ul>
<li><strong>Bar and line plots</strong> (two grouping variables):</li>
</ul>
<pre class="r"><code>ggbarplot(ToothGrowth, x = "dose", y = "len", add = "mean_se",
          color = "supp", palette = "jco", 
          position = position_dodge(0.8))+
  stat_compare_means(aes(group = supp), label = "p.signif", label.y = 29)

ggline(ToothGrowth, x = "dose", y = "len", add = "mean_se",
          color = "supp", palette = "jco")+
  stat_compare_means(aes(group = supp), label = "p.signif", 
                     label.y = c(16, 25, 29))</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/add-p-values-to-ggplots-bar-line-plot-p-value-two-grouping-var-1.png" alt="Add p-values and significance levels to ggplots" width="355.2" style="margin-bottom:10px;" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/ggpubr/add-p-values-to-ggplots-bar-line-plot-p-value-two-grouping-var-2.png" alt="Add p-values and significance levels to ggplots" width="355.2" style="margin-bottom:10px;" />
<p class="caption">
Add p-values and significance levels to ggplots
</p>
</div>
</div>
<div id="infos" class="section level2">
<h2>Infos</h2>
<p>This analysis has been performed using <strong>R software</strong> (ver. 3.3.2) and <strong>ggpubr</strong> (ver. 0.1.3).</p>
</div>

<script>jQuery(document).ready(function () {
    jQuery('#rdoc h1').addClass('wiki_paragraph1');
    jQuery('#rdoc h2').addClass('wiki_paragraph2');
    jQuery('#rdoc h3').addClass('wiki_paragraph3');
    jQuery('#rdoc h4').addClass('wiki_paragraph4');
    });//add phpboost class to header</script>
<style>.content{padding:0px;}</style>
</div><!--end rdoc-->
<!--====================== stop here when you copy to sthda================-->


<!-- END HTML -->]]></description>
			<pubDate>Thu, 08 Jun 2017 11:48:52 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[R packages]]></title>
			<link>https://www.sthda.com/english/wiki/r-packages</link>
			<guid>https://www.sthda.com/english/wiki/r-packages</guid>
			<description><![CDATA[<!-- START HTML -->

  <!--====================== start from here when you copy to sthda================-->  
  <div id="rdoc">

<p><br/></p>
<style>
@media (min-width: 769px) {#rdoc .small-block{width: 250px; height:400px;}} /*large width*/
#rdoc .small-block{text-align:center; font-size:1.1em; margin-right:5px; display:block; float:left;}
.rpkg-title{text-align:center; font-size:1.8em; font-weight:bold; margin-bottom:20px;}
</style>
<p><span class="success"> In this section, you’ll find R packages developed by STHDA for easy data analyses.</span></p>
<br/>
<div class="small-block">
<div class="rpkg-title" style="color:#86AA00;">
factoextra
</div>
<p>factoextra let you extract and create ggplot2-based elegant visualizations of multivariate data analyse results, including PCA, CA, MCA, MFA, HMFA and clustering methods. 
</p>
<p><a href= "https://www.sthda.com/english/wiki/factoextra-r-package-easy-multivariate-data-analyses-and-elegant-visualization" target = "_blank">Overview >></a><br/> <a href= "https://www.sthda.com/english/rpkgs/factoextra" target = "_blank">factoextra Site Link >></a><br/></p>
</div>
<div class="small-block">
<div class="rpkg-title" style="color:#00AFBB;">
survminer
</div>
<p>survminer provides functions for facilitating survival analysis and visualization. 
</p>
<p><a href= "https://www.sthda.com/english/wiki/survminer-r-package-survival-data-analysis-and-visualization" target = "_blank">Overview >></a><br/> <a href= "https://www.sthda.com/english/rpkgs/survminer" target = "_blank">survminer Site Link >></a>
</p>
<p>Releases: <a href= "https://www.sthda.com/english/wiki/survminer-0-2-4" target = "_blank">v0.2.4</a> |</p>
</div>
<div class="small-block">
<div class="rpkg-title" style="color:#FF6600;">
ggpubr
</div>
<p>The default plots generated by ggplot2 requires some formatting before we can send them for publication. To customize a ggplot, the syntax is opaque and this raises the level of difficulty for researchers with no advanced R programming skills. ggpubr provides some easy-to-use functions for creating and customizing ‘ggplot2’- based publication ready plots. 
</p>
<p><a href= "https://www.sthda.com/english/wiki/ggpubr-r-package-ggplot2-based-publication-ready-plots" target = "_blank">Overview >></a><br/> <a href= "https://www.sthda.com/english/rpkgs/ggpubr" target = "_blank">ggpubr Site Link >> </a><br/></p>
</div>
<br/>
<div style="clear:both;">

</div>
<div id="infos" class="section level1">
<h1>Infos</h1>
<p><span class="warning"> This analysis has been performed using <strong>R software</strong> (ver. 3.3.2) </span></p>
</div>

<script>jQuery(document).ready(function () {
    jQuery('#rdoc h1').addClass('wiki_paragraph1');
    jQuery('#rdoc h2').addClass('wiki_paragraph2');
    jQuery('#rdoc h3').addClass('wiki_paragraph3');
    jQuery('#rdoc h4').addClass('wiki_paragraph4');
    });//add phpboost class to header</script>
<style>.content{padding:0px;}</style>
</div><!--end rdoc-->
<!--====================== stop here when you copy to sthda================-->


<!-- END HTML -->]]></description>
			<pubDate>Tue, 13 Dec 2016 00:24:14 +0100</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[Bar plot of Group Means with Individual Observations]]></title>
			<link>https://www.sthda.com/english/wiki/bar-plot-of-group-means-with-individual-observations</link>
			<guid>https://www.sthda.com/english/wiki/bar-plot-of-group-means-with-individual-observations</guid>
			<description><![CDATA[<!-- START HTML -->

  <!--====================== start from here when you copy to sthda================-->  
  <div id="rdoc">


<div id="TOC">
<ul>
<li><a href="#example-data-sets">Example data sets</a></li>
<li><a href="#install-ggpubr">Install ggpubr</a></li>
<li><a href="#bar-plot-of-group-means-with-individual-informations">Bar plot of group means with individual informations</a></li>
</ul>
</div>

<p><br/></p>
<p><a href="https://www.sthda.com/english/english/wiki/ggpubr-r-package-ggplot2-based-publication-ready-plots">ggpubr</a> is great for data visualization and very easy to use for non-“R programmer”. It makes easy to simply produce an elegant ggplot2-based graphs. Read more about ggpubr: <a href="https://www.sthda.com/english/english/wiki/ggpubr-r-package-ggplot2-based-publication-ready-plots">ggpubr</a> .</p>
<p>Here we demonstrate how to plot easily a barplot of group means +/- standard error with individual observations.</p>
<div id="example-data-sets" class="section level2">
<h2>Example data sets</h2>
<pre class="r"><code>d <- as.data.frame(mtcars[, c("am", "hp")])
d$rowname <- rownames(d)
head(d)</code></pre>
<pre><code>##                   am  hp           rowname
## Mazda RX4          1 110         Mazda RX4
## Mazda RX4 Wag      1 110     Mazda RX4 Wag
## Datsun 710         1  93        Datsun 710
## Hornet 4 Drive     0 110    Hornet 4 Drive
## Hornet Sportabout  0 175 Hornet Sportabout
## Valiant            0 105           Valiant</code></pre>
</div>
<div id="install-ggpubr" class="section level2">
<h2>Install ggpubr</h2>
<p>The latest version of ggpubr can be installed as follow:</p>
<pre class="r"><code># Install ggpubr
if(!require(devtools)) install.packages("devtools")
devtools::install_github("kassambara/ggpubr")</code></pre>
</div>
<div id="bar-plot-of-group-means-with-individual-informations" class="section level2">
<h2>Bar plot of group means with individual informations</h2>
<ul>
<li>Plot y = “hp” by groups x = “am”</li>
<li>Add mean +/- standard error and individual points: <strong>add = c(“mean_se”, “point”)</strong>. Allowed values are one or the combination of: “none”, “dotplot”, “jitter”, “boxplot”, “point”, “mean”, “mean_se”, “mean_sd”, “mean_ci”, “mean_range”, “median”, “median_iqr”, “median_mad”, “median_range”.</li>
<li>Color and fill by groups: color = “am”, fill = “am”</li>
<li>Add row names as labels.</li>
</ul>
<pre class="r"><code>library(ggpubr)
# Bar plot of group means + points
ggbarplot(d, x = "am", y = "hp",
          add = c("mean_se", "point"),
          color = "am", fill = "am", alpha = 0.5)+ 
  ggrepel::geom_text_repel(aes(label = rowname))</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-barplot-group-means-individual-information-bar-plot-group-means-1.png" alt="Bar plot of Group Means with Individual Observations" width="672" />
<p class="caption">
Bar plot of Group Means with Individual Observations
</p>
</div>
</div>

<script>jQuery(document).ready(function () {
    jQuery('#rdoc h1').addClass('wiki_paragraph1');
    jQuery('#rdoc h2').addClass('wiki_paragraph2');
    jQuery('#rdoc h3').addClass('wiki_paragraph3');
    jQuery('#rdoc h4').addClass('wiki_paragraph4');
    });//add phpboost class to header</script>
<style>.content{padding:0px;}</style>
</div><!--end rdoc-->
<!--====================== stop here when you copy to sthda================-->


<!-- END HTML -->]]></description>
			<pubDate>Thu, 27 Oct 2016 16:55:39 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[ggpubr R Package: ggplot2-Based Publication Ready Plots]]></title>
			<link>https://www.sthda.com/english/wiki/ggpubr-r-package-ggplot2-based-publication-ready-plots</link>
			<guid>https://www.sthda.com/english/wiki/ggpubr-r-package-ggplot2-based-publication-ready-plots</guid>
			<description><![CDATA[<!-- START HTML -->

            
  <!--====================== start from here when you copy to sthda================-->  
  <div id="rdoc">


<div id="TOC">
<ul>
<li><a href="#why-ggpubr">Why ggpubr?</a><ul>
<li><a href="#installation-and-loading">Installation and loading</a></li>
</ul></li>
<li><a href="#geting-started">Geting started</a><ul>
<li><a href="#density-and-histogram-plots">Density and histogram plots</a></li>
<li><a href="#box-plots-violin-plots-dot-plots-and-strip-charts">Box plots, violin plots, dot plots and strip charts</a></li>
</ul></li>
<li><a href="#bar-plots">Bar plots</a><ul>
<li><a href="#line-plots">Line plots</a></li>
<li><a href="#pie-chart">Pie chart</a></li>
<li><a href="#scatter-plots">Scatter plots</a></li>
<li><a href="#clevelands-dot-plots">Cleveland’s dot plots</a></li>
<li><a href="#ggpar-customize-ggplot-easily">ggpar(): customize ggplot easily</a><ul>
<li><a href="#main-titles-axis-labels-and-legend-titles">Main titles, axis labels and legend titles</a></li>
<li><a href="#legend-position-and-appearance">Legend position and appearance</a></li>
<li><a href="#color-palettes">Color palettes</a></li>
<li><a href="#axis-limits-and-scales">Axis limits and scales</a></li>
<li><a href="#axis-ticks-customize-tick-marks-and-labels">Axis ticks: customize tick marks and labels</a></li>
<li><a href="#themes">Themes</a></li>
<li><a href="#rotate-a-plot">Rotate a plot</a></li>
</ul></li>
<li><a href="#more">More</a></li>
</ul></li>
<li><a href="#infos">Infos</a></li>
</ul>
</div>

<p><br/></p>
<div id="why-ggpubr" class="section level1">
<h1>Why ggpubr?</h1>
<p>ggplot2 by <a href="http://docs.ggplot2.org/current/">Hadley Wickham</a> is an excellent and flexible package for elegant data visualization in R. However the default generated plots requires some formatting before we can send them for publication. Furthermore, to customize a ggplot, the syntax is opaque and this raises the level of difficulty for researchers with no advanced R programming skills.</p>
<p><span class="success">The ‘ggpubr’ package provides some easy-to-use functions for creating and customizing ‘ggplot2’- based publication ready plots.</span></p>
<div id="installation-and-loading" class="section level2">
<h2>Installation and loading</h2>
<ul>
<li>Install from <a href="https://cran.r-project.org/package=ggpubr">CRAN</a> as follow:</li>
</ul>
<pre class="r"><code>install.packages("ggpubr")</code></pre>
<ul>
<li>Or, install the latest version from <a href="https://github.com/kassambara/ggpubr">GitHub</a> as follow:</li>
</ul>
<pre class="r"><code># Install
if(!require(devtools)) install.packages("devtools")
devtools::install_github("kassambara/ggpubr")</code></pre>
<ul>
<li>Load ggpubr as follow:</li>
</ul>
<pre class="r"><code>library(ggpubr)</code></pre>
</div>
</div>
<div id="geting-started" class="section level1">
<h1>Geting started</h1>
<p><span class="success">See the online documentation (<a href="https://www.sthda.com/english/english/rpkgs/ggpubr" class="uri">https://www.sthda.com/english/rpkgs/ggpubr</a>) for a complete list.</span></p>
<div id="density-and-histogram-plots" class="section level2">
<h2>Density and histogram plots</h2>
<ol style="list-style-type: decimal">
<li><strong>Create some data</strong></li>
</ol>
<pre class="r"><code>set.seed(1234)
wdata = data.frame(
   sex = factor(rep(c("F", "M"), each=200)),
   weight = c(rnorm(200, 55), rnorm(200, 58)))
head(wdata, 4)</code></pre>
<pre><code>##   sex   weight
## 1   F 53.79293
## 2   F 55.27743
## 3   F 56.08444
## 4   F 52.65430</code></pre>
<ol start="2" style="list-style-type: decimal">
<li><strong>Density plot with mean lines and marginal rug</strong></li>
</ol>
<pre class="r"><code># Change outline and fill colors by groups ("sex")
# Use custom palette
ggdensity(wdata, x = "weight",
   add = "mean", rug = TRUE,
   color = "sex", fill = "sex",
   palette = c("#00AFBB", "#E7B800"))</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-density-plot-1.png" width="528" /></p>
<br/>
<div class="block">
<p>Note that: <br/></p>
<ol style="list-style-type: decimal">
<li>the argument <strong>palette</strong> is used for coloring or filling by groups. Allowed values include:
<ul>
<li>“grey” for grey color palettes;</li>
<li>brewer palettes e.g. “RdBu”, “Blues”, …; <a href="https://www.sthda.com/english/english/wiki/ggplot2-colors-how-to-change-colors-automatically-and-manually#use-rcolorbrewer-palettes">click here to see all brewer palettes</a>.</li>
<li>or custom color palettes e.g. c(“blue”, “red”) or c(“#00AFBB”, “#E7B800”);</li>
<li>and scientific journal palettes from <a href="https://cran.r-project.org/web/packages/ggsci/vignettes/ggsci.html">ggsci R package</a>, e.g.: “npg”, “aaas”, “lancet”, “jco”, “ucscgb”, “uchicago”, “simpsons” and “rickandmorty”.</li>
</ul></li>
<li>the argument <strong>add</strong> can be used to add mean or median lines to density and to histogram plots. Allowed values are: “mean” and “median”.</li>
</ol>
</div>
<p><br/></p>
<ol start="3" style="list-style-type: decimal">
<li><strong>Histogram plot with mean lines and marginal rug</strong></li>
</ol>
<pre class="r"><code># Change outline and fill colors by groups ("sex")
# Use custom color palette
gghistogram(wdata, x = "weight",
   add = "mean", rug = TRUE,
   color = "sex", fill = "sex",
   palette = c("#00AFBB", "#E7B800"))</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-histogram-plot-1.png" width="528" /></p>
<p><span class="warning">If you want to create the above histogram with the standard ggplot2 functions, the syntax is extremely complex for beginners (see the R script below). The ggpubr package is a wrapper around ggplot2 functions to make your life easier and to produce quickly a publication ready plot.</span></p>
<pre class="r"><code># ggplot2 standard syntax for creating histogram
# +++++++++++++++++++++++++++++++++++++
# Compute group mean
library("dplyr")
mu <- wdata %>%
group_by(sex) %>%
summarise(grp.mean = mean(weight))
# Plot
ggplot(data = wdata, aes(weight)) +
  geom_histogram(aes(color = sex, fill = sex),
                 position = "identity", alpha = 0.5)+
  geom_vline(data = mu, aes(xintercept=grp.mean, color = sex),
             linetype="dashed", size=1) +
  scale_color_manual(values = c("#00AFBB", "#E7B800"))+
  scale_fill_manual(values = c("#00AFBB", "#E7B800"))+
  theme_classic()+
  theme(
    axis.text.x = element_text(size = 12, colour = "black",face = "bold"),
    axis.text.y = element_text(size = 12, colour = "black",face = "bold"),
    axis.line.x = element_line(colour = "black", size = 1),
    axis.line.y = element_line(colour = "black", size = 1),
    legend.position = "bottom"
    )</code></pre>
</div>
<div id="box-plots-violin-plots-dot-plots-and-strip-charts" class="section level2">
<h2>Box plots, violin plots, dot plots and strip charts</h2>
<ol style="list-style-type: decimal">
<li><strong>Load data</strong></li>
</ol>
<pre class="r"><code>data("ToothGrowth")
df <- ToothGrowth
head(df, 4)</code></pre>
<pre><code>##    len supp dose
## 1  4.2   VC  0.5
## 2 11.5   VC  0.5
## 3  7.3   VC  0.5
## 4  5.8   VC  0.5</code></pre>
<ol start="2" style="list-style-type: decimal">
<li><strong>Box plots with jittered points</strong></li>
</ol>
<pre class="r"><code># Change outline colors by groups: dose
# Use custom color palette
# Add jitter points and change the shape by groups
 ggboxplot(df, x = "dose", y = "len",
    color = "dose", palette =c("#00AFBB", "#E7B800", "#FC4E07"),
    add = "jitter", shape = "dose")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-box-plot-points-1.png" width="528" /></p>
<br/>
<div class="block">
<p>Note that, when using ggpubr functions for drawing box plots, violin plots, dot plots, strip charts, bar plots, line plots or error plots, the argument <strong>add</strong> can be used for adding another plot element (e.g.: dot plot or error bars).</p>
In this case, allowed values for the argument <strong>add</strong> are one or the combination of: “none”, “dotplot”, “jitter”, “boxplot”, “mean”, “mean_se”, “mean_sd”, “mean_ci”, “mean_range”, “median”, “median_iqr”, “median_mad”, “median_range”; see ?desc_statby for more details.
</div>
<p><br/></p>
<ol start="3" style="list-style-type: decimal">
<li><strong>Violin plots with box plots inside</strong></li>
</ol>
<pre class="r"><code># Change fill color by groups: dose
# add boxplot with white fill color
ggviolin(df, x = "dose", y = "len", fill = "dose",
   palette = c("#00AFBB", "#E7B800", "#FC4E07"),
   add = "boxplot", add.params = list(fill = "white"))</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-violin-plots-with-box-plots-1.png" width="528" /></p>
<ol start="4" style="list-style-type: decimal">
<li><strong>Dot plots with summary statistics</strong></li>
</ol>
<pre class="r"><code># Change outline and fill colors by groups: dose
# Add mean + sd
ggdotplot(df, x = "dose", y = "len", color = "dose", fill = "dose", 
          palette = c("#00AFBB", "#E7B800", "#FC4E07"),
          add = "mean_sd", add.params = list(color = "gray"))</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-dot-plots-1.png" width="528" /></p>
<br/>
<div class="block">
Recall that, possible summary statistics include “boxplot”, “mean”, “mean_se”, “mean_sd”, “mean_ci”, “mean_range”, “median”, “median_iqr”, “median_mad”, “median_range”; see ?desc_statby for more details.
</div>
<p><br/></p>
<ol start="4" style="list-style-type: decimal">
<li><strong>Strip chart with summary statistics</strong></li>
</ol>
<pre class="r"><code># Change points size
# Change point colors and shapes by groups: dose
# Use custom color palette
 ggstripchart(df, "dose", "len",  size = 2, shape = "dose",
   color = "dose", palette = c("#00AFBB", "#E7B800", "#FC4E07"),
   add = "mean_sd")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-strip-chart-1.png" width="528" /></p>
</div>
</div>
<div id="bar-plots" class="section level1">
<h1>Bar plots</h1>
<ol style="list-style-type: decimal">
<li><strong>Basic plot with labels outsite</strong></li>
</ol>
<pre class="r"><code># Data
df2 <- data.frame(dose=c("D0.5", "D1", "D2"),
   len=c(4.2, 10, 29.5))
print(df2)</code></pre>
<pre><code>##   dose  len
## 1 D0.5  4.2
## 2   D1 10.0
## 3   D2 29.5</code></pre>
<pre class="r"><code># Change ouline and fill colors by groups: dose
# Use custom color palette
# Add labels
 ggbarplot(df2, x = "dose", y = "len",
   fill = "dose", color = "dose",
   palette = c("#00AFBB", "#E7B800", "#FC4E07"),
   label = TRUE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-bar-plots-1.png" width="528" /></p>
<br/>
<div class="block">
<ul>
<li>Use lab.pos = “in”, to put labels inside bars</li>
<li>Use lab.col, to change label colors</li>
</ul>
</div>
<p><br/></p>
<ol start="2" style="list-style-type: decimal">
<li><strong>Bar plot with multiple groups</strong></li>
</ol>
<pre class="r"><code># Create some data
df3 <- data.frame(supp=rep(c("VC", "OJ"), each=3),
   dose=rep(c("D0.5", "D1", "D2"),2),
   len=c(6.8, 15, 33, 4.2, 10, 29.5))
print(df3)</code></pre>
<pre><code>##   supp dose  len
## 1   VC D0.5  6.8
## 2   VC   D1 15.0
## 3   VC   D2 33.0
## 4   OJ D0.5  4.2
## 5   OJ   D1 10.0
## 6   OJ   D2 29.5</code></pre>
<pre class="r"><code># Plot "len" by "dose" and change color by a second group: "supp"
# Add labels inside bars
ggbarplot(df3, x = "dose", y = "len",
  fill = "supp", color = "supp", palette = c("#00AFBB", "#E7B800"),
  label = TRUE, lab.col = "white", lab.pos = "in")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-bar-plot-multiple-groups-1.png" width="528" /></p>
<ol start="4" style="list-style-type: decimal">
<li><strong>Bar plot visualizing the mean of each group with error bars</strong></li>
</ol>
<pre class="r"><code># Data: ToothGrowth data set we&amp;#39;ll be used.
df <- ToothGrowth
head(df, 10)</code></pre>
<pre><code>##     len supp dose
## 1   4.2   VC  0.5
## 2  11.5   VC  0.5
## 3   7.3   VC  0.5
## 4   5.8   VC  0.5
## 5   6.4   VC  0.5
## 6  10.0   VC  0.5
## 7  11.2   VC  0.5
## 8  11.2   VC  0.5
## 9   5.2   VC  0.5
## 10  7.0   VC  0.5</code></pre>
<pre class="r"><code># Visualize the mean of each group
# Change point and outline colors by groups: dose
# Add jitter points and errors (mean_se)
ggbarplot(df, x = "dose", y = "len", color = "dose",
          palette = c("#00AFBB", "#E7B800", "#FC4E07"),
          add = c("mean_se", "jitter"))</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-unnamed-chunk-10-1.png" width="528" /></p>
<div id="line-plots" class="section level2">
<h2>Line plots</h2>
<ol style="list-style-type: decimal">
<li><strong>Line plots with multiple groups</strong></li>
</ol>
<pre class="r"><code># Plot "len" by "dose" and
# Change line types and point shapes by a second groups: "supp"
# Change color by groups "supp"
ggline(df3, x = "dose", y = "len",
  linetype = "supp", shape = "supp",
  color = "supp",  palette = c("#00AFBB", "#E7B800"))</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-unnamed-chunk-11-1.png" width="528" /></p>
<ol start="2" style="list-style-type: decimal">
<li><strong>Line plot visualizing the mean of each group with error bars</strong></li>
</ol>
<pre class="r"><code># Visualize the mean of each group: dose
# Change colors by a second groups: supp
# Add jitter points and errors (mean_se)
ggline(df, x = "dose", y = "len", 
       color = "supp", 
       palette = c("#00AFBB", "#E7B800", "#FC4E07"),
       add = c("mean_se", "jitter"))</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-unnamed-chunk-12-1.png" width="528" /></p>
</div>
<div id="pie-chart" class="section level2">
<h2>Pie chart</h2>
<ol style="list-style-type: decimal">
<li><strong>Create some data</strong></li>
</ol>
<pre class="r"><code>df4 <- data.frame(
  group = c("Male", "Female", "Child"),
  value = c(25, 25, 50))
head(df4)</code></pre>
<pre><code>##    group value
## 1   Male    25
## 2 Female    25
## 3  Child    50</code></pre>
<ol start="2" style="list-style-type: decimal">
<li><strong>Pie chart</strong></li>
</ol>
<pre class="r"><code># Change fill color by group
# set outline line color to white
# Use custom color palette
# Show group names and value as labels
labs <- paste0(df4$group, " (", df4$value, "%)")
ggpie(df4, x = "value", fill = "group", color = "white",
   palette = c("#00AFBB", "#E7B800", "#FC4E07"),
   label = labs, lab.pos = "in", lab.font = "white")</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-pie-chart-1.png" width="576" /></p>
</div>
<div id="scatter-plots" class="section level2">
<h2>Scatter plots</h2>
<ol style="list-style-type: decimal">
<li><strong>Load and prepare data</strong></li>
</ol>
<pre class="r"><code>data("mtcars")
df5 <- mtcars
df5$cyl <- as.factor(df5$cyl) # grouping variable
df5$name = rownames(df5) # for point labels
head(df5[, c("wt", "mpg", "cyl")], 3)</code></pre>
<pre><code>##                  wt  mpg cyl
## Mazda RX4     2.620 21.0   6
## Mazda RX4 Wag 2.875 21.0   6
## Datsun 710    2.320 22.8   4</code></pre>
<ol style="list-style-type: decimal">
<li><strong>Scatter plots with regression line and confidence interval</strong></li>
</ol>
<pre class="r"><code>ggscatter(df5, x = "wt", y = "mpg",
   color = "black", shape = 21, size = 4, # Points color, shape and size
   add = "reg.line",  # Add regressin line
   add.params = list(color = "blue", fill = "lightgray"), # Customize reg. line
   conf.int = TRUE, # Add confidence interval
   cor.coef = TRUE # Add correlation coefficient
   )</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-scatter-plot-regression-line-1.png" width="528" /></p>
<br/>
<div class="block">
Note that, when using ggpubr functions for drawing scatter plots, allowed values for the argument <strong>add</strong> are one of “none”, “reg.line” (for adding linear regression line) or “loess” (for adding local regression fitting).
</div>
<p><br/></p>
<ol start="2" style="list-style-type: decimal">
<li><strong>Scatter plot with concentration ellipses and labels</strong></li>
</ol>
<pre class="r"><code># Change point colors and shapes by groups: cyl
# Use custom palette
# Add concentration ellipses with mean points (barycenters)
# Add marginal rug
# Add label and use repel = TRUE to avoid label overplotting
ggscatter(df5, x = "wt", y = "mpg",
   color = "cyl", shape = "cyl",
   palette = c("#00AFBB", "#E7B800", "#FC4E07"),
   ellipse = TRUE, mean.point = TRUE,
   rug = TRUE, label = "name", font.label = 10, repel = TRUE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-scatter-plot-concentration-ellipse-1.png" width="672" /></p>
<br/>
<div class="block">
Note that, it’s possible to change the ellipse type by using the argument <strong>ellipse.type</strong>. Possible values are ‘convex’, ‘confidence’ or types supported by ggplot2::stat_ellipse() including one of c(“t”, “norm”, “euclid”).
</div>
<p><br/></p>
</div>
<div id="clevelands-dot-plots" class="section level2">
<h2>Cleveland’s dot plots</h2>
<pre class="r"><code># Change colors by  group cyl
ggdotchart(df5, x = "mpg", label = "name",
   group = "cyl", color = "cyl",
   palette = c("#00AFBB", "#E7B800", "#FC4E07") )</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-cleveland-dot-plots-1.png" width="384" /></p>
</div>
<div id="ggpar-customize-ggplot-easily" class="section level2">
<h2>ggpar(): customize ggplot easily</h2>
<p>The function <strong>ggpar</strong>() [in ggpubr] can be used to simply and easily customize any ggplot2-based graphs. The graphical parameters that can be changed using ggpar() include:</p>
<ul>
<li>Main titles, axis labels and legend titles</li>
<li>Legend position and appearance</li>
<li>colors</li>
<li>Axis limits</li>
<li>Axis transformations: log and sqrt</li>
<li>Axis ticks</li>
<li>Themes</li>
<li>Rotate a plot</li>
</ul>
<p><span class="warning">Note that all the arguments accepted by the function ggpar() can be also directly passed to the plotting functions in ggpubr package.</span></p>
<p>We start by creating a basic box plot colored by groups as follow:</p>
<pre class="r"><code>df <- ToothGrowth
p <- ggboxplot(df, x = "dose", y = "len",
               color = "dose")
print(p)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-basic-box-plot-1.png" width="259.2" /></p>
<div id="main-titles-axis-labels-and-legend-titles" class="section level3">
<h3>Main titles, axis labels and legend titles</h3>
<pre class="r"><code># Change title texts and fonts
ggpar(p, main = "Plot of length \n by dose",
      xlab ="Dose (mg)", ylab = "Teeth length",
      legend.title = "Dose (mg)",
      font.main = c(14,"bold.italic", "red"),
      font.x = c(14, "bold", "#2E9FDF"),
      font.y = c(14, "bold", "#E7B800"))

# Hide titles
ggpar(p, xlab = FALSE, ylab = FALSE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-main-titles-axis-labels-legend-titles-1.png" width="259.2" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-main-titles-axis-labels-legend-titles-2.png" width="259.2" /></p>
<br/>
<div class="block">
<p>Note that, <br/></p>
<ol style="list-style-type: decimal">
<li><strong>font.main, font.x, font.y</strong> are vectors of length 3 indicating respectively the size (e.g.: 14), the style (e.g.: “plain”, “bold”, “italic”, “bold.italic”) and the color (e.g.: “red”) of main title, xlab and ylab, respectively. For example font.x = c(14, “bold”, “red”). Use font.x = 14, to change only font size; or use font.x = “bold”, to change only font face.</li>
<li>you can use <strong>\n</strong>, to split long title into multiple lines.</li>
</ol>
</div>
<p><br/></p>
</div>
<div id="legend-position-and-appearance" class="section level3">
<h3>Legend position and appearance</h3>
<pre class="r"><code>ggpar(p,
 legend = "right", legend.title = "Dose (mg)",
 font.legend = c(10, "bold", "red"))</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-legend-position-appearance-1.png" width="307.2" /></p>
<br/>
<div class="block">
Note that, the <strong>legend</strong> argument is a character vector specifying legend position. Allowed values are one of c(“top”, “bottom”, “left”, “right”, “none”). Default is “bottom” side position. to remove the legend use legend = “none”. Legend position can be also specified using a numeric vector c(x, y). Their values should be between 0 and 1. c(0,0) corresponds to the “bottom left” and c(1,1) corresponds to the “top right” position.
</div>
<p><br/></p>
</div>
<div id="color-palettes" class="section level3">
<h3>Color palettes</h3>
<p>As mentioned above, the argument <strong>palette</strong> is used to change group color palettes. Allowed values include:</p>
<ul>
<li>Custom color palettes e.g. c(“blue”, “red”) or c(“#00AFBB”, “#E7B800”);</li>
<li>“grey” for grey color palettes;</li>
<li>brewer palettes e.g. “RdBu”, “Blues”, …; <a href="https://www.sthda.com/english/english/wiki/ggplot2-colors-how-to-change-colors-automatically-and-manually#use-rcolorbrewer-palettes">click here to see all brewer palettes</a>.</li>
<li>and scientific journal palettes from <a href="https://cran.r-project.org/web/packages/ggsci/vignettes/ggsci.html">ggsci R package</a>, e.g.: “npg”, “aaas”, “lancet”, “jco”, “ucscgb”, “uchicago”, “simpsons” and “rickandmorty”.</li>
</ul>
<pre class="r"><code># Use custom color palette
ggpar(p, palette = c("#00AFBB", "#E7B800", "#FC4E07"))

# Use brewer palette
ggpar(p, palette = "Dark2" )

# Use grey palette
ggpar(p, palette = "grey")
   

# Use scientific journal palette from ggsci package
# Allowed values: "npg", "aaas", "lancet", "jco", 
#   "ucscgb", "uchicago", "simpsons" and "rickandmorty".
ggpar(p, palette = "npg") # nature</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-color-1.png" width="259.2" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-color-2.png" width="259.2" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-color-3.png" width="259.2" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-color-4.png" width="259.2" /></p>
</div>
<div id="axis-limits-and-scales" class="section level3">
<h3>Axis limits and scales</h3>
<p>The following arguments can be used:</p>
<br/>
<div class="block">
<ul>
<li><strong>xlim, ylim</strong>: a numeric vector of length 2, specifying x and y axis limits (minimum and maximum values), respectively. e.g.: ylim = c(0, 50).</li>
<li><strong>xscale, yscale</strong>: x and y axis scale, respectively. Allowed values are one of c(“none”, “log2”, “log10”, “sqrt”); e.g.: yscale=“log2”.</li>
<li><strong>format.scale</strong>: logical value. If TRUE, axis tick mark labels will be formatted when xscale or yscale = “log2” or “log10”.</li>
</ul>
</div>
<p><br/></p>
<pre class="r"><code># Change y axis limits
ggpar(p, ylim = c(0, 50))

# Change y axis scale to log2
ggpar(p, yscale = "log2")

# Format axis scale
ggpar(p, yscale = "log2", format.scale = TRUE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-axis-limits-scales-1.png" width="259.2" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-axis-limits-scales-2.png" width="259.2" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-axis-limits-scales-3.png" width="259.2" /></p>
</div>
<div id="axis-ticks-customize-tick-marks-and-labels" class="section level3">
<h3>Axis ticks: customize tick marks and labels</h3>
<p>The following arguments can be used:</p>
<br/>
<div class="block">
<ul>
<li><strong>ticks</strong>: logical value. Default is TRUE. If FALSE, hide axis tick marks.</li>
<li><strong>tickslab</strong>: logical value. Default is TRUE. If FALSE, hide axis tick labels.</li>
<li><strong>font.tickslab</strong>: Font style (size, face, color) for tick labels, e.g.: c(14, “bold”, “red”).</li>
<li><strong>xtickslab.rt, ytickslab.rt</strong>: Rotation angle of x and y axis tick labels, respectively. Default value is 0.</li>
<li><strong>xticks.by, yticks.by</strong>: numeric value controlling x and y axis breaks, respectively. For example, if yticks.by = 5, a tick mark is shown on every 5. Default value is NULL.</li>
</ul>
</div>
<p><br/></p>
<pre class="r"><code># Axis tick labels style: "plain", "italic", "bold" or "bold.italic"
# Rotation angle = 45
ggpar(p, font.tickslab = c(12, "bold", "#2E9FDF"),
      xtickslab.rt = 45, ytickslab.rt = 45)

# Hide ticks and tickslab
ggpar(p, ticks = FALSE, tickslab = FALSE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-axis-ticks-1.png" width="259.2" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-axis-ticks-2.png" width="259.2" /></p>
</div>
<div id="themes" class="section level3">
<h3>Themes</h3>
<p>The R package <strong>ggpubr</strong> contains two main functions for changing the default ggplot theme to a publication ready theme:</p>
<ul>
<li><strong>theme_pubr</strong>(): change the theme to a publication ready theme</li>
<li><strong>labs_pubr</strong>(): Format only plot labels to a publication ready style</li>
</ul>
<p><span class="success">theme_pubr() will produce plots with bold axis labels, bold tick mark labels and legend at the bottom leaving extra space for the plotting area.</span></p>
<br/>
<div class="block">
The argument <strong>ggtheme</strong> can be used in any ggpubr plotting functions to change the plot theme. Default value is <strong>theme_pubr</strong>() for publication ready theme. Allowed values include ggplot2 official themes: theme_gray(), theme_bw(), theme_minimal(), theme_classic(), theme_void(), etc. It’s also possible to use the function <strong>“+”</strong> to add a theme.
</div>
<p><br/></p>
<pre class="r"><code># Gray theme
p + theme_gray()

# Minimal theme
p + theme_minimal()

# Format only plot labels to a publication ready style
# by using the function labs_pubr()
p + theme_minimal() + labs_pubr(base_size = 16)</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-theme-1.png" width="307.2" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-theme-2.png" width="307.2" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-theme-3.png" width="307.2" /></p>
</div>
<div id="rotate-a-plot" class="section level3">
<h3>Rotate a plot</h3>
<ul>
<li>Create some data</li>
</ul>
<pre class="r"><code>set.seed(1234)
wdata = data.frame(
   sex = factor(rep(c("F", "M"), each=200)),
   weight = c(rnorm(200, 55), rnorm(200, 58)))</code></pre>
<ul>
<li>Create a density plot and change plot orientation</li>
</ul>
<pre class="r"><code># Basic density plot
p <- ggdensity(wdata, x = "weight") + theme_gray()
p

# Horizontal plot
ggpar(p, orientation = "horizontal" ) + theme_gray()

# y axis reversed
ggpar(p, orientation = "reverse" ) + theme_gray()</code></pre>
<p><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-rotate-plot-1.png" width="259.2" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-rotate-plot-2.png" width="259.2" /><img src="https://www.sthda.com/english/sthda/RDoc/figure/r-packages/ggpubr/ggpubr-rotate-plot-3.png" width="259.2" /></p>
</div>
</div>
<div id="more" class="section level2">
<h2>More</h2>
<p><span class="success">See the online documentation (<a href="https://www.sthda.com/english/english/rpkgs/ggpubr" class="uri">https://www.sthda.com/english/rpkgs/ggpubr</a>) for a complete list.</span></p>
</div>
</div>
<div id="infos" class="section level1">
<h1>Infos</h1>
<p><span class="warning"> This analysis has been performed using <strong>R software</strong> (ver. 3.2.4) and <strong>ggpubr</strong> (ver. 0.1.0.999) </span></p>
</div>

<script>jQuery(document).ready(function () {
    jQuery('#rdoc h1').addClass('wiki_paragraph1');
    jQuery('#rdoc h2').addClass('wiki_paragraph2');
    jQuery('#rdoc h3').addClass('wiki_paragraph3');
    jQuery('#rdoc h4').addClass('wiki_paragraph4');
    });//add phpboost class to header</script>
<style>.content{padding:0px;}</style>
</div><!--end rdoc-->
<!--====================== stop here when you copy to sthda================-->


<!-- END HTML -->]]></description>
			<pubDate>Sun, 24 Jul 2016 16:44:26 +0200</pubDate>
			
		</item>
		
	</channel>
</rss>
