<?xml version="1.0" encoding="UTF-8" ?>
<!-- RSS generated by PHPBoost on Tue, 14 Apr 2026 05:00:31 +0200 -->

<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
	<channel>
		<title><![CDATA[Last articles - STHDA : ggpubr: Publication Ready Plots]]></title>
		<atom:link href="https://www.sthda.com/english/syndication/rss/articles/24" rel="self" type="application/rss+xml"/>
		<link>https://www.sthda.com</link>
		<description><![CDATA[Last articles - STHDA : ggpubr: Publication Ready Plots]]></description>
		<copyright>(C) 2005-2026 PHPBoost</copyright>
		<language>en</language>
		<generator>PHPBoost</generator>
		
		
		<item>
			<title><![CDATA[Add Text Labels to Histogram and Density Plots]]></title>
			<link>https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/84-add-text-labels-to-histogram-and-density-plots/</link>
			<guid>https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/84-add-text-labels-to-histogram-and-density-plots/</guid>
			<description><![CDATA[<!-- START HTML -->

<div id="rdoc">

<p>In this article, we’ll explain how to create <strong>histograms</strong>/<strong>density plots</strong> with <strong>text labels</strong> using the ggpubr package.</p>
<p>I used this type of plots in my recent scientific publication entitled “<a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5449613/pdf/gkx327.pdf">Global miRNA expression analysis identifies novel key regulators of plasma cell differentiation and malignant plasma cell</a>”, in Nucleic Acids Research Journal, where I was interested to visualize the distribution of the citation index of some key genes (Figure 4A, A. Kassambara et al., NAR 2017). The plot has been generated using the ggpubr package.</p>
<p>In the examples presented here, We’ll use the demo data set <strong>gene_citation</strong> [in ggpubr]. It contains the mean citation index of 66 genes defined by assessing PubMed abstracts and annotations using two key words i) Gene name + b cell differentiation and ii) Gene name + plasma cell differentiation. A citation index is computed for each gene as the average number of citations obtained using the two key words. Genes with a mean citation index >= 3 are kept in the data.</p>
<p>Bar plot of the gene citation index sorted in descending order:</p>
<pre class="r"><code>library(ggpubr)
# Load data
data(gene_citation)
head(gene_citation)</code></pre>
<pre><code>##      gene citation_index
## 2   CASP3           68.0
## 4    CDK6           10.5
## 7   CCND2           10.0
## 8     SCD            8.5
## 10 SLAMF6            4.5
## 11 BCL2L1           56.5</code></pre>
<pre class="r"><code>ggbarplot(gene_citation, x = "gene", y = "citation_index",
          fill = "lightgray", 
          xlab = "Gene name", ylab = "Citation index",
          sort.val = "desc", # Sort in descending order
          top = 20,          # select top 20 most citated genes
          x.text.angle = 45  # x axis text rotation angle
          )</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/050-add-text-to-histogram-and-density-plots-bar-plot-citation-index-1.png" width="480" /></p>
<p>The plot below shows the distribution of the citation index. Some key genes known to be involved in plasma cell differentiation are highlighted.</p>
<pre class="r"><code># Some key genes of interest to be highlighted
key.gns <- c("MYC", "PRDM1", "CD69", "IRF4", "CASP3",
             "BCL2L1", "MYB",  "BACH2", "BIM1",  "PTEN",
             "KRAS", "FOXP1", "IGF1R", "KLF4", "CDK6", "CCND2",
             "IGF1", "TNFAIP3", "SMAD3", "SMAD7",
             "BMPR2", "RB1", "IGF2R", "ARNT")
        
# Histogram distribution
gghistogram(gene_citation, x = "citation_index", y = "..count..",
            xlab = "Number of citation",
            ylab = "Number of genes",
            binwidth = 5, 
            fill = "lightgray", color = "black",
            label = "gene", label.select = key.gns, repel = TRUE,
            font.label = list(color= "citation_index"),
            xticks.by = 20, # Break x ticks by 20
            gradient.cols = c("blue", "red"),
            legend = c(0.7, 0.6),                                 
            legend.title = ""       # Hide legend title
            )</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/050-add-text-to-histogram-and-density-plots-gene-citation-1.png" width="768" /></p>
<pre class="r"><code># Density distribution
ggdensity(gene_citation, x = "citation_index", y = "..count..",
            xlab = "Number of citation",
            ylab = "Number of genes",
            fill = "lightgray", color = "black",
            label = "gene", label.select = key.gns, repel = TRUE,
            font.label = list(color= "citation_index"),
            xticks.by = 20, # Break x ticks by 20
            gradient.cols = c("blue", "red"),
            legend = c(0.7, 0.6),                                 
            legend.title = ""       # Hide legend title
            )</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/050-add-text-to-histogram-and-density-plots-gene-citation-2.png" width="768" /></p>
</div>


</div><!--end rdoc-->

<!-- END HTML -->]]></description>
			<pubDate>Sat, 02 Sep 2017 22:49:00 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[Create and Customize Multi-panel ggplots: Easy Guide to Facet]]></title>
			<link>https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/83-create-and-customize-multi-panel-ggplots-easy-guide-to-facet/</link>
			<guid>https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/83-create-and-customize-multi-panel-ggplots-easy-guide-to-facet/</guid>
			<description><![CDATA[<!-- START HTML -->

  <div id="rdoc">



<p>This article describes how to split up your data by one or more variables and to visualize the subsets of the data together. The function <strong>facet</strong>() [in ggpubr] allows to draw <strong>multi-panel</strong> plots of a data set grouped by one or two variables. Additionally, we’ll show how to easily modify panel labels.</p>

<p><strong>Contents:</strong></p>
<div id="TOC">
<ul>
<li><a href="#prerequisites">Prerequisites</a></li>
<li><a href="#basic-plots">Basic plots</a></li>
<li><a href="#facet-by-one-grouping-variables">Facet by one grouping variables</a></li>
<li><a href="#facet-by-two-grouping-variables">Facet by two grouping variables</a></li>
<li><a href="#modifying-panel-label-appearance">Modifying panel label appearance</a></li>
</ul>
</div>

<div id="prerequisites" class="section level2">
<h2>Prerequisites</h2>
<p>Required R packages: ggpubr to easily create ggplot2-based publication ready plots.</p>
<p>Install from CRAN:</p>
<pre class="r"><code>install.packages("ggpubr")</code></pre>
<p>Or, install the latest developmental version from <a href="https://github.com/kassambara/ggpubr">GitHub</a> as follow:</p>
<pre class="r"><code>if(!require(devtools)) install.packages("devtools")
devtools::install_github("kassambara/ggpubr")</code></pre>
<p>Load ggpubr:</p>
<pre class="r"><code>library(ggpubr)</code></pre>
</div>
<div id="basic-plots" class="section level2">
<h2>Basic plots</h2>
<p>Demo data set:</p>
<pre class="r"><code>df <- ToothGrowth
df$dose <- as.factor(df$dose)
head(df)</code></pre>
<pre><code>##    len supp dose
## 1  4.2   VC  0.5
## 2 11.5   VC  0.5
## 3  7.3   VC  0.5
## 4  5.8   VC  0.5
## 5  6.4   VC  0.5
## 6 10.0   VC  0.5</code></pre>
<p>Plot:</p>
<pre class="r"><code>p <- ggdensity(df, x = "len", fill = "dose", 
               palette = "jco", 
               ggtheme = theme_light(), legend = "top")
p</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/045-facet-ggplot-basic-density-1.png" width="384" /></p>
</div>
<div id="facet-by-one-grouping-variables" class="section level2">
<h2>Facet by one grouping variables</h2>
<p>Divide by the levels of the <em>supp</em> variable in the horizontal direction:</p>
<pre class="r"><code>facet(p, facet.by = "supp")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/045-facet-ggplot-facet-by-one-variables-1.png" width="576" /></p>
<p>Divide by the levels of <em>supp</em> in the vertical direction –> use <em>ncol = 1</em>:</p>
<pre class="r"><code>facet(p, facet.by = "supp", ncol = 1)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/045-facet-ggplot-facet-by-one-variables-vertical-1.png" width="240" /></p>
</div>
<div id="facet-by-two-grouping-variables" class="section level2">
<h2>Facet by two grouping variables</h2>
<p>The data can be split up by one or two variables that vary on the horizontal and/or vertical direction.</p>
<p>For example in <strong>facet.by = c(“supp”, “dose”)</strong>:</p>
<ul>
<li>“supp”, the first variable, will be displayed in vertical direction</li>
<li>“dose”, the second variable, will be displayed in horizontal direction.</li>
</ul>
<pre class="r"><code># Divide with "supp" vertical, "dose" horizontal
facet(p, facet.by = c("supp", "dose"),
      short.panel.labs = FALSE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/045-facet-ggplot-facet-by-two-variables-1.png" width="672" /></p>
</div>
<div id="modifying-panel-label-appearance" class="section level2">
<h2>Modifying panel label appearance</h2>
<p>Additional arguments are available to customize the appearance of panel labels (see <em>?facet</em>). These include:</p>
<div class="block">
<ul>
<li>
<p>
<strong>short.panel.labs</strong>: logical value. If TRUE, create short labels for panels by omitting variable names; in other words panels will be labelled only by variable grouping levels.
</p>
</li>
<li>
<p>
<strong>panel.labs</strong>: a list of one or two character vectors to modify facet label text. For example, panel.labs = list(sex = c(“Male”, “Female”)) specifies the labels for the “sex” variable. For two grouping variables, you can use for example panel.labs = list(sex = c(“Male”, “Female”), rx = c(“Obs”, “Lev”, “Lev2”) ).
</p>
</li>
<li>
<strong>panel.labs.background</strong>: a list of aesthetics to customize the background of panel labels. Should contain the combination of the following elements:
<ul>
<li>
<em>color, linetype, size</em>: background line color, type and size
</li>
<li>
<em>fill</em>: background fill color. For example, panel.labs.background = list(color = “blue”, fill = “pink”).
</li>
</ul>
</li>
<li>
<p>
<strong>panel.labs.font</strong>: a list of aesthetics indicating the size (e.g.: 14), the face/style (e.g.: “plain”, “bold”, “italic”, “bold.italic”) and the color (e.g.: “red”) and the orientation angle (e.g.: 45) of panel labels. Use <strong>panel.labs.font.x</strong> and <strong>panel.labs.font.y</strong> to customize only labels in x direction and y direction, respectively.
</p>
</li>
</ul>
</div>
<pre class="r"><code># Divide with "supp" vertical, "dose" horizontal
facet(p, facet.by = c("supp", "dose"),
       panel.labs = list(
         supp = c("Orange Juice", "Vitamin C"),
         dose = c("D0.5", "D1", "D2")
         ),
       panel.labs.background = list(color = "steelblue", fill = "steelblue", size = 0.5),
       panel.labs.font = list(color = "white"),
       panel.labs.font.x = list(angle = 45, color = "white")
      )</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/045-facet-ggplot-label-appearance-1.png" width="672" /></p>
</div>
</div>


</div><!--end rdoc-->

<!-- END HTML -->]]></description>
			<pubDate>Sat, 02 Sep 2017 19:49:00 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[ggplot2 - Easy Way to Change Graphical Parameters]]></title>
			<link>https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/82-ggplot2-easy-way-to-change-graphical-parameters/</link>
			<guid>https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/82-ggplot2-easy-way-to-change-graphical-parameters/</guid>
			<description><![CDATA[<!-- START HTML -->

  <div id="rdoc">

<p>This article describes the function <strong>ggpar</strong>() [in ggpubr], which can be used to simply and easily customize any <strong>ggplot2</strong>-based graphs. The <strong>graphical parameters</strong> that can be changed using ggpar() include:</p>
<ul>
<li>Main titles, axis labels and legend titles</li>
<li>Legend position and appearance</li>
<li>colors</li>
<li>Axis limits</li>
<li>Axis transformations: log and sqrt</li>
<li>Axis ticks</li>
<li>Themes</li>
<li>Rotate a plot</li>
</ul>
<div class="warning">
<p>
Note that all the arguments accepted by the function ggpar() can be also directly passed to the plotting functions in ggpubr package, such as <em>ggboxplot</em>(), <em>ggdotplot</em>(), <em>ggscatter</em>(), …
</p>
</div>

<p><strong>Contents:</strong></p>
<div id="TOC">
<ul>
<li><a href="#prerequisites">Prerequisites</a></li>
<li><a href="#basic-plots">Basic plots</a></li>
<li><a href="#change-titles-and-axis-labels">Change titles and axis labels</a></li>
<li><a href="#change-legend-position-appearance">Change legend position &amp; appearance</a></li>
<li><a href="#change-color-palettes">Change color palettes</a><ul>
<li><a href="#group-colors">Group colors</a></li>
<li><a href="#gradient-colors">Gradient colors</a></li>
</ul></li>
<li><a href="#change-axis-limits-and-scales">Change axis limits and scales</a></li>
<li><a href="#customize-axis-text-and-ticks">Customize axis text and ticks</a></li>
<li><a href="#rotate-a-plot">Rotate a plot</a></li>
<li><a href="#change-themes">Change themes</a></li>
<li><a href="#remove-ggplot-components">Remove ggplot components</a></li>
</ul>
</div>

<div id="prerequisites" class="section level2">
<h2>Prerequisites</h2>
<p>Install <a href="https://www.sthda.com/english/rpkgs/ggpubr">ggpubr (version >= 0.1.4)</a>, for easily creating ggplot2-based publication ready plots.</p>
<p>Install from CRAN:</p>
<pre class="r"><code>install.packages("ggpubr")</code></pre>
<p>Or, install the latest developmental version from <a href="https://github.com/kassambara/ggpubr">GitHub</a> as follow:</p>
<pre class="r"><code>if(!require(devtools)) install.packages("devtools")
devtools::install_github("kassambara/ggpubr")</code></pre>
<p>Load ggpubr:</p>
<pre class="r"><code>library(ggpubr)</code></pre>
</div>
<div id="basic-plots" class="section level2">
<h2>Basic plots</h2>
<p>We start by creating a basic box plot colored by groups. To add the panel border line, we’ll use the helper function <strong>border</strong>() [in ggpubr].</p>
<pre class="r"><code># Basic plot
p <- ggboxplot(ToothGrowth, x = "dose", y = "len",
               color = "dose")
p

# Add grids
p + grids(linetype = "dashed")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-add-grids-1.png" width="259.2" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-add-grids-2.png" width="259.2" /></p>
<pre class="r"><code># Add panel border line
p + border("black")
  
# Change background color
p + bgcolor("#BFD5E3") +
  border("#BFD5E3") </code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-panel-border-and-bacground-color-1.png" width="259.2" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-panel-border-and-bacground-color-2.png" width="259.2" /></p>
</div>
<div id="change-titles-and-axis-labels" class="section level2">
<h2>Change titles and axis labels</h2>
<p>Change plot titles and labels as follow:</p>
<pre class="r"><code># Change titles and axis labels
p2 <- ggpar(p, 
            title = "Box Plot created with ggpubr",
            subtitle = "Length by dose",
            caption = "Source: ggpubr",
            xlab ="Dose (mg)", 
            ylab = "Teeth length",
            legend.title = "Dose (mg)")
p2</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-change-titles-and-axis-labels-1.png" width="336" /></p>
<p>Change the font/appearance of plot titles and labels. Use <strong>ggpar</strong>():</p>
<pre class="r"><code>ggpar(p2, 
      font.title = c(14, "bold.italic", "red"),
      font.subtitle = c(10,  "orange"),
      font.caption = c(10,  "orange"),
      font.x = c(14,  "blue"),
      font.y = c(14,  "#993333")
      )</code></pre>
<p>or, equivalently, use <strong>font</strong>():</p>
<pre class="r"><code>p2 +
 font("title", size = 14, color = "red", face = "bold.italic")+
 font("subtitle", size = 10, color = "orange")+
 font("caption", size = 10, color = "orange")+
 font("xlab", size = 12, color = "blue")+
 font("ylab", size = 12, color = "#993333")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-change-apperance-titles-and-axis-labels-1.png" width="336" /></p>
<p>Note that, you can change simultaneously titles/labels text and appearance at once using the function <strong>ggpar</strong>() as follow:</p>
<pre class="r"><code># Change title texts and fonts
# line break: \n
ggpar(p, title = "Plot of length \n by dose",
      xlab ="Dose (mg)", ylab = "Teeth length",
      legend.title = "Dose (mg)",
      font.title = c(14,"bold.italic", "red"),
      font.x = c(14, "bold", "#2E9FDF"),
      font.y = c(14, "bold", "#E7B800"))</code></pre>
<div class="block">
<p>
Note that,
</p>
<ul>
<li>
<p>
<strong>font.title, font.subtitle, font.caption, font.x, font.y</strong> are vectors of length 3 indicating respectively the size (e.g.: 14), the style (e.g.: “plain”, “bold”, “italic”, “bold.italic”) and the color (e.g.: “red”) of main title, subtitle, caption, xlab and ylab, respectively. For example font.x = c(14, “bold”, “red”). Use font.x = 14, to change only font size; or use font.x = “bold”, to change only font face.
</p>
</li>
<li>
<p>
you can use line breaks, to split long title into multiple lines.
</p>
</li>
</ul>
</div>
</div>
<div id="change-legend-position-appearance" class="section level2">
<h2>Change legend position &amp; appearance</h2>
<pre class="r"><code>ggpar(p,
      legend = "right", legend.title = "Dose (mg)") + 
  font("legend.title", color = "blue", face = "bold")+ 
  font("legend.text", color = "red")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-legend-position-appearance-1.png" width="307.2" /></p>
<div class="block">
<p>
Note that, the <em>legend</em> argument is a character vector specifying legend position. Allowed values are one of c(“top”, “bottom”, “left”, “right”, “none”). To remove the legend use legend = “none”. Legend position can be also specified using a numeric vector c(x, y). Their values should be between 0 and 1. c(0,0) corresponds to the “bottom left” and c(1,1) corresponds to the “top right” position.
</p>
</div>
</div>
<div id="change-color-palettes" class="section level2">
<h2>Change color palettes</h2>
<div id="group-colors" class="section level3">
<h3>Group colors</h3>
<p>The argument <strong>palette</strong> (in ggpar() function ) can be used to change group color palettes. Allowed values include:</p>
<ul>
<li>Custom color palettes e.g. c(“blue”, “red”) or c(“#00AFBB”, “#E7B800”);</li>
<li>“grey” for grey color palettes;</li>
<li>brewer palettes e.g. “RdBu”, “Blues”, …; To view all, type this in R: <em>RColorBrewer::display.brewer.all()</em>.</li>
<li>and scientific journal palettes from <a href="https://cran.r-project.org/web/packages/ggsci/vignettes/ggsci.html">ggsci R package</a>, e.g.: “npg”, “aaas”, “lancet”, “jco”, “ucscgb”, “uchicago”, “simpsons” and “rickandmorty”.</li>
</ul>
<pre class="r"><code># Use custom color palette
ggpar(p, palette = c("#00AFBB", "#E7B800", "#FC4E07"))

# Use brewer palette. 
# Type RColorBrewer::display.brewer.all(), to see possible palettes
ggpar(p, palette = "Dark2" )

# Use grey palette
ggpar(p, palette = "grey")
   
# Use scientific journal palette from ggsci package
# Allowed values: "npg", "aaas", "lancet", "jco", 
#   "ucscgb", "uchicago", "simpsons" and "rickandmorty".
ggpar(p, palette = "npg") # nature</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-color-1.png" width="259.2" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-color-2.png" width="259.2" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-color-3.png" width="259.2" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-color-4.png" width="259.2" /></p>
<p>Alternatively, you can use directly the functions <strong>color_palette</strong>() and <strong>fill_palette</strong>() [in ggpubr] as follow:</p>
<pre class="r"><code># jco color palette
p + color_palette("jco")

# Custom color
p + color_palette(c("#00AFBB", "#E7B800", "#FC4E07"))</code></pre>
</div>
<div id="gradient-colors" class="section level3">
<h3>Gradient colors</h3>
<p>To change easily gradient colors, the ggpubr package provides the functions: <strong>gradient_color</strong>() and <strong>gradient_fill</strong>().</p>
<p>For example, start by creating a scatter plot colored according the values of a continuous variable “mpg”.</p>
<pre class="r"><code>p3 <- ggscatter(mtcars, x = "wt", y = "mpg", color = "mpg",
                size = 2)</code></pre>
<p>Change gradient color:</p>
<pre class="r"><code># Use one custom color
p3 + gradient_color("red")

# Two colors
p3 + gradient_color(c("blue",  "red"))

# Three colors
p3 + gradient_color(c("blue", "white", "red"))

# Use RColorBrewer palette
p3 + gradient_color("RdYlBu")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-gradient-colors-1.png" width="336" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-gradient-colors-2.png" width="336" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-gradient-colors-3.png" width="336" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-gradient-colors-4.png" width="336" /></p>
<p>For <strong>gradient_fill</strong>(), the same syntax holds true. Start by creating a scatter plot filled by a continuous variable. Then, change the gradient color.</p>
<pre class="r"><code>p4 <- ggscatter(mtcars, x = "wt", y = "mpg", fill = "mpg",
                size = 4, shape = 21)
p4 + gradient_fill(c("blue", "white", "red"))</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-gradient-fill-1.png" width="384" /></p>
</div>
</div>
<div id="change-axis-limits-and-scales" class="section level2">
<h2>Change axis limits and scales</h2>
<p>The following arguments can be used in <strong>ggpar</strong>():</p>
<div class="block">
<ul>
<li>
<strong>xlim, ylim</strong>: a numeric vector of length 2, specifying x and y axis limits (minimum and maximum values), respectively. e.g.: ylim = c(0, 50).
</li>
<li>
<strong>xscale, yscale</strong>: x and y axis scale, respectively. Allowed values are one of c(“none”, “log2”, “log10”, “sqrt”); e.g.: yscale=“log2”.
</li>
<li>
<strong>format.scale</strong>: logical value. If TRUE, axis tick mark labels will be formatted when xscale or yscale = “log2” or “log10”.
</li>
</ul>
</div>
<pre class="r"><code># Change y axis limits
ggpar(p, ylim = c(0, 50))

# Change y axis scale to log2
ggpar(p, yscale = "log2")

# Format axis scale
ggpar(p, yscale = "log2", format.scale = TRUE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-axis-limits-scales-1.png" width="240" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-axis-limits-scales-2.png" width="240" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-axis-limits-scales-3.png" width="240" /></p>
<p>Alternatively, you can also use the function <strong>xscale</strong>() and <strong>yscale</strong>() [in ggpubr], as follow:</p>
<pre class="r"><code>p + yscale("log2", .format = TRUE)</code></pre>
</div>
<div id="customize-axis-text-and-ticks" class="section level2">
<h2>Customize axis text and ticks</h2>
<pre class="r"><code># Change the font of x and y axis texts.
# Rotate x and y texts, angle = 45
p + 
  font("xy.text", size = 12, color = "blue", face = "bold") +
  rotate_x_text(45)+       
  rotate_y_text(45)

# remove ticks and axis texts
p + rremove("ticks")+
  rremove("axis.text")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-axis-text-and-axis-ticks-1.png" width="240" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-axis-text-and-axis-ticks-2.png" width="240" /></p>
</div>
<div id="rotate-a-plot" class="section level2">
<h2>Rotate a plot</h2>
<pre class="r"><code># Horizontal box plot
p + rotate()</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-rotate-plot-1.png" width="259.2" /></p>
</div>
<div id="change-themes" class="section level2">
<h2>Change themes</h2>
<p>The default theme in ggpubr is <strong>theme_pubr</strong>(), for publication ready theme.</p>
<p>The argument <strong>ggtheme</strong> can be used in any ggpubr plotting functions to change the plot theme.</p>
<p>Allowed values include ggplot2 official themes: theme_gray(), theme_bw(), theme_minimal(), theme_classic(), theme_void(), etc. It’s also possible to use the function <strong>“+”</strong> to add a theme.</p>
<pre class="r"><code># Gray theme
p + theme_gray()

# Black and white theme
p + theme_bw()</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-theme-1.png" width="240" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-theme-2.png" width="240" /></p>
<pre class="r"><code># Theme light
p + theme_light()

# Minimal theme
p + theme_minimal()

# Empty theme
p + theme_void()</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-theme--1.png" width="240" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-theme--2.png" width="240" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-theme--3.png" width="240" /></p>
</div>
<div id="remove-ggplot-components" class="section level2">
<h2>Remove ggplot components</h2>
<p>The function <strong>rremove</strong>() [in ggpubr] can be used to remove a specific component from a ggplot.</p>
<p>Usage:</p>
<pre class="r"><code>rremove(object)</code></pre>
<p><em>Object</em>: character string specifying the plot components. Allowed values include:</p>
<ul>
<li>“grid” for both x and y grids</li>
<li>“x.grid” for x axis grids</li>
<li>“y.grid” for y axis grids</li>
<li>“axis” for both x and y axes</li>
<li>“x.axis” for x axis</li>
<li>“y.axis” for y axis</li>
<li>“xlab”, or “x.title” for x axis label</li>
<li>“ylab”, or “y.title” for y axis label</li>
<li>“xylab”, “xy.title” or “axis.title” for both x and y axis labels</li>
<li>“x.text” for x axis texts (x axis tick labels)</li>
<li>“y.text” for y axis texts (y axis tick labels)</li>
<li>“xy.text” or “axis.text” for both x and y axis texts</li>
<li>“ticks” for both x and y ticks</li>
<li>“x.ticks” for x ticks</li>
<li>“y.ticks” for y ticks</li>
<li>“legend.title” for the legend title</li>
<li>“legend” for the legend</li>
</ul>
<p>Examples:</p>
<pre class="r"><code># Basic plot
p <- ggboxplot(ToothGrowth, x = "dose", y = "len",
  ggtheme = theme_gray())
p

# Remove all grids
p + rremove("grid")

# Remove only x grids
p + rremove("x.grid")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-remove-ggplot-components-1.png" width="288" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-remove-ggplot-components-2.png" width="288" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/035-graphical-parameters-remove-ggplot-components-3.png" width="288" /></p>
</div>
</div>


</div><!--end rdoc-->

<!-- END HTML -->]]></description>
			<pubDate>Fri, 01 Sep 2017 23:47:00 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[ggplot2 - Easy Way to Mix Multiple Graphs on The Same Page]]></title>
			<link>https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/81-ggplot2-easy-way-to-mix-multiple-graphs-on-the-same-page/</link>
			<guid>https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/81-ggplot2-easy-way-to-mix-multiple-graphs-on-the-same-page/</guid>
			<description><![CDATA[<!-- START HTML -->

  <div id="rdoc">


<p>To arrange <strong>multiple</strong> <strong>ggplot2</strong> graphs on the same page, the standard R functions - <em>par()</em> and <em>layout()</em> - cannot be used.</p>
<p>The basic solution is to use the <a href="https://github.com/baptiste/gridextra/wiki/arrangeGrob"><strong>gridExtra</strong></a> R package, which comes with the following functions:</p>
<ul>
<li><em>grid.arrange</em>() and <em>arrangeGrob</em>() to arrange multiple ggplots on one page</li>
<li><em>marrangeGrob</em>() for arranging multiple ggplots over multiple pages.</li>
</ul>
<p>However, these functions makes no attempt at aligning the plot panels; instead, the plots are simply placed into the grid as they are, and so the axes are not aligned.</p>
<p>If axis alignment is required, you can switch to the <a href="https://cran.r-project.org/web/packages/cowplot/vignettes/introduction.html"><strong>cowplot</strong></a> package, which include the function <strong>plot_grid</strong>() with the argument <em>align</em>. However, the cowplot package doesn’t contain any solution for multi-pages layout. Therefore, we provide the function <strong>ggarrange</strong>() [in ggpubr], a wrapper around the plot_grid() function, to arrange multiple ggplots over multiple pages. It can also create a common unique legend for multiple plots.</p>
<div class="block">
<p>
This article will show you, step by step, how to combine multiple <strong>ggplots</strong> on the same page, as well as, over multiple pages, using helper functions available in the following R package: <a href="https://www.sthda.com/english/rpkgs/ggpubr/index.html"><strong>ggpubr</strong> R package</a>, <strong>cowplot</strong> and <strong>gridExtra</strong>. We’ll also describe how to export the arranged plots to a file.
</p>
</div>
<p><img src="https://www.sthda.com/english/sthda-upload/images/ggpubr/arrange-multiple-ggplots.png" alt="Arrange multiple ggplots" /></p>
<p>Related articles:</p>
<div class="notice">
<ul>
<li>
<a href="https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/80-bar-plots-and-modern-alternatives/">Bar Plots and Modern Alternatives</a>
</li>
<li>
<a href="https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/77-facilitating-exploratory-data-visualization-application-to-tcga-genomic-data/">Facilitating Exploratory Data Visualization: Application to TCGA Genomic Data</a>
</li>
<li>
<a href="https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/76-add-p-values-and-significance-levels-to-ggplots/">Add P-values and Significance Levels to ggplots</a>
</li>
</ul>
</div>
<p><strong>Contents</strong>:</p>
<div id="TOC">
<ul>
<li><a href="#prerequisites">Prerequisites</a><ul>
<li><a href="#required-r-package">Required R package</a></li>
<li><a href="#demo-data-sets">Demo data sets</a></li>
</ul></li>
<li><a href="#create-some-plots">Create some plots</a></li>
<li><a href="#arrange-on-one-page">Arrange on one page</a></li>
<li><a href="#annotate-the-arranged-figure">Annotate the arranged figure</a></li>
<li><a href="#align-plot-panels">Align plot panels</a></li>
<li><a href="#change-columnrow-span-of-a-plot">Change column/row span of a plot</a><ul>
<li><a href="#use-ggpubr-r-package">Use ggpubr R package</a></li>
<li><a href="#use-cowplot-r-package">Use cowplot R package</a></li>
<li><a href="#use-gridextra-r-package">Use gridExtra R package</a></li>
<li><a href="#use-grid-r-package">Use grid R package</a></li>
</ul></li>
<li><a href="#use-common-legend-for-combined-ggplots">Use common legend for combined ggplots</a></li>
<li><a href="#scatter-plot-with-marginal-density-plots">Scatter plot with marginal density plots</a></li>
<li><a href="#mix-table-text-and-ggplot">Mix table, text and ggplot2 graphs</a></li>
<li><a href="#insert-a-graphical-element-inside-a-ggplot">Insert a graphical element inside a ggplot</a><ul>
<li><a href="#place-a-table-within-a-ggplot">Place a table within a ggplot</a></li>
<li><a href="#place-a-box-plot-within-a-ggplot">Place a box plot within a ggplot</a></li>
<li><a href="#add-background-image-to-ggplot2-graphs">Add background image to ggplot2 graphs</a></li>
</ul></li>
<li><a href="#arrange-over-multiple-pages">Arrange over multiple pages</a></li>
<li><a href="#nested-layout-with-ggarrange">Nested layout with ggarrange()</a></li>
<li><a href="#export-plots">Export plots</a></li>
<li><a href="#acknoweledgment">Acknoweledgment</a></li>
</ul>
</div>

<div id="prerequisites" class="section level2">
<h2>Prerequisites</h2>
<div id="required-r-package" class="section level3">
<h3>Required R package</h3>
<p>You need to install the R package <a href="https://www.sthda.com/english/rpkgs/ggpubr">ggpubr (version >= 0.1.3)</a>, to easily create ggplot2-based publication ready plots.</p>
<p>We recommend to install the latest developmental version from <a href="https://github.com/kassambara/ggpubr">GitHub</a> as follow:</p>
<pre class="r"><code>if(!require(devtools)) install.packages("devtools")
devtools::install_github("kassambara/ggpubr")</code></pre>
<p>If installation from Github failed, then try to install from <a href="https://cran.r-project.org/package=ggpubr">CRAN</a> as follow:</p>
<pre class="r"><code>install.packages("ggpubr")</code></pre>
<div class="warning">
<p>
Note that, the installation of <strong>ggpubr</strong> will automatically install the <strong>gridExtra</strong> and the <strong>cowplot</strong> package; so you don’t need to re-install them.
</p>
</div>
<p>Load ggpubr:</p>
<pre class="r"><code>library(ggpubr)</code></pre>
</div>
<div id="demo-data-sets" class="section level3">
<h3>Demo data sets</h3>
<p>Data: ToothGrowth and mtcars data sets.</p>
<pre class="r"><code># ToothGrowth
data("ToothGrowth")
head(ToothGrowth)</code></pre>
<pre><code>##    len supp dose
## 1  4.2   VC  0.5
## 2 11.5   VC  0.5
## 3  7.3   VC  0.5
## 4  5.8   VC  0.5
## 5  6.4   VC  0.5
## 6 10.0   VC  0.5</code></pre>
<pre class="r"><code># mtcars 
data("mtcars")
mtcars$name <- rownames(mtcars)
mtcars$cyl <- as.factor(mtcars$cyl)
head(mtcars[, c("name", "wt", "mpg", "cyl")])</code></pre>
<pre><code>##                                name   wt  mpg cyl
## Mazda RX4                 Mazda RX4 2.62 21.0   6
## Mazda RX4 Wag         Mazda RX4 Wag 2.88 21.0   6
## Datsun 710               Datsun 710 2.32 22.8   4
## Hornet 4 Drive       Hornet 4 Drive 3.21 21.4   6
## Hornet Sportabout Hornet Sportabout 3.44 18.7   8
## Valiant                     Valiant 3.46 18.1   6</code></pre>
</div>
</div>
<div id="create-some-plots" class="section level2">
<h2>Create some plots</h2>
<p>Here, we’ll use ggplot2-based plotting functions available in <a href="https://www.sthda.com/english/rpkgs/ggpubr">ggpubr</a>. You can use any ggplot2 functions to create the plots that you want for arranging them later.</p>
<p>We’ll start by creating 4 different plots:</p>
<ul>
<li>Box plots and dot plots using the <em>ToothGrowth</em> data set</li>
<li>Bar plots and scatter plots using the <em>mtcars</em> data set</li>
</ul>
<p>You’ll learn how to combine these plots in the next sections using specific functions.</p>
<ul>
<li><strong>Create a box plot and a dot plot</strong>:</li>
</ul>
<pre class="r"><code># Box plot (bp)
bxp <- ggboxplot(ToothGrowth, x = "dose", y = "len",
                 color = "dose", palette = "jco")
bxp

# Dot plot (dp)
dp <- ggdotplot(ToothGrowth, x = "dose", y = "len",
                 color = "dose", palette = "jco", binwidth = 1)
dp</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-box-dot-plot-1.png" width="336" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-box-dot-plot-2.png" width="336" /></p>
<ul>
<li><strong>Create an ordered bar plot and a scatter plot</strong>:</li>
</ul>
<p>Create ordered bar plots. Change the fill color by the grouping variable “cyl”. Sorting will be done globally, but not by groups.</p>
<pre class="r"><code># Bar plot (bp)
bp <- ggbarplot(mtcars, x = "name", y = "mpg",
          fill = "cyl",               # change fill color by cyl
          color = "white",            # Set bar border colors to white
          palette = "jco",            # jco journal color palett. see ?ggpar
          sort.val = "asc",           # Sort the value in ascending order
          sort.by.groups = TRUE,      # Sort inside each group
          x.text.angle = 90           # Rotate vertically x axis texts
          )
bp + font("x.text", size = 8)

# Scatter plots (sp)
sp <- ggscatter(mtcars, x = "wt", y = "mpg",
                add = "reg.line",               # Add regression line
                conf.int = TRUE,                # Add confidence interval
                color = "cyl", palette = "jco", # Color by groups "cyl"
                shape = "cyl"                   # Change point shape by groups "cyl"
                )+
  stat_cor(aes(color = cyl), label.x = 3)       # Add correlation coefficient
sp</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-ordered-bar-plots-1.png" width="336" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-ordered-bar-plots-2.png" width="336" /></p>
</div>
<div id="arrange-on-one-page" class="section level2">
<h2>Arrange on one page</h2>
<p>To arrange multiple ggplots on one single page, we’ll use the function <strong>ggarrange</strong>()[in <strong>ggpubr</strong>], which is a wrapper around the function <em>plot_grid</em>() [in <em>cowplot</em> package]. Compared to the standard function <em>plot_grid</em>(), <strong>ggarange</strong>() can arrange multiple ggplots over multiple pages.</p>
<pre class="r"><code>ggarrange(bxp, dp, bp + rremove("x.text"), 
          labels = c("A", "B", "C"),
          ncol = 2, nrow = 2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-ggarrange-1.png" width="576" /></p>
<p>Alternatively, you can also use the function <strong>plot_grid</strong>() [in <strong>cowplot</strong>]:</p>
<pre class="r"><code>library("cowplot")
plot_grid(bxp, dp, bp + rremove("x.text"), 
          labels = c("A", "B", "C"),
          ncol = 2, nrow = 2)</code></pre>
<p>or, the function <strong>grid.arrange</strong>() [in <strong>gridExtra</strong>]:</p>
<pre class="r"><code>library("gridExtra")
grid.arrange(bxp, dp, bp + rremove("x.text"), 
             ncol = 2, nrow = 2)</code></pre>
</div>
<div id="annotate-the-arranged-figure" class="section level2">
<h2>Annotate the arranged figure</h2>
<p>R function: <strong>annotate_figure</strong>() [in ggpubr].</p>
<pre class="r"><code>figure <- ggarrange(sp, bp + font("x.text", size = 10),
                    ncol = 1, nrow = 2)

annotate_figure(figure,
                top = text_grob("Visualizing mpg", color = "red", face = "bold", size = 14),
                bottom = text_grob("Data source: \n mtcars data set", color = "blue",
                                   hjust = 1, x = 1, face = "italic", size = 10),
                left = text_grob("Figure arranged using ggpubr", color = "green", rot = 90),
                right = "I&amp;#39;m done, thanks :-)!",
                fig.lab = "Figure 1", fig.lab.face = "bold"
                )</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-annotate-arranged-ggplots-1.png" width="576" /></p>
<div class="warning">
<p>
Note that, the function annotate_figure() supports any ggplots.
</p>
</div>
</div>
<div id="align-plot-panels" class="section level2">
<h2>Align plot panels</h2>
<p>A real use case is, for example, when plotting <a href="https://www.sthda.com/english/wiki/survival-analysis-basics">survival curves</a> with the risk table placed under the main plot.</p>
<p>To illustrate this case, we’ll use the survminer package. First, install it using <em>install.packages(“survminer”)</em>, then type this:</p>
<pre class="r"><code># Fit survival curves
library(survival)
fit <- survfit( Surv(time, status) ~ adhere, data = colon )

# Plot survival curves
library(survminer)
ggsurv <- ggsurvplot(fit, data = colon, 
                     palette = "jco",                              # jco palette
                     pval = TRUE, pval.coord = c(500, 0.4),        # Add p-value
                     risk.table = TRUE                            # Add risk table
                     )
names(ggsurv)</code></pre>
<pre><code>## [1] "plot"           "table"          "data.survplot"  "data.survtable"</code></pre>
<p>ggsurv is a list including the following components:</p>
<ul>
<li><em>plot</em>: survival curves</li>
<li><em>table</em>: the risk table plot</li>
</ul>
<p>You can arrange the survival plot and the risk table as follow:</p>
<pre class="r"><code>ggarrange(ggsurv$plot, ggsurv$table, heights = c(2, 0.7),
          ncol = 1, nrow = 2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-align-plot-panels-1.png" width="576" /></p>
<div class="warning">
<p>
It can be seen that the axes of the survival plot and the risk table are not aligned vertically. To align them, specify the argument <strong>align</strong> as follow.
</p>
</div>
<pre class="r"><code>ggarrange(ggsurv$plot, ggsurv$table, heights = c(2, 0.7),
          ncol = 1, nrow = 2, align = "v")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-align-plot-panels-2-1.png" width="576" /></p>
</div>
<div id="change-columnrow-span-of-a-plot" class="section level2">
<h2>Change column/row span of a plot</h2>
<div id="use-ggpubr-r-package" class="section level3">
<h3>Use ggpubr R package</h3>
<p>We’ll use nested <strong>ggarrange</strong>() functions to change column/row span of plots.</p>
<p>For example, using the R code below:</p>
<ul>
<li>the scatter plot (sp) will live in the first row and spans over two columns</li>
<li>the box plot (bxp) and the dot plot (dp) will be first arranged and will live in the second row with two different columns</li>
</ul>
<pre class="r"><code>ggarrange(sp,                                                 # First row with scatter plot
          ggarrange(bxp, dp, ncol = 2, labels = c("B", "C")), # Second row with box and dot plots
          nrow = 2, 
          labels = "A"                                        # Labels of the scatter plot
          ) </code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-ggarrange-column-row-span-1.png" width="576" /></p>
</div>
<div id="use-cowplot-r-package" class="section level3">
<h3>Use cowplot R package</h3>
<p>The combination of the functions <strong>ggdraw</strong>() + <strong>draw_plot</strong>() + <strong>draw_plot_label</strong>() [in <strong>cowplot</strong>] can be used to place graphs at particular locations with a particular size.</p>
<p><strong>ggdraw(). Initialize an empty drawing canvas</strong>:</p>
<pre class="r"><code>ggdraw()</code></pre>
<div class="warning">
<p>
Note that, by default, coordinates run from 0 to 1, and the point (0, 0) is in the lower left corner of the canvas (see the figure below).
</p>
</div>
<p><img src="https://www.sthda.com/english/sthda-upload/images/ggpubr/canva.png" alt="draw_plot" /></p>
<p><strong>draw_plot(). Places a plot somewhere onto the drawing canvas</strong>:</p>
<pre class="r"><code>draw_plot(plot, x = 0, y = 0, width = 1, height = 1)</code></pre>
<ul>
<li><em>plot</em>: the plot to place (ggplot2 or a gtable)</li>
<li><em>x, y</em>: The x/y location of the lower left corner of the plot.</li>
<li><em>width, height</em>: the width and the height of the plot</li>
</ul>
<p><strong>draw_plot_label()</strong>. Adds a plot label to the upper left corner of a graph. It can handle vectors of labels with associated coordinates.</p>
<pre class="r"><code>draw_plot_label(label, x = 0, y = 1, size = 16, ...)</code></pre>
<ul>
<li><em>label</em>: a vector of labels to be drawn</li>
<li><em>x, y</em>: Vector containing the x and y position of the labels, respectively.</li>
<li><em>size</em>: Font size of the label to be drawn</li>
</ul>
<p>For example, you can combine multiple plots, with particular locations and different sizes, as follow:</p>
<pre class="r"><code>library("cowplot")
ggdraw() +
  draw_plot(bxp, x = 0, y = .5, width = .5, height = .5) +
  draw_plot(dp, x = .5, y = .5, width = .5, height = .5) +
  draw_plot(bp, x = 0, y = 0, width = 1, height = 0.5) +
  draw_plot_label(label = c("A", "B", "C"), size = 15,
                  x = c(0, 0.5, 0), y = c(1, 1, 0.5))</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-cowplot-row-column-span-1.png" width="576" /></p>
</div>
<div id="use-gridextra-r-package" class="section level3">
<h3>Use gridExtra R package</h3>
<p>The function <strong>arrangeGrop</strong>() [in <strong>gridExtra</strong>] helps to change the row/column span of a plot.</p>
<p>For example, using the R code below:</p>
<ul>
<li>the scatter plot (sp) will live in the first row and spans over two columns</li>
<li>the box plot (bxp) and the dot plot (dp) will live in the second row with two plots in two different columns</li>
</ul>
<pre class="r"><code>library("gridExtra")
grid.arrange(sp,                             # First row with one plot spaning over 2 columns
             arrangeGrob(bxp, dp, ncol = 2), # Second row with 2 plots in 2 different columns
             nrow = 2)                       # Number of rows</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-grid-arrange-column-row-span-1.png" width="576" /></p>
<p>It’s also possible to use the argument <strong>layout_matrix</strong> in the <strong>grid.arrange</strong>() function, to create a complex layout.</p>
<p>In the R code below <em>layout_matrix</em> is a 2x2 matrix (2 columns and 2 rows). The first row is all 1s, that’s where the first plot lives, spanning the two columns; the second row contains plots 2 and 3 each occupying one column.</p>
<pre class="r"><code>grid.arrange(bp,                                    # bar plot spaning two columns
             bxp, sp,                               # box plot and scatter plot
             ncol = 2, nrow = 2, 
             layout_matrix = rbind(c(1,1), c(2,3)))</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-grid-arrange-layout-matrix-1.png" width="576" /></p>
<div class="warning">
<p>
Note that, it’s also possible to annotate the output of the <em>grid.arrange</em>() function using the helper function <strong>draw_plot_label</strong>() [in cowplot].
</p>
</div>
<p>To easily annotate the <em>grid.arrange</em>() / <em>arrangeGrob</em>() output (a gtable), you should first transform it to a ggplot using the function <em>as_ggplot</em>() [in ggpubr ]. Next you can annotate it using the function <em>draw_plot_label</em>() [in cowplot].</p>
<pre class="r"><code>library("gridExtra")
library("cowplot")

# Arrange plots using arrangeGrob
# returns a gtable (gt)
gt <- arrangeGrob(bp,                               # bar plot spaning two columns
             bxp, sp,                               # box plot and scatter plot
             ncol = 2, nrow = 2, 
             layout_matrix = rbind(c(1,1), c(2,3)))

# Add labels to the arranged plots
p <- as_ggplot(gt) +                                # transform to a ggplot
  draw_plot_label(label = c("A", "B", "C"), size = 15,
                  x = c(0, 0, 0.5), y = c(1, 0.5, 0.5)) # Add labels
p</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-add-labels-to-grid-arrange-1.png" width="576" /></p>
<div class="notice">
<p>
In the above R code, we used <em>arrangeGrob</em>() instead of <em>grid.arrange</em>().
</p>
<p>
Note that, the main difference between these two functions is that, <em>grid.arrange</em>() draw automatically the output of the arranged plots.
</p>
<p>
As we want to annotate the arranged plots before drawing it, the function <em>arrangeGrob</em>() is preferred in this case.
</p>
</div>
</div>
<div id="use-grid-r-package" class="section level3">
<h3>Use grid R package</h3>
<p>The grid R package can be used to create a complex layout with the help of the function <strong>grid.layout</strong>(). It provides also the helper function <strong>viewport</strong>() to define a region or a viewport on the layout. The function <strong>print</strong>() is used to place plots in a specified region.</p>
<p>The different steps can be summarized as follow :</p>
<ol style="list-style-type: decimal">
<li>Create plots : p1, p2, p3, ….</li>
<li>Move to a new page on a grid device using the function <strong>grid.newpage</strong>()</li>
<li>Create a layout 2X2 - number of columns = 2; number of rows = 2</li>
<li>Define a grid viewport : a rectangular region on a graphics device</li>
<li>Print a plot into the viewport</li>
</ol>
<pre class="r"><code>library(grid)

# Move to a new page
grid.newpage()

# Create layout : nrow = 3, ncol = 2
pushViewport(viewport(layout = grid.layout(nrow = 3, ncol = 2)))

# A helper function to define a region on the layout
define_region <- function(row, col){
  viewport(layout.pos.row = row, layout.pos.col = col)
} 

# Arrange the plots
print(sp, vp = define_region(row = 1, col = 1:2))   # Span over two columns
print(bxp, vp = define_region(row = 2, col = 1))
print(dp, vp = define_region(row = 2, col = 2))
print(bp + rremove("x.text"), vp = define_region(row = 3, col = 1:2))</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-viewport-1.png" width="576" /></p>
</div>
</div>
<div id="use-common-legend-for-combined-ggplots" class="section level2">
<h2>Use common legend for combined ggplots</h2>
<p>To place a common unique legend in the margin of the arranged plots, the function <strong>ggarrange</strong>() [in ggpubr] can be used with the following arguments:</p>
<ul>
<li><em>common.legend = TRUE</em>: place a common legend in a margin</li>
<li><em>legend</em>: specify the legend position. Allowed values include one of c(“top”, “bottom”, “left”, “right”)</li>
</ul>
<pre class="r"><code>ggarrange(bxp, dp, labels = c("A", "B"),
          common.legend = TRUE, legend = "bottom")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-common-legend-for-combined-ggplots-1.png" width="576" /></p>
</div>
<div id="scatter-plot-with-marginal-density-plots" class="section level2">
<h2>Scatter plot with marginal density plots</h2>
<pre class="r"><code># Scatter plot colored by groups ("Species")
sp <- ggscatter(iris, x = "Sepal.Length", y = "Sepal.Width",
                color = "Species", palette = "jco",
                size = 3, alpha = 0.6)+
  border()                                         

# Marginal density plot of x (top panel) and y (right panel)
xplot <- ggdensity(iris, "Sepal.Length", fill = "Species",
                   palette = "jco")
yplot <- ggdensity(iris, "Sepal.Width", fill = "Species", 
                   palette = "jco")+
  rotate()

# Cleaning the plots
yplot <- yplot + clean_theme() 
xplot <- xplot + clean_theme()

# Arranging the plot
ggarrange(xplot, NULL, sp, yplot, 
          ncol = 2, nrow = 2,  align = "hv", 
          widths = c(2, 1), heights = c(1, 2),
          common.legend = TRUE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-marginal-plot-grouped-data-1.png" width="576" /></p>
</div>
<div id="mix-table-text-and-ggplot" class="section level2">
<h2>Mix table, text and ggplot2 graphs</h2>
<p>In this section, we’ll show how to plot a table and text alongside a chart. The iris data set will be used.</p>
<p>We start by creating the following plots:</p>
<ol style="list-style-type: decimal">
<li>a <strong>density plot</strong> of the variable “Sepal.Length”. R function: <strong>ggdensity</strong>() [in ggpubr]</li>
<li>a plot of the <strong>summary table</strong> containing the descriptive statistics (mean, sd, … ) of Sepal.Length.
<ul>
<li>R function for computing descriptive statistics: <strong>desc_statby</strong>() [in ggpubr].</li>
<li>R function to draw a textual table: <strong>ggtexttable</strong>() [in ggpubr].</li>
</ul></li>
<li>a plot of a text <strong>paragraph</strong>. R function: <strong>ggparagraph</strong>() [in ggpubr].</li>
</ol>
<p>We finish by arranging/combining the three plots using the function <strong>ggarrange</strong>() [in ggpubr]</p>
<pre class="r"><code># Density plot of "Sepal.Length"
#::::::::::::::::::::::::::::::::::::::
density.p <- ggdensity(iris, x = "Sepal.Length", 
                       fill = "Species", palette = "jco")

# Draw the summary table of Sepal.Length
#::::::::::::::::::::::::::::::::::::::
# Compute descriptive statistics by groups
stable <- desc_statby(iris, measure.var = "Sepal.Length",
                      grps = "Species")
stable <- stable[, c("Species", "length", "mean", "sd")]
# Summary table plot, medium orange theme
stable.p <- ggtexttable(stable, rows = NULL, 
                        theme = ttheme("mOrange"))

# Draw text
#::::::::::::::::::::::::::::::::::::::
text <- paste("iris data set gives the measurements in cm",
              "of the variables sepal length and width",
              "and petal length and width, respectively,",
              "for 50 flowers from each of 3 species of iris.",
             "The species are Iris setosa, versicolor, and virginica.", sep = " ")
text.p <- ggparagraph(text = text, face = "italic", size = 11, color = "black")

# Arrange the plots on the same page
ggarrange(density.p, stable.p, text.p, 
          ncol = 1, nrow = 3,
          heights = c(1, 0.5, 0.3))</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-add-table-add-text-data-visualization-1.png" width="576" /></p>
</div>
<div id="insert-a-graphical-element-inside-a-ggplot" class="section level2">
<h2>Insert a graphical element inside a ggplot</h2>
<p>The function <strong>annotation_custom</strong>() [in ggplot2] can be used for adding tables, plots or other grid-based elements within the plotting area of a ggplot. The simplified format is :</p>
<pre class="r"><code>annotation_custom(grob, xmin, xmax, ymin, ymax)</code></pre>
<div class="block">
<ul>
<li>
<strong>grob</strong>: the external graphical element to display
</li>
<li>
<strong>xmin, xmax</strong> : x location in data coordinates (horizontal location)
</li>
<li>
<strong>ymin, ymax</strong> : y location in data coordinates (vertical location)
</li>
</ul>
</div>
<div id="place-a-table-within-a-ggplot" class="section level3">
<h3>Place a table within a ggplot</h3>
<p>We’ll use the plots - density.p and stable.p - created in the previous section (@ref(mix-table-text-and-ggplot)).</p>
<pre class="r"><code>density.p + annotation_custom(ggplotGrob(stable.p),
                              xmin = 5.5, ymin = 0.7,
                              xmax = 8)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-add-table-within-a-ggplot-1.png" width="480" /></p>
</div>
<div id="place-a-box-plot-within-a-ggplot" class="section level3">
<h3>Place a box plot within a ggplot</h3>
<ol style="list-style-type: decimal">
<li>Create a scatter plot of y = “Sepal.Width” by x = “Sepal.Length” using the iris data set. R function <strong>ggscatter</strong>() [ggpubr]</li>
<li>Create separately the box plot of x and y variables with transparent background. R function: <strong>ggboxplot</strong>() [ggpubr].</li>
<li>Transform the box plots into graphical objects called a “grop” in Grid terminology. R function <strong>ggplotGrob</strong>() [ggplot2].</li>
<li>Place the box plot grobs inside the scatter plot. R function: <strong>annotation_custom</strong>() [ggplot2].</li>
</ol>
<div class="warning">
<p>
As the inset box plot overlaps with some points, a <em>transparent background</em> is used for the box plots.
</p>
</div>
<pre class="r"><code># Scatter plot colored by groups ("Species")
#::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
sp <- ggscatter(iris, x = "Sepal.Length", y = "Sepal.Width",
                color = "Species", palette = "jco",
                size = 3, alpha = 0.6)

# Create box plots of x/y variables
#::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
# Box plot of the x variable
xbp <- ggboxplot(iris$Sepal.Length, width = 0.3, fill = "lightgray") +
  rotate() +
  theme_transparent()
# Box plot of the y variable
ybp <- ggboxplot(iris$Sepal.Width, width = 0.3, fill = "lightgray") +
  theme_transparent()

# Create the external graphical objects
# called a "grop" in Grid terminology
xbp_grob <- ggplotGrob(xbp)
ybp_grob <- ggplotGrob(ybp)


# Place box plots inside the scatter plot
#::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
xmin <- min(iris$Sepal.Length); xmax <- max(iris$Sepal.Length)
ymin <- min(iris$Sepal.Width); ymax <- max(iris$Sepal.Width)
yoffset <- (1/15)*ymax; xoffset <- (1/15)*xmax

# Insert xbp_grob inside the scatter plot
sp + annotation_custom(grob = xbp_grob, xmin = xmin, xmax = xmax, 
                       ymin = ymin-yoffset, ymax = ymin+yoffset) +
  # Insert ybp_grob inside the scatter plot
  annotation_custom(grob = ybp_grob,
                       xmin = xmin-xoffset, xmax = xmin+xoffset, 
                       ymin = ymin, ymax = ymax)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-place-box-plot-within-scatter-plot-1.png" width="480" /></p>
</div>
<div id="add-background-image-to-ggplot2-graphs" class="section level3">
<h3>Add background image to ggplot2 graphs</h3>
<p><strong>Import the background image</strong>. Use either the function <strong>readJPEG</strong>() [in <em>jpeg</em> package] or the function <strong>readPNG</strong>() [in <em>png</em> package] depending on the format of the background image.</p>
<p>To test the example below, make sure that the <em>png</em> package is installed. You can install it using install.packages(“png”) R command.</p>
<pre class="r"><code># Import the image
img.file <- system.file(file.path("https://www.sthda.com/english/sthda-upload/images/ggpubr", "background-image.png"),
                        package = "ggpubr")
img <- png::readPNG(img.file)</code></pre>
<p><strong>Combine a ggplot with the background image</strong>. R function: <strong>background_image</strong>() [in ggpubr].</p>
<pre class="r"><code>library(ggplot2)
library(ggpubr)
ggplot(iris, aes(Species, Sepal.Length))+
  background_image(img)+
  geom_boxplot(aes(fill = Species), color = "white")+
  fill_palette("jco")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-ggplot2-with-background-image-1.png" width="432" /></p>
<p>Change box plot fill color transparency by specifying the argument alpha. Value should be in [0, 1], where 0 is full transparency and 1 is no transparency.</p>
<pre class="r"><code>library(ggplot2)
library(ggpubr)
ggplot(iris, aes(Species, Sepal.Length))+
  background_image(img)+
  geom_boxplot(aes(fill = Species), color = "white", alpha = 0.5)+
  fill_palette("jco")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-ggplot2-with-background-image-transparency-1.png" width="432" /></p>
<p>Another example, overlaying the France map and a ggplot2:</p>
<pre class="r"><code>mypngfile <- download.file("https://upload.wikimedia.org/wikipedia/commons/thumb/e/e4/France_Flag_Map.svg/612px-France_Flag_Map.svg.png", 
                           destfile = "france.png", mode = &amp;#39;wb&amp;#39;) 

img <- png::readPNG(&amp;#39;france.png&amp;#39;) 

ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width)) +
  background_image(img)+
  geom_point(aes(color = Species), alpha = 0.6, size = 5)+
  color_palette("jco")+
  theme(legend.position = "top")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-overlay-ggplot2-with-country-map-1.png" width="384" /></p>
</div>
</div>
<div id="arrange-over-multiple-pages" class="section level2">
<h2>Arrange over multiple pages</h2>
<p>If you have a long list of ggplots, say n = 20 plots, you may want to arrange the plots and to place them on multiple pages. With 4 plots per page, you need 5 pages to hold the 20 plots.</p>
<p>The function <strong>ggarrange</strong>() [in ggpubr] provides a convenient solution to arrange multiple ggplots over multiple pages. After specifying the arguments <em>nrow</em> and <em>ncol</em>, the function <strong>ggarrange</strong>() computes automatically the number of pages required to hold the list of the plots. It returns a list of arranged ggplots.</p>
<p>For example the following R code,</p>
<pre class="r"><code>multi.page <- ggarrange(bxp, dp, bp, sp,
                        nrow = 1, ncol = 2)</code></pre>
<p>returns a list of two pages with two plots per page. You can visualize each page as follow:</p>
<pre class="r"><code>multi.page[[1]] # Visualize page 1
multi.page[[2]] # Visualize page 2</code></pre>
<p>You can also export the arranged plots to a pdf file using the function <strong>ggexport</strong>() [in ggpubr]:</p>
<pre class="r"><code>ggexport(multi.page, filename = "multi.page.ggplot2.pdf")</code></pre>
<p>PDF file: <a href="//www.slideshare.net/kassambara/multipageggplot2">Multi.page.ggplot2</a></p>
<iframe src="//www.slideshare.net/slideshow/embed_code/key/zT39z4AS2fVIY" width="570" height="510" frameborder="0" marginwidth="0" marginheight="0" scrolling="no" style="border:1px solid #CCC; border-width:1px; margin-bottom:5px; max-width: 100%;" allowfullscreen>
</iframe>
<div class="warning">
<p>
Note that, it’s also possible to use the function <strong>marrangeGrob()</strong> [in gridExtra] to create a multi-pages output.
</p>
</div>
<pre class="r"><code>library(gridExtra)
res <- marrangeGrob(list(bxp, dp, bp, sp), nrow = 1, ncol = 2)

# Export to a pdf file
ggexport(res, filename = "multi.page.ggplot2.pdf")

# Visualize interactively
res</code></pre>
</div>
<div id="nested-layout-with-ggarrange" class="section level2">
<h2>Nested layout with ggarrange()</h2>
<p>We’ll arrange the plot created in section (@ref(mix-table-text-and-ggplot)) and (@ref(create-some-plots)).</p>
<pre class="r"><code>p1 <- ggarrange(sp, bp + font("x.text", size = 9),
                ncol = 1, nrow = 2)

p2 <- ggarrange(density.p, stable.p, text.p, 
                ncol = 1, nrow = 3,
                heights = c(1, 0.5, 0.3))

ggarrange(p1, p2, ncol = 2, nrow = 1)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/030-arrange-multiple-ggplots-nested-layout-1.png" width="720" /></p>
</div>
<div id="export-plots" class="section level2">
<h2>Export plots</h2>
<p>R function: <strong>ggexport</strong>() [in ggpubr].</p>
<p>First, create a list of 4 ggplots corresponding to the variables Sepal.Length, Sepal.Width, Petal.Length and Petal.Width in the iris data set.</p>
<pre class="r"><code>plots <- ggboxplot(iris, x = "Species",
                   y = c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"),
                   color = "Species", palette = "jco"
                   )
plots[[1]]  # Print the first plot
plots[[2]]  # Print the second plots and so on...</code></pre>
<p>Next, you can export individual plots to a file (pdf, eps or png) (one plot per page). It’s also possible to arrange the plots (2 plot per page) when exporting them.</p>
<p>Export individual plots to a pdf file (one plot per page):</p>
<pre class="r"><code>ggexport(plotlist = plots, filename = "test.pdf")</code></pre>
<p>Arrange and export. Specify nrow and ncol to display multiple plots on the same page:</p>
<pre class="r"><code>ggexport(plotlist = plots, filename = "test.pdf",
         nrow = 2, ncol = 1)</code></pre>
</div>
<div id="acknoweledgment" class="section level2">
<h2>Acknoweledgment</h2>
<p>We sincerely thank all developers for their efforts behind the packages that ggpubr depends on, namely:</p>
<ul>
<li>Baptiste Auguie (2016). gridExtra: Miscellaneous Functions for “Grid” Graphics. R package version 2.2.1. <a href="https://CRAN.R-project.org/package=gridExtra" class="uri">https://CRAN.R-project.org/package=gridExtra</a></li>
<li>Claus O. Wilke (2016). cowplot: Streamlined Plot Theme and Plot Annotations for ‘ggplot2’. R package version 0.7.0. <a href="https://CRAN.R-project.org/package=cowplot" class="uri">https://CRAN.R-project.org/package=cowplot</a></li>
<li>H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2009.</li>
</ul>
</div>
</div>


</div><!--end rdoc-->

<!-- END HTML -->]]></description>
			<pubDate>Fri, 01 Sep 2017 10:59:00 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[Bar Plots and Modern Alternatives]]></title>
			<link>https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/80-bar-plots-and-modern-alternatives/</link>
			<guid>https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/80-bar-plots-and-modern-alternatives/</guid>
			<description><![CDATA[<!-- START HTML -->
  <div id="rdoc">
<p>This article describes how to create easily basic and ordered <strong>bar plots</strong> using ggplot2 based helper functions available in the <a href="https://www.sthda.com/english/rpkgs/ggpubr/index.html">ggpubr R package</a>. We’ll also present some modern alternatives to bar plots, including <strong>lollipop charts</strong> and <strong>cleveland’s dot plots</strong>.</p>
<div class="block">
<p>
You might be also interested by the following articles:
</p>
<ul>
<li>
<a href="https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/76-add-p-values-and-significance-levels-to-ggplots/">Add P-values and Significance Levels to ggplots</a>
</li>
<li>
<a href="https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/77-facilitating-exploratory-data-visualization-application-to-tcga-genomic-data/">Facilitating Exploratory Data Visualization: Application to TCGA Genomic Data</a>
</li>
</ul>
</div>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/025-bar-plots-and-alternatives-logo-1.png" width="720" /></p>

<p><strong>Contents</strong>:</p>



<div id="TOC">
<ul>
<li><a href="#prerequisites">Prerequisites</a><ul>
<li><a href="#required-r-package">Required R package</a></li>
</ul></li>
<li><a href="#basic-bar-plots">Basic bar plots</a></li>
<li><a href="#multiple-grouping-variables">Multiple grouping variables</a></li>
<li><a href="#ordered-bar-plots">Ordered bar plots</a></li>
<li><a href="#deviation-graphs">Deviation graphs</a></li>
<li><a href="#alternatives-to-bar-plots">Alternatives to bar plots</a><ul>
<li><a href="#lollipop-chart">Lollipop chart</a></li>
<li><a href="#clevelands-dot-plot">Cleveland’s dot plot</a></li>
</ul></li>
</ul>
</div>

<div id="prerequisites" class="section level2">
<h2>Prerequisites</h2>
<div id="required-r-package" class="section level3">
<h3>Required R package</h3>
<p>You need to install the R package <a href="https://www.sthda.com/english/rpkgs/ggpubr">ggpubr (version >= 0.1.3)</a>, to easily create ggplot2-based publication ready plots.</p>
<p>Install from CRAN:</p>
<pre class="r"><code>install.packages("ggpubr")</code></pre>
<p>Or, install the latest developmental version from <a href="https://github.com/kassambara/ggpubr">GitHub</a> as follow:</p>
<pre class="r"><code>if(!require(devtools)) install.packages("devtools")
devtools::install_github("kassambara/ggpubr")</code></pre>
<p>Load ggpubr:</p>
<pre class="r"><code>library(ggpubr)</code></pre>
</div>
</div>
<div id="basic-bar-plots" class="section level2">
<h2>Basic bar plots</h2>
<p>Create a demo data set:</p>
<pre class="r"><code>df <- data.frame(dose=c("D0.5", "D1", "D2"),
                 len=c(4.2, 10, 29.5))
print(df)</code></pre>
<pre><code>##   dose  len
## 1 D0.5  4.2
## 2   D1 10.0
## 3   D2 29.5</code></pre>
<p>Basic bar plots:</p>
<pre class="r"><code># Basic bar plots with label
p <- ggbarplot(df, x = "dose", y = "len",
          color = "black", fill = "lightgray")
p

# Rotate to create horizontal bar plots
p + rotate()</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/025-bar-plots-and-alternatives-basics-1.png" width="336" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/025-bar-plots-and-alternatives-basics-2.png" width="336" /></p>
<p>Change fill and outline colors by groups:</p>
<pre class="r"><code>ggbarplot(df, x = "dose", y = "len",
   fill = "dose", color = "dose", palette = "jco")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/025-bar-plots-and-alternatives-color-1.png" width="336" /></p>
</div>
<div id="multiple-grouping-variables" class="section level2">
<h2>Multiple grouping variables</h2>
<p>Create a demo data set:</p>
<pre class="r"><code>df2 <- data.frame(supp=rep(c("VC", "OJ"), each=3),
                  dose=rep(c("D0.5", "D1", "D2"),2),
                  len=c(6.8, 15, 33, 4.2, 10, 29.5))
print(df2)</code></pre>
<pre><code>##   supp dose  len
## 1   VC D0.5  6.8
## 2   VC   D1 15.0
## 3   VC   D2 33.0
## 4   OJ D0.5  4.2
## 5   OJ   D1 10.0
## 6   OJ   D2 29.5</code></pre>
<p>Plot y = “len” by x = “dose” and change color by a second group: “supp”</p>
<pre class="r"><code># Stacked bar plots, add labels inside bars
ggbarplot(df2, x = "dose", y = "len",
  fill = "supp", color = "supp", 
  palette = c("gray", "black"),
  label = TRUE, lab.col = "white", lab.pos = "in")

# Change position: Interleaved (dodged) bar plot
ggbarplot(df2, x = "dose", y = "len",
          fill = "supp", color = "supp", 
          palette = c("gray", "black"),
          position = position_dodge(0.9))</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/025-bar-plots-and-alternatives-stacked-bar-plots-1.png" width="336" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/025-bar-plots-and-alternatives-stacked-bar-plots-2.png" width="336" /></p>
</div>
<div id="ordered-bar-plots" class="section level2">
<h2>Ordered bar plots</h2>
<p>Load and prepare data:</p>
<pre class="r"><code># Load data
data("mtcars")
dfm <- mtcars
# Convert the cyl variable to a factor
dfm$cyl <- as.factor(dfm$cyl)
# Add the name colums
dfm$name <- rownames(dfm)
# Inspect the data
head(dfm[, c("name", "wt", "mpg", "cyl")])</code></pre>
<pre><code>##                                name   wt  mpg cyl
## Mazda RX4                 Mazda RX4 2.62 21.0   6
## Mazda RX4 Wag         Mazda RX4 Wag 2.88 21.0   6
## Datsun 710               Datsun 710 2.32 22.8   4
## Hornet 4 Drive       Hornet 4 Drive 3.21 21.4   6
## Hornet Sportabout Hornet Sportabout 3.44 18.7   8
## Valiant                     Valiant 3.46 18.1   6</code></pre>
<p>Create ordered bar plots. Change the fill color by the grouping variable “cyl”. Sorting will be done globally, but not by groups.</p>
<pre class="r"><code>ggbarplot(dfm, x = "name", y = "mpg",
          fill = "cyl",               # change fill color by cyl
          color = "white",            # Set bar border colors to white
          palette = "jco",            # jco journal color palett. see ?ggpar
          sort.val = "desc",          # Sort the value in dscending order
          sort.by.groups = FALSE,     # Don&amp;#39;t sort inside each group
          x.text.angle = 90           # Rotate vertically x axis texts
          )</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/025-bar-plots-and-alternatives-ordered-bar-plots-1.png" width="576" /></p>
<p>Sort bars inside each group. Use the argument <strong>sort.by.groups = TRUE</strong>.</p>
<pre class="r"><code>ggbarplot(dfm, x = "name", y = "mpg",
          fill = "cyl",               # change fill color by cyl
          color = "white",            # Set bar border colors to white
          palette = "jco",            # jco journal color palett. see ?ggpar
          sort.val = "asc",           # Sort the value in dscending order
          sort.by.groups = TRUE,      # Sort inside each group
          x.text.angle = 90           # Rotate vertically x axis texts
          )</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/025-bar-plots-and-alternatives-ordered-bar-plots-by-groups-1.png" width="576" /></p>
</div>
<div id="deviation-graphs" class="section level2">
<h2>Deviation graphs</h2>
<p>The deviation graph shows the deviation of quantitative values to a reference value. In the R code below, we’ll plot the mpg z-score from the mtcars data set.</p>
<p>Calculate the z-score of the mpg data:</p>
<pre class="r"><code># Calculate the z-score of the mpg data
dfm$mpg_z <- (dfm$mpg -mean(dfm$mpg))/sd(dfm$mpg)
dfm$mpg_grp <- factor(ifelse(dfm$mpg_z < 0, "low", "high"), 
                     levels = c("low", "high"))
# Inspect the data
head(dfm[, c("name", "wt", "mpg", "mpg_z", "mpg_grp", "cyl")])</code></pre>
<pre><code>##                                name   wt  mpg  mpg_z mpg_grp cyl
## Mazda RX4                 Mazda RX4 2.62 21.0  0.151    high   6
## Mazda RX4 Wag         Mazda RX4 Wag 2.88 21.0  0.151    high   6
## Datsun 710               Datsun 710 2.32 22.8  0.450    high   4
## Hornet 4 Drive       Hornet 4 Drive 3.21 21.4  0.217    high   6
## Hornet Sportabout Hornet Sportabout 3.44 18.7 -0.231     low   8
## Valiant                     Valiant 3.46 18.1 -0.330     low   6</code></pre>
<p>Create an ordered bar plot, colored according to the level of mpg:</p>
<pre class="r"><code>ggbarplot(dfm, x = "name", y = "mpg_z",
          fill = "mpg_grp",           # change fill color by mpg_level
          color = "white",            # Set bar border colors to white
          palette = "jco",            # jco journal color palett. see ?ggpar
          sort.val = "asc",           # Sort the value in ascending order
          sort.by.groups = FALSE,     # Don&amp;#39;t sort inside each group
          x.text.angle = 90,          # Rotate vertically x axis texts
          ylab = "MPG z-score",
          xlab = FALSE,
          legend.title = "MPG Group"
          )</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/025-bar-plots-and-alternatives-deviation-graphs-1.png" width="576" /></p>
<p>Rotate the plot: use rotate = TRUE and sort.val = “desc”</p>
<pre class="r"><code>ggbarplot(dfm, x = "name", y = "mpg_z",
          fill = "mpg_grp",           # change fill color by mpg_level
          color = "white",            # Set bar border colors to white
          palette = "jco",            # jco journal color palett. see ?ggpar
          sort.val = "desc",          # Sort the value in descending order
          sort.by.groups = FALSE,     # Don&amp;#39;t sort inside each group
          x.text.angle = 90,          # Rotate vertically x axis texts
          ylab = "MPG z-score",
          legend.title = "MPG Group",
          rotate = TRUE,
          ggtheme = theme_minimal()
          )</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/025-bar-plots-and-alternatives-deviation-graphs-horizontal-1.png" width="624" /></p>
</div>
<div id="alternatives-to-bar-plots" class="section level2">
<h2>Alternatives to bar plots</h2>
<div id="lollipop-chart" class="section level3">
<h3>Lollipop chart</h3>
<p>Lollipop chart is an alternative to bar plots, when you have a large set of values to visualize.</p>
<p>Lollipop chart colored by the grouping variable “cyl”:</p>
<pre class="r"><code>ggdotchart(dfm, x = "name", y = "mpg",
           color = "cyl",                                # Color by groups
           palette = c("#00AFBB", "#E7B800", "#FC4E07"), # Custom color palette
           sorting = "ascending",                        # Sort value in descending order
           add = "segments",                             # Add segments from y = 0 to dots
           ggtheme = theme_pubr()                        # ggplot2 theme
           )</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/025-bar-plots-and-alternatives-lollipop-chart-1.png" width="720" /></p>
<ul>
<li>Sort in descending order. <strong>sorting = “descending”</strong>.</li>
<li>Rotate the plot vertically, using <strong>rotate = TRUE</strong>.</li>
<li>Sort the mpg value inside each group by using <strong>group = “cyl”</strong>.</li>
<li>Set <strong>dot.size</strong> to 6.</li>
<li>Add mpg values as label. <strong>label = “mpg”</strong> or <strong>label = round(dfm$mpg)</strong>.</li>
</ul>
<pre class="r"><code>ggdotchart(dfm, x = "name", y = "mpg",
           color = "cyl",                                # Color by groups
           palette = c("#00AFBB", "#E7B800", "#FC4E07"), # Custom color palette
           sorting = "descending",                       # Sort value in descending order
           add = "segments",                             # Add segments from y = 0 to dots
           rotate = TRUE,                                # Rotate vertically
           group = "cyl",                                # Order by groups
           dot.size = 6,                                 # Large dot size
           label = round(dfm$mpg),                        # Add mpg values as dot labels
           font.label = list(color = "white", size = 9, 
                             vjust = 0.5),               # Adjust label parameters
           ggtheme = theme_pubr()                        # ggplot2 theme
           )</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/025-bar-plots-and-alternatives-lollipop-chart-rotate-1.png" width="480" /></p>
<p>Deviation graph:</p>
<ul>
<li>Use y = “mpg_z”</li>
<li>Change segment color and size: add.params = list(color = “lightgray”, size = 2)</li>
</ul>
<pre class="r"><code>ggdotchart(dfm, x = "name", y = "mpg_z",
           color = "cyl",                                # Color by groups
           palette = c("#00AFBB", "#E7B800", "#FC4E07"), # Custom color palette
           sorting = "descending",                       # Sort value in descending order
           add = "segments",                             # Add segments from y = 0 to dots
           add.params = list(color = "lightgray", size = 2), # Change segment color and size
           group = "cyl",                                # Order by groups
           dot.size = 6,                                 # Large dot size
           label = round(dfm$mpg_z,1),                        # Add mpg values as dot labels
           font.label = list(color = "white", size = 9, 
                             vjust = 0.5),               # Adjust label parameters
           ggtheme = theme_pubr()                        # ggplot2 theme
           )+
  geom_hline(yintercept = 0, linetype = 2, color = "lightgray")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/025-bar-plots-and-alternatives-lollipop-chart-deviation-1.png" width="720" /></p>
</div>
<div id="clevelands-dot-plot" class="section level3">
<h3>Cleveland’s dot plot</h3>
<p>Color y text by groups. Use y.text.col = TRUE.</p>
<pre class="r"><code>ggdotchart(dfm, x = "name", y = "mpg",
           color = "cyl",                                # Color by groups
           palette = c("#00AFBB", "#E7B800", "#FC4E07"), # Custom color palette
           sorting = "descending",                       # Sort value in descending order
           rotate = TRUE,                                # Rotate vertically
           dot.size = 2,                                 # Large dot size
           y.text.col = TRUE,                            # Color y text by groups
           ggtheme = theme_pubr()                        # ggplot2 theme
           )+
  theme_cleveland()                                      # Add dashed grids</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/025-bar-plots-and-alternatives-cleveland-dot-plots-1.png" width="480" /></p>
</div>
</div>
</div>


</div><!--end rdoc-->

<!-- END HTML -->]]></description>
			<pubDate>Fri, 01 Sep 2017 01:07:00 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[Plot Means/Medians and Error Bars]]></title>
			<link>https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/79-plot-meansmedians-and-error-bars/</link>
			<guid>https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/79-plot-meansmedians-and-error-bars/</guid>
			<description><![CDATA[<!-- START HTML -->

  <div id="rdoc">
<p>In this article, we’ll describe how to <strong>plot</strong> easily <strong>means</strong> or <strong>medians</strong> with <strong>error bars</strong>. We’ll use ggplot2 based helper functions available in the <a href="https://www.sthda.com/english/rpkgs/ggpubr/index.html">ggpubr R package</a>.</p>

<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-bmeans-with-error-bars-logo-1.png" width="432" /></p>
<p><strong>Contents</strong>:</p>
<div id="TOC">
<ul>
<li><a href="#prerequisites">Prerequisites</a><ul>
<li><a href="#required-r-package">Required R package</a></li>
<li><a href="#demo-data-sets">Demo data sets</a></li>
</ul></li>
<li><a href="#error-plots">Error plots</a></li>
<li><a href="#line-plots">Line plots</a></li>
<li><a href="#bar-plots">Bar plots</a></li>
<li><a href="#add-labels">Add labels</a></li>
<li><a href="#application-to-gene-expression-data">Application to gene expression data</a></li>
</ul>
</div>
<div id="prerequisites" class="section level2">
<h2>Prerequisites</h2>
<div id="required-r-package" class="section level3">
<h3>Required R package</h3>
<p>You need to install the R package <a href="https://www.sthda.com/english/rpkgs/ggpubr">ggpubr</a>, to easily create ggplot2-based publication ready plots.</p>
<p>We recommend to install the latest developmental version from <a href="https://github.com/kassambara/ggpubr">GitHub</a> as follow:</p>
<pre class="r"><code>if(!require(devtools)) install.packages("devtools")
devtools::install_github("kassambara/ggpubr")</code></pre>
<p>If the installation from Github failed, then try to install from <a href="https://cran.r-project.org/package=ggpubr">CRAN</a> as follow:</p>
<pre class="r"><code>install.packages("ggpubr")</code></pre>
<p>Load ggpubr:</p>
<pre class="r"><code>library(ggpubr)</code></pre>
</div>
<div id="demo-data-sets" class="section level3">
<h3>Demo data sets</h3>
<p>Data: ToothGrowth and mtcars data sets.</p>
<pre class="r"><code># ToothGrowth
data("ToothGrowth")
head(ToothGrowth)</code></pre>
<pre><code>##    len supp dose
## 1  4.2   VC  0.5
## 2 11.5   VC  0.5
## 3  7.3   VC  0.5
## 4  5.8   VC  0.5
## 5  6.4   VC  0.5
## 6 10.0   VC  0.5</code></pre>
<pre class="r"><code># mtcars 
data("mtcars")
head(mtcars[, c("wt", "mpg", "cyl")])</code></pre>
<pre><code>##                     wt  mpg cyl
## Mazda RX4         2.62 21.0   6
## Mazda RX4 Wag     2.88 21.0   6
## Datsun 710        2.32 22.8   4
## Hornet 4 Drive    3.21 21.4   6
## Hornet Sportabout 3.44 18.7   8
## Valiant           3.46 18.1   6</code></pre>
</div>
</div>
<div id="error-plots" class="section level2">
<h2>Error plots</h2>
<p>R function: <strong>ggerrorplot</strong>() [in <strong>ggpubr</strong>].</p>
<p>Simplified format:</p>
<pre class="r"><code>ggerrorplot(data, x, y, desc_stat = "mean_se")</code></pre>
<div class="block">
<ul>
<li>
<em>data</em>: a data frame
</li>
<li>
<em>x, y</em>: x and y variables for plotting
</li>
<li>
<em>desc_stat</em>: descriptive statistics to be used for visualizing errors. Default value is “mean_se”. Allowed values are one of , “mean”, “mean_se”, “mean_sd”, “mean_ci”, “mean_range”, “median”, “median_iqr”, “median_mad”, “median_range”
</li>
</ul>
</div>
<p>For example, the following R code uses the ToothGrowth data set and plots y = “len” by x = “dose”.</p>
<pre class="r"><code># Mean +/- standard deviation
ggerrorplot(ToothGrowth, x = "dose", y = "len", 
            desc_stat = "mean_sd")
# Change error plot type and add mean points
ggerrorplot(ToothGrowth, x = "dose", y = "len", 
            desc_stat = "mean_sd",
            error.plot = "errorbar",            # Change error plot type
            add = "mean"                        # Add mean points
            )</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-error-plot-1.png" width="336" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-error-plot-2.png" width="336" /></p>
<p>It’s also possible to add jitter points (representing individual points), dot plots and violin plots:</p>
<pre class="r"><code># Add jittered points
ggerrorplot(ToothGrowth, x = "dose", y = "len", 
            desc_stat = "mean_sd", color = "black",
            add = "jitter", add.params = list(color = "darkgray")
            )
# Add dot plots
ggerrorplot(ToothGrowth, x = "dose", y = "len", 
            desc_stat = "mean_sd", color = "black",
            add = "dotplot", add.params = list(color = "darkgray")
            )
# Add violin plots
ggerrorplot(ToothGrowth, x = "dose", y = "len", 
            desc_stat = "mean_sd", color = "black",
            add = "violin", add.params = list(color = "darkgray")
            )</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-error-plot-add-points-1.png" width="336" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-error-plot-add-points-2.png" width="336" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-error-plot-add-points-3.png" width="336" /></p>
<p>To add p-values comparing means, use this:</p>
<pre class="r"><code># Specify the comparisons you want
my_comparisons <- list( c("0.5", "1"), c("1", "2"), c("0.5", "2") )
ggerrorplot(ToothGrowth, x = "dose", y = "len",
            desc_stat = "mean_sd", color = "black",
            add = "violin", add.params = list(color = "darkgray"))+ 
  stat_compare_means(comparisons = my_comparisons)+ # Add pairwise comparisons p-value
  stat_compare_means(label.y = 50)                  # Add global p-value</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-add-p-values-1.png" width="336" /></p>
<p>Read more at : <a href="https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/76-add-p-values-and-significance-levels-to-ggplots/">Add P-values and Significance Levels to ggplots</a>.</p>
<p>Color by a grouping variable:</p>
<pre class="r"><code># Color by "dose" (same variable used on x-axis)
ggerrorplot(ToothGrowth, x = "dose", y = "len", 
            desc_stat = "mean_sd", 
            color = "dose", palette = "jco")
# Color by another grouping variable "supp"
ggerrorplot(ToothGrowth, x = "dose", y = "len", 
            desc_stat = "mean_sd", 
            color = "supp", palette = "jco",
            position = position_dodge(0.3)     # Adjust the space between bars
            )</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-color-by-groups-1.png" width="336" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-color-by-groups-2.png" width="336" /></p>
</div>
<div id="line-plots" class="section level2">
<h2>Line plots</h2>
<p>You can create a line plot of mean +/- error using the function <strong>ggline</strong>()[in <strong>ggpubr</strong>]. The format is as follow:</p>
<pre class="r"><code># Basic line plots of means +/- se with jittered points
ggline(ToothGrowth, x = "dose", y = "len", 
       add = c("mean_se", "jitter"))</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-basic-line-plots-1.png" width="432" /></p>
<p>Color by groups:</p>
<pre class="r"><code>ggline(ToothGrowth, x = "dose", y = "len", 
       add = c("mean_se", "jitter"),
       color = "supp", palette = "jco")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-line-plots-color-by-groups-1.png" width="432" /></p>
</div>
<div id="bar-plots" class="section level2">
<h2>Bar plots</h2>
<p>R function <strong>ggbarplot</strong>()[in <strong>ggpubr</strong>]. The format is as follow:</p>
<pre class="r"><code># Basic bar plots of means +/- se with jittered points
ggbarplot(ToothGrowth, x = "dose", y = "len", 
       add = c("mean_se", "jitter"))</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-basic-bar-plots-1.png" width="432" /></p>
<p>Color by groups:</p>
<pre class="r"><code>ggbarplot(ToothGrowth, x = "dose", y = "len", 
          add = c("mean_se", "jitter"),
          color = "supp", palette = "jco",
          position = position_dodge(0.8))</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-bar-plots-color-by-groups-1.png" width="432" /></p>
</div>
<div id="add-labels" class="section level2">
<h2>Add labels</h2>
<p>In this section we’ll plot group means with individual information.</p>
<p><em>Data</em>: mtcars.</p>
<pre class="r"><code># Prepare the data set
# Use row names as individual names
df <- as.data.frame(mtcars[, c("am", "hp")])
df$name <- rownames(df)
head(df)</code></pre>
<pre><code>##                   am  hp              name
## Mazda RX4          1 110         Mazda RX4
## Mazda RX4 Wag      1 110     Mazda RX4 Wag
## Datsun 710         1  93        Datsun 710
## Hornet 4 Drive     0 110    Hornet 4 Drive
## Hornet Sportabout  0 175 Hornet Sportabout
## Valiant            0 105           Valiant</code></pre>
<p>Create a bar plot with individual labels.</p>
<pre class="r"><code>set.seed(123)
# Bar plot of mean +/- se, add individual points
ggbarplot(df, x = "am", y = "hp",
          add = c("mean_se", "point"),
          color = "am", fill = "am", alpha = 0.5,
          palette = "jco")+
   ggrepel::geom_text_repel(aes(label = name))</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-add-labels-1.png" width="480" /></p>
</div>
<div id="application-to-gene-expression-data" class="section level2">
<h2>Application to gene expression data</h2>
<p>In our previous article - <a href="https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/77-facilitating-exploratory-data-visualization-application-to-tcga-genomic-data/">Facilitating Exploratory Data Visualization: Application to TCGA Genomic Data</a> - we described how to visualize gene expression data using box plots, violin plots, dot plots and stripcharts. We also demonstrated how to combine the plot of multiples variables (genes) in the same plot.</p>
<p>Here we provide some R code to visualize the mean expression profile of one or multiple genes. We’ll use the gene expression data set described in our previous tutorial: <a href="https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/77-facilitating-exploratory-data-visualization-application-to-tcga-genomic-data/">Facilitating Exploratory Data Visualization: Application to TCGA Genomic Data</a>.</p>
<pre class="r"><code>expr <- read.delim("https://raw.githubusercontent.com/kassambara/data/master/expr_tcga.txt",
                   stringsAsFactors = FALSE)</code></pre>
<p>The data set contains the mRNA expression for five genes of interest - GATA3, PTEN, XBP1, ESR1 and MUC1 - from 3 different data sets:</p>
<ul>
<li>Breast invasive carcinoma (BRCA),</li>
<li>Ovarian serous cystadenocarcinoma (OV) and</li>
<li>Lung squamous cell carcinoma (LUSC)</li>
</ul>
<p>The R code below displays the mean expression of three genes - “GATA3”, “PTEN” and “XBP1”.</p>
<pre class="r"><code>ggline(expr, x = "dataset",
      y = c("GATA3", "PTEN", "XBP1"),
      combine = TRUE,
      ylab = "Expression", 
      add = "mean_sd")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-mean-gene-expression-plot-1.png" width="768" /></p>
<p>You can also add other geometries on the mean plot such as jitter points, dotplot or violin. To add a violin plot, type this:</p>
<pre class="r"><code>ggline(expr, x = "dataset",
      y = c("GATA3", "PTEN", "XBP1"),
      combine = TRUE,
      ylab = "Expression", 
      color = "gray",                                     # Line color
      add = c("mean_sd", "violin"),                     
      add.params = list(color = "dataset"),
      palette = "jco"
      )</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-mean-gene-expression-violin-1.png" width="768" /></p>
<p>To add jitter points, we’ll use a small subset of data for readability:</p>
<pre class="r"><code># Subset 50 random rows
set.seed(123)
random_rows <- sample(1:nrow(expr), 50)
expr2 <- expr[random_rows, ]
# Visualize
ggline(expr2, x = "dataset",
      y = c("GATA3", "PTEN", "XBP1"),
      combine = TRUE,
      ylab = "Expression", 
      color = "gray",                           
      add = c("mean_sd", "jitter"),                     
      add.params = list(color = "dataset", size = 0.5),
      palette = "jco"
      )</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-mean-gene-expression-add-jitter-1.png" width="768" /></p>
<p>As <a href="https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/77-facilitating-exploratory-data-visualization-application-to-tcga-genomic-data/">previously shown</a>, you can merge the three plots as follow:</p>
<pre class="r"><code># Merge the three plot
ggline(expr2, x = "dataset",
      y = c("GATA3", "PTEN", "XBP1"),
      merge = TRUE,
      ylab = "Expression", 
      add = "mean_sd",
      palette = "jco")
# Add  jitter
ggline(expr2, x = "dataset",
      y = c("GATA3", "PTEN", "XBP1"),
      merge = TRUE,
      ylab = "Expression", 
      add = c("mean_sd", "jitter"),              # Add jitter points
      palette = "jco")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-mean-gene-expression-merge-1.png" width="336" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-mean-gene-expression-merge-2.png" width="336" /></p>
<p>Show line labels:</p>
<pre class="r"><code>ggline(expr2, x = "dataset",
      y = c("GATA3", "PTEN", "XBP1"),
      merge = TRUE,
      ylab = "Expression",                   
      add = "mean",   
      show.line.label = TRUE,
      repel = TRUE,
      legend = "none",
      palette = rep("black", 3)   # Black color for each line
      )</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-mean-gene-expression-line-labels-1.png" width="480" /></p>
<p>Let’s plot a complex plot with point labels:</p>
<pre class="r"><code># Add  jitter
ggline(expr2, x = "dataset",
      y = c("GATA3", "PTEN", "XBP1"),
      merge = TRUE,
      ylab = "Expression", 
      add = c("mean_se", "jitter"),              # Add mean_se and jitter points
      add.params = list(size = 0.7),             # Add point size
      label = "bcr_patient_barcode",             # Add point labels
      label.select = list(top.up = 2),           # show only labels for the top 2 points
      font.label = list(color = ".y."),          # Color labels by .y., here gene names
      repel = TRUE,                              # Use repel to avoid labels overplotting
      palette = "jco")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-mean-gene-expression-merge-and-labels-1.png" width="672" /></p>
<p>Plot a bar plot of means:</p>
<pre class="r"><code># Create bar plots
ggbarplot(expr2, x = "dataset",
      y = c("GATA3", "PTEN", "XBP1"),
      combine = TRUE,
      ylab = "Expression", 
      add = "mean_se")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-mean-gene-expression-barplot-1.png" width="768" /></p>
<pre class="r"><code># Merge bar plots
ggbarplot(expr2, x = "dataset",
      y = c("GATA3", "PTEN", "XBP1"),
      merge = TRUE,
      ylab = "Expression", 
      add = "mean_se", palette = "jco")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-mean-gene-expression-barplot-2.png" width="768" /></p>
<p>Error plots:</p>
<pre class="r"><code># Create error plots
ggerrorplot(expr2, x = "dataset",
      y = c("GATA3", "PTEN", "XBP1"),
      combine = TRUE,
      ylab = "Expression", 
      add = "mean_sd")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-mean-gene-expression-error-plot-1.png" width="768" /></p>
<pre class="r"><code># Merge error plots
ggerrorplot(expr2, x = "dataset",
      y = c("GATA3", "PTEN", "XBP1"),
      merge = TRUE,
      ylab = "Expression", 
      add = "mean_sd", palette = "jco",
      position = position_dodge(0.3)
      )</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/020-plot-means-medians-and-error-bars-mean-gene-expression-error-plot-2.png" width="768" /></p>
</div>
</div>
</div><!--end rdoc-->

<!-- END HTML -->]]></description>
			<pubDate>Fri, 01 Sep 2017 01:00:00 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[Perfect Scatter Plots with Correlation and Marginal Histograms]]></title>
			<link>https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/78-perfect-scatter-plots-with-correlation-and-marginal-histograms/</link>
			<guid>https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/78-perfect-scatter-plots-with-correlation-and-marginal-histograms/</guid>
			<description><![CDATA[<!-- START HTML -->

  <div id="rdoc">

<p><strong>Scatter plots</strong> are used to display the relationship between two variables x and y. In this article, we’ll start by showing how to create beautiful scatter plots in R. We’ll use helper functions in the <a href="https://www.sthda.com/english/rpkgs/ggpubr/index.html">ggpubr R package</a> to display automatically the <strong>correlation coefficient</strong> and the <strong>significance level</strong> on the plot. We’ll also describe how to color points by groups and to add concentration ellipses around each group. Additionally, we’ll show how to create <strong>bubble charts</strong>, as well as, how to add <strong>marginal plots</strong> (histogram, density or boxplot) to a scatter plot.</p>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-scatter-plots-logo-1.png" width="528" /></p>
<br/>
<p><strong>Contents</strong>:</p>
<div id="TOC">
<ul>
<li><a href="#prerequisites">Prerequisites</a><ul>
<li><a href="#required-r-package">Required R package</a></li>
<li><a href="#demo-data-sets">Demo data sets</a></li>
</ul></li>
<li><a href="#basic-plots">Basic plots</a></li>
<li><a href="#color-by-groups">Color by groups</a></li>
<li><a href="#add-concentration-ellipses">Add concentration ellipses</a></li>
<li><a href="#add-point-labels">Add point labels</a></li>
<li><a href="#bubble-chart">Bubble chart</a></li>
<li><a href="#color-by-a-continuous-variable">Color by a continuous variable</a></li>
<li><a href="#add-marginal-plots">Add marginal plots</a></li>
<li><a href="#add-2d-density-estimation">Add 2d density estimation</a></li>
<li><a href="#application-to-gene-expression-data">Application to gene expression data</a></li>
<li><a href="#further-readings">Further readings</a></li>
</ul>
</div>

<div id="prerequisites" class="section level2">
<h2>Prerequisites</h2>
<div id="required-r-package" class="section level3">
<h3>Required R package</h3>
<p>You need to install the R package <a href="https://www.sthda.com/english/rpkgs/ggpubr">ggpubr (version >= 0.1.3)</a>, to easily create ggplot2-based publication ready plots.</p>
<p>We recommend to install the latest developmental version from <a href="https://github.com/kassambara/ggpubr">GitHub</a> as follow:</p>
<pre class="r"><code>if(!require(devtools)) install.packages("devtools")
devtools::install_github("kassambara/ggpubr")</code></pre>
<p>If installation from Github failed, then try to install from <a href="https://cran.r-project.org/package=ggpubr">CRAN</a> as follow:</p>
<pre class="r"><code>install.packages("ggpubr")</code></pre>
<p>Load ggpubr:</p>
<pre class="r"><code>library(ggpubr)</code></pre>
<p>The following R functions will be used:</p>
<div class="warning">
<ul>
<li>
<strong>ggscatter</strong>()[in ggpubr]: plot scatter plots
</li>
<li>
<strong>stat_cor</strong>()[in ggpubr]: Add correlation coefficients and significance levels
</li>
</ul>
</div>
</div>
<div id="demo-data-sets" class="section level3">
<h3>Demo data sets</h3>
<p>Data: <a href="https://www.sthda.com/english/wiki/r-built-in-data-sets#mtcars-motor-trend-car-road-tests">mtcars</a> data sets.</p>
<pre class="r"><code># Load data
data("mtcars")
df <- mtcars
# Convert cyl as a grouping variable
df$cyl <- as.factor(df$cyl)

# Inspect the data
head(df[, c("wt", "mpg", "cyl", "qsec")])</code></pre>
<pre><code>##                     wt  mpg cyl qsec
## Mazda RX4         2.62 21.0   6 16.5
## Mazda RX4 Wag     2.88 21.0   6 17.0
## Datsun 710        2.32 22.8   4 18.6
## Hornet 4 Drive    3.21 21.4   6 19.4
## Hornet Sportabout 3.44 18.7   8 17.0
## Valiant           3.46 18.1   6 20.2</code></pre>
</div>
</div>
<div id="basic-plots" class="section level2">
<h2>Basic plots</h2>
<pre class="r"><code>ggscatter(df, x = "wt", y = "mpg",
          add = "reg.line",                                 # Add regression line
          conf.int = TRUE,                                  # Add confidence interval
          add.params = list(color = "blue",
                            fill = "lightgray")
          )+
  stat_cor(method = "pearson", label.x = 3, label.y = 30)  # Add correlation coefficient</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-basic-1.png" width="480" /></p>
<p>You can change the point shape, by specifying the argument <em>shape</em>, for example:</p>
<pre class="r"><code>ggscatter(df, x = "wt", y = "mpg",
          shape = 18)</code></pre>
<p>To see the different point shapes commonly used in R, type this:</p>
<pre class="r"><code>show_point_shapes()</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-point-shapes-1.png" width="384" /></p>
</div>
<div id="color-by-groups" class="section level2">
<h2>Color by groups</h2>
<p>Grouping variable: cyl. To add a correlation coefficient per group, specify the grouping variable using the mapping function <strong>aes</strong>() as follow.</p>
<pre class="r"><code>ggscatter(df, x = "wt", y = "mpg",
          add = "reg.line",                         # Add regression line
          conf.int = TRUE,                          # Add confidence interval
          color = "cyl", palette = "jco",           # Color by groups "cyl"
          shape = "cyl"                             # Change point shape by groups "cyl"
          )+
  stat_cor(aes(color = cyl), label.x = 3)           # Add correlation coefficient</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-color-by-groups-1.png" width="528" /></p>
<pre class="r"><code># Extending the regression line --> fullrange = TRUE
# Add marginal rug (marginal density) ---> rug = TRUE
ggscatter(df, x = "wt", y = "mpg",
          add = "reg.line",                         # Add regression line
          color = "cyl", palette = "jco",           # Color by groups "cyl"
          shape = "cyl",                            # Change point shape by groups "cyl"
          fullrange = TRUE,                         # Extending the regression line
          rug = TRUE                                # Add marginal rug
          )+
  stat_cor(aes(color = cyl), label.x = 3)           # Add correlation coefficient</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-color-by-groups-2.png" width="528" /></p>
</div>
<div id="add-concentration-ellipses" class="section level2">
<h2>Add concentration ellipses</h2>
<p>Main arguments:</p>
<div class="block">
<ul>
<li>
<strong>ellipse = TRUE</strong>: Draw ellipses around groups.
</li>
<li>
<strong>ellipse.level</strong>: The size of the concentration ellipse in normal probability. Default is 0.95.
</li>
<li>
<strong>ellipse.type</strong>: Ellipse types. Possible values are ‘convex’, ‘confidence’ or types supported by ggplot2::stat_ellipse including one of c(“t”, “norm”, “euclid”). Default is “norm”.
</li>
</ul>
</div>
<pre class="r"><code>ggscatter(df, x = "wt", y = "mpg",
          color = "cyl", palette = "jco",
          shape = "cyl",
          ellipse = TRUE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-concentration-ellipses-1.png" width="528" /></p>
<pre class="r"><code># Change the ellipse type to &amp;#39;convex&amp;#39;
ggscatter(df, x = "wt", y = "mpg",
          color = "cyl", palette = "jco",
          shape = "cyl",
          ellipse = TRUE, ellipse.type = "convex")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-concentration-ellipses-2.png" width="528" /></p>
<pre class="r"><code># Add group mean points and stars
ggscatter(df, x = "wt", y = "mpg",
          color = "cyl", palette = "jco",
          shape = "cyl",
          ellipse = TRUE, 
          mean.point = TRUE,
          star.plot = TRUE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-concentration-ellipses-3.png" width="528" /></p>
</div>
<div id="add-point-labels" class="section level2">
<h2>Add point labels</h2>
<p>Main arguments:</p>
<div class="block">
<ul>
<li>
<strong>label</strong>: the name of the column containing point labels.
</li>
<li>
<strong>font.label</strong>: a list which can contain the combination of the following elements: the size (e.g.: 14), the style (e.g.: “plain”, “bold”, “italic”, “bold.italic”) and the color (e.g.: “red”) of labels. For example font.label = list(size = 14, face = “bold”, color =“red”). To specify only the size and the style, use font.label = list(size = 14, face = “plain”).
</li>
<li>
<strong>label.select</strong>: character vector specifying some labels to show.
</li>
<li>
<strong>repel = TRUE</strong>: Avoid label overlapping.
</li>
</ul>
</div>
<pre class="r"><code># Use row names as point labels
df$name <- rownames(df)
ggscatter(df, x = "wt", y = "mpg",
   color = "cyl", palette = "jco",
   label = "name", repel = TRUE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-add-label-1.png" width="576" /></p>
<pre class="r"><code># Select some labels to show
ggscatter(df, x = "wt", y = "mpg",
   color = "cyl", palette = "jco",
   label = "name", repel = TRUE,
   label.select = c("Toyota Corolla", "Merc 280", "Duster 360"))</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-add-label-2.png" width="576" /></p>
<pre class="r"><code># Show labels according to some criteria: x and y values
ggscatter(df, x = "wt", y = "mpg",
   color = "cyl", palette = "jco",
   label = "name", repel = TRUE,
   label.select = list(criteria = "`x` > 4 &amp; `y` < 15"))</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-add-label-3.png" width="576" /></p>
</div>
<div id="bubble-chart" class="section level2">
<h2>Bubble chart</h2>
<p>In a bubble chart, points size is controlled by a continuous variable, here “qsec”. In the R code below, the argument alpha is used to control color transparency. alpha should be between 0 and 1.</p>
<pre class="r"><code>ggscatter(df, x = "wt", y = "mpg",
   color = "cyl", palette = "jco",
   size = "qsec", alpha = 0.5)+
  scale_size(range = c(0.5, 15))    # Adjust the range of points size</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-bubble-chart-1.png" width="576" /></p>
</div>
<div id="color-by-a-continuous-variable" class="section level2">
<h2>Color by a continuous variable</h2>
<p>The R code below, will color points according to the values of a continuous variable, here “mpg”. By default, a blue gradient color is created. This can be changed using the helper function <strong>gradient_color</strong>() [in ggpubr].</p>
<pre class="r"><code># Color by continuous variable
p <- ggscatter(df, x = "wt", y = "mpg",
               color = "mpg")
p

# Change gradient color
p + gradient_color(c("blue", "white", "red"))</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-color-by-continuous-variable-1.png" width="336" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-color-by-continuous-variable-2.png" width="336" /></p>
</div>
<div id="add-marginal-plots" class="section level2">
<h2>Add marginal plots</h2>
<p>The function <strong>ggMarginal</strong>() [in ggExtra package], can be used to easily add a marginal histogram, density or boxplot to a scatter plot.</p>
<p>First, install the ggExtra package as follow: <em>install.packages(“ggExtra”)</em>; then type the following R code:</p>
<pre class="r"><code># Add density distribution as marginal plot
library("ggExtra")
p <- ggscatter(iris, x = "Sepal.Length", y = "Sepal.Width",
               color = "Species", palette = "jco",
               size = 3, alpha = 0.6)
ggMarginal(p, type = "density")

# Change marginal plot type
ggMarginal(p, type = "boxplot")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-marginal-plot-1.png" width="336" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-marginal-plot-2.png" width="336" /></p>
<p>One limitation of ggExtra is that it can’t cope with multiple groups in the scatter plot and the marginal plots. In the R code below, we provide a solution using the <em>cowplot</em> package.</p>
<pre class="r"><code># Scatter plot colored by groups ("Species")
sp <- ggscatter(iris, x = "Sepal.Length", y = "Sepal.Width",
                color = "Species", palette = "jco",
                size = 3, alpha = 0.6)+
  border()                                         

# Marginal density plot of x (top panel) and y (right panel)
xplot <- ggdensity(iris, "Sepal.Length", fill = "Species",
                   palette = "jco")
yplot <- ggdensity(iris, "Sepal.Width", fill = "Species", 
                   palette = "jco")+
  rotate()

# Cleaning the plots
sp <- sp + rremove("legend")
yplot <- yplot + clean_theme() + rremove("legend")
xplot <- xplot + clean_theme() + rremove("legend")

# Arranging the plot using cowplot
library(cowplot)
plot_grid(xplot, NULL, sp, yplot, ncol = 2, align = "hv", 
          rel_widths = c(2, 1), rel_heights = c(1, 2))</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-marginal-plot-grouped-data-1.png" width="576" /></p>
<p>Add marginal boxplot:</p>
<pre class="r"><code># Scatter plot colored by groups ("Species")
sp <- ggscatter(iris, x = "Sepal.Length", y = "Sepal.Width",
                color = "Species", palette = "jco",
                size = 3, alpha = 0.6, ggtheme = theme_bw())             

# Marginal boxplot of x (top panel) and y (right panel)
xplot <- ggboxplot(iris, x = "Species", y = "Sepal.Length", 
                   color = "Species", fill = "Species", palette = "jco",
                   alpha = 0.5, ggtheme = theme_bw())+
  rotate()
yplot <- ggboxplot(iris, x = "Species", y = "Sepal.Width",
                   color = "Species", fill = "Species", palette = "jco",
                   alpha = 0.5, ggtheme = theme_bw())

# Cleaning the plots
sp <- sp + rremove("legend")
yplot <- yplot + clean_theme() + rremove("legend")
xplot <- xplot + clean_theme() + rremove("legend")

# Arranging the plot using cowplot
library(cowplot)
plot_grid(xplot, NULL, sp, yplot, ncol = 2, align = "hv", 
          rel_widths = c(2, 1), rel_heights = c(1, 2))</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-add-marginal-boxplot-grouped-data-1.png" width="576" /></p>
<p>The problem with the above plots, is the presence of extra spaces between the main plot and the marginal density plots. Recently, in a <a href="https://twitter.com/ClausWilke/status/900776341494276096">tweet post</a>, Claus Wilke provides the following solution for creating a perfect scatter plot with marginal density plots or histogram plots:</p>
<pre class="r"><code>library(cowplot) 

# Main plot
pmain <- ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species))+
  geom_point()+
  ggpubr::color_palette("jco")

# Marginal densities along x axis
xdens <- axis_canvas(pmain, axis = "x")+
  geom_density(data = iris, aes(x = Sepal.Length, fill = Species),
              alpha = 0.7, size = 0.2)+
  ggpubr::fill_palette("jco")

# Marginal densities along y axis
# Need to set coord_flip = TRUE, if you plan to use coord_flip()
ydens <- axis_canvas(pmain, axis = "y", coord_flip = TRUE)+
  geom_density(data = iris, aes(x = Sepal.Width, fill = Species),
                alpha = 0.7, size = 0.2)+
  coord_flip()+
  ggpubr::fill_palette("jco")


p1 <- insert_xaxis_grob(pmain, xdens, grid::unit(.2, "null"), position = "top")
p2<- insert_yaxis_grob(p1, ydens, grid::unit(.2, "null"), position = "right")
ggdraw(p2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-marginal-density-1.png" width="576" /></p>
</div>
<div id="add-2d-density-estimation" class="section level2">
<h2>Add 2d density estimation</h2>
<pre class="r"><code># Add 2d density estimation
sp <- ggscatter(iris, x = "Sepal.Length", y = "Sepal.Width",
                color = "lightgray")
sp + geom_density_2d()

# Gradient color
sp + stat_density_2d(aes(fill = ..level..), geom = "polygon")

# Change gradient color: custom
sp + stat_density_2d(aes(fill = ..level..), geom = "polygon")+
  gradient_fill(c("white", "steelblue"))

# Change the gradient color: RColorBrewer palette
sp + stat_density_2d(aes(fill = ..level..), geom = "polygon") +
  gradient_fill("YlOrRd")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-add-2d-density-1.png" width="336" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-add-2d-density-2.png" width="336" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-add-2d-density-3.png" width="336" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-add-2d-density-4.png" width="336" /></p>
</div>
<div id="application-to-gene-expression-data" class="section level2">
<h2>Application to gene expression data</h2>
<p>We’ll use the gene expression data set described in our previous tutorial: <a href="https://www.sthda.com/english/wiki/facilitating-exploratory-data-visualization-application-to-tcga-genomic-data">Facilitating Exploratory Data Visualization: Application to TCGA Genomic Data</a>.</p>
<pre class="r"><code>expr <- read.delim("https://raw.githubusercontent.com/kassambara/data/master/expr_tcga.txt",
                   stringsAsFactors = FALSE)</code></pre>
<p>The data set contains the mRNA expression for five genes of interest - GATA3, PTEN, XBP1, ESR1 and MUC1 - from 3 different data sets:</p>
<ul>
<li>Breast invasive carcinoma (BRCA),</li>
<li>Ovarian serous cystadenocarcinoma (OV) and</li>
<li>Lung squamous cell carcinoma (LUSC)</li>
</ul>
<p>The following plots show the association between GATA3 and ESR1 genes expression.</p>
<pre class="r"><code># Association between GATA3 and ESR1
# Color points by dataset
# Add correlation coefficient by dataset
ggscatter(expr, x = "GATA3", y = "ESR1", size = 0.3, 
          rug = TRUE,                                # Add marginal rug
          color = "dataset", palette = "jco") +
  stat_cor(aes(color = dataset), method = "spearman")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-gene-expression-1.png" width="432" /></p>
<p>Facet/split by data set, add regression line and confidence interval:</p>
<pre class="r"><code>ggscatter(expr, x = "GATA3", y = "ESR1", size = 0.3,
          color = "dataset", palette = "jco",
          facet.by = "dataset", #scales = "free_x",
          add = "reg.line", conf.int = TRUE) +
  stat_cor(aes(color = dataset), method = "spearman", label.y = 6)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-gene-expression-facet-1.png" width="672" /></p>
<p>Combining multiple plots. Visualize the correlation of GATA3 with two other genes (ESR1 and MUC1)</p>
<pre class="r"><code>ggscatter(expr, x = "GATA3", y = c("ESR1", "MUC1"), size = 0.3,
          combine = TRUE, ylab = "Expression",
          color = "dataset", palette = "jco",
          add = "reg.line", conf.int = TRUE) +
  stat_cor(aes(color = dataset), method = "spearman")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/015-scatter-plot-and-correlation-gene-expression-combine-1.png" width="672" /></p>
</div>
<div id="further-readings" class="section level2">
<h2>Further readings</h2>
<p>See also the <a href="https://CRAN.R-project.org/package=ggpmisc">ggpmisc</a> R package to add linear model equation to a scatter plot.</p>
</div>
</div>


</div><!--end rdoc-->

<!-- END HTML -->]]></description>
			<pubDate>Fri, 01 Sep 2017 00:22:00 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[Facilitating Exploratory Data Visualization: Application to TCGA Genomic Data]]></title>
			<link>https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/77-facilitating-exploratory-data-visualization-application-to-tcga-genomic-data/</link>
			<guid>https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/77-facilitating-exploratory-data-visualization-application-to-tcga-genomic-data/</guid>
			<description><![CDATA[<!-- START HTML -->

  <div id="rdoc">
<p>In genomic fields, it’s very common to explore the <strong>gene expression</strong> profile of one or a list of genes involved in a pathway of interest. Here, we present some helper functions in the <a href="https://www.sthda.com/english/rpkgs/ggpubr/">ggpubr R package</a> to facilitate <strong>exploratory data analysis</strong> (<strong>EDA</strong>) for life scientists.</p>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-logo-1.png" width="672" /></p>
<p>Standard graphical techniques used in EDA, include:</p>
<ul>
<li>Box plot</li>
<li>Violin plot</li>
<li>Stripchart</li>
<li>Dot plot</li>
<li>Histogram and density plots</li>
<li>ECDF plot</li>
<li>Q-Q plot</li>
</ul>
<p>All these plots can be created using the <a href="http://ggplot2.tidyverse.org/reference/"><strong>ggplot2</strong> R package</a>, which is highly flexible.</p>
<p>However, to customize a ggplot, the syntax might appear opaque for a beginner and this raises the level of difficulty for researchers with no advanced R programming skills.</p>
<div class="block">
<p>
Here, we present the <a href="https://www.sthda.com/english/rpkgs/ggpubr/">ggpubr package</a>, a wrapper around ggplot2, which provides some easy-to-use functions for creating ‘ggplot2’- based publication ready plots. We’ll use the ggpubr functions to visualize gene expression profile from <strong>TCGA</strong> genomic data sets.
</p>
</div>
  <br/>  
<p>
    <strong>Contents:</strong>
</p>
<div id="TOC">
<ul>
<li><a href="#prerequisites">Prerequisites</a><ul>
<li><a href="#ggpubr-package">ggpubr package</a></li>
<li><a href="#tcga-data">TCGA data</a></li>
</ul></li>
<li><a href="#gene-expression-data">Gene expression data</a></li>
<li><a href="#box-plots">Box plots</a></li>
<li><a href="#violin-plots">Violin plots</a></li>
<li><a href="#stripcharts-and-dot-plots">Stripcharts and dot plots</a></li>
<li><a href="#density-plots">Density plots</a></li>
<li><a href="#histogram-plots">Histogram plots</a></li>
<li><a href="#empirical-cumulative-density-function">Empirical cumulative density function</a></li>
<li><a href="#quantile---quantile-plot">Quantile - Quantile plot</a></li>
</ul>
</div>
<div id="prerequisites" class="section level2">
<h2>Prerequisites</h2>
<div id="ggpubr-package" class="section level3">
<h3>ggpubr package</h3>
<p>Required R package: <a href="https://www.sthda.com/english/rpkgs/ggpubr">ggpubr</a>.</p>
<ul>
<li>Install from <a href="https://cran.r-project.org/package=ggpubr">CRAN</a> as follow:</li>
</ul>
<pre class="r"><code>install.packages("ggpubr")</code></pre>
<ul>
<li>Or, install the latest developmental version from <a href="https://github.com/kassambara/ggpubr">GitHub</a> as follow:</li>
</ul>
<pre class="r"><code>if(!require(devtools)) install.packages("devtools")
devtools::install_github("kassambara/ggpubr")</code></pre>
<ul>
<li>Load ggpubr:</li>
</ul>
<pre class="r"><code>library(ggpubr)</code></pre>
</div>
<div id="tcga-data" class="section level3">
<h3>TCGA data</h3>
<p><a href="https://cancergenome.nih.gov/">The Cancer Genome Atlas (TCGA) data</a> is a publicly available data containing clinical and genomic data across 33 cancer types. These data include gene expression, CNV profiling, SNP genotyping, DNA methylation, miRNA profiling, exome sequencing, and other types of data.</p>
<p>The <a href="https://github.com/RTCGA/RTCGA">RTCGA</a> R package, by Marcin Marcin Kosinski et al., provides a convenient solution to access to clinical and genomic data available in TCGA. Each of the data packages is a separate package, and must be installed (once) individually.</p>
<p>The following R code installs the core RTCGA package as well as the clinical and mRNA gene expression data packages.</p>
<pre class="r"><code># Load the bioconductor installer. 
source("https://bioconductor.org/biocLite.R")
# Install the main RTCGA package
biocLite("RTCGA")
# Install the clinical and mRNA gene expression data packages
biocLite("RTCGA.clinical")
biocLite("RTCGA.mRNA")</code></pre>
<p>To see the type of data available for each cancer type, use this:</p>
<pre class="r"><code>library(RTCGA)
infoTCGA()</code></pre>
<pre><code>## # A tibble: 38 x 13
##   Cohort    BCR Clinical     CN   LowP Methylation   mRNA mRNASeq    miR
## * <fctr> <fctr>   <fctr> <fctr> <fctr>      <fctr> <fctr>  <fctr> <fctr>
## 1    ACC     92       92     90      0          80      0      79      0
## 2   BLCA    412      412    410    112         412      0     408      0
## 3   BRCA   1098     1097   1089     19        1097    526    1093      0
## 4   CESC    307      307    295     50         307      0     304      0
## 5   CHOL     51       45     36      0          36      0      36      0
## 6   COAD    460      458    451     69         457    153     457      0
## # ... with 32 more rows, and 4 more variables: miRSeq <fctr>, RPPA <fctr>,
## #   MAF <fctr>, rawMAF <fctr></code></pre>
<div class="success">
<p>
More information about the disease names can be found at: <a href="http://gdac.broadinstitute.org/" class="uri">http://gdac.broadinstitute.org/</a>
</p>
</div>
</div>
</div>
<div id="gene-expression-data" class="section level2">
<h2>Gene expression data</h2>
<p>The R function <strong>expressionsTCGA</strong>() [in RTCGA package] can be used to easily extract the expression values of genes of interest in one or multiple cancer types.</p>
<p>In the following R code, we start by extracting the mRNA expression for five genes of interest - GATA3, PTEN, XBP1, ESR1 and MUC1 - from 3 different data sets:</p>
<ul>
<li>Breast invasive carcinoma (BRCA),</li>
<li>Ovarian serous cystadenocarcinoma (OV) and</li>
<li>Lung squamous cell carcinoma (LUSC)</li>
</ul>
<pre class="r"><code>library(RTCGA)
library(RTCGA.mRNA)
expr <- expressionsTCGA(BRCA.mRNA, OV.mRNA, LUSC.mRNA,
                        extract.cols = c("GATA3", "PTEN", "XBP1","ESR1", "MUC1"))
expr</code></pre>
<pre><code>## # A tibble: 1,305 x 7
##            bcr_patient_barcode   dataset GATA3  PTEN  XBP1   ESR1  MUC1
##                          <chr>     <chr> <dbl> <dbl> <dbl>  <dbl> <dbl>
## 1 TCGA-A1-A0SD-01A-11R-A115-07 BRCA.mRNA  2.87 1.361  2.98  3.084  1.65
## 2 TCGA-A1-A0SE-01A-11R-A084-07 BRCA.mRNA  2.17 0.428  2.55  2.386  3.08
## 3 TCGA-A1-A0SH-01A-11R-A084-07 BRCA.mRNA  1.32 1.306  3.02  0.791  2.99
## 4 TCGA-A1-A0SJ-01A-11R-A084-07 BRCA.mRNA  1.84 0.810  3.13  2.495 -1.92
## 5 TCGA-A1-A0SK-01A-12R-A084-07 BRCA.mRNA -6.03 0.251 -1.45 -4.861 -1.17
## 6 TCGA-A1-A0SM-01A-11R-A084-07 BRCA.mRNA  1.80 1.311  4.04  2.797  3.53
## # ... with 1,299 more rows</code></pre>
<p>To display the number of sample in each data set, type this:</p>
<pre class="r"><code>nb_samples <- table(expr$dataset)
nb_samples</code></pre>
<pre><code>## 
## BRCA.mRNA LUSC.mRNA   OV.mRNA 
##       590       154       561</code></pre>
<p>We can simplify data set names by removing the “mRNA” tag. This can be done using the R base function <strong>gsub</strong>().</p>
<pre class="r"><code>expr$dataset <- gsub(pattern = ".mRNA", replacement = "",  expr$dataset)</code></pre>
<p>Let’s simplify also the patients’ barcode column. The following R code will change the barcodes into BRCA1, BRCA2, …, OV1, OV2, …., etc</p>
<pre class="r"><code>expr$bcr_patient_barcode <- paste0(expr$dataset, c(1:590, 1:561, 1:154))
expr</code></pre>
<pre><code>## # A tibble: 1,305 x 7
##   bcr_patient_barcode dataset GATA3  PTEN  XBP1   ESR1  MUC1
##                 <chr>   <chr> <dbl> <dbl> <dbl>  <dbl> <dbl>
## 1               BRCA1    BRCA  2.87 1.361  2.98  3.084  1.65
## 2               BRCA2    BRCA  2.17 0.428  2.55  2.386  3.08
## 3               BRCA3    BRCA  1.32 1.306  3.02  0.791  2.99
## 4               BRCA4    BRCA  1.84 0.810  3.13  2.495 -1.92
## 5               BRCA5    BRCA -6.03 0.251 -1.45 -4.861 -1.17
## 6               BRCA6    BRCA  1.80 1.311  4.04  2.797  3.53
## # ... with 1,299 more rows</code></pre>
<p>The above (expr) dataset has been saved at <a href="https://raw.githubusercontent.com/kassambara/data/master/expr_tcga.txt" class="uri">https://raw.githubusercontent.com/kassambara/data/master/expr_tcga.txt</a>. This data is required to practice the R code provided in this tutotial.</p>
<p>If you experience some issues in installing the RTCGA packages, You can simply load the data as follow:</p>
<pre class="r"><code>expr <- read.delim("https://raw.githubusercontent.com/kassambara/data/master/expr_tcga.txt",
                   stringsAsFactors = FALSE)</code></pre>
</div>
<div id="box-plots" class="section level2">
<h2>Box plots</h2>
<p>Create a box plot of a gene expression profile, colored by groups (here data set/cancer type):</p>
<pre class="r"><code>library(ggpubr)
# GATA3
ggboxplot(expr, x = "dataset", y = "GATA3",
          title = "GATA3", ylab = "Expression",
          color = "dataset", palette = "jco")
# PTEN
ggboxplot(expr, x = "dataset", y = "PTEN",
          title = "PTEN", ylab = "Expression",
          color = "dataset", palette = "jco")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-boxplot-gene-expression-1.png" width="336" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-boxplot-gene-expression-2.png" width="336" /></p>
<div class="block">
<p>
Note that, the argument <strong>palette</strong> is used to change color palettes. Allowed values include:
</p>
<ul>
<li>
“grey” for grey color palettes;
</li>
<li>
brewer palettes e.g. “RdBu”, “Blues”, …;. To view all, type this in R: <strong>RColorBrewer::display.brewer.all()</strong> or <a href="https://www.sthda.com/english/wiki/ggplot2-colors-how-to-change-colors-automatically-and-manually#use-rcolorbrewer-palettes">click here to see all brewer palettes</a>;
</li>
<li>
or custom color palettes e.g. c(“blue”, “red”) or c(“#00AFBB”, “#E7B800”);
</li>
<li>
and scientific journal palettes from the <a href="https://cran.r-project.org/web/packages/ggsci/vignettes/ggsci.html">ggsci R package</a>, e.g.: “npg”, “aaas”, “lancet”, “jco”, “ucscgb”, “uchicago”, “simpsons” and “rickandmorty”.
</li>
</ul>
</div>
<p>Instead of repeating the same R code for each gene, you can create a list of plots at once, as follow:</p>
<pre class="r"><code># Create a  list of plots
p <- ggboxplot(expr, x = "dataset", 
               y = c("GATA3", "PTEN", "XBP1"),
               title = c("GATA3", "PTEN", "XBP1"),
               ylab = "Expression", 
               color = "dataset", palette = "jco")
# View GATA3
p$GATA3
# View PTEN
p$PTEN
# View XBP1
p$XBP1</code></pre>
<div class="block">
<p>
Note that, when the argument <em>y</em> contains multiple variables (here multiple gene names), then the arguments <em>title</em>, <em>xlab</em> and <em>ylab</em> can be also a character vector of same length as <em>y</em>.
</p>
</div>
<p>To add p-values and significance levels to the boxplots, read our previous article: <a href="https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/77-facilitating-exploratory-data-visualization-application-to-tcga-genomic-data/">Add P-values and Significance Levels to ggplots</a>. Briefly, you can do this:</p>
<pre class="r"><code>my_comparisons <- list(c("BRCA", "OV"), c("OV", "LUSC"))
ggboxplot(expr, x = "dataset", y = "GATA3",
          title = "GATA3", ylab = "Expression",
          color = "dataset", palette = "jco")+
  stat_compare_means(comparisons = my_comparisons)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-compare-means-1.png" width="384" /></p>
<p>For each of the genes, you can compare the different groups as follow:</p>
<pre class="r"><code>compare_means(c(GATA3, PTEN, XBP1) ~ dataset, data = expr)</code></pre>
<pre><code>## # A tibble: 9 x 8
##      .y. group1 group2         p     p.adj p.format p.signif   method
##   <fctr>  <chr>  <chr>     <dbl>     <dbl>    <chr>    <chr>    <chr>
## 1  GATA3   BRCA     OV 1.11e-177 3.34e-177  < 2e-16     **** Wilcoxon
## 2  GATA3   BRCA   LUSC  6.68e-73  1.34e-72  < 2e-16     **** Wilcoxon
## 3  GATA3     OV   LUSC  2.97e-08  2.97e-08  3.0e-08     **** Wilcoxon
## 4   PTEN   BRCA     OV  6.79e-05  6.79e-05  6.8e-05     **** Wilcoxon
## 5   PTEN   BRCA   LUSC  1.04e-16  3.13e-16  < 2e-16     **** Wilcoxon
## 6   PTEN     OV   LUSC  1.28e-07  2.56e-07  1.3e-07     **** Wilcoxon
## # ... with 3 more rows</code></pre>
<p>If you want to select items (here cancer types) to display or to remove a particular item from the plot, use the argument <strong>select</strong> or <strong>remove</strong>, as follow:</p>
<pre class="r"><code># Select BRCA and OV cancer types
ggboxplot(expr, x = "dataset", y = "GATA3",
          title = "GATA3", ylab = "Expression",
          color = "dataset", palette = "jco",
          select = c("BRCA", "OV"))
# or remove BRCA
ggboxplot(expr, x = "dataset", y = "GATA3",
          title = "GATA3", ylab = "Expression",
          color = "dataset", palette = "jco",
          remove = "BRCA")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-select-dataset-1.png" width="336" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-select-dataset-2.png" width="336" /></p>
<p>To change the order of the data sets on x axis, use the argument <strong>order</strong>. For example <em>order = c(“LUSC”, “OV”, “BRCA”)</em>:</p>
<pre class="r"><code># Order data sets
ggboxplot(expr, x = "dataset", y = "GATA3",
          title = "GATA3", ylab = "Expression",
          color = "dataset", palette = "jco",
          order = c("LUSC", "OV", "BRCA"))</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-order-dataset-1.png" width="336" /></p>
<p>To create horizontal plots, use the argument <strong>rotate = TRUE</strong>:</p>
<pre class="r"><code>ggboxplot(expr, x = "dataset", y = "GATA3",
          title = "GATA3", ylab = "Expression",
          color = "dataset", palette = "jco",
          rotate = TRUE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-horizontal-plot-1.png" width="432" /></p>
<p>To combine the three gene expression plots into a multi-panel plot, use the argument <strong>combine = TRUE</strong>:</p>
<pre class="r"><code>ggboxplot(expr, x = "dataset",
          y = c("GATA3", "PTEN", "XBP1"),
          combine = TRUE,
          ylab = "Expression",
          color = "dataset", palette = "jco")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-boxplot-gene-expression-multi-panel-1.png" width="720" /></p>
<p>You can also merge the 3 plots using the argument <strong>merge = TRUE</strong> or <strong>merge = “asis”</strong>:</p>
<pre class="r"><code>ggboxplot(expr, x = "dataset",
          y = c("GATA3", "PTEN", "XBP1"),
          merge = TRUE,
          ylab = "Expression", 
          palette = "jco")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-merge-plot-1.png" width="576" /></p>
<p>In the plot above, It’s easy to visually compare the expression level of the different genes in each cancer type.</p>
<p>But you might want to put genes (y variables) on x axis, in order to compare the expression level in the different cell subpopulations.</p>
<p>In this situation, the y variables (i.e.: genes) become x tick labels and the x variable (i.e.: dataset) becomes the grouping variable. To do this, use the argument <strong>merge = “flip”</strong>.</p>
<pre class="r"><code>ggboxplot(expr, x = "dataset",
          y = c("GATA3", "PTEN", "XBP1"),
          merge = "flip",
          ylab = "Expression", 
          palette = "jco")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-merge-flip-1.png" width="576" /></p>
<p>You might want to add jittered points on the boxplot. Each point correspond to individual observations. To add jittered points, use the argument <strong>add = “jitter”</strong> as follow. To customize the added elements, specify the argument <strong>add.params</strong>.</p>
<pre class="r"><code>ggboxplot(expr, x = "dataset",
          y = c("GATA3", "PTEN", "XBP1"),
          combine = TRUE,
          color = "dataset", palette = "jco",
          ylab = "Expression", 
          add = "jitter",                              # Add jittered points
          add.params = list(size = 0.1, jitter = 0.2)  # Point size and the amount of jittering
          )</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-boxplot-with-jitter-points-1.png" width="720" /></p>
<div class="block">
<p>
Note that, when using <strong>ggboxplot</strong>() sensible values for the argument <strong>add</strong> are one of c(“jitter”, “dotplot”). If you decide to use <strong>add = “dotplot”</strong>, you can adjust <em>dotsize</em> and <em>binwidth</em> wen you have a strong dense dotplot. <a href="http://r4ds.had.co.nz/eda.html">Read more about binwidth</a>.
</p>
</div>
<p>You can add and adjust a dotplot as follow:</p>
<pre class="r"><code>ggboxplot(expr, x = "dataset",
          y = c("GATA3", "PTEN", "XBP1"),
          combine = TRUE,
          color = "dataset", palette = "jco",
          ylab = "Expression", 
          add = "dotplot",                              # Add dotplot
          add.params = list(binwidth = 0.1, dotsize = 0.3)
          )</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-boxplot-with-dotplot-1.png" width="720" /></p>
<p>You might want to label the boxplot by showing the names of samples with the top n highest or lowest values. In this case, you can use the following arguments:</p>
<ul>
<li><strong>label</strong>: the name of the column containing point labels.</li>
<li><strong>label.select</strong>: can be of two formats:
<ul>
<li>a <em>character vector</em> specifying some labels to show.</li>
<li>a <em>list</em> containing one or the combination of the following components:
<ul>
<li><em>top.up</em> and <em>top.down</em>: to display the labels of the top up/down points. For example, <em>label.select = list(top.up = 10, top.down = 4)</em>.</li>
<li><em>criteria</em>: to filter, for example, by x and y variables values, use this: <em>label.select = list(criteria = “`y` > 3.9 &amp; `y` < 5 &amp; `x` %in% c(‘BRCA’, ‘OV’)”)</em>.</li>
</ul></li>
</ul></li>
</ul>
<p>For example:</p>
<pre class="r"><code>ggboxplot(expr, x = "dataset",
          y = c("GATA3", "PTEN", "XBP1"),
          combine = TRUE,
          color = "dataset", palette = "jco",
          ylab = "Expression", 
          add = "jitter",                               # Add jittered points
          add.params = list(size = 0.1, jitter = 0.2),  # Point size and the amount of jittering
          label = "bcr_patient_barcode",                # column containing point labels
          label.select = list(top.up = 2, top.down = 2),# Select some labels to display
          font.label = list(size = 9, face = "italic"), # label font
          repel = TRUE                                  # Avoid label text overplotting
          )</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-boxplot-with-point-labels-1.png" width="720" /></p>
<p>A complex criteria for labeling can be specified as follow:</p>
<pre class="r"><code>label.select.criteria <- list(criteria = "`y` > 3.9 &amp; `x` %in% c('BRCA', 'OV')")
ggboxplot(expr, x = "dataset",
          y = c("GATA3", "PTEN", "XBP1"),
          combine = TRUE,
          color = "dataset", palette = "jco",
          ylab = "Expression", 
          label = "bcr_patient_barcode",              # column containing point labels
          label.select = label.select.criteria,       # Select some labels to display
          font.label = list(size = 9, face = "italic"), # label font
          repel = TRUE                                # Avoid label text overplotting
          )</code></pre>
<div class="warning">
<p>
Other types of plots, with the same arguments as the function <strong>ggboxplot</strong>(), are available, such as stripchart and violin plots.
</p>
</div>
</div>
<div id="violin-plots" class="section level2">
<h2>Violin plots</h2>
<p>The following R code draws violin plots with box plots inside:</p>
<pre class="r"><code>ggviolin(expr, x = "dataset",
          y = c("GATA3", "PTEN", "XBP1"),
          combine = TRUE, 
          color = "dataset", palette = "jco",
          ylab = "Expression", 
          add = "boxplot")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-violin-plots-and-box-plots-1.png" width="768" /></p>
<p>Instead of adding a box plot inside the violin plot, you can add the median + interquantile range as follow:</p>
<pre class="r"><code>ggviolin(expr, x = "dataset",
          y = c("GATA3", "PTEN", "XBP1"),
          combine = TRUE, 
          color = "dataset", palette = "jco",
          ylab = "Expression", 
          add = "median_iqr")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-violin-plots-and-median-iqr-1.png" width="768" /></p>
<div class="block">
<p>
When using the function <strong>ggviolin</strong>(), sensible values for the argument <strong>add</strong> include: “mean”, “mean_se”, “mean_sd”, “mean_ci”, “mean_range”, “median”, “median_iqr”, “median_mad”, “median_range”.
</p>
<p>
You can also add “jitter” points and “dotplot” inside the violin plot as described previously in the box plot section.
</p>
</div>
</div>
<div id="stripcharts-and-dot-plots" class="section level2">
<h2>Stripcharts and dot plots</h2>
<p>To draw a stripchart, type this:</p>
<pre class="r"><code>ggstripchart(expr, x = "dataset",
             y = c("GATA3", "PTEN", "XBP1"),
             combine = TRUE, 
             color = "dataset", palette = "jco",
             size = 0.1, jitter = 0.2,
             ylab = "Expression", 
             add = "median_iqr",
             add.params = list(color = "gray"))</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-stripchart-1.png" width="768" /></p>
<p>For a dot plot, use this:</p>
<pre class="r"><code>ggdotplot(expr, x = "dataset",
          y = c("GATA3", "PTEN", "XBP1"),
          combine = TRUE, 
          color = "dataset", palette = "jco",
          fill = "white",
          binwidth = 0.1,
          ylab = "Expression", 
          add = "median_iqr",
          add.params = list(size = 0.9))</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-dot-plots-1.png" width="768" /></p>
</div>
<div id="density-plots" class="section level2">
<h2>Density plots</h2>
<p>To visualize the distribution as a density plot, use the function <strong>ggdensity</strong>() as follow:</p>
<pre class="r"><code># Basic density plot
ggdensity(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       y = "..density..",
       combine = TRUE,                  # Combine the 3 plots
       xlab = "Expression", 
       add = "median",                  # Add median line. 
       rug = TRUE                       # Add marginal rug
)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-density-plot-1.png" width="768" /></p>
<pre class="r"><code># Change color and fill by dataset
ggdensity(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       y = "..density..",
       combine = TRUE,                  # Combine the 3 plots
       xlab = "Expression", 
       add = "median",                  # Add median line. 
       rug = TRUE,                      # Add marginal rug
       color = "dataset", 
       fill = "dataset",
       palette = "jco"
)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-density-plot-2.png" width="768" /></p>
<pre class="r"><code># Merge the 3 plots
# and use y = "..count.." instead of "..density.."
ggdensity(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       y = "..count..",
       merge = TRUE,                    # Merge the 3 plots
       xlab = "Expression", 
       add = "median",                  # Add median line. 
       rug = TRUE ,                     # Add marginal rug
       palette = "jco"                  # Change color palette
)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-density-plot-3.png" width="768" /></p>
<pre class="r"><code># color and fill by x variables
ggdensity(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       y = "..count..",
       color = ".x.", fill = ".x.",     # color and fill by x variables
       merge = TRUE,                    # Merge the 3 plots
       xlab = "Expression", 
       add = "median",                  # Add median line. 
       rug = TRUE ,                     # Add marginal rug
       palette = "jco"                  # Change color palette
)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-density-plot-4.png" width="768" /></p>
<pre class="r"><code># Facet by "dataset"
ggdensity(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       y = "..count..",
       color = ".x.", fill = ".x.", 
       facet.by = "dataset",            # Split by "dataset" into multi-panel
       merge = TRUE,                    # Merge the 3 plots
       xlab = "Expression", 
       add = "median",                  # Add median line. 
       rug = TRUE ,                     # Add marginal rug
       palette = "jco"                  # Change color palette
)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-density-plot-5.png" width="768" /></p>
</div>
<div id="histogram-plots" class="section level2">
<h2>Histogram plots</h2>
<p>To visualize the distribution as a histogram plot, use the function <strong>gghistogram</strong>() as follow:</p>
<pre class="r"><code># Basic histogram plot 
gghistogram(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       y = "..density..",
       combine = TRUE,                  # Combine the 3 plots
       xlab = "Expression", 
       add = "median",                  # Add median line. 
       rug = TRUE                       # Add marginal rug
)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-histogram-plot-1.png" width="768" /></p>
<pre class="r"><code># Change color and fill by dataset
gghistogram(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       y = "..density..",
       combine = TRUE,                  # Combine the 3 plots
       xlab = "Expression", 
       add = "median",                  # Add median line. 
       rug = TRUE,                      # Add marginal rug
       color = "dataset", 
       fill = "dataset",
       palette = "jco"
)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-histogram-plot-2.png" width="768" /></p>
<pre class="r"><code># Merge the 3 plots
# and use y = "..count.." instead of "..density.."
gghistogram(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       y = "..count..",
       merge = TRUE,                    # Merge the 3 plots
       xlab = "Expression", 
       add = "median",                  # Add median line. 
       rug = TRUE ,                     # Add marginal rug
       palette = "jco"                  # Change color palette
)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-histogram-plot-3.png" width="768" /></p>
<pre class="r"><code># color and fill by x variables
gghistogram(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       y = "..count..",
       color = ".x.", fill = ".x.",     # color and fill by x variables
       merge = TRUE,                    # Merge the 3 plots
       xlab = "Expression", 
       add = "median",                  # Add median line. 
       rug = TRUE ,                     # Add marginal rug
       palette = "jco"                  # Change color palette
)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-histogram-plot-4.png" width="768" /></p>
<pre class="r"><code># Facet by "dataset"
gghistogram(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       y = "..count..",
       color = ".x.", fill = ".x.", 
       facet.by = "dataset",            # Split by "dataset" into multi-panel
       merge = TRUE,                    # Merge the 3 plots
       xlab = "Expression", 
       add = "median",                  # Add median line. 
       rug = TRUE ,                     # Add marginal rug
       palette = "jco"                  # Change color palette
)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-histogram-plot-5.png" width="768" /></p>
</div>
<div id="empirical-cumulative-density-function" class="section level2">
<h2>Empirical cumulative density function</h2>
<pre class="r"><code># Basic ECDF plot 
ggecdf(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       combine = TRUE,                 
       xlab = "Expression", ylab = "F(expression)"
)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-ecdf-plot-1.png" width="768" /></p>
<pre class="r"><code># Change color  by dataset
ggecdf(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       combine = TRUE,                 
       xlab = "Expression", ylab = "F(expression)",
       color = "dataset", palette = "jco"
)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-ecdf-plot-2.png" width="768" /></p>
<pre class="r"><code># Merge the 3 plots and color by x variables
ggecdf(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       merge = TRUE,                 
       xlab = "Expression", ylab = "F(expression)",
       color = ".x.", palette = "jco"
)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-ecdf-plot-3.png" width="768" /></p>
<pre class="r"><code># Merge the 3 plots and color by x variables
# facet by "dataset" into multi-panel
ggecdf(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       merge = TRUE,                 
       xlab = "Expression", ylab = "F(expression)",
       color = ".x.", palette = "jco",
       facet.by = "dataset"
)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-ecdf-plot-4.png" width="768" /></p>
</div>
<div id="quantile---quantile-plot" class="section level2">
<h2>Quantile - Quantile plot</h2>
<pre class="r"><code># Basic ECDF plot 
ggqqplot(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       combine = TRUE, size = 0.5
)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-qq-plot-1.png" width="768" /></p>
<pre class="r"><code># Change color  by dataset
ggqqplot(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       combine = TRUE, color = "dataset", palette = "jco",
       size = 0.5
)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-qq-plot-2.png" width="768" /></p>
<pre class="r"><code># Merge the 3 plots and color by x variables
ggqqplot(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       merge = TRUE,  
       color = ".x.", palette = "jco"
)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-qq-plot-3.png" width="768" /></p>
<pre class="r"><code># Merge the 3 plots and color by x variables
# facet by "dataset" into multi-panel
ggqqplot(expr,
       x = c("GATA3", "PTEN",  "XBP1"),
       merge = TRUE, size = 0.5,
       color = ".x.", palette = "jco",
       facet.by = "dataset"
)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/005-exploratory-data-visualization-qq-plot-4.png" width="768" /></p>
</div>
</div>
</div><!--end rdoc-->
 

<!-- END HTML -->]]></description>
			<pubDate>Thu, 31 Aug 2017 23:24:00 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[Add P-values and Significance Levels to ggplots]]></title>
			<link>https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/76-add-p-values-and-significance-levels-to-ggplots/</link>
			<guid>https://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/76-add-p-values-and-significance-levels-to-ggplots/</guid>
			<description><![CDATA[<!-- START HTML -->

  <div id="rdoc">
<p>In this article, we’ll describe how to easily i) <strong>compare means</strong> of two or multiple groups; ii) and to automatically add <strong>p-values</strong> and <strong>significance levels</strong> to a ggplot (such as box plots, dot plots, bar plots and line plots …).</p>
<p><img src="https://www.sthda.com/english/sthda-upload/images/ggpubr/add-pvalues-to-ggplots.png" alt="Add p-values to ggplots" /></p>
<p><strong>Contents</strong></p>
<div id="TOC">
<ul>
<li><a href="#prerequisites">Prerequisites</a><ul>
<li><a href="#install-and-load-required-r-packages">Install and load required R packages</a></li>
<li><a href="#demo-data-sets">Demo data sets</a></li>
</ul></li>
<li><a href="#methods-for-comparing-means">Methods for comparing means</a></li>
<li><a href="#r-functions-to-add-p-values">R functions to add p-values</a><ul>
<li><a href="#compare_means">compare_means()</a></li>
<li><a href="#stat_compare_means">stat_compare_means()</a></li>
</ul></li>
<li><a href="#compare-two-independent-groups">Compare two independent groups</a></li>
<li><a href="#compare-two-paired-samples">Compare two paired samples</a></li>
<li><a href="#compare-more-than-two-groups">Compare more than two groups</a></li>
<li><a href="#multiple-grouping-variables">Multiple grouping variables</a></li>
<li><a href="#other-plot-types">Other plot types</a></li>
</ul>
</div>
<div id="prerequisites" class="section level2">
<h2>Prerequisites</h2>
<div id="install-and-load-required-r-packages" class="section level3">
<h3>Install and load required R packages</h3>
<p>Required R package: <a href="https://www.sthda.com/english/rpkgs/ggpubr">ggpubr (version >= 0.1.3)</a>, for ggplot2-based publication ready plots.</p>
<ul>
<li>Install from <a href="https://cran.r-project.org/package=ggpubr">CRAN</a> as follow:</li>
</ul>
<pre class="r"><code>install.packages("ggpubr")</code></pre>
<ul>
<li>Or, install the latest developmental version from <a href="https://github.com/kassambara/ggpubr">GitHub</a> as follow:</li>
</ul>
<pre class="r"><code>if(!require(devtools)) install.packages("devtools")
devtools::install_github("kassambara/ggpubr")</code></pre>
<ul>
<li>Load ggpubr:</li>
</ul>
<pre class="r"><code>library(ggpubr)</code></pre>
<div class="success">
<p>
Official documentation of ggpubr is available at: <a href="https://www.sthda.com/english/rpkgs/ggpubr" class="uri">https://www.sthda.com/english/rpkgs/ggpubr</a>
</p>
</div>
</div>
<div id="demo-data-sets" class="section level3">
<h3>Demo data sets</h3>
<p>Data: <a href="https://www.sthda.com/english/wiki/r-built-in-data-sets#toothgrowth">ToothGrowth</a> data sets.</p>
<pre class="r"><code>data("ToothGrowth")
head(ToothGrowth)</code></pre>
<pre><code>##    len supp dose
## 1  4.2   VC  0.5
## 2 11.5   VC  0.5
## 3  7.3   VC  0.5
## 4  5.8   VC  0.5
## 5  6.4   VC  0.5
## 6 10.0   VC  0.5</code></pre>
</div>
</div>
<div id="methods-for-comparing-means" class="section level2">
<h2>Methods for comparing means</h2>
<p>The standard methods to compare the means of two or more groups in R, have been largely described at: <a href="https://www.sthda.com/english/wiki/comparing-means-in-r">comparing means in R</a>.</p>
<p>The most common methods for comparing means include:</p>
<table>
<thead>
<tr class="header">
<th>Methods</th>
<th>R function</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>T-test</td>
<td>t.test()</td>
<td>Compare two groups (parametric)</td>
</tr>
<tr class="even">
<td>Wilcoxon test</td>
<td>wilcox.test()</td>
<td>Compare two groups (non-parametric)</td>
</tr>
<tr class="odd">
<td>ANOVA</td>
<td>aov() or anova()</td>
<td>Compare multiple groups (parametric)</td>
</tr>
<tr class="even">
<td>Kruskal-Wallis</td>
<td>kruskal.test()</td>
<td>Compare multiple groups (non-parametric)</td>
</tr>
</tbody>
</table>
<p>A practical guide to compute and interpret the results of each of these methods are provided at the following links:</p>
<div class="block">
<ul>
<li>
Comparing one-sample mean to a standard known mean:
<ul>
<li>
<a href="https://www.sthda.com/english/wiki/one-sample-t-test-in-r">One-Sample T-test (parametric)</a>
</li>
<li>
<a href="https://www.sthda.com/english/wiki/one-sample-wilcoxon-signed-rank-test-in-r">One-Sample Wilcoxon Test (non-parametric)</a>
</li>
</ul>
</li>
<li>
Comparing the means of two independent groups:
<ul>
<li>
<a href="https://www.sthda.com/english/wiki/unpaired-two-samples-t-test-in-r">Unpaired Two Samples T-test (parametric)</a>
</li>
<li>
<a href="https://www.sthda.com/english/wiki/unpaired-two-samples-wilcoxon-test-in-r">Unpaired Two-Samples Wilcoxon Test (non-parametric)</a>
</li>
</ul>
</li>
<li>
Comparing the means of paired samples:
<ul>
<li>
<a href="https://www.sthda.com/english/wiki/paired-samples-t-test-in-r">Paired Samples T-test (parametric)</a>
</li>
<li>
<a href="https://www.sthda.com/english/wiki/paired-samples-wilcoxon-test-in-r">Paired Samples Wilcoxon Test (non-parametric)</a>
</li>
</ul>
</li>
<li>
Comparing the means of more than two groups
<ul>
<li>
Analysis of variance (ANOVA, parametric):
<ul>
<li>
<a href="https://www.sthda.com/english/wiki/one-way-anova-test-in-r">One-Way ANOVA Test in R</a>
</li>
<li>
<a href="https://www.sthda.com/english/wiki/two-way-anova-test-in-r">Two-Way ANOVA Test in R</a>
</li>
</ul>
</li>
<li>
<a href="https://www.sthda.com/english/wiki/kruskal-wallis-test-in-r">Kruskal-Wallis Test in R (non parametric alternative to one-way ANOVA)</a>
</li>
</ul>
</li>
</ul>
</div>
</div>
<div id="r-functions-to-add-p-values" class="section level2">
<h2>R functions to add p-values</h2>
<p>Here we present two new R functions in the <strong>ggpubr</strong> package:</p>
<ul>
<li><strong>compare_means</strong>(): easy to use solution to performs one and multiple mean comparisons.</li>
<li><strong>stat_compare_means</strong>(): easy to use solution to automatically add p-values and significance levels to a ggplot.</li>
</ul>
<div id="compare_means" class="section level3">
<h3>compare_means()</h3>
<p>As we’ll show in the next sections, it has multiple useful options compared to the standard R functions.</p>
<p>The simplified format is as follow:</p>
<pre class="r"><code>compare_means(formula, data, method = "wilcox.test", paired = FALSE,
  group.by = NULL, ref.group = NULL, ...)</code></pre>
<div class="block">
<ul>
<li>
<strong>formula</strong>: a formula of the form <em>x ~ group</em>, where x is a numeric variable and group is a factor with one or multiple levels. For example, <em>formula = TP53 ~ cancer_group</em>. It’s also possible to perform the test for multiple response variables at the same time. For example, <em>formula = c(TP53, PTEN) ~ cancer_group</em>.
</li>
<li>
<p>
<strong>data</strong>: a data.frame containing the variables in the formula.
</p>
</li>
<li>
<strong>method</strong>: the type of test. Default is <em>“wilcox.test”</em>. Allowed values include:
<ul>
<li>
<em>“t.test”</em> (parametric) and <em>“wilcox.test”</em>" (non-parametric). Perform comparison between two groups of samples. If the grouping variable contains more than two levels, then a pairwise comparison is performed.
</li>
<li>
<em>“anova”</em> (parametric) and <em>“kruskal.test”</em> (non-parametric). Perform one-way ANOVA test comparing multiple groups.
</li>
</ul>
</li>
<li>
<p>
<strong>paired</strong>: a logical indicating whether you want a paired test. Used only in <em>t.test</em> and in <em>wilcox.test</em>.
</p>
</li>
<li>
<p>
<strong>group.by</strong>: variables used to group the data set before applying the test. When specified the mean comparisons will be performed in each subset of the data formed by the different levels of the group.by variables.
</p>
</li>
<li>
<p>
<strong>ref.group</strong>: a character string specifying the reference group. If specified, for a given grouping variable, each of the group levels will be compared to the reference group (i.e. control group). ref.group can be also <em>“.all.”</em>. In this case, each of the grouping variable levels is compared to all (i.e. base-mean).
</p>
</li>
</ul>
</div>
</div>
<div id="stat_compare_means" class="section level3">
<h3>stat_compare_means()</h3>
<p>This function extends ggplot2 for adding mean comparison p-values to a ggplot, such as box blots, dot plots, bar plots and line plots.</p>
<p>The simplified format is as follow:</p>
<pre class="r"><code>stat_compare_means(mapping = NULL, comparisons = NULL hide.ns = FALSE,
                   label = NULL,  label.x = NULL, label.y = NULL,  ...)</code></pre>
<div class="block">
<ul>
<li>
<p>
<strong>mapping</strong>: Set of aesthetic mappings created by aes().
</p>
</li>
<li>
<p>
<strong>comparisons</strong>: A list of length-2 vectors. The entries in the vector are either the names of 2 values on the x-axis or the 2 integers that correspond to the index of the groups of interest, to be compared.
</p>
</li>
<li>
<p>
<strong>hide.ns</strong>: logical value. If TRUE, hide ns symbol when displaying significance levels.
</p>
</li>
<li>
<p>
<strong>label</strong>: character string specifying label type. Allowed values include “p.signif” (shows the significance levels), “p.format” (shows the formatted p value).
</p>
</li>
<li>
<p>
<strong>label.x,label.y</strong>: numeric values. coordinates (in data units) to be used for absolute positioning of the label. If too short they will be recycled.
</p>
</li>
<li>
<p>
<strong>…</strong>: other arguments passed to the function <strong>compare_means</strong>() such as <em>method</em>, <em>paired</em>, <em>ref.group</em>.
</p>
</li>
</ul>
</div>
</div>
</div>
<div id="compare-two-independent-groups" class="section level2">
<h2>Compare two independent groups</h2>
<p>Perform the test:</p>
<pre class="r"><code>compare_means(len ~ supp, data = ToothGrowth)</code></pre>
<pre><code>## # A tibble: 1 x 8
##     .y. group1 group2      p  p.adj p.format p.signif   method
##   <chr>  <chr>  <chr>  <dbl>  <dbl>    <chr>    <chr>    <chr>
## 1   len     OJ     VC 0.0645 0.0645    0.064       ns Wilcoxon</code></pre>
<div class="warning">
<p>
By default <strong>method = “wilcox.test”</strong> (non-parametric test). You can also specify <strong>method = “t.test”</strong> for a parametric t-test.
</p>
</div>
<p>Returned value is a data frame with the following columns:</p>
<ul>
<li>.y.: the y variable used in the test.</li>
<li>p: the p-value</li>
<li>p.adj: the adjusted p-value. Default value for p.adjust.method = “holm”</li>
<li>p.format: the formatted p-value</li>
<li>p.signif: the significance level.</li>
<li>method: the statistical test used to compare groups.</li>
</ul>
<p>Create a box plot with p-values:</p>
<pre class="r"><code>p <- ggboxplot(ToothGrowth, x = "supp", y = "len",
          color = "supp", palette = "jco",
          add = "jitter")
#  Add p-value
p + stat_compare_means()
# Change method
p + stat_compare_means(method = "t.test")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/010-add-p-values-to-ggplots-compare-means-two-independent-groups-1.png" width="288" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/010-add-p-values-to-ggplots-compare-means-two-independent-groups-2.png" width="288" /></p>
<p>Note that, the p-value label position can be adjusted using the arguments: <em>label.x, label.y, hjust and vjust</em>.</p>
<p>The default p-value label displayed is obtained by concatenating the <strong>method</strong> and the <strong>p</strong> columns of the returned data frame by the function <strong>compare_means</strong>(). You can specify other combinations using the <strong>aes</strong>() function.</p>
<p>For example,</p>
<ul>
<li><strong>aes(label = ..p.format..)</strong> or <strong>aes(label = paste0(“p =”, ..p.format..))</strong>: display only the formatted p-value (without the method name)</li>
<li><strong>aes(label = ..p.signif..)</strong>: display only the significance level.</li>
<li><strong>aes(label = paste0(..method.., “\n”, “p =”, ..p.format..))</strong>: Use line break (“\n”) between the method name and the p-value.</li>
</ul>
<p>As an illustration, type this:</p>
<pre class="r"><code>p + stat_compare_means( aes(label = ..p.signif..), 
                        label.x = 1.5, label.y = 40)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/010-add-p-values-to-ggplots-compare-means-two-independent-groups-significance-level-1.png" width="355.2" /></p>
<p>If you prefer, it’s also possible to specify the argument <em>label</em> as a character vector:</p>
<pre class="r"><code>p + stat_compare_means( label = "p.signif", label.x = 1.5, label.y = 40)</code></pre>
</div>
<div id="compare-two-paired-samples" class="section level2">
<h2>Compare two paired samples</h2>
<p>Perform the test:</p>
<pre class="r"><code>compare_means(len ~ supp, data = ToothGrowth, paired = TRUE)</code></pre>
<pre><code>## # A tibble: 1 x 8
##     .y. group1 group2       p   p.adj p.format p.signif   method
##   <chr>  <chr>  <chr>   <dbl>   <dbl>    <chr>    <chr>    <chr>
## 1   len     OJ     VC 0.00431 0.00431   0.0043       ** Wilcoxon</code></pre>
<p>Visualize paired data using the <strong>ggpaired</strong>() function:</p>
<pre class="r"><code>ggpaired(ToothGrowth, x = "supp", y = "len",
         color = "supp", line.color = "gray", line.size = 0.4,
         palette = "jco")+
  stat_compare_means(paired = TRUE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/010-add-p-values-to-ggplots-compare-means-paired-tests-1.png" width="355.2" /></p>
</div>
<div id="compare-more-than-two-groups" class="section level2">
<h2>Compare more than two groups</h2>
<ul>
<li>Global test:</li>
</ul>
<pre class="r"><code># Global test
compare_means(len ~ dose,  data = ToothGrowth, method = "anova")</code></pre>
<pre><code>## # A tibble: 1 x 6
##     .y.        p    p.adj p.format p.signif method
##   <chr>    <dbl>    <dbl>    <chr>    <chr>  <chr>
## 1   len 9.53e-16 9.53e-16  9.5e-16     ****  Anova</code></pre>
<p>Plot with global p-value:</p>
<pre class="r"><code># Default method = "kruskal.test" for multiple groups
ggboxplot(ToothGrowth, x = "dose", y = "len",
          color = "dose", palette = "jco")+
  stat_compare_means()
# Change method to anova
ggboxplot(ToothGrowth, x = "dose", y = "len",
          color = "dose", palette = "jco")+
  stat_compare_means(method = "anova")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/010-add-p-values-to-ggplots-multiple-independent-groups-1.png" width="288" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/010-add-p-values-to-ggplots-multiple-independent-groups-2.png" width="288" /></p>
<ul>
<li><strong>Pairwise comparisons</strong>. If the grouping variable contains more than two levels, then pairwise tests will be performed automatically. The default method is “wilcox.test”. You can change this to “t.test”.</li>
</ul>
<pre class="r"><code># Perorm pairwise comparisons
compare_means(len ~ dose,  data = ToothGrowth)</code></pre>
<pre><code>## # A tibble: 3 x 8
##     .y. group1 group2        p    p.adj p.format p.signif   method
##   <chr>  <chr>  <chr>    <dbl>    <dbl>    <chr>    <chr>    <chr>
## 1   len    0.5      1 7.02e-06 1.40e-05  7.0e-06     **** Wilcoxon
## 2   len    0.5      2 8.41e-08 2.52e-07  8.4e-08     **** Wilcoxon
## 3   len      1      2 1.77e-04 1.77e-04  0.00018      *** Wilcoxon</code></pre>
<pre class="r"><code># Visualize: Specify the comparisons you want
my_comparisons <- list( c("0.5", "1"), c("1", "2"), c("0.5", "2") )
ggboxplot(ToothGrowth, x = "dose", y = "len",
          color = "dose", palette = "jco")+ 
  stat_compare_means(comparisons = my_comparisons)+ # Add pairwise comparisons p-value
  stat_compare_means(label.y = 50)     # Add global p-value</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/010-add-p-values-to-ggplots-pairwise-comparisons-1.png" width="480" /></p>
<p>If you want to specify the precise y location of bars, use the argument <strong>label.y</strong>:</p>
<pre class="r"><code>ggboxplot(ToothGrowth, x = "dose", y = "len",
          color = "dose", palette = "jco")+ 
  stat_compare_means(comparisons = my_comparisons, label.y = c(29, 35, 40))+
  stat_compare_means(label.y = 45)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/010-add-p-values-to-ggplots-pairwise-comparisons-bar-location-1.png" width="480" /></p>
<p>(Adding bars, connecting compared groups, has been facilitated by the <a href="https://github.com/Artjom-Metro/ggsignif">ggsignif</a> R package )</p>
<ul>
<li><strong>Multiple pairwise tests against a reference group</strong>:</li>
</ul>
<pre class="r"><code># Pairwise comparison against reference
compare_means(len ~ dose,  data = ToothGrowth, ref.group = "0.5",
              method = "t.test")</code></pre>
<pre><code>## # A tibble: 2 x 8
##     .y. group1 group2        p    p.adj p.format p.signif method
##   <chr>  <chr>  <chr>    <dbl>    <dbl>    <chr>    <chr>  <chr>
## 1   len    0.5      1 6.70e-09 6.70e-09  6.7e-09     **** T-test
## 2   len    0.5      2 1.47e-16 2.94e-16  < 2e-16     **** T-test</code></pre>
<pre class="r"><code># Visualize
ggboxplot(ToothGrowth, x = "dose", y = "len",
          color = "dose", palette = "jco")+
  stat_compare_means(method = "anova", label.y = 40)+      # Add global p-value
  stat_compare_means(label = "p.signif", method = "t.test",
                     ref.group = "0.5")                    # Pairwise comparison against reference</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/010-add-p-values-to-ggplots-reference-group-1.png" width="480" /></p>
<ul>
<li><strong>Multiple pairwise tests against all (base-mean)</strong>:</li>
</ul>
<pre class="r"><code># Comparison of each group against base-mean
compare_means(len ~ dose,  data = ToothGrowth, ref.group = ".all.",
              method = "t.test")</code></pre>
<pre><code>## # A tibble: 3 x 8
##     .y. group1 group2        p    p.adj p.format p.signif method
##   <chr>  <chr>  <chr>    <dbl>    <dbl>    <chr>    <chr>  <chr>
## 1   len  .all.    0.5 1.24e-06 3.73e-06  1.2e-06     **** T-test
## 2   len  .all.      1 5.67e-01 5.67e-01     0.57       ns T-test
## 3   len  .all.      2 1.37e-05 2.74e-05  1.4e-05     **** T-test</code></pre>
<pre class="r"><code># Visualize
ggboxplot(ToothGrowth, x = "dose", y = "len",
          color = "dose", palette = "jco")+
  stat_compare_means(method = "anova", label.y = 40)+      # Add global p-value
  stat_compare_means(label = "p.signif", method = "t.test",
                     ref.group = ".all.")                  # Pairwise comparison against all</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/010-add-p-values-to-ggplots-comparison-against-base-mean-1.png" width="480" /></p>
<p>A typical situation, where pairwise comparisons against “all” can be useful, is illustrated here using the <em>myeloma</em> data set available on Github.</p>
<p>We’ll plot the expression profile of the DEPDC1 gene according to the patients’ molecular groups. We want to know if there is any difference between groups. If yes, where the difference is?</p>
<p>To answer to this question, you can perform a pairwise comparison between all the 7 groups. This will lead to a lot of comparisons between all possible combinations. If you have many groups, as here, it might be difficult to interpret.</p>
<p>Another easy solution is to compare each of the seven groups against “all” (i.e. base-mean). When the test is significant, then you can conclude that DEPDC1 is significantly overexpressed or downexpressed in a group xxx compared to all.</p>
<pre class="r"><code># Load myeloma data from GitHub
myeloma <- read.delim("https://raw.githubusercontent.com/kassambara/data/master/myeloma.txt")
# Perform the test
compare_means(DEPDC1 ~ molecular_group,  data = myeloma,
              ref.group = ".all.", method = "t.test")</code></pre>
<pre><code>## # A tibble: 7 x 8
##      .y. group1           group2        p   p.adj p.format p.signif method
##    <chr>  <chr>            <chr>    <dbl>   <dbl>    <chr>    <chr>  <chr>
## 1 DEPDC1  .all.       Cyclin D-1 0.149690 0.44907  0.14969       ns T-test
## 2 DEPDC1  .all.       Cyclin D-2 0.523143 1.00000  0.52314       ns T-test
## 3 DEPDC1  .all.     Hyperdiploid 0.000282 0.00169  0.00028      *** T-test
## 4 DEPDC1  .all. Low bone disease 0.005084 0.02542  0.00508       ** T-test
## 5 DEPDC1  .all.              MAF 0.086107 0.34443  0.08611       ns T-test
## 6 DEPDC1  .all.            MMSET 0.576291 1.00000  0.57629       ns T-test
## # ... with 1 more rows</code></pre>
<pre class="r"><code># Visualize the expression profile
ggboxplot(myeloma, x = "molecular_group", y = "DEPDC1", color = "molecular_group", 
          add = "jitter", legend = "none") +
  rotate_x_text(angle = 45)+
  geom_hline(yintercept = mean(myeloma$DEPDC1), linetype = 2)+ # Add horizontal line at base mean
  stat_compare_means(method = "anova", label.y = 1600)+        # Add global annova p-value
  stat_compare_means(label = "p.signif", method = "t.test",
                     ref.group = ".all.")                      # Pairwise comparison against all</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/010-add-p-values-to-ggplots-comparison-against-base-mean2-1.png" width="672" /></p>
<div class="success">
<p>
From the plot above, we can conclude that DEPDC1 is significantly overexpressed in proliferation group and, it’s significantly downexpressed in Hyperdiploid and Low bone disease compared to all.
</p>
</div>
<div class="warning">
<p>
Note that, if you want to hide the ns symbol, specify the argument <em>hide.ns = TRUE</em>.
</p>
</div>
<pre class="r"><code># Visualize the expression profile
ggboxplot(myeloma, x = "molecular_group", y = "DEPDC1", color = "molecular_group", 
          add = "jitter", legend = "none") +
  rotate_x_text(angle = 45)+
  geom_hline(yintercept = mean(myeloma$DEPDC1), linetype = 2)+ # Add horizontal line at base mean
  stat_compare_means(method = "anova", label.y = 1600)+        # Add global annova p-value
  stat_compare_means(label = "p.signif", method = "t.test",
                     ref.group = ".all.", hide.ns = TRUE)      # Pairwise comparison against all</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/010-add-p-values-to-ggplots-comparison-against-base-mean-hide-ns-1.png" width="672" /></p>
</div>
<div id="multiple-grouping-variables" class="section level2">
<h2>Multiple grouping variables</h2>
<ul>
<li><strong>Two independent sample comparisons after grouping the data by another variable</strong>:</li>
</ul>
<p>Perform the test:</p>
<pre class="r"><code>compare_means(len ~ supp, data = ToothGrowth, 
              group.by = "dose")</code></pre>
<pre><code>## # A tibble: 3 x 9
##    dose   .y. group1 group2       p  p.adj p.format p.signif   method
##   <dbl> <chr>  <chr>  <chr>   <dbl>  <dbl>    <chr>    <chr>    <chr>
## 1   0.5   len     OJ     VC 0.02319 0.0464    0.023        * Wilcoxon
## 2   1.0   len     OJ     VC 0.00403 0.0121    0.004       ** Wilcoxon
## 3   2.0   len     OJ     VC 1.00000 1.0000    1.000       ns Wilcoxon</code></pre>
<div class="notice">
<p>
In the example above, for each level of the variable “dose”, we compare the means of the variable “len” in the different groups formed by the grouping variable “supp”.
</p>
</div>
<p>Visualize (1/2). Create a multi-panel box plots facetted by group (here, “dose”):</p>
<pre class="r"><code># Box plot facetted by "dose"
p <- ggboxplot(ToothGrowth, x = "supp", y = "len",
          color = "supp", palette = "jco",
          add = "jitter",
          facet.by = "dose", short.panel.labs = FALSE)
# Use only p.format as label. Remove method name.
p + stat_compare_means(label = "p.format")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/010-add-p-values-to-ggplots-facet-1.png" width="672" /></p>
<pre class="r"><code># Or use significance symbol as label
p + stat_compare_means(label =  "p.signif", label.x = 1.5)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/010-add-p-values-to-ggplots-facet-2.png" width="672" /></p>
<div class="warning">
<p>
To hide the ‘ns’ symbol, use the argument <strong>hide.ns = TRUE</strong>.
</p>
</div>
<p>Visualize (2/2). Create one single panel with all box plots. Plot y = “len” by x = “dose” and color by “supp”:</p>
<pre class="r"><code>p <- ggboxplot(ToothGrowth, x = "dose", y = "len",
          color = "supp", palette = "jco",
          add = "jitter")
p + stat_compare_means(aes(group = supp))</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/010-add-p-values-to-ggplots-compare-means-interaction-1.png" width="672" /></p>
<pre class="r"><code># Show only p-value
p + stat_compare_means(aes(group = supp), label = "p.format")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/010-add-p-values-to-ggplots-compare-means-interaction-2.png" width="672" /></p>
<pre class="r"><code># Use significance symbol as label
p + stat_compare_means(aes(group = supp), label = "p.signif")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/010-add-p-values-to-ggplots-compare-means-interaction-3.png" width="672" /></p>
<ul>
<li><strong>Paired sample comparisons after grouping the data by another variable</strong>:</li>
</ul>
<p>Perform the test:</p>
<pre class="r"><code>compare_means(len ~ supp, data = ToothGrowth, 
              group.by = "dose", paired = TRUE)</code></pre>
<pre><code>## # A tibble: 3 x 9
##    dose   .y. group1 group2      p  p.adj p.format p.signif   method
##   <dbl> <chr>  <chr>  <chr>  <dbl>  <dbl>    <chr>    <chr>    <chr>
## 1   0.5   len     OJ     VC 0.0330 0.0659    0.033        * Wilcoxon
## 2   1.0   len     OJ     VC 0.0191 0.0572    0.019        * Wilcoxon
## 3   2.0   len     OJ     VC 1.0000 1.0000    1.000       ns Wilcoxon</code></pre>
<p>Visualize. Create a multi-panel box plots facetted by group (here, “dose”):</p>
<pre class="r"><code># Box plot facetted by "dose"
p <- ggpaired(ToothGrowth, x = "supp", y = "len",
          color = "supp", palette = "jco", 
          line.color = "gray", line.size = 0.4,
          facet.by = "dose", short.panel.labs = FALSE)
# Use only p.format as label. Remove method name.
p + stat_compare_means(label = "p.format", paired = TRUE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/010-add-p-values-to-ggplots-facet-paired-1.png" width="672" /></p>
</div>
<div id="other-plot-types" class="section level2">
<h2>Other plot types</h2>
<ul>
<li><strong>Bar and line plots</strong> (one grouping variable):</li>
</ul>
<pre class="r"><code># Bar plot of mean +/-se
ggbarplot(ToothGrowth, x = "dose", y = "len", add = "mean_se")+
  stat_compare_means() +                                         # Global p-value
  stat_compare_means(ref.group = "0.5", label = "p.signif",
                     label.y = c(22, 29))                   # compare to ref.group
# Line plot of mean +/-se
ggline(ToothGrowth, x = "dose", y = "len", add = "mean_se")+
  stat_compare_means() +                                         # Global p-value
  stat_compare_means(ref.group = "0.5", label = "p.signif",
                     label.y = c(22, 29))     </code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/010-add-p-values-to-ggplots-bar-line-plot-p-value-one-grouping-var-1.png" width="288" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/010-add-p-values-to-ggplots-bar-line-plot-p-value-one-grouping-var-2.png" width="288" /></p>
<ul>
<li><strong>Bar and line plots</strong> (two grouping variables):</li>
</ul>
<pre class="r"><code>ggbarplot(ToothGrowth, x = "dose", y = "len", add = "mean_se",
          color = "supp", palette = "jco", 
          position = position_dodge(0.8))+
  stat_compare_means(aes(group = supp), label = "p.signif", label.y = 29)
ggline(ToothGrowth, x = "dose", y = "len", add = "mean_se",
          color = "supp", palette = "jco")+
  stat_compare_means(aes(group = supp), label = "p.signif", 
                     label.y = c(16, 25, 29))</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/010-add-p-values-to-ggplots-bar-line-plot-p-value-two-grouping-var-1.png" width="355.2" /><img src="https://www.sthda.com/english/sthda-upload/figures/ggpubr/010-add-p-values-to-ggplots-bar-line-plot-p-value-two-grouping-var-2.png" width="355.2" /></p>
</div>
</div>
</div><!--end rdoc-->
<!-- END HTML -->]]></description>
			<pubDate>Thu, 31 Aug 2017 17:00:00 +0200</pubDate>
			
		</item>
		
	</channel>
</rss>
