ggplot2 barplots : Quick start guide - R software and data visualization


This R tutorial describes how to create a barplot using R software and ggplot2 package.

The function geom_bar() can be used.

ggplot2 barplot - R software and data visualization

Basic barplots

Data

Data derived from ToothGrowth data sets are used. ToothGrowth describes the effect of Vitamin C on Tooth growth in Guinea pigs.

df <- data.frame(dose=c("D0.5", "D1", "D2"),
                len=c(4.2, 10, 29.5))
head(df)
##   dose  len
## 1 D0.5  4.2
## 2   D1 10.0
## 3   D2 29.5
  • len : Tooth length
  • dose : Dose in milligrams (0.5, 1, 2)

Create barplots

library(ggplot2)
# Basic barplot
p<-ggplot(data=df, aes(x=dose, y=len)) +
  geom_bar(stat="identity")
p
   
# Horizontal bar plot
p + coord_flip()

ggplot2 barplot - R software and data visualizationggplot2 barplot - R software and data visualization

Change the width and the color of bars :

# Change the width of bars
ggplot(data=df, aes(x=dose, y=len)) +
  geom_bar(stat="identity", width=0.5)
# Change colors
ggplot(data=df, aes(x=dose, y=len)) +
  geom_bar(stat="identity", color="blue", fill="white")
# Minimal theme + blue fill color
p<-ggplot(data=df, aes(x=dose, y=len)) +
  geom_bar(stat="identity", fill="steelblue")+
  theme_minimal()
p

ggplot2 barplot - R software and data visualizationggplot2 barplot - R software and data visualizationggplot2 barplot - R software and data visualization

Choose which items to display :

p + scale_x_discrete(limits=c("D0.5", "D2"))

ggplot2 barplot - R software and data visualization

Bar plot with labels

# Outside bars
ggplot(data=df, aes(x=dose, y=len)) +
  geom_bar(stat="identity", fill="steelblue")+
  geom_text(aes(label=len), vjust=-0.3, size=3.5)+
  theme_minimal()
# Inside bars
ggplot(data=df, aes(x=dose, y=len)) +
  geom_bar(stat="identity", fill="steelblue")+
  geom_text(aes(label=len), vjust=1.6, color="white", size=3.5)+
  theme_minimal()

ggplot2 barplot - R software and data visualizationggplot2 barplot - R software and data visualization

Barplot of counts

In the R code above, we used the argument stat = “identity” to make barplots. Note that, the default value of the argument stat is “bin”. In this case, the height of the bar represents the count of cases in each category.

To make a barplot of counts, we will use the mtcars data sets :

head(mtcars)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
# Don't map a variable to y
ggplot(mtcars, aes(x=factor(cyl)))+
  geom_bar(stat="bin", width=0.7, fill="steelblue")+
  theme_minimal()

ggplot2 barplot - R software and data visualization

Change barplot colors by groups

Change outline colors

Barplot outline colors can be automatically controlled by the levels of the variable dose :

# Change barplot line colors by groups
p<-ggplot(df, aes(x=dose, y=len, color=dose)) +
  geom_bar(stat="identity", fill="white")
p

ggplot2 barplot - R software and data visualization

It is also possible to change manually barplot line colors using the functions :

  • scale_color_manual() : to use custom colors
  • scale_color_brewer() : to use color palettes from RColorBrewer package
  • scale_color_grey() : to use grey color palettes
# Use custom color palettes
p+scale_color_manual(values=c("#999999", "#E69F00", "#56B4E9"))
# Use brewer color palettes
p+scale_color_brewer(palette="Dark2")
# Use grey scale
p + scale_color_grey() + theme_classic()

ggplot2 barplot - R software and data visualizationggplot2 barplot - R software and data visualizationggplot2 barplot - R software and data visualization

Read more on ggplot2 colors here : ggplot2 colors

Change fill colors

In the R code below, barplot fill colors are automatically controlled by the levels of dose :

# Change barplot fill colors by groups
p<-ggplot(df, aes(x=dose, y=len, fill=dose)) +
  geom_bar(stat="identity")+theme_minimal()
p

ggplot2 barplot - R software and data visualization

It is also possible to change manually barplot fill colors using the functions :

  • scale_fill_manual() : to use custom colors
  • scale_fill_brewer() : to use color palettes from RColorBrewer package
  • scale_fill_grey() : to use grey color palettes
# Use custom color palettes
p+scale_fill_manual(values=c("#999999", "#E69F00", "#56B4E9"))
# use brewer color palettes
p+scale_fill_brewer(palette="Dark2")
# Use grey scale
p + scale_fill_grey()

ggplot2 barplot - R software and data visualizationggplot2 barplot - R software and data visualizationggplot2 barplot - R software and data visualization

Use black outline color :

ggplot(df, aes(x=dose, y=len, fill=dose))+
geom_bar(stat="identity", color="black")+
scale_fill_manual(values=c("#999999", "#E69F00", "#56B4E9"))+
  theme_minimal()

ggplot2 barplot - R software and data visualization

Read more on ggplot2 colors here : ggplot2 colors

Change the legend position

# Change bar fill colors to blues
p <- p+scale_fill_brewer(palette="Blues")
p + theme(legend.position="top")
p + theme(legend.position="bottom")
# Remove legend
p + theme(legend.position="none")

ggplot2 barplot - R software and data visualizationggplot2 barplot - R software and data visualizationggplot2 barplot - R software and data visualization

The allowed values for the arguments legend.position are : “left”,“top”, “right”, “bottom”.

Read more on ggplot legend : ggplot2 legend

Change the order of items in the legend

The function scale_x_discrete can be used to change the order of items to “2”, “0.5”, “1” :

p + scale_x_discrete(limits=c("D2", "D0.5", "D1"))

ggplot2 barplot - R software and data visualization

Barplot with multiple groups

Data

Data derived from ToothGrowth data sets are used. ToothGrowth describes the effect of Vitamin C on tooth growth in Guinea pigs. Three dose levels of Vitamin C (0.5, 1, and 2 mg) with each of two delivery methods [orange juice (OJ) or ascorbic acid (VC)] are used :

df2 <- data.frame(supp=rep(c("VC", "OJ"), each=3),
                dose=rep(c("D0.5", "D1", "D2"),2),
                len=c(6.8, 15, 33, 4.2, 10, 29.5))
head(df2)
##   supp dose  len
## 1   VC D0.5  6.8
## 2   VC   D1 15.0
## 3   VC   D2 33.0
## 4   OJ D0.5  4.2
## 5   OJ   D1 10.0
## 6   OJ   D2 29.5
  • len : Tooth length
  • dose : Dose in milligrams (0.5, 1, 2)
  • supp : Supplement type (VC or OJ)

Create barplots

A stacked barplot is created by default. You can use the function position_dodge() to change this. The barplot fill color is controlled by the levels of dose :

# Stacked barplot with multiple groups
ggplot(data=df2, aes(x=dose, y=len, fill=supp)) +
  geom_bar(stat="identity")
# Use position=position_dodge()
ggplot(data=df2, aes(x=dose, y=len, fill=supp)) +
geom_bar(stat="identity", position=position_dodge())

ggplot2 barplot - R software and data visualizationggplot2 barplot - R software and data visualization

Change the color manually :

# Change the colors manually
p <- ggplot(data=df2, aes(x=dose, y=len, fill=supp)) +
geom_bar(stat="identity", color="black", position=position_dodge())+
  theme_minimal()
# Use custom colors
p + scale_fill_manual(values=c('#999999','#E69F00'))
# Use brewer color palettes
p + scale_fill_brewer(palette="Blues")

ggplot2 barplot - R software and data visualizationggplot2 barplot - R software and data visualization

Add labels

Add labels to a dodged barplot :

ggplot(data=df2, aes(x=dose, y=len, fill=supp)) +
  geom_bar(stat="identity", position=position_dodge())+
  geom_text(aes(label=len), vjust=1.6, color="white",
            position = position_dodge(0.9), size=3.5)+
  scale_fill_brewer(palette="Paired")+
  theme_minimal()

ggplot2 barplot - R software and data visualization

Add labels to a stacked barplot : 3 steps are required

  1. Sort the data by dose and supp : the package plyr is used
  2. Calculate the cumulative sum of the variable len for each dose
  3. Create the plot
library(plyr)
# Sort by dose and supp
df_sorted <- arrange(df2, dose, supp) 
head(df_sorted)
##   supp dose  len
## 1   OJ D0.5  4.2
## 2   VC D0.5  6.8
## 3   OJ   D1 10.0
## 4   VC   D1 15.0
## 5   OJ   D2 29.5
## 6   VC   D2 33.0
# Calculate the cumulative sum of len for each dose
df_cumsum <- ddply(df_sorted, "dose",
                   transform, label_ypos=cumsum(len))
head(df_cumsum)
##   supp dose  len label_ypos
## 1   OJ D0.5  4.2        4.2
## 2   VC D0.5  6.8       11.0
## 3   OJ   D1 10.0       10.0
## 4   VC   D1 15.0       25.0
## 5   OJ   D2 29.5       29.5
## 6   VC   D2 33.0       62.5
# Create the barplot
ggplot(data=df_cumsum, aes(x=dose, y=len, fill=supp)) +
  geom_bar(stat="identity")+
  geom_text(aes(y=label_ypos, label=len), vjust=1.6, 
            color="white", size=3.5)+
  scale_fill_brewer(palette="Paired")+
  theme_minimal()

ggplot2 barplot - R software and data visualization

If you want to place the labels at the middle of bars, you have to modify the cumulative sum as follow :

df_cumsum <- ddply(df_sorted, "dose",
                   transform, 
                   label_ypos=cumsum(len) - 0.5*len)
# Create the barplot
ggplot(data=df_cumsum, aes(x=dose, y=len, fill=supp)) +
  geom_bar(stat="identity")+
  geom_text(aes(y=label_ypos, label=len), vjust=1.6, 
            color="white", size=3.5)+
  scale_fill_brewer(palette="Paired")+
  theme_minimal()

ggplot2 barplot - R software and data visualization

Barplot with a numeric x-axis

If the variable on x-axis is numeric, it can be useful to treat it as a continuous or a factor variable depending on what you want to do :

# Create some data
df2 <- data.frame(supp=rep(c("VC", "OJ"), each=3),
                dose=rep(c("0.5", "1", "2"),2),
                len=c(6.8, 15, 33, 4.2, 10, 29.5))
head(df2)
##   supp dose  len
## 1   VC  0.5  6.8
## 2   VC    1 15.0
## 3   VC    2 33.0
## 4   OJ  0.5  4.2
## 5   OJ    1 10.0
## 6   OJ    2 29.5
# x axis treated as continuous variable
df2$dose <- as.numeric(as.vector(df2$dose))
ggplot(data=df2, aes(x=dose, y=len, fill=supp)) +
  geom_bar(stat="identity", position=position_dodge())+
  scale_fill_brewer(palette="Paired")+
  theme_minimal()
# Axis treated as discrete variable
df2$dose<-as.factor(df2$dose)
ggplot(data=df2, aes(x=dose, y=len, fill=supp)) +
  geom_bar(stat="identity", position=position_dodge())+
  scale_fill_brewer(palette="Paired")+
  theme_minimal()

ggplot2 barplot - R software and data visualizationggplot2 barplot - R software and data visualization

Barplot with error bars

The helper function below will be used to calculate the mean and the standard deviation, for the variable of interest, in each group :

#+++++++++++++++++++++++++
# Function to calculate the mean and the standard deviation
  # for each group
#+++++++++++++++++++++++++
# data : a data frame
# varname : the name of a column containing the variable
  #to be summariezed
# groupnames : vector of column names to be used as
  # grouping variables
data_summary <- function(data, varname, groupnames){
  require(plyr)
  summary_func <- function(x, col){
    c(mean = mean(x[[col]], na.rm=TRUE),
      sd = sd(x[[col]], na.rm=TRUE))
  }
  data_sum<-ddply(data, groupnames, .fun=summary_func,
                  varname)
  data_sum <- rename(data_sum, c("mean" = varname))
 return(data_sum)
}

Summarize the data :

df3 <- data_summary(ToothGrowth, varname="len", 
                    groupnames=c("supp", "dose"))
# Convert dose to a factor variable
df3$dose=as.factor(df3$dose)
head(df3)
##   supp dose   len       sd
## 1   OJ  0.5 13.23 4.459709
## 2   OJ    1 22.70 3.910953
## 3   OJ    2 26.06 2.655058
## 4   VC  0.5  7.98 2.746634
## 5   VC    1 16.77 2.515309
## 6   VC    2 26.14 4.797731

The function geom_errorbar() can be used to produce a bar graph with error bars :

# Standard deviation of the mean as error bar
p <- ggplot(df3, aes(x=dose, y=len, fill=supp)) + 
   geom_bar(stat="identity", position=position_dodge()) +
  geom_errorbar(aes(ymin=len-sd, ymax=len+sd), width=.2,
                 position=position_dodge(.9))
  
p + scale_fill_brewer(palette="Paired") + theme_minimal()

ggplot2 barplot - R software and data visualization

Customized barplots

# Change color by groups
# Add error bars
p + labs(title="Plot of length  per dose", 
         x="Dose (mg)", y = "Length")+
   scale_fill_manual(values=c('black','lightgray'))+
   theme_classic()

ggplot2 barplot - R software and data visualization

Change fill colors manually :

# Greens
p + scale_fill_brewer(palette="Greens") + theme_minimal()
# Reds
p + scale_fill_brewer(palette="Reds") + theme_minimal()

ggplot2 barplot - R software and data visualizationggplot2 barplot - R software and data visualization

Infos

This analysis has been performed using R software (ver. 3.1.2) and ggplot2 (ver. 1.0.0)








Want to Learn More on R Programming and Data Science?

Follow us by Email

by FeedBurner

On Social Networks:


 Get involved :
  Click to follow us on and Google+ :   
  Comment this article by clicking on "Discussion" button (top-right position of this page)
  Sign up as a member and post news and articles on STHDA web site.


Suggestions


ggplot2 axis ticks : A guide to customize tick marks and labels
ggplot2 - Easy way to mix multiple graphs on the same page - R software and data visualization
ggplot2 colors : How to change colors automatically and manually?
ggplot2 legend : Easy steps to change the position and the appearance of a graph legend in R software
ggplot2 box plot : Quick start guide - R software and data visualization
ggplot2 axis scales and transformations
ggplot2 title : main, axis and legend titles
ggplot2 : Quick correlation matrix heatmap - R software and data visualization
ggplot2 pie chart : Quick start guide - R software and data visualization
ggplot2 texts : Add text annotations to a graph in R software
ggplot2 scatter plots : Quick start guide - R software and data visualization
ggplot2 line types : How to change line types of a graph in R software?
ggplot2 point shapes
ggplot2 themes and background colors : The 3 elements
ggplot2 line plot : Quick start guide - R software and data visualization
ggplot2 violin plot : Quick start guide - R software and data visualization
ggplot2 histogram plot : Quick start guide - R software and data visualization
ggplot2 - Essentials
ggplot2 error bars : Quick start guide - R software and data visualization
ggplot2 add straight lines to a plot : horizontal, vertical and regression lines
Be Awesome in ggplot2: A Practical Guide to be Highly Effective - R software and data visualization
ggplot2 dot plot : Quick start guide - R software and data visualization
ggplot2 facet : split a plot into a matrix of panels
ggplot2 density plot : Quick start guide - R software and data visualization
ggplot2 stripchart (jitter) : Quick start guide - R software and data visualization
ggplot2 rotate a graph : reverse and flip the plot
ggplot2 qq plot (quantile - quantile graph) : Quick start guide - R software and data visualization
ggplot2 area plot : Quick start guide - R software and data visualization
ggfortify : Extension to ggplot2 to handle some popular packages - R software and data visualization
ggcorrplot: Visualization of a correlation matrix using ggplot2
GGally R package: Extension to ggplot2 for correlation matrix and survival plots - R software and data visualization
ggplot2 ECDF plot : Quick start guide for Empirical Cumulative Density Function - R software and data visualization
ggsave : Save a ggplot - R software and data visualization
qplot: Quick plot with ggplot2 - R software and data visualization

This page has been seen 130666 times