# Introduction

ggplot2.histogram is an easy to use function for plotting histograms using ggplot2 package and R statistical software. In this ggplot2 tutorial we will see how to make a histogram and to customize the graphical parameters including main title, axis labels, legend, background and colors. ggplot2.histogram function is from easyGgplot2 R package. An R script is available in the next section to install the package.

At the end of this tutorial you will be able to draw, with few R code, the following plot:

ggplot2.histogram function is described in detail at the end of this document.

# Install and load easyGgplot2 package

easyGgplot2 R package can be installed as follow :

``````install.packages("devtools")
library(devtools)
install_github("easyGgplot2", "kassambara")``````

Load the package using this R code :

``library(easyGgplot2)``

# Data format

The data must be a numeric vector or a data.frame (columns are variables and rows are observations).

weight data, from easyGgplot2 package, will be used in the following examples.

``````# create a numeric vector
numVector<-rnorm(100)
``## [1]  0.5738  1.1956 -0.1904  0.4465  0.2567 -1.6642``
``````# data.frame
``````##      sex weight
## 1 Female  63.79
## 2 Female  65.28
## 3 Female  66.08
## 4 Female  62.65
## 5 Female  65.43
## 6 Female  65.51``````

weight data contain the weight of 400 people (200 females and 200 males).

# Basic histograms

``````# Histogram from a single numeric vector
# ggplot2.histogram(data=numVector)
# Basic histogram plot from the vector "weight"
ggplot2.histogram(data=weight, xName='weight')
# Change the width of bars
ggplot2.histogram(data=weight, xName='weight', binwidth=0.1)
# Change y axis values to density
ggplot2.histogram(data=weight, xName='weight', scale="density")``````

# Change the histogram orientation

``````# Horizontal histogram plot
ggplot2.histogram(data=weight, xName='weight',
orientation="horizontal")
# y Axis reversed
ggplot2.histogram(data=weight, xName='weight',
orientation="yAxisReversed")``````

# Add mean line and density curve

``````# Add mean line to the histogram plot.
# Change fill color and line color
ggplot2.histogram(data=weight, xName='weight',
fill="white", color="black",
meanLineType="dashed", meanLineSize=1)
ggplot2.histogram(data=weight, xName='weight',
fill="white", color="black",

# Change the line type of the histogram plot

Different point shapes and line types can be used in the plot. By default, ggplot2 uses solid line type and circle shape.

``````#Change the histogram line color and line type
ggplot2.histogram(data=weight, xName='weight',
fill="white", color="black",
linetype="longdash")``````

# Histogram plot with multiple groups

``````# Multiple histograms on the same plot
# Color the histogram plot by the groupName "sex"
ggplot2.histogram(data=weight, xName='weight',
groupName='sex', legendPosition="top")
# Histogram plots with semi-transparent fill.
# alpha is the transparency of the overlaid color
ggplot2.histogram(data=weight, xName='weight',
groupName='sex', legendPosition="top",
alpha=0.5 )
# Histogram plots with mean lines
ggplot2.histogram(data=weight, xName='weight',
groupName='sex', legendPosition="top",

You can change the position adjustment to use for overlapping points on the layer. Possible values for the argument `position` is “identity”, “stack”, “dodge”. This is shown in the following histograms.

``````# Default value of position is "identity"
ggplot2.histogram(data=weight, xName='weight',
groupName='sex', legendPosition="top",
alpha=0.5, position="identity")
# Interleaved histograms
ggplot2.histogram(data=weight, xName='weight',
groupName='sex', legendPosition="top",
alpha=0.5, position="dodge")
#stacked histograms
ggplot2.histogram(data=weight, xName='weight',
groupName='sex', legendPosition="top",
alpha=0.5, position="stack")``````

## Parameters

The arguments that can be used to customize titles and x and y axis are listed below :

Parameters Description
mainTitle the title of the plot
mainTitleFont a vector of length 3 indicating respectively the size, the style (“italic”, “bold”, “bold.italic”) and the color of x and y axis titles. `Default value is: mainTitleFont=c(14, "bold", "black")`.
xShowTitle, yShowTitle if TRUE, x and y axis titles will be shown. Set the value to FALSE to hide axis labels. Default values are `TRUE`.
xtitle, ytitle x and y axis labels. Default values are `NULL`.
xtitleFont, ytitleFont a vector of length 3 indicating respectively the size, the style and the color of x and y axis titles. Possible values for the style:“plain”, “italic”, “bold”, “bold.italic”. Color can be specified as an hexadecimal code (e.g: “#FFCC00”) or by the name (e.g : “red”, “green”). Default values are `xtitleFont=c(14,"bold", "black"), ytitleFont=c(14,"bold", "black")`.
xlim, ylim limit for the x and y axis. Default values are `NULL`.
xScale, yScale x and y axis scales. Possible values : c(“none”, “log2”, “log10”). e.g: yScale=“log2”. Default values are `NULL`.
xShowTickLabel, yShowTickLabel if TRUE, x and y axis tick mark labels will be shown. Default values are `TRUE`.
xTickLabelFont, yTickLabelFont a vector of length 3 indicating respectively the size, the style and the color of x and y axis tick label fonts. Default value are `xTickLabelFont=c(12, "bold", "black"), yTickLabelFont=c(12, "bold", "black")`.
xtickLabelRotation, ytickLabelRotation Rotation angle of x and y axis tick labels. Default value are `0`.
hideAxisTicks if TRUE, x and y axis ticks are hidden. Default value is `FALSE`.
axisLine a vector of length 3 indicating respectively the size, the line type and the color of axis lines. Default value is `c(0.5, "solid", "#E5E5E5")`.

## Main title and axis labels

``````# basic plot
plot<-ggplot2.histogram(data=weight, xName='weight')
print(plot)
# Change main title and axis titles
ggplot2.customize(plot, mainTitle="Weight histo.",
xtitle="Weight (Kg)", ytitle="Histogram")
# Customize title styles. Possible values for the font style :
# 'plain', 'italic', 'bold', 'bold.italic'.
ggplot2.customize(plot,  mainTitle="Weight histo.",
mainTitleFont=c(14,"bold.italic", "red"),
xtitle="Weight (Kg)", ytitle="Histogram",
xtitleFont=c(14,"bold", "#993333"),
ytitleFont=c(14,"bold", "#993333"))
# Hide x an y axis titles
ggplot2.customize(plot, xShowTitle=FALSE, yShowTitle=FALSE)``````

## Axis ticks

``````# Axis ticks labels and orientaion
ggplot2.histogram(data=weight, xName='weight',
xShowTitle=FALSE, yShowTitle=FALSE,
xTickLabelFont=c(14,"bold", "#993333"),
yTickLabelFont=c(14,"bold", "#993333"),
xtickLabelRotation=45, ytickLabelRotation=45)
# Hide axis tick labels
ggplot2.histogram(data=weight, xName='weight',
xShowTitle=FALSE, yShowTitle=FALSE,
xShowTickLabel=FALSE, yShowTickLabel=FALSE)
# Hide axis ticks
ggplot2.histogram(data=weight, xName='weight',
xShowTitle=FALSE, yShowTitle=FALSE,
xShowTickLabel=FALSE, yShowTickLabel=FALSE,
hideAxisTicks=TRUE)
# AxisLine : a vector of length 3 indicating the size,
#the line type and the color of axis lines
ggplot2.histogram(data=weight, xName='weight',
axisLine=c(1, "solid", "darkblue"))``````

## Background and colors

### Change histogram plot background and fill colors

``````# Change background color to "white". Default is "gray"
ggplot2.histogram(data=weight, xName='weight',
backgroundColor="white")
# Change background color to "lightblue" and grid color to "white"
ggplot2.histogram(data=weight, xName='weight',
backgroundColor="lightblue", gridColor="white")
#Change plot fill color
#color =  color of the histogram border
ggplot2.histogram(data=weight, xName='weight',
# Remove grid; remove top and right borders around the plot;
# change  axis lines
ggplot2.histogram(data=weight, xName='weight',
removePanelGrid=TRUE,removePanelBorder=TRUE,
axisLine=c(0.5, "solid", "black"))``````

### Fill the histogram by count value

``````# Fill the histogram according to the count value
ggplot2.histogram(data=weight, xName='weight')+
geom_histogram(aes(fill = ..count..))

# Change the scale
ggplot2.histogram(data=weight, xName='weight')+
geom_histogram(aes(fill = ..count..))+
scale_fill_gradient("Count", low = "green", high = "red")``````

### Change histogram plot color according to the group

Colors can be specified as a hexadecimal RGB triplet, such as `"#FFCC00"` or by names (e.g : `"red"` ). You can also use other color scales, such as ones taken from the RColorBrewer package. The different color systems available in R have been described in detail here.

To change histogram plot color according to the group, you have to specify the name of the data column containing the groups using the argument `groupName`. Use the argument `groupColors`, to specify colors by `hexadecimal` code or by `name`. In this case, the length of groupColors should be the same as the number of the groups. Use the argument `brewerPalette`, to specify colors using `RColorBrewer`palette.

``````# Change group colors using hexadecimal colors
# alpha is the transparency level of overlaid color.
#The value can variate from 0 (total transparency)
#to 1 (no transparency)
ggplot2.histogram(data=weight, xName='weight', groupName='sex',
legendPosition="top",
groupColors=c('#999999','#E69F00'), alpha=0.5 )
# Change group colors using brewer palette: "Paired"
ggplot2.histogram(data=weight, xName='weight', groupName='sex',
legendPosition="top",
brewerPalette="Paired",alpha=0.5)``````

Color can also be changed by using names as follow :

``````#Change group colors using color names
ggplot2.histogram(data=weight, xName='weight',
groupName='sex', groupColors=c('aquamarine3','goldenrod1'),
alpha=0.5 )``````

## Legend

### Legend position

``````# Change the legend position to "top"
# (possible values: "left","top", "right", "bottom")
ggplot2.histogram(data=weight, xName='weight',
groupName='sex', alpha=0.5,
legendPosition="top")
# legendPosition can be also a numeric vector c(x, y)
ggplot2.histogram(data=weight, xName='weight',
groupName='sex', alpha=0.5,
legendPosition=c(0.8,0.2))
#Remove legend
ggplot2.histogram(data=weight, xName='weight',
groupName='sex', alpha=0.5,
showLegend=FALSE)``````

It is also possible to position the legend inside the plotting area. You have to indicate the x, y coordinates of legend box. x and y values must be between 0 and 1. `c(0,0)` corresponds to `"bottom left"` and `c(1,1)` corresponds to `"top right"` position.

### Legend background color, title and text font styles

``````# Change legend background color, title and text font styles
ggplot2.histogram(data=weight, xName='weight',
groupName='sex', alpha=0.5,
# legendTitleFont=c(size, style, color)
legendTitle="Groups", legendTitleFont=c(10, "bold", "blue"),
#legendTextFont=c(size, style, color)
legendTextFont=c(10, "bold.italic", "red"),
#legendBackground: c(fill, lineSize, lineType, lineColor)
legendBackground=c("lightblue", 0.5, "solid", "darkblue" )
)``````

## Axis scales

Possible values for x axis scale are “none”, “log2” and log10. Default value is “none”.

``````# Change x axis limit
ggplot2.histogram(data=weight, xName='weight',
groupName='sex', alpha=0.5,
showLegend=FALSE, xlim=c(60,72))
# x Log scale. Possible values="none", "log2" and "log10"
ggplot2.histogram(data=weight, xName='weight',
groupName='sex', alpha=0.5,
showLegend=FALSE, xScale="log2")``````

## Create a customized plots with few R code

``````# basic plot
plot<-ggplot2.histogram(data=weight, xName='weight', groupName='sex',
groupColors=c('#999999','#E69F00'), alpha=0.5,
backgroundColor="white")
#print(plot)
# Customized histogram
plot<-ggplot2.customize(plot, xtitle="Weight (Kg)", ytitle="Count",
showLegend=FALSE,
mainTitle="Weight histogram \nby sex")
print(plot)
# Remove grid; Remove Top and right border around the plot
# add density curve and mean line
ggplot2.customize(plot,
removePanelGrid=TRUE,removePanelBorder=TRUE,
axisLine=c(0.5, "solid", "black"),

The argument alpha is used to specify the transparency of colors.

# Facet : split a plot into a matrix of panels

The facet approach splits a plot into a matrix of panels. Each panel shows a different subset of the data.

## Facet with one variable

``````# Facet by "sex" variable
ggplot2.histogram(data=weight, xName='weight',
groupName='sex', legendPosition="top",
faceting=TRUE, facetingVarNames="sex")
# Change the direction. possible values are "vertical", horizontal".
# default is vertical.
ggplot2.histogram(data=weight, xName='weight',
groupName='sex', legendPosition="top",
faceting=TRUE, facetingVarNames="sex",
facetingDirection="horizontal") ``````

## Facet with two variables

The mtcars data is used in the following examples.

``````data(mtcars)
``````##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1``````

mtcars (Motor Trend Car Road Tests) comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles.

``````# Facet by two variables: vs and am.
# Rows are vs and columns are am
ggplot2.histogram(data=mtcars, xName='mpg', groupName='vs',
legendPosition="top",
faceting=TRUE, facetingVarNames=c("vs", "am"))
#Facet by two variables: reverse the order of the 2 variables
#Rows are am and columns are vs
ggplot2.histogram(data=mtcars, xName='mpg', groupName='vs',
legendPosition="top",
faceting=TRUE, facetingVarNames=c("am", "vs"))``````

## Facet scales

By default, all the panels have the same scale (`facetingScales="fixed"`). They can be made independent, by setting scales to `free`, `free_x`, or `free_y`.

``````# Facet with free scales
ggplot2.histogram(data=mtcars, xName='mpg', groupName='vs',
legendPosition="top",
faceting=TRUE, facetingVarNames=c("vs", "am"),
facetingScales="free")``````

As you can see in the above plot, y axis have different scales in the different panels.

## Facet label apperance

``````# Change facet text font. Possible values for the font style:
#'plain', 'italic', 'bold', 'bold.italic'.
ggplot2.histogram(data=mtcars, xName='mpg', groupName='vs',
legendPosition="top",
faceting=TRUE, facetingVarNames=c("vs", "am"),
facetingFont=c(12, 'bold.italic', "red"))
# Change the apperance of the rectangle around facet label
ggplot2.histogram(data=mtcars, xName='mpg', groupName='vs',
legendPosition="top",
faceting=TRUE, facetingVarNames=c("vs", "am"),
facetingRect=list(background="white", lineType="solid",
lineColor="black", lineSize=1.5)
)``````

# ggplot2.histogram function

## Description

Plot easily a histogram plot with R package easyGgplot2.

## usage

``````ggplot2.histogram(data, xName=NULL, groupName=NULL,
position=c("identity", "stack", "dodge"),
meanLineType="dashed", meanLineSize=1,
densityAlpha=0.2,
densityLineType="solid", densityLineColor="#2F2F2F",
scale=c("frequency", "density"),
groupColors=NULL, brewerPalette=NULL,...)``````

## Arguments

Arguments Descriptions
data data.frame or a numeric vector. Columns are variables and rows are observations.
xName The name of column containing x variable. Default value is NULL.
groupName The name of column containing group variable. This variable is used to color plot according to the group.
position Change the position adjustment to use for overlapping points on the layer. Possible values for the argument position is “identity”, “stack”, “dodge”. Default value is identity.
addMeanLine If TRUE, the mean line is added on the plot for each group. Default value is FALSE.
meanLineColor, meanLineType, meanLineSize mean line color, type and size.
densityFill The fill color of density plot. The value is considered only when groupName=NULL. If groupName is specified, density curves are colored according groupColors or brewerPalette.
densityAlpha Degree of transparency of overlaid colors for density curves. Default is 0.2 (20%).
densityLineType, densityLineColor Line type and color for density curve.
scale Indicate whether y axis values are density or frequency. Default value is frequency.
groupColors Color of groups. groupColors should have the same length as groups.
brewerPalette This can be also used to indicate group colors. In this case the parameter groupColors should be NULL. e.g: brewerPalette=“Paired”.
…. Other arguments passed on to ggplot2.customize custom function or to geom_histogram and geom_density functions from ggplot2 package.

The other arguments which can be used are described at this link : ggplot2 customize. They are used to customize the plot (axis, title, background, color, legend, ….) generated using ggplot2 or easyGgplot2 R package.

## Examples

``````library(easyGgplot2)
#plot
plot<-ggplot2.histogram(data=weight, xName='weight',groupName='sex',
groupColors=c('#999999','#E69F00'))
plot<-ggplot2.customize(plot,
mainTitle="Plot of weight histogram \nper sex",
xtitle="Weight (Kg)", ytitle="Histogram")
print(plot)``````

# Easy ggplot2 ebook

Note that an eBook is available on easyGgplot2 package here.

September 2014 : First edition.

Licence : This document is under creative commons licence (http://creativecommons.org/licenses/by-nc-sa/3.0/).

# Infos

This analysis was performed using R (ver. 3.1.0), easyGgplot2 (ver 1.0.0) and ggplot2 (ver 1.0.0).