# Articles - R Graphics Essentials

## Plot Time Series Data Using GGPlot

|   268155  |  Comments (3)  |  R Graphics Essentials

In this chapter, we start by describing how to plot simple and multiple time series data using the R function `geom_line()` [in ggplot2].

Next, we show how to set date axis limits and add trend smoothed line to a time series graphs. Finally, we introduce some extensions to the ggplot2 package for easily handling and analyzing time series objects.

Additionally, you’ll learn how to detect peaks (maxima) and valleys (minima) in time series data.

Contents:

## Basic ggplot of time series

• Plot types: line plot with dates on x-axis
• Demo data set: `economics` [ggplot2] time series data sets are used.

In this section we’ll plot the variables `psavert` (personal savings rate) and `uempmed` (number of unemployed in thousands) by `date` (x-axis).

• Load required packages and set the default theme:
``````library(ggplot2)
theme_set(theme_minimal())
# Demo dataset
head(economics)``````
``````## # A tibble: 6 x 6
##         date   pce    pop psavert uempmed unemploy
##
## 1 1967-07-01   507 198712    12.5     4.5     2944
## 2 1967-08-01   510 198911    12.5     4.7     2945
## 3 1967-09-01   516 199113    11.7     4.6     2958
## 4 1967-10-01   513 199311    12.5     4.9     3143
## 5 1967-11-01   518 199498    12.5     4.7     3066
## 6 1967-12-01   526 199657    12.1     4.8     3018``````
• Create basic line plots
``````# Basic line plot
ggplot(data = economics, aes(x = date, y = pop))+
geom_line(color = "#00AFBB", size = 2)
# Plot a subset of the data
ss <- subset(economics, date > as.Date("2006-1-1"))
ggplot(data = ss, aes(x = date, y = pop)) +
geom_line(color = "#FC4E07", size = 2)``````

• Control line size by the value of a continuous variable:
``````ggplot(data = economics, aes(x = date, y = pop)) +
geom_line(aes(size = unemploy/pop), color = "#FC4E07")``````

## Plot multiple time series data

Here, we’ll plot the variables `psavert` and `uempmed` by dates. You should first reshape the data using the `tidyr` package: - Collapse `psavert` and `uempmed` values in the same column (new column). R function: `gather()[tidyr]` - Create a grouping variable that with levels = `psavert` and `uempmed`

``````library(tidyr)
library(dplyr)
df <- economics %>%
select(date, psavert, uempmed) %>%
gather(key = "variable", value = "value", -date)
head(df, 3)``````
``````## # A tibble: 3 x 3
##         date variable value
##
## 1 1967-07-01  psavert  12.5
## 2 1967-08-01  psavert  12.5
## 3 1967-09-01  psavert  11.7``````
``````# Multiple line plot
ggplot(df, aes(x = date, y = value)) +
geom_line(aes(color = variable), size = 1) +
scale_color_manual(values = c("#00AFBB", "#E7B800")) +
theme_minimal()``````

``````# Area plot
ggplot(df, aes(x = date, y = value)) +
geom_area(aes(color = variable, fill = variable),
alpha = 0.5, position = position_dodge(0.8)) +
scale_color_manual(values = c("#00AFBB", "#E7B800")) +
scale_fill_manual(values = c("#00AFBB", "#E7B800"))``````

## Set date axis limits

Key R function: `scale_x_date()`

``````# Base plot with date axis
p <- ggplot(data = economics, aes(x = date, y = psavert)) +
geom_line(color = "#00AFBB", size = 1)
p
# Set axis limits c(min, max)
min <- as.Date("2002-1-1")
max <- NA
p + scale_x_date(limits = c(min, max))``````

## Format date axis labels

Key function: `scale_x_date()`.

To format date axis labels, you can use different combinations of days, weeks, months and years:

• Weekday name: use `%a` and `%A` for abbreviated and full weekday name, respectively
• Month name: use `%b` and `%B` for abbreviated and full month name, respectively
• `%d`: day of the month as decimal number
• `%Y`: Year with century.
• See more options in the documentation of the function `?strptime`
``````# Format : month/year
p + scale_x_date(date_labels = "%b/%Y")``````

## Add trend smoothed line

Key function: `stat_smooth()`

``````p + stat_smooth(
color = "#FC4E07", fill = "#FC4E07",
method = "loess"
)``````

## ggplot2 extensions for ts objects

The `ggfortify` package is an extension to ggplot2 that makes it easy to plot time series objects (Horikoshi and Tang 2017). It can handle the output of many time series packages, including: zoo::zooreg(), xts::xts(), timeSeries::timSeries(), tseries::irts(), forecast::forecast(), vars:vars().

Another interesting package is the `ggpmisc` package (Aphalo 2017), which provides two useful methods for time series object:

• `stat_peaks()` finds at which x positions local y maxima are located, and
• `stat_valleys()` finds at which x positions local y minima are located.

Here, we’ll show how to easily:

• Visualize a time series object, using the data set `AirPassengers` (monthly airline passenger numbers 1949-1960).
• Identify shifts in mean and/or variance in a time series using the `changepoint` package.
• Detect jumps in a data using the `strucchange` package and the data set `Nile` (Measurements of the annual flow of the river Nile at Aswan).
• Detect peaks and valleys using the `ggpmisc` package and the data set `lynx` (Annual Canadian Lynx trappings 1821–1934).

First, install required R packages:

``````install.packages(
c("ggfortify", "changepoint",
"strucchange", "ggpmisc")
)``````

Then use the `autoplot.ts()` function to visualize time series objects, as follow:

``````library(ggfortify)
library(magrittr) # for piping %>%
# Plot ts objects
autoplot(AirPassengers)
# Identify change points in mean and variance
AirPassengers %>%
changepoint:: cpt.meanvar() %>%  # Identify change points
autoplot()
# Detect jump in a data
strucchange::breakpoints(Nile ~ 1) %>%
autoplot()``````

Detect peaks and valleys:

``````library(ggpmisc)
ggplot(lynx, as.numeric = FALSE) + geom_line() +
stat_peaks(colour = "red") +
stat_peaks(geom = "text", colour = "red",
vjust = -0.5, x.label.fmt = "%Y") +
stat_valleys(colour = "blue") +
stat_valleys(geom = "text", colour = "blue", angle = 45,
vjust = 1.5, hjust = 1,  x.label.fmt = "%Y")+
ylim(-500, 7300)``````

## References

Aphalo, Pedro J. 2017. Ggpmisc: Miscellaneous Extensions to ’Ggplot2’. https://CRAN.R-project.org/package=ggpmisc.

Horikoshi, Masaaki, and Yuan Tang. 2017. Ggfortify: Data Visualization Tools for Statistical Analysis Results. https://CRAN.R-project.org/package=ggfortify.