Data Manipulation in R

Rename Data Frame Columns in R

In this tutorial, you will learn how to rename the columns of a data frame in R.This can be done easily using the function rename() [dplyr package]. It’s also possible to use R base functions, but they require more typing.

Renaming Columns of a Data Table in R



Contents:

Required packages

Load the tidyverse packages, which include dplyr:

library(tidyverse)

Demo dataset

We’ll use the R built-in iris data set, which we start by converting into a tibble data frame (tbl_df) for easier data analysis.

my_data <- as_tibble(iris)
my_data
## # A tibble: 150 x 5
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
##          <dbl>       <dbl>        <dbl>       <dbl> <fct>  
## 1          5.1         3.5          1.4         0.2 setosa 
## 2          4.9         3            1.4         0.2 setosa 
## 3          4.7         3.2          1.3         0.2 setosa 
## 4          4.6         3.1          1.5         0.2 setosa 
## 5          5           3.6          1.4         0.2 setosa 
## 6          5.4         3.9          1.7         0.4 setosa 
## # ... with 144 more rows

Renaming columns with dplyr::rename()

Rename the column Sepal.Length to sepal_length and Sepal.Width to sepal_width:

my_data %>% 
  rename(
    sepal_length = Sepal.Length,
    sepal_width = Sepal.Width
    )
## # A tibble: 150 x 5
##   sepal_length sepal_width Petal.Length Petal.Width Species
##          <dbl>       <dbl>        <dbl>       <dbl> <fct>  
## 1          5.1         3.5          1.4         0.2 setosa 
## 2          4.9         3            1.4         0.2 setosa 
## 3          4.7         3.2          1.3         0.2 setosa 
## 4          4.6         3.1          1.5         0.2 setosa 
## 5          5           3.6          1.4         0.2 setosa 
## 6          5.4         3.9          1.7         0.4 setosa 
## # ... with 144 more rows

Renaming columns with R base functions

To rename the column Sepal.Length to sepal_length, the procedure is as follow:

  1. Get column names using the function names() or colnames()
  2. Change column names where name = Sepal.Length
# get column names
colnames(my_data)
## [1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width" 
## [5] "Species"
# Rename column where names is "Sepal.Length"
names(my_data)[names(my_data) == "Sepal.Length"] <- "sepal_length"
names(my_data)[names(my_data) == "Sepal.Width"] <- "sepal_width"
my_data
## # A tibble: 150 x 5
##   sepal_length sepal_width Petal.Length Petal.Width Species
##          <dbl>       <dbl>        <dbl>       <dbl> <fct>  
## 1          5.1         3.5          1.4         0.2 setosa 
## 2          4.9         3            1.4         0.2 setosa 
## 3          4.7         3.2          1.3         0.2 setosa 
## 4          4.6         3.1          1.5         0.2 setosa 
## 5          5           3.6          1.4         0.2 setosa 
## 6          5.4         3.9          1.7         0.4 setosa 
## # ... with 144 more rows

It’s also possible to rename by index in names vector as follow:

names(my_data)[1] <- "sepal_length"
names(my_data)[2] <- "sepal_width"

Summary

In this chapter, we describe how to rename data frame columns using the function rename()[in dplyr package].



Reorder Data Frame Rows in R (Prev Lesson)
(Next Lesson) Compute and Add new Variables to a Data Frame in R
Back to Data Manipulation in R

Comments ( 16 )

  • Suhani

    what should i do if i want to change setosa to Setosa

    • Kassambara

      It’s possible to use the function mutate() as follow:

      library("tidyverse")
      iris.modified <- iris %>%
        mutate(Species = ifelse(Species == "setosa", "Setosa", Species))
      head(iris.modified)
      
      • Norman Munyengwa

        How do i add the letter “V” to row names in R. For example, row name codes are 1023, 1024, 1025 and i want to change it to V1023, V1024, V2025.

        Thank you.

        • Sala Lotfi

          This might help you…Kan has nicely described this..
          https://blog.exploratory.io/selecting-columns-809bdd1ef615

          df %>%
          select(-starts_with(“user.”), -starts_with(“milestone.”),
          -starts_with(“pull_”), -ends_with(“url”)) %>%
          rename(developer = assignee.login) %>%
          select(-starts_with(“assignee”), -title, -comments, -locked, -labels, -id, -body)

  • Kassambara

    You can proceed as follow:

    rownames(mydata) <- paste0("V",  rownames(mydata))
    
  • Anil Kumar

    If I have a quite big data suppose 200+ column?

    • Kassambara

      The functions described here still work, even if you have a large number of columns

      • Thomas

        Hi Kassambara,

        You seem to be really on top of how to rename columns and I’m been struggling with writing a code that can rename columns based on their names. I have many different dataset where a number of columns will start with “alt” (e.g. alt1.price, alt1.pol, alt1.x, alt2.price, alt2.pol, alt2.x) and I would like to rename these columns to price_1, pol_1, x_1, price_2, pol_2, x_3.

        Essentially, I would like to select columns starting with alt, add an underscore, delete the ‘alt’ and move the number to the end of the column name. Is that possible in any way?

        Kind regards, Thomas

        • Kassambara

          Hi Thomas,

          you need to perform some string manipulations as shown below.

          library(tidyverse)
          library(stringr) 
          
          # Demo data peparation
          iris <- as_tibble(iris)
          colnames(iris) <- c("alt1.price", "alt2.price", "alt2.pol", "alt2.x", "y")
          iris
          
          # Helper function to rename columns containing alt
          rename_column <- function(x){
            library(stringr)
            alt <- x %>% str_extract("^alt[0-9]+\\.")
            if(is.na(alt)){
              # stop here and return x, because it doesn't start with "alt"
              return(x)
            }
            suffix <- x %>% str_replace(pattern = alt, replacement = "")
            number <- alt %>% str_replace_all(pattern = "alt|\\.", "")
            new.name <- paste(suffix, number, sep = "_")
            return(new.name)
          }
          
          # Renaming columns
          columns <- colnames(iris)
          colnames(iris) <- columns %>% map(rename_column)
          iris
          
          • Thomas

            Kassambara – you are a hero. Thanks a million for your extremely detailed answer. I was hoping for some hints and get a full code – much appreciated.
            /T

          • Moses

            You are goooood!

  • Felix Kennith Chan

    If I have a large data set with 200+ columns?
    is there a way where I don’t do each column manually one by one? could you possibly create a forloop or something to do it? if you can how would that work and what would it look like?

    Thanks

    • Kassambara

      You can also go as follow:

      colnames(my_data) = c("newname1", "newname2", "newname3")
      
      • Felix Kennith Chan

        Is there a way where I don’t do c(“newname1”, “newname2”, “newname3”, … , “newname200”)?

  • Abdoulaye Sarr

    I have a matrix with column data as years as date but when using as.Date it expects something %y%m%d how to rename column to %Y only as date but not character?
    example 2001-01-01 rename as 2001

  • Venkatapanchumarthi

    Hi i am Venkatapanchumarthi.
    Awesome. I read excellent artcicle in recent days, this post is very informative this article is helped me a lot .Thanks for sharing such useful information.

Give a comment

Want to post an issue with R? If yes, please make sure you have read this: How to Include Reproducible R Script Examples in Datanovia Comments

Teacher
Alboukadel Kassambara
Role : Founder of Datanovia
Read More