Reading Data From Excel Files (xls|xlsx) into R
Previously, we described the essentials of R programming and some best practices for preparing your data. We also provided quick start guides for reading and writing txt and csv files using R base functions as well as using a most modern R package named readr, which is faster (X10) than R base functions.
In this article, you’ll learn how to read data from Excel xls or xlsx file formats into R. This can be done either by:
- copying data from Excel
- using readxl package
- or using xlsx package
Preleminary tasks
Launch RStudio as described here: Running RStudio and setting up your working directory
Prepare your data as described here: Best practices for preparing your data
Copying data from Excel and import into R
On Windows system
Open the Excel file containing your data: select and copy the data (ctrl + c)
Type the R code below to import the copied data from the clipboard into R and store the data in a data frame (my_data):
my_data <- read.table(file = "clipboard",
sep = "\t", header=TRUE)
On Mac OSX system
Select and copy the data (Cmd + c)
Use the function pipe(pbpaste) to import the data you’ve copied (with Cmd + c):
my_data <- read.table(pipe("pbpaste"), sep="\t", header = TRUE)
Importing Excel files into R using readxl package
The readxl package, developed by Hadley Wickham, can be used to easily import Excel files (xls|xlsx) into R without any external dependencies.
Installing and loading readxl package
- Install
install.packages("readxl")
- Load
library("readxl")
Using readxl package
The readxl package comes with the function read_excel() to read xls and xlsx files
- Read both xls and xlsx files
# Loading
library("readxl")
# xls files
my_data <- read_excel("my_file.xls")
# xlsx files
my_data <- read_excel("my_file.xlsx")
The above R code, assumes that the file “my_file.xls” and “my_file.xlsx” is in your current working directory. To know your current working directory, type the function getwd() in R console.
- It’s also possible to choose a file interactively using the function file.choose(), which I recommend if you’re a beginner in R programming:
my_data <- read_excel(file.choose())
If you use the R code above in RStudio, you will be asked to choose a file.
- Specify sheet with a number or name
# Specify sheet by its name
my_data <- read_excel("my_file.xlsx", sheet = "data")
# Specify sheet by its index
my_data <- read_excel("my_file.xlsx", sheet = 2)
- Case of missing values: NA (not available). If NAs are represented by something (example: “—”) other than blank cells, set the na argument:
my_data <- read_excel("my_file.xlsx", na = "---")
Importing Excel files using xlsx package
The xlsx package, a java-based solution, is one of the powerful R packages to read, write and format Excel files.
Installing and loading xlsx package
- Install
install.packages("xlsx")
- Load
library("xlsx")
Using xlsx package
There are two main functions in xlsx package for reading both xls and xlsx Excel files: read.xlsx() and read.xlsx2() [faster on big files compared to read.xlsx function].
The simplified formats are:
read.xlsx(file, sheetIndex, header=TRUE)
read.xlsx2(file, sheetIndex, header=TRUE)
- file: file path
- sheetIndex: the index of the sheet to be read
- header: a logical value. If TRUE, the first row is used as column names.
Example of usage:
library("xlsx")
my_data <- read.xlsx(file.choose(), 1) # read first sheet
Read more
Read more about for reading, writing and formatting Excel files:
Summary
Read Excel files using readxl package: read_excel(file.choose(), sheet = 1)
- Read Excel files using xlsx package: read.xlsx(file.choose(), sheetIndex = 1)
Infos
This analysis has been performed using R (ver. 3.2.3).
Show me some love with the like buttons below... Thank you and please don't forget to share and comment below!!
Montrez-moi un peu d'amour avec les like ci-dessous ... Merci et n'oubliez pas, s'il vous plaît, de partager et de commenter ci-dessous!
Recommended for You!
Recommended for you
This section contains the best data science and self-development resources to help you on your path.
Books - Data Science
Our Books
- Practical Guide to Cluster Analysis in R by A. Kassambara (Datanovia)
- Practical Guide To Principal Component Methods in R by A. Kassambara (Datanovia)
- Machine Learning Essentials: Practical Guide in R by A. Kassambara (Datanovia)
- R Graphics Essentials for Great Data Visualization by A. Kassambara (Datanovia)
- GGPlot2 Essentials for Great Data Visualization in R by A. Kassambara (Datanovia)
- Network Analysis and Visualization in R by A. Kassambara (Datanovia)
- Practical Statistics in R for Comparing Groups: Numerical Variables by A. Kassambara (Datanovia)
- Inter-Rater Reliability Essentials: Practical Guide in R by A. Kassambara (Datanovia)
Others
- R for Data Science: Import, Tidy, Transform, Visualize, and Model Data by Hadley Wickham & Garrett Grolemund
- Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurelien Géron
- Practical Statistics for Data Scientists: 50 Essential Concepts by Peter Bruce & Andrew Bruce
- Hands-On Programming with R: Write Your Own Functions And Simulations by Garrett Grolemund & Hadley Wickham
- An Introduction to Statistical Learning: with Applications in R by Gareth James et al.
- Deep Learning with R by François Chollet & J.J. Allaire
- Deep Learning with Python by François Chollet
Click to follow us on Facebook :
Comment this article by clicking on "Discussion" button (top-right position of this page)