Reading Data From TXT|CSV Files: R Base Functions
Previously, we described the essentials of R programming and some best practices for preparing your data.
Preleminary tasks
Launch RStudio as described here: Running RStudio and setting up your working directory
Prepare your data as described here: Best practices for preparing your data
R base functions for importing data
The R base function read.table() is a general function that can be used to read a file in table format. The data will be imported as a data frame.
Note that, depending on the format of your file, several variants of read.table() are available to make your life easier, including read.csv(), read.csv2(), read.delim() and read.delim2().
- read.csv(): for reading “comma separated value” files (“.csv”).
- read.csv2(): variant used in countries that use a comma “,” as decimal point and a semicolon “;” as field separators.
- read.delim(): for reading “tab-separated value” files (“.txt”). By default, point (“.”) is used as decimal points.
- read.delim2(): for reading “tab-separated value” files (“.txt”). By default, comma (“,”) is used as decimal points.
The simplified format of these functions are, as follow:
# Read tabular data into R
read.table(file, header = FALSE, sep = "", dec = ".")
# Read "comma separated value" files (".csv")
read.csv(file, header = TRUE, sep = ",", dec = ".", ...)
# Or use read.csv2: variant used in countries that
# use a comma as decimal point and a semicolon as field separator.
read.csv2(file, header = TRUE, sep = ";", dec = ",", ...)
# Read TAB delimited files
read.delim(file, header = TRUE, sep = "\t", dec = ".", ...)
read.delim2(file, header = TRUE, sep = "\t", dec = ",", ...)
- file: the path to the file containing the data to be imported into R.
- sep: the field separator character. “\t” is used for tab-delimited file.
- header: logical value. If TRUE, read.table() assumes that your file has a header row, so row 1 is the name of each column. If that’s not the case, you can add the argument header = FALSE.
- dec: the character used in the file for decimal points.
Reading a local file
- To import a local .txt or a .csv file, the syntax would be:
# Read a txt file, named "mtcars.txt"
my_data <- read.delim("mtcars.txt")
# Read a csv file, named "mtcars.csv"
my_data <- read.csv("mtcars.csv")
The above R code, assumes that the file “mtcars.txt” or “mtcars.csv” is in your current working directory. To know your current working directory, type the function getwd() in R console.
- It’s also possible to choose a file interactively using the function file.choose(), which I recommend if you’re a beginner in R programming:
# Read a txt file
my_data <- read.delim(file.choose())
# Read a csv file
my_data <- read.csv(file.choose())
If you use the R code above in RStudio, you will be asked to choose a file.
If your data contains column with text, R may assume that columns as a factors or grouping variables (e.g.: “good”, “good”, “bad”, “bad”, “bad”). If you don’t want your text data to be converted as factors, add stringsAsFactor = FALSE in read.delim(), read.csv() and read.table() functions. In this case, the data frame columns corresponding to string in your text file will be character.
For example:
my_data <- read.delim(file.choose(),
stringsAsFactor = FALSE)
- If your field separator is for example “|”, it’s possible use the general function read.table() with additional arguments:
my_data <- read.table(file.choose(),
sep ="|", header = TRUE, dec =".")
Reading a file from internet
It’s possible to use the functions read.delim(), read.csv() and read.table() to import files from the web.
my_data <- read.delim("https://www.sthda.com/upload/boxplot_format.txt")
head(my_data)
Nom variable Group
1 IND1 10 A
2 IND2 7 A
3 IND3 20 A
4 IND4 14 A
5 IND5 14 A
6 IND6 12 A
Summary
Import a local .txt file: read.delim(file.choose())
Import a local .csv file: read.csv(file.choose())
- Import a file from internet: read.delim(url) if a txt file or read.csv(url) if a csv file
Infos
This analysis has been performed using R (ver. 3.2.3).
Show me some love with the like buttons below... Thank you and please don't forget to share and comment below!!
Montrez-moi un peu d'amour avec les like ci-dessous ... Merci et n'oubliez pas, s'il vous plaît, de partager et de commenter ci-dessous!
Recommended for You!
Recommended for you
This section contains the best data science and self-development resources to help you on your path.
Books - Data Science
Our Books
- Practical Guide to Cluster Analysis in R by A. Kassambara (Datanovia)
- Practical Guide To Principal Component Methods in R by A. Kassambara (Datanovia)
- Machine Learning Essentials: Practical Guide in R by A. Kassambara (Datanovia)
- R Graphics Essentials for Great Data Visualization by A. Kassambara (Datanovia)
- GGPlot2 Essentials for Great Data Visualization in R by A. Kassambara (Datanovia)
- Network Analysis and Visualization in R by A. Kassambara (Datanovia)
- Practical Statistics in R for Comparing Groups: Numerical Variables by A. Kassambara (Datanovia)
- Inter-Rater Reliability Essentials: Practical Guide in R by A. Kassambara (Datanovia)
Others
- R for Data Science: Import, Tidy, Transform, Visualize, and Model Data by Hadley Wickham & Garrett Grolemund
- Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurelien Géron
- Practical Statistics for Data Scientists: 50 Essential Concepts by Peter Bruce & Andrew Bruce
- Hands-On Programming with R: Write Your Own Functions And Simulations by Garrett Grolemund & Hadley Wickham
- An Introduction to Statistical Learning: with Applications in R by Gareth James et al.
- Deep Learning with R by François Chollet & J.J. Allaire
- Deep Learning with Python by François Chollet
Click to follow us on Facebook :
Comment this article by clicking on "Discussion" button (top-right position of this page)