Reading Data From TXT|CSV Files: R Base Functions
Previously, we described the essentials of R programming and some best practices for preparing your data.
Launch RStudio as described here: Running RStudio and setting up your working directory
Prepare your data as described here: Best practices for preparing your data
R base functions for importing data
The R base function read.table() is a general function that can be used to read a file in table format. The data will be imported as a data frame.
Note that, depending on the format of your file, several variants of read.table() are available to make your life easier, including read.csv(), read.csv2(), read.delim() and read.delim2().
- read.csv(): for reading “comma separated value” files (“.csv”).
- read.csv2(): variant used in countries that use a comma “,” as decimal point and a semicolon “;” as field separators.
- read.delim(): for reading “tab-separated value” files (“.txt”). By default, point (“.”) is used as decimal points.
- read.delim2(): for reading “tab-separated value” files (“.txt”). By default, comma (“,”) is used as decimal points.
The simplified format of these functions are, as follow:
# Read tabular data into R read.table(file, header = FALSE, sep = "", dec = ".") # Read "comma separated value" files (".csv") read.csv(file, header = TRUE, sep = ",", dec = ".", ...) # Or use read.csv2: variant used in countries that # use a comma as decimal point and a semicolon as field separator. read.csv2(file, header = TRUE, sep = ";", dec = ",", ...) # Read TAB delimited files read.delim(file, header = TRUE, sep = "\t", dec = ".", ...) read.delim2(file, header = TRUE, sep = "\t", dec = ",", ...)
- file: the path to the file containing the data to be imported into R.
- sep: the field separator character. “\t” is used for tab-delimited file.
- header: logical value. If TRUE, read.table() assumes that your file has a header row, so row 1 is the name of each column. If that’s not the case, you can add the argument header = FALSE.
- dec: the character used in the file for decimal points.
Reading a local file
- To import a local .txt or a .csv file, the syntax would be:
# Read a txt file, named "mtcars.txt" my_data <- read.delim("mtcars.txt") # Read a csv file, named "mtcars.csv" my_data <- read.csv("mtcars.csv")
The above R code, assumes that the file “mtcars.txt” or “mtcars.csv” is in your current working directory. To know your current working directory, type the function getwd() in R console.
- It’s also possible to choose a file interactively using the function file.choose(), which I recommend if you’re a beginner in R programming:
# Read a txt file my_data <- read.delim(file.choose()) # Read a csv file my_data <- read.csv(file.choose())
If you use the R code above in RStudio, you will be asked to choose a file.
If your data contains column with text, R may assume that columns as a factors or grouping variables (e.g.: “good”, “good”, “bad”, “bad”, “bad”). If you don’t want your text data to be converted as factors, add stringsAsFactor = FALSE in read.delim(), read.csv() and read.table() functions. In this case, the data frame columns corresponding to string in your text file will be character.
my_data <- read.delim(file.choose(), stringsAsFactor = FALSE)
- If your field separator is for example “|”, it’s possible use the general function read.table() with additional arguments:
my_data <- read.table(file.choose(), sep ="|", header = TRUE, dec =".")
Reading a file from internet
It’s possible to use the functions read.delim(), read.csv() and read.table() to import files from the web.
my_data <- read.delim("http://www.sthda.com/upload/boxplot_format.txt") head(my_data)
Nom variable Group 1 IND1 10 A 2 IND2 7 A 3 IND3 20 A 4 IND4 14 A 5 IND5 14 A 6 IND6 12 A
Import a local .txt file: read.delim(file.choose())
Import a local .csv file: read.csv(file.choose())
- Import a file from internet: read.delim(url) if a txt file or read.csv(url) if a csv file
This analysis has been performed using R (ver. 3.2.3).
Show me some love with the like buttons below... Thank you and please don't forget to share and comment below!!
Montrez-moi un peu d'amour avec les like ci-dessous ... Merci et n'oubliez pas, s'il vous plaît, de partager et de commenter ci-dessous!
Recommended for You!
Recommended for you
This section contains best data science and self-development resources to help you on your path.
Coursera - Online Courses and Specialization
- Course: Machine Learning: Master the Fundamentals by Standford
- Specialization: Data Science by Johns Hopkins University
- Specialization: Python for Everybody by University of Michigan
- Courses: Build Skills for a Top Job in any Industry by Coursera
- Specialization: Master Machine Learning Fundamentals by University of Washington
- Specialization: Statistics with R by Duke University
- Specialization: Software Development in R by Johns Hopkins University
- Specialization: Genomic Data Science by Johns Hopkins University
Popular Courses Launched in 2020
- Google IT Automation with Python by Google
- AI for Medicine by deeplearning.ai
- Epidemiology in Public Health Practice by Johns Hopkins University
- AWS Fundamentals by Amazon Web Services
- The Science of Well-Being by Yale University
- Google IT Support Professional by Google
- Python for Everybody by University of Michigan
- IBM Data Science Professional Certificate by IBM
- Business Foundations by University of Pennsylvania
- Introduction to Psychology by Yale University
- Excel Skills for Business by Macquarie University
- Psychological First Aid by Johns Hopkins University
- Graphic Design by Cal Arts
Books - Data Science
- Practical Guide to Cluster Analysis in R by A. Kassambara (Datanovia)
- Practical Guide To Principal Component Methods in R by A. Kassambara (Datanovia)
- Machine Learning Essentials: Practical Guide in R by A. Kassambara (Datanovia)
- R Graphics Essentials for Great Data Visualization by A. Kassambara (Datanovia)
- GGPlot2 Essentials for Great Data Visualization in R by A. Kassambara (Datanovia)
- Network Analysis and Visualization in R by A. Kassambara (Datanovia)
- Practical Statistics in R for Comparing Groups: Numerical Variables by A. Kassambara (Datanovia)
- Inter-Rater Reliability Essentials: Practical Guide in R by A. Kassambara (Datanovia)
- R for Data Science: Import, Tidy, Transform, Visualize, and Model Data by Hadley Wickham & Garrett Grolemund
- Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurelien Géron
- Practical Statistics for Data Scientists: 50 Essential Concepts by Peter Bruce & Andrew Bruce
- Hands-On Programming with R: Write Your Own Functions And Simulations by Garrett Grolemund & Hadley Wickham
- An Introduction to Statistical Learning: with Applications in R by Gareth James et al.
- Deep Learning with R by François Chollet & J.J. Allaire
- Deep Learning with Python by François Chollet
Want to Learn More on R Programming and Data Science?
Follow us by Email On Social Networks: