Reading Data From TXT|CSV Files: R Base Functions


Previously, we described the essentials of R programming and some best practices for preparing your data.


In this article, you’ll learn how to import data from .txt (tab-separated values) and .csv (comma-separated values) file formats into R.


Reading Data From txt|csv Files: R Base Functions

Preleminary tasks

  1. Launch RStudio as described here: Running RStudio and setting up your working directory

  2. Prepare your data as described here: Best practices for preparing your data

R base functions for importing data

The R base function read.table() is a general function that can be used to read a file in table format. The data will be imported as a data frame.

Note that, depending on the format of your file, several variants of read.table() are available to make your life easier, including read.csv(), read.csv2(), read.delim() and read.delim2().


  • read.csv(): for reading “comma separated value” files (“.csv”).
  • read.csv2(): variant used in countries that use a comma “,” as decimal point and a semicolon “;” as field separators.
  • read.delim(): for reading “tab-separated value” files (“.txt”). By default, point (“.”) is used as decimal points.
  • read.delim2(): for reading “tab-separated value” files (“.txt”). By default, comma (“,”) is used as decimal points.


The simplified format of these functions are, as follow:

# Read tabular data into R
read.table(file, header = FALSE, sep = "", dec = ".")
# Read "comma separated value" files (".csv")
read.csv(file, header = TRUE, sep = ",", dec = ".", ...)
# Or use read.csv2: variant used in countries that 
# use a comma as decimal point and a semicolon as field separator.
read.csv2(file, header = TRUE, sep = ";", dec = ",", ...)
# Read TAB delimited files
read.delim(file, header = TRUE, sep = "\t", dec = ".", ...)
read.delim2(file, header = TRUE, sep = "\t", dec = ",", ...)

  • file: the path to the file containing the data to be imported into R.
  • sep: the field separator character. “\t” is used for tab-delimited file.
  • header: logical value. If TRUE, read.table() assumes that your file has a header row, so row 1 is the name of each column. If that’s not the case, you can add the argument header = FALSE.
  • dec: the character used in the file for decimal points.


Reading a local file

  • To import a local .txt or a .csv file, the syntax would be:
# Read a txt file, named "mtcars.txt"
my_data <- read.delim("mtcars.txt")
# Read a csv file, named "mtcars.csv"
my_data <- read.csv("mtcars.csv")

The above R code, assumes that the file “mtcars.txt” or “mtcars.csv” is in your current working directory. To know your current working directory, type the function getwd() in R console.

  • It’s also possible to choose a file interactively using the function file.choose(), which I recommend if you’re a beginner in R programming:
# Read a txt file
my_data <- read.delim(file.choose())
# Read a csv file
my_data <- read.csv(file.choose())

If you use the R code above in RStudio, you will be asked to choose a file.

If your data contains column with text, R may assume that columns as a factors or grouping variables (e.g.: “good”, “good”, “bad”, “bad”, “bad”). If you don’t want your text data to be converted as factors, add stringsAsFactor = FALSE in read.delim(), read.csv() and read.table() functions. In this case, the data frame columns corresponding to string in your text file will be character.

For example:

my_data <- read.delim(file.choose(), 
                      stringsAsFactor = FALSE)
  • If your field separator is for example “|”, it’s possible use the general function read.table() with additional arguments:
my_data <- read.table(file.choose(), 
                      sep ="|", header = TRUE, dec =".")

Reading a file from internet

It’s possible to use the functions read.delim(), read.csv() and read.table() to import files from the web.

my_data <- read.delim("http://www.sthda.com/upload/boxplot_format.txt")
head(my_data)
   Nom variable Group
1 IND1       10     A
2 IND2        7     A
3 IND3       20     A
4 IND4       14     A
5 IND5       14     A
6 IND6       12     A

Summary


  • Import a local .txt file: read.delim(file.choose())

  • Import a local .csv file: read.csv(file.choose())

  • Import a file from internet: read.delim(url) if a txt file or read.csv(url) if a csv file


Infos

This analysis has been performed using R (ver. 3.2.3).









Want to Learn More on R Programming and Data Science?

Follow us by Email

by FeedBurner

On Social Networks:


 Get involved :
  Click to follow us on and Google+ :   
  Comment this article by clicking on "Discussion" button (top-right position of this page)
  Sign up as a member and post news and articles on STHDA web site.
This page has been seen 6220 times