Reading Data From TXT|CSV Files: R Base Functions


Previously, we described the essentials of R programming and some best practices for preparing your data.


In this article, you’ll learn how to import data from .txt (tab-separated values) and .csv (comma-separated values) file formats into R.


Reading Data From txt|csv Files: R Base Functions

Preleminary tasks

  1. Launch RStudio as described here: Running RStudio and setting up your working directory

  2. Prepare your data as described here: Best practices for preparing your data

R base functions for importing data

The R base function read.table() is a general function that can be used to read a file in table format. The data will be imported as a data frame.

Note that, depending on the format of your file, several variants of read.table() are available to make your life easier, including read.csv(), read.csv2(), read.delim() and read.delim2().


  • read.csv(): for reading “comma separated value” files (“.csv”).
  • read.csv2(): variant used in countries that use a comma “,” as decimal point and a semicolon “;” as field separators.
  • read.delim(): for reading “tab-separated value” files (“.txt”). By default, point (“.”) is used as decimal points.
  • read.delim2(): for reading “tab-separated value” files (“.txt”). By default, comma (“,”) is used as decimal points.


The simplified format of these functions are, as follow:

# Read tabular data into R
read.table(file, header = FALSE, sep = "", dec = ".")
# Read "comma separated value" files (".csv")
read.csv(file, header = TRUE, sep = ",", dec = ".", ...)
# Or use read.csv2: variant used in countries that 
# use a comma as decimal point and a semicolon as field separator.
read.csv2(file, header = TRUE, sep = ";", dec = ",", ...)
# Read TAB delimited files
read.delim(file, header = TRUE, sep = "\t", dec = ".", ...)
read.delim2(file, header = TRUE, sep = "\t", dec = ",", ...)

  • file: the path to the file containing the data to be imported into R.
  • sep: the field separator character. “\t” is used for tab-delimited file.
  • header: logical value. If TRUE, read.table() assumes that your file has a header row, so row 1 is the name of each column. If that’s not the case, you can add the argument header = FALSE.
  • dec: the character used in the file for decimal points.


Reading a local file

  • To import a local .txt or a .csv file, the syntax would be:
# Read a txt file, named "mtcars.txt"
my_data <- read.delim("mtcars.txt")
# Read a csv file, named "mtcars.csv"
my_data <- read.csv("mtcars.csv")

The above R code, assumes that the file “mtcars.txt” or “mtcars.csv” is in your current working directory. To know your current working directory, type the function getwd() in R console.

  • It’s also possible to choose a file interactively using the function file.choose(), which I recommend if you’re a beginner in R programming:
# Read a txt file
my_data <- read.delim(file.choose())
# Read a csv file
my_data <- read.csv(file.choose())

If you use the R code above in RStudio, you will be asked to choose a file.

If your data contains column with text, R may assume that columns as a factors or grouping variables (e.g.: “good”, “good”, “bad”, “bad”, “bad”). If you don’t want your text data to be converted as factors, add stringsAsFactor = FALSE in read.delim(), read.csv() and read.table() functions. In this case, the data frame columns corresponding to string in your text file will be character.

For example:

my_data <- read.delim(file.choose(), 
                      stringsAsFactor = FALSE)
  • If your field separator is for example “|”, it’s possible use the general function read.table() with additional arguments:
my_data <- read.table(file.choose(), 
                      sep ="|", header = TRUE, dec =".")

Reading a file from internet

It’s possible to use the functions read.delim(), read.csv() and read.table() to import files from the web.

my_data <- read.delim("http://www.sthda.com/upload/boxplot_format.txt")
head(my_data)
   Nom variable Group
1 IND1       10     A
2 IND2        7     A
3 IND3       20     A
4 IND4       14     A
5 IND5       14     A
6 IND6       12     A

Summary


  • Import a local .txt file: read.delim(file.choose())

  • Import a local .csv file: read.csv(file.choose())

  • Import a file from internet: read.delim(url) if a txt file or read.csv(url) if a csv file


Infos

This analysis has been performed using R (ver. 3.2.3).


Enjoyed this article? I’d be very grateful if you’d help it spread by emailing it to a friend, or sharing it on Twitter, Facebook or Linked In.

Show me some love with the like buttons below... Thank you and please don't forget to share and comment below!!
Avez vous aimé cet article? Je vous serais très reconnaissant si vous aidiez à sa diffusion en l'envoyant par courriel à un ami ou en le partageant sur Twitter, Facebook ou Linked In.

Montrez-moi un peu d'amour avec les like ci-dessous ... Merci et n'oubliez pas, s'il vous plaît, de partager et de commenter ci-dessous!






This page has been seen 9533 times