From SRA to FASTQ file

This analysis was performed using R (ver. 3.1.0).


  • Download automatically sequencing data from Short Read Archive (SRA)
  • Convert SRA to FASTQ file

Download SRA file

The data from Popovic et al., (GEO accession number: GSE57478) where used in the following example. The SRA files are available here :

The downloaded process can be automated using R.

STEP 1. Download a table of the metadata into a CSV file “SraRunInfo.csv”:

From SRA web page :

click on “Send to (top right corner)” Select “File” Select format “RunInfo” Click on “Create File”

STEP 2. Read this CSV file “SraRunInfo.csv” into R:

The SRA files are automatically download in the current working directory using the following R script:

#Read SRA file infos
sri<-read.csv("SraRunInfo.csv", stringsAsFactors=FALSE)
for(i in 1:length(files)) download.file(sri$download_path[i], files[i])

This article describes just one way to automate the download of SRA files from R. Users can also use wget (Unix/Linux) or curl (MAC OS X) or download from web browser


Convert SRA to FASTQ format

To convert the example data to FASTQ, use the fastq-dump command from the SRA Toolkit on each SRA file. To install SRA Toolkit click here.

R can be used to construct the required shell commands and to automate the process, starting from the SraRunInfo.csv" metadata table, as follows:

# Assure that all the files has been downloaded successfully
# Remember, the R object files has been created in the previous code chunk
stopifnot( all(file.exists(files)) ) 
for(f in files) {
  cmd = paste("fastq-dump --split-3", f)
  cat(cmd,"\n")#print the current command
  system(cmd) # invoke command

fastq-dumb can be also used manually into the Unix Shell.

fastq-dump --split-3 SRR1282056.sra 

Be sure to use the –split-3 option, which splits mate-pair reads into separate files. After this command, single and paired-end data will produce one or two FASTQ files, respectively. For paired-end data, the file names will be suffixed 1.FASTQ and 2.FASTQ; otherwise, a single file with extension .FASTQ will be produced

See the video:

