Entering edit mode
Hello everyone,
I am using ShortRead to do some pre-processing for 100bp Illumina
reads.
I used qa() to read to do an upfront analysis, and then used
trimEnds() to
trim my reads by quality, and now I would like to redo the qa() to see
a
'before and after' type of thing. After running it I get this error:
> qas <- lapply(seq_along(fls), function(i, fls) qa(readFastq(fls[i]),
names(fls)[i]), fls)
Error in density.default(qscore) : 'x' contains missing values
Calls: lapply ... .qa_qdensity -> density -> density ->
density.default
Execution halted
I ran the first 100 reads of one of the files and it worked just fine.
My scripts look like this:
To trim the reads
the first few lines are just parsing to get a nice looking name for
the
'newfls'
oldfls <-
c("/data/smcintur/RNASeq/NMseq/Sample_CRT1/CRT1_ATTCCT_L005_R1_001.fas
tq",
"/data/smcintur/RNASeq/NMseq/Sample_CRT1/CRT1_ATTCCT_L006_R1_001.fastq
",
"/data/smcintur/RNASeq/NMseq/Sample_CRT1/CRT1_ATTCCT_L007_R1_001.fastq
")
flsSplit <- strsplit(oldfls, "/")
FileSplit <- unlist(lapply(flsSplit, function(x) paste(x[1:6],
collapse =
"/")))
readFiles <- unlist(lapply(flsSplit, function(x) x[7]))
newfls <-
paste("/scratch/smcintur/Sample_CRT1/",unlist(lapply(strsplit(readFile
s,
"_"),
function(x) paste(x[1], "Clean",
x[3],
sep = "_"))),sep ="")
newfls <- paste(newfls, ".fastq", sep = "")
reads <- readFastq(oldfls[1])
treads <- trimEnds(reads, Cutoff)
rm(reads)
writeFastq(treads, file = newfls[1])
rm(treads)
and then the same thing for the following two reads.
And then to check quality
again, parsing up front for pretty names, and then a qa() call.
Note that this script worked just fine to process the reads before
they were
library(ShortRead)
fls <- list.files(pattern = ".fastq")
flsSplit <- strsplit(fls, split = "_")
Lib <- lapply(flsSplit, function(x) x[1])
Lane <- lapply(flsSplit, function(x) x[3])
names(fls) <- paste(Lib, Lane, sep = "_")
names(fls) <- gsub(".fastq", "", names(fls))
outname <- paste("QA_", Lib[1], ".rda", sep = "")
qas <- lapply(seq_along(fls), function(i, fls) qa(readFastq(fls[i]),
names(fls)[i]), fls)
QA <- do.call(rbind, qas)
save(QA, file = outname)
Does anyone have an idea of what is going on?
> sessionInfo()
R version 3.0.0 (2013-04-03)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets
methods
[8] base
other attached packages:
[1] ShortRead_1.18.0 latticeExtra_0.6-24 RColorBrewer_1.0-5
[4] Rsamtools_1.12.3 lattice_0.20-15 Biostrings_2.28.0
[7] GenomicRanges_1.12.4 IRanges_1.18.1 BiocGenerics_0.6.0
loaded via a namespace (and not attached):
[1] Biobase_2.20.0 bitops_1.0-5 grid_3.0.0 hwriter_1.3
stats4_3.0.0
[6] zlibbioc_1.6.0
--
Sam McInturf
[[alternative HTML version deleted]]