Entering edit mode
Davis, Wade
▴
350
@davis-wade-2803
Last seen 10.4 years ago
Hi Farida,
I used to use the PROcess package extensively, but I haven't much for
the past 2 years. I ran into the same problem that you did, so I wrote
a modified version of the rmBaseline function that fixes that, and
does some other things that you may find handy later on. The parts
that should interest you most are the highcut and lowcut options.
The normed and rawout options are not used that often.
I last ran this code under R 2.6.1, so I am not sure if it will work
without a few tweaks.
Good luck,
Wade
rmBaseline2<-function(fldr, bseoffrda = NULL, breaks = 200, qntl = 0,
method = "loess",
lowcut=0, highcut=195000, bw = 0.1,
rawout=FALSE,normed=FALSE,
SpecNames = list.files(fldr, pattern =
"\\.*csv\\.*"))
{
##################################################
## modified BATCH function for baseline subtraction
##################################################
# Modified version of rmBaseline function in PROcess package.
# This version allows you to specify the mass range to consider
# for baseline removal via the inputs lowcut and highcut.
# This was written to accounts for minor differences in the spectra
length
# due to the laser firing for slightly different lengths of time.
#
# Use rawout=TRUE if you want all of the spectra read in and
# stored in a matrix without actually baseline subtracting.
# (This is useful for taking advantage of plotting routines
# that were originally written for spectra after they had been
baseline subtracted.)
#
# The use of normed=T is more rare. It was written as part of an
exploratory analysis
# I did to see if it made a difference if you normalized, then
baseline subtracted
# rather than the traditional process of baseline subtracting and
then normalizing
# My analysis showed that it made no appreciable difference, so I am
sticking
# with the status quo.
SpecNames.abbrev<-unlist(strsplit(SpecNames,split = " [0-9]{3}
"))[seq(2,2*length(SpecNames),2)]
if(normed==FALSE){
fs <- SpecNames
n <- length(fs)
#peek at dimensions to create empty matrix
ftemp <- read.files(file.path(fldr,paste(SpecNames), fsep
="\\")[1])
ftemp2 <- ftemp[ftemp[, 1] > lowcut & ftemp[, 1] < highcut, ]
bseoffM<-matrix(data=0.0123456,ncol=n,nrow=dim(ftemp2)[1])
for (j in 1:n) {
f1 <- read.files(file.path(fldr,paste(SpecNames), fsep
="\\")[j])
fcut <- f1[f1[, 1] > lowcut & f1[, 1] < highcut, ]
if(rawout==FALSE){bseoffM[,j] <- bslnoff(fcut,
breaks = breaks, qntl = qntl,
method = method, bw = bw)[,2]
}
if(rawout==TRUE){bseoffM[,j]<-fcut[,2]}
if (j==1){rownames(bseoffM) <-
signif(bslnoff(fcut, breaks = breaks, qntl = qntl,
method = method, bw = bw)[,1],6)
}
}
colnames(bseoffM) <- SpecNames
}
if(normed==TRUE){
fs <- fldr
n <- ncol(fs)
for (j in 1:n) {
f1 <- cbind(as.numeric(rownames(fs)),fs[,j])
fcut <- f1[f1[, 1] > lowcut & f1[, 1] < highcut, ]
bseoff <- bslnoff(fcut, breaks = breaks, qntl =
qntl,
method = method, bw = bw)
if (j > 1)
bseoffM <- cbind(bseoffM, bseoff[, 2])
else bseoffM <- bseoff[, 2]
}
dimnames(bseoffM) <- list(signif(bseoff[, 1], 6),
SpecNames=colnames(fldr))
}
if (!is.null(bseoffrda))
save(list = bseoffM, file = bseoffrda)
bseoffM
}
##EXAMPLE
# rmBaseline2(fldr=seldipath(basedir="W:\\Master6\\Raw
Specta",chiptype="IMAC",inten="high")
# ,breaks = 2
# ,qntl = 0
# ,method = "approx"
# ,bw = 0.1,
# highcut=50000
# )
-----Original Message-----
From: Farida Mostajabi [mailto:f0most01@louisville.edu]
Sent: Monday, November 30, 2009 2:48 PM
To: bioconductor at stat.math.ethz.ch
Subject: [BioC] Question on PROcess package
To whom it may concern,
I am a student from University of Louisville, USA. I am currently
doing some MALDI-TOF MS data analysis research
with PROcess package.
I am trying to use the batch functionality of the package to do pre
processing on 286 spectra. The m/z values
are not exactly the same throughout the spectra, which I think it is
an assumption in PROcess package.
I used the code below to do baseline correction for one spectrum at
a time
B.fs <- list.files(my.B.files, pattern = "\\.*csv\\.*", full.names =
TRUE)
nb.file <- length(B.fs)
foo<-lapply(seq(nb.file), function(i) read.files(B.fs[i] ))
f0<-lapply(seq(nb.file), function(i) foo[[i]][foo[[i]][,1]>0,])
basecorr<-lapply(seq(nb.file), function(i) bslnoff(f0[[i]], method =
"loess", bw = 0.1))
I could not use "rmBaseline" function since the row-names of the
returning matrix are the m/z values, which in my case, are not
identical.
Would you please give some suggestions on this issue?
Best Regards,
Farida