Agi4x44PreProcess /filtering probenames from GeneName
1
0
Entering edit mode
Maria Raeder ▴ 10
@maria-raeder-4550
Last seen 10.4 years ago
Dear Mailing List, I have been struggling for some time with some agilent single channel arrays, which I believe has been scanned with a earlier version AFE, because they do not contain the columns Sequence and chr coord, but I have tried to use the Agi4x44PreProcess package, with some adjustments, please see below. My main problem now is that I cannot remove the agilent probe names which are embedded within the genesymbol column for some genes The reason for doing this is to prepare files for GSEA analysis. The function for doing this in the Agi4x44PreProcess package: gsea.files, does not work, porbably due the the columns I am lacking, and the filter.probes also returns an error message, probably due to the same reason. I would be very grateful for any comments and help Thanks, Maria Here is the code : library("Agi4x44PreProcess") library("hgug4112a.db") library("vsn") library("convert") library("GO.db") setwd("/mydirectory") #reading targets file targets=read.targets(infile="targets_ec3.txt") targets[1:10,1:5] names(targets) #Many( has skipped them, but included FIleName, Treatment and GErep) #read in files with LIMMA: dd <- read.maimages(targets$FileName, source="agilent", columns = list(G = "gMedianSignal", Gb = "gBGUsed", R = "gProcessedSignal", Rb = "gBGMedianSignal"), annotation = c("Row", "Col","FeatureNum", "ControlType","ProbeName","ProbeUID", "GeneName", "SystematicName", "Description", "gIsWellAboveBG", "gIsFound", "gIsSaturated", "gIsFeatPopnOL", "gIsFeatNonUnifOL")) #reads inn 146 arrays) ##########Quality control (skipped) ###########Background correction and normailzation and log 2 transformation: library(vsn) ddNORM = BGandNorm(dd, BGmethod = "half", NORMmethod = "quantile",foreground = "MeanSignal", background = "BGMedianSignal", offset = 50, makePLOTpre = FALSE, makePLOTpost = FALSE) #filtering: ddFILT=filter.probes(ddNORM, control=TRUE, wellaboveBG=TRUE, isfound=TRUE, wellaboveNEG=TRUE, sat=TRUE, PopnOL=TRUE, NonUnifOL=TRUE, nas=TRUE, limWellAbove=75, limISF=75, limNEG=75, limSAT=75, limPopnOL=75, limNonUnifOL=75, limNAS=100, makePLOT=TRUE,annotation.package="hgug4112a.db",flag.c ounts=FALSE,targets) FILTERING PROBES BY FLAGS FILTERING BY ControlType FLAG Error in data.frame(PROBE_ID, as.character(probe.chr), as.character(probe.seq), : arguments imply differing number of rows: 43376, 0 [[alternative HTML version deleted]]
Annotation probe Agi4x44PreProcess Annotation probe Agi4x44PreProcess • 1.5k views
ADD COMMENT
0
Entering edit mode
@wolfgang-huber-3550
Last seen 4 months ago
EMBL European Molecular Biology Laborat…
Dear Maria I am not sure I understood your question, anyway: would perhaps the 'strsplit' function of R help you, that allows you to split strings and then extract components? E.g. the idiom sapply(strsplit(x, ","), "[", 2) will extract the text between the first and second comma in each string within x. Best wishes Wolfgang Il Mar/18/11 2:28 PM, Maria Raeder ha scritto: > Dear Mailing List, > > I have been struggling for some time with some agilent single channel > arrays, which I believe has been scanned with a earlier version AFE, > because they do not contain the columns Sequence and chr coord, but I > have tried to use the Agi4x44PreProcess package, with some > adjustments, please see below. My main problem now is that I cannot > remove the agilent probe names which are embedded within the > genesymbol column for some genes The reason for doing this is to > prepare files for GSEA analysis. The function for doing this in the > Agi4x44PreProcess package: gsea.files, does not work, porbably due > the the columns I am lacking, and the filter.probes also returns an > error message, probably due to the same reason. > > I would be very grateful for any comments and help > > Thanks, Maria > > Here is the code : > > library("Agi4x44PreProcess") library("hgug4112a.db") library("vsn") > library("convert") library("GO.db") > > setwd("/mydirectory") > > #reading targets file targets=read.targets(infile="targets_ec3.txt") > targets[1:10,1:5] > > names(targets) > > #Many( has skipped them, but included FIleName, Treatment and GErep) > > #read in files with LIMMA: dd<- read.maimages(targets$FileName, > source="agilent", columns = list(G = "gMedianSignal", Gb = "gBGUsed", > R = "gProcessedSignal", Rb = "gBGMedianSignal"), annotation = > c("Row", "Col","FeatureNum", "ControlType","ProbeName","ProbeUID", > "GeneName", "SystematicName", "Description", "gIsWellAboveBG", > "gIsFound", "gIsSaturated", "gIsFeatPopnOL", "gIsFeatNonUnifOL")) > > #reads inn 146 arrays) > > ##########Quality control (skipped) > > ###########Background correction and normailzation and log 2 > transformation: library(vsn) ddNORM = BGandNorm(dd, BGmethod = > "half", NORMmethod = "quantile",foreground = "MeanSignal", background > = "BGMedianSignal", offset = 50, makePLOTpre = FALSE, makePLOTpost = > FALSE) > > #filtering: ddFILT=filter.probes(ddNORM, control=TRUE, > wellaboveBG=TRUE, isfound=TRUE, wellaboveNEG=TRUE, sat=TRUE, > PopnOL=TRUE, NonUnifOL=TRUE, nas=TRUE, limWellAbove=75, limISF=75, > limNEG=75, limSAT=75, limPopnOL=75, limNonUnifOL=75, limNAS=100, > makePLOT=TRUE,annotation.package="hgug4112a.db",flag.counts=FALSE,ta rgets) > > FILTERING PROBES BY FLAGS > > > FILTERING BY ControlType FLAG Error in data.frame(PROBE_ID, > as.character(probe.chr), as.character(probe.seq), : arguments imply > differing number of rows: 43376, 0 > > > [[alternative HTML version deleted]] > > _______________________________________________ Bioconductor mailing > list Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor Search the > archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor -- Wolfgang Huber EMBL http://www.embl.de/research/units/genome_biology/huber
ADD COMMENT

Login before adding your answer.

Traffic: 503 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6