xps: hugene11 chip gives problems

0

Entering edit mode

Groot, Philip de ▴ 630

@groot-philip-de-1307

Last seen 10.7 years ago

Hi Christian, I am trying to do an analysis using xps and the hugene11 chip. However, I run into problems for which I need your help. I created a small test-script to demonstrate the problem: library(xps) scheme <- root.scheme("/local2/R-2.15.2/library/xps/schemes/hugene11stv1.root") x.xps <- import.data(scheme, "tmp_x", celdir = ".", celfiles = "G092_A05_01_1.1.CEL", verbose = TRUE) cat("The loaded .CEL-files are:\n"); for (i in 1:length(rawCELName(x.xps, fullpath = FALSE))) cat(sprintf("%s\n", rawCELName(x.xps, fullpath = FALSE)[i])); Upon execution, I get: > library(xps) Welcome to xps version 1.18.1 an R wrapper for XPS - eXpression Profiling System (c) Copyright 2001-2012 by Christian Stratowa > scheme <- root.scheme("/local2/R-2.15.2/library/xps/schemes/hugene11stv1.root") > x.xps <- import.data(scheme, "tmp_x", celdir = ".", celfiles = "G092_A05_01_1.1.CEL", verbose = TRUE) Opening file </local2> in <read> mode... Creating new temporary file </mnt>... Importing <./G092_A05_01_1.1.CEL> as <g092_a05_01_1.1.cel>... hybridization statistics: 1 cells with minimal intensity 19 1 cells with maximal intensity 21364.4 New dataset <dataset> is added to Content... > > cat("The loaded .CEL-files are:\n"); The loaded .CEL-files are: > for (i in 1:length(rawCELName(x.xps, fullpath = FALSE))) + cat(sprintf("%s\n", rawCELName(x.xps, fullpath = FALSE)[i])); Error: Tree set <> could not be found in file content Error: Tree set <> could not be found in file content NA The weird thing is: I only have this problem with the hugene11 chip. As far as I can see, al other chips work properly (still na32 based). This effects all other steps, because there is no "content" to normalise etc. I created the root-scheme as follows: scmdir <- paste(.path.package("xps"), "schemes/", sep = "/") scheme <- import.exon.scheme("hugene11stv1",filedir=scmdir, layoutfile=paste(libdir, "HuGene-1_1-st-v1.r4.clf", sep="/"), schemefile=paste(libdir,"HuGene-1_1-st-v1.r4.pgf", sep="/"), probeset=paste(anndir,"HuGene-1_1-st-v1.na33.1.hg19.probeset.csv", sep="/"), transcript=paste(anndir,"HuGene- 1_1-st-v1.na33.1.hg19.transcript.csv", sep="/"), add.mask = TRUE) (libdir and anndir are also defined off course). I even updated the na32 annotation to the latest Affymetrix version (na33) the exclude a problem there. It does not fix the issue. Please note that I am running root version 5.32/04 as version 5.32/01 is no longer available for download. Root works properly as far as I can see. Do you have any clue where this problem originates from? Thank you! sessionInfo(): > sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] xps_1.18.1 loaded via a namespace (and not attached): [1] tools_2.15.2 Regards, Dr. Philip de Groot Bioinformatician / Microarray analysis expert Wageningen University / TIFN Netherlands Nutrigenomics Center (NNC) Nutrition, Metabolism & Genomics Group Division of Human Nutrition PO Box 8129, 6700 EV Wageningen Visiting Address: "De Valk" ("Erfelijkheidsleer"), Building 304, Verbindingsweg 4, 6703 HC Wageningen Room: 0052a T: 0317 485786 F: 0317 483342 E-mail: Philip.deGroot@wur.nl<mailto:philip.degroot@wur.nl> I: http://humannutrition.wur.nl<http: humannutrition.wur.nl=""/> https://madmax.bioinformatics.nl http://www.nutrigenomicsconsortium.nl<http: www.nutrigenom="" icsconsortium.nl=""/> [[alternative HTML version deleted]]

Microarray Annotation xps Microarray Annotation xps • 1.5k views

ADD COMMENT • link updated 12.3 years ago by cstrato ★ 3.9k • written 12.3 years ago by Groot, Philip de ▴ 630

0

Entering edit mode

cstrato ★ 3.9k

@cstrato-908

Last seen 6.6 years ago

Austria

Dear Philip, I have just tried a subset of CEL-files from the Affymetrix "gene_1_1_st_ap_tissue_sample_data" for HuGene_1.1 array, but I cannot repeat the error you get. Here is my output for one CEL-file only: > library(xps) Welcome to xps version 1.19.1 an R wrapper for XPS - eXpression Profiling System (c) Copyright 2001-2013 by Christian Stratowa > scheme <- root.scheme("./na33/hugene11stv1.root") > x.xps <- import.data(scheme, "tmp_x", celdir = "./cel", celfiles = "HumanBrain_1.CEL", verbose = TRUE) Opening file <./na33/hugene11stv1.root> in <read> mode... Creating new temporary file </volumes>... Importing <./cel/HumanBrain_1.CEL> as <humanbrain_1.cel>... hybridization statistics: 1 cells with minimal intensity 17.5 1 cells with maximal intensity 22402.1 New dataset <dataset> is added to Content... > cat("The loaded .CEL-files are:\n"); The loaded .CEL-files are: > for (i in 1:length(rawCELName(x.xps, fullpath = FALSE))) + cat(sprintf("%s\n", rawCELName(x.xps, fullpath = FALSE)[i])); HumanBrain_1.CEL > > sessionInfo() R version 2.15.0 (2012-03-30) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] xps_1.19.1 loaded via a namespace (and not attached): [1] tools_2.15.0 > As you see everything is ok. I did also run the triplicates of the Brain and Prostate samples and could do RMA w/o problems. Could you please try the following two options: 1, Could you try to use the CEL-files from the Affymetrix dataset to make sure that there is no problem with the CEL-files. 2, I see that you did create the ROOT scheme files in directory: scmdir <- paste(.path.package("xps"), "schemes/", sep = "/") I must admit that I have never tried to store the scheme files in the package directory, since I have the feeling that this may cause troubles, especially when you update R and/or the xps package to a new version. Could you please try to save your file "hugene11stv1.root" in a different directory such as '/home/degroot/schemes' or better to create this file in this directory, and then try if you still get the problem. Best regards, Christian On 1/10/13 1:03 PM, Groot, Philip de wrote: > Hi Christian, > > I am trying to do an analysis using xps and the hugene11 chip. However, > I run into problems for which I need your help. > > I created a small test-script to demonstrate the problem: > > library(xps) > > scheme <- > root.scheme("/local2/R-2.15.2/library/xps/schemes/hugene11stv1.root") > > x.xps <- import.data(scheme, "tmp_x", celdir = ".", celfiles = > "G092_A05_01_1.1.CEL", verbose = TRUE) > > cat("The loaded .CEL-files are:\n"); > > for (i in 1:length(rawCELName(x.xps, fullpath = FALSE))) > > cat(sprintf("%s\n", rawCELName(x.xps, fullpath = FALSE)[i])); > > Upon execution, I get: > >> library(xps) > > Welcome to xps version 1.18.1 > > an R wrapper for XPS - eXpression Profiling System > > (c) Copyright 2001-2012 by Christian Stratowa > >> scheme <- root.scheme("/local2/R-2.15.2/library/xps/schemes/hugene11stv1.root") > >> x.xps <- import.data(scheme, "tmp_x", celdir = ".", celfiles = "G092_A05_01_1.1.CEL", verbose = TRUE) > > Opening file </local2> in > <read> mode... > > Creating new temporary file > </mnt>... > > Importing <./G092_A05_01_1.1.CEL> as <g092_a05_01_1.1.cel>... > > hybridization statistics: > > 1 cells with minimal intensity 19 > > 1 cells with maximal intensity 21364.4 > > New dataset <dataset> is added to Content... > >> > >> cat("The loaded .CEL-files are:\n"); > > The loaded .CEL-files are: > >> for (i in 1:length(rawCELName(x.xps, fullpath = FALSE))) > > + cat(sprintf("%s\n", rawCELName(x.xps, fullpath = FALSE)[i])); > > Error: Tree set <> could not be found in file content > > Error: Tree set <> could not be found in file content > > NA > > The weird thing is: I only have this problem with the hugene11 chip. As > far as I can see, al other chips work properly (still na32 based). > > This effects all other steps, because there is no ?content? to normalise > etc. > > I created the root-scheme as follows: > > scmdir <- paste(.path.package("xps"), "schemes/", sep = "/") > > scheme <- import.exon.scheme("hugene11stv1",filedir=scmdir, > layoutfile=paste(libdir, "HuGene-1_1-st-v1.r4.clf", sep="/"), > schemefile=paste(libdir,"HuGene-1_1-st-v1.r4.pgf", sep="/"), > probeset=paste(anndir,"HuGene-1_1-st-v1.na33.1.hg19.probeset.csv", > sep="/"), > transcript=paste(anndir,"HuGene- 1_1-st-v1.na33.1.hg19.transcript.csv", > sep="/"), add.mask = TRUE) > > (libdir and anndir are also defined off course). > > I even updated the na32 annotation to the latest Affymetrix version > (na33) the exclude a problem there. It does not fix the issue. > > Please note that I am running root version 5.32/04 as version 5.32/01 is > no longer available for download. Root works properly as far as I can see. > > Do you have any clue where this problem originates from? Thank you! > > sessionInfo(): > >> sessionInfo() > > R version 2.15.2 (2012-10-26) > > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > > [7] LC_PAPER=C LC_NAME=C > > [9] LC_ADDRESS=C LC_TELEPHONE=C > > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > > [1] xps_1.18.1 > > loaded via a namespace (and not attached): > > [1] tools_2.15.2 > > Regards, > > *Dr. Philip de Groot > Bioinformatician / Microarray analysis expert* > > Wageningen University / TIFN > Netherlands Nutrigenomics Center (NNC) > > Nutrition, Metabolism & Genomics Group > Division of Human Nutrition > PO Box 8129, 6700 EV Wageningen > Visiting Address: > > "De Valk" ("Erfelijkheidsleer"), > > Building 304, > Verbindingsweg 4, 6703 HC Wageningen > Room: 0052a > T: 0317 485786 > F: 0317 483342 > E-mail: Philip.deGroot at wur.nl <mailto:philip.degroot at="" wur.nl=""> > I: http://humannutrition.wur.nl <http: humannutrition.wur.nl=""/> > > https://madmax.bioinformatics.nl > > http://www.nutrigenomicsconsortium.nl > <http: www.nutrigenomicsconsortium.nl=""/> > > >

ADD COMMENT • link 12.3 years ago cstrato ★ 3.9k

0

Entering edit mode

Dear Philip, Meanwhile I did another test and renamed my CEL-files to mimic your names. This is what I get: > celfiles <- c("Brain_01_1.1.CEL","Prostate_01_1.1.CEL") > data.genome11 <- import.data(scheme.hugene11, "tmp_HuBrPr", filedir=datdir, celdir=celdir, celfiles=celfiles) Opening file </volumes> in <read> mode... Creating new temporary file </volumes>... Importing </volumes> as <brain_01_1.1.cel>... hybridization statistics: 1 cells with minimal intensity 17.5 1 cells with maximal intensity 22402.1 New dataset <dataset> is added to Content... Importing </volumes> as <prostate_01_1.1.cel>... hybridization statistics: 2 cells with minimal intensity 14.5 1 cells with maximal intensity 23266.3 > for (i in 1:length(rawCELName(data.genome11, fullpath = FALSE))) + cat(sprintf("%s\n", rawCELName(data.genome11, fullpath = FALSE)[i])) Error: Tree set <> could not be found in file content Error: Tree set <> could not be found in file content As you can see I can now replicate your error. The solution is simple, i.e. use parameter 'celnames'. Now the result is: > celfiles <- c("Brain_01_1.1.CEL","Prostate_01_1.1.CEL") > celnames <- c("Brain01","Prostate01") > data.genome11 <- import.data(scheme.hugene11, "tmp_HuBrPr", filedir=datdir, celdir=celdir, celfiles=celfiles, celnames=celnames) Opening file </volumes> in <read> mode... Creating new temporary file </volumes>... Importing </volumes> as <brain01.cel>... hybridization statistics: 1 cells with minimal intensity 17.5 1 cells with maximal intensity 22402.1 New dataset <dataset> is added to Content... Importing </volumes> as <prostate01.cel>... hybridization statistics: 2 cells with minimal intensity 14.5 1 cells with maximal intensity 23266.3 > for (i in 1:length(rawCELName(data.genome11, fullpath = FALSE))) + cat(sprintf("%s\n", rawCELName(data.genome11, fullpath = FALSE)[i])) Brain_01_1.1.CEL Prostate_01_1.1.CEL As you can see, now everything works fine. The reason for introducing parameter 'celnames' was from the beginning to allow alternative names w/o the need to change the names of the original CEL-files, since often CEL-files had names such as 'Breast_tissue;24/08/1999;batch-1,lot-2.1.CEL'. I hope that using parameter 'celnames' does solve your problem. Best regards, Christian On 1/10/13 9:10 PM, cstrato wrote: > Dear Philip, > > I have just tried a subset of CEL-files from the Affymetrix > "gene_1_1_st_ap_tissue_sample_data" for HuGene_1.1 array, but I cannot > repeat the error you get. Here is my output for one CEL-file only: > > > library(xps) > > Welcome to xps version 1.19.1 > an R wrapper for XPS - eXpression Profiling System > (c) Copyright 2001-2013 by Christian Stratowa > > > scheme <- root.scheme("./na33/hugene11stv1.root") > > x.xps <- import.data(scheme, "tmp_x", celdir = "./cel", celfiles = > "HumanBrain_1.CEL", verbose = TRUE) > Opening file <./na33/hugene11stv1.root> in <read> mode... > Creating new temporary file > </volumes>... > Importing <./cel/HumanBrain_1.CEL> as <humanbrain_1.cel>... > hybridization statistics: > 1 cells with minimal intensity 17.5 > 1 cells with maximal intensity 22402.1 > New dataset <dataset> is added to Content... > > cat("The loaded .CEL-files are:\n"); > The loaded .CEL-files are: > > for (i in 1:length(rawCELName(x.xps, fullpath = FALSE))) > + cat(sprintf("%s\n", rawCELName(x.xps, fullpath = FALSE)[i])); > HumanBrain_1.CEL > > > > sessionInfo() > R version 2.15.0 (2012-03-30) > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) > > locale: > [1] C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] xps_1.19.1 > > loaded via a namespace (and not attached): > [1] tools_2.15.0 > > > > > As you see everything is ok. I did also run the triplicates of the Brain > and Prostate samples and could do RMA w/o problems. > > Could you please try the following two options: > > 1, Could you try to use the CEL-files from the Affymetrix dataset to > make sure that there is no problem with the CEL-files. > > 2, I see that you did create the ROOT scheme files in directory: > scmdir <- paste(.path.package("xps"), "schemes/", sep = "/") > > I must admit that I have never tried to store the scheme files in the > package directory, since I have the feeling that this may cause > troubles, especially when you update R and/or the xps package to a new > version. > Could you please try to save your file "hugene11stv1.root" in a > different directory such as '/home/degroot/schemes' or better to create > this file in this directory, and then try if you still get the problem. > > Best regards, > Christian > > > On 1/10/13 1:03 PM, Groot, Philip de wrote: >> Hi Christian, >> >> I am trying to do an analysis using xps and the hugene11 chip. However, >> I run into problems for which I need your help. >> >> I created a small test-script to demonstrate the problem: >> >> library(xps) >> >> scheme <- >> root.scheme("/local2/R-2.15.2/library/xps/schemes/hugene11stv1.root") >> >> x.xps <- import.data(scheme, "tmp_x", celdir = ".", celfiles = >> "G092_A05_01_1.1.CEL", verbose = TRUE) >> >> cat("The loaded .CEL-files are:\n"); >> >> for (i in 1:length(rawCELName(x.xps, fullpath = FALSE))) >> >> cat(sprintf("%s\n", rawCELName(x.xps, fullpath = FALSE)[i])); >> >> Upon execution, I get: >> >>> library(xps) >> >> Welcome to xps version 1.18.1 >> >> an R wrapper for XPS - eXpression Profiling System >> >> (c) Copyright 2001-2012 by Christian Stratowa >> >>> scheme <- >>> root.scheme("/local2/R-2.15.2/library/xps/schemes/hugene11stv1.root") >> >>> x.xps <- import.data(scheme, "tmp_x", celdir = ".", celfiles = >>> "G092_A05_01_1.1.CEL", verbose = TRUE) >> >> Opening file </local2> in >> <read> mode... >> >> Creating new temporary file >> </mnt>... >> >> >> Importing <./G092_A05_01_1.1.CEL> as <g092_a05_01_1.1.cel>... >> >> hybridization statistics: >> >> 1 cells with minimal intensity 19 >> >> 1 cells with maximal intensity 21364.4 >> >> New dataset <dataset> is added to Content... >> >>> >> >>> cat("The loaded .CEL-files are:\n"); >> >> The loaded .CEL-files are: >> >>> for (i in 1:length(rawCELName(x.xps, fullpath = FALSE))) >> >> + cat(sprintf("%s\n", rawCELName(x.xps, fullpath = FALSE)[i])); >> >> Error: Tree set <> could not be found in file content >> >> Error: Tree set <> could not be found in file content >> >> NA >> >> The weird thing is: I only have this problem with the hugene11 chip. As >> far as I can see, al other chips work properly (still na32 based). >> >> This effects all other steps, because there is no ?content? to normalise >> etc. >> >> I created the root-scheme as follows: >> >> scmdir <- paste(.path.package("xps"), "schemes/", sep = "/") >> >> scheme <- import.exon.scheme("hugene11stv1",filedir=scmdir, >> layoutfile=paste(libdir, "HuGene-1_1-st-v1.r4.clf", sep="/"), >> schemefile=paste(libdir,"HuGene-1_1-st-v1.r4.pgf", sep="/"), >> probeset=paste(anndir,"HuGene-1_1-st-v1.na33.1.hg19.probeset.csv", >> sep="/"), >> transcript=paste(anndir,"HuGene- 1_1-st-v1.na33.1.hg19.transcript.csv", >> sep="/"), add.mask = TRUE) >> >> (libdir and anndir are also defined off course). >> >> I even updated the na32 annotation to the latest Affymetrix version >> (na33) the exclude a problem there. It does not fix the issue. >> >> Please note that I am running root version 5.32/04 as version 5.32/01 is >> no longer available for download. Root works properly as far as I can >> see. >> >> Do you have any clue where this problem originates from? Thank you! >> >> sessionInfo(): >> >>> sessionInfo() >> >> R version 2.15.2 (2012-10-26) >> >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> >> [7] LC_PAPER=C LC_NAME=C >> >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> >> [1] xps_1.18.1 >> >> loaded via a namespace (and not attached): >> >> [1] tools_2.15.2 >> >> Regards, >> >> *Dr. Philip de Groot >> Bioinformatician / Microarray analysis expert* >> >> Wageningen University / TIFN >> Netherlands Nutrigenomics Center (NNC) >> >> Nutrition, Metabolism & Genomics Group >> Division of Human Nutrition >> PO Box 8129, 6700 EV Wageningen >> Visiting Address: >> >> "De Valk" ("Erfelijkheidsleer"), >> >> Building 304, >> Verbindingsweg 4, 6703 HC Wageningen >> Room: 0052a >> T: 0317 485786 >> F: 0317 483342 >> E-mail: Philip.deGroot at wur.nl <mailto:philip.degroot at="" wur.nl=""> >> I: http://humannutrition.wur.nl <http: humannutrition.wur.nl=""/> >> >> https://madmax.bioinformatics.nl >> >> http://www.nutrigenomicsconsortium.nl >> <http: www.nutrigenomicsconsortium.nl=""/> >> >> >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD REPLY • link 12.3 years ago cstrato ★ 3.9k

0

Entering edit mode

Dear Christian, Thank you very much! I was thinking that it must have been something in the CEL-file itself, but it turns out to be the filename! I'll adapt the script on our production server to fix the issue. I have to mention that we use xps for quite some years now. We never encountered this issue before! I worked through your recommendations from yesterday. I could indeed properly load the affymetrix sample data. And changing the location of the root-scheme did not fix the issue either! Fortunately, we do understand this now! And you are right: if xps is updated, I need to recreate the schemes too. This needs only to be done once every 6 months (usually) and is not a big problem. And it also forces me to check the Affymetrix site for updated annotations etc. I just feel more comfortable if the schemes are created by the current running version of xps. Have a nice weekend. Regards, Dr. Philip de Groot Ph.D. Bioinformatics Researcher Wageningen University / TIFN Nutrigenomics Consortium Nutrition, Metabolism & Genomics Group Division of Human Nutrition PO Box 8129, 6700 EV Wageningen Visiting Address: Erfelijkheidsleer: De Valk, Building 304 Dreijenweg 2, 6703 HA Wageningen Room: 0052a T: +31-317-485786 F: +31-317-483342 E-mail: Philip.deGroot at wur.nl Internet: http://www.nutrigenomicsconsortium.nl http://humannutrition.wur.nl/ https://madmax.bioinformatics.nl/ ________________________________________ From: cstrato [cstrato@aon.at] Sent: 11 January 2013 21:05 To: Groot, Philip de Cc: bioconductor at r-project.org Subject: Re: [BioC] xps: hugene11 chip gives problems Dear Philip, Meanwhile I did another test and renamed my CEL-files to mimic your names. This is what I get: > celfiles <- c("Brain_01_1.1.CEL","Prostate_01_1.1.CEL") > data.genome11 <- import.data(scheme.hugene11, "tmp_HuBrPr", filedir=datdir, celdir=celdir, celfiles=celfiles) Opening file </volumes> in <read> mode... Creating new temporary file </volumes>... Importing </volumes> as <brain_01_1.1.cel>... hybridization statistics: 1 cells with minimal intensity 17.5 1 cells with maximal intensity 22402.1 New dataset <dataset> is added to Content... Importing </volumes> as <prostate_01_1.1.cel>... hybridization statistics: 2 cells with minimal intensity 14.5 1 cells with maximal intensity 23266.3 > for (i in 1:length(rawCELName(data.genome11, fullpath = FALSE))) + cat(sprintf("%s\n", rawCELName(data.genome11, fullpath = FALSE)[i])) Error: Tree set <> could not be found in file content Error: Tree set <> could not be found in file content As you can see I can now replicate your error. The solution is simple, i.e. use parameter 'celnames'. Now the result is: > celfiles <- c("Brain_01_1.1.CEL","Prostate_01_1.1.CEL") > celnames <- c("Brain01","Prostate01") > data.genome11 <- import.data(scheme.hugene11, "tmp_HuBrPr", filedir=datdir, celdir=celdir, celfiles=celfiles, celnames=celnames) Opening file </volumes> in <read> mode... Creating new temporary file </volumes>... Importing </volumes> as <brain01.cel>... hybridization statistics: 1 cells with minimal intensity 17.5 1 cells with maximal intensity 22402.1 New dataset <dataset> is added to Content... Importing </volumes> as <prostate01.cel>... hybridization statistics: 2 cells with minimal intensity 14.5 1 cells with maximal intensity 23266.3 > for (i in 1:length(rawCELName(data.genome11, fullpath = FALSE))) + cat(sprintf("%s\n", rawCELName(data.genome11, fullpath = FALSE)[i])) Brain_01_1.1.CEL Prostate_01_1.1.CEL As you can see, now everything works fine. The reason for introducing parameter 'celnames' was from the beginning to allow alternative names w/o the need to change the names of the original CEL-files, since often CEL-files had names such as 'Breast_tissue;24/08/1999;batch-1,lot-2.1.CEL'. I hope that using parameter 'celnames' does solve your problem. Best regards, Christian On 1/10/13 9:10 PM, cstrato wrote: > Dear Philip, > > I have just tried a subset of CEL-files from the Affymetrix > "gene_1_1_st_ap_tissue_sample_data" for HuGene_1.1 array, but I cannot > repeat the error you get. Here is my output for one CEL-file only: > > > library(xps) > > Welcome to xps version 1.19.1 > an R wrapper for XPS - eXpression Profiling System > (c) Copyright 2001-2013 by Christian Stratowa > > > scheme <- root.scheme("./na33/hugene11stv1.root") > > x.xps <- import.data(scheme, "tmp_x", celdir = "./cel", celfiles = > "HumanBrain_1.CEL", verbose = TRUE) > Opening file <./na33/hugene11stv1.root> in <read> mode... > Creating new temporary file > </volumes>... > Importing <./cel/HumanBrain_1.CEL> as <humanbrain_1.cel>... > hybridization statistics: > 1 cells with minimal intensity 17.5 > 1 cells with maximal intensity 22402.1 > New dataset <dataset> is added to Content... > > cat("The loaded .CEL-files are:\n"); > The loaded .CEL-files are: > > for (i in 1:length(rawCELName(x.xps, fullpath = FALSE))) > + cat(sprintf("%s\n", rawCELName(x.xps, fullpath = FALSE)[i])); > HumanBrain_1.CEL > > > > sessionInfo() > R version 2.15.0 (2012-03-30) > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) > > locale: > [1] C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] xps_1.19.1 > > loaded via a namespace (and not attached): > [1] tools_2.15.0 > > > > > As you see everything is ok. I did also run the triplicates of the Brain > and Prostate samples and could do RMA w/o problems. > > Could you please try the following two options: > > 1, Could you try to use the CEL-files from the Affymetrix dataset to > make sure that there is no problem with the CEL-files. > > 2, I see that you did create the ROOT scheme files in directory: > scmdir <- paste(.path.package("xps"), "schemes/", sep = "/") > > I must admit that I have never tried to store the scheme files in the > package directory, since I have the feeling that this may cause > troubles, especially when you update R and/or the xps package to a new > version. > Could you please try to save your file "hugene11stv1.root" in a > different directory such as '/home/degroot/schemes' or better to create > this file in this directory, and then try if you still get the problem. > > Best regards, > Christian > > > On 1/10/13 1:03 PM, Groot, Philip de wrote: >> Hi Christian, >> >> I am trying to do an analysis using xps and the hugene11 chip. However, >> I run into problems for which I need your help. >> >> I created a small test-script to demonstrate the problem: >> >> library(xps) >> >> scheme <- >> root.scheme("/local2/R-2.15.2/library/xps/schemes/hugene11stv1.root") >> >> x.xps <- import.data(scheme, "tmp_x", celdir = ".", celfiles = >> "G092_A05_01_1.1.CEL", verbose = TRUE) >> >> cat("The loaded .CEL-files are:\n"); >> >> for (i in 1:length(rawCELName(x.xps, fullpath = FALSE))) >> >> cat(sprintf("%s\n", rawCELName(x.xps, fullpath = FALSE)[i])); >> >> Upon execution, I get: >> >>> library(xps) >> >> Welcome to xps version 1.18.1 >> >> an R wrapper for XPS - eXpression Profiling System >> >> (c) Copyright 2001-2012 by Christian Stratowa >> >>> scheme <- >>> root.scheme("/local2/R-2.15.2/library/xps/schemes/hugene11stv1.root") >> >>> x.xps <- import.data(scheme, "tmp_x", celdir = ".", celfiles = >>> "G092_A05_01_1.1.CEL", verbose = TRUE) >> >> Opening file </local2> in >> <read> mode... >> >> Creating new temporary file >> </mnt>... >> >> >> Importing <./G092_A05_01_1.1.CEL> as <g092_a05_01_1.1.cel>... >> >> hybridization statistics: >> >> 1 cells with minimal intensity 19 >> >> 1 cells with maximal intensity 21364.4 >> >> New dataset <dataset> is added to Content... >> >>> >> >>> cat("The loaded .CEL-files are:\n"); >> >> The loaded .CEL-files are: >> >>> for (i in 1:length(rawCELName(x.xps, fullpath = FALSE))) >> >> + cat(sprintf("%s\n", rawCELName(x.xps, fullpath = FALSE)[i])); >> >> Error: Tree set <> could not be found in file content >> >> Error: Tree set <> could not be found in file content >> >> NA >> >> The weird thing is: I only have this problem with the hugene11 chip. As >> far as I can see, al other chips work properly (still na32 based). >> >> This effects all other steps, because there is no ?content? to normalise >> etc. >> >> I created the root-scheme as follows: >> >> scmdir <- paste(.path.package("xps"), "schemes/", sep = "/") >> >> scheme <- import.exon.scheme("hugene11stv1",filedir=scmdir, >> layoutfile=paste(libdir, "HuGene-1_1-st-v1.r4.clf", sep="/"), >> schemefile=paste(libdir,"HuGene-1_1-st-v1.r4.pgf", sep="/"), >> probeset=paste(anndir,"HuGene-1_1-st-v1.na33.1.hg19.probeset.csv", >> sep="/"), >> transcript=paste(anndir,"HuGene- 1_1-st-v1.na33.1.hg19.transcript.csv", >> sep="/"), add.mask = TRUE) >> >> (libdir and anndir are also defined off course). >> >> I even updated the na32 annotation to the latest Affymetrix version >> (na33) the exclude a problem there. It does not fix the issue. >> >> Please note that I am running root version 5.32/04 as version 5.32/01 is >> no longer available for download. Root works properly as far as I can >> see. >> >> Do you have any clue where this problem originates from? Thank you! >> >> sessionInfo(): >> >>> sessionInfo() >> >> R version 2.15.2 (2012-10-26) >> >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> >> [7] LC_PAPER=C LC_NAME=C >> >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> >> [1] xps_1.18.1 >> >> loaded via a namespace (and not attached): >> >> [1] tools_2.15.2 >> >> Regards, >> >> *Dr. Philip de Groot >> Bioinformatician / Microarray analysis expert* >> >> Wageningen University / TIFN >> Netherlands Nutrigenomics Center (NNC) >> >> Nutrition, Metabolism & Genomics Group >> Division of Human Nutrition >> PO Box 8129, 6700 EV Wageningen >> Visiting Address: >> >> "De Valk" ("Erfelijkheidsleer"), >> >> Building 304, >> Verbindingsweg 4, 6703 HC Wageningen >> Room: 0052a >> T: 0317 485786 >> F: 0317 483342 >> E-mail: Philip.deGroot at wur.nl <mailto:philip.degroot at="" wur.nl=""> >> I: http://humannutrition.wur.nl <http: humannutrition.wur.nl=""/> >> >> https://madmax.bioinformatics.nl >> >> http://www.nutrigenomicsconsortium.nl >> <http: www.nutrigenomicsconsortium.nl=""/> >> >> >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD REPLY • link 12.3 years ago Groot, Philip de ▴ 630

0

Entering edit mode

Dear Philip, I am glad to hear that using 'celnames' could solve your problem. It is interesting to hear that you have never had problems with names of CEL-files. Personally I prefer to change the names, especially the names of the CEL-files from GEO which are simply numbers with a prefix. Have a nice weekend, too. Christian On 1/11/13 10:34 PM, Groot, Philip de wrote: > Dear Christian, > > Thank you very much! I was thinking that it must have been something in the CEL-file itself, but it turns out to be the filename! I'll adapt the script on our production server to fix the issue. I have to mention that we use xps for quite some years now. We never encountered this issue before! > > I worked through your recommendations from yesterday. I could indeed properly load the affymetrix sample data. And changing the location of the root-scheme did not fix the issue either! Fortunately, we do understand this now! > > And you are right: if xps is updated, I need to recreate the schemes too. This needs only to be done once every 6 months (usually) and is not a big problem. And it also forces me to check the Affymetrix site for updated annotations etc. I just feel more comfortable if the schemes are created by the current running version of xps. > > Have a nice weekend. > > Regards, > > > Dr. Philip de Groot Ph.D. > Bioinformatics Researcher > > Wageningen University / TIFN > Nutrigenomics Consortium > Nutrition, Metabolism & Genomics Group > Division of Human Nutrition > PO Box 8129, 6700 EV Wageningen > Visiting Address: Erfelijkheidsleer: De Valk, Building 304 > Dreijenweg 2, 6703 HA Wageningen > Room: 0052a > T: +31-317-485786 > F: +31-317-483342 > E-mail: Philip.deGroot at wur.nl > Internet: http://www.nutrigenomicsconsortium.nl > http://humannutrition.wur.nl/ > https://madmax.bioinformatics.nl/ > ________________________________________ > From: cstrato [cstrato at aon.at] > Sent: 11 January 2013 21:05 > To: Groot, Philip de > Cc: bioconductor at r-project.org > Subject: Re: [BioC] xps: hugene11 chip gives problems > > Dear Philip, > > Meanwhile I did another test and renamed my CEL-files to mimic your > names. This is what I get: > > celfiles <- c("Brain_01_1.1.CEL","Prostate_01_1.1.CEL") > > data.genome11 <- import.data(scheme.hugene11, "tmp_HuBrPr", > filedir=datdir, celdir=celdir, celfiles=celfiles) > Opening file > </volumes> in > <read> mode... > Creating new temporary file > </volumes>... > Importing > </volumes> > as <brain_01_1.1.cel>... > hybridization statistics: > 1 cells with minimal intensity 17.5 > 1 cells with maximal intensity 22402.1 > New dataset <dataset> is added to Content... > Importing > </volumes> as > <prostate_01_1.1.cel>... > hybridization statistics: > 2 cells with minimal intensity 14.5 > 1 cells with maximal intensity 23266.3 > > for (i in 1:length(rawCELName(data.genome11, fullpath = FALSE))) > + cat(sprintf("%s\n", rawCELName(data.genome11, fullpath = FALSE)[i])) > Error: Tree set <> could not be found in file content > Error: Tree set <> could not be found in file content > > > As you can see I can now replicate your error. > > The solution is simple, i.e. use parameter 'celnames'. Now the result is: > > celfiles <- c("Brain_01_1.1.CEL","Prostate_01_1.1.CEL") > > celnames <- c("Brain01","Prostate01") > > data.genome11 <- import.data(scheme.hugene11, "tmp_HuBrPr", > filedir=datdir, celdir=celdir, celfiles=celfiles, celnames=celnames) > Opening file > </volumes> in > <read> mode... > Creating new temporary file > </volumes>... > Importing > </volumes> > as <brain01.cel>... > hybridization statistics: > 1 cells with minimal intensity 17.5 > 1 cells with maximal intensity 22402.1 > New dataset <dataset> is added to Content... > Importing > </volumes> as > <prostate01.cel>... > hybridization statistics: > 2 cells with minimal intensity 14.5 > 1 cells with maximal intensity 23266.3 > > for (i in 1:length(rawCELName(data.genome11, fullpath = FALSE))) > + cat(sprintf("%s\n", rawCELName(data.genome11, fullpath = FALSE)[i])) > Brain_01_1.1.CEL > Prostate_01_1.1.CEL > > As you can see, now everything works fine. The reason for introducing > parameter 'celnames' was from the beginning to allow alternative names > w/o the need to change the names of the original CEL-files, since often > CEL-files had names such as 'Breast_tissue;24/08/1999;batch-1,lot-2.1.CEL'. > > I hope that using parameter 'celnames' does solve your problem. > > Best regards, > Christian > > > On 1/10/13 9:10 PM, cstrato wrote: >> Dear Philip, >> >> I have just tried a subset of CEL-files from the Affymetrix >> "gene_1_1_st_ap_tissue_sample_data" for HuGene_1.1 array, but I cannot >> repeat the error you get. Here is my output for one CEL-file only: >> >> > library(xps) >> >> Welcome to xps version 1.19.1 >> an R wrapper for XPS - eXpression Profiling System >> (c) Copyright 2001-2013 by Christian Stratowa >> >> > scheme <- root.scheme("./na33/hugene11stv1.root") >> > x.xps <- import.data(scheme, "tmp_x", celdir = "./cel", celfiles = >> "HumanBrain_1.CEL", verbose = TRUE) >> Opening file <./na33/hugene11stv1.root> in <read> mode... >> Creating new temporary file >> </volumes>... >> Importing <./cel/HumanBrain_1.CEL> as <humanbrain_1.cel>... >> hybridization statistics: >> 1 cells with minimal intensity 17.5 >> 1 cells with maximal intensity 22402.1 >> New dataset <dataset> is added to Content... >> > cat("The loaded .CEL-files are:\n"); >> The loaded .CEL-files are: >> > for (i in 1:length(rawCELName(x.xps, fullpath = FALSE))) >> + cat(sprintf("%s\n", rawCELName(x.xps, fullpath = FALSE)[i])); >> HumanBrain_1.CEL >> > >> > sessionInfo() >> R version 2.15.0 (2012-03-30) >> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >> >> locale: >> [1] C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] xps_1.19.1 >> >> loaded via a namespace (and not attached): >> [1] tools_2.15.0 >> > >> >> >> As you see everything is ok. I did also run the triplicates of the Brain >> and Prostate samples and could do RMA w/o problems. >> >> Could you please try the following two options: >> >> 1, Could you try to use the CEL-files from the Affymetrix dataset to >> make sure that there is no problem with the CEL-files. >> >> 2, I see that you did create the ROOT scheme files in directory: >> scmdir <- paste(.path.package("xps"), "schemes/", sep = "/") >> >> I must admit that I have never tried to store the scheme files in the >> package directory, since I have the feeling that this may cause >> troubles, especially when you update R and/or the xps package to a new >> version. >> Could you please try to save your file "hugene11stv1.root" in a >> different directory such as '/home/degroot/schemes' or better to create >> this file in this directory, and then try if you still get the problem. >> >> Best regards, >> Christian >> >> >> On 1/10/13 1:03 PM, Groot, Philip de wrote: >>> Hi Christian, >>> >>> I am trying to do an analysis using xps and the hugene11 chip. However, >>> I run into problems for which I need your help. >>> >>> I created a small test-script to demonstrate the problem: >>> >>> library(xps) >>> >>> scheme <- >>> root.scheme("/local2/R-2.15.2/library/xps/schemes/hugene11stv1.root") >>> >>> x.xps <- import.data(scheme, "tmp_x", celdir = ".", celfiles = >>> "G092_A05_01_1.1.CEL", verbose = TRUE) >>> >>> cat("The loaded .CEL-files are:\n"); >>> >>> for (i in 1:length(rawCELName(x.xps, fullpath = FALSE))) >>> >>> cat(sprintf("%s\n", rawCELName(x.xps, fullpath = FALSE)[i])); >>> >>> Upon execution, I get: >>> >>>> library(xps) >>> >>> Welcome to xps version 1.18.1 >>> >>> an R wrapper for XPS - eXpression Profiling System >>> >>> (c) Copyright 2001-2012 by Christian Stratowa >>> >>>> scheme <- >>>> root.scheme("/local2/R-2.15.2/library/xps/schemes/hugene11stv1.root") >>> >>>> x.xps <- import.data(scheme, "tmp_x", celdir = ".", celfiles = >>>> "G092_A05_01_1.1.CEL", verbose = TRUE) >>> >>> Opening file </local2> in >>> <read> mode... >>> >>> Creating new temporary file >>> </mnt>... >>> >>> >>> Importing <./G092_A05_01_1.1.CEL> as <g092_a05_01_1.1.cel>... >>> >>> hybridization statistics: >>> >>> 1 cells with minimal intensity 19 >>> >>> 1 cells with maximal intensity 21364.4 >>> >>> New dataset <dataset> is added to Content... >>> >>>> >>> >>>> cat("The loaded .CEL-files are:\n"); >>> >>> The loaded .CEL-files are: >>> >>>> for (i in 1:length(rawCELName(x.xps, fullpath = FALSE))) >>> >>> + cat(sprintf("%s\n", rawCELName(x.xps, fullpath = FALSE)[i])); >>> >>> Error: Tree set <> could not be found in file content >>> >>> Error: Tree set <> could not be found in file content >>> >>> NA >>> >>> The weird thing is: I only have this problem with the hugene11 chip. As >>> far as I can see, al other chips work properly (still na32 based). >>> >>> This effects all other steps, because there is no ?content? to normalise >>> etc. >>> >>> I created the root-scheme as follows: >>> >>> scmdir <- paste(.path.package("xps"), "schemes/", sep = "/") >>> >>> scheme <- import.exon.scheme("hugene11stv1",filedir=scmdir, >>> layoutfile=paste(libdir, "HuGene-1_1-st-v1.r4.clf", sep="/"), >>> schemefile=paste(libdir,"HuGene-1_1-st-v1.r4.pgf", sep="/"), >>> probeset=paste(anndir,"HuGene-1_1-st-v1.na33.1.hg19.probeset.csv", >>> sep="/"), >>> transcript=paste(anndir,"HuGene- 1_1-st-v1.na33.1.hg19.transcript.csv", >>> sep="/"), add.mask = TRUE) >>> >>> (libdir and anndir are also defined off course). >>> >>> I even updated the na32 annotation to the latest Affymetrix version >>> (na33) the exclude a problem there. It does not fix the issue. >>> >>> Please note that I am running root version 5.32/04 as version 5.32/01 is >>> no longer available for download. Root works properly as far as I can >>> see. >>> >>> Do you have any clue where this problem originates from? Thank you! >>> >>> sessionInfo(): >>> >>>> sessionInfo() >>> >>> R version 2.15.2 (2012-10-26) >>> >>> Platform: x86_64-unknown-linux-gnu (64-bit) >>> >>> locale: >>> >>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >>> >>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >>> >>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >>> >>> [7] LC_PAPER=C LC_NAME=C >>> >>> [9] LC_ADDRESS=C LC_TELEPHONE=C >>> >>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>> >>> attached base packages: >>> >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> >>> [1] xps_1.18.1 >>> >>> loaded via a namespace (and not attached): >>> >>> [1] tools_2.15.2 >>> >>> Regards, >>> >>> *Dr. Philip de Groot >>> Bioinformatician / Microarray analysis expert* >>> >>> Wageningen University / TIFN >>> Netherlands Nutrigenomics Center (NNC) >>> >>> Nutrition, Metabolism & Genomics Group >>> Division of Human Nutrition >>> PO Box 8129, 6700 EV Wageningen >>> Visiting Address: >>> >>> "De Valk" ("Erfelijkheidsleer"), >>> >>> Building 304, >>> Verbindingsweg 4, 6703 HC Wageningen >>> Room: 0052a >>> T: 0317 485786 >>> F: 0317 483342 >>> E-mail: Philip.deGroot at wur.nl <mailto:philip.degroot at="" wur.nl=""> >>> I: http://humannutrition.wur.nl <http: humannutrition.wur.nl=""/> >>> >>> https://madmax.bioinformatics.nl >>> >>> http://www.nutrigenomicsconsortium.nl >>> <http: www.nutrigenomicsconsortium.nl=""/> >>> >>> >>> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > >

ADD REPLY • link 12.3 years ago cstrato ★ 3.9k

Login before adding your answer.