affyxparser package question
1
0
Entering edit mode
@tae-hoon-chung-783
Last seen 10.3 years ago
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070808/ 59059bfd/attachment.pl
• 1.1k views
ADD COMMENT
0
Entering edit mode
@benilton-carvalho-1375
Last seen 4.8 years ago
Brazil/Campinas/UNICAMP
any specific reason to not use library(oligo) x = read.celfiles(list.celfiles()) and check the annotation packages? b On Aug 8, 2007, at 6:02 PM, Tae-Hoon Chung wrote: > Hi; > > When one uses readCelUnits() to read SNP chip cel files, how one can > tell which values are for forward strand or for backward strand and > which values are from the non-shifted probes or from the shifted > probes? For instance, in the following code chunk, which values are > from forward/backward strand and from central/shifted probes? > > library("hapmap100kxba", lib="~/Library/R64") > library(affxparser, lib="~/Library/R64") > > pth <- system.file('celFiles', package='hapmap100kxba') > files <- list.files(path=pth, full.names=T) > > chip.type <- readCelHeader(files[1])$chiptype ## Mapping50K_Xba240 > cels <- readCelUnits(files[1], cdf='~/Project/ProbeAnnot/ > Mapping50K_Xba240.cdf', stratifyBy='pm', addDimnames=T) > length(cels) ## 59015 > labs.test <- names(cels)[100:120] > cels[[labs.test[1]]] > ## $A > ## $A$intensities > ## [1] 7563 8050 9531 9292 11261 > ## > ## $G > ## $G$intensities > ## [1] 6540 7639 9027 10512 11381 > ## > ## $A > ## $A$intensities > ## [1] 4036 4144 3858 5170 3975 > ## > ## $G > ## $G$intensities > ## [1] 4425 4291 3682 5912 5208 > > > Tae-Hoon Chung > > Post-Doctoral Researcher > Computational Biology Division, TGEN > 445 N 5th St. Phoenix, AZ 85004 USA > O: 1-602-343-8724 > F: 1-602-343-8840 > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/ > gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070808/ 6d78c8b7/attachment.pl
ADD REPLY
0
Entering edit mode
I don't mean that it's going to be very hard.... I mean that it might give you some work and someone else already figured that out for you and you could have been using your time for something else. All you need is something like: library(pd.mapping50k.xba240) dbGetQuery(db(pd.mapping50k.xba240), "SELECT * from pmfeature LIMIT 5") b On Aug 8, 2007, at 6:15 PM, Tae-Hoon Chung wrote: > Hi, Benilton; > > Does this mean that it's going to be tough to tell which ones are > for forward/backward strand or for central/shifted probe without > involving annotation package? > > Tae-Hoon Chung > > Post-Doctoral Researcher > Computational Biology Division, TGEN > 445 N 5th St. Phoenix, AZ 85004 USA > O: 1-602-343-8724 > F: 1-602-343-8840 > > > On Aug 8, 2007, at 3:08 PM, Benilton Carvalho wrote: > >> any specific reason to not use >> >> library(oligo) >> x = read.celfiles(list.celfiles()) >> >> and check the annotation packages? >> >> b >> >> On Aug 8, 2007, at 6:02 PM, Tae-Hoon Chung wrote: >> >>> Hi; >>> >>> When one uses readCelUnits() to read SNP chip cel files, how one can >>> tell which values are for forward strand or for backward strand and >>> which values are from the non-shifted probes or from the shifted >>> probes? For instance, in the following code chunk, which values are >>> from forward/backward strand and from central/shifted probes? >>> >>> library("hapmap100kxba", lib="~/Library/R64") >>> library(affxparser, lib="~/Library/R64") >>> >>> pth <- system.file('celFiles', package='hapmap100kxba') >>> files <- list.files(path=pth, full.names=T) >>> >>> chip.type <- readCelHeader(files[1])$chiptype ## Mapping50K_Xba240 >>> cels <- readCelUnits(files[1], cdf='~/Project/ProbeAnnot/ >>> Mapping50K_Xba240.cdf', stratifyBy='pm', addDimnames=T) >>> length(cels) ## 59015 >>> labs.test <- names(cels)[100:120] >>> cels[[labs.test[1]]] >>> ## $A >>> ## $A$intensities >>> ## [1] 7563 8050 9531 9292 11261 >>> ## >>> ## $G >>> ## $G$intensities >>> ## [1] 6540 7639 9027 10512 11381 >>> ## >>> ## $A >>> ## $A$intensities >>> ## [1] 4036 4144 3858 5170 3975 >>> ## >>> ## $G >>> ## $G$intensities >>> ## [1] 4425 4291 3682 5912 5208 >>> >>> >>> Tae-Hoon Chung >>> >>> Post-Doctoral Researcher >>> Computational Biology Division, TGEN >>> 445 N 5th St. Phoenix, AZ 85004 USA >>> O: 1-602-343-8724 >>> F: 1-602-343-8840 >>> >>> >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/ >>> gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070808/ e7df47d0/attachment.pl
ADD REPLY
0
Entering edit mode
What happens is pretty the same effect of what you get when you do as.numeric() on a matrix... There is a one to one mapping between X and Y coordinates (given that you the dimensions of the chip) to an index... This is the nature of the xy2i function used a lot on the affy package and everything else on BioConductor. And that's how we get the vector. The basic idea behind the CEL file (over-simplifying here) is that it contains 3 columns that we use and I could mimic it with something like: ints <- rnorm(80) fake.cel <- expand.grid(X=1:10, Y=1:8) fake.cel <- cbind(fake.cel, intensity=ints) if you have a few minutes, you'll see that there is an association between the X and Y columns with the rownumber... this rownumber is the "index" I referred to above and is the "fid" column in the annotation. b On Aug 8, 2007, at 6:33 PM, Tae-Hoon Chung wrote: > Hi, Benilton; > > When I tried the 'low-level' method of using readCelUnits(), I got > a list with individual probe intensities. However, when I tried > read.celfiles() from 'oligo' package, I got an object of > 'SnpFeatureSet' class whose measurement values were retrieved > through exprs() method. However, the matrix retrieved by this > method contained one vector for each sample and it was hard to > figure out how the low-level intensity values were transformed into > a single vector. I am curious if the low-level intensity values are > still preserved in the SnpFeatureSet object or more or less > summarized in it. > > Tae-Hoon Chung > > Post-Doctoral Researcher > Computational Biology Division, TGEN > 445 N 5th St. Phoenix, AZ 85004 USA > O: 1-602-343-8724 > F: 1-602-343-8840 > > > On Aug 8, 2007, at 3:24 PM, Benilton Carvalho wrote: > >> I don't mean that it's going to be very hard.... I mean that it >> might give you some work and someone else already figured that out >> for you and you could have been using your time for something else. >> >> All you need is something like: >> >> library(pd.mapping50k.xba240) >> dbGetQuery(db(pd.mapping50k.xba240), "SELECT * from pmfeature >> LIMIT 5") >> >> b >> >> On Aug 8, 2007, at 6:15 PM, Tae-Hoon Chung wrote: >> >>> Hi, Benilton; >>> >>> Does this mean that it's going to be tough to tell which ones are >>> for forward/backward strand or for central/shifted probe without >>> involving annotation package? >>> >>> Tae-Hoon Chung >>> >>> Post-Doctoral Researcher >>> Computational Biology Division, TGEN >>> 445 N 5th St. Phoenix, AZ 85004 USA >>> O: 1-602-343-8724 >>> F: 1-602-343-8840 >>> >>> >>> On Aug 8, 2007, at 3:08 PM, Benilton Carvalho wrote: >>> >>>> any specific reason to not use >>>> >>>> library(oligo) >>>> x = read.celfiles(list.celfiles()) >>>> >>>> and check the annotation packages? >>>> >>>> b >>>> >>>> On Aug 8, 2007, at 6:02 PM, Tae-Hoon Chung wrote: >>>> >>>>> Hi; >>>>> >>>>> When one uses readCelUnits() to read SNP chip cel files, how >>>>> one can >>>>> tell which values are for forward strand or for backward strand >>>>> and >>>>> which values are from the non-shifted probes or from the shifted >>>>> probes? For instance, in the following code chunk, which values >>>>> are >>>>> from forward/backward strand and from central/shifted probes? >>>>> >>>>> library("hapmap100kxba", lib="~/Library/R64") >>>>> library(affxparser, lib="~/Library/R64") >>>>> >>>>> pth <- system.file('celFiles', package='hapmap100kxba') >>>>> files <- list.files(path=pth, full.names=T) >>>>> >>>>> chip.type <- readCelHeader(files[1])$chiptype ## >>>>> Mapping50K_Xba240 >>>>> cels <- readCelUnits(files[1], cdf='~/Project/ProbeAnnot/ >>>>> Mapping50K_Xba240.cdf', stratifyBy='pm', addDimnames=T) >>>>> length(cels) ## 59015 >>>>> labs.test <- names(cels)[100:120] >>>>> cels[[labs.test[1]]] >>>>> ## $A >>>>> ## $A$intensities >>>>> ## [1] 7563 8050 9531 9292 11261 >>>>> ## >>>>> ## $G >>>>> ## $G$intensities >>>>> ## [1] 6540 7639 9027 10512 11381 >>>>> ## >>>>> ## $A >>>>> ## $A$intensities >>>>> ## [1] 4036 4144 3858 5170 3975 >>>>> ## >>>>> ## $G >>>>> ## $G$intensities >>>>> ## [1] 4425 4291 3682 5912 5208 >>>>> >>>>> >>>>> Tae-Hoon Chung >>>>> >>>>> Post-Doctoral Researcher >>>>> Computational Biology Division, TGEN >>>>> 445 N 5th St. Phoenix, AZ 85004 USA >>>>> O: 1-602-343-8724 >>>>> F: 1-602-343-8840 >>>>> >>>>> >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor at stat.math.ethz.ch >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: http://news.gmane.org/ >>>>> gmane.science.biology.informatics.conductor >>> >
ADD REPLY
0
Entering edit mode
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070808/ 287d3546/attachment.pl
ADD REPLY
0
Entering edit mode
try: dbGetQuery(db(pd.mapping50k.xba240), "SELECT * FROM sequence LIMIT 5") and let us know if that contains what you expected. b On Aug 8, 2007, at 7:04 PM, Tae-Hoon Chung wrote: > Hi, Benilton; > > I can understand what you mean. Basically, information from > read.celfiles() is equivalent with that from readCelUnits() except > the necessary transformation from list to vectors to make a > SnpFeatureSet object. Obviously, the forward/backward strand > information can be retrieved from the annotation package. Still, > there's no indication which values are for central probes and which > values are for shifted probes, right? When I looked at the > annotation package, it contained the following information: fid, > strand, allele, fsetid, pos, x, y. It seemed like the "pos" > information was somehow related to central/shifted probes but was > not clear. > > Tae-Hoon Chung > > Post-Doctoral Researcher > Computational Biology Division, TGEN > 445 N 5th St. Phoenix, AZ 85004 USA > O: 1-602-343-8724 > F: 1-602-343-8840 > > > On Aug 8, 2007, at 3:43 PM, Benilton Carvalho wrote: > >> What happens is pretty the same effect of what you get when you do >> as.numeric() on a matrix... >> >> There is a one to one mapping between X and Y coordinates (given >> that you the dimensions of the chip) to an index... This is the >> nature of the xy2i function used a lot on the affy package and >> everything else on BioConductor. And that's how we get the vector. >> >> The basic idea behind the CEL file (over-simplifying here) is that >> it contains 3 columns that we use and I could mimic it with >> something like: >> >> ints <- rnorm(80) >> fake.cel <- expand.grid(X=1:10, Y=1:8) >> fake.cel <- cbind(fake.cel, intensity=ints) >> >> if you have a few minutes, you'll see that there is an association >> between the X and Y columns with the rownumber... this rownumber >> is the "index" I referred to above and is the "fid" column in the >> annotation. >> >> b >> >> On Aug 8, 2007, at 6:33 PM, Tae-Hoon Chung wrote: >> >>> Hi, Benilton; >>> >>> When I tried the 'low-level' method of using readCelUnits(), I >>> got a list with individual probe intensities. However, when I >>> tried read.celfiles() from 'oligo' package, I got an object of >>> 'SnpFeatureSet' class whose measurement values were retrieved >>> through exprs() method. However, the matrix retrieved by this >>> method contained one vector for each sample and it was hard to >>> figure out how the low-level intensity values were transformed >>> into a single vector. I am curious if the low-level intensity >>> values are still preserved in the SnpFeatureSet object or more or >>> less summarized in it. >>> >>> Tae-Hoon Chung >>> >>> Post-Doctoral Researcher >>> Computational Biology Division, TGEN >>> 445 N 5th St. Phoenix, AZ 85004 USA >>> O: 1-602-343-8724 >>> F: 1-602-343-8840 >>> >>> >>> On Aug 8, 2007, at 3:24 PM, Benilton Carvalho wrote: >>> >>>> I don't mean that it's going to be very hard.... I mean that it >>>> might give you some work and someone else already figured that >>>> out for you and you could have been using your time for >>>> something else. >>>> >>>> All you need is something like: >>>> >>>> library(pd.mapping50k.xba240) >>>> dbGetQuery(db(pd.mapping50k.xba240), "SELECT * from pmfeature >>>> LIMIT 5") >>>> >>>> b >>>> >>>> On Aug 8, 2007, at 6:15 PM, Tae-Hoon Chung wrote: >>>> >>>>> Hi, Benilton; >>>>> >>>>> Does this mean that it's going to be tough to tell which ones >>>>> are for forward/backward strand or for central/shifted probe >>>>> without involving annotation package? >>>>> >>>>> Tae-Hoon Chung >>>>> >>>>> Post-Doctoral Researcher >>>>> Computational Biology Division, TGEN >>>>> 445 N 5th St. Phoenix, AZ 85004 USA >>>>> O: 1-602-343-8724 >>>>> F: 1-602-343-8840 >>>>> >>>>> >>>>> On Aug 8, 2007, at 3:08 PM, Benilton Carvalho wrote: >>>>> >>>>>> any specific reason to not use >>>>>> >>>>>> library(oligo) >>>>>> x = read.celfiles(list.celfiles()) >>>>>> >>>>>> and check the annotation packages? >>>>>> >>>>>> b >>>>>> >>>>>> On Aug 8, 2007, at 6:02 PM, Tae-Hoon Chung wrote: >>>>>> >>>>>>> Hi; >>>>>>> >>>>>>> When one uses readCelUnits() to read SNP chip cel files, how >>>>>>> one can >>>>>>> tell which values are for forward strand or for backward >>>>>>> strand and >>>>>>> which values are from the non-shifted probes or from the shifted >>>>>>> probes? For instance, in the following code chunk, which >>>>>>> values are >>>>>>> from forward/backward strand and from central/shifted probes? >>>>>>> >>>>>>> library("hapmap100kxba", lib="~/Library/R64") >>>>>>> library(affxparser, lib="~/Library/R64") >>>>>>> >>>>>>> pth <- system.file('celFiles', package='hapmap100kxba') >>>>>>> files <- list.files(path=pth, full.names=T) >>>>>>> >>>>>>> chip.type <- readCelHeader(files[1])$chiptype ## >>>>>>> Mapping50K_Xba240 >>>>>>> cels <- readCelUnits(files[1], cdf='~/Project/ProbeAnnot/ >>>>>>> Mapping50K_Xba240.cdf', stratifyBy='pm', addDimnames=T) >>>>>>> length(cels) ## 59015 >>>>>>> labs.test <- names(cels)[100:120] >>>>>>> cels[[labs.test[1]]] >>>>>>> ## $A >>>>>>> ## $A$intensities >>>>>>> ## [1] 7563 8050 9531 9292 11261 >>>>>>> ## >>>>>>> ## $G >>>>>>> ## $G$intensities >>>>>>> ## [1] 6540 7639 9027 10512 11381 >>>>>>> ## >>>>>>> ## $A >>>>>>> ## $A$intensities >>>>>>> ## [1] 4036 4144 3858 5170 3975 >>>>>>> ## >>>>>>> ## $G >>>>>>> ## $G$intensities >>>>>>> ## [1] 4425 4291 3682 5912 5208 >>>>>>> >>>>>>> >>>>>>> Tae-Hoon Chung >>>>>>> >>>>>>> Post-Doctoral Researcher >>>>>>> Computational Biology Division, TGEN >>>>>>> 445 N 5th St. Phoenix, AZ 85004 USA >>>>>>> O: 1-602-343-8724 >>>>>>> F: 1-602-343-8840 >>>>>>> >>>>>>> >>>>>>> >>>>>>> [[alternative HTML version deleted]] >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Bioconductor mailing list >>>>>>> Bioconductor at stat.math.ethz.ch >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>>> Search the archives: http://news.gmane.org/ >>>>>>> gmane.science.biology.informatics.conductor >>>>> >>> >
ADD REPLY
0
Entering edit mode
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070809/ c197fc8b/attachment.pl
ADD REPLY

Login before adding your answer.

Traffic: 524 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6