subset in XPS
1
0
Entering edit mode
Zhibin Lu ▴ 80
@zhibin-lu-2882
Last seen 10.2 years ago
Hi, I am new in R/bioconductor. I am using xps package to analyze Affymetrix Gene ST 1.0 data. After I loaded CEL files into the DataTreeSet and compute the expression level with RMA, can I work on a subset of the data? Say, I have 12 samples. After RMA, can I just work on 6 of them and divide them into two groups and apply UniFilter to just these 6 ones? Thanks, Zhibin
xps xps • 1.0k views
ADD COMMENT
0
Entering edit mode
cstrato ★ 3.9k
@cstrato-908
Last seen 6.1 years ago
Austria
Dear Zhibin Since you have already done RMA you have now an ExprTreeSet, called e.g. "data.rma". You can see the structure with: > str(data.rma) Since currently there is no direct possibility to use a subset of type ExprTreeSet only, you can create a new class ExprTreeSet in the following way: 1. Make a subset of slot "data" which is a dataframe (assuming that you want to use samples 1,2,3,7,8,9): > subdata <- exprs(data.rma) > subdata <- subdata[,c(1:2,3:5, 9:11)] Please note that it is important to keep the first two columns. 2. Create a copy "sub.rma" of class "data.rma" > sub.rma <- data.rma 3. Replace slot "data" with "subdata": > exprs(sub.rma) <- subdata For the moment you need to replace slots "treenames" and "numtrees", too, which I will change in the future to be done automatically. 4. Replace slot "treenames" with the names of your subset: a, create list containing the sub samples > subtrees <- unlist(treeNames(data.g.rma)) > subtrees <- as.list(subtrees[c(1:3,7:9)]) b, check if the names are correct: > subtrees c, replace slot "treenames": > sub.rma at treenames <- subtrees 5. Replace slot "numtrees" with the number of subsamples > sub.rma at numtrees <- length(subtrees) 6. Check if the new ExprTreeSet is correct: > str(sub.rma) Now you can use the new ExprTreeSet "sub.rma" as input for method unifilter: > rma.ufr <- unifilter(sub.rma, .......) If you want to take advantage of the advanced capabilties of package "limma", then you can create a Biobase class "ExpressionSet" containing only your 6 samples as described in Appendix A.3 of the vignette xps.pdf: 1. extract the normalized expression data: > subdata <- validData(data.rma) 2. Since "subdata" is a dataframe, simply create a subframe: > subdata <- subdata[,c(1:3,7:9)] 3. Create a Biobase class "ExpressionSet", called "subset" > subset <- new("ExpressionSet", exprs = as.matrix(subdata)) Now you have an ExpressionSet ready for use with "limma". Please let me know if you succeeded with this info. Best regards Christian _._._._._._._._._._._._._._._._ C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a e.m.a.i.l: cstrato at aon.at _._._._._._._._._._._._._._._._ Zhibin Lu wrote: > Hi, > > I am new in R/bioconductor. I am using xps package to analyze Affymetrix Gene ST 1.0 data. After I loaded CEL files into the DataTreeSet and compute the expression level with RMA, can I work on a subset of the data? Say, I have 12 samples. After RMA, can I just work on 6 of them and divide them into two groups and apply UniFilter to just these 6 ones? > > Thanks, > > Zhibin > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > >
ADD COMMENT
0
Entering edit mode
Dear Christian, Thanks so much for such a detailed explanation. I will try this when I get to work next week, and I do not see why I can not follow the direction. Thanks again and have a nice weekend, Zhibin > Date: Sat, 28 Jun 2008 15:46:26 +0200 > From: cstrato@aon.at > To: zhbluweb@hotmail.com > CC: bioconductor@stat.math.ethz.ch > Subject: Re: [BioC] subset in XPS > > Dear Zhibin > > Since you have already done RMA you have now an ExprTreeSet, > called e.g. "data.rma". You can see the structure with: > > str(data.rma) > > Since currently there is no direct possibility to use a > subset of type ExprTreeSet only, you can create a new class > ExprTreeSet in the following way: > > 1. Make a subset of slot "data" which is a dataframe > (assuming that you want to use samples 1,2,3,7,8,9): > > subdata <- exprs(data.rma) > > subdata <- subdata[,c(1:2,3:5, 9:11)] > Please note that it is important to keep the first > two columns. > > 2. Create a copy "sub.rma" of class "data.rma" > > sub.rma <- data.rma > > 3. Replace slot "data" with "subdata": > > exprs(sub.rma) <- subdata > > For the moment you need to replace slots "treenames" and > "numtrees", too, which I will change in the future to be > done automatically. > > 4. Replace slot "treenames" with the names of your subset: > a, create list containing the sub samples > > subtrees <- unlist(treeNames(data.g.rma)) > > subtrees <- as.list(subtrees[c(1:3,7:9)]) > b, check if the names are correct: > > subtrees > c, replace slot "treenames": > > sub.rma@treenames <- subtrees > > 5. Replace slot "numtrees" with the number of subsamples > > sub.rma@numtrees <- length(subtrees) > > 6. Check if the new ExprTreeSet is correct: > > str(sub.rma) > > Now you can use the new ExprTreeSet "sub.rma" as input for > method unifilter: > > rma.ufr <- unifilter(sub.rma, .......) > > > If you want to take advantage of the advanced capabilties > of package "limma", then you can create a Biobase class > "ExpressionSet" containing only your 6 samples as described > in Appendix A.3 of the vignette xps.pdf: > > 1. extract the normalized expression data: > > subdata <- validData(data.rma) > > 2. Since "subdata" is a dataframe, simply create a subframe: > > subdata <- subdata[,c(1:3,7:9)] > > 3. Create a Biobase class "ExpressionSet", called "subset" > > subset <- new("ExpressionSet", exprs = as.matrix(subdata)) > > Now you have an ExpressionSet ready for use with "limma". > > Please let me know if you succeeded with this info. > > Best regards > Christian > _._._._._._._._._._._._._._._._ > C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a > V.i.e.n.n.a A.u.s.t.r.i.a > e.m.a.i.l: cstrato at aon.at > _._._._._._._._._._._._._._._._ > > Zhibin Lu wrote: > > Hi, > > > > I am new in R/bioconductor. I am using xps package to analyze Affymetrix Gene ST 1.0 data. After I loaded CEL files into the DataTreeSet and compute the expression level with RMA, can I work on a subset of the data? Say, I have 12 samples. After RMA, can I just work on 6 of them and divide them into two groups and apply UniFilter to just these 6 ones? > > > > Thanks, > > > > Zhibin > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > > _________________________________________________________________ [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Dear Zhibin Meanwhile, I have uploaded a new version to BioC devel: http://bioconductor.org/packages/2.3/bioc/html/xps.html which simplifies your request as follows: 1. get expression values > value <- exprs(data.rma) 2. select treenames of choice (no extension necessary) > treenames <- c("TestA2", "TestB1") 3. make a copy of your object if you do not want to replace it > sub.rma <- data.rma 4. replace slot data with subset exprs(sub.rma, treenames) <- value 5. check if the new ExprTreeSet is correct: > str(sub.rma) Best regards Christian Zhibin Lu wrote: > Dear Christian, > > Thanks so much for such a detailed explanation. I will try this when I > get to work next week, and I do not see why I can not follow the > direction. > > Thanks again and have a nice weekend, > > Zhibin > > > Date: Sat, 28 Jun 2008 15:46:26 +0200 > > From: cstrato at aon.at > > To: zhbluweb at hotmail.com > > CC: bioconductor at stat.math.ethz.ch > > Subject: Re: [BioC] subset in XPS > > > > Dear Zhibin > > > > Since you have already done RMA you have now an ExprTreeSet, > > called e.g. "data.rma". You can see the structure with: > > > str(data.rma) > > > > Since currently there is no direct possibility to use a > > subset of type ExprTreeSet only, you can create a new class > > ExprTreeSet in the following way: > > > > 1. Make a subset of slot "data" which is a dataframe > > (assuming that you want to use samples 1,2,3,7,8,9): > > > subdata <- exprs(data.rma) > > > subdata <- subdata[,c(1:2,3:5, 9:11)] > > Please note that it is important to keep the first > > two columns. > > > > 2. Create a copy "sub.rma" of class "data.rma" > > > sub.rma <- data.rma > > > > 3. Replace slot "data" with "subdata": > > > exprs(sub.rma) <- subdata > > > > For the moment you need to replace slots "treenames" and > > "numtrees", too, which I will change in the future to be > > done automatically. > > > > 4. Replace slot "treenames" with the names of your subset: > > a, create list containing the sub samples > > > subtrees <- unlist(treeNames(data.g.rma)) > > > subtrees <- as.list(subtrees[c(1:3,7:9)]) > > b, check if the names are correct: > > > subtrees > > c, replace slot "treenames": > > > sub.rma at treenames <- subtrees > > > > 5. Replace slot "numtrees" with the number of subsamples > > > sub.rma at numtrees <- length(subtrees) > > > > 6. Check if the new ExprTreeSet is correct: > > > str(sub.rma) > > > > Now you can use the new ExprTreeSet "sub.rma" as input for > > method unifilter: > > > rma.ufr <- unifilter(sub.rma, .......) > > > > > > If you want to take advantage of the advanced capabilties > > of package "limma", then you can create a Biobase class > > "ExpressionSet" containing only your 6 samples as described > > in Appendix A.3 of the vignette xps.pdf: > > > > 1. extract the normalized expression data: > > > subdata <- validData(data.rma) > > > > 2. Since "subdata" is a dataframe, simply create a subframe: > > > subdata <- subdata[,c(1:3,7:9)] > > > > 3. Create a Biobase class "ExpressionSet", called "subset" > > > subset <- new("ExpressionSet", exprs = as.matrix(subdata)) > > > > Now you have an ExpressionSet ready for use with "limma". > > > > Please let me know if you succeeded with this info. > > > > Best regards > > Christian > > _._._._._._._._._._._._._._._._ > > C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a > > V.i.e.n.n.a A.u.s.t.r.i.a > > e.m.a.i.l: cstrato at aon.at > > _._._._._._._._._._._._._._._._ > > > > Zhibin Lu wrote: > > > Hi, > > > > > > I am new in R/bioconductor. I am using xps package to analyze > Affymetrix Gene ST 1.0 data. After I loaded CEL files into the > DataTreeSet and compute the expression level with RMA, can I work on a > subset of the data? Say, I have 12 samples. After RMA, can I just work > on 6 of them and divide them into two groups and apply UniFilter to > just these 6 ones? > > > > > > Thanks, > > > > > > Zhibin > > > > > > _______________________________________________ > > > Bioconductor mailing list > > > Bioconductor at stat.math.ethz.ch > > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > > > > > > > > -------------------------------------------------------------------- ----
ADD REPLY

Login before adding your answer.

Traffic: 954 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6