Fold change GEO data

0

Entering edit mode

hemant ritturaj ▴ 150

@hemant-ritturaj-3016

Last seen 10.6 years ago

Dear All, I was trying to analyse the GEO data from GSE format Following is the code gse <- getGEO('gse6901')[[1]] exprs(gse) log2(exprs(gse)) set <- log2(exprs(gse)) target <- c('A','A','A','B','B','B','C','C','C','D','D','D') f <- factor(targets,levels=c('A','B','C','D')) design <- model.matrix(~0+f) contrast.matrix <- makeContrasts(fA-fB,fA-fC,fA-fD,levels=design) fit <- lmFit(set,design,method="robust") fit2 <- contrasts.fit(fit,contrast.matrix) fit2 <- ebayes(fit2) I am ineterested in log fold change between the contrast samples, but the as I see it is not there my question is how can i get the log fold change between the contrast samples I shall be very thankful for any help and suggesstion Regards Hemant [[alternative HTML version deleted]]

• 2.2k views

ADD COMMENT • link updated 16.6 years ago by Sean Davis 21k • written 16.6 years ago by hemant ritturaj ▴ 150

0

Entering edit mode

Sean Davis 21k

@sean-davis-490

Last seen 10 weeks ago

United States

On Tue, Sep 9, 2008 at 6:05 AM, hemant ritturaj <ritturajhemant at="" gmail.com=""> wrote: > Dear All, > > I was trying to analyse the GEO data from GSE format > > > Following is the code > > gse <- getGEO('gse6901')[[1]] > exprs(gse) > log2(exprs(gse)) > set <- log2(exprs(gse)) > target <- c('A','A','A','B','B','B','C','C','C','D','D','D') > f <- factor(targets,levels=c('A','B','C','D')) > design <- model.matrix(~0+f) > contrast.matrix <- makeContrasts(fA-fB,fA-fC,fA-fD,levels=design) > fit <- lmFit(set,design,method="robust") > fit2 <- contrasts.fit(fit,contrast.matrix) > fit2 <- ebayes(fit2) > > > I am ineterested in log fold change between the contrast samples, but the as > I see it is not there Hi, Hemant. When posting code, it is best to cut-and-paste it from you session to avoid typos (target should be targets above). When you say "I see it is not there", I'm not sure what you mean. Have you looked at the topTable output (you'll have to read the limma user guide and topTable help)? If you have done that, then could you complete your example and explain what is missing? Hope that helps. Sean

ADD COMMENT • link 16.6 years ago Sean Davis 21k

0

Entering edit mode

Dear Sean, My apologies for the errror. Here I am giving the complete error which i am getting > gse <- getGEO('gse6901')[[1]] > exprs(gse) > log2(exprs(gse)) > set <- log2(exprs(gse)) > targets <- c('A','A','A','B','B','B','C','C','C','D','D','D') > f <- factor(targets,levels=c('A','B','C','D')) > design <- model.matrix(~0+f) > contrast.matrix <- makeContrasts(fA-fB,fA-fC,fA-fD,levels=design) > fit <- lmFit(set,design,method="robust") > fit2 <- contrasts.fit(fit,contrast.matrix) > fit2 <- ebayes(fit2) > names(fit2) [1] "df.prior" "s2.prior" "s2.post" "t" "p.value" "var.prior" [7] "lods" topTable(fit2,n=10) Error in if (ncol(fit) > 1) return(topTableF(fit, number = number, genelist = genelist, : argument is of length zero I am not able to figure out logFC from the contrast matrix. also using the topTable gave me the above error. Also I am not able to understand why there are lesser variables than doing the same kind of analysis done using raw cel microarray files. Am I missing some things in my program ?? or it is the limitation using NCBI-GEO GSE format files. I am pasting one of the variable results done from the raw cel microarray files for your kind refernce names(fit2) [1] "coefficients" "rank" "assign" "qr" [5] "df.residual" "sigma" "cov.coefficients" "stdev.unscaled" [9] "genes" "Amean" "method" "design" [13] "contrasts" "df.prior" "s2.prior" "var.prior" [17] "proportion" "s2.post" "t" "p.value" [21] "lods" "F" "F.p.value" Thanks for helping Regards On Tue, Sep 9, 2008 at 8:11 PM, Sean Davis <sdavis2@mail.nih.gov> wrote: > On Tue, Sep 9, 2008 at 6:05 AM, hemant ritturaj > <ritturajhemant@gmail.com> wrote: > > Dear All, > > > > I was trying to analyse the GEO data from GSE format > > > > > > Following is the code > > > > gse <- getGEO('gse6901')[[1]] > > exprs(gse) > > log2(exprs(gse)) > > set <- log2(exprs(gse)) > > target <- c('A','A','A','B','B','B','C','C','C','D','D','D') > > f <- factor(targets,levels=c('A','B','C','D')) > > design <- model.matrix(~0+f) > > contrast.matrix <- makeContrasts(fA-fB,fA-fC,fA-fD,levels=design) > > fit <- lmFit(set,design,method="robust") > > fit2 <- contrasts.fit(fit,contrast.matrix) > > fit2 <- ebayes(fit2) > > > > > > I am ineterested in log fold change between the contrast samples, but the > as > > I see it is not there > > Hi, Hemant. When posting code, it is best to cut-and-paste it from > you session to avoid typos (target should be targets above). > > When you say "I see it is not there", I'm not sure what you mean. > Have you looked at the topTable output (you'll have to read the limma > user guide and topTable help)? If you have done that, then could you > complete your example and explain what is missing? > > Hope that helps. > > Sean > -- Hemant Ritturaj Kushwaha PhD Student Center for Computational Biology and Bioinformatics Jawaharlal Nehru University New Delhi-110067 Phone: 09868801604 [[alternative HTML version deleted]]

ADD REPLY • link 16.6 years ago hemant ritturaj ▴ 150

0

Entering edit mode

On Tue, Sep 9, 2008 at 1:31 PM, hemant ritturaj <ritturajhemant at="" gmail.com=""> wrote: > Dear Sean, > > My apologies for the errror. > > Here I am giving the complete error which i am getting > >> gse <- getGEO('gse6901')[[1]] >> exprs(gse) >> log2(exprs(gse)) >> set <- log2(exprs(gse)) >> targets <- c('A','A','A','B','B','B','C','C','C','D','D','D') >> f <- factor(targets,levels=c('A','B','C','D')) >> design <- model.matrix(~0+f) >> contrast.matrix <- makeContrasts(fA-fB,fA-fC,fA-fD,levels=design) >> fit <- lmFit(set,design,method="robust") >> fit2 <- contrasts.fit(fit,contrast.matrix) >> fit2 <- ebayes(fit2) > >> names(fit2) > [1] "df.prior" "s2.prior" "s2.post" "t" "p.value" "var.prior" > [7] "lods" > > topTable(fit2,n=10) > Error in if (ncol(fit) > 1) return(topTableF(fit, number = number, genelist > = genelist, : > argument is of length zero This works for me: gse <- getGEO('gse6901')[[1]] exprs(gse) <- log2(exprs(gse)+1) # avoid log2(0) since we are not normalizing targets <- c('A','A','A','B','B','B','C','C','C','D','D','D') f <- factor(targets,levels=c('A','B','C','D')) design <- model.matrix(~0+f) contrast.matrix <- makeContrasts(fA-fB,fA-fC,fA-fD,levels=design) fit <- lmFit(gse,design) # you can use an ExpressionSet just fine in limma fit2 <- contrasts.fit(fit,contrast.matrix) fit2 <- eBayes(fit2) names(fit2) [1] "coefficients" "rank" "assign" "qr" [5] "df.residual" "sigma" "cov.coefficients" "stdev.unscaled" [9] "genes" "Amean" "method" "design" [13] "contrasts" "df.prior" "s2.prior" "var.prior" [17] "proportion" "s2.post" "t" "p.value" [21] "lods" "F" "F.p.value" Again, start with a clean workspace, load GEOquery and limma, then paste the above commands into the window. Let us know if you have further questions.

ADD REPLY • link 16.6 years ago Sean Davis 21k

0

Entering edit mode

Dear Sean, Thanks for replying. It really worked the way you suggested. I have another question related to logFC I am getting the table fA...fB fA...fC fA...fD AveExpr F P.Value 35463 -7.708127 -7.938576 0.599991797 4.975526 543.0372 6.193852e-13 18194 -9.328894 -8.614855 0.071930586 7.696517 482.1491 1.244436e-12 23024 -8.227234 -6.507593 0.049250068 7.259353 425.1816 2.600162e-12 16492 -8.456246 -7.544640 -0.990340311 6.837342 404.4312 3.485316e-12 when I try to take any specific coefficient using topTable(fit2,number=10,coef="fA-fB") fA...fB fA...fC fA...fD AveExpr F P.Value 35463 -7.708127 -7.938576 0.599991797 4.975526 543.0372 6.193852e-13 18194 -9.328894 -8.614855 0.071930586 7.696517 482.1491 1.244436e-12 23024 -8.227234 -6.507593 0.049250068 7.259353 425.1816 2.600162e-12 16492 -8.456246 -7.544640 -0.990340311 6.837342 404.4312 3.485316e-12 I get this error Error in toptable(fit = fit[c("coefficients", "stdev.unscaled")], coef = coef, : subscript out of bounds Does that mean fA...fB is the logFC that i can consider directly ?? Is there any way that i can sort out the topTable error to figure out logFC for specific coeff ?? Regards On Wed, Sep 10, 2008 at 12:21 AM, Sean Davis <sdavis2@mail.nih.gov> wrote: > On Tue, Sep 9, 2008 at 1:31 PM, hemant ritturaj > <ritturajhemant@gmail.com> wrote: > > Dear Sean, > > > > My apologies for the errror. > > > > Here I am giving the complete error which i am getting > > > >> gse <- getGEO('gse6901')[[1]] > >> exprs(gse) > >> log2(exprs(gse)) > >> set <- log2(exprs(gse)) > >> targets <- c('A','A','A','B','B','B','C','C','C','D','D','D') > >> f <- factor(targets,levels=c('A','B','C','D')) > >> design <- model.matrix(~0+f) > >> contrast.matrix <- makeContrasts(fA-fB,fA-fC,fA-fD,levels=design) > >> fit <- lmFit(set,design,method="robust") > >> fit2 <- contrasts.fit(fit,contrast.matrix) > >> fit2 <- ebayes(fit2) > > > >> names(fit2) > > [1] "df.prior" "s2.prior" "s2.post" "t" "p.value" > "var.prior" > > [7] "lods" > > > > topTable(fit2,n=10) > > Error in if (ncol(fit) > 1) return(topTableF(fit, number = number, > genelist > > = genelist, : > > argument is of length zero > > This works for me: > > gse <- getGEO('gse6901')[[1]] > exprs(gse) <- log2(exprs(gse)+1) # avoid log2(0) since we are not > normalizing > targets <- c('A','A','A','B','B','B','C','C','C','D','D','D') > f <- factor(targets,levels=c('A','B','C','D')) > design <- model.matrix(~0+f) > contrast.matrix <- makeContrasts(fA-fB,fA-fC,fA-fD,levels=design) > fit <- lmFit(gse,design) # you can use an ExpressionSet just fine in limma > fit2 <- contrasts.fit(fit,contrast.matrix) > fit2 <- eBayes(fit2) > names(fit2) > > [1] "coefficients" "rank" "assign" "qr" > [5] "df.residual" "sigma" "cov.coefficients" > "stdev.unscaled" > [9] "genes" "Amean" "method" "design" > [13] "contrasts" "df.prior" "s2.prior" "var.prior" > [17] "proportion" "s2.post" "t" "p.value" > [21] "lods" "F" "F.p.value" > > Again, start with a clean workspace, load GEOquery and limma, then > paste the above commands into the window. Let us know if you have > further questions. > -- Hemant Ritturaj Kushwaha PhD Student Center for Computational Biology and Bioinformatics Jawaharlal Nehru University New Delhi-110067 Phone: 09868801604 [[alternative HTML version deleted]]

ADD REPLY • link 16.6 years ago hemant ritturaj ▴ 150

0

Entering edit mode

Hemant, The reason you are getting that error is because you need to match topTable's 'coef' argument with the column names of the your design/contrast matrix. Guessing from your output, probably you need to say coef="fA...fB" instead of coef="fA-fB", and so on for the others. Alternatively, you can specify coef=1 for the first column, coef=2, etc. Cheers, Mark > Dear Sean, > > Thanks for replying. > > It really worked the way you suggested. > > I have another question related to logFC > > I am getting the table > fA...fB fA...fC fA...fD AveExpr F > P.Value > 35463 -7.708127 -7.938576 0.599991797 4.975526 543.0372 6.193852e-13 > 18194 -9.328894 -8.614855 0.071930586 7.696517 482.1491 1.244436e-12 > 23024 -8.227234 -6.507593 0.049250068 7.259353 425.1816 2.600162e-12 > 16492 -8.456246 -7.544640 -0.990340311 6.837342 404.4312 3.485316e-12 > > when I try to take any specific coefficient using > > topTable(fit2,number=10,coef="fA-fB") > fA...fB fA...fC fA...fD AveExpr F P.Value > 35463 -7.708127 -7.938576 0.599991797 4.975526 543.0372 6.193852e-13 > 18194 -9.328894 -8.614855 0.071930586 7.696517 482.1491 1.244436e-12 > 23024 -8.227234 -6.507593 0.049250068 7.259353 425.1816 2.600162e-12 > 16492 -8.456246 -7.544640 -0.990340311 6.837342 404.4312 3.485316e-12 > > I get this error > > Error in toptable(fit = fit[c("coefficients", "stdev.unscaled")], coef = > coef, : > subscript out of bounds > > Does that mean fA...fB is the logFC that i can consider directly ?? > > Is there any way that i can sort out the topTable error to figure out > logFC > for specific coeff ?? > > Regards > > On Wed, Sep 10, 2008 at 12:21 AM, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: > >> On Tue, Sep 9, 2008 at 1:31 PM, hemant ritturaj >> <ritturajhemant at="" gmail.com=""> wrote: >> > Dear Sean, >> > >> > My apologies for the errror. >> > >> > Here I am giving the complete error which i am getting >> > >> >> gse <- getGEO('gse6901')[[1]] >> >> exprs(gse) >> >> log2(exprs(gse)) >> >> set <- log2(exprs(gse)) >> >> targets <- c('A','A','A','B','B','B','C','C','C','D','D','D') >> >> f <- factor(targets,levels=c('A','B','C','D')) >> >> design <- model.matrix(~0+f) >> >> contrast.matrix <- makeContrasts(fA-fB,fA-fC,fA- fD,levels=design) >> >> fit <- lmFit(set,design,method="robust") >> >> fit2 <- contrasts.fit(fit,contrast.matrix) >> >> fit2 <- ebayes(fit2) >> > >> >> names(fit2) >> > [1] "df.prior" "s2.prior" "s2.post" "t" "p.value" >> "var.prior" >> > [7] "lods" >> > >> > topTable(fit2,n=10) >> > Error in if (ncol(fit) > 1) return(topTableF(fit, number = number, >> genelist >> > = genelist, : >> > argument is of length zero >> >> This works for me: >> >> gse <- getGEO('gse6901')[[1]] >> exprs(gse) <- log2(exprs(gse)+1) # avoid log2(0) since we are not >> normalizing >> targets <- c('A','A','A','B','B','B','C','C','C','D','D','D') >> f <- factor(targets,levels=c('A','B','C','D')) >> design <- model.matrix(~0+f) >> contrast.matrix <- makeContrasts(fA-fB,fA-fC,fA-fD,levels=design) >> fit <- lmFit(gse,design) # you can use an ExpressionSet just fine in >> limma >> fit2 <- contrasts.fit(fit,contrast.matrix) >> fit2 <- eBayes(fit2) >> names(fit2) >> >> [1] "coefficients" "rank" "assign" "qr" >> [5] "df.residual" "sigma" "cov.coefficients" >> "stdev.unscaled" >> [9] "genes" "Amean" "method" "design" >> [13] "contrasts" "df.prior" "s2.prior" >> "var.prior" >> [17] "proportion" "s2.post" "t" "p.value" >> [21] "lods" "F" "F.p.value" >> >> Again, start with a clean workspace, load GEOquery and limma, then >> paste the above commands into the window. Let us know if you have >> further questions. >> > > > > -- > Hemant Ritturaj Kushwaha > PhD Student > Center for Computational Biology and Bioinformatics > Jawaharlal Nehru University > New Delhi-110067 > Phone: 09868801604 > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD REPLY • link 16.6 years ago Mark Robinson ★ 1.1k

0

Entering edit mode

Dear all, I am working on Rice microarray data from NCBI-GEO. I want to ask a question: Is there any way by which I can connect the Affymetrix probe expression data to the GO or TIGR annotations or Pfam ?? I shall be really thankful to anyone who helps in finding solution to the question. Regards Hemant Ritturaj Kushwaha [[alternative HTML version deleted]]

ADD REPLY • link 16.6 years ago hemant ritturaj ▴ 150

0

Entering edit mode

Dear All, My question is when i used the GSE data gse <- getGEO('GSE7530')[[1]] exprs(gse) <- log2(exprs(gse)+1) Warning message: In log(c(-2.50293388, -2.043776569, 0.552862779, 0.551170135, -2.433251302, : NaNs produced Quesion 1: I am not able to understand this error, is this mean that some of the point is not present in data, if yes how to tackle such problem ??? Keeping this problem in view would it be fine to go further with the further steps ?? I tried to save this expression information using > write.table(exprs(gse),file="gse_7530_exprssn",sep="\t") this file do have expression information but it doesn't have any probe name but jst the serial number of the probes, How can it be joined to the probe names. regards Hemant [[alternative HTML version deleted]]

ADD REPLY • link 16.6 years ago hemant ritturaj ▴ 150

0

Entering edit mode

On Sat, Sep 13, 2008 at 12:08 PM, hemant ritturaj <ritturajhemant at="" gmail.com=""> wrote: > Dear All, > > > My question is when i used the GSE data > > gse <- getGEO('GSE7530')[[1]] > exprs(gse) <- log2(exprs(gse)+1) > Warning message: > In log(c(-2.50293388, -2.043776569, 0.552862779, 0.551170135, -2.433251302, > : > NaNs produced > > Quesion 1: I am not able to understand this error, is this mean that some of > the point is not present in data, if yes how to tackle such problem ??? Note that this is not an error, but a warning. As for why, if you look at the raw data, you will see that it contains zeros. You can either add a small positive value to the expression values before taking the log (see the previous email that I sent to you) or you can use an approach like a glog (the bioc mailing list archive has a discussion of glog from Wolfgang Huber). > Keeping this problem in view would it be fine to go further with the > further steps ?? If your algorithms will deal with NaNs in the data, then you can probably go ahead. > I tried to save this expression information using >> write.table(exprs(gse),file="gse_7530_exprssn",sep="\t") > > this file do have expression information but it doesn't have any probe name > but jst the serial number of the probes, How can it be joined to the probe > names. You will need to write out the featureData as well. See the ExpressionSet help and Biobase vignettes for more information. Sean

ADD REPLY • link 16.6 years ago Sean Davis 21k

Login before adding your answer.