Simple affymetrix question (treated vs non-treated)

0

Entering edit mode

Wonjong Moon ▴ 80

@wonjong-moon-1900

Last seen 10.2 years ago

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20061005/ 42e0c5e0/attachment.pl

• 890 views

ADD COMMENT • link 18.2 years ago • updated 18.1 years ago Wonjong Moon ▴ 80

0

Entering edit mode

Morten Mattingsdal ▴ 190

@morten-mattingsdal-1907

Last seen 10.2 years ago

Hi Wonjong, I have some experience with affymetrix analysis in BioC, but im not a core member so my views are from a "end-user" point of view: 1. you design seems correct. If you want to double check it, use the affylmGUI library. Its a good way to check your design definitions. 2. Positive M values would mean up-regulated in group1 3. Yes. If you want to manually check probes to visually inspect their intensity values, use: Eset=as.matrix(eset) Eset["Pf.4.224.0_CDS_x_at",] 4. Ok. 5. Yes 6. If you have a relatively large experiment with many replicates, this is not necessary since weak spots tend to have a large variance and hence get poor p-values. Either way you can filter your data with modifications to the following code, where "data" is the return of the ReadAffy. This will keep probes with more than 3 present calls Eset=as.matrix(eset) calls=as.matrix(mas5calls(data)) calls.sum <- rowSums(calls=="P") keep <- names(calls.sum[calls.sum>=3]) new=Eset[keep,] then feed Limma the "new" object. Your toptable is only 10 probes long. You probably have some up-regulated probes as well further down the list ? To get all probes with positive B value use: top <- topTable(fit2,coef=1, adjust="BH", sort.by="B", number=20000) top <- top[top$B >0,] NB have you done any MA plots or boxplots of your data ? maybe that can give you any hints hope this helps morten Wonjong Moon wrote: > I am trying to analyze Affymetrix data (treated (group1) vs Non- Treated > (group2)) > I would like to know up-regulated probesets in group1 (treated)? > > 1. I would like to know that I set up the design matrix correctly. > 2. I did group1 (treated) - group1 (non-treated). So positive M values > means up-regulated in group1 (treated), is that right? Or I switched > treated to Non-treated? > 3. Output numbers look like to have opposite signs. > 4. If I go down to positive M values, then all P-values are 1. > 5 Does 4 mean there's no up-regulated probe sets at significant level? > > Data and R codes are available http://binf.gmu.edu/wmoon/diff. > > 6. How can I remove 'Absent' flagged probesets? Or is it OK to leave > them? > Thank you. > > Wonjong > > > Target file: 141PD.txt > > Name FileName Target > CSA1 1A-1_SA1_141PD.CEL CSA > CSA2 2A-1_SA2_141PD.CEL CSA > CSA3 3A-1_SA4_141PD.CEL CSA > CSA4 4A-1_SA5_141PD.CEL CSA > Non5 5A-1_Non1_141PD.CEL Non > Non6 6Ar-1_Non2_141PD.CEL Non > Non7 7A-1_Non4_141PD.CEL Non > Non8 8A-1_Non5_141PD.CEL Non > > > library(limma) # Loads limma library. > targets <- readTargets("141PD.txt") > library(affy); data <- ReadAffy(filenames=targets$FileName) # Reads CEL > files > eset <- rma(data) # Normalizes data with 'rma' > esign <- model.matrix(~ -1+factor(c(1,1,1,1,2,2,2,2))) > colnames(design) <- c("group1", "group2") > fit <- lmFit(eset, design) > contrast.matrix <- makeContrasts(group1-group2,levels=design) > fit2 <- contrasts.fit(fit, contrast.matrix) > fit2 <- eBayes(fit2) # Computes moderated t-statistics and log-odds > topTable(fit2, coef=1, adjust="fdr", sort.by="M", number=10) > > > output > ID M A t P.Value adj.P.Val B > 21231 Pf.4.224.0_CDS_x_at -4.49 3.99 -37.7 2.59e-10 5.90e-06 10.27 > 21230 Pf.4.223.0_CDS_x_at -4.32 4.10 -30.8 1.32e-09 1.50e-05 9.77 > 22728 X03144.1_at -3.57 4.39 -23.4 1.17e-08 5.74e-05 8.86 > 22101 Pf.7.64.0_CDS_a_at -5.03 4.64 -23.2 1.22e-08 5.74e-05 8.84 > 612 AF306408.1_RC_at -3.51 3.52 -22.6 1.50e-08 5.74e-05 8.73 > 20063 Pf.13_1.84.0_CDS_a_at -5.03 4.54 -22.6 1.51e-08 5.74e-05 8.73 > 20855 Pf.2.36.0_CDS_at -2.58 4.65 -20.2 3.73e-08 1.21e-04 8.24 > 22524 Pf.9.267.0_CDS_at -4.16 4.05 -18.5 7.27e-08 2.04e-04 7.84 > 21350 Pf.5.119.0_CDS_x_at -4.55 5.32 -18.3 8.07e-08 2.04e-04 7.78 > 20078 Pf.13_1.99.0_CDS_x_at -2.36 6.50 -17.9 9.52e-08 2.17e-04 7.67 > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > > . > >

ADD COMMENT • link 18.1 years ago Morten Mattingsdal ▴ 190

0

Entering edit mode

Wonjong Moon ▴ 80

@wonjong-moon-1900

Last seen 10.2 years ago

Thank you for your reply. Here, I have two different BioC codes A and B. I am comparaing Affymetrix data for 'CSA' and 'Non'. I used two different matrix design. Target file: 141PD.txt Name FileName Target CSA1 1A-1_SA1_141PD.CEL CSA CSA2 2A-1_SA2_141PD.CEL CSA CSA3 3A-1_SA4_141PD.CEL CSA CSA4 4A-1_SA5_141PD.CEL CSA Non5 5A-1_Non1_141PD.CEL Non Non6 6Ar-1_Non2_141PD.CEL Non Non7 7A-1_Non4_141PD.CEL Non Non8 8A-1_Non5_141PD.CEL Non Matrix design1 and design2 gave me the opposite sign with same B value (absolute value of M is exactly same), which means up-regulated genes in design1 became down-regulated in design2. I would like to know which one is correct for my purpose. My purpose is to know which matrix design gives me the up-regulated genes in 'CSA' with reasonable B or p values. Questions. 1. Positive M values in design1 mean up-regulated in CSA? 2. Positive M values in design2 mean up-regulated in CSA? A. matrix design1 library(affy) library(limma) # Loads limma library. targets <- readTargets("141PD.txt") # Reads targets information from file data <- ReadAffy(filenames=targets$FileName) # Reads CEL files (specified in 'targets') into AffyBatch object eset <- rma(data) # Normalizes data with 'rma' design <- model.matrix(~ -1+factor(c(1,1,1,1,2,2,2,2))) design colnames(design) <- c("CSA", "Non") fit <- lmFit(eset, design) contrast.matrix <- makeContrasts(CSA-Non,levels=design) fit2 <- contrasts.fit(fit, contrast.matrix) fit2 <- eBayes(fit2) topTable(fit2, coef=1, adjust="fdr", sort.by="B", number=10) ID M A t P.Value adj.P.Val B Pf.4.224.0_CDS_x_at -4.493635 3.989016 -37.73719 2.591029e-10 5.899515e-06 10.266739 Pf.4.223.0_CDS_x_at -4.321856 4.101203 -30.76139 1.319405e-09 1.502077e-05 9.771946 X03144.1_at -3.570154 4.388052 -23.35914 1.171606e-08 5.742442e-05 8.855552 Pf.7.64.0_CDS_a_at -5.031334 4.643776 -23.24412 1.218247e-08 5.742442e-05 8.836283 AF306408.1_RC_at -3.512501 3.516498 -22.64498 1.497590e-08 5.742442e-05 8.732638 Pf.13_1.84.0_CDS_a_at -5.032685 4.542793 -22.61523 1.513226e-08 5.742442e-05 8.727346 Pf.2.36.0_CDS_at -2.584731 4.651177 -20.17259 3.728248e-08 1.212693e-04 8.239339 Pf.9.267.0_CDS_at -4.158351 4.053352 -18.52982 7.270084e-08 2.041490e-04 7.841473 Pf.5.119.0_CDS_x_at -4.550460 5.321148 -18.28511 8.069483e-08 2.041490e-04 7.776541 Pf.13_1.99.0_CDS_x_at -2.364882 6.501028 -17.90327 9.521651e-08 2.167985e-04 7.672015 > design CSA Non 1 1 0 2 1 0 3 1 0 4 1 0 5 0 1 6 0 1 7 0 1 8 0 1 attr(,"assign") [1] 1 1 attr(,"contrasts") attr(,"contrasts")$`factor(c(1, 1, 1, 1, 2, 2, 2, 2))` [1] "contr.treatment" B. matrix design2 library(affy) library(limma) # Loads limma library. targets <- readTargets("141PD.txt") # Reads targets information from file data <- ReadAffy(filenames=targets$FileName) # Reads CEL files (specified in 'targets') into AffyBatch object eset <- rma(data) # Normalizes data with 'rma' pData(eset) chips <- c("CSA", "CSA", "CSA", "CSA", "Non", "Non", "Non", "Non") design <-model.matrix(~factor(chips)) colnames(design) <- c("CSA", "CSA vs Non") design fit <- lmFit(eset, design) fit <- eBayes(fit) options(digits=2) topTable(fit, coef=2, n=10, adjust="BH") ID M A t P.Value adj.P.Val B 21231 Pf.4.224.0_CDS_x_at 4.5 4.0 38 2.6e-10 5.9e-06 10.3 21230 Pf.4.223.0_CDS_x_at 4.3 4.1 31 1.3e-09 1.5e-05 9.8 22728 X03144.1_at 3.6 4.4 23 1.2e-08 5.7e-05 8.9 22101 Pf.7.64.0_CDS_a_at 5.0 4.6 23 1.2e-08 5.7e-05 8.8 612 AF306408.1_RC_at 3.5 3.5 23 1.5e-08 5.7e-05 8.7 20063 Pf.13_1.84.0_CDS_a_at 5.0 4.5 23 1.5e-08 5.7e-05 8.7 20855 Pf.2.36.0_CDS_at 2.6 4.7 20 3.7e-08 1.2e-04 8.2 22524 Pf.9.267.0_CDS_at 4.2 4.1 19 7.3e-08 2.0e-04 7.8 21350 Pf.5.119.0_CDS_x_at 4.6 5.3 18 8.1e-08 2.0e-04 7.8 20078 Pf.13_1.99.0_CDS_x_at 2.4 6.5 18 9.5e-08 2.2e-04 7.7 > design CSA CSA vs Non 1 1 0 2 1 0 3 1 0 4 1 0 5 1 1 6 1 1 7 1 1 8 1 1 attr(,"assign") [1] 0 1 attr(,"contrasts") attr(,"contrasts")$`factor(chips)` [1] "contr.treatment"

ADD COMMENT • link 18.1 years ago Wonjong Moon ▴ 80

0

Entering edit mode

Wonjong Moon wrote: > Thank you for your reply. > Here, I have two different BioC codes A and B. I am comparaing > Affymetrix data for 'CSA' and 'Non'. I used two different matrix design. > > Target file: 141PD.txt > > Name FileName Target > CSA1 1A-1_SA1_141PD.CEL CSA > CSA2 2A-1_SA2_141PD.CEL CSA > CSA3 3A-1_SA4_141PD.CEL CSA > CSA4 4A-1_SA5_141PD.CEL CSA > Non5 5A-1_Non1_141PD.CEL Non > Non6 6Ar-1_Non2_141PD.CEL Non > Non7 7A-1_Non4_141PD.CEL Non > Non8 8A-1_Non5_141PD.CEL Non > > > Matrix design1 and design2 gave me the opposite sign with same B value > (absolute value of M is exactly same), which means up-regulated genes in > design1 became down-regulated in design2. > I would like to know which one is correct for my purpose. My purpose is > to know which matrix design gives me the up-regulated genes in 'CSA' > with reasonable B or p values. > Questions. > 1. Positive M values in design1 mean up-regulated in CSA? You set up the design matrix specifically using CSA - Non, so that is the comparison you are making. Therefore, a positive value means up in CSA, and a negative means the opposite. > 2. Positive M values in design2 mean up-regulated in CSA? No. Coefficient 2 in that design is Non - CSA. HTH, Jim > > A. matrix design1 > library(affy) > > library(limma) # Loads limma library. > > targets <- readTargets("141PD.txt") # Reads targets information from > file > data <- ReadAffy(filenames=targets$FileName) # Reads CEL files > (specified in 'targets') into AffyBatch object > eset <- rma(data) # Normalizes data with 'rma' > design <- model.matrix(~ -1+factor(c(1,1,1,1,2,2,2,2))) > > design > colnames(design) <- c("CSA", "Non") > fit <- lmFit(eset, design) > > contrast.matrix <- makeContrasts(CSA-Non,levels=design) > > fit2 <- contrasts.fit(fit, contrast.matrix) > > fit2 <- eBayes(fit2) > > topTable(fit2, coef=1, adjust="fdr", sort.by="B", number=10) > > > ID M A t P.Value > adj.P.Val B > Pf.4.224.0_CDS_x_at -4.493635 3.989016 -37.73719 2.591029e-10 > 5.899515e-06 10.266739 > Pf.4.223.0_CDS_x_at -4.321856 4.101203 -30.76139 1.319405e-09 > 1.502077e-05 9.771946 > X03144.1_at -3.570154 4.388052 -23.35914 1.171606e-08 > 5.742442e-05 8.855552 > Pf.7.64.0_CDS_a_at -5.031334 4.643776 -23.24412 1.218247e-08 > 5.742442e-05 8.836283 > AF306408.1_RC_at -3.512501 3.516498 -22.64498 1.497590e-08 > 5.742442e-05 8.732638 > Pf.13_1.84.0_CDS_a_at -5.032685 4.542793 -22.61523 1.513226e-08 > 5.742442e-05 8.727346 > Pf.2.36.0_CDS_at -2.584731 4.651177 -20.17259 3.728248e-08 > 1.212693e-04 8.239339 > Pf.9.267.0_CDS_at -4.158351 4.053352 -18.52982 7.270084e-08 > 2.041490e-04 7.841473 > Pf.5.119.0_CDS_x_at -4.550460 5.321148 -18.28511 8.069483e-08 > 2.041490e-04 7.776541 > Pf.13_1.99.0_CDS_x_at -2.364882 6.501028 -17.90327 9.521651e-08 > 2.167985e-04 7.672015 > > > > >>design > > CSA Non > 1 1 0 > 2 1 0 > 3 1 0 > 4 1 0 > 5 0 1 > 6 0 1 > 7 0 1 > 8 0 1 > attr(,"assign") > [1] 1 1 > attr(,"contrasts") > attr(,"contrasts")$`factor(c(1, 1, 1, 1, 2, 2, 2, 2))` > [1] "contr.treatment" > > B. matrix design2 > library(affy) > > library(limma) # Loads limma library. > > targets <- readTargets("141PD.txt") # Reads targets information from > file > data <- ReadAffy(filenames=targets$FileName) # Reads CEL files > (specified in 'targets') into AffyBatch object > eset <- rma(data) # Normalizes data with 'rma' > pData(eset) > chips <- c("CSA", "CSA", "CSA", "CSA", "Non", "Non", "Non", "Non") > design <-model.matrix(~factor(chips)) > colnames(design) <- c("CSA", "CSA vs Non") > design > fit <- lmFit(eset, design) > fit <- eBayes(fit) > options(digits=2) > topTable(fit, coef=2, n=10, adjust="BH") > ID M A t P.Value adj.P.Val B > 21231 Pf.4.224.0_CDS_x_at 4.5 4.0 38 2.6e-10 5.9e-06 10.3 > 21230 Pf.4.223.0_CDS_x_at 4.3 4.1 31 1.3e-09 1.5e-05 9.8 > 22728 X03144.1_at 3.6 4.4 23 1.2e-08 5.7e-05 8.9 > 22101 Pf.7.64.0_CDS_a_at 5.0 4.6 23 1.2e-08 5.7e-05 8.8 > 612 AF306408.1_RC_at 3.5 3.5 23 1.5e-08 5.7e-05 8.7 > 20063 Pf.13_1.84.0_CDS_a_at 5.0 4.5 23 1.5e-08 5.7e-05 8.7 > 20855 Pf.2.36.0_CDS_at 2.6 4.7 20 3.7e-08 1.2e-04 8.2 > 22524 Pf.9.267.0_CDS_at 4.2 4.1 19 7.3e-08 2.0e-04 7.8 > 21350 Pf.5.119.0_CDS_x_at 4.6 5.3 18 8.1e-08 2.0e-04 7.8 > 20078 Pf.13_1.99.0_CDS_x_at 2.4 6.5 18 9.5e-08 2.2e-04 7.7 > > > >>design > > CSA CSA vs Non > 1 1 0 > 2 1 0 > 3 1 0 > 4 1 0 > 5 1 1 > 6 1 1 > 7 1 1 > 8 1 1 > attr(,"assign") > [1] 0 1 > attr(,"contrasts") > attr(,"contrasts")$`factor(chips)` > [1] "contr.treatment" > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.

ADD REPLY • link 18.1 years ago James W. MacDonald 67k

Login before adding your answer.