annotating microarray data with mogene10stv1

0

Entering edit mode

Jakub Stanislaw Nowak ▴ 70

@jakub-stanislaw-nowak-6656

Last seen 10.6 years ago

Hello everyone, This my first attempt so it may not be a perfect email. I am not very advanced in bioinformatics so I tried to be very detailed. Basically I am trying to annotate microarray dataset from Affymetrix using bioconductor. I used following steps: 1. loading libraries > library(annotate) > library(limma) > library(mogene10sttranscriptcluster.db) > library(affy) 2. I read my target files into annotated data frame using >adf <- read.AnnotatedDataFrame("target.txt",header=TRUE,row.names=1 ,as.is=TRUE) 3. I read my expression .CEL files into target files using >mydata <- ReadAffy(filenames=pData(adf)$FileName,phenoData=adf) > > mydata > AffyBatch object > size of arrays=1050x1050 features (19 kb) > cdf=MoGene-1_0-st-v1 (34760 affyids) > number of samples=6 > number of genes=34760 > annotation=mogene10stv1 > notes= 4. I normalised my data with ram >eset <- rma(mydata) when you check eset it looks ok > > eset > ExpressionSet (storageMode: lockedEnvironment) > assayData: 34760 features, 6 samples > element names: exprs > protocolData > sampleNames: GSM910962.CEL GSM910963.CEL ... GSM910967.CEL (6 total) > varLabels: ScanDate > varMetadata: labelDescription > phenoData > sampleNames: GSM910962.CEL GSM910963.CEL ... GSM910967.CEL (6 total) > varLabels: FileName Description > varMetadata: labelDescription > featureData > featureNames: 10338001 10338003 ... 10608724 (34760 total) > fvarLabels: ID Symbol Name > fvarMetadata: labelDescription > experimentData: use 'experimentData(object)' > Annotation: mogene10stv1 5. I was able to generate fold change vector for interesting samples but I have problem annotating eset. 6. I tried annotate the eset file using annotation from above using mogene10sttranscriptcluster.db or mogene10stprobes.db. #I build an annotation table ID <- featureNames(eset) Symbol <- getSYMBOL(ID, "mogene10sttranscriptcluster.db") Name <- as.character(lookUp(ID, "mogene10sttranscriptcluster.db", "GENENAME")) tmp <- data.frame(ID=ID, Symbol=Symbol, Name=Name, stringsAsFactors=F) tmp[tmp=="NA"] <- NA #fix padding with NA characters The problem I have is that for large number of IDs - all initial 6500 - I am getting Symbol and Name annotated as NA Here is the output for some of them > > Name[6500:6550] > [1] "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" > [22] "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" > [43] "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" ?NA" > > > Symbol[6500:6550] > 10344545 10344546 10344547 10344548 10344549 10344550 10344551 10344552 10344553 10344554 10344555 10344556 > NA NA NA NA NA NA NA NA NA NA NA NA > 10344557 10344558 10344559 10344560 10344561 10344562 10344563 10344564 10344565 10344566 10344567 10344568 > NA NA NA NA NA NA NA NA NA NA NA NA > 10344569 10344570 10344571 10344572 10344573 10344574 10344575 10344576 10344577 10344578 10344579 10344580 > NA NA NA NA NA NA NA NA NA NA NA NA > 10344581 10344582 10344583 10344584 10344585 10344586 10344587 10344588 10344589 10344590 10344591 10344592 > NA NA NA NA NA NA NA NA NA NA NA NA > 10344593 10344594 10344595 > NA NA NA #So if I assign it as feature data of the current Eset fData(eset) <- tmp #and perform stat with limma #Build the design matrix design <- model.matrix(~-1+factor(c(1,1,2,2,3,3))) colnames(design) <- c(?mock?,"siGFP?,"siLin28a",?") > > design > mock siGFP siLin28a > 1 1 0 0 > 2 1 0 0 > 3 0 1 0 > 4 0 1 0 > 5 0 0 1 > 6 0 0 1 > attr(,"assign") > [1] 1 1 1 > attr(,"contrasts") > attr(,"contrasts")$`factor(c(1, 1, 2, 2, 3, 3))` > [1] "contr.treatment" # instructs Limma which comparisons to make contrastmatrix <- makeContrasts(mock-siGFP,mock-siLin28a,siGFP- siLin28a,levels=design) > > contrastmatrix > Contrasts > Levels mock - siGFP mock - siLin28a siGFP - siLin28a > mock 1 1 0 > siGFP -1 0 1 > siLin28a 0 -1 -1 # make the contrasts fit <- lmFit(eset, design) fit2 <- contrasts.fit(fit, contrastmatrix) fit2 <- eBayes(fit2) #listed the top differentially expressed genes > > topTable(fit2,coef=1,adjust="fdr") > ID Symbol Name logFC AveExpr t P.Value adj.P.Val B > 10342604 10342604 <na> <na> 1.831809 2.845863 14.037480 1.248257e-05 0.4338942 -3.694032 > 10343224 10343224 <na> <na> -2.751868 2.658551 -10.992289 4.802754e-05 0.8347186 -3.715792 > 10339733 10339733 <na> <na> 1.703917 3.426402 9.860457 8.674159e-05 1.0000000 -3.728996 > 10341175 10341175 <na> <na> -2.861665 2.167471 -8.877976 1.526481e-04 1.0000000 -3.744350 > 10340405 10340405 <na> <na> -1.368074 2.944242 -8.245297 2.263752e-04 1.0000000 -3.756919 > 10343199 10343199 <na> <na> -1.238289 2.077839 -8.130892 2.437752e-04 1.0000000 -3.759472 > 10339048 10339048 <na> <na> 1.101959 2.129288 7.949683 2.746238e-04 1.0000000 -3.763713 > 10344413 10344413 <na> <na> -1.407850 2.259821 -7.810608 3.014011e-04 1.0000000 -3.767144 > 10343919 10343919 <na> <na> 1.134134 2.650166 7.644642 3.374231e-04 1.0000000 -3.771452 > 10338867 10338867 <na> <na> 1.162114 5.792621 7.454279 3.850651e-04 1.0000000 -3.776701 As you can see just this small portion is already missing information about Symbol and Name. So my question is do I use a correct .db library for annotation? As it looks like I missing a lot of ID cannot be annotated How can I fix that problem? Many Thanks, Jakub -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.

Microarray Annotation annotate limma ASSIGN Microarray Annotation annotate limma ASSIGN • 4.7k views

ADD COMMENT • link updated 10.7 years ago by James W. MacDonald 68k • written 10.8 years ago by Jakub Stanislaw Nowak ▴ 70

0

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 14 hours ago

United States

Hi Jakub, On 7/19/2014 8:30 AM, Jakub Stanislaw Nowak wrote: > Hello everyone, > > This my first attempt so it may not be a perfect email. I am not very advanced in bioinformatics so I tried to be very detailed. > Basically I am trying to annotate microarray dataset from Affymetrix using bioconductor. > > I used following steps: > > 1. loading libraries > >> library(annotate) >> library(limma) >> library(mogene10sttranscriptcluster.db) >> library(affy) > > 2. I read my target files into annotated data frame using >> adf <- read.AnnotatedDataFrame("target.txt",header=TRUE,row.names=1 ,as.is=TRUE) > > 3. I read my expression .CEL files into target files using >> mydata <- ReadAffy(filenames=pData(adf)$FileName,phenoData=adf) > >>> mydata >> AffyBatch object >> size of arrays=1050x1050 features (19 kb) >> cdf=MoGene-1_0-st-v1 (34760 affyids) >> number of samples=6 >> number of genes=34760 >> annotation=mogene10stv1 >> notes= > > > 4. I normalised my data with ram >> eset <- rma(mydata) > > when you check eset it looks ok >>> eset >> ExpressionSet (storageMode: lockedEnvironment) >> assayData: 34760 features, 6 samples >> element names: exprs >> protocolData >> sampleNames: GSM910962.CEL GSM910963.CEL ... GSM910967.CEL (6 total) >> varLabels: ScanDate >> varMetadata: labelDescription >> phenoData >> sampleNames: GSM910962.CEL GSM910963.CEL ... GSM910967.CEL (6 total) >> varLabels: FileName Description >> varMetadata: labelDescription >> featureData >> featureNames: 10338001 10338003 ... 10608724 (34760 total) >> fvarLabels: ID Symbol Name >> fvarMetadata: labelDescription >> experimentData: use 'experimentData(object)' >> Annotation: mogene10stv1 > > 5. I was able to generate fold change vector for interesting samples but I have problem annotating eset. > > 6. I tried annotate the eset file using annotation from above using mogene10sttranscriptcluster.db or mogene10stprobes.db. > > #I build an annotation table > > ID <- featureNames(eset) > > Symbol <- getSYMBOL(ID, "mogene10sttranscriptcluster.db") > > Name <- as.character(lookUp(ID, "mogene10sttranscriptcluster.db", "GENENAME")) > > tmp <- data.frame(ID=ID, Symbol=Symbol, Name=Name, > stringsAsFactors=F) > > tmp[tmp=="NA"] <- NA #fix padding with NA characters There are two issues here. First, the all of the Gene ST arrays have a lot of control probes that tend to pollute your results. There is a function in my affycoretools package that you can use to remove them (getMainProbes). Second, you don't want to annotate like that. The lookUp() function will by default ignore all probesets that have one-to-many mappings, and will return NA for them. Instead, use more current methods: tmp <- select(mogene10sttranscriptcluster.db, ID, c("SYMBOL","GENENAME","ENTREZID")) You will then get a message stating that you have multiple-mapping probesets. How you deal with that is up to you. Alternatives include concatenating: tmp2 <- do.call("rbind", lapply(split(tmp, tmp$PROBEID), function(x) apply(x,2,function(y) paste(y[!duplicated(y)], collapse = " | ")))) I usually don't do that because I push results out via ReportingTools, and I like to have links to the Gene IDs. Plus some of the probesets map to a huge number of genes, so you can get really messy results. An alternative is to select genes at random, for those with multiple mappings: tmp3 <- t(sapply(split(tmp, tmp$PROBEID),function(x) x[sample(seq_len(nrow(x)), 1),])) Or my personal favorite, the most naive thing possible: tmp4 <- tmp[!duplicated(tmp$PROBEID),] Best, Jim > > > > The problem I have is that for large number of IDs - all initial 6500 - I am getting Symbol and Name annotated as NA > Here is the output for some of them > >>> Name[6500:6550] >> [1] "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" >> [22] "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" >> [43] "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" ?NA" >> >>> Symbol[6500:6550] >> 10344545 10344546 10344547 10344548 10344549 10344550 10344551 10344552 10344553 10344554 10344555 10344556 >> NA NA NA NA NA NA NA NA NA NA NA NA >> 10344557 10344558 10344559 10344560 10344561 10344562 10344563 10344564 10344565 10344566 10344567 10344568 >> NA NA NA NA NA NA NA NA NA NA NA NA >> 10344569 10344570 10344571 10344572 10344573 10344574 10344575 10344576 10344577 10344578 10344579 10344580 >> NA NA NA NA NA NA NA NA NA NA NA NA >> 10344581 10344582 10344583 10344584 10344585 10344586 10344587 10344588 10344589 10344590 10344591 10344592 >> NA NA NA NA NA NA NA NA NA NA NA NA >> 10344593 10344594 10344595 >> NA NA NA > > #So if I assign it as feature data of the current Eset > > fData(eset) <- tmp > > #and perform stat with limma > > #Build the design matrix > design <- model.matrix(~-1+factor(c(1,1,2,2,3,3))) > colnames(design) <- c(?mock?,"siGFP?,"siLin28a",?") > >>> design >> mock siGFP siLin28a >> 1 1 0 0 >> 2 1 0 0 >> 3 0 1 0 >> 4 0 1 0 >> 5 0 0 1 >> 6 0 0 1 >> attr(,"assign") >> [1] 1 1 1 >> attr(,"contrasts") >> attr(,"contrasts")$`factor(c(1, 1, 2, 2, 3, 3))` >> [1] "contr.treatment" > > > # instructs Limma which comparisons to make > contrastmatrix <- makeContrasts(mock-siGFP,mock-siLin28a,siGFP- siLin28a,levels=design) > >>> contrastmatrix >> Contrasts >> Levels mock - siGFP mock - siLin28a siGFP - siLin28a >> mock 1 1 0 >> siGFP -1 0 1 >> siLin28a 0 -1 -1 > > # make the contrasts > fit <- lmFit(eset, design) > > fit2 <- contrasts.fit(fit, contrastmatrix) > fit2 <- eBayes(fit2) > > #listed the top differentially expressed genes > >>> topTable(fit2,coef=1,adjust="fdr") >> ID Symbol Name logFC AveExpr t P.Value adj.P.Val B >> 10342604 10342604 <na> <na> 1.831809 2.845863 14.037480 1.248257e-05 0.4338942 -3.694032 >> 10343224 10343224 <na> <na> -2.751868 2.658551 -10.992289 4.802754e-05 0.8347186 -3.715792 >> 10339733 10339733 <na> <na> 1.703917 3.426402 9.860457 8.674159e-05 1.0000000 -3.728996 >> 10341175 10341175 <na> <na> -2.861665 2.167471 -8.877976 1.526481e-04 1.0000000 -3.744350 >> 10340405 10340405 <na> <na> -1.368074 2.944242 -8.245297 2.263752e-04 1.0000000 -3.756919 >> 10343199 10343199 <na> <na> -1.238289 2.077839 -8.130892 2.437752e-04 1.0000000 -3.759472 >> 10339048 10339048 <na> <na> 1.101959 2.129288 7.949683 2.746238e-04 1.0000000 -3.763713 >> 10344413 10344413 <na> <na> -1.407850 2.259821 -7.810608 3.014011e-04 1.0000000 -3.767144 >> 10343919 10343919 <na> <na> 1.134134 2.650166 7.644642 3.374231e-04 1.0000000 -3.771452 >> 10338867 10338867 <na> <na> 1.162114 5.792621 7.454279 3.850651e-04 1.0000000 -3.776701 > > As you can see just this small portion is already missing information about Symbol and Name. > > So my question is do I use a correct .db library for annotation? As it looks like I missing a lot of ID cannot be annotated > > How can I fix that problem? > > Many Thanks, > > Jakub > > > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD COMMENT • link 10.7 years ago James W. MacDonald 68k

0

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 14 hours ago

United States

Hi Jakub, Please don't take questions off-list (use 'Reply-all' when responding). On 7/22/2014 12:06 PM, Jakub Stanislaw Nowak wrote: > Hi Jim, > > I think I have couple follow up questions. As I got stuck trying using getMainProbes function. > As I am still a beginner with R my question might sound quite naive > > 1. First question is about loading data using oligo package. Which approach would you use or they both give the same output? > >>> celFiles<-list.celfiles() >>> mydata <- read.celfiles(celFiles) >> Platform design info loaded. >> Reading in : GSM910962.CEL >> Reading in : GSM910963.CEL >> Reading in : GSM910964.CEL >> Reading in : GSM910965.CEL >> Reading in : GSM910966.CEL >> Reading in : GSM910967.CEL > > or > >>> adf<-read.AnnotatedDataFrame("target.txt",row.names=1, header=TRUE, as.is=TRUE) >>> mydata2 <- read.celfiles(filenames=pData(adf)$FileName,phenoData=adf) >> Platform design info loaded. >> Reading in : GSM910962.CEL >> Reading in : GSM910963.CEL >> Reading in : GSM910964.CEL >> Reading in : GSM910965.CEL >> Reading in : GSM910966.CEL >> Reading in : GSM910967.CEL >> Warning message: >> In read.celfiles(filenames = pData(adf)$FileName, phenoData = adf) : >> 'channel' automatically added to varMetadata in phenoData. There should be no difference between the two, other than the obvious difference in the phenoData slot. > > 2. how would use function getMainProbes > > I tried this and I ended up getting an error > >>> eset <- rma(mydata) >> Background correcting >> Normalizing >> Calculating Expression > >>> ID <- getMainProbes(eset) >>> ID >> ExpressionSet (storageMode: lockedEnvironment) >> assayData: 28858 features, 6 samples >> element names: exprs >> protocolData >> rowNames: mock1 mock2 ... siLin28a2 (6 total) >> varLabels: exprs dates >> varMetadata: labelDescription channel >> phenoData >> rowNames: mock1 mock2 ... siLin28a2 (6 total) >> varLabels: index >> varMetadata: labelDescription channel >> featureData: none >> experimentData: use 'experimentData(object)' >> Annotation: pd.mogene.1.0.st.v1 You didn't get an error. You were returned an ExpressionSet containing only the 28,858 main probes (you started with 35K or so, IIRC). > >>> symbol <- getSYMBOL(ID, "pd.mogene.1.0.st.v1") >> Error in unlist(lookUp(x, data, "SYMBOL")) : >> error in evaluating the argument 'x' in selecting a method for function 'unlist': Error in mget(x, envir = getAnnMap(what, chip = data, load = load), ifnotfound = NA) : >> error in evaluating the argument 'envir' in selecting a method for function 'mget': Error in (function (classes, fdef, mtable) : >> unable to find an inherited method for function ?columns? for signature ?"AffyGenePDInfo?? > > I think getMainProbes vs featureNames result in different format of output so maybe therefore my reasoning is wrong when I want to obtain symbols. > Also what type of annotation would you use. pd.mogene.1.0.st.v1 or mogene10sttranscriptcluster.db? I gave you a suggestion previously that you shouldn't be using getSYMBOL(), or lookUp() or any of the old-style annotation functions. That suggestion still holds! Use select() instead! Also, pd.mogene.1.0.st.v1 isn't an annotation package. It is similar in spirit to the cdf packages that you use with the affy package, and is used to map probes to probesets, among other things. The annotation package for this array, when summarized at the 'core' level (which is the default for oligo::rma()) is the mogene10sttranscriptcluster.db package. Refer to my previous email to see how to use this package to annotate your data. Best, Jim > > I will be grateful if you can give me some suggestions. > > Thanks, > > Jakub > > > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD COMMENT • link 10.7 years ago James W. MacDonald 68k

0

Entering edit mode

The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.

ADD REPLY • link 10.7 years ago Jakub Stanislaw Nowak ▴ 70

0

Entering edit mode

Hi, Jakub When you do ID <- getMainProbes(eset), the ID here is an expression set rather than a character vector. To extract the character vector, you can do featureNames(ID). select(mogene10sttranscriptcluster.db, featureNames(ID), c("SYMBOL","GENENAME","ENTREZID")) Best, Xiayu From: bioconductor-bounces@r-project.org [mailto:bioconductor- bounces@r-project.org] On Behalf Of Jakub Stanislaw Nowak Sent: Tuesday, July 22, 2014 2:42 PM To: James W. MacDonald Cc: bioconductor@r-project.org Subject: Re: [BioC] annotating microarray data with mogene10stv1 Hi Jim, Thanks for your suggestion. Somehow I overlooked the function select. Now I think I am getting closer. I have a problem with applying select () to my probes. I think it may be due to type of ID = probes value type which is ExpressionSet. So first as explained before I generated the ID containing main probes from my dataset > > ID <- getMainProbes(eset) > > ID > ExpressionSet (storageMode: lockedEnvironment) > assayData: 28858 features, 6 samples > element names: exprs > protocolData > rowNames: mock1 mock2 ... siLin28a2 (6 total) > varLabels: exprs dates > varMetadata: labelDescription channel > phenoData > rowNames: mock1 mock2 ... siLin28a2 (6 total) > varLabels: index > varMetadata: labelDescription channel > featureData: none > experimentData: use 'experimentData(object)' > Annotation: pd.mogene.1.0.st.v1 Then I wanted to annotate using select() and I am getting this error. > > tmp <- select(mogene10sttranscriptcluster.db, ID, c("SYMBOL","GENENAME","ENTREZID")) > Error in .testForValidKeys(x, keys, keytype) : > 'keys' must be a character vector However if I use ID which is generated with featureNames() the select() works but I think I am not removing control probes that you were describing before by applying this approach. Is there a way that I can convert value which is of type ExpressionSet to a character type? Or alternatively what should I do make it work? Many thanks, Jakub On 22 Jul 2014, at 17:21, James W. MacDonald <jmacdon@uw.edu<mailto:jmacdon@uw.edu>> wrote: > Hi Jakub, > > Please don't take questions off-list (use 'Reply-all' when responding). > > On 7/22/2014 12:06 PM, Jakub Stanislaw Nowak wrote: >> Hi Jim, >> >> I think I have couple follow up questions. As I got stuck trying using getMainProbes function. >> As I am still a beginner with R my question might sound quite naive >> >> 1. First question is about loading data using oligo package. Which approach would you use or they both give the same output? >> >>>> celFiles<-list.celfiles() >>>> mydata <- read.celfiles(celFiles) >>> Platform design info loaded. >>> Reading in : GSM910962.CEL >>> Reading in : GSM910963.CEL >>> Reading in : GSM910964.CEL >>> Reading in : GSM910965.CEL >>> Reading in : GSM910966.CEL >>> Reading in : GSM910967.CEL >> >> or >> >>>> adf<-read.AnnotatedDataFrame("target.txt",row.names=1, header=TRUE, as.is=TRUE) >>>> mydata2 <- read.celfiles(filenames=pData(adf)$FileName,phenoData=adf) >>> Platform design info loaded. >>> Reading in : GSM910962.CEL >>> Reading in : GSM910963.CEL >>> Reading in : GSM910964.CEL >>> Reading in : GSM910965.CEL >>> Reading in : GSM910966.CEL >>> Reading in : GSM910967.CEL >>> Warning message: >>> In read.celfiles(filenames = pData(adf)$FileName, phenoData = adf) : >>> 'channel' automatically added to varMetadata in phenoData. > > There should be no difference between the two, other than the obvious difference in the phenoData slot. > >> >> 2. how would use function getMainProbes >> >> I tried this and I ended up getting an error >> >>>> eset <- rma(mydata) >>> Background correcting >>> Normalizing >>> Calculating Expression >> >>>> ID <- getMainProbes(eset) >>>> ID >>> ExpressionSet (storageMode: lockedEnvironment) >>> assayData: 28858 features, 6 samples >>> element names: exprs >>> protocolData >>> rowNames: mock1 mock2 ... siLin28a2 (6 total) >>> varLabels: exprs dates >>> varMetadata: labelDescription channel >>> phenoData >>> rowNames: mock1 mock2 ... siLin28a2 (6 total) >>> varLabels: index >>> varMetadata: labelDescription channel >>> featureData: none >>> experimentData: use 'experimentData(object)' >>> Annotation: pd.mogene.1.0.st.v1 > > You didn't get an error. You were returned an ExpressionSet containing only the 28,858 main probes (you started with 35K or so, IIRC). > >> >>>> symbol <- getSYMBOL(ID, "pd.mogene.1.0.st.v1") >>> Error in unlist(lookUp(x, data, "SYMBOL")) : >>> error in evaluating the argument 'x' in selecting a method for function 'unlist': Error in mget(x, envir = getAnnMap(what, chip = data, load = load), ifnotfound = NA) : >>> error in evaluating the argument 'envir' in selecting a method for function 'mget': Error in (function (classes, fdef, mtable) : >>> unable to find an inherited method for function âcolumnsâ for signature â"AffyGenePDInfoââ >> >> I think getMainProbes vs featureNames result in different format of output so maybe therefore my reasoning is wrong when I want to obtain symbols. >> Also what type of annotation would you use. pd.mogene.1.0.st.v1 or mogene10sttranscriptcluster.db? > > I gave you a suggestion previously that you shouldn't be using getSYMBOL(), or lookUp() or any of the old-style annotation functions. That suggestion still holds! Use select() instead! > > Also, pd.mogene.1.0.st.v1 isn't an annotation package. It is similar in spirit to the cdf packages that you use with the affy package, and is used to map probes to probesets, among other things. > > The annotation package for this array, when summarized at the 'core' level (which is the default for oligo::rma()) is the mogene10sttranscriptcluster.db package. Refer to my previous email to see how to use this package to annotate your data. > > Best, > > Jim > > >> >> I will be grateful if you can give me some suggestions. >> >> Thanks, >> >> Jakub >> >> >> > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ Bioconductor mailing list Bioconductor@r-project.org<mailto:bioconductor@r-project.org> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]

ADD REPLY • link 10.7 years ago Rao,Xiayu ▴ 550

0

Entering edit mode

The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.

ADD REPLY • link 10.7 years ago Jakub Stanislaw Nowak ▴ 70

Login before adding your answer.