ExpressionSet class and problems with phenotype and metadata matrices
1
0
Entering edit mode
@sean-maceachern-2684
Last seen 10.1 years ago
Hello, I'm new to R and Bioconductor. I am trying to analyse a simple microarray experiment examining two lines: Resistant (R) and susceptible (S) for differences in expression levels. The data I have contains a file with expression for 4 and 3 replicates from the R and S lines respectively. I'm trying to create an ExpressionSet object to initially complete some exploratory clustering on the data set and I have been following the vignette " An Introduction to Bioconductor?s ExpressionSet Class" by Falcon etal. I have read in my data: >summary(AffyIn) lineA.1 lineB.3 Min. : 2.0 Min. : 2.0 1st Qu.: 18.0 1st Qu.: 18.0 Median : 38.0 Median : 42.0 Mean : 139.0 Mean : 143.4 3rd Qu.: 96.0 3rd Qu.: 105.0 Max. :6974.0 ...... Max. :7417.0 >dim(AffyIn) [1] 38483 7 Following the vignette I have read in a simple phenotype txt file containing seven rows which relate to the 7 lines with two phenotypes R and S >dim(AffyPheno) [1] 7 1 >summary(AffyPheno) Pheno R:4 S:3 > all(rownames(AffyPheno) == colnames(AffyIn)) [1] TRUE #However, it is after this that I start having some problems; as I am using my own data I have modified some of the exercises in the vignette. > AffyPheno[c(3,7),c("Pheno")] [1] R S Levels: R S # I was expecting something like the following to be returned: Pheno lineA.3 R LineB.7 S #Also when I try the following command I get this error: >AffyPheno[AffyPheno$Pheno == "R"] Error in `[.data.frame`(AffyPheno, AffyPheno$Pheno == "R") : undefined columns selected #My R programming knowledge is basic at best so I assumed there was something wrong there and continued with the metadata and phenoData > metadata = data.frame(labelDescrition = c("Status"),rownames=c("Phenotype")) > metadata labelDescrition rownames 1 Status Phenotype > phenoData=new("AnnotatedDataFrame", data = AffyPheno, varMetadata = metadata) > phenoData An object of class "AnnotatedDataFrame" rowNames: line6.1, line6.2, ..., line7.4 (7 total) varLabels and varMetadata description: Pheno: NA additional varMetadata: rownames, labelDescription # As you can see no error was thrown, but I was expecting something in the varLabels and varMetadata descrtiptions... So I thought it was best to check the list to see if anyone could point out any mistakes I've made before I continue. While I was here I was also wondering if anyone knew of anything in the annotation package like the hgu95av2 chip for annotating chicken affy data in the annotation library? Thanks in advance, Sean MacEachern R version 2.6.0 (2007-10-03) i386-apple-darwin8.10.1 Biobase_1.16.3
Annotation Clustering hgu95av2 Annotation Clustering hgu95av2 • 1.0k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 12 hours ago
United States
Hi Sean, Sean MacEachern wrote: > Hello, > > I'm new to R and Bioconductor. I am trying to analyse a simple microarray > experiment examining two lines: Resistant (R) and susceptible (S) for > differences in expression levels. > > The data I have contains a file with expression for 4 and 3 replicates from > the R and S lines respectively. I'm trying to create an ExpressionSet object > to initially complete some exploratory clustering on the data set and I have > been following the vignette " An Introduction to Bioconductor?s > ExpressionSet Class" by Falcon etal. > > I have read in my data: > >> summary(AffyIn) > lineA.1 lineB.3 > Min. : 2.0 Min. : 2.0 > 1st Qu.: 18.0 1st Qu.: 18.0 > Median : 38.0 Median : 42.0 > Mean : 139.0 Mean : 143.4 > 3rd Qu.: 96.0 3rd Qu.: 105.0 > Max. :6974.0 ...... Max. :7417.0 > >> dim(AffyIn) > [1] 38483 7 > > Following the vignette I have read in a simple phenotype txt file containing > seven rows which relate to the 7 lines with two phenotypes R and S > >> dim(AffyPheno) > [1] 7 1 > >> summary(AffyPheno) > Pheno > R:4 > S:3 > >> all(rownames(AffyPheno) == colnames(AffyIn)) > [1] TRUE > > > #However, it is after this that I start having some problems; as I am using > my own data I have modified some of the exercises in the vignette. > >> AffyPheno[c(3,7),c("Pheno")] > [1] R S > Levels: R S > > # I was expecting something like the following to be returned: > Pheno > lineA.3 R > LineB.7 S You shouldn't expect that. You might want to peruse 'An Introduction to R', which I believe should cover this point. What is happening is the output is being coerced to a vector, which can be overridden by using AffyPheno[c(3,7),c("Pheno"), drop=FALSE] > > #Also when I try the following command I get this error: >> AffyPheno[AffyPheno$Pheno == "R"] > > Error in `[.data.frame`(AffyPheno, AffyPheno$Pheno == "R") : > undefined columns selected The error is supposed to be helpful here. You are trying to select rows from a data.frame, but you aren't saying which columns you want. The correct incantation looks like this: AffyPheno[AffyPheno$Pheno == "R", ] if you want all columns. This again is something that 'An Introduction to R' will help with. > > #My R programming knowledge is basic at best so I assumed there was > something wrong there and continued with the metadata and phenoData > >> metadata = data.frame(labelDescrition = c("Status"),rownames=c("Phenotype")) >> metadata > labelDescrition rownames > 1 Status Phenotype > >> phenoData=new("AnnotatedDataFrame", data = AffyPheno, varMetadata = metadata) >> phenoData > An object of class "AnnotatedDataFrame" > rowNames: line6.1, line6.2, ..., line7.4 (7 total) > varLabels and varMetadata description: > Pheno: NA > additional varMetadata: rownames, labelDescription > > > # As you can see no error was thrown, but I was expecting something in the > varLabels and varMetadata descrtiptions... I'd have to check to be sure, but I believe what you want for your metadata is to explain what the 'Pheno' column contains. So something like metadata = data.frame(labelDescrition = c("Phenotype"),rownames="Pheno") Is IIRC correct. I'm actually surprised you didn't get an error. Martin Morgan may respond as well, and he knows better than, well, everybody about the ExpressionSet class so he will know for sure. > > So I thought it was best to check the list to see if anyone could point out > any mistakes I've made before I continue. > > While I was here I was also wondering if anyone knew of anything in the > annotation package like the hgu95av2 chip for annotating chicken affy data > in the annotation library? Um, what? Not sure what you want here. The hgu95av2 chip is designed for analyzing human samples, so there is nothing in there for chickens. If you have chicken affy data, then you might want to look at the chicken annotation package, which _does_ annotate that chip. Best, Jim > > Thanks in advance, > > Sean MacEachern > > R version 2.6.0 (2007-10-03) > i386-apple-darwin8.10.1 > Biobase_1.16.3 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
ADD COMMENT

Login before adding your answer.

Traffic: 997 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6