Entering edit mode
Scott Ochsner
▴
300
@scott-ochsner-599
Last seen 10.3 years ago
Dear BioC,
I would like feedback as to the appropriateness of the following
procedure to produce a set of 1000 random gene lists, each list of
length 2000. The idea is to use the set of random gene lists to
assess how often random gene lists of size x can reproduce or improve
the classification performance of
myCuratedList.
#remove myCuratedList from the universe of possible genes. The "eset"
object is your standard ExpressionSet object.
>length(myCuratedList)
[1] 2000
>Index<-setdiff(1:length(rownames(exprs(eset))),myCuratedList)
>length(Index)
[1] 20277
#generate 1000 random gene lists using the genes in Index. The code
for resample is taken from the help pages for sample.
>randomMatrix<-replicate(1000,resample(index,2000))
>dim(randomMatrix)
[1] 2000 1000
I've verified that each column does not contain repeated genes as
should be the case with resample without replacement.
Is there a standard procedure for doing the above or is what I've done
kosher?
Scott A. Ochsner, Ph.D.
NURSA Bioinformatics
Molecular and Cellular Biology
Baylor College of Medicine
Houston, TX. 77030
phone: 713-798-6227