Entering edit mode
Kasper Daniel Hansen
▴
630
@kasper-daniel-hansen-459
Last seen 10.2 years ago
Adaikalavan Ramasamy <ramasamy@cancer.org.uk> writes:
> I do not know much about exprSet (please correct me if I am wrong)
but I
> think and treat exprSet as matrix. Indeed in my previous message, I
was
> writing in the context of matrix.
>
> data(affybatch.example)
> a <- rma(affybatch.example)
> m <- exprs(a)
>
> Then I work with 'm' which may or may not be what you want.
>
> If you want to force a matrix to exprSet, the examples in
> help("exprSet") might be helpful.
an exprSet is a matrix of expression values coupled with a dataframe
of covariates. If you (original poster) look at the aforementioned
article, you will se that they use the original exprset (lets call it
Edata) in the following way:
Xdata <- t(exprs(Edata))
Ydata <- pData(Edata)["y-values"]
So you do not really need the exprset object, as it is only used to
get the matrix of expression values and the dataframe of classes. Now,
given that you have a fit (which you have constructed using a train
data set with known classes), you predict the classes in something
like
predict(fit, newdata=Xdata.test)
I suggest looking at the code and try to separate the different
components.
/Kasper
> Regards, Adai.
>
>
> On Wed, 2004-07-28 at 14:09, Liu, Xin wrote:
>> Thanks Tom, Sean, Xavier for the reply, and especially Adai!
>> However I still have a problem. To put the microarray data into
these supervised clustering, the expreSet need to be built. To build
expreSet, you need to give the class of every sample. So when I
predict samples with unknown classes, how to put them into the
expreSet? Thank you!
>>
>> Xin
>>
>>
>>
>> -----Original Message-----
>> From: Adaikalavan Ramasamy [mailto:ramasamy@cancer.org.uk]
>> Sent: 28 July 2004 13:00
>> To: Liu, Xin
>> Cc: Tom R. Fahland; BioConductor mailing list
>> Subject: Re: [BioC] KNN, SVM, and randomForest - How to predict
>> testwithout known categories
>>
>>
>> If algorithm 1 predicts "Yes", "Yes", "No", "No" for 4 samples and
>> algorithm 2 predicts "Yes", "No", "Yes", "No", how do you know
which one
>> is the better algorithm ? So you use tests set with known classes
to do
>> this. You can do this by breaking your learning set (samples with
know
>> classes) into training and test set. Look up "cross validation".
>>
>> Some example of built in cross validation
>> * knn.cv() is a leave one out cross-validation of knn()
>> * svm() in library(e1071) has an argument named 'cross' for cross
>> validation
>> In practice, I prefer to write my own wrapper for cross-validation
to
>> ensure that sampling method is the same across all algorithms.
>>
>> Once you have determined the best algorithm and features, you then
use
>> predict() to predict samples with unknown classes.
>>
>> Regards, Adai.
>>
>>
>>
>> On Wed, 2004-07-28 at 09:18, Liu, Xin wrote:
>> > In R, before using KNN, SVM, and randomForest, a expreSet is
needed to build, which require the train WITH known catagories and the
test WITH known catagories. However, by definition, in supervised
learning you always train (with known
>> > catagories), then predict the test WITHOUT known catagories. I
wonder how to implement this. Thank you!
>> >
>> > Xin
>> >
>> >
>> >
>> >
>> >
>> > -----Original Message-----
>> > From: Tom R. Fahland [mailto:tfahland@genomatica.com]
>> > Sent: 27 July 2004 18:48
>> > To: Liu, Xin; bioconductor@stat.math.ethz.ch
>> > Subject: RE: [BioC] KNN, SVM,and randomForest - How to predict
samples
>> > without category
>> >
>> >
>> > By definition, in supervised learning you always train (with
known
>> > catagories), then run your unbiased data through for prediction.
Both CV
>> > and train/test partitions are good for choosing parameters and
>> > optimizing the algorithms. I have just completed a study
predicting dose
>> > expsoure with good reasults using different algorithms.
>> > Tom
>> >
>> > -----Original Message-----
>> > From: bioconductor-bounces@stat.math.ethz.ch
>> > [mailto:bioconductor-bounces@stat.math.ethz.ch] On Behalf Of Liu,
Xin
>> > Sent: Tuesday, July 27, 2004 07:39
>> > To: bioconductor@stat.math.ethz.ch
>> > Subject: [BioC] KNN, SVM,and randomForest - How to predict
samples
>> > without category
>> >
>> >
>> > Dear all,
>> >
>> > Supervised clusterings (KNN, SVM, and randomForest) use test
sample set
>> > and train sample set to do prediction. To create the expreSet,
the
>> > category is needed for each sample. However sometimes we need to
predict
>> > sample without its category. Anybody has some clue to do this?
Thank you
>> > very much!
>> >
>> > Best regards,
>> > Xin LIU
>> >
>> >
>> >
>> > This e-mail is from ArraGen Ltd\ \ The e-mail and any
files\...{{dropped}}
>> >
>> > _______________________________________________
>> > Bioconductor mailing list
>> > Bioconductor@stat.math.ethz.ch
>> > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>> >
>>
>>
>>
>>
>>
>> This e-mail is from ArraGen Ltd
>>
>> The e-mail and any files transmitted with it are confidential and
privileged and intended solely for the use of the individual or entity
to whom they are addressed.
>>
>> Any unauthorised direct or indirect dissemination, distribution or
copying of this message and any attachments is strictly prohibited.
>>
>> If you have received the e-mail in error please notify
helpdesk@arragen.com or telephone +44 28 38 363841 and delete the
e-mail from your system.
>>
>> E-mail and other communications sent to this company may be
reviewed or read by persons other than the intended recipient.
>>
>> Viruses : although we have taken steps to ensure that this e-mail
and any attachments are free from any virus, you should, in keeping
with good practice, ensure that they are actually virus free.
>>
>> ArraGen Ltd. Registration Number NI 43067
>> Registered Address : Almac House, 20 Seagoe Industrial Estate,
Craigavon, BT63 5QD
>>
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@stat.math.ethz.ch
> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>
--
Kasper Daniel Hansen, Research Assistant
Department of Biostatistics, University of Copenhagen