Entering edit mode
Daniel Brewer
★
1.9k
@daniel-brewer-1791
Last seen 10.2 years ago
Hello,
I am getting a bit confused about gene selection and machine learning
and I was wondering if you could help me out. I have a dataset that
is
classified into two groups and my aim is to get a small number of
genes
(10-20) in a gene signature that I will in theory be able to apply to
over datasets to optimal classify the samples. As I do not have a
test
and training set I am using Leave-one-out cross-validation to help
determine the robustness. I have read that one should perform gene
selection for each split of the samples i.e.
1) Select one group as the test set
2) On the remainder select genes
3) Apply machine learning algorithm
4) Test whether the test set is correctly classified
5) Go to one
If you do this, you might get different genes each time, so how do you
get your "final" optimal gene classifier?
Many thanks
Dan
--
**************************************************************
Daniel Brewer, Ph.D.
Institute of Cancer Research
Molecular Carcinogenesis
Email: daniel.brewer at icr.ac.uk
**************************************************************
The Institute of Cancer Research: Royal Cancer Hospital, a charitable
Company Limited by Guarantee, Registered in England under Company No.
534147 with its Registered Office at 123 Old Brompton Road, London SW7
3RP.
This e-mail message is confidential and for use by the
a...{{dropped:2}}