Entering edit mode
Hi ,
I have matrix of mutations count as follows
T1 T2 T3 T4 …………….. T100
CA A.A 1 0 2 1 …….
CA A.C 1 2 2 1 …..
CA A.G 1 0 0 0 ….
CA A.T 5 0 0 0 ….
CA C.A 10 0 1 1 …..
..
...
when i tried finding the best number of signatures based on RSS and explained variance plots, it was 5 ,when i normalized the counts by the total number of events observed in each sample, i.e. dividing by the column sums, the best number of signatures became 10, Any idea??
Thank you very much.
Can you please have a look at http://www.bioconductor.org/help/support/posting-guide/ again and provide a reproducible example? The motif matrix you show in your question has data for 4 samples, and hence it is impossible to decompose this into more than 4 signatures, while you state that you have 5.
Hi Julian,
Thanks for reply,the matrix above is just an example. the original matrix which i am using have ~100 samples.
Any idea why the number of signatures that best describes the data is different after normalizing by colSum??
Please note that you haven't yet provided a concrete example with details. Without this, addressing the issue will hardly be possible and pushing for an answer also not be very helpful.